Learning Apache Cassandra - Second Edition

Ebook624 pages2 hours

Learning Apache Cassandra - Second Edition

Name: Learning Apache Cassandra - Second Edition
Author: Sandeep Yarabarla
ISBN: 9781787128408

By Sandeep Yarabarla

Rating: 0 out of 5 stars

()

Read preview

About this ebook

About This Book

Install Cassandra and set up multi-node clusters
Design rich schemas that capture the relationships between different data types
Master the advanced features available in Cassandra 3.x through a step-by-step tutorial and build a scalable, high performance database layer

Who This Book Is For

If you are a NoSQL developer and new to Apache Cassandra who wants to learn its common as well as not-so-common features, this book is for you. Alternatively, a developer wanting to enter the world of NoSQL will find this book useful.

It does not assume any prior experience in coding or any framework.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateApr 25, 2017

ISBN9781787128408

Author

Sandeep Yarabarla

Related authors

Skip carousel

Related to Learning Apache Cassandra - Second Edition

Related ebooks

Skip carousel

Apache Cassandra Essentials
Ebook
Apache Cassandra Essentials
byPadalia Nitin
Rating: 4 out of 5 stars
4/5
Learning Apache Cassandra
Ebook
Learning Apache Cassandra
byMat Brown
Rating: 0 out of 5 stars
0 ratings
Mastering Apache Cassandra - Second Edition
Ebook
Mastering Apache Cassandra - Second Edition
byNishant Neeraj
Rating: 0 out of 5 stars
0 ratings
Cassandra High Availability
Ebook
Cassandra High Availability
byRobbie Strickland
Rating: 5 out of 5 stars
5/5
Mastering Scala Machine Learning
Ebook
Mastering Scala Machine Learning
byAlex Kozlov
Rating: 0 out of 5 stars
0 ratings
PostgreSQL Development Essentials
Ebook
PostgreSQL Development Essentials
byManpreet Kaur
Rating: 5 out of 5 stars
5/5
HBase Essentials
Ebook
HBase Essentials
byNishant Garg
Rating: 0 out of 5 stars
0 ratings
Mastering MongoDB 4.x - Second Edition: Expert techniques to run high-volume and fault-tolerant database solutions using MongoDB 4.x, 2nd Edition
Ebook
Mastering MongoDB 4.x - Second Edition: Expert techniques to run high-volume and fault-tolerant database solutions using MongoDB 4.x, 2nd Edition
byAlex Giamas
Rating: 0 out of 5 stars
0 ratings
Scala for Data Science
Ebook
Scala for Data Science
byBugnion Pascal
Rating: 0 out of 5 stars
0 ratings
Distributed Computing in Java 9
Ebook
Distributed Computing in Java 9
byRaja Malleswara Rao Pattamsetti
Rating: 0 out of 5 stars
0 ratings
Apache Oozie Essentials
Ebook
Apache Oozie Essentials
bySingh Jagat Jasjit
Rating: 0 out of 5 stars
0 ratings
Apache Spark Graph Processing
Ebook
Apache Spark Graph Processing
byRamamonjison Rindra
Rating: 0 out of 5 stars
0 ratings
Real-Time Streaming with Apache Kafka, Spark, and Storm: Create Platforms That Can Quickly Crunch Data and Deliver Real-Time Analytics to Users
Ebook
Real-Time Streaming with Apache Kafka, Spark, and Storm: Create Platforms That Can Quickly Crunch Data and Deliver Real-Time Analytics to Users
byBrindha Priyadarshini Jeyaraman
Rating: 0 out of 5 stars
0 ratings
Instant Redis Optimization How-to
Ebook
Instant Redis Optimization How-to
byArun Chinnachamy
Rating: 0 out of 5 stars
0 ratings
Instant MapReduce Patterns – Hadoop Essentials How-to
Ebook
Instant MapReduce Patterns – Hadoop Essentials How-to
bySrinath Perera
Rating: 0 out of 5 stars
0 ratings
Data Processing and Modeling with Hadoop: Mastering Hadoop Ecosystem Including ETL, Data Vault, DMBok, GDPR, and Various Data-Centric Tools
Ebook
Data Processing and Modeling with Hadoop: Mastering Hadoop Ecosystem Including ETL, Data Vault, DMBok, GDPR, and Various Data-Centric Tools
byVinicius Aquino do Vale
Rating: 0 out of 5 stars
0 ratings
Apache Spark 2.x Cookbook
Ebook
Apache Spark 2.x Cookbook
byRishi Yadav
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Computer Vision with SAS: An Introduction
Ebook
Deep Learning for Computer Vision with SAS: An Introduction
byRobert Blanchard
Rating: 0 out of 5 stars
0 ratings
Mastering PostgreSQL 12 - Third Edition: Advanced techniques to build and administer scalable and reliable PostgreSQL database applications, 3rd Edition
Ebook
Mastering PostgreSQL 12 - Third Edition: Advanced techniques to build and administer scalable and reliable PostgreSQL database applications, 3rd Edition
byHans-Jürgen Schönig
Rating: 0 out of 5 stars
0 ratings
Practical OneOps
Ebook
Practical OneOps
byNilesh Nimkar
Rating: 0 out of 5 stars
0 ratings
MariaDB High Performance
Ebook
MariaDB High Performance
byPierre MAVRO
Rating: 0 out of 5 stars
0 ratings
PostgreSQL 11 Administration Cookbook: Over 175 recipes for database administrators to manage enterprise databases
Ebook
PostgreSQL 11 Administration Cookbook: Over 175 recipes for database administrators to manage enterprise databases
bySimon Riggs
Rating: 0 out of 5 stars
0 ratings
Cassandra Design Patterns - Second Edition
Ebook
Cassandra Design Patterns - Second Edition
byThottuvaikkatumana Rajanarayanan
Rating: 0 out of 5 stars
0 ratings
Hadoop Cluster Deployment
Ebook
Hadoop Cluster Deployment
byDanil Zburivsky
Rating: 0 out of 5 stars
0 ratings
Apache Hive Essentials
Ebook
Apache Hive Essentials
byDayong Du
Rating: 0 out of 5 stars
0 ratings
PostgreSQL for Data Architects
Ebook
PostgreSQL for Data Architects
byJayadevan Maymala
Rating: 0 out of 5 stars
0 ratings
Learning Elasticsearch
Ebook
Learning Elasticsearch
byAbhishek Andhavarapu
Rating: 4 out of 5 stars
4/5
Hadoop Blueprints
Ebook
Hadoop Blueprints
byDeshpande Tanmay
Rating: 0 out of 5 stars
0 ratings
Elasticsearch for Hadoop
Ebook
Elasticsearch for Hadoop
byShukla Vishal
Rating: 0 out of 5 stars
0 ratings
Introduction to JVM Languages
Ebook
Introduction to JVM Languages
byVincent van der Leun
Rating: 0 out of 5 stars
0 ratings

Databases For You

Skip carousel

SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
SQL Clearly Explained
Ebook
SQL Clearly Explained
byJan L. Harrington
Rating: 5 out of 5 stars
5/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
Ebook
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
byWilliam Sullivan
Rating: 5 out of 5 stars
5/5
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
Ebook
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
byAlexander Cooper
Rating: 1 out of 5 stars
1/5
Building a Scalable Data Warehouse with Data Vault 2.0
Ebook
Building a Scalable Data Warehouse with Data Vault 2.0
byDaniel Linstedt
Rating: 4 out of 5 stars
4/5
Learn Git in a Month of Lunches
Ebook
Learn Git in a Month of Lunches
byRick Umali
Rating: 0 out of 5 stars
0 ratings
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
Blockchain Basics: A Non-Technical Introduction in 25 Steps
Ebook
Blockchain Basics: A Non-Technical Introduction in 25 Steps
byDaniel Drescher
Rating: 5 out of 5 stars
5/5
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
A Concise Guide to Object Orientated Programming
Ebook
A Concise Guide to Object Orientated Programming
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
Python Projects for Everyone
Ebook
Python Projects for Everyone
byMohamad Charara
Rating: 0 out of 5 stars
0 ratings
LINUX: Beginner's Crash Course. Your Step-By-Step Guide To Learning The Linux Operating System And Command Line Easy & Fast!
Ebook
LINUX: Beginner's Crash Course. Your Step-By-Step Guide To Learning The Linux Operating System And Command Line Easy & Fast!
byJeremy Li
Rating: 3 out of 5 stars
3/5
Access 2019 For Dummies
Ebook
Access 2019 For Dummies
byLaurie A. Ulrich
Rating: 0 out of 5 stars
0 ratings
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
Ebook
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
byPiyanka Jain
Rating: 5 out of 5 stars
5/5
Advanced Analytics in Power BI with R and Python: Ingesting, Transforming, Visualizing
Ebook
Advanced Analytics in Power BI with R and Python: Ingesting, Transforming, Visualizing
byRyan Wade
Rating: 0 out of 5 stars
0 ratings
Learning Oracle 12c: A PL/SQL Approach
Ebook
Learning Oracle 12c: A PL/SQL Approach
bySham Tickoo
Rating: 0 out of 5 stars
0 ratings
Learn SQL Server Administration in a Month of Lunches
Ebook
Learn SQL Server Administration in a Month of Lunches
byDon Jones
Rating: 3 out of 5 stars
3/5
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
Ebook
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
byDan Clark
Rating: 0 out of 5 stars
0 ratings
Excel 2021
Ebook
Excel 2021
byJIAYI SIMONDS
Rating: 4 out of 5 stars
4/5
100+ SQL Queries T-SQL for Microsoft SQL Server
Ebook
100+ SQL Queries T-SQL for Microsoft SQL Server
byIFS Harrison
Rating: 4 out of 5 stars
4/5
The Data and Analytics Playbook: Proven Methods for Governed Data and Analytic Quality
Ebook
The Data and Analytics Playbook: Proven Methods for Governed Data and Analytic Quality
byLowell Fryman
Rating: 5 out of 5 stars
5/5
SQL: Practical Guide for Developers
Ebook
SQL: Practical Guide for Developers
byMichael J. Donahoo
Rating: 2 out of 5 stars
2/5
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
Ebook
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
byJohn Ladley
Rating: 4 out of 5 stars
4/5
Python and SQLite Development
Ebook
Python and SQLite Development
byAgus Kurniawan
Rating: 0 out of 5 stars
0 ratings
CompTIA DataSys+ Study Guide: Exam DS0-001
Ebook
CompTIA DataSys+ Study Guide: Exam DS0-001
byMike Chapple
Rating: 0 out of 5 stars
0 ratings
Business Intelligence Strategy and Big Data Analytics: A General Management Perspective
Ebook
Business Intelligence Strategy and Big Data Analytics: A General Management Perspective
bySteve Williams
Rating: 5 out of 5 stars
5/5
Getting Started with SQL Server 2014 Administration
Ebook
Getting Started with SQL Server 2014 Administration
byGethyn Ellis
Rating: 0 out of 5 stars
0 ratings
Learning PostgreSQL
Ebook
Learning PostgreSQL
byJuba Salahaldin
Rating: 1 out of 5 stars
1/5
Access 2010 All-in-One For Dummies
Ebook
Access 2010 All-in-One For Dummies
byAlison Barrows
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
Podcast episode
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
byData Engineering Podcast
100%
100% found this document useful
Cloud Dataflow with Eric Anderson: Batch and stream processing systems have been evolving for the past decade. From MapReduce to Apache Storm to Dataflow, the best practices for large volume data processing have become more sophisticated as the industry and open source communities have ...
Podcast episode
Cloud Dataflow with Eric Anderson: Batch and stream processing systems have been evolving for the past decade. From MapReduce to Apache Storm to Dataflow, the best practices for large volume data processing have become more sophisticated as the industry and open source communities have ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
Podcast episode
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
byData Engineering Podcast
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
Podcast episode
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
Podcast episode
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
byTest and Code
0 ratings
0% found this document useful
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
Podcast episode
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
byData Engineering Podcast
0 ratings
0% found this document useful
Crafting Interpreters With Bob Nystrom: Bob Nystrom is the author of Crafting Interpreters. I speak with Nystrom about building a programming language and an interpreter implementation for it. We talk about parsing, the difference between compiler and interpreters and a lot more. If you are...
Podcast episode
Crafting Interpreters With Bob Nystrom: Bob Nystrom is the author of Crafting Interpreters. I speak with Nystrom about building a programming language and an interpreter implementation for it. We talk about parsing, the difference between compiler and interpreters and a lot more. If you are...
byCoRecursive: Coding Stories
0 ratings
0% found this document useful
Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39: Self Service Data Flows With Apache NiFi (Interview)
Podcast episode
Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39: Self Service Data Flows With Apache NiFi (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
Hasty Treat - Hireable Skills for 2021: In this Hasty Treat, Scott and Wes talk about hireable skills or 2021 — what you need to know to get a job and grow in your career this year! Freshbooks - Sponsor Get a 30 day free trial of Freshbooks at and put SYNTAX in the “How did...
Podcast episode
Hasty Treat - Hireable Skills for 2021: In this Hasty Treat, Scott and Wes talk about hireable skills or 2021 — what you need to know to get a job and grow in your career this year! Freshbooks - Sponsor Get a 30 day free trial of Freshbooks at and put SYNTAX in the “How did...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
Podcast episode
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
byInvest Like the Best with Patrick O'Shaughnessy
0 ratings
0% found this document useful
Distributed Systems Tradeoffs with Camille Fournier: Distributed systems products are often marketed with terms like “real-time data” and “hassle-free scaling”, but what do those terms actually mean? Is data in a distributed system ever reliably “real time”? Do we ever have strong enough plans about our ...
Podcast episode
Distributed Systems Tradeoffs with Camille Fournier: Distributed systems products are often marketed with terms like “real-time data” and “hassle-free scaling”, but what do those terms actually mean? Is data in a distributed system ever reliably “real time”? Do we ever have strong enough plans about our ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
Podcast episode
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
byData Engineering Podcast
0 ratings
0% found this document useful
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
Podcast episode
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Netflix Scheduling with Sharma Podila: At Netflix, developers write applications with a variety of requirements–from simple requests for a list of movies to more resource-intensive requests like a complex machine learning workflow. Netflix wants developers to be able to request the resour...
Podcast episode
Netflix Scheduling with Sharma Podila: At Netflix, developers write applications with a variety of requirements–from simple requests for a list of movies to more resource-intensive requests like a complex machine learning workflow. Netflix wants developers to be able to request the resour...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
How ChatGPT Changes Tech + The End of Remote Work? — With Aaron Levie
Podcast episode
How ChatGPT Changes Tech + The End of Remote Work? — With Aaron Levie
byBig Technology Podcast
100%
100% found this document useful
Cloud Dependencies with Mya Pitzeruse: New software abstractions always take advantage of the abstractions that have been built before. Software libraries allow us to import code that sits on the same host as a new program. Open source software let us copy and paste existing code,
Podcast episode
Cloud Dependencies with Mya Pitzeruse: New software abstractions always take advantage of the abstractions that have been built before. Software libraries allow us to import code that sits on the same host as a new program. Open source software let us copy and paste existing code,
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
Podcast episode
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
byScreaming in the Cloud
0 ratings
0% found this document useful
CockroachDB In Depth with Peter Mattis - Episode 35
Podcast episode
CockroachDB In Depth with Peter Mattis - Episode 35
byData Engineering Podcast
0 ratings
0% found this document useful
Using FoundationDB As The Bedrock For Your Distributed Systems - Episode 80: An interview about the FoundationDB project and how it simplifies the work of building custom distributed systems applications
Podcast episode
Using FoundationDB As The Bedrock For Your Distributed Systems - Episode 80: An interview about the FoundationDB project and how it simplifies the work of building custom distributed systems applications
byData Engineering Podcast
0 ratings
0% found this document useful
#111 The Rise of the Julia Programming Language
Podcast episode
#111 The Rise of the Julia Programming Language
byDataFramed
0 ratings
0% found this document useful
Simplifying Data Integration Through Eventual Connectivity - Episode 91: An interview about a new pattern for data integration that reduces the amount of effort required to find connections in numerous data sets
Podcast episode
Simplifying Data Integration Through Eventual Connectivity - Episode 91: An interview about a new pattern for data integration that reduces the amount of effort required to find connections in numerous data sets
byData Engineering Podcast
0 ratings
0% found this document useful
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
Podcast episode
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Hasty Treat - Why should I use React Hooks?: In this Hasty Treat, Scott and Wes talk about React Hooks and why you might want to use them instead of class components. Sentry - Sponsor If you want to know what’s happening with your errors, track them with . Sentry is open-source error...
Podcast episode
Hasty Treat - Why should I use React Hooks?: In this Hasty Treat, Scott and Wes talk about React Hooks and why you might want to use them instead of class components. Sentry - Sponsor If you want to know what’s happening with your errors, track them with . Sentry is open-source error...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
React Hooks - 1 Year Later: In this episode of Syntax, Scott and Wes talk about React Hooks, one year later — what’s changed, how to use them, and more! Sanity - Sponsor is a real-time headless CMS with a fully customizable Content Studio built in React. Get a Sanity...
Podcast episode
React Hooks - 1 Year Later: In this episode of Syntax, Scott and Wes talk about React Hooks, one year later — what’s changed, how to use them, and more! Sanity - Sponsor is a real-time headless CMS with a fully customizable Content Studio built in React. Get a Sanity...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Open Source Object Storage For All Of Your Data - Episode 99: An interview on the open source MinIO platform for fast and flexible object storage for data intensive applications and analytics that runs everywhere
Podcast episode
Open Source Object Storage For All Of Your Data - Episode 99: An interview on the open source MinIO platform for fast and flexible object storage for data intensive applications and analytics that runs everywhere
byData Engineering Podcast
0 ratings
0% found this document useful
Design Patterns – Podcast S08 E03: Joshua Greene and Jay Strawn, the authors of "Design Patterns by Tutorials", join us to talk about different Design Patterns and SOLID.
Podcast episode
Design Patterns – Podcast S08 E03: Joshua Greene and Jay Strawn, the authors of "Design Patterns by Tutorials", join us to talk about different Design Patterns and SOLID.
byThe Kodeco Podcast: For App Developers and Gamers
0 ratings
0% found this document useful
State In React: In this episode of Syntax, Scott and Wes talk about state in React: local state, global state, UI state, data state, caching, API data and more! LogRocket - Sponsor LogRocket lets you replay what users do on your site, helping you reproduce bugs and...
Podcast episode
State In React: In this episode of Syntax, Scott and Wes talk about state in React: local state, global state, UI state, data state, caching, API data and more! LogRocket - Sponsor LogRocket lets you replay what users do on your site, helping you reproduce bugs and...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
Stateful, Distributed Stream Processing on Flink with Fabian Hueske - Episode 57: Scalable and Stateful Streaming Data With Apache Flink (Interview)
Podcast episode
Stateful, Distributed Stream Processing on Flink with Fabian Hueske - Episode 57: Scalable and Stateful Streaming Data With Apache Flink (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

AWS vs Azure
Linux Format
Article
AWS vs Azure
Aug 22, 2023
9 min read
AWS Vs Azure What’s The Difference?
PC Pro Magazine
Article
AWS Vs Azure What’s The Difference?
Sep 11, 2022
7 min read
Docker vs Podman
APC
Article
Docker vs Podman
Apr 19, 2021
When Cockpit was first developed, it had plug-in support for administering your Docker containers remotely via its user-friendly web interface. But then Red Hat OS became a major backer of Cockpit, and when Red Hat developed its own alternative to Do
1 min read
KAFKA Build Utilities With The Kafka Server
Linux Format
Article
KAFKA Build Utilities With The Kafka Server
Jul 2, 2019
Nowadays, quite a few data architectures involve both a database and Apache Kafka, which is a distributed streaming platform and the subject of this tutorial. You can also find Kafka described as a publish-subscribe message system, which is a fancy w
7 min read
Join the Pod, Man!
Linux Format
Article
Join the Pod, Man!
May 30, 2023
8 min read
Elasticsearch And Kibana Basics
Linux Format
Article
Elasticsearch And Kibana Basics
Dec 15, 2020
1 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
Build A Search And Analytic Engine
Linux Format
Article
Build A Search And Analytic Engine
Mar 10, 2020
7 min read
What Is The Future Of Game Streaming Now That Stadia Is Dead?
APC
Article
What Is The Future Of Game Streaming Now That Stadia Is Dead?
Oct 31, 2022
Once hyped as being ‘the future of gaming’, the Google Stadia game streaming service was officially, just three years after launch and before even making it to Australian shores. When game streaming first launched we did have some apprehension about
2 min read
Your First Steps In Grafana
Linux Format
Article
Your First Steps In Grafana
Nov 17, 2020
The easiest way to get hold of Grafana and begin using it as soon as possible is by downloading and executing its official Docker image. This means that apart from the Docker image, you won’t need to download, set up or install anything else for Graf
1 min read
Types Of Databases
Linux Format
Article
Types Of Databases
Aug 27, 2019
NoSQL databases provide the performance, scalability and stability that’s required by the modern data-driven apps we interact with these days. But that is where the similarity between NoSQL systems end. In fact, it wouldn’t be wrong to say that the o
1 min read
In Brief
Linux Format
Article
In Brief
Jun 1, 2021
Mu is a code editor for many forms of Python. We can write standard Python 3 code, create web apps and write code for microcontrollers such as the new Raspberry Pi Pico. Mu is designed for new users and does away with complicated IDEs in favour of a
1 min read
Workflow
Linux Format
Article
Workflow
Nov 17, 2020
3 min read
Filesystems
Linux Format
Article
Filesystems
Nov 16, 2021
1 min read
» Stochastic Algorithms
Linux Format
Article
» Stochastic Algorithms
Dec 14, 2021
If you’re up for some relatively maths-heavy computer-science reading (and who isn’t?), then consider looking into stochastic algorithms. Sometimes lumped together with machine-learning, stochastic algorithms is a loosely defined category that you co
1 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Basic Concepts
Linux Format
Article
Basic Concepts
Jul 2, 2019
A messaging system such as Kafka enables you to send messages between processes, applications and servers. Applications connect to Kafka to send or get data. Strictly speaking, a Kafka ‘topic’ is a unit of storage in Kafka: data in Kafka is stored in
1 min read
Picture In A Mainframe
Linux Format
Article
Picture In A Mainframe
Jul 2, 2019
11 min read
An Introduction To Rabbitmq
Linux Format
Article
An Introduction To Rabbitmq
Jun 29, 2021
RabbitMQ is a Message Broker, which means that it can safely hold messages generated by applications and make them available to other applications. The main advantages are reliability, support for clustering and high-availability queues, tracing capa
1 min read
Understand And Deploy Security Keys
Linux Format
Article
Understand And Deploy Security Keys
Feb 8, 2022
9 min read
Add Military-level Security To Any Project
Linux Format
Article
Add Military-level Security To Any Project
Aug 27, 2019
7 min read
Machine Learning – With Zero Programming
APC
Article
Machine Learning – With Zero Programming
Aug 12, 2019
6 min read
Installation
Linux Format
Article
Installation
Oct 19, 2021
1 min read
Mucking About With AI
APC
Article
Mucking About With AI
May 22, 2023
2 min read
MARIADB Optimise And Control Your Databases
Linux Format
Article
MARIADB Optimise And Control Your Databases
Jul 30, 2019
9 min read
Software Pools Server Memory for Faster Networks
Futurity
Article
Software Pools Server Memory for Faster Networks
May 31, 2017
A group of engineers has created open-source software that allows for memory sharing among servers in a computer network, allowing for more efficient use of memory and even faster computer operations. For decades, operators of large computer clusters
2 min read
Common Errors
Linux Format
Article
Common Errors
Aug 27, 2019
If you receive a ‘Script not found’ error, this probably means that you don’t have the mod scripts installed in your Minecraft directory. Check that you’ve replaced .minecraft with the one from McPiFoMo; this should include mcpipy, which will be full
1 min read
Route Traffic Between Networks Using A Pi
Linux Format
Article
Route Traffic Between Networks Using A Pi
Jun 2, 2020
A deep-dive into Pi networking solutions resulted in this tutorial. The goal was to uncover a Pi configuration that would enable the routing of network traffic from a wired network to a wireless network. The aim is to build a network router using a R
10 min read
How To Develop A RESTful Client In Go
Linux Format
Article
How To Develop A RESTful Client In Go
Nov 16, 2021
Mihalis Tsoukalos is a systems engineer and technical writer. He’s the author of Go Systems Programming and Mastering Go. You can reach him at @mactsouk. The subject of this month’s tutorial is RESTful services. In particular, you’re going to learn h
9 min read
Access Your Mac Anywhere
MacLife
Article
Access Your Mac Anywhere
Nov 8, 2022
2 min read

Related categories

Skip carousel

Reviews for Learning Apache Cassandra - Second Edition

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Learning Apache Cassandra - Second Edition - Sandeep Yarabarla

Title Page

Learning Apache Cassandra

Second Edition

Managing fault-tolerant and scalable data

Sandeep Yarabarla

BIRMINGHAM - MUMBAI

Learning Apache Cassandra

Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: February 2015

Second Edition: April 2017

Production reference: 1200417

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78712-729-6

www.packtpub.com

Credits

About the Author

Sandeep Yarabarla is a professional software engineer working for Verizon Labs, based out of Palo Alto, CA. After graduating from Carnegie Mellon University, he has worked on several big data technologies for a spectrum of companies. He has developed applications primarily in Java and Go.

His experience includes handling large amounts of unstructured and structured data in Hadoop, and developing data processing applications using Spark and MapReduce. Right now, he is working with some cutting-edge technologies such as Cassandra, Kafka, Mesos, and Docker to build fault-tolerant and highly scalable applications.

I would like to thank my mom and dad for their love and support throughout my career. I would also like to thank my relatives and friends for their help during various stages of my life. Lastly, I would like to thank Packt for giving me this opportunity to write this book and all the staff involved who helped me with the book's completion.

About the Reviewer

Graham Doman is a passionate software architect who has worked in a wide variety of business domains over his 20-year career. He started off as a junior working with C++, before moving onto C# and JavaScript, which have been his main languages for many years. He’s worked on a variety of projects and products, ranging from recruitment agency systems, medical devices, back of bridge route planning software, air powered printer drivers, and many more.

He had the opportunity to study for an MSc in Data Science, and having worked in data-focused projects throughout his career, he jumped at the chance, graduating in 2015. He has been passionate about NoSQL, big data, and their application in IoTT projects ever since. As a result of this newfound passion, he’s delved into Hadoop, Cassandra, Spark, MQTT, Python, R, Scala, and Java. Though he’s not particularly mathematical minded, he’s even delved into the curious world of statistics.

He has his own IT consultancy company, Buteo Consultancy Ltd (http://www.bizdb.co.uk/), which specialises in data and software engineering, data science, and IoT. He is actively working on a number of different contracts and forging new connections.

I would like to thank my family, Sally, Ewan, Erin, William and Felix, who have supported me in all my endeavours these past few years. I couldn't do it without you guys.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at service@packtpub.com for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/178712729X.

If you'd like to join our team of regular reviewers, you can e-mail us at customerreviews@packtpub.com. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Getting Up and Running with Cassandra

What is big data?

Challenges of modern applications

Why not relational databases?

How to handle big data

What is Cassandra and why Cassandra?

Horizontal scalability

High availability

Write optimization

Structured records

Secondary indexes

Materialized views

Efficient result ordering

Immediate consistency

Discretely writable collections

Relational joins

MapReduce and Spark

Rich and flexible data model

Lightweight transactions

Multidata center replication

Comparing Cassandra to the alternatives

Installing Cassandra

Installing the JDK

Installing on Debian-based systems (Ubuntu)

Installing on RHEL-based systems

Installing on Windows

Installing on Mac OS X

Installing the binary tarball

Bootstrapping the project

CQL—the Cassandra Query Language

Interacting with Cassandra

Getting started with CQL

Creating a keyspace

Selecting a keyspace

Creating a table

Inserting and reading data

New features in Cassandra 2.2, 3.0, and 3.X

Summary

The First Table

How to configure keyspaces

Creating the users table

Structuring of tables

Table and column options

The type system

Strings

Integers

Floating point and decimal numbers

Timestamp

UUIDs

Booleans

Blobs

Collections

Other data types

The purpose of types

Inserting data

Writing data does not yield feedback

Partial inserts

Selecting data

Missing rows

Selecting more than one row

Retrieving all the rows

Paginating through results

Inserts are always upserts

Developing a mental model for Cassandra

Summary

Organizing Related Data

A table for status updates

Creating a table with a compound primary key

The structure of the status updates table

UUIDs and timestamps

Working with status updates

Extracting timestamps

Looking up a specific status update

Automatically generating UUIDs

Anatomy of a compound primary key

Anatomy of a single-column primary key

Beyond two columns

Multiple clustering columns

Composite partition keys

Composite partition key table

Structure of composite partition key tables

Composite partition key with multiple clustering columns

Compound keys represent parent-child relationships

Coupling parents and children using static columns

Defining static columns

Working with static columns

Interacting only with the static columns

Static-only inserts

Static columns act like predefined joins

When to use static columns

Refining our mental model

Summary

Beyond Key-Value Lookup

Looking up rows by partition

The limits of the WHERE keyword

Restricting by clustering column

Restricting by part of a partition key

Retrieving status updates for a specific time range

Creating time UUID ranges

Selecting a slice of a partition

Paginating over rows in a partition

Counting rows

Reversing the order of rows

Reversing clustering order at query time

Reversing clustering order in the schema

Limitations of ORDER BY

ORDER BY summary

Paginating over multiple partitions

JSON support

INSERT JSON

SELECT JSON

Building an autocomplete function

Summary

Establishing Relationships

Modeling follow relationships

Outbound follows

Inbound follows

Storing follow relationships

Cassandra data modelling

Conceptual data model (entity relationship model)

Logical data model (query-driven design)

Physical data model

Denormalization

Looking up follow relationships

Unfollowing users

Using secondary indexes to avoid denormalization

The form of the single table

Adding a secondary index

Other uses of secondary indexes

Limitations of secondary indexes

Secondary indexes can only have one column

Secondary indexes can only be tested for equality

Secondary index lookup is not as efficient as primary key lookup

Materialized views

Adding a view

Summary

Denormalizing Data for Maximum Performance

A normalized approach

Generating the timeline

Ordering and pagination

Multiple partitions and read efficiency

Partial denormalization

Displaying the home timeline

Read performance and write complexity

Fully denormalizing the home timeline

Creating a status update

Displaying the home timeline

Write complexity and data integrity

Batching in Cassandra

Logged batches

Unlogged batches

When to use unlogged batches

Misuse of BATCH statements

Summary

Expanding Your Data Model

Viewing a keyspace schema

Viewing a table schema in cqlsh

Adding columns to tables

Deleting columns

Updating the existing rows

Updating multiple columns

Updating multiple rows

Removing a value from a column

Missing columns in Cassandra

Deleting specific columns

Syntactic sugar for deletion

Deleting table data (TRUNCATE)

Deleting table/keyspace with schema (DROP)

Inserts, updates, and upserts

Inserts can overwrite existing data

Checking before inserting isn't enough

Another advantage of UUIDs

Conditional inserts and lightweight transactions

Updates can create new rows

Optimistic locking with conditional updates

Optimistic locking in action

Optimistic locking and accidental updates

Lightweight transactions and their cost

When lightweight transactions aren't necessary

Summary

Collections, Tuples, and User-Defined Types

The problem with concurrent updates

Serializing the collection

Introducing concurrency

Collection columns and concurrent updates

Defining collection columns

Reading and writing sets

Advanced set manipulation

Removing values from a set

Sets and uniqueness

Collections and upserts

Using lists for ordered, non-unique values

Defining a list column

Writing a list

Discrete list manipulation

Writing data at a specific index

Removing elements from the list

Using maps to store key-value pairs

Writing a map

Updating discrete values in a map

Removing values from maps

Collections in inserts

Collections and secondary indexes

Secondary indexes on map columns

The limitations of collections

Reading discrete values from collections

Collection size limit

Reading a collection column from multiple rows

Unable to reuse collection names

Performance of collection operations

Working with tuples

Creating a tuple column

Writing to tuples

Indexing tuples

User-defined types

Creating a user-defined type

Assigning a user-defined type to a column

Adding data to a user-defined column

Indexing and querying user-defined types

Partial selection of user-defined types

Choosing between tuples and user-defined types

Nested collections

Nested tuples/UDTs

Comparing data structures

Summary

Aggregating Time-Series Data

Recording discrete analytics observations

Using discrete analytics observations

Slicing and dicing our data

Recording aggregate analytics observations

Answering the right question

Precomputation versus read-time aggregation

The many possibilities for aggregation

The role of discrete observations

Recording analytics observations

Updating a counter column

Counters and upserts

Setting and resetting counter columns

Counter columns and deletion

Counter columns need their own table

Cassandra configuration

Configuration location

Modifying configuration

Restarting Cassandra

User-defined functions

User-defined aggregate functions

Standard aggregate functions

Summary

How Cassandra Distributes Data

Data distribution in Cassandra

Cassandra's partitioning strategy - partition key tokens

Distributing partition tokens

Partitioners

Partition keys group data on the same node

Virtual nodes

Virtual nodes facilitate redistribution

Data replication in Cassandra

Masterless replication

Replication without a master

Gossip protocol

Multidata center cluster

Snitch

Replication strategy

Durable writes

Consistency

Immediate and eventual consistency

Consistency in Cassandra

The anatomy of a successful request

Tuning consistency

Eventual consistency with ONE

Immediate consistency with ALL

Fault-tolerant immediate consistency with QUORUM

Local consistency levels

Comparing consistency levels

Choosing the right consistency level

The CAP theorem

Handling conflicting data

Last-write-wins conflict resolution

Introspecting write timestamps

Overriding write timestamps

Distributed deletion

Stumbling on tombstones

Expiring columns with TTL

Table configuration options

Summary

Cassandra Multi-Node Cluster

3 - node cluster

Prerequisites

Tuning configuration options setting up a 3-node cluster

Tuning configuration

Cassandra.yaml

Cassandra-env.sh

Starting the 3-node cluster

Consistency in action

Write consistency

Consistency QUORUM

Consistency ANY

Cassandra internals

The write path

Compaction

The read path

Cassandra repair mechanisms

Hinted handoff

Read repair

Anti-entropy repair

Summary

Application Development Using the Java Driver

A simple query

Cluster API

Getting metadata

Querying

Prepared statements

QueryBuilder API

Building an INSERT statement

Building an UPDATE statement

Building a SELECT statement

Asynchronous querying

Execute asynchronously

Processing future results

Driver policies

Load-balancing policy

RoundRobinPolicy

DCAwareRoundRobinPolicy

TokenAwarePolicy

Retry Policy

Summary

Peeking under the Hood

Using cassandra-cli

The structure of a simple primary key table

Exploring cells

A model of column families: RowKey and cells

Compound primary keys in column families

A complete mapping

The wide row data structure

The empty cell

Collection columns in column families

Set columns in column families

Map columns in column families

List columns in column families

Appending and prepending values to lists

Other list operations

Summary

Authentication and Authorization

Enabling authentication and authorization

Authentication, authorization, and fault-tolerance

Authentication with cqlsh

Authentication in your application

Setting up a user

Changing a user's password

Viewing user accounts

Controlling access

Viewing permissions

Revoking access

Authorization in action

Authorization as a hedge against mistakes

Security beyond authentication and authorization

Security protects against vulnerabilities

Summary

Wrapping up

Preface

The crop of distributed databases that have come to the market in recent years appeals to application developers for several reasons. Their storage capacity is nearly limitless, bounded only by the number of machines you can afford to spin up. Masterless replication makes them resilient to adverse events, handling even a complete machine failure without any noticeable effect on the applications that rely on them. Log-structured storage engines allow these databases to handle high volume write loads without blinking an eye.

But compared to traditional relational databases, not to mention newer document stores, distributed databases are typically feature-poor and inconvenient to work with. Read and write functionality is frequently confined to simple key-value operations, with more complex operations demanding arcane map-reduce implementations. Happily, Cassandra provides all of the benefits of a fully distributed data store while also exposing a familiar, user-friendly data model and query interface.

By the time I began writing this book, Cassandra had seen plenty of improvements with regards to performance and feature set since its inception. The earliest versions of Cassandra were optimized for fast and large volumes of writes. The read performance was good, but not at par with the write performance. Several improvements were made to make reads considerably faster, such as the addition of bloom filters, caching mechanisms, better indexing, and partitioning.

Over the past couple of years, we have had several successful deployments of Cassandra, both on premise and in the cloud. I have helped several teams migrate from traditional databases to Cassandra without a hitch. Since it is a fully distributed database with masterless architecture, it works well with a scheduling framework such as Mesos. The toughest challenge one would face when transitioning from a relational database to Cassandra would be to come up with an optimal data model. While Cassandra allows you to have flexible models, it is still vital to ensure you get the maximum performance out of it.

The goal of this book is to teach: how to use Cassandra effectively, powerfully, and efficiently. We'll explore Cassandra's ins and outs by designing the persistence layer for a messaging service that allows users to post status updates that are visible to their friends. By the end of the book, you'll be fully prepared to build your own highly scalable and highly available applications.

What this book covers

Chapter 1, Getting Up and Running with Cassandra, introduces the major reasons to choose Cassandra over a traditional relational or document database. It then provides step-by-step instructions on installing Cassandra on various operating systems, creating a keyspace, and interacting with the database using the CQL language and cqlsh tool.

Chapter 2, The First Table, is a walkthrough of creating a table, inserting data, and retrieving rows by primary key. Along the way, it discusses how Cassandra tables are structured, and provides a tour of the Cassandra type system.

Chapter 3, Organizing Related Data, introduces more complex table structures that group related data together using compound primary keys and composite partition keys.

Chapter 4, Beyond Key-Value Lookup, puts the more robust schema developed in the previous chapter to use, explaining how to query for sorted ranges of rows. It also touches upon the JSON support that was introduced in Cassandra 2.2.

Chapter 5, Establishing Relationships, develops table structures for modeling relationships between rows. The chapter introduces static columns and row deletion. This chapter also touches upon secondary indexes and materialized views, which can be used to avoid denormalization of data.

Chapter 6, Denormalizing Data for Maximum Performance, explains when and why storing multiple copies of the same data can make your application more efficient. The chapter introduces batching mechanisms in Cassandra and when to use them.

Chapter 7, Expanding Your Data Model, demonstrates the use of lightweight transactions to ensure data integrity. It also introduces schema alteration, row updates, and single-column deletion.

Chapter 8, Collections, Tuples, and User-Defined Types, introduces collection columns and explores Cassandra's support for advanced, atomic collection manipulation. It also introduces tuples, nested collections, and user-defined types.

Chapter 9, Aggregating Time-Series Data, covers the common use case of collecting high-volume time-series data and introduces counter columns. It also introduces user-defined functions and user-defined aggregates.

Chapter 10, How Cassandra Distributes Data, explores what happens when you save a row to Cassandra. It considers eventual consistency and teaches you how to use tunable consistency to get the right balance between consistency and fault-tolerance.

Chapter 11, Cassandra Multi-Node Cluster, explains how the dynamics of consistency levels and replication factor, change with a multi-node cluster. This chapter also touches upon some of the architectural aspects of Cassandra, including the read/write paths and data repair mechanisms.

Chapter 12, Application Development Using the Java Driver, introduces the DataStax Java driver which can be used to develop applications in Java with appropriate load balancing, reconnection, and retry policies to work with Cassandra.

Appendix A, Peeking under the Hood, peels away the abstractions provided by CQL to reveal how Cassandra represents data at the lower column family level.

Appendix B, Authentication and Authorization, introduces ways to control access to your Cassandra cluster and specific data structures within it.

What you need for this book

You will need the following software to work with the examples in this book:

Java Runtime Environment 8.0 (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)

Apache Cassandra 3.X (http://cassandra.apache.org/download/)

Java IDE (IntelliJ or Eclipse) to edit, compile, and run Java code

Further instructions on installing these are presented in the upcoming chapters of the book.

Who this book is for

This book is for first-time users of Cassandra, as well as anyone who wants a better understanding of Cassandra in order to evaluate it as a solution for their application. Since Cassandra is a standalone database, we don't assume any particular coding language or framework; anyone who builds applications for a living, and who wants those applications to scale, will benefit from reading the book. Later on, some examples have been presented in Java, but anyone with a minimalistic understanding of object-oriented programming should be able to grasp them.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: The next lines of code read the link and assign it to the to the BeautifulSoup function.

A block of code is set as follows:

Any command-line input or output is written as follows:

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: In order to download new modules, we will go to Files | Settings | Project Name | Project Interpreter.

Warnings or important notes appear in a box like this.

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail feedback@packtpub.com, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Hover the mouse pointer on the SUPPORT tab at the top.

Click on Code Downloads & Errata.

Enter the name of the book in the Search box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Learning-Apache-Cassandra-Second-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/LearningApacheCassandraSecondEdition_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at copyright@packtpub.com with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at questions@packtpub.com, and we will do our best to address the problem.

Getting Up and Running with Cassandra

As an application developer, you have almost certainly worked with databases extensively. You must have built products using relational databases such as MySQL and PostgreSQL, and perhaps experimented with NoSQL databases including a document store such as MongoDB or a key value store such as Redis. While each of these tools has its strengths, you will now consider whether a distributed database such as Cassandra might be the best choice for the task at hand.

In this chapter, we'll begin with the need for NoSQL databases to satisfy the conundrum of ever-growing data. We will see why NoSQL databases are becoming the de facto choice for big data and real-time web applications. We will also talk about the major reasons to choose Cassandra from among the many database options available to you. Having established that Cassandra is a great choice, we'll go through the nuts and bolts of getting a local Cassandra installation up and running. By the end of this chapter, you'll know the following:

What big data is and why relational databases are not a good choice

When and why Cassandra is a good choice for your application

How to install Cassandra on your development machine

How to interact with Cassandra using cqlsh

How to create a keyspace, table, and write a simple query

What is big data?

Big data is a relatively new

Enjoying the preview?

Page 1 of 1

Learning Apache Cassandra - Second Edition

About this ebook

Sandeep Yarabarla

Related authors

Related to Learning Apache Cassandra - Second Edition

Related ebooks

Databases For You

Related podcast episodes

Related articles

Related categories

Reviews for Learning Apache Cassandra - Second Edition

What did you think?

Book preview

Learning Apache Cassandra - Second Edition - Sandeep Yarabarla

Title Page

Learning Apache Cassandra

Second Edition

Managing fault-tolerant and scalable data

Sandeep Yarabarla

Learning Apache Cassandra

Second Edition

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

Credits

About the Author

About the Reviewer

www.PacktPub.com

Why subscribe?

Customer Feedback

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

Getting Up and Running with Cassandra

What is big data?