Ebook692 pages3 hours

Learning HBase

Name: Learning HBase
Author: Shashwat Shriparv
ISBN: 9781783985951

By Shashwat Shriparv

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Apache HBase is a nonrelational NoSQL database management system that runs on top of HDFS. It is an open source, distributed, versioned, column-oriented store. It facilitates the tech industry with random, real-time read/write access to your Big Data with the benefit of linear scalability on the fly.

This book will take you through a series of core tasks in HBase. The introductory chapter will give you all the information you need about the HBase ecosystem. Furthermore, you'll learn how to configure, create, verify, and test clusters. The book also explores different parameters of Hadoop and HBase that need to be considered for optimization and a trouble-free operation of the cluster. It will focus more on HBase's data model, storage, and structure layout. You will also get to know the different options that can be used to speed up the operation and functioning of HBase. The book will also teach the users basic- and advance-level coding in Java for HBase. By the end of the book, you will have learned how to use HBase with large data sets and integrate them with Hadoop.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateNov 25, 2014

ISBN9781783985951

Author

Shashwat Shriparv

Related authors

Skip carousel

Related to Learning HBase

Related ebooks

Skip carousel

Mastering Apache Cassandra - Second Edition
Ebook
Mastering Apache Cassandra - Second Edition
byNishant Neeraj
Rating: 0 out of 5 stars
0 ratings
Hadoop Cluster Deployment
Ebook
Hadoop Cluster Deployment
byDanil Zburivsky
Rating: 0 out of 5 stars
0 ratings
Hadoop Real-World Solutions Cookbook - Second Edition
Ebook
Hadoop Real-World Solutions Cookbook - Second Edition
byDeshpande Tanmay
Rating: 0 out of 5 stars
0 ratings
Scala for Data Science
Ebook
Scala for Data Science
byBugnion Pascal
Rating: 0 out of 5 stars
0 ratings
Apache Hive Cookbook
Ebook
Apache Hive Cookbook
byShrey Mehrotra
Rating: 0 out of 5 stars
0 ratings
Hadoop 2.x Administration Cookbook
Ebook
Hadoop 2.x Administration Cookbook
byGurmukh Singh
Rating: 0 out of 5 stars
0 ratings
Mastering Hadoop
Ebook
Mastering Hadoop
bySandeep Karanth
Rating: 0 out of 5 stars
0 ratings
HDInsight Essentials - Second Edition
Ebook
HDInsight Essentials - Second Edition
byRajesh Nadipalli
Rating: 0 out of 5 stars
0 ratings
Learning Apache Cassandra - Second Edition
Ebook
Learning Apache Cassandra - Second Edition
bySandeep Yarabarla
Rating: 0 out of 5 stars
0 ratings
OpenStack Sahara Essentials
Ebook
OpenStack Sahara Essentials
byOmar Khedher
Rating: 0 out of 5 stars
0 ratings
Hadoop Blueprints
Ebook
Hadoop Blueprints
byAnurag Shrivastava
Rating: 0 out of 5 stars
0 ratings
Learning Hadoop 2
Ebook
Learning Hadoop 2
byGarry Turkington
Rating: 4 out of 5 stars
4/5
Cassandra Design Patterns - Second Edition
Ebook
Cassandra Design Patterns - Second Edition
byThottuvaikkatumana Rajanarayanan
Rating: 0 out of 5 stars
0 ratings
Learning Apache Spark 2
Ebook
Learning Apache Spark 2
byMuhammad Asif Abbasi
Rating: 0 out of 5 stars
0 ratings
Apache Hive Essentials
Ebook
Apache Hive Essentials
byDayong Du
Rating: 0 out of 5 stars
0 ratings
HBase High Performance Cookbook
Ebook
HBase High Performance Cookbook
byRuchir Choudhry
Rating: 0 out of 5 stars
0 ratings
Cassandra High Availability
Ebook
Cassandra High Availability
byRobbie Strickland
Rating: 5 out of 5 stars
5/5
Hadoop MapReduce v2 Cookbook - Second Edition
Ebook
Hadoop MapReduce v2 Cookbook - Second Edition
byThilina Gunarathne
Rating: 0 out of 5 stars
0 ratings
Cloudera Administration Handbook
Ebook
Cloudera Administration Handbook
byRohit Menon
Rating: 0 out of 5 stars
0 ratings
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Ebook
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
byWei Liu
Rating: 0 out of 5 stars
0 ratings
Scala Test-Driven Development
Ebook
Scala Test-Driven Development
byGaurav Sood
Rating: 0 out of 5 stars
0 ratings
Learning Underscore.js
Ebook
Learning Underscore.js
byPop Alex
Rating: 0 out of 5 stars
0 ratings
Monitoring Hadoop
Ebook
Monitoring Hadoop
byGurmukh Singh
Rating: 0 out of 5 stars
0 ratings
Amazon SimpleDB: LITE
Ebook
Amazon SimpleDB: LITE
byPrabhakar Chaganti
Rating: 0 out of 5 stars
0 ratings
Learning Kibana 5.0
Ebook
Learning Kibana 5.0
byBahaaldine Azarmi
Rating: 0 out of 5 stars
0 ratings
PostgreSQL Development Essentials
Ebook
PostgreSQL Development Essentials
byManpreet Kaur
Rating: 5 out of 5 stars
5/5
Kafka Streams - Real-time Streams Processing
Ebook
Kafka Streams - Real-time Streams Processing
byPrashant Kumar Pandey
Rating: 5 out of 5 stars
5/5
DynamoDB Applied Design Patterns
Ebook
DynamoDB Applied Design Patterns
byUchit Vyas
Rating: 3 out of 5 stars
3/5
Learning Elasticsearch
Ebook
Learning Elasticsearch
byAbhishek Andhavarapu
Rating: 4 out of 5 stars
4/5
Learning Azure DocumentDB
Ebook
Learning Azure DocumentDB
byBecker Riccardo
Rating: 0 out of 5 stars
0 ratings

Programming For You

Skip carousel

Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
Ebook
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
byTimothy C. Needham
Rating: 4 out of 5 stars
4/5
HTML & CSS: Learn the Fundaments in 7 Days
Ebook
HTML & CSS: Learn the Fundaments in 7 Days
byMichael Knapp
Rating: 4 out of 5 stars
4/5
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
Ebook
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
byJason Scotts
Rating: 4 out of 5 stars
4/5
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
Ebook
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
byBrady Ellison
Rating: 5 out of 5 stars
5/5
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
Ebook
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
byMitchell Lynn
Rating: 0 out of 5 stars
0 ratings
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
Ebook
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
byDavid DuRocher
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
Ebook
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
byJames Tudor
Rating: 5 out of 5 stars
5/5
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
Ebook
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
byKevin Pitch
Rating: 5 out of 5 stars
5/5
C# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2
Ebook
C# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2
byPatrick Felicia
Rating: 0 out of 5 stars
0 ratings
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
Ebook
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
byHeath Haskins
Rating: 5 out of 5 stars
5/5
Python Data Structures and Algorithms
Ebook
Python Data Structures and Algorithms
byBenjamin Baka
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Ebook
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
byMark Chan
Rating: 5 out of 5 stars
5/5
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Learn JavaScript in 24 Hours
Ebook
Learn JavaScript in 24 Hours
byAlex Nordeen
Rating: 3 out of 5 stars
3/5
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
Ebook
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
byAnthony Adams
Rating: 4 out of 5 stars
4/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byNikhil Abraham
Rating: 4 out of 5 stars
4/5
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
Ebook
Problem Solving in C and Python: Programming Exercises and Solutions, Part 1
byYana Kortsarts
Rating: 5 out of 5 stars
5/5
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
Ebook
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
byRobert Oliver
Rating: 0 out of 5 stars
0 ratings
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
Linux: Learn in 24 Hours
Ebook
Linux: Learn in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
Ebook
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
byAnthony Adams
Rating: 4 out of 5 stars
4/5
OneNote: The Ultimate Guide on How to Use Microsoft OneNote for Getting Things Done
Ebook
OneNote: The Ultimate Guide on How to Use Microsoft OneNote for Getting Things Done
byChris Will
Rating: 1 out of 5 stars
1/5
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
Ebook
Expert Python Programming - Third Edition: Become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7, 3rd Edition
byMichał Jaworski
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Episode 101. Allright, let's talk about Kafka: Whew! So we took a big break over summer (like Bob said, we were just swamped with work.. oof), but we are BACK! and like always we are ready to explore even deeper Java topics for the professional developer. This time we set our sights in Apache...
Podcast episode
Episode 101. Allright, let's talk about Kafka: Whew! So we took a big break over summer (like Bob said, we were just swamped with work.. oof), but we are BACK! and like always we are ready to explore even deeper Java topics for the professional developer. This time we set our sights in Apache...
byJava Pub House
0 ratings
0% found this document useful
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
Podcast episode
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
byData Engineering Podcast
100%
100% found this document useful
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
Podcast episode
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
byCoder Radio
0 ratings
0% found this document useful
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
Podcast episode
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
byScreaming in the Cloud
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
EP54 - What is the Map Operation in Java Streams?: Big announcement: today marks the launch of our brand new "Beginners only" Coding Bootcamp. If you're a beginner to coding and have spent less than about 6 months learning to code, you're a great fit for this new 16 week Coding Bootcamp. You can join...
Podcast episode
EP54 - What is the Map Operation in Java Streams?: Big announcement: today marks the launch of our brand new "Beginners only" Coding Bootcamp. If you're a beginner to coding and have spent less than about 6 months learning to code, you're a great fit for this new 16 week Coding Bootcamp. You can join...
byCoders Campus Podcast
0 ratings
0% found this document useful
Beam and Spark with Holden Karau: This week our colleague, Holden Karau, joins us to talk about Spark and Beam.
Podcast episode
Beam and Spark with Holden Karau: This week our colleague, Holden Karau, joins us to talk about Spark and Beam.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
Podcast episode
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
byInvest Like the Best with Patrick O'Shaughnessy
0 ratings
0% found this document useful
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
Podcast episode
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
byTest and Code
0 ratings
0% found this document useful
Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39: Self Service Data Flows With Apache NiFi (Interview)
Podcast episode
Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39: Self Service Data Flows With Apache NiFi (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
Hasty Treat - Hireable Skills for 2021: In this Hasty Treat, Scott and Wes talk about hireable skills or 2021 — what you need to know to get a job and grow in your career this year! Freshbooks - Sponsor Get a 30 day free trial of Freshbooks at and put SYNTAX in the “How did...
Podcast episode
Hasty Treat - Hireable Skills for 2021: In this Hasty Treat, Scott and Wes talk about hireable skills or 2021 — what you need to know to get a job and grow in your career this year! Freshbooks - Sponsor Get a 30 day free trial of Freshbooks at and put SYNTAX in the “How did...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Building and Testing Web Services with Postman: With Guest Host Abhinov Asthana
Podcast episode
Building and Testing Web Services with Postman: With Guest Host Abhinov Asthana
byProgramming Throwdown
0 ratings
0% found this document useful
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
Podcast episode
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
Lessons Learned from Cloud Foundry
Podcast episode
Lessons Learned from Cloud Foundry
byThe Cloudcast
0 ratings
0% found this document useful
State In React: In this episode of Syntax, Scott and Wes talk about state in React: local state, global state, UI state, data state, caching, API data and more! LogRocket - Sponsor LogRocket lets you replay what users do on your site, helping you reproduce bugs and...
Podcast episode
State In React: In this episode of Syntax, Scott and Wes talk about state in React: local state, global state, UI state, data state, caching, API data and more! LogRocket - Sponsor LogRocket lets you replay what users do on your site, helping you reproduce bugs and...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
Podcast episode
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
EP 09: Application Contexts, Dependency Injection, and Inversion of Control - OH MY!
Podcast episode
EP 09: Application Contexts, Dependency Injection, and Inversion of Control - OH MY!
byPro Coder Show
0 ratings
0% found this document useful
Open Source at Google Cloud Platform with Sarah Novotny: Mark and Melanie are joined by Sarah Novotny, Head of Open Source Strategy for GCP, to talk all about Open Source, the Cloud Native Compute Foundation & their relationships to Google Cloud Platform.
Podcast episode
Open Source at Google Cloud Platform with Sarah Novotny: Mark and Melanie are joined by Sarah Novotny, Head of Open Source Strategy for GCP, to talk all about Open Source, the Cloud Native Compute Foundation & their relationships to Google Cloud Platform.
byGoogle Cloud Platform Podcast
100%
100% found this document useful
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
Podcast episode
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
byData Engineering Podcast
0 ratings
0% found this document useful
Putting Apache Spark Into Action with Jean Georges Perrin - Episode 60: Tackling Apache Spark From The Data Engineer's Perspective (Interview)
Podcast episode
Putting Apache Spark Into Action with Jean Georges Perrin - Episode 60: Tackling Apache Spark From The Data Engineer's Perspective (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
Yugabyte and Database Innovations with Karthik Ranganathan: This week Corey is joined by Karthik Ranganathan, CTO and Co-Founder of Yugabyte, to talk about databases of which YugabyteDB is one of the best. Karthik started at Facebook building distributed databases and now has moved onto building even more! Why? We
Podcast episode
Yugabyte and Database Innovations with Karthik Ranganathan: This week Corey is joined by Karthik Ranganathan, CTO and Co-Founder of Yugabyte, to talk about databases of which YugabyteDB is one of the best. Karthik started at Facebook building distributed databases and now has moved onto building even more! Why? We
byScreaming in the Cloud
0 ratings
0% found this document useful
Episode 102. Oh my... Spring Boot 3 is out! An interview with Dan Vega from the Pivotal Team!: Ok, so it's an incredible time to be in the Java Ecosystem, and one of the biggest frameworks out there just dropped their three-point-oh version! That's right! So Spring Boot is not officially 3.0, and it has as a Baseline Java 17! (oohh!!). So we...
Podcast episode
Episode 102. Oh my... Spring Boot 3 is out! An interview with Dan Vega from the Pivotal Team!: Ok, so it's an incredible time to be in the Java Ecosystem, and one of the biggest frameworks out there just dropped their three-point-oh version! That's right! So Spring Boot is not officially 3.0, and it has as a Baseline Java 17! (oohh!!). So we...
byJava Pub House
0 ratings
0% found this document useful
Episode 73. Spring Boot 2.0 is out! Hear all about it with Greg Turnquist: Episode 73. Spring Boot 2.0 is out! Hear all about it with Greg Turnquist It's new, it's shiny, and is powerful! The new Spring Boot 2.0 framework is out! And we interviewed Spring's own @gregturn to tell us what's new, what's improved and what has...
Podcast episode
Episode 73. Spring Boot 2.0 is out! Hear all about it with Greg Turnquist: Episode 73. Spring Boot 2.0 is out! Hear all about it with Greg Turnquist It's new, it's shiny, and is powerful! The new Spring Boot 2.0 framework is out! And we interviewed Spring's own @gregturn to tell us what's new, what's improved and what has...
byJava Pub House
0 ratings
0% found this document useful
20 JavaScript Array and Object Methods to make you a better developer: Wes and Scott rattle through ~20 different Object and Arra Methods that will make you a better JavaScript developer. Freshbooks - Sponsor This is episode Wes mentions the free book . Get a 30 day free trial of Freshbooks at . Netlify —...
Podcast episode
20 JavaScript Array and Object Methods to make you a better developer: Wes and Scott rattle through ~20 different Object and Arra Methods that will make you a better JavaScript developer. Freshbooks - Sponsor This is episode Wes mentions the free book . Get a 30 day free trial of Freshbooks at . Netlify —...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
Podcast episode
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Making Automated Machine Learning More Accessible With EvalML: An interview with Angela Lin and Jeremy Shih about the open source EvalML framework for building automated machine learning workflows.
Podcast episode
Making Automated Machine Learning More Accessible With EvalML: An interview with Angela Lin and Jeremy Shih about the open source EvalML framework for building automated machine learning workflows.
byThe Python Podcast.__init__
100%
100% found this document useful
API Lifecycle with Alan Ho: This week Alan Ho from Apigee joins your cohosts Francesc and Mark to talk about the lifecycle of an API.
Podcast episode
API Lifecycle with Alan Ho: This week Alan Ho from Apigee joins your cohosts Francesc and Mark to talk about the lifecycle of an API.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
Podcast episode
Data Visualization and D3.js with Irene Ros: Scott talks to Data Visualization expert Irene Ros. When she isn't contributing to the Miso Project, teaching her d3.js class, or working on making OpenVis Conf the best data visualization conference it can be, she's working on projects that focus on creating engaging interactive visual displays of information.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
346: Elixir and Phoenix with Jesse Herrick: Jesse Herrick is a software engineer based in Columbus, Ohio at Little Lines, a RoR development company. Jesse often works in Rails for work, but his main software passion is Elixir and Phoenix. He dazzles Brittany with how great Phoenix LiveView is.
Podcast episode
346: Elixir and Phoenix with Jesse Herrick: Jesse Herrick is a software engineer based in Columbus, Ohio at Little Lines, a RoR development company. Jesse often works in Rails for work, but his main software passion is Elixir and Phoenix. He dazzles Brittany with how great Phoenix LiveView is.
byThe Ruby on Rails Podcast
0 ratings
0% found this document useful
Panel Discussion: Cloud Identity and Access Management
Podcast episode
Panel Discussion: Cloud Identity and Access Management
byCloud Ace
0 ratings
0% found this document useful

Skip carousel

Join the Pod, Man!
Linux Format
Article
Join the Pod, Man!
May 30, 2023
8 min read
MARIADB Optimise And Control Your Databases
Linux Format
Article
MARIADB Optimise And Control Your Databases
Jul 30, 2019
9 min read
Your First Steps In Grafana
Linux Format
Article
Your First Steps In Grafana
Nov 17, 2020
The easiest way to get hold of Grafana and begin using it as soon as possible is by downloading and executing its official Docker image. This means that apart from the Docker image, you won’t need to download, set up or install anything else for Graf
1 min read
Grafana Terminology
Linux Format
Article
Grafana Terminology
Jan 14, 2020
A Grafana data source is a database, file or service that provides data to Grafana – it cannot operate without data. A Grafana panel is the basic building block of Grafana. Panels are made of visualisations or queries. A Grafana query is used for req
1 min read
Pull, Configure And Run
Linux Format
Article
Pull, Configure And Run
Apr 7, 2020
Guacamole offers ready-to-run installation packages that are available for Linux distros such as CentOS or Debian. However, the thrust of this article is to illustrate running Guacamole in a Docker container context. Fire up an environment where you
8 min read
Grafana, Telegraf And Influxdb
Linux Format
Article
Grafana, Telegraf And Influxdb
Jun 30, 2020
If you don’t like Netdata or if you want to try something else, you can give Grafana (https://grafana.com), Telegraf (www.influxdata.com/time-series-platform/telegraf) and InfluxDB (www.influxdata.com/products/influxdb-overview) a try. Grafana can’t
1 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
Docker vs Podman
APC
Article
Docker vs Podman
Apr 19, 2021
When Cockpit was first developed, it had plug-in support for administering your Docker containers remotely via its user-friendly web interface. But then Red Hat OS became a major backer of Cockpit, and when Red Hat developed its own alternative to Do
1 min read
Build a Better nginx Reverse Proxy
Maximum PC
Article
Build a Better nginx Reverse Proxy
Feb 4, 2020
4 min read
Monitor Systems And Docker Deployments
Linux Format
Article
Monitor Systems And Docker Deployments
Jun 30, 2020
Welcome to Netdata, software for distributed real-time performance and health monitoring of UNIX machines. Don’t you dare turn that page! A key advantage of Netdata is that it collects all of its metrics without introducing too much load on to the Li
8 min read
The Return Of Gpu Computing
PC Pro Magazine
Article
The Return Of Gpu Computing
Jul 8, 2021
5 min read
Build A Search And Analytic Engine
Linux Format
Article
Build A Search And Analytic Engine
Mar 10, 2020
7 min read
Rokoko Studio 2.0
3D World
Article
Rokoko Studio 2.0
Feb 23, 2021
1 min read
KAFKA Build Utilities With The Kafka Server
Linux Format
Article
KAFKA Build Utilities With The Kafka Server
Jul 2, 2019
Nowadays, quite a few data architectures involve both a database and Apache Kafka, which is a distributed streaming platform and the subject of this tutorial. You can also find Kafka described as a publish-subscribe message system, which is a fancy w
7 min read
Elasticsearch And Kibana Basics
Linux Format
Article
Elasticsearch And Kibana Basics
Dec 15, 2020
1 min read
Add Military-level Security To Any Project
Linux Format
Article
Add Military-level Security To Any Project
Aug 27, 2019
7 min read
Picture In A Mainframe
Linux Format
Article
Picture In A Mainframe
Jul 2, 2019
11 min read
Create Asynchronous Code With Python
Linux Format
Article
Create Asynchronous Code With Python
Jun 29, 2021
8 min read
Develop TCP/IP Servers And Clients
Linux Format
Article
Develop TCP/IP Servers And Clients
Aug 23, 2022
RUST OUR EXPERT Get the code for this tutorial from the Linux Format archive: www. linuxformat. com/archives ?issue=293. You can learn more about Rust at www. rust-lang.org. This month we’ll learn how to develop TCP/IP servers and clients in Rust
10 min read
Updating Sites
Linux Format
Article
Updating Sites
Oct 19, 2021
1 min read
Rediscover Speed With The Redis Revolution
Linux Format
Article
Rediscover Speed With The Redis Revolution
Jul 25, 2023
Credit: https://redis.io Redis is an open-source, in-memory data structure store that has gained popularity R as a highly efficient caching and messaging system. It prioritises speed, efficiency and versatility, making it a top choice for various ap
8 min read
Set Up A Production- Ready Web Server
APC
Article
Set Up A Production- Ready Web Server
Nov 4, 2019
8 min read
Set Up A Production-ready Web Server
Linux Format
Article
Set Up A Production-ready Web Server
Sep 24, 2019
8 min read
Twenty Years Of WordPress Websites!
Linux Format
Article
Twenty Years Of WordPress Websites!
Oct 17, 2023
11 min read
Enterprise-grade Monitoring Made Easy
Linux Format
Article
Enterprise-grade Monitoring Made Easy
Mar 10, 2020
9 min read
AI Coded Bash Scripts
Linux Format
Article
AI Coded Bash Scripts
Nov 14, 2023
2 min read
Cloud Support
Linux Format
Article
Cloud Support
Jan 11, 2022
Rsync doesn’t have any specific network backup features, but it could, theoretically, make use of a networked resource (such as an SMB share) as the destination folder, but that could be said of practically any backup software, so it’s a bit of a str
1 min read
How We Tested…
Linux Format
Article
How We Tested…
Jan 12, 2021
You’ll find these applications in the software repositories of most desktop distributions, even if the featured version is not the latest. Some programs provide Snap packages, and others provide installable binaries for RPM- and DEB-based distributio
1 min read
HotPicks
Linux Format
Article
HotPicks
Nov 15, 2022
12 min read
How To Build The Linux Format Server
Linux Format
Article
How To Build The Linux Format Server
Oct 19, 2021
10 min read

Related categories

Skip carousel

Reviews for Learning HBase

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Learning HBase - Shashwat Shriparv

Learning HBase

Credits

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Understanding the HBase Ecosystem

HBase layout on top of Hadoop

Comparing architectural differences between RDBMs and HBase

HBase features

HBase in the Hadoop ecosystem

Data representation in HBase

Hadoop

Core daemons of Hadoop

Comparing HBase with Hadoop

Comparing functional differences between RDBMs and HBase

Logical view of row-oriented databases

Logical view of column-oriented databases

Pros and cons of column-oriented databases

About the internal storage architecture of HBase

Getting started with HBase

When it started

HBase components and functionalities

ZooKeeper

Why an odd number of ZooKeepers?

HMaster

If a master node goes down

RegionServer

Components of a RegionServer

Client

Catalog tables

Who is using HBase and why?

When should we think of using HBase?

When not to use HBase

Understanding some open source HBase tools

The Hadoop-HBase version compatibility table

Applications of HBase

HBase pros and cons

Summary

2. Let's Begin with HBase

Understanding HBase components in detail

HFile

Region

Scalability – understanding the scale up and scale out processes

Scale in

Scale out

Reading and writing cycle

Write-Ahead Logs

MemStore

HBase housekeeping

Compaction

Minor compaction

Major compaction

Region split

Region assignment

Region merge

RegionServer failovers

The HBase delete request

The reading and writing cycle

List of available HBase distributions

Prerequisites and capacity planning for HBase

The forward DNS resolution

The reverse DNS resolution

Java

SSH

Domain Name Server

Using Network Time Protocol to keep your node on time

OS-level changes and tuning up OS for HBase

Summary

3. Let's Start Building It

Downloading Java on Ubuntu

Considering host configurations

Host file based

Command based

File based

DNS based

Installing and configuring SSH

Installing SSH on Ubuntu/Red Hat/CentOS

Configuring SSH

Installing and configuring NTP

Performing capacity planning

Installing and configuring Hadoop

core-site.xml

hdfs-site.xml

yarn-site.xml

mapred-site.xml

hadoop-env.sh

yarn-env.sh

Slaves file

Hadoop start up steps

Configuring Apache HBase

Configuring HBase in the standalone mode

Configuring HBase in the distributed mode

hbase-site.xml

HBase-env.sh

regionservers

Installing and configuring ZooKeeper

Installing Cloudera Hadoop and HBase

Downloading the required RPM packages

Installing Cloudera in an easier way

Installing the Hadoop and MapReduce packages

Installing Hadoop on Windows

Summary

4. Optimizing the HBase/Hadoop Cluster

Setup types for Hadoop and HBase clusters

Recommendations for CDH cluster configuration

Capacity planning

Hadoop optimization

General optimization tips

Optimizing Java GC

Optimizing Linux OS

Optimizing the Hadoop parameter

Optimizing MapReduce

Rack awareness in Hadoop

Number of Map and Reduce limits in configuration files

Considering and deciding the maximum number of Map and Reduce tasks

Optimizing HBase

Hadoop

Memory

Java

HBase

Optimizing ZooKeeper

Important files in Hadoop

Important files in HBase

Summary

5. The Storage, Structure Layout, and Data Model of HBase

Data types in HBase

Storing data in HBase – logical view versus actual physical view

Namespace

Commands available for namespaces

Services of HBase

Row key

Column family

Column

Cell

Version

Timestamp

Data model operations

Get

Put

Scan

Delete

Versioning and why

Deciding the number of the version

Lower bound of versions

Upper bound of versions

Schema designing

Types of table designs

Benefits of Short Wide and Tall-Thin design patterns

Composite key designing

Real-time use case of schema in an HBase table

Schema change operations

Calculating the data size stored in HBase

Summary

6. HBase Cluster Maintenance and Troubleshooting

Hadoop shell commands

Types of Hadoop shell commands

Administration commands

User commands

File system-related commands

Difference between copyToLocal/copyFromLocal and get/put

HBase shell commands

HBase administration tools

hbck – HBase check

HBase health check script

Writing HBase shell scripts

Using the Hadoop tool or JARs for HBase

Connecting HBase with Hive

HBase region management

Compaction

Merge

HBase node management

Commissioning

Decommissioning

Implementing security

Secure access

Requirement

Kerberos KDC

Client-side security configuration

Client-side security configuration for thrift requests

Server-side security configuration

Simple security

Server-side configuration

Client-side configuration

The tag security feature

Access control in HBase

Server-side access control

Cell-level access using tags

Configuring ZooKeeper for security

Troubleshooting the most frequent HBase errors and their explanations

What might fail in cluster

Monitoring HBase health

HBase web UI

Master

RegionServer

ZooKeeper command line

Linux tools

Summary

7. Scripting in HBase

HBase backup and restore techniques

Offline backup / full-shutdown backup

Backup

Restore

Online backup

The HBase snapshot

Online

Offline

The HBase replication method

Setting up cluster replication

Backup and restore using Export and Import commands

Export

Import

Miscellaneous utilities

CopyTable

HTable API

Backup using a Mozilla tool

HBase on Windows

Scripting in HBase

The .irbrc file

Getting the HBase timestamp from HBase shell

Enabling debugging shell

Enabling the debug level in HBase shell

Enabling SQL in HBase

Contributing to HBase

Summary

8. Coding HBase in Java

Setting up the environment for development

Building a Java client to code in HBase

Data types

Data model Java operations

Read

Get()

Constructors

Supported methods

Scan()

Constructors

Methods

Write

Put()

Constructors

Methods

Modify

Delete()

Constructors

Methods

HBase filters

Types of filters

Client APIs

Summary

9. Advance Coding in Java for HBase

Interfaces, classes, and exceptions

Code related to administrative tasks

Data operation code

MapReduce and HBase

RESTful services and Thrift services interface

REST service interfaces

Thrift

Coding for HDFS operations

Some advance topics in brief

Coprocessors

Types of coprocessors

Bloom filters

The Lily project

Features

Summary

10. HBase Use Cases

HBase in industry today

The future of HBase against relational databases

Some real-world project examples' use cases

HBase at Facebook

Choosing HBase

Storing in HBase

The architecture of a Facebook message

Facts and figures

HBase at Pinterest

The layout architecture

HBase at Groupon

The layout architecture

HBase at LongTail Video

The layout architecture

HBase at Aadhaar (UIDAI)

The layout architecture

Useful links and references

Summary

Index

Learning HBase

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: November 2014

Production reference: 1181114

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78398-594-4

www.packtpub.com

Credits

Author

Shashwat Shriparv

Reviewers

Ashutosh Bijoor

Chhavi Gangwal

Henry Garner

Nitin Pawar

Jing Song

Arun Vasudevan

Commissioning Editor

Akram Hussain

Acquisition Editor

Kevin Colaco

Content Development Editor

Prachi Bisht

Technical Editor

Pankaj Kadam

Copy Editors

Janbal Dharmaraj

Sayanee Mukherjee

Project Coordinator

Sageer Parkar

Proofreaders

Bridget Braund

Maria Gould

Lucy Rowland

Indexer

Tejal Soni

Graphics

Ronak Dhruv

Production Coordinator

Aparna Bhagat

Cover Work

Aparna Bhagat

About the Author

Shashwat Shriparv was born in Muzaffarpur, Bihar. He did his schooling from Muzaffarpur and Shillong, Meghalaya. He received his BCA degree from IGNOU, Delhi and his MCA degree from Cochin University of Science and Technology, Kerala (C-DAC Trivandrum).

He was introduced to Big Data technologies in early 2010 when he was asked to perform a proof of concept (POC) on Big Data technologies in storing and processing logs. He was also given another project, where he was required to store huge binary files with variable headers and process them. At this time, he started configuring, setting up, and testing Hadoop HBase clusters and writing sample code for them. After performing a successful POC, he initiated serious development using Java REST and SOAP web services, building a system to store and process logs to Hadoop using web services, and then storing these logs in HBase using homemade schema and reading data using HBase APIs and HBase-Hive mapped queries. Shashwat successfully implemented the project, and then moved on to work on huge binary files of size 1 to 3 TB, processing the header and storing metadata to HBase and files on HDFS.

Shashwat started his career as a software developer at C-DAC Cyber Forensics, Trivandrum, building mobile-related software for forensics analysis. Then, he moved to Genilok Computer Solutions, where he worked on cluster computing, HPC technologies, and web technologies. After this, he moved to Bangalore from Trivandrum and joined PointCross, where he started working with Big Data technologies, developing software using Java, web services, and platform as Big Data. He worked on many projects revolving around Big Data technologies, such as Hadoop, HBase, Hive, Pig, Sqoop, Flume, and so on at PointCross. From here, he moved to HCL Infosystems Ltd. to work on the UIDAI project, which is one of the most prestigious projects in India, providing a unique identification number to every resident of India. Here, he worked on technologies such as HBase, Hive, Hadoop, Pig, and Linux, scripting, managing HBase Hadoop clusters, writing scripts, automating tasks and processes, and building dashboards for monitoring clusters.

Currently, he is working with Cognilytics, Inc. on Big Data technologies, HANA, and other high-performance technologies.

You can find out more about him at https://github.com/shriparv and http://helpmetocode.blogspot.com. You can connect with him on LinkedIn at http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9. You can also e-mail him at .

Shashwat has worked as a reviewer on the book Pig Design Pattern, Pradeep Pasupuleti, Packt Publishing. He also contributed to his college magazine, InfinityTech, as an editor.

Acknowledgments

First, I would like to thank a few people from Packt Publishing: Kevin for encouraging me to write this book, Prachi for assisting and guiding me throughout the writing process, Pankaj for helping me out in technical editing, and all other contributors to this book.

I would like to thank all the developers, contributors, and forums of Hadoop, HBase, and Big Data technologies for giving the industry such awesome technologies and contributing to it continuously. Thanks to Lars and Noll for their contribution towards HBase and Hadoop, respectively.

I would like to thank some people who helped me to learn from life, including teachers at my college—Roshani ma'am (Principal), Namboothari sir, Santosh sir, Manjush ma'am, Hudlin Leo ma'am, and my seniors Jitesh sir, Nilanchal sir, Vaidhath sir, Jwala sir, Ashutosh sir, Anzar sir, Kishor sir, and all my friends in Batch 6. I dedicate this book to my friend, Nikhil, who is not in this world now. Special thanks to Ratnakar Mishra and Chandan Jha for always being with me and believing in me. Thanks also go out to Vineet, Shashi bhai, Shailesh, Rajeev, Pintu, Darshna, Priya, Amit, Manzar, Sunil, Ashok bhai, Pradeep, Arshad, Sujith, Vinay, Rachana, Ashwathi, Rinku, Pheona, Lizbeth, Arun, Kalesh, Chitra, Fatima, Rajesh, Jasmin, and all my friends from C-DAC Trivendrum college. I thank all my juniors, seniors, and friends in college. Thanks to all my colleagues at C-DAC Cyber Forensic: Sateesh sir, my project manager; Anwer Reyaz. J, an enthusiast who is always encouraging; Bibin bhai sahab; Ramani sir; Potty sir; Bhadran sir; Thomas sir; Satish sir; Nabeel sir; Balan sir; Abhin sir; and others. I would also like to thank Mani sir; Raja sir; my friends and teammates: Maruthi Kiran, Chethan, Alok, Tariq, Sujatha, Bhagya, and Mukesh; Sri Gopal sir, my team leader; and all my other colleagues from PointCross. I thank Ramesh Krishnan sir, Manoj sir, Vinod sir, Nand Kishor sir, and my teammates Varun bhai sahab, Preeti Gupta, Kuldeep bhai sahab, and all my colleagues at HCL Infosystems Ltd. and UIDAI. I would also like to thank Satish sir; Sudipta sir; my manager, Atul sir; Pradeep; Nikhil; Mohit; Brijesh; Kranth; Ashish Chopara; Sudhir; and all my colleagues at Cognilytics, Inc.

Last but not the least, I would like to thank papa, Dr. Rameshwar Dwivedi; mummy, Smt. Rewa Dwivedi; bhai, Vishwas Priambud; sister-in-law, Ragini Dwivedi; sweet sister, Bhumika; brother-in-law, Chandramauli Dwivedi; and new members of my family, Vasu and Atmana.

If I missed any names, it does not mean that I am not thankful to them, they all are in my heart and I am thankful to everyone who has come in my life and left their mark. Also, thanking is not in any order.

About the Reviewers

Ashutosh Bijoor (Ash) is Chief Technology Officer at Accion Labs India Private Limited. He has over 20 years of experience in the technology industry with customers ranging from start-ups to large multinationals in a wide range of industries, including high tech, engineering, software, insurance, banking, chemicals, pharmaceuticals, healthcare, media, and entertainment. He is experienced in leading and managing cross-functional teams through an entire product development life cycle.

Ashutosh is skilled in emerging technologies, software architectures, framework design, and agile process definition. He has implemented enterprise solutions as well as commercial products in domains such as Big Data, business intelligence, graphics and image processing, sound and video processing, and advanced text search and analytics.

His e-mail ID is <ashutosh.bijoor@accionlabs.com>. You can also visit his website at http://bijoor.me.

Chhavi Gangwal is currently associated with Impetus Infotech (India) Pvt. Ltd. as a technical lead. With over 7 years of experience in the IT industry, she has worked on various dimensions of social media and the Web and witnessed the rise of Big Data first hand.

Presently, Chhavi is leading the development of Kundera, a JPA 2.0-compliant object-datastore mapping library for NoSQL data stores. She is also actively involved in the product management and development of multitude of Big Data tools. Apart from a working knowledge of several NoSQL data stores, Java, PHP, and different JavaScript frameworks, her passion lies in product designing and learning the latest technologies. Connect with Chhavi on https://www.linkedin.com/profile/view?id=58308893.

Nitin Pawar started his career as a release engineer with Veritas Systems, and so, the quality of software systems is always the main goal in his approach towards work. He has been lucky to work in multiple work profiles at companies such as Yahoo! for almost 5 years, where he learned a lot about the Hadoop ecosystem. After this, he worked with start-ups in analytics and Big Data domains, helping them design backend analytics infrastructures and platforms.

He enjoys solving problems and helping others facing technical issues. Reviewing this book gave him a better understanding of the HBase system, and he hopes that the readers will like it too.

He has also reviewed the book Securing Hadoop, Sudheesh Narayanan, and a video, Building Hadoop Clusters [Video], Sean Mikha, both by Packt Publishing.

Jing Song has been working in the software industry as an engineer for more than 14 years since she graduated school. She enjoys solving problems and learning about new technologies in the Computer Science domain. Her interests and experiences lie across multiple tiers such as web-frontend GUI to middleware, middleware to backend SQL RDBMS, and NoSQL data storage. In the last 5 years, she has mainly focused on enterprise application performance and cloud computing areas. Jing currently works for Apple as a tech lead, leading various Java applications from design to implementation and performance tuning.

Arun Vasudevan is a technical lead at Accion Labs India Private Limited. He specializes in Business Analytics and Visualization and has worked on solutions in various industry verticals, including insurance, telecom, and retail. He specializes in developing applications on Big Data technologies, including Hadoop stack, Cloud technologies, and NoSQL databases. He also has expertise on cloud infrastructure setup and management using OpenStack and AWS APIs.

Arun is skilled in Java J2EE, JavaScript, relational databases, NoSQL technologies, and visualization using custom-built JavaScript visualization tools such as D3JS. Arun manages a team that delivers business analytics and visualization solutions.

His e-mail address is <arun.vasudevan@accionlabs.com>. You can also visit his LinkedIn account at https://www.linkedin.com/profile/view?id=40201159.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

I would like to thank god for giving me this opportunity. I dedicate this book to baba, dadi, nana, and nani.

Preface

This book will provide a top-down approach to learning HBase, which will be useful for both novices and experts. You will start learning configuration, code to maintenance, and troubleshooting—a kind of all-in-one HBase knowledge bank. This will be a step-by-step guide, which will help you work on HBase. The book will include day-to-day activities using HBase administration, and the implementation of Hadoop, plus HBase cluster setup from ground approach. The book will cover a complete list of use cases and explanations to implement HBase as an effective Big Data tool. It will also help you understand the layout and structure of HBase. There are lots of books available on the market on HBase, but they lack something in them; some of them focuses more on configuration and some on coding, but this book will provide a kind of start-to-end approach, which will be useful for a person with zero knowledge in HBase to the person proficient in HBase. This book is a complete guide to HBase administration and development with real-time scenarios and an operation guide.

This book will provide an understanding of what HBase is like, where it came from, who all are involved, why should we consider using it, why people are using it, when to use it, and how to use it. This book will give overall information about the HBase ecosystem. It's more like an HBase-confusion-buster book, a book to read and implement in real life. The book has in-depth theory and practical examples on HBase features. This theoretical and practical approach clears doubts on Hadoop and HBase. It provides complete guidance on configuration/management/troubleshooting of HBase clusters and their operations. The book is targeted at administration and development aspects of HBase; administration with troubleshooting, setup, and development with client and server APIs. This book also enables you to design schema, code in Java, and write shell scripts to work with HBase.

What this book covers

Chapter 1, Understanding the HBase Ecosystem, introduces HBase in detail, and discusses its features, its evolution, and its architecture. We will compare HBase with traditional databases and look at add-on features and the various underlying components, and its uses in the industry.

Chapter 2, Let's Begin with HBase, deals with the HBase components in detail, their internal architecture, communication between different components, how it provides scalability, as well as the HBase reading and writing cycle process, HBase housekeeping tasks, region-related operations, the different components needed for a HBase cluster configuration, and some basic OS tuning.

Chapter 3, Let's Start Building It, lets us proceed ahead with building an HBase cluster. In this chapter, you will find information on the various components and the places we can get it from. We will start configuring the cluster and consider all the parameters and optimization tweaks while building the Hadoop and HBase cluster. One section in the chapter will focus on the various component-level and OS-level parameters for an optimized cluster.

Chapter 4, Optimizing the HBase/Hadoop Cluster, teaches us to optimize the HBase cluster according to the production environment and running cluster troubleshooting tasks. We will look at optimization on hardware, OS, software, and network parameters. This chapter will also teach us how we can optimize Hadoop for a better HBase.

Chapter 5, The Storage, Structure Layout, and Data Model of HBase, discusses HBase's data model and its various data model operations for fetching and writing data in HBase tables. We will also consider some use cases in order to design schema in HBase.

Chapter 6, HBase Cluster Maintenance and Troubleshooting, covers all the aspects of HBase cluster management, operation, and maintenance. Once a cluster is built and in operation, we need to look after it, continuously tune it up, and troubleshoot in order to have a healthy HBase cluster. We will also study the commands available with HBase and Hadoop shell.

Chapter 7, Scripting in HBase, explains an automation process using HBase and shell scripts. We will learn to write scripts as an administrator or developer to automate various data-model-related tasks. We will also read about various backup and restore options available in HBase and how to perform them.

Chapter 8, Coding HBase in Java, teaches Java coding in HBase. We will start with basic Java coding in HBase and learn about Java APIs available for client requests. You will also learn to build a basic client in Java, which can be used to contact an HBase cluster for various operations using Java code.

Chapter 9, Advance Coding in Java for HBase, focuses more on Java coding in HBase. It is a more detailed learning about all the different kind of APIs, classes, methods,

Enjoying the preview?

Page 1 of 1

Learning HBase

About this ebook

Shashwat Shriparv

Related authors

Related to Learning HBase

Related ebooks

Programming For You

Related podcast episodes

Related articles

Related categories

Reviews for Learning HBase

What did you think?

Book preview

Learning HBase - Shashwat Shriparv

Table of Contents

Learning HBase

Learning HBase

Credits

About the Author

Acknowledgments

About the Reviewers

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers