Professional Documents
Culture Documents
Hyderabad
Office Address:
Flat No 204, Annapurna
Block
Aditya Enclave, Ameerpet
Hyderabad - 500038
Telangana, India.
Quick Contact:
info@OrienIT.com
+91 040 6514 2345
+91 970 320 2345
http://www.orienit.com/
HADOOP
Course
urse Details :
urse Duration 55 Days
Content
ainer: Kalyan
ainer Profile: SeniorBigDataTrainer At ORIENhttp://www.orienit
IT
ode Of Training: Both Online/Classroom .com/
1.Introduction to Big Data and Hadoop
Big Data
o What is Big Data?
o Why all industries are talking about Big Data?
o What are the issues in Big Data?
Storage
What are the challenges for storing big data?
Processing
What are the challenges for processing big data?
HDFS architecture
Name Node
Importance ofName Node
What are the roles ofName Node
What are the drawbacks in Name Node
Accessing HDFS
o CLI(Command Line Interface) using hdfs commands
o Java Based Approach
Configurations
o Can we change the existing configurations of hdfs or not?
http://www.orienit.com/
o Importance of configurations
How to overcome the Drawbacks in HDFS
o Name Node failures
o Secondary Name Node failures
o Data Node failures
Where does it fit and Where doesnt fit?
Exploring the Apache HDFS Web UI
http://www.orienit.com/
Joins
o Map Side Join
What is the importance of Map Side Join
Where we are using it
o Reduce Side Join
What is the importance of Reduce Side Join
Where we are usingFormat's
utput it in Map Reduce
o What is the difference between Map Side join and Reduce Side Join?
Text Output Format
Compression techniques
Sequence File Output Format
o Importance of Compression techniques in production environment
Importance of Output Format in
o Compression Types
NONE, RECORD and Map Reduce
BLOCK
o Compression Codecs How to use Output Format in Map
Reduce
Default, Gzip, Bzip, Snappy and LZO
How
o Enabling and Disabling to write
these custom
techniques Output
for all the Jobs
Format's
o Enabling and Disabling
Map Reduce Schedulers
and its Record
these techniques Writer Job
for a particular
HADOO
P
oFIFO Scheduler
oCapacity Scheduler
oFair Scheduler
oImportance of Schedulers in production environment
oHow to use Schedulers in production environment http://www.orienit.com/
Map Reduce Programming Model
o How to write the Map Reduce jobs in Java
o Running the Map Reduce jobs in local mode
o Running the Map Reduce jobs in pseudo mode
o Running the Map Reduce jobs in cluster mode
http://www.orienit.com/
Load Functions
o How to write the Load Functions in Pig
o How to use the Load Functions in Pig
o Importance of Load Functions in Pig
Store Functions
o How to use the Store Functions in Pig
o Importance of Store Functions in Pig
utput Format's in Map Reduce
Transformations in Pig
How to write the complexText Output Format
pig scripts
How to integrate the PigSequence
and Hbase File Output Format
Importance of Output Format in
6.Apache HIVE Map Reduce
Hive Introduction How to use Output Format in Map
Hive architecture Reduce
o Driver How to write custom Output HADOO
P
o Compiler Format's and its Record Writer
o Optimizer
o Semantic Analyzer
Hive Query Language(Hive QL)
SQL VS Hive QL
Hive Installation and Configuration
Hive DLL and DML Operations
http://www.orienit.com/
Hive Services
o CLI
o Hiveserver
o Hwi
Metastore
o embedded metastore configuration
utput
o external metastore Format's in Map Reduce
configuration
Text Output Format
UDF's Sequence File Output Format
Importance
o How to write the UDF's in Hive of Output Format in
o How to use the UDF'sMap Reduce
in Hive
o Importance of UDF's in
HowHive
to use Output Format in Map
UDAF's
Reduce
How to write custom Output
o How to use the UDAF's in Hive
HADOO
o
Format's and its Record Writer
Importance of UDAF's in Hive P
UDTF's
o How to use the UDTF's in Hive
o Importance of UDTF's in Hive
How to write a complex Hive queries
What is Hive Data Model? http://www.orienit.com/
Partitions
o Importance of Hive Partitionsin production
environment
o Limitations of Hive Partitions
o How to write Partitions
Buckets
utput Format's in Map Reduce
o Importance of Hive Bucketsin production
Text Output Format
environment Sequence File Output Format
o How to write Buckets
Importance of Output Format in
SerDe Map Reduce
o Importance of HiveHow to use Output
SerDe'sin Format in Map
production
Reduce
environment
o How to write SerDeHow to write custom Output
programs HADOO
How to integrate the Format's
Hive and and its Record Writer
Hbase
P
7.Cloudera Impala
Introduction to Impala
Impala Examples
o HMaster
o HRegionServer
o Zookeeper
Mapreduce integration
o Mapreduce over HBase http://www.orienit.com/
10.Apache Phoenix
Introduction to Phoenix
Installing Phoenix
Integrating with Hbase
Comparing Hbase & Phoenix
Practice on Phoenix examples
11.Apache Cassandra
Introduction to Cassandra
Installing Cassandra
Practice on Cassandra examples
12.MongoDB
Introduction to MongoDB
Installing MongoDB HADOO
Practice on MongoDB examples
P
13.Apache Drill
Introduction to Drill
Installing Drill
Practice on Drill examples
http://www.orienit.com/
14.Apache SQOOP
Introduction to Sqoop
MySQL client and Server Installation
Sqoop Installation
How to connect to Relational Database using Sqoop
Examples on Import and Export Sqoop commands
15.Apache FLUME
Introduction to flume
Flume installation
Flume Architecture
o Agent
o Sources
o Channels
o Sinks HADOO
Practice on Flume examples
P
16.Apache Kafka
Introduction to Kafka
Installing Kafka
Practice on Kafka examples
http://www.orienit.com/
17.Apache Spark
Introduction to Spark
Installing Spark
Spark Architecture
Introduction to Spark Components
o Spark Core
o Spark SQL
o Spark Streaming
o Spark MLLib
o Spark GraphX
Practice on Spark examples
Spark and Hive interation
18.Apache OOZIE
Introduction to oozie
Oozie installation
HADOO
Executing different oozie workflow jobs
Monitering Oozie workflow jobs P
19.Real Time Big Data Projects
We willl be sharingEnd-to-End Big Data Projects
We are providingBig Data Project Practice on Our Lab
We are providingImportantRecorded Videos on Our
YouTube Channel http://www.orienit.com/
Any information search inGoogle / YouTubeby keyword
20.Pre-Requisites for this Course
Java Basics like OOPS Concepts, Interfaces, Classes and
Abstract Classes etc (Free Java classes as part of course)
SQL Basic Knowledge ( Free SQL classes as part of course)
Linux Basic Commands (Provided in our blog)
Administration topics:
Hadoop Installations (Windows & Linux)
o Local mode (hands on installation on ur laptop)
o Pseudo mode (hands on installation on ur laptop)
o Cluster mode (hands on 40+node cluster setup
in our lab)
HADOO
o Nodes Commissioning and De-commissioning in
Hadoop Cluster
P
o Jobs Monitoring in Hadoop Cluster
o Fair Scheduler (hands on installation on ur laptop)
o Capacity Scheduler (hands on installation on ur laptop)
http://www.orienit.com/
Hive Installations
o Local mode (hands on installation on ur laptop)
With internal Derby
o Cluster mode (hands on installation on ur laptop)
With external Derby
With external MySql
o Hive Web Interface (HWI) mode (hands on installation on ur
laptop)
o Hive Thrift Server mode (hands on installation on ur laptop)
o Derby Installation (hands on installation on ur laptop)
o MySql Installation (hands on installation on ur laptop)
Pig Installations
o Local mode (hands on installation on ur laptop)
o Mapreduce mode (hands on installation on ur laptop)
HADOO
Hbase Installations
P
o Local mode (hands on installation on ur laptop)
o Psuedo mode (hands on installation on ur laptop)
o Cluster mode (hands on installation on ur laptop)
With internal Zookeeper
With external Zookeeper
http://www.orienit.com/
Zookeeper Installations
o Local mode (hands on installation on ur laptop)
o Cluster mode (hands on installation on ur laptop)
Sqoop Installations
oSqoop installation with MySql (hands on installation on ur
laptop)
oSqoop with hadoop integration (hands on installation on ur
laptop)
oSqoop with hive integration (hands on installation on ur
laptop)
oSqoop with hbase integration (hands on installation on ur
laptop)
HADOO
Flume Installation
o Psuedo mode (hands on installation on ur laptop) P
Oozie Installation
o Psuedo mode (hands on installation on ur laptop)
Hortonworks Distribution
Introduction to Hortonworks
Hortonworks Installation
Hortonworks Certification details
How to use Hortonworks hadoop
What are the main differences between Hortonworks and
Apache hadoop
HADOO
Amazon EMR
P
Introduction to Amazon EMR and Amazon EC2
How to use Amazon EMR and Amazon EC2
Why to use Amazon EMR and Importance of this
http://www.orienit.com/
Hadoop ecosystem Integrations:
Hive and Spark integration
Hive and HBase integration
Pig and HBase integration
Sqoop and RDBMS integration
Hbase and Phoenix integration
Flume and Phoenix integration
Kakfa and Phoenix integraion
http://www.orienit.com/
23.What we are offering to you:
Hadoop installation on bothWindows & Linux
Free WeeklyOnline Hadoop Certification
Real Time Big Data projects will be shared
FreeBig Data Workshopsonnew & advanced technologies
Hands on MapReduce programming around20+programs these will make you to
perfect in MapReduce both concept-wise and programmatically
Hands on5 POC'swill be provided (These POC's will help you perfect in Hadoop
and it's ecosystems)
Hands on practical 40+Node hadoop cluster setup in our Lab.
Well documentedHadoop material with all the topics covering in the course
Well documentedHadoop blogcontains frequent interview questions along with
the answers and latest updates on Big Data technology.
Discussing abouthadoop interview questions & answersdaily base.
Resume preparationwith POC's or Project's based on your experience.
http://www.orienit.com/
Thank You
Office Address:
Flat No 204, Annapurna Block
Aditya Enclave, Ameerpet
Hyderabad - 500038
Telangana, India.
+91 040 6514 2345
+91 970 320 2345
info@OrienIT.com
http://www.orienit.com/