Professional Documents
Culture Documents
to process and distribute large amounts of data on a regular basis. Basic database management systems and tools become ineffective in dealing with processing and storing such large amounts of data. Knowledge and expertise in dealing with Big Data management applications has become a necessity within the IT industry. CloudAce offers two separate Big Data Training programs that focus around Apaches Hadoop platform, This is a 30 hours instructor lead developer training course delivers the key concepts and expertise necessary to create robust data processing applications using Apache Hadoop.Through lecture and interactive, hands-on exercises, attendees will learn Hadoop and its ecosystem components. The training is desgined with a vendor neutral approach .However upon completion of the course, attendees can clear Hadoop developer certification from Cloudera or from HortonWorks. Certification is a great differentiator; it helps establish individuals as leaders in their field, providing customers with tangible evidence of skills and expertise. About Our Trainers By participating in our Big Data Training programs, you will be placed under the guidance of a certified cloud computing professional that has worked with us as a Technical Lead for over 9 years, dealing extensively with Big Data analytics, development, and implementation. Our trainer holds the Hadoop developer and Hadoop administrator certifications, also boasting a wealth of teaching experience. Our trainer also has intensive hands on experience in the implementation of algorithms like decision trees, support vector machines, random forest, nave bayees, neural networks, genetic algorithm, conjoint analysis, principal component analysis, etc. Hadoop Developer Training Our Hadoop Developer Big Data Training program consists of a total of 14 modules that detail the platforms functionalities, advantages and drawbacks. Participants will benefit from an in-depth understanding of the Apache Hadoop platform and will come to learn how to program and tune the program to perform relevant analytics. Participants will learn how to setup Hadoop clusters and also be introduced to common and advanced algorithms and programs. The program also covers the various components of Hadoop Ecosystem extensively.
CLOUDACE TECHNOLOGIES, Regus Solitaire Business Centre (Hyderabad) Pvt Ltd, 4th Floor, Gumidelli Commercial Complex, 1-10-39 to 44, Old Airport Road, Begumpet, Hyderabad - 500016. www.cloudace.in
The duration of the program is 30 hours, completed over the course of 4 days The Hadoop Developer training program will be conducted in a classroom. The fee for the Hadoop Developer tutorial is 24,000 INR, exclusive of service taxes. Upon completion of this training, successful participants will receive the certification of Hadoop Developer.
The agenda for the course is outlined below
o o o o o o o o o o o o
What is Cloud Computing What is Grid Computing What is Virtualization How above three are inter-related to each other What is Big Data Introduction to Analytics and the need for big data analytics Hadoop Solutions - Big Picture Hadoop distributions Comparing Hadoop Vs. Traditional systems Volunteer Computing Data Retrieval - Radom Access Vs. Sequential Access NoSQL Databases
o o
o o o o
What is Hadoop? The Hadoop Distributed File System How MapReduce Works Anatomy of a Hadoop Cluster
o o o o o
CLOUDACE TECHNOLOGIES, Regus Solitaire Business Centre (Hyderabad) Pvt Ltd, 4th Floor, Gumidelli Commercial Complex, 1-10-39 to 44, Old Airport Road, Begumpet, Hyderabad - 500016. www.cloudace.in
o o o o o
Blocks and Splits Replication Data high availability Data Integrity Cluster architecture and block placement
Developing MapReduce Programs in Local Mode Pseudo-distributed Mode Fully distributed mode
o o o o o o
Examining a Sample MapReduce Program Basic API Concepts The Driver Code The Mapper The Reducer Hadoop's Streaming API
o o o o o o
Install and configure Apache Hadoop Make a fully distributed Hadoop cluster on a single laptop/desktop Install and configure Cloudera Hadoop distribution in fully distributed mode Install and configure Horton Works Hadoop distribution in fully distributed mode Monitoring the cluster Getting used to management console of Cloudera and Horton Works
o o o o o o o
Using Combiners The configure and close Methods SequenceFiles Partitioners Counters Directly Accessing HDFS ToolRunner
CLOUDACE TECHNOLOGIES, Regus Solitaire Business Centre (Hyderabad) Pvt Ltd, 4th Floor, Gumidelli Commercial Complex, 1-10-39 to 44, Old Airport Road, Begumpet, Hyderabad - 500016. www.cloudace.in
o o o o o o
Sorting and Searching Indexing Classification/Machine Learning Term Frequency - Inverse Document Frequency Word Co-Occurrence Hands-On Exercise: Creating an Inverted Index
o o o
o o o o o o o
A Recap of the MapReduce Flow Custom Writables and WritableComparables The Secondary Sort Creating InputFormats and OutputFormats Pipelining Jobs With Oozie Map-Side Joins Reduce-Side Joins
o o o
Counters Skipping Bad Records Rerunning Failed tasks with Isolation Runner
o o o o o o o
Reducing network traffic with combiner Reducing the amount of input data Using Compression Reusing the JVM Running with speculative execution Refactoring code and rewriting algorithms Parameters affecting Performance Other Performance Aspects
CLOUDACE TECHNOLOGIES, Regus Solitaire Business Centre (Hyderabad) Pvt Ltd, 4th Floor, Gumidelli Commercial Complex, 1-10-39 to 44, Old Airport Road, Begumpet, Hyderabad - 500016. www.cloudace.in
o o o
Hbase concepts Install and configure hbase on cluster Create database, Develop and run sample applications
o o o
ZooKeeper concepts Install and configure ZooKeeper Use ZooKeeper for cluster maintenance
o o o o
Hive concepts Install and configure hive on cluster Create database, access it console Develop and run sample applications in Java/Python to access hive
o o
Install and configure sqoop on cluster Import data from Oracle/Mysql to hive
o o
o o o
Flume and Chukwa concepts Install and configure flume on cluster Create a sample application to capture logs from Apache using flume
Analytics Basics
o o o o
Analytics and big data analytics Commonly used analytics algorithms Analytics tools like R and Weka Mahout
CLOUDACE TECHNOLOGIES, Regus Solitaire Business Centre (Hyderabad) Pvt Ltd, 4th Floor, Gumidelli Commercial Complex, 1-10-39 to 44, Old Airport Road, Begumpet, Hyderabad - 500016. www.cloudace.in
- 4 Days classroom Training - 24,000 INR + Service Taxes per Participant ( excludes Exam Fees)
CLOUDACE TECHNOLOGIES, Regus Solitaire Business Centre (Hyderabad) Pvt Ltd, 4th Floor, Gumidelli Commercial Complex, 1-10-39 to 44, Old Airport Road, Begumpet, Hyderabad - 500016. www.cloudace.in