Big Data and Hadoop

Uploaded by

Ghanshyam Sharma

0% found this document useful (0 votes)

11 views15 pages

Big Data and Hadoop

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Big Data and Hadoop

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

11 views15 pages

Big Data and Hadoop

Uploaded by

Ghanshyam Sharma

Big Data and Hadoop

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 15

Search inside document

Presented by

Dr T.N.Sharma
` Data which is beyond the storage and
processing
` Characteristics of Big Data
◦ Volume
V l
◦ Velocity
y
◦ Variety
` Apache top level project, open-source
implementation of frameworks for reliable,
scalable, distributed computing and data
storage.
storage
` It is a flexible and highly-available
architecture for large scale computation and
data processing on a network of commodity
hardware.
` g
Designed to answer the q
question: “How
to process big data with reasonable
cost and time?”
2005: Doug Cutting and Michael J. Cafarella
developed Hadoop to support distribution for
the Nutch search engine project.
Doug Cutting
The project was funded by Yahoo.

2006: Yahoo gave the project to Apache

Software Foundation.
2003

2004

2006
HDFS
` Responsible for storing data on the cluster

` Data files are split into blocks and distributed

across the nodes in the cluster

` Each block is replicated multiple times

` HDFS is a file system written in Java based on
the Google’s GFS

` Provides redundant storage for massive

amounts of data
` Files are split into blocks
` Blocks are split across many machines at load
time
◦ Diff
Differentt blocks
bl k from
f the
th same file
fil will
ill be
b stored
t d on
different machines
` Blocks are replicated
p across multiple
p
machines
` The NameNode keeps track of which blocks
make up a file and where they are stored
` Default replication is 3-fold
` When a client wants to retrieve data

◦ C
Communicates withh the
h NameNode d to determine
d
which blocks make up a file and on which data
nodes those blocks are stored

◦ Then communicated directly with the data nodes to

read the data

Docker: The Complete Guide to the Most Widely Used Virtualization Technology. Create Containers and Deploy them to Production Safely and Securely.: Docker & Kubernetes, #1
From Everand
Docker: The Complete Guide to the Most Widely Used Virtualization Technology. Create Containers and Deploy them to Production Safely and Securely.: Docker & Kubernetes, #1
Jordan Lioy
No ratings yet
Hadoop Training in Hyderabad - Hadoop File System
Document5 pages
Hadoop Training in Hyderabad - Hadoop File System
kellytechnologies
No ratings yet
Software Containers: The Complete Guide to Virtualization Technology. Create, Use and Deploy Scalable Software with Docker and Kubernetes. Includes Docker and Kubernetes.
From Everand
Software Containers: The Complete Guide to Virtualization Technology. Create, Use and Deploy Scalable Software with Docker and Kubernetes. Includes Docker and Kubernetes.
Jordan Lioy
No ratings yet
Distributed File Systems
Document46 pages
Distributed File Systems
Frank Oyugi
No ratings yet
Software Knowledge
From Everand
Software Knowledge
Debojit Acharjee
No ratings yet
Viden Io Data Analytics Lecture10 Introduction To Hdfs
Document28 pages
Viden Io Data Analytics Lecture10 Introduction To Hdfs
Ram Chandu
No ratings yet
Hadoop
Document71 pages
Hadoop
ouadaouiamine2
No ratings yet
DC Unit-6
Document81 pages
DC Unit-6
Shaily Goyal
No ratings yet
Hadoop Training in Bangalore
Document31 pages
Hadoop Training in Bangalore
kellytechnologies
No ratings yet
CC Unit-5
Document33 pages
CC Unit-5
Rajamanikkam Rajamanikkam
No ratings yet
Unit V Cloud Technologies and Advancements 8
Document33 pages
Unit V Cloud Technologies and Advancements 8
Jaya Prakash M
No ratings yet
Apache Hadoop Book Report: HDFS Architecture and Working
Document45 pages
Apache Hadoop Book Report: HDFS Architecture and Working
Harish R
No ratings yet
Hadoop: A Software Framework For Data Intensive Computing Applications
Document47 pages
Hadoop: A Software Framework For Data Intensive Computing Applications
vaibhavbdx
No ratings yet
CS8791 Cloud Computing Unit V Hadoop MapReduce Notes
Document33 pages
CS8791 Cloud Computing Unit V Hadoop MapReduce Notes
GOKUL b
No ratings yet
The Hadoop Distributed File System
Document44 pages
The Hadoop Distributed File System
SaravanaRaajaa
No ratings yet
What Is Hadoop HDFS
Document20 pages
What Is Hadoop HDFS
poppyyming
No ratings yet
PPT05-Hadoop Storage Layer
Document67 pages
PPT05-Hadoop Storage Layer
TsabitAlaykRidhollah
No ratings yet
Big Data Analytics Unit - 3 Hadoop Concepts
Document28 pages
Big Data Analytics Unit - 3 Hadoop Concepts
Gurusamy Guru
No ratings yet
BDA Lab Assignment 2
Document18 pages
BDA Lab Assignment 2
parth shah
No ratings yet
Unit 3.1
Document88 pages
Unit 3.1
Awadhesh Maurya
No ratings yet
Cloud Computing - Unit 5 Notes
Document33 pages
Cloud Computing - Unit 5 Notes
steffinamorin L
No ratings yet
Introduction to Hadoop (1)
Document52 pages
Introduction to Hadoop (1)
anytingac1
No ratings yet
BigData Module 1 New
Document17 pages
BigData Module 1 New
sangamesh k
No ratings yet
Bda Summer 2022 Solution
Document30 pages
Bda Summer 2022 Solution
Vivek
No ratings yet
Big Data Assighmwnt 2
Document60 pages
Big Data Assighmwnt 2
sakshi soni
No ratings yet
Assignment 2
Document9 pages
Assignment 2
Dileep Kn
No ratings yet
p73 Benson
Document10 pages
p73 Benson
Shreyas Ramnath
No ratings yet
Chapter 10
Document25 pages
Chapter 10
Md Kowsar Islam
No ratings yet
CloverETL versus Hadoop for transforming large datasets in parallel
Document23 pages
CloverETL versus Hadoop for transforming large datasets in parallel
Vijay Kumar
No ratings yet
Ch02 - Big Data Storage Concepts
Document23 pages
Ch02 - Big Data Storage Concepts
mahlet kinfe
No ratings yet
Distributed-File Systems Background
Document9 pages
Distributed-File Systems Background
pradeepshaktawat
No ratings yet
Unit 5 Print
Document32 pages
Unit 5 Print
sivapunithan S
No ratings yet
Distributed File Systems: Definitions
Document17 pages
Distributed File Systems: Definitions
Nonny Eleweke
No ratings yet
Bigdata-7 Ahs Merged
Document45 pages
Bigdata-7 Ahs Merged
1SI20CS076 PRADEEP H V
No ratings yet
Distributed File System
Document49 pages
Distributed File System
Richa Singh
No ratings yet
Apache Hadoop Internals: Architecture, High Availability & RAID
Document25 pages
Apache Hadoop Internals: Architecture, High Availability & RAID
Kavipriya R
No ratings yet
Unit-1 Introduction To Big Data
Document38 pages
Unit-1 Introduction To Big Data
Ankit Chauhan
No ratings yet
Big-Data Computing: Hadoop Distributed File System: B. Ramamurthy
Document43 pages
Big-Data Computing: Hadoop Distributed File System: B. Ramamurthy
Yogesh Bansal
No ratings yet
DFSNov 1
Document36 pages
DFSNov 1
Taha Karim
No ratings yet
Big-Data Computing: Hadoop Distributed File System: B. Ramamurthy
Document45 pages
Big-Data Computing: Hadoop Distributed File System: B. Ramamurthy
Harsh Patel
No ratings yet
Unit 2 Hadoop
Document67 pages
Unit 2 Hadoop
AKSHAY Kumar
No ratings yet
UNIT5
Document33 pages
UNIT5
sureshkumar a
No ratings yet
BFC Project
Document18 pages
BFC Project
sudha gilmore
No ratings yet
BigData Lecture4 HDFS
Document33 pages
BigData Lecture4 HDFS
Mubarak Begum
No ratings yet
MODULE 2
Document17 pages
MODULE 2
Aditya Raj
No ratings yet
UNIT V-Cloud Computing
Document33 pages
UNIT V-Cloud Computing
Jayanth V 19CS045
No ratings yet
Discuss The Hadoop Distributed File System (HDFS)
Document12 pages
Discuss The Hadoop Distributed File System (HDFS)
joe
No ratings yet
Explain Distributed File System and Features of Hadoop
Document3 pages
Explain Distributed File System and Features of Hadoop
akurathikotaiah
No ratings yet
Unit III
Document86 pages
Unit III
Farhan Sj
No ratings yet
Module 2 Hadoop
Document23 pages
Module 2 Hadoop
additiladdha
No ratings yet
UNIT-1-part-2-BIG DATA ANALYTICS AND TOOLS
Document19 pages
UNIT-1-part-2-BIG DATA ANALYTICS AND TOOLS
Alekhya Abbaraju
No ratings yet
Data Deduplication in Cloud Using Convergent Encryption
Document6 pages
Data Deduplication in Cloud Using Convergent Encryption
ZYFER
No ratings yet
DSECL ZG 522: Big Data Systems: Session 6: Hadoop Architecture and Filesystem
Document56 pages
DSECL ZG 522: Big Data Systems: Session 6: Hadoop Architecture and Filesystem
Swati Bhagavatula
No ratings yet
BDS Session 5
Document57 pages
BDS Session 5
R Krish
No ratings yet
Introduction to Big Data and Hadoop Framework
Document29 pages
Introduction to Big Data and Hadoop Framework
Manoj K Upadhyaya
100% (1)
Bigdata 15cs82 Vtu Module 1 2 Notes PDF
Document49 pages
Bigdata 15cs82 Vtu Module 1 2 Notes PDF
Shobhit Kushwaha
No ratings yet
HDFS Basics for Big Data Analytics
Document49 pages
HDFS Basics for Big Data Analytics
rishik jha
57% (14)
CC Unit 5 Notes
Document30 pages
CC Unit 5 Notes
Hrudhai S
No ratings yet
Bda - Unit 2
Document56 pages
Bda - Unit 2
Kajal Vaniya
No ratings yet
HDFS Slides
Document26 pages
HDFS Slides
Aditya Sharma
No ratings yet
Mysql Assignment - 15: Where Quality Matters
Document1 page
Mysql Assignment - 15: Where Quality Matters
Ghanshyam Sharma
No ratings yet
RELATIONAL DATABASES AND SIMPLE QUERIES
Document13 pages
RELATIONAL DATABASES AND SIMPLE QUERIES
Ghanshyam Sharma
No ratings yet
Mysql Assignment - 2: Where Quality Matters
Document1 page
Mysql Assignment - 2: Where Quality Matters
Ghanshyam Sharma
No ratings yet
Mysql Assignment - 1: Where Quality Matters
Document1 page
Mysql Assignment - 1: Where Quality Matters
Ghanshyam Sharma
No ratings yet
Manage finances easily with Tally accounting software
Document90 pages
Manage finances easily with Tally accounting software
Kaushal Sharma
No ratings yet
15 1312MH CH09 PDF
Document17 pages
15 1312MH CH09 PDF
Antora Hoque
No ratings yet
Joining Notes New
Document8 pages
Joining Notes New
Ghanshyam Sharma
No ratings yet
Question Bank Class Xi Ip To Be Uploaded
Document19 pages
Question Bank Class Xi Ip To Be Uploaded
Ghanshyam Sharma
No ratings yet
MYSQL Database Transactions Assignment
Document1 page
MYSQL Database Transactions Assignment
Ghanshyam Sharma
No ratings yet
Python Lists Guide
Document23 pages
Python Lists Guide
Ghanshyam Sharma
No ratings yet
Java String Functions': Visit at "JH X.KS'KK Ue%"
Document5 pages
Java String Functions': Visit at "JH X.KS'KK Ue%"
Ghanshyam Sharma
No ratings yet
View / Download Account Statement: Savings Account No.: 50100021996040, CYBERCITY
Document2 pages
View / Download Account Statement: Savings Account No.: 50100021996040, CYBERCITY
Ghanshyam Sharma
No ratings yet
Preeti Arora List Questions
Document8 pages
Preeti Arora List Questions
Ghanshyam Sharma
100% (1)
14 Years Experience in Application Development & Web Design
Document12 pages
14 Years Experience in Application Development & Web Design
Ghanshyam Sharma
No ratings yet
Preeti Arora Dictionary Questions
Document6 pages
Preeti Arora Dictionary Questions
Ghanshyam Sharma
33% (3)
Payment Status: Thank You Your Payment Is Successful
Document1 page
Payment Status: Thank You Your Payment Is Successful
Ghanshyam Sharma
No ratings yet
© Educational Technology Department, Group Head Office, The City School. 1
Document13 pages
© Educational Technology Department, Group Head Office, The City School. 1
Ghanshyam Sharma
No ratings yet
Class XI Informatics Practices Book Reference Sumita Arora
Document17 pages
Class XI Informatics Practices Book Reference Sumita Arora
Ghanshyam Sharma
No ratings yet
14 Years Experience in Application Development & Web Design
Document12 pages
14 Years Experience in Application Development & Web Design
Ghanshyam Sharma
No ratings yet
Question Bank Class Xi Ip To Be Uploaded
Document19 pages
Question Bank Class Xi Ip To Be Uploaded
Ghanshyam Sharma
No ratings yet
RELATIONAL DATABASES AND SIMPLE QUERIES
Document13 pages
RELATIONAL DATABASES AND SIMPLE QUERIES
Ghanshyam Sharma
No ratings yet
G 083 19 LIC Sales Brochure Jeevan Amar Hindi Proof 12
Document2 pages
G 083 19 LIC Sales Brochure Jeevan Amar Hindi Proof 12
Ghanshyam Sharma
No ratings yet
Web Technology Brochure
Document11 pages
Web Technology Brochure
Ghanshyam Sharma
No ratings yet
Python Primer Guide
Document33 pages
Python Primer Guide
Ghanshyam Sharma
No ratings yet
PHP Basic String Functions
Document1 page
PHP Basic String Functions
Ghanshyam Sharma
No ratings yet
We Can Create Dictionary With Following Four Methods
Document4 pages
We Can Create Dictionary With Following Four Methods
Ghanshyam Sharma
No ratings yet
Class Xi Guess Paper Informatics Practices: General Instructions: MM: 70
Document8 pages
Class Xi Guess Paper Informatics Practices: General Instructions: MM: 70
Ghanshyam Sharma
No ratings yet
MBA IT 108 Solutions
Document10 pages
MBA IT 108 Solutions
Ghanshyam Sharma
No ratings yet
BCA 301 Solutions
Document12 pages
BCA 301 Solutions
Ghanshyam Sharma
No ratings yet
Introduction To Accounting by T S Grewal
Document9 pages
Introduction To Accounting by T S Grewal
Ghanshyam Sharma
No ratings yet