Hadoop Install

Install hadoop in UBUNTU OS
sudo apt-get -y update

sudo apt-get -y upgrade
sudo apt-get -y update
First update all software
chandu@chandu-Inspiron-N5010:~$ sudo apt-get update

chandu@chandu-Inspiron-N5010:~$ sudo mkdir /usr/local/java
chandu@chandu-Inspiron-N5010:~$ sudo chmod 777 /usr/local/java
chandu@chandu-Inspiron-N5010:~$ cd Desktop
chandu@chandu-Inspiron-N5010:~Desktop$ cp
Install java
chandu@chandu-Inspiron-N5010:~$ sudo vim /etc/profile
JAVA_HOME=/usr/local/java/jdk1.8.0_111
JRE_HOME=$JAVA_HOME/jre
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME
export JRE_HOME
export PATH
chandu@chandu-Inspiron-N5010:~$ sudo update-alternatives --install "/usr/bin/java" "java"

"/usr/local/java/jdk1.8.0_111/jre/bin/java" 1
sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/local/java/jdk1.8.0_111/bin/javac" 1
sudo update-alternatives --set java /usr/local/java/jdk1.8.0_111/jre/bin/java
sudo update-alternatives --set javac /usr/local/java/jdk1.8.0_111/bin/javac
chandu@chandu-Inspiron-N5010:~$ . /etc/profile
chandu@chandu-Inspiron-N5010:~$ java version
java version "1.8.0_111"

Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
chandu@chandu-Inspiron-N5010:~$ jps
26824 Jps
chandu@chandu-Inspiron-N5010:~$ javac -version
javac 1.8.0_111
Installing java automatically from app store
chandu@chandu-Inspiron-N5010:~$ sudo apt-get install <java package name>
chandu@chandu-Inspiron-N5010:~$ java -version
openjdk version "1.8.0_111"

OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-2ubuntu0.16.10.2-b14)
OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)
Path of java
chandu@chandu-Inspiron-N5010:~$ gedit ~/.bashrc
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
PATH=$PATH:$JAVA_HOME/bin
Create separate user for hadoop

chandu@chandu-Inspiron-N5010:~$ sudo su
chandu@chandu-Inspiron-N5010:~$ whoami
Create group
root@chandu-Inspiron-N5010:Home/chandu# sudo addgroup hadoop
Create user
root@chandu-Inspiron-N5010:Home/chandu# sudo adduser hduser
Add hduser to group hadoop
root@chandu-Inspiron-N5010:Home/chandu# sudo adduser hduser hadoop
Add hduser to sudoer list
root@chandu-Inspiron-N5010:Home/chandu# sudo visudo
#allow members of group sudo to execute any command
hduser ALL=(ALL)ALL
ctrl+x
Yes and enter
Logout your system and logon to hduser

First, make sure that SSH is installed and a server is running. On Ubuntu, for example,
this is achieved with:
sudo apt-get install ssh
Then, to enable passwordless login, generate a new SSH key with an empty passphrase:
Install ssh server on your computer

-----------------------ssh key is used for physically login to another machine.
hduser@chandu-Inspiron-N5010:~$ sudo apt-get install openssh-server
hduser@chandu-Inspiron-N5010:~$ sudo apt-get install vim
Generate key
hduser@chandu-Inspiron-N5010:~$ ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa
hduser@chandu-Inspiron-N5010:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
hduser@chandu-Inspiron-N5010:~$ chmod 700 $HOME/.ssh/authorized_keys
If the SSH connect should fail, these general tips might help:
Enable debugging with ssh -vvv localhost and investigate the error in detail.
Check the SSH server configuration in /etc/ssh/sshd_config, in particular the options
PubkeyAuthentication (which should be set to yes) and AllowUsers (if this option is
active, add the hduser user to it). If you made any changes to the SSH server
configuration file, you can force a configuration reload with sudo /etc/init.d/ssh reload.
hduser@chandu-Inspiron-N5010:~$ sudo /etc/init.d/ssh restart

----------------------testing ssh is working properly or not (Test that you can connect with:)
hduser@chandu-Inspiron-N5010:~$ ssh localhost
If successful, you should not have to type in a password.
The fallow message must be appear in console

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by

applicable law.
To disable IPv6 on Ubuntu LTS, open

hduser@chandu-Inspiron-N5010:~$ sudo vim /etc/sysctl.conf
press i-------------------------------------for insert paste at end of the document
# disable ipv6
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
esc------------------------------------------
:wq-----------------------------------------write quit
hduser@chandu-Inspiron-N5010:~$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6

0
Reboot your system it will show 1
hduser@chandu-Inspiron-N5010:~$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
1
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
-Hadoop Installation
hduser@chandu-Inspiron-N5010:~$ cd ~/Desktop
hduser@chandu-Inspiron-N5010:~/Desktop$ sudo mv hadoop-2.7.3.tar.gz /usr/local/
hduser@chandu-Inspiron-N5010:~$ cd /usr/local/
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo tar -xvzf hadoop-2.7.3.tar.gz
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo ln -s hadoop-2.7.3 hadoop
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo chown -R hduser:hadoop hadoop-2.7.3
/usr/local/hadoop
Changing permission to the folder
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo chmod 777 hadoop-2.7.3
hduser@chandu-Inspiron-N5010:~$ cd /usr/local/hadoop/etc/hadoop/
hduser@chandu-Inspiron-N5010:/usr/local/hadoop/etc/hadoop$
vim hadoop-env.sh
press i
/usr/local/java/jdk1.8.0_111--------------------------------java path
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export HADOOP_HOME_WARN_SUPPRESS="TRUE"
export JAVA_HOME=/usr/local/java/jdk1.8.0_111
LD_LIBRARY_PATH=/usr/local/hadoop/lib/native/:$LD_LIBRARY_PATH
Esc
:wq
hduser@chandu-Inspiron-N5010:~$ vim ~/.bashrc
#HADOOP VARIABLES START

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
#HADOOP VARIABLES END
# Native Path
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later
on)
export JAVA_HOME=/usr/local/java/jdk1.8.0_111
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null

alias fs="hadoop fs"
unaliash ls &> /dev/null
aliash ls="fs -ls"
# If you have LZO compression enabled in your Hadoop cluster and

# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
#lzohead () {
# hadoop fs -cat $1 | lzop -dc | head -1000 | less
#}
Close terminal and reload it
hduser@laptop:~$ sudo mkdir -p /app/hadoop/tmp

hduser@laptop:~$ sudo chown hduser:hadoop /app/hadoop/tmp
hduser@laptop:~$ sudo chmod -R 777 /app/hadoop/tmp
hduser@laptop:~$ cd /usr/local/hadoop/etc/hadoop/
vim yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-servies</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
vim core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:53410</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
theFileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
cp mapred-site.xml.template mapred-site.xml
vim mapred-site.xml
<property>
<name>mapreduse.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job
tracker runs at. If "local", then jobs are run in-process as a
single map
and reduce task.
</description>
</property>
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hadoop_store
vim hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
Formatting the HDFS filesystem

Before HDFS can be used for the first time, the filesystem must be formatted. This is
done by running the following command:
hadoop namenode -format

Starting and stopping the daemons
To start the HDFS, YARN, and MapReduce daemons, type:
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
jps
Stopping the daemons is done as follows:
mr-jobhistory-daemon.sh stop historyserver

stop-yarn.sh
stop-dfs.sh
Creating a user directory

Create a home directory for yourself by running the following:
hadoop fs -mkdir -p /user/$USER
hdfs dfs -mkdir - /user/hduser
namenode 50070
resourcemanager 8088
mapreducer 19888
Pig installation
hduser@chandu-Inspiron-N5010:~/Desktop$ sudo mv pig-0.16.0.tar.gz /usr/local/
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo tar -xvzf pig-0.16.0.tar.gz
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo ln -s pig-0.16.0 pig
Add pig path

chandu@chandu-Inspiron-N5010:~$ nano ~/.bashrc
export PIG_HOME= /usr/local/pig

export PIG_CONF_DIR= $PIG_HOME/conf
export PIG_CLASSPATH= $PIG_CONF_DIR
export PATH= $PIG_HOME/bin:$PATH
Hive installation
hduser@chandu-Inspiron-N5010:~/Desktop$ sudo mv apache-hive-2.1.0-bin.tar.gz /usr/local/
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo tar -xvzf apache-hive-2.1.0-bin.tar.gz
hduser@chandu-Inspiron-N5010:/usr/local/$ sudo ln -s apache-hive-2.1.0-bin hive
Add hive path
export HIVE_HOME= usr/local/hive
export PATH= $PATH:$ HIVE_HOME/bin
On all daemons
---------------------This will create warehouse folder for us.
hdfs dfs mkdir p /user/hive/warehouse
hdfs dfs mkdir p /tmp/hive
hdfs dfs chmod 777 /tmp
hdfs dfs chmod 777 /user/hive/warehouse
hdfs dfs chmod 777 /tmp/hive
if you get following message

Delect the absolute log4j-slf4j-impl.jar as we same jar file provided by
hadoop and we will use hadoop given jar.
sudo rm /home/hduser/ecosystem/hive/lib/log4j-slf4j-impl.jar
Sudo rm /home/hduser/ecosystem/hive/lib/log4j-slf4j-impl.jar
Initialize the database to be used with hive
From 2.1.0 version onward we have to tell which type of database we are
using
Change directory to cd /usr/local

Schematool iniSchema dbType derby

Hadoop Install

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hadoop Install

Uploaded by

Copyright:

Available Formats

Install hadoop in UBUNTU OS

sudo apt-get -y update

First update all software

chandu@chandu-Inspiron-N5010:~$ sudo apt-get update

chandu@chandu-Inspiron-N5010:~$ sudo update-alternatives --install "/usr/bin/java" "java"

chandu@chandu-Inspiron-N5010:~$ java version

java version "1.8.0_111"

openjdk version "1.8.0_111"

Create separate user for hadoop

Logout your system and logon to hduser

Install ssh server on your computer

hduser@chandu-Inspiron-N5010:~$ sudo /etc/init.d/ssh restart

The fallow message must be appear in console

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by

To disable IPv6 on Ubuntu LTS, open

hduser@chandu-Inspiron-N5010:~$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6

#HADOOP VARIABLES START

# Some convenient aliases and functions for running Hadoop-related commands

unalias fs &> /dev/null

# If you have LZO compression enabled in your Hadoop cluster and

hduser@laptop:~$ sudo mkdir -p /app/hadoop/tmp

sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

Formatting the HDFS filesystem

hadoop namenode -format

Stopping the daemons is done as follows:

mr-jobhistory-daemon.sh stop historyserver

Creating a user directory

Add pig path

export PIG_HOME= /usr/local/pig

if you get following message

Change directory to cd /usr/local

You might also like