Zookeeper Setup
Required Software
Zookeeper runs in Java release 1.6 or greater (JDK 6 or greater) hence please download and install JDK first.
Zookeeper runs as an ensemble of Zookeeper servers, which should be of odd numbers, as zookeeper requires a majority. For example, with four machines ZooKeeper can only handle the failure of a single machine; if two machines fail, the remaining two machines do not constitute a majority. However, with five machines ZooKeeper can handle the failure of two machines.
Three Zookeeper servers is the minimum recommended size for an ensemble, and also recommendation is that they should run on separate machines.
Download Zookeeper: http://www.apache.org/dyn/closer.cgi/zookeeper/
Or from : http://zookeeper.apache.org/releases.html#download
in this session we will be deploying zookeeper under “ /etc/zookeper” and will use /data/zookeeper as zookeeper data directory, so I have down loaded the zookeeper-3.4.6.tar.gz file under directory “/etc/zookeeper/” and extracted the content.
- Create the directory structure.
[root@dn2 hadoop_conf_bkp]# mkdir /etc/zookeeper/[root@dn2 hadoop_conf_bkp]# mkdir /data/zookeeper[root@dn2 hadoop_conf_bkp]# chown hadoop:hadoop /etc/zookeeper/ /data/zookeeper/[root@dn2 hadoop_conf_bkp]# chmod 775 /etc/zookeeper/ /data/zookeeper/[root@dn2 hadoop_conf_bkp]#
- Download Zookeeper and extract the content.
cd /etc/zookeeper/
tar -xvzf zookeeper-3.4.6.tar.gz
- Now create/edit the zoo.cfg under conf directory and add entry in following manner (assuming we are using three servers to host zookeeper i.e. nn1.hadoop.com, dn1.hadoop.com and dn2.hadoop.com)
cd /etc/zookeeper/zookeeper-3.4.6/conf
vi zoo.cfg
dataDir=/data/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=nn1.hadoop.com:2888:3888
server.2=dn1.hadoop.com:2888:3888
server.3=dn2.hadoop.com:2888:3888
- Create the “myid “ file under zookeepr data directory path and add respective server ID (i.e. 1, 2, 3 on nn1, dn1 and dn2 respectively)
cd /var/zookeeper
vi myid
[hadoop@nn1 conf]$ cat /var/zookeeper/myid
1
[hadoop@nn1 conf]$
- Now copy/deploy the zookeeper directory to remaining server and create “myid” file .
You can do the same using SCP or rsync.
e.g.
for i in `cat /opt/hadoop/hadoop-2.7.1/etc/hadoop/slaves`;
do
echo $i; rsync -avxP –exclude=logs /etc/zookeeper/zookeeper-3.4.6 $i:/etc/zookeeper/;
echo ” sync completed for host $i”
done
now edit “/data/zookeeper/myid” file in dn1 and dn2 (each host part of your zookeeper quorum) , and update the value to 2, 3 respectively.
- We are done with the required configurations and now good to start the service.
Start Zookeeper service
[hadoop@nn1 bin]$ pwd/etc/zookeeper/zookeeper-3.4.6/bin[hadoop@nn1 bin]$ [hadoop@nn1 bin]$ . zkServer.sh startJMX enabled by defaultUsing config: /etc/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfgStarting zookeeper … STARTED[hadoop@nn1 bin]$
Note: Log file with name “zookeeper.out” will be generated in the current path from where start script has been executed.
To check status:
./zkServer.sh status
Out of three zookeeper quorum only one will be leader and two will be follower. We can run the command in a loop to check the status in each zookeeper server:
./zkServer.sh status [hadoop@nn1 zookeeper-3.4.6]$ for i in nn1 dn1 dn2> do> ssh $i “echo “Hoastname: `hostname`”; sh /etc/zookeeper/zookeeper-3.4.6/bin/zkServer.sh status”> echo ” “> doneHoastname: nn1.hadoop.comJMX enabled by defaultUsing config: /etc/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower Hoastname: nn1.hadoop.comJMX enabled by defaultUsing config: /etc/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: leader Hoastname: nn1.hadoop.comJMX enabled by defaultUsing config: /etc/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower [hadoop@nn1 zookeeper-3.4.6]$
You can also run zkCli.sh present under the bin directory to connect to zookeeper shell, and ensure you are able to connect successfully.
In order to stop a zookeeper service run below command.
.bin/zkServer.sh stop
Thank you
Leave a Reply