Oozie ssh action on EMR cluster

Oozie ssh action on EMR cluster

Prerequisites for Oozie ssh action on EMR cluster:

Please note that in case of Oozie ssh action, Oozie tries to ssh into remote host using oozie user. Hence we need to first ensure that we are able to ssh into remote host from Oozie server using oozie user. We also need to consider the fact that, in case of EMR cluster we can only ssh into all the EC2 instances using the key pair we have specified during cluster creation.

Now in order to setup ssh connection from Oozie server (master node) to any remote serer using oozie user, please perform the following steps.

1. ssh into master node using your key pair. (ssh -i “< key file>” hadoop@<masternode_ip/dns>)
2. copy the key file to master node and put it under “/home/hadoop/.ssh”
3. rename the file to id_rsa and change permission 400.( now you should be able to ssh into any slave nodes from master nodes using hadoop user, though it is optional)
4. run following command to enable su for oozie user
sudo chsh -s /bin/bash oozie
sudo su oozie (make sure you are able to su to oozie, and then hit exit to go back to hadoop user session)
5. Now as hadoop user run the following commands to put the key file under oozie user .ssh folder.

sudo cp ~/.ssh/id_rsa /var/lib/oozie/.ssh/id_rsa

sudo chown oozie:oozie /var/lib/oozie/.ssh/id_rsa

sudo chmod 400 /var/lib/oozie/.ssh/id_rsa

now switch to oozie user (sudo su oozie) and verify that you are able to shh into the required remote host or even local host.

e.g. ssh hadoop@localhost
ssh hadoop@<remotehost_ip>

once the ssh connection is successful, please run the Oozie job with SSH action, and see if it executed successfully.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *