Verifying HDFS in-transit encryption Using tcpdump and Wireshark

Verifying HDFS in-transit encryption Using tcpdump and Wireshark

In this document we will show, how we can verify if the data being transferred to a Hadoop cluster with HDFS in-transit encryption enabled is actually getting encrypted or not. So, let’s start with : Verifying HDFS in-transit encryption Using tcpdump and Wireshark

Note: here we are discussing about Data in transit encryption only. We will discuss about Data in rest encryption in another document.

HDFS in-transit encryption:

As per Apache Hadoop Documentation https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html we can secure HDFS data in transit with following properties:

 

Core-site.xml:
<property>
<name>hadoop.rpc.protection</name>
<value>privacy</value>
</property>

 

Hdfs-site:

<property>
<name>dfs.encrypt.data.transfer</name>
<value>true</value>
</property>

<property>
<name>dfs.encrypt.data.transfer.cipher.suites</name>
<value>AES/CTR/NoPadding</value>
</property>

 

<property>
<name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>
<value>256</value>
</property>

 

We will initiate HDFS copy from a client machine (edge/gateway node)  to

  1. Hadoop cluster without HDFS in-transit encryption.
  2. Hadoop cluster with cluster with HDFS in-transit encryption enabled

 

Cluster details:

Cluster without HDFS in-transit encryption

1.

NN:  172.31.19.201

Datanode:
172.31.20.196
172.31.29.174

2.

NN: 172.31.18.62

Datanode:
172.31.17.87
172.31.20.79

 

Cluster with HDFS in-transit encryption

NN 172.31.22.103
Datanode
172.31.22.102
172.31.22.57

  1. Copy from nonencrypt cluster to nonencrypt cluster(or a client node to nonencrypt cluster ):
  • SSH into client node or any node of no encrypted cluster. Create a test file and copy it to HDFS.
  • hdfs dfs -put test_file.txt /tmp
  • Now open another terminal of same node and run tcpdump
  • sudo tcpdump -i eth0 -vvv -w sha01.pcap
  • On the first terminal run the hdfs copy command (copy to nonencrypt cluster)

hdfs –loglevel DEBUG dfs -cp /tmp/test_file.txt hdfs://172.31.18.62/tmp

 

As soon as the copy command is finished, switch to other window and stop the tcpdump (ctrl+c), it will complete the capture and dump into a file with name “sha01.pcap” in the current directory.

 

If you run in debug mode you will observer the following line:

—————

17/07/06 07:25:45 DEBUG sasl.SaslDataTransferClient: SASL client skipping handshake in unsecured configuration for addr = /172.31.17.87, datanodeId = DatanodeInfoWithStorage[172.31.17.87:50010,DS-b6c67c1f-d306-49b9-ada1-787ff0fde14b,DISK]

—————

Now lets verify the tcpdump to see how the data packets has been transmitted.

Steps to analyze tcpdump (pcap file):

  1. Download and install Wireshark on a windows machine (if you don’t have it already)
  2. Now copy the file “sha01.pcap” to the windows machine to analyze the tcpdump output.
  3. Open the file in wireshark and search for the entry where destination IP will be one of your target cluster data node and port 50010 . right click –>follow –> and select Tcp stream.

 

 

 

If you see here, we can read the contents as it is in clear text. Now let’s see the TCP stream for copy operation to secure cluster.

 

  1. copy from nonencrypt cluster to encrypt cluster (or  a client node to encrypt cluster:
  • ssh on to NN of non encrypted cluster
  • Create a test file and copy it to HDFS.
  •     hdfs dfs -put test_file.txt /tmp
  • Now open another terminal of same node and run tcpdump
  •         sudo tcpdump -i eth0 -vvv -w sha02.pcap
  • on the first terminal run the hdfs copy command (copy from nonencrypt cluster to encrypted cluster)

hdfs –loglevel DEBUG dfs -cp /tmp/test_file.txt hdfs://172.31.22.103/tmp

 

you will observer the following lines in debug output which confirms to use encrypted transfer.

————

17/07/06 07:16:35 DEBUG hdfs.DFSClient: Getting new encryption token from NN

17/07/06 07:16:35 DEBUG ipc.ProtobufRpcEngine: Call: getDataEncryptionKey took 1ms

17/07/06 07:16:35 DEBUG sasl.SaslDataTransferClient: SASL client doing encrypted handshake for addr = /172.31.22.102, datanodeId = DatanodeInfoWithStorage[172.31.22.102:50010,DS-146d0277-eff0-4d49-a1c4-32b2f321f3bf,DISK]

17/07/06 07:16:35 DEBUG sasl.SaslDataTransferClient: Client using encryption algorithm null

17/07/06 07:16:35 DEBUG sasl.DataTransferSaslUtil: Verifying QOP, requested QOP = [auth-conf], negotiated QOP = auth-conf

17/07/06 07:16:35 DEBUG security.SaslInputStream: Actual length is 22

—–

Now lest see the tcpdump using wireshark. As listed in the previous step copy the “sha02.pcap” file on to windows machine and open it using wireshark.

 

Encrypted Message:

As you can see the contents is encrypted and not in human readable format. This confirms the data being transmitted was encrypted, hence in-transit encryption did happened successfully.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *