You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Lin Zhao <li...@exabeam.com> on 2016/04/06 22:36:44 UTC

Is it possible to turn on data node encryption without kerberos?

I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
What version of Hadoop you are running? 

Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
What version of Hadoop you are running? 

Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
What version of Hadoop you are running? 

Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Musty,

Thanks for the response. I saw the same issue on two environments, one versioned 2.6.0-cdh5.4.7, rcf5ade14581a102afdf8b7689b31ef225e7362fc and the other 2.7.1, r15ecc87ccf4a0228f35af08fc56de536e6ce657a.

I've tried different combinations of the algorithm/key length. Latest I've tried is:

  <property>

    <name>dfs.encrypt.data.transfer</name>

    <value>true</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.algorithm</name>

    <value>3des</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.suites</name>

    <value>AES/CTR/NoPadding</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>

    <value>128</value>

  </property>

From: Musty Rehmani <mu...@yahoo.com>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 3:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Lin,

Great catch!  Yes, there is an implicit assumption that dfs.block.access.token.enable would have to be set to true.

The main purpose of this property is to activate usage of block access tokens for client DataNode interactions.  For example, upon opening a file, the user authenticates to the NameNode, the NameNode issues block access tokens that declare the user is authorized to access the blocks of the file, and then the client presents those block access tokens to the DataNodes to authorize access to the block.

The implicit dependency for data transfer protocol encryption comes about because that feature also relies on the same infrastructure within the NameNode for management of encryption keys.  Without the property enabled, the NameNode doesn't activate its secret management infrastructure, and therefore there is no encryption key available to use for data transfer protocol encryption.

Most fully secured deployments would set dfs.block.access.token.enable=true as part of the full security enablement procedure.  Since you're not fully enabling security, your case is the first time that I've seen this happen.

--Chris Nauroth

From: Lin Zhao <li...@exabeam.com>>
Date: Wednesday, April 6, 2016 at 7:48 PM
To: Chris Nauroth <cn...@hortonworks.com>>, "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Lin,

Great catch!  Yes, there is an implicit assumption that dfs.block.access.token.enable would have to be set to true.

The main purpose of this property is to activate usage of block access tokens for client DataNode interactions.  For example, upon opening a file, the user authenticates to the NameNode, the NameNode issues block access tokens that declare the user is authorized to access the blocks of the file, and then the client presents those block access tokens to the DataNodes to authorize access to the block.

The implicit dependency for data transfer protocol encryption comes about because that feature also relies on the same infrastructure within the NameNode for management of encryption keys.  Without the property enabled, the NameNode doesn't activate its secret management infrastructure, and therefore there is no encryption key available to use for data transfer protocol encryption.

Most fully secured deployments would set dfs.block.access.token.enable=true as part of the full security enablement procedure.  Since you're not fully enabling security, your case is the first time that I've seen this happen.

--Chris Nauroth

From: Lin Zhao <li...@exabeam.com>>
Date: Wednesday, April 6, 2016 at 7:48 PM
To: Chris Nauroth <cn...@hortonworks.com>>, "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Lin,

Great catch!  Yes, there is an implicit assumption that dfs.block.access.token.enable would have to be set to true.

The main purpose of this property is to activate usage of block access tokens for client DataNode interactions.  For example, upon opening a file, the user authenticates to the NameNode, the NameNode issues block access tokens that declare the user is authorized to access the blocks of the file, and then the client presents those block access tokens to the DataNodes to authorize access to the block.

The implicit dependency for data transfer protocol encryption comes about because that feature also relies on the same infrastructure within the NameNode for management of encryption keys.  Without the property enabled, the NameNode doesn't activate its secret management infrastructure, and therefore there is no encryption key available to use for data transfer protocol encryption.

Most fully secured deployments would set dfs.block.access.token.enable=true as part of the full security enablement procedure.  Since you're not fully enabling security, your case is the first time that I've seen this happen.

--Chris Nauroth

From: Lin Zhao <li...@exabeam.com>>
Date: Wednesday, April 6, 2016 at 7:48 PM
To: Chris Nauroth <cn...@hortonworks.com>>, "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Lin,

Great catch!  Yes, there is an implicit assumption that dfs.block.access.token.enable would have to be set to true.

The main purpose of this property is to activate usage of block access tokens for client DataNode interactions.  For example, upon opening a file, the user authenticates to the NameNode, the NameNode issues block access tokens that declare the user is authorized to access the blocks of the file, and then the client presents those block access tokens to the DataNodes to authorize access to the block.

The implicit dependency for data transfer protocol encryption comes about because that feature also relies on the same infrastructure within the NameNode for management of encryption keys.  Without the property enabled, the NameNode doesn't activate its secret management infrastructure, and therefore there is no encryption key available to use for data transfer protocol encryption.

Most fully secured deployments would set dfs.block.access.token.enable=true as part of the full security enablement procedure.  Since you're not fully enabling security, your case is the first time that I've seen this happen.

--Chris Nauroth

From: Lin Zhao <li...@exabeam.com>>
Date: Wednesday, April 6, 2016 at 7:48 PM
To: Chris Nauroth <cn...@hortonworks.com>>, "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
We use ssh for all clients to interact with our cluster nodes.  In addition we use Ranger to authorize and audit users for access to data and run jobs 
Just FYI 
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 7:37 PM, Lin Zhao<li...@exabeam.com> wrote:   Chris,
Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.
How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?
From: Chris Nauroth <cn...@hortonworks.com>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>, Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.
The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.
Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.
--Chris Nauroth
From: Musty Rehmani <mu...@yahoo.com.INVALID>
Reply-To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android

 On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote: I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao
  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
We use ssh for all clients to interact with our cluster nodes.  In addition we use Ranger to authorize and audit users for access to data and run jobs 
Just FYI 
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 7:37 PM, Lin Zhao<li...@exabeam.com> wrote:   Chris,
Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.
How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?
From: Chris Nauroth <cn...@hortonworks.com>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>, Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.
The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.
Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.
--Chris Nauroth
From: Musty Rehmani <mu...@yahoo.com.INVALID>
Reply-To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android

 On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote: I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao
  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
We use ssh for all clients to interact with our cluster nodes.  In addition we use Ranger to authorize and audit users for access to data and run jobs 
Just FYI 
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 7:37 PM, Lin Zhao<li...@exabeam.com> wrote:   Chris,
Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.
How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?
From: Chris Nauroth <cn...@hortonworks.com>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>, Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.
The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.
Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.
--Chris Nauroth
From: Musty Rehmani <mu...@yahoo.com.INVALID>
Reply-To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android

 On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote: I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao
  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
We use ssh for all clients to interact with our cluster nodes.  In addition we use Ranger to authorize and audit users for access to data and run jobs 
Just FYI 
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 7:37 PM, Lin Zhao<li...@exabeam.com> wrote:   Chris,
Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.
How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?
From: Chris Nauroth <cn...@hortonworks.com>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>, Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.
The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.
Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.
--Chris Nauroth
From: Musty Rehmani <mu...@yahoo.com.INVALID>
Reply-To: "musty_rehmani@yahoo.com" <mu...@yahoo.com>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android

 On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote: I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao
  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.

How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.

How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.

How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

I followed the TestEncryptedTransfer test case and set dfs.block.access.token.enable to true and this issue is resolved. I can't find documentation this property is mandatory for encrypted data transfer. What does this property do?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Chris,

Thanks a lot for the detailed response. A little background on our deployment. Our HDFS cluster is a single-tenant  deployed on a Docker cluster running across several hosts. Inside of the docker containers there's only one root user running everything. And there's no external network access to the container network. So we are ensurring authentication by controlling access to the physical boxes. Our major concern is sniffing on the data in transfer.

How does hadoop.rpc.protection work if set to privacy but without kerberos? Is the communication still encrypted?

From: Chris Nauroth <cn...@hortonworks.com>>
Date: Wednesday, April 6, 2016 at 4:02 PM
To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>, Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?

It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Musty,

Thanks for the response. I saw the same issue on two environments, one versioned 2.6.0-cdh5.4.7, rcf5ade14581a102afdf8b7689b31ef225e7362fc and the other 2.7.1, r15ecc87ccf4a0228f35af08fc56de536e6ce657a.

I've tried different combinations of the algorithm/key length. Latest I've tried is:

  <property>

    <name>dfs.encrypt.data.transfer</name>

    <value>true</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.algorithm</name>

    <value>3des</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.suites</name>

    <value>AES/CTR/NoPadding</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>

    <value>128</value>

  </property>

From: Musty Rehmani <mu...@yahoo.com>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 3:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Musty,

Thanks for the response. I saw the same issue on two environments, one versioned 2.6.0-cdh5.4.7, rcf5ade14581a102afdf8b7689b31ef225e7362fc and the other 2.7.1, r15ecc87ccf4a0228f35af08fc56de536e6ce657a.

I've tried different combinations of the algorithm/key length. Latest I've tried is:

  <property>

    <name>dfs.encrypt.data.transfer</name>

    <value>true</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.algorithm</name>

    <value>3des</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.suites</name>

    <value>AES/CTR/NoPadding</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>

    <value>128</value>

  </property>

From: Musty Rehmani <mu...@yahoo.com>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 3:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Lin Zhao <li...@exabeam.com>.
Musty,

Thanks for the response. I saw the same issue on two environments, one versioned 2.6.0-cdh5.4.7, rcf5ade14581a102afdf8b7689b31ef225e7362fc and the other 2.7.1, r15ecc87ccf4a0228f35af08fc56de536e6ce657a.

I've tried different combinations of the algorithm/key length. Latest I've tried is:

  <property>

    <name>dfs.encrypt.data.transfer</name>

    <value>true</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.algorithm</name>

    <value>3des</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.suites</name>

    <value>AES/CTR/NoPadding</value>

  </property>

  <property>

    <name>dfs.encrypt.data.transfer.cipher.key.bitlength</name>

    <value>128</value>

  </property>

From: Musty Rehmani <mu...@yahoo.com>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 3:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Chris Nauroth <cn...@hortonworks.com>.
It is possible to turn on data transfer protocol encryption without enabling Kerberos authentication.  We have a test suite in the Hadoop codebase named TestEncryptedTransfer that configures data transfer encryption, but not Kerberos, and those tests are passing.

The hadoop.rpc.protection setting is unrelated to data transfer protocol.  Instead, it controls the SASL quality of protection for the RPC connections used by many Hadoop client/server interactions.  This won't really be active unless Kerberos authentication is enabled though.

Please note that even though it's possible to enable data transfer protocol encryption without using Kerberos authentication in the cluster, the benefit of that is questionable in a production deployment.  Without Kerberos authentication, it's very easy for an unauthenticated user to spoof another user and access their HDFS files.  Whether or not the data is encrypted in transit becomes irrelevant at that point.

--Chris Nauroth

From: Musty Rehmani <mu...@yahoo.com.INVALID>>
Reply-To: "musty_rehmani@yahoo.com<ma...@yahoo.com>" <mu...@yahoo.com>>
Date: Wednesday, April 6, 2016 at 2:54 PM
To: Lin Zhao <li...@exabeam.com>>, "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Is it possible to turn on data node encryption without kerberos?


Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.
Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao
<li...@exabeam.com>> wrote:
I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.


<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property>


When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"


I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

Any help is appreciated.

Thanks,

Lin Zhao

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
What version of Hadoop you are running? 

Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao  

Re: Is it possible to turn on data node encryption without kerberos?

Posted by Musty Rehmani <mu...@yahoo.com.INVALID>.
Kerberos is used to authenticate user or service principal to grant access to cluster. It doesn't encrypt data blocks coming in and out of cluster.  
Sent from Yahoo Mail on Android 
 
  On Wed, Apr 6, 2016 at 4:36 PM, Lin Zhao<li...@exabeam.com> wrote:   I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both. 
<property>
  <name>dfs.encrypt.data.transfer</name>
  <value>true</value>
</property><property>
  <name>hadoop.rpc.protection</name>
  <value>privacy</value>
</property> 
When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client at /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"
I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication, but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails? Any help is appreciated.Thanks,Lin Zhao