You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Jingfei Hu <ji...@hotmail.com> on 2015/11/24 02:44:06 UTC

RE: Am I understanding right?

Anyone?

 

From: Jingfei Hu [mailto:jingfei.hu@gmail.com] 
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org
Cc: jingfei_hu@hotmail.com
Subject: Am I understanding right?

 

Hi team,

I have some trouble to access a HDFS enabled with Kerberos using webhdfs
protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just
one node). I tried several things.

1.       Enable the Kerberos according to the wizard

a.       I can access the hdfs file using webhdfs in that node with correct
Kerberos user name and password. (I am using curl -negotiate .)

b.       But I can't access the hdfs file outside of the hdfs cluster, say a
windows 10 client in our corp network.

2.       Enabled the Kerberos and connect it with a LDAP

a.       I can access the hdfs file using webhdfs in that node with correct
Kerberos user name and password. (I am using curl -negotiate .)

b.       I can access the hdfs file using webhdfs in a machine within the
domain which is connected with the KDC using the KDC user name and password 

c.       I can access the hdfs file using webhdfs in a machine within the
domain which is connected with the KDC using the domain account and password

So my question is will 1.b work in any circumstances? Or it's not working by
design?

 

Thanks,

Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Well, it isn’t necessarily difficult if you can kinit to the same KDC as is used inside the cluster OR to one that is explicitly setup to be trusted by the cluster KDC.
I don’t know what is involved for cross domain trust with a cluster on azure.

If you are able to do that and you have line of site of the webhdfs host:port then it should work with a kinit before using curl —negotiate.
You need to understand the network security involved: is the cluster firewalled off, is there cross domain trust setup or are you sharing the same KDC as the cluster, etc.

You haven’t provided what the error is that you receive so it is a bit tough to speculate any more.

On Nov 27, 2015, at 4:01 AM, Jingfei Hu <ji...@gmail.com>> wrote:

Hi Larry,
Thanks for your reply. But I am still confusing about
Direct access to webhdfs will be difficult from your desktop.

What do you mean by difficult? Is it impossible? My major question is can I use the user name and password which is recognized by the KDC (kerberized on my hdfs cluster) with webhdfs protocol to access hdfs files and directories? My thought goes like ‘Hey, I’ve got the user name and password, it’s all that the KDC needs to verify I am a valid user, why still can’t I access the files?’. What else do I need to do to get this working?  Is KDC always requiring a request from a trusted machine instead of just user name and password?

Thanks,
Jingfei
From: Larry McCay III [mailto:lmccay@hortonworks.com]
Sent: Tuesday, November 24, 2015 1:15 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: Jingfei Hu <ji...@gmail.com>
Subject: Re: Am I understanding right?

Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:


Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Well, it isn’t necessarily difficult if you can kinit to the same KDC as is used inside the cluster OR to one that is explicitly setup to be trusted by the cluster KDC.
I don’t know what is involved for cross domain trust with a cluster on azure.

If you are able to do that and you have line of site of the webhdfs host:port then it should work with a kinit before using curl —negotiate.
You need to understand the network security involved: is the cluster firewalled off, is there cross domain trust setup or are you sharing the same KDC as the cluster, etc.

You haven’t provided what the error is that you receive so it is a bit tough to speculate any more.

On Nov 27, 2015, at 4:01 AM, Jingfei Hu <ji...@gmail.com>> wrote:

Hi Larry,
Thanks for your reply. But I am still confusing about
Direct access to webhdfs will be difficult from your desktop.

What do you mean by difficult? Is it impossible? My major question is can I use the user name and password which is recognized by the KDC (kerberized on my hdfs cluster) with webhdfs protocol to access hdfs files and directories? My thought goes like ‘Hey, I’ve got the user name and password, it’s all that the KDC needs to verify I am a valid user, why still can’t I access the files?’. What else do I need to do to get this working?  Is KDC always requiring a request from a trusted machine instead of just user name and password?

Thanks,
Jingfei
From: Larry McCay III [mailto:lmccay@hortonworks.com]
Sent: Tuesday, November 24, 2015 1:15 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: Jingfei Hu <ji...@gmail.com>
Subject: Re: Am I understanding right?

Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:


Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Well, it isn’t necessarily difficult if you can kinit to the same KDC as is used inside the cluster OR to one that is explicitly setup to be trusted by the cluster KDC.
I don’t know what is involved for cross domain trust with a cluster on azure.

If you are able to do that and you have line of site of the webhdfs host:port then it should work with a kinit before using curl —negotiate.
You need to understand the network security involved: is the cluster firewalled off, is there cross domain trust setup or are you sharing the same KDC as the cluster, etc.

You haven’t provided what the error is that you receive so it is a bit tough to speculate any more.

On Nov 27, 2015, at 4:01 AM, Jingfei Hu <ji...@gmail.com>> wrote:

Hi Larry,
Thanks for your reply. But I am still confusing about
Direct access to webhdfs will be difficult from your desktop.

What do you mean by difficult? Is it impossible? My major question is can I use the user name and password which is recognized by the KDC (kerberized on my hdfs cluster) with webhdfs protocol to access hdfs files and directories? My thought goes like ‘Hey, I’ve got the user name and password, it’s all that the KDC needs to verify I am a valid user, why still can’t I access the files?’. What else do I need to do to get this working?  Is KDC always requiring a request from a trusted machine instead of just user name and password?

Thanks,
Jingfei
From: Larry McCay III [mailto:lmccay@hortonworks.com]
Sent: Tuesday, November 24, 2015 1:15 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: Jingfei Hu <ji...@gmail.com>
Subject: Re: Am I understanding right?

Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:


Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Well, it isn’t necessarily difficult if you can kinit to the same KDC as is used inside the cluster OR to one that is explicitly setup to be trusted by the cluster KDC.
I don’t know what is involved for cross domain trust with a cluster on azure.

If you are able to do that and you have line of site of the webhdfs host:port then it should work with a kinit before using curl —negotiate.
You need to understand the network security involved: is the cluster firewalled off, is there cross domain trust setup or are you sharing the same KDC as the cluster, etc.

You haven’t provided what the error is that you receive so it is a bit tough to speculate any more.

On Nov 27, 2015, at 4:01 AM, Jingfei Hu <ji...@gmail.com>> wrote:

Hi Larry,
Thanks for your reply. But I am still confusing about
Direct access to webhdfs will be difficult from your desktop.

What do you mean by difficult? Is it impossible? My major question is can I use the user name and password which is recognized by the KDC (kerberized on my hdfs cluster) with webhdfs protocol to access hdfs files and directories? My thought goes like ‘Hey, I’ve got the user name and password, it’s all that the KDC needs to verify I am a valid user, why still can’t I access the files?’. What else do I need to do to get this working?  Is KDC always requiring a request from a trusted machine instead of just user name and password?

Thanks,
Jingfei
From: Larry McCay III [mailto:lmccay@hortonworks.com]
Sent: Tuesday, November 24, 2015 1:15 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: Jingfei Hu <ji...@gmail.com>
Subject: Re: Am I understanding right?

Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:


Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:

Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:

Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:

Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei


Re: Am I understanding right?

Posted by Larry McCay III <lm...@hortonworks.com>.
Hi Jingfei -

Once you kerberize your cluster, you will generally need to be able to authenticate to KDC that is either shared with cluster or some sort of cross domain trust is established between the two KDC’s.
You might considering using Apache Knox to authenticate an external client to Knox via LDAP or some other mechanism and Knox will take care of the strong authentication required to access secured Hadoop resources.

You may access a file in HDFS this way with curl using HTTP basic auth against LDAP for example:

curl -ivku username:password -X GET https://host:port/gateway/sandbox/webhdfs/v1/tmp/filename?op=OPEN

Direct access to webhdfs will be difficult from your desktop.

Hope that helps,

—larry

On Nov 23, 2015, at 8:44 PM, Jingfei Hu <ji...@hotmail.com>> wrote:

Anyone?

From: Jingfei Hu [mailto:jingfei.hu@gmail.com]
Sent: Monday, November 23, 2015 6:26 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Cc: jingfei_hu@hotmail.com<ma...@hotmail.com>
Subject: Am I understanding right?

Hi team,
I have some trouble to access a HDFS enabled with Kerberos using webhdfs protocol. The Hadoop deployment is using HDP sandbox in Windows Azure, (just one node). I tried several things.
1.       Enable the Kerberos according to the wizard
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       But I can’t access the hdfs file outside of the hdfs cluster, say a windows 10 client in our corp network.
2.       Enabled the Kerberos and connect it with a LDAP
a.       I can access the hdfs file using webhdfs in that node with correct Kerberos user name and password. (I am using curl –negotiate …)
b.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the KDC user name and password
c.       I can access the hdfs file using webhdfs in a machine within the domain which is connected with the KDC using the domain account and password
So my question is will 1.b work in any circumstances? Or it’s not working by design?

Thanks,
Jingfei