You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by jiang licht <li...@yahoo.com> on 2010/03/05 22:57:24 UTC

Security issue: hadoop fs shell bypass authentication?

I am considering the following problem: if someone knows the master and ports of a hadoop cluster, is he able to run hadoop fs shell to read/write/update/delete data in the cluster without any authentication? Ofcoz, the cluster should be built on top of an isolated island of private network, there will be no such issue. But since ssh needs to be setup for maintaining the cluster (namenode talking to datanode and jobtracker talking to tasktracker), I am wondering if hadoop fs shell also needs such authentication.

The following is what I did and found. It seems that hadoop fs shell bypasses any authentication. Any thoughts?

Hadoop cluster HC: master A as namenode/jobtracker, slave B as datanode/tasktracker.

A 3rd box C knows address of A and namenode port NAMENODEPORT. So, I assign "hdfs://A:NAMENODEPORT" to "fs.default.name" in core-site.xml on C. There is no public key of A on C. C does not participate in any cluster, except for the setting in "core-site.xml" as above and hadoop program.

Then, sitting on C, I can do various fs shell commands! For example upload a file from C to hadoop HC using "hadoop fs -copyFromLocal file@C dst@HC"; delete a file in HC from C by using "hadoop fs -rm file@HC".

So, this means that hadoop fs shell does not require any authentication and can be fired from anywhere?

Thanks,
--

Michael


      

Re: Security issue: hadoop fs shell bypass authentication?

Posted by Allen Wittenauer <aw...@linkedin.com>.


On 3/8/10 11:06 AM, "Michael Segel" <mi...@hotmail.com> wrote:

> The other issue is how secure is 'secure enough' ?
> If you limit the physical access to the cloud, limit connectivity to the cloud
> to certain 'choke' points,  and  then authenticate at the client level prior
to 
> connection,

... which is exactly what the Kerberos piece will do. :)  Right now, Hadoop
has no authentication prior to client connect.  Instead it provides the
output of whoami by default or you can override by setting hadoop.ugi.job as
part of your configuration object.

> you can put in enough security to satisfy most of today's
> applications. Is this perfect? No, but its going to take time to add security
> that will even get down to the column level in HBase. Granted at the Hadoop
> level, you can add more security to custom job control packages.

I mostly agree, but...

Putting controls at the Java level (what I assume you meant by custom job
control packages) doesn't really help that much when user code executes as
the same user as the one who can write to the framework's config files.
Thus LinuxTaskController--it forks the child processes as the user who
submitted.  It is trivial to expand that to do things like run a job in a
container or other fancy things if you need even more level of control
(processor sets, better memory limits, privilege escalation, whatever).

It is hoped that with these two parts in place, Hadoop will meet that
"secure enough" level for the vast majority of users. Sure it'd be great
(and a fairly logical extension) to have things like labels. Without these
basic parts of the plumbing in place though, those more advanced features
are completely out of reach.


RE: Security issue: hadoop fs shell bypass authentication?

Posted by Michael Segel <mi...@hotmail.com>.


> Date: Sun, 7 Mar 2010 22:36:30 -0800
> Subject: Re: Security issue: hadoop fs shell bypass authentication?
> From: awittenauer@linkedin.com
> To: common-user@hadoop.apache.org
> 

> 
> Any user with access can impersonate any other user, including the hadoop
> root user, without something like Kerberos in play.  Permissions are
> meaningless if without the ability to verify someone is who they say they
> are.
> 
> For companies that are dealing with data that needs to be SOX, PCI, etc,
> compliant, this is part of a major requirement to make Hadoop truly viable.
> Otherwise all the users on the grid are technically insiders, etc.
> 
> [See also the stuff around LinuxTaskController, which runs user code as the
> user who submitted rather than a common user.]
> 

I don't disagree with you but I will have to say that if these are concerns, then you are using the wrong tool for the job.

Now don't get me wrong, but if you look at the relational database engines they've had 20+ years to get it right and they still have issues. HBase? Its relatively still in diapers. (What else can you compare Hadoop to but RDBMSs?)

The other issue is how secure is 'secure enough' ?
If you limit the physical access to the cloud, limit connectivity to the cloud to certain 'choke' points,  and then authenticate at the client level prior to connection, you can put in enough security to satisfy most of today's applications. Is this perfect? No, but its going to take time to add security that will even get down to the column level in HBase. Granted at the Hadoop level, you can add more security to custom job control packages.





 		 	   		  
_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
http://clk.atdmt.com/GBL/go/201469229/direct/01/

Re: Security issue: hadoop fs shell bypass authentication?

Posted by Allen Wittenauer <aw...@linkedin.com>.


On 3/6/10 10:41 PM, "jiang licht" <li...@yahoo.com> wrote:

> I can feel that pain, Kerberos needs you to pull more hair from your head :) I
> worked on it a while back and now only remember bit of it.

The only other real choice is PKI. CRLs? Blech.   I'd much rather tie the
grid into my pre-existing Active Directory structure, so I can use its
Kerberos functionality and get SSO for free in the process.
 
> But anyway, a secured hadoop cluster can be created on top of a carefully
> designed and deployed network and firewall system anyway, that's what ppl are
> now using, so, no worry actually ...

That only gets you so far.

Any user with access can impersonate any other user, including the hadoop
root user, without something like Kerberos in play.  Permissions are
meaningless if without the ability to verify someone is who they say they
are.

For companies that are dealing with data that needs to be SOX, PCI, etc,
compliant, this is part of a major requirement to make Hadoop truly viable.
Otherwise all the users on the grid are technically insiders, etc.

[See also the stuff around LinuxTaskController, which runs user code as the
user who submitted rather than a common user.]


Re: Security issue: hadoop fs shell bypass authentication?

Posted by jiang licht <li...@yahoo.com>.
I can feel that pain, Kerberos needs you to pull more hair from your head :) I worked on it a while back and now only remember bit of it.
 
But anyway, a secured hadoop cluster can be created on top of a carefully designed and deployed network and firewall system anyway, that's what ppl are now using, so, no worry actually ...

Thank,
--
Michael

--- On Sat, 3/6/10, Edward Capriolo <ed...@gmail.com> wrote:


From: Edward Capriolo <ed...@gmail.com>
Subject: Re: Security issue: hadoop fs shell bypass authentication?
To: common-user@hadoop.apache.org
Date: Saturday, March 6, 2010, 8:46 PM


The upcoming security will work with kerberos. actions like running a
map reduce job will involve getting a kerberos ticket and passing it
along. I have dodged kerberos for a long time and not looking forward
to much more complexity.but it will almost certainly be a switchable
on off config option.

On 3/6/10, Huy Phan <da...@gmail.com> wrote:
> IMO, we should handle the security part at system level. In this case,
> you can configure iptable to restrict the connections to namenode.
>
> On 03/07/2010 05:56 AM, jiang licht wrote:
>> Good to know and look forward to seeing next release of hadoop with such
>> new security features...
>> �
>> Thanks,
>> --
>> Michael
>>
>> --- On Sat, 3/6/10, Owen O'Malley<om...@apache.org>  wrote:
>>
>>
>> From: Owen O'Malley<om...@apache.org>
>> Subject: Re: Security issue: hadoop fs shell bypass authentication?
>> To: common-user@hadoop.apache.org
>> Date: Saturday, March 6, 2010, 2:20 AM
>>
>>
>>
>> On Mar 5, 2010, at 4:49 PM, Allen Wittenauer wrote:
>>
>>
>>> On 3/5/10 1:57 PM, "jiang licht"<li...@yahoo.com>  wrote:
>>>
>>>> So, this means that hadoop fs shell does not require any authentication
>>>> and
>>>> can be fired from anywhere?
>>>>
>>> There is no authentication/security layer in any released version of
>>> Hadoop.
>>>
>> True, although we are busily adding it. *Smile* It is going into trunk and
>> Yahoo is back porting all of the security work on top of the Yahoo 0.20
>> branch. The primary coding is done, it is undergoing QA now. The plan is
>> to get it on to the alpha clusters by April, and production clusters by
>> August. Although we haven't pushed the security branch out yet to our
>> github repository, we should soon. (http://github.com/yahoo/hadoop-common)
>>
>> -- Owen
>>
>>
>>
>>
>>
>
>



      

Re: Security issue: hadoop fs shell bypass authentication?

Posted by Edward Capriolo <ed...@gmail.com>.
The upcoming security will work with kerberos. actions like running a
map reduce job will involve getting a kerberos ticket and passing it
along. I have dodged kerberos for a long time and not looking forward
to much more complexity.but it will almost certainly be a switchable
on off config option.

On 3/6/10, Huy Phan <da...@gmail.com> wrote:
> IMO, we should handle the security part at system level. In this case,
> you can configure iptable to restrict the connections to namenode.
>
> On 03/07/2010 05:56 AM, jiang licht wrote:
>> Good to know and look forward to seeing next release of hadoop with such
>> new security features...
>> �
>> Thanks,
>> --
>> Michael
>>
>> --- On Sat, 3/6/10, Owen O'Malley<om...@apache.org>  wrote:
>>
>>
>> From: Owen O'Malley<om...@apache.org>
>> Subject: Re: Security issue: hadoop fs shell bypass authentication?
>> To: common-user@hadoop.apache.org
>> Date: Saturday, March 6, 2010, 2:20 AM
>>
>>
>>
>> On Mar 5, 2010, at 4:49 PM, Allen Wittenauer wrote:
>>
>>
>>> On 3/5/10 1:57 PM, "jiang licht"<li...@yahoo.com>  wrote:
>>>
>>>> So, this means that hadoop fs shell does not require any authentication
>>>> and
>>>> can be fired from anywhere?
>>>>
>>> There is no authentication/security layer in any released version of
>>> Hadoop.
>>>
>> True, although we are busily adding it. *Smile* It is going into trunk and
>> Yahoo is back porting all of the security work on top of the Yahoo 0.20
>> branch. The primary coding is done, it is undergoing QA now. The plan is
>> to get it on to the alpha clusters by April, and production clusters by
>> August. Although we haven't pushed the security branch out yet to our
>> github repository, we should soon. (http://github.com/yahoo/hadoop-common)
>>
>> -- Owen
>>
>>
>>
>>
>>
>
>

Re: Security issue: hadoop fs shell bypass authentication?

Posted by Huy Phan <da...@gmail.com>.
IMO, we should handle the security part at system level. In this case, 
you can configure iptable to restrict the connections to namenode.

On 03/07/2010 05:56 AM, jiang licht wrote:
> Good to know and look forward to seeing next release of hadoop with such new security features...
> �
> Thanks,
> --
> Michael
>
> --- On Sat, 3/6/10, Owen O'Malley<om...@apache.org>  wrote:
>
>
> From: Owen O'Malley<om...@apache.org>
> Subject: Re: Security issue: hadoop fs shell bypass authentication?
> To: common-user@hadoop.apache.org
> Date: Saturday, March 6, 2010, 2:20 AM
>
>
>
> On Mar 5, 2010, at 4:49 PM, Allen Wittenauer wrote:
>
>    
>> On 3/5/10 1:57 PM, "jiang licht"<li...@yahoo.com>  wrote:
>>      
>>> So, this means that hadoop fs shell does not require any authentication and
>>> can be fired from anywhere?
>>>        
>> There is no authentication/security layer in any released version of Hadoop.
>>      
> True, although we are busily adding it. *Smile* It is going into trunk and Yahoo is back porting all of the security work on top of the Yahoo 0.20 branch. The primary coding is done, it is undergoing QA now. The plan is to get it on to the alpha clusters by April, and production clusters by August. Although we haven't pushed the security branch out yet to our github repository, we should soon. (http://github.com/yahoo/hadoop-common)
>
> -- Owen
>
>
>
>
>    


Re: Security issue: hadoop fs shell bypass authentication?

Posted by jiang licht <li...@yahoo.com>.
Good to know and look forward to seeing next release of hadoop with such new security features...
 
Thanks,
--
Michael

--- On Sat, 3/6/10, Owen O'Malley <om...@apache.org> wrote:


From: Owen O'Malley <om...@apache.org>
Subject: Re: Security issue: hadoop fs shell bypass authentication?
To: common-user@hadoop.apache.org
Date: Saturday, March 6, 2010, 2:20 AM



On Mar 5, 2010, at 4:49 PM, Allen Wittenauer wrote:

> On 3/5/10 1:57 PM, "jiang licht" <li...@yahoo.com> wrote:
>> So, this means that hadoop fs shell does not require any authentication and
>> can be fired from anywhere?
> 
> There is no authentication/security layer in any released version of Hadoop.

True, although we are busily adding it. *Smile* It is going into trunk and Yahoo is back porting all of the security work on top of the Yahoo 0.20 branch. The primary coding is done, it is undergoing QA now. The plan is to get it on to the alpha clusters by April, and production clusters by August. Although we haven't pushed the security branch out yet to our github repository, we should soon. (http://github.com/yahoo/hadoop-common)

-- Owen



      

Re: Security issue: hadoop fs shell bypass authentication?

Posted by Owen O'Malley <om...@apache.org>.
On Mar 5, 2010, at 4:49 PM, Allen Wittenauer wrote:

> On 3/5/10 1:57 PM, "jiang licht" <li...@yahoo.com> wrote:
>> So, this means that hadoop fs shell does not require any  
>> authentication and
>> can be fired from anywhere?
>
> There is no authentication/security layer in any released version of  
> Hadoop.

True, although we are busily adding it. *Smile* It is going into trunk  
and Yahoo is back porting all of the security work on top of the Yahoo  
0.20 branch. The primary coding is done, it is undergoing QA now. The  
plan is to get it on to the alpha clusters by April, and production  
clusters by August. Although we haven't pushed the security branch out  
yet to our github repository, we should soon. (http://github.com/yahoo/hadoop-common 
)

-- Owen

Re: Security issue: hadoop fs shell bypass authentication?

Posted by jiang licht <li...@yahoo.com>.
Thanks, Allen. I figured that out, namenode/jobtracker faithfully service incoming request except checking its access control list if specified by dfs.hosts and dfs.hosts.exclude ...

Thanks,
--

Michael

--- On Fri, 3/5/10, Allen Wittenauer <aw...@linkedin.com> wrote:

From: Allen Wittenauer <aw...@linkedin.com>
Subject: Re: Security issue: hadoop fs shell bypass authentication?
To: common-user@hadoop.apache.org
Date: Friday, March 5, 2010, 6:49 PM




On 3/5/10 1:57 PM, "jiang licht" <li...@yahoo.com> wrote:
> So, this means that hadoop fs shell does not require any authentication and
> can be fired from anywhere?

There is no authentication/security layer in any released version of Hadoop.




      

Re: Security issue: hadoop fs shell bypass authentication?

Posted by Allen Wittenauer <aw...@linkedin.com>.


On 3/5/10 1:57 PM, "jiang licht" <li...@yahoo.com> wrote:
> So, this means that hadoop fs shell does not require any authentication and
> can be fired from anywhere?

There is no authentication/security layer in any released version of Hadoop.