You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Steve Loughran <st...@hortonworks.com> on 2018/07/04 15:58:57 UTC

[DISCUSS]: securing ASF Hadoop releases out of the box

Bitcoins are profitable enough to justify writing malware to run on Hadoop clusters & schedule mining jobs: there have been a couple of incidents of this in the wild, generally going in through no security, well known passwords, open ports.

Vendors of Hadoop-related products get to deal with their lockdown themselves, which they often do by installing kerberos from the outset, making users make up their own password for admin accounts, etc.

The ASF releases though: we just provide something insecure out the box and some docs saying "use kerberos if you want security"

What we can do here?

Some things to think about

* docs explaining IN CAPITAL LETTERS why you need to lock down your cluster to a private subnet or use Kerberos
* Anything which can be done to make Kerberos easier (?). I see there are some oustanding patches for HADOOP-12649 which need review, but what else?

Could we have Hadoop determine when it's coming up on an open network and start warning? And how? 

At the very least, single node hadoop should be locked down. You shouldn't have to bring up kerberos to run it like that. And for more sophisticated multinode deployments, should the scripts refuse to work without kerberos unless you pass in some argument like "--Dinsecure-clusters-permitted"

Any other ideas?


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Steve Loughran <st...@hortonworks.com>.

> On 6 Jul 2018, at 00:04, Eric Yang <ey...@hortonworks.com> wrote:
> 
> +1 on Non-routable IP idea.  My preference is to start in Hadoop-common to minimize the scope and incrementally improve.  However, this will be incompatible change for initial user experience on public cloud.  What would be the right release vehicle for this work (3.2+ or 4.x)?

3.2+

as for public cloud, that's precisely where you don't want to be wide open unless you are in some VPN setup. If you can't set up network rules here, should you be trying to install ASF hadoop out the box into a VM.

We should ping the Bigtop people here for their input.

> 
> Regards,
> Eric
> 
> On 7/5/18, 2:33 PM, "larry mccay" <lm...@apache.org> wrote:
> 
>    +1 from me as well.
> 
>    On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
>    wrote:
> 
>> 
>> 
>>> On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
>>> 
>>> +1, on the Non-Routable Idea. We like it so much that we added it to the
>> Ozone roadmap.
>>> https://issues.apache.org/jira/browse/HDDS-231
>>> 
>>> If there is consensus on bringing this to Hadoop in general, we can
>> build this feature in common.
>>> 
>>> --Anu
>>> 
>> 
>> 
>> +1 to out the box, everywhere. Web UIs included
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> 
>> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Steve Loughran <st...@hortonworks.com>.

> On 6 Jul 2018, at 00:04, Eric Yang <ey...@hortonworks.com> wrote:
> 
> +1 on Non-routable IP idea.  My preference is to start in Hadoop-common to minimize the scope and incrementally improve.  However, this will be incompatible change for initial user experience on public cloud.  What would be the right release vehicle for this work (3.2+ or 4.x)?

3.2+

as for public cloud, that's precisely where you don't want to be wide open unless you are in some VPN setup. If you can't set up network rules here, should you be trying to install ASF hadoop out the box into a VM.

We should ping the Bigtop people here for their input.

> 
> Regards,
> Eric
> 
> On 7/5/18, 2:33 PM, "larry mccay" <lm...@apache.org> wrote:
> 
>    +1 from me as well.
> 
>    On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
>    wrote:
> 
>> 
>> 
>>> On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
>>> 
>>> +1, on the Non-Routable Idea. We like it so much that we added it to the
>> Ozone roadmap.
>>> https://issues.apache.org/jira/browse/HDDS-231
>>> 
>>> If there is consensus on bringing this to Hadoop in general, we can
>> build this feature in common.
>>> 
>>> --Anu
>>> 
>> 
>> 
>> +1 to out the box, everywhere. Web UIs included
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> 
>> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Steve Loughran <st...@hortonworks.com>.

> On 6 Jul 2018, at 00:04, Eric Yang <ey...@hortonworks.com> wrote:
> 
> +1 on Non-routable IP idea.  My preference is to start in Hadoop-common to minimize the scope and incrementally improve.  However, this will be incompatible change for initial user experience on public cloud.  What would be the right release vehicle for this work (3.2+ or 4.x)?

3.2+

as for public cloud, that's precisely where you don't want to be wide open unless you are in some VPN setup. If you can't set up network rules here, should you be trying to install ASF hadoop out the box into a VM.

We should ping the Bigtop people here for their input.

> 
> Regards,
> Eric
> 
> On 7/5/18, 2:33 PM, "larry mccay" <lm...@apache.org> wrote:
> 
>    +1 from me as well.
> 
>    On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
>    wrote:
> 
>> 
>> 
>>> On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
>>> 
>>> +1, on the Non-Routable Idea. We like it so much that we added it to the
>> Ozone roadmap.
>>> https://issues.apache.org/jira/browse/HDDS-231
>>> 
>>> If there is consensus on bringing this to Hadoop in general, we can
>> build this feature in common.
>>> 
>>> --Anu
>>> 
>> 
>> 
>> +1 to out the box, everywhere. Web UIs included
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> 
>> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Eric Yang <ey...@hortonworks.com>.
+1 on Non-routable IP idea.  My preference is to start in Hadoop-common to minimize the scope and incrementally improve.  However, this will be incompatible change for initial user experience on public cloud.  What would be the right release vehicle for this work (3.2+ or 4.x)?

Regards,
Eric

On 7/5/18, 2:33 PM, "larry mccay" <lm...@apache.org> wrote:

    +1 from me as well.
    
    On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
    wrote:
    
    >
    >
    > > On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
    > >
    > > +1, on the Non-Routable Idea. We like it so much that we added it to the
    > Ozone roadmap.
    > > https://issues.apache.org/jira/browse/HDDS-231
    > >
    > > If there is consensus on bringing this to Hadoop in general, we can
    > build this feature in common.
    > >
    > > --Anu
    > >
    >
    >
    > +1 to out the box, everywhere. Web UIs included
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >
    >
    


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Eric Yang <ey...@hortonworks.com>.
+1 on Non-routable IP idea.  My preference is to start in Hadoop-common to minimize the scope and incrementally improve.  However, this will be incompatible change for initial user experience on public cloud.  What would be the right release vehicle for this work (3.2+ or 4.x)?

Regards,
Eric

On 7/5/18, 2:33 PM, "larry mccay" <lm...@apache.org> wrote:

    +1 from me as well.
    
    On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
    wrote:
    
    >
    >
    > > On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
    > >
    > > +1, on the Non-Routable Idea. We like it so much that we added it to the
    > Ozone roadmap.
    > > https://issues.apache.org/jira/browse/HDDS-231
    > >
    > > If there is consensus on bringing this to Hadoop in general, we can
    > build this feature in common.
    > >
    > > --Anu
    > >
    >
    >
    > +1 to out the box, everywhere. Web UIs included
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >
    >
    


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Eric Yang <ey...@hortonworks.com>.
+1 on Non-routable IP idea.  My preference is to start in Hadoop-common to minimize the scope and incrementally improve.  However, this will be incompatible change for initial user experience on public cloud.  What would be the right release vehicle for this work (3.2+ or 4.x)?

Regards,
Eric

On 7/5/18, 2:33 PM, "larry mccay" <lm...@apache.org> wrote:

    +1 from me as well.
    
    On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
    wrote:
    
    >
    >
    > > On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
    > >
    > > +1, on the Non-Routable Idea. We like it so much that we added it to the
    > Ozone roadmap.
    > > https://issues.apache.org/jira/browse/HDDS-231
    > >
    > > If there is consensus on bringing this to Hadoop in general, we can
    > build this feature in common.
    > >
    > > --Anu
    > >
    >
    >
    > +1 to out the box, everywhere. Web UIs included
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >
    >
    


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by larry mccay <lm...@apache.org>.
+1 from me as well.

On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
wrote:

>
>
> > On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
> >
> > +1, on the Non-Routable Idea. We like it so much that we added it to the
> Ozone roadmap.
> > https://issues.apache.org/jira/browse/HDDS-231
> >
> > If there is consensus on bringing this to Hadoop in general, we can
> build this feature in common.
> >
> > --Anu
> >
>
>
> +1 to out the box, everywhere. Web UIs included
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by larry mccay <lm...@apache.org>.
+1 from me as well.

On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
wrote:

>
>
> > On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
> >
> > +1, on the Non-Routable Idea. We like it so much that we added it to the
> Ozone roadmap.
> > https://issues.apache.org/jira/browse/HDDS-231
> >
> > If there is consensus on bringing this to Hadoop in general, we can
> build this feature in common.
> >
> > --Anu
> >
>
>
> +1 to out the box, everywhere. Web UIs included
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by larry mccay <lm...@apache.org>.
+1 from me as well.

On Thu, Jul 5, 2018 at 5:19 PM, Steve Loughran <st...@hortonworks.com>
wrote:

>
>
> > On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
> >
> > +1, on the Non-Routable Idea. We like it so much that we added it to the
> Ozone roadmap.
> > https://issues.apache.org/jira/browse/HDDS-231
> >
> > If there is consensus on bringing this to Hadoop in general, we can
> build this feature in common.
> >
> > --Anu
> >
>
>
> +1 to out the box, everywhere. Web UIs included
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Steve Loughran <st...@hortonworks.com>.

> On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
> 
> +1, on the Non-Routable Idea. We like it so much that we added it to the Ozone roadmap.
> https://issues.apache.org/jira/browse/HDDS-231
> 
> If there is consensus on bringing this to Hadoop in general, we can build this feature in common.
> 
> --Anu
> 


+1 to out the box, everywhere. Web UIs included


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Steve Loughran <st...@hortonworks.com>.

> On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
> 
> +1, on the Non-Routable Idea. We like it so much that we added it to the Ozone roadmap.
> https://issues.apache.org/jira/browse/HDDS-231
> 
> If there is consensus on bringing this to Hadoop in general, we can build this feature in common.
> 
> --Anu
> 


+1 to out the box, everywhere. Web UIs included


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Steve Loughran <st...@hortonworks.com>.

> On 5 Jul 2018, at 23:15, Anu Engineer <ae...@hortonworks.com> wrote:
> 
> +1, on the Non-Routable Idea. We like it so much that we added it to the Ozone roadmap.
> https://issues.apache.org/jira/browse/HDDS-231
> 
> If there is consensus on bringing this to Hadoop in general, we can build this feature in common.
> 
> --Anu
> 


+1 to out the box, everywhere. Web UIs included


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Anu Engineer <ae...@hortonworks.com>.
+1, on the Non-Routable Idea. We like it so much that we added it to the Ozone roadmap.
https://issues.apache.org/jira/browse/HDDS-231

If there is consensus on bringing this to Hadoop in general, we can build this feature in common.

--Anu


On 7/5/18, 1:09 PM, "Sean Busbey" <bu...@cloudera.com.INVALID> wrote:

    I really, really like the approach of defaulting to only non-routeable
    IPs allowed. it seems like a good tradeoff for complexity of
    implementation, pain to reconfigure, and level of protection.
    
    On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon <to...@cloudera.com.invalid> wrote:
    > The approach we took in Apache Kudu is that, if Kerberos hasn't been
    > enabled, we default to a whitelist of subnets. The default whitelist is
    > 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
    > matches the IANA "non-routeable IP" subnet list.
    >
    > In other words, out-of-the-box, you get a deployment that works fine within
    > a typical LAN environment, but won't allow some remote hacker to locate
    > your cluster and access your data. We thought this was a nice balance
    > between "works out of the box without lots of configuration" and "decent
    > security". In my opinion a "localhost-only by default" would be be overly
    > restrictive since I'd usually be deploying on some datacenter or EC2
    > machine and then trying to access it from a client on my laptop.
    >
    > We released this first a bit over a year ago if my memory serves me, and
    > we've had relatively few complaints or questions about it. We also made
    > sure that the error message that comes back to clients is pretty
    > reasonable, indicating the specific configuration that is disallowing
    > access, so if people hit the issue on upgrade they had a clear idea what is
    > going on.
    >
    > Of course it's not foolproof, since as Eric says, you're still likely open
    > to the entirety of your corporation, and you may not want that, but as he
    > also pointed out, that might be true even if you enable Kerberos
    > authentication.
    >
    > -Todd
    >
    > On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:
    >
    >> Hadoop default configuration aimed for user friendliness to increase
    >> adoption, and security can be enabled one by one.  This approach is most
    >> problematic to security because system can be compromised before all
    >> security features are turned on.
    >> Larry's proposal will add some safety to remind system admin if security
    >> is disabled.  However, reducing the number of knobs on security configs are
    >> likely required to make the system secure for the banner idea to work
    >> without writing too much guessing logic to determine if UI is secured.
    >> Penetration test can provide better insights of what hasn't been secured to
    >> improve the next release.  Thankfully most Hadoop vendors have done this
    >> work periodically to help the community secure Hadoop.
    >>
    >> There are plenty of company advertised if you want security, use
    >> Kerberos.  This statement is not entirely true.  Kerberos makes security
    >> more difficult to crack for external parties, but it shouldn't be the only
    >> method to secure Hadoop.  When the Kerberos environment is larger than
    >> Hadoop cluster, anyone within Kerberos environment can access Hadoop
    >> cluster freely without restriction.  In large scale enterprises or some
    >> cloud vendors that sublet their resources, this might not be acceptable.
    >>
    >> From my point of view, a secure Hadoop release must default all settings
    >> to localhost only and allow users to add more hosts through authorized
    >> white list of servers.  This will keep security perimeter in check.  All
    >> wild card ACLs will need to be removed or default to current user/current
    >> host only.  Proxy user/host ACL list must be enforced on http channels.
    >> This is basically realigning the default configuration to single node
    >> cluster or firewalled configuration.
    >>
    >> Regards,
    >> Eric
    >>
    >> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
    >>
    >>     Hi Steve -
    >>
    >>     This is a long overdue DISCUSS thread!
    >>
    >>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
    >>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
    >> warning
    >>     to get to the page like SSL exceptions in the browser do?
    >>     Similar tactic for UI access without SSL?
    >>     A new AuthenticationFilter can be added to the filter chains that
    >> blocks
    >>     API calls unless explicitly configured to be open and obvious log a
    >> similar
    >>     message?
    >>
    >>     thanks,
    >>
    >>     --larry
    >>
    >>
    >>
    >>
    >>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
    >> stevel@hortonworks.com>
    >>     wrote:
    >>
    >>     > Bitcoins are profitable enough to justify writing malware to run on
    >> Hadoop
    >>     > clusters & schedule mining jobs: there have been a couple of
    >> incidents of
    >>     > this in the wild, generally going in through no security, well known
    >>     > passwords, open ports.
    >>     >
    >>     > Vendors of Hadoop-related products get to deal with their lockdown
    >>     > themselves, which they often do by installing kerberos from the
    >> outset,
    >>     > making users make up their own password for admin accounts, etc.
    >>     >
    >>     > The ASF releases though: we just provide something insecure out the
    >> box
    >>     > and some docs saying "use kerberos if you want security"
    >>     >
    >>     > What we can do here?
    >>     >
    >>     > Some things to think about
    >>     >
    >>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
    >>     > cluster to a private subnet or use Kerberos
    >>     > * Anything which can be done to make Kerberos easier (?). I see
    >> there are
    >>     > some oustanding patches for HADOOP-12649 which need review, but what
    >> else?
    >>     >
    >>     > Could we have Hadoop determine when it's coming up on an open
    >> network and
    >>     > start warning? And how?
    >>     >
    >>     > At the very least, single node hadoop should be locked down. You
    >> shouldn't
    >>     > have to bring up kerberos to run it like that. And for more
    >> sophisticated
    >>     > multinode deployments, should the scripts refuse to work without
    >> kerberos
    >>     > unless you pass in some argument like "--Dinsecure-clusters-
    >> permitted"
    >>     >
    >>     > Any other ideas?
    >>     >
    >>     >
    >>     > ------------------------------------------------------------
    >> ---------
    >>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    >>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >>     >
    >>     >
    >>
    >>
    >>
    >
    >
    > --
    > Todd Lipcon
    > Software Engineer, Cloudera
    
    
    
    -- 
    busbey
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    For additional commands, e-mail: common-dev-help@hadoop.apache.org
    
    


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Anu Engineer <ae...@hortonworks.com>.
+1, on the Non-Routable Idea. We like it so much that we added it to the Ozone roadmap.
https://issues.apache.org/jira/browse/HDDS-231

If there is consensus on bringing this to Hadoop in general, we can build this feature in common.

--Anu


On 7/5/18, 1:09 PM, "Sean Busbey" <bu...@cloudera.com.INVALID> wrote:

    I really, really like the approach of defaulting to only non-routeable
    IPs allowed. it seems like a good tradeoff for complexity of
    implementation, pain to reconfigure, and level of protection.
    
    On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon <to...@cloudera.com.invalid> wrote:
    > The approach we took in Apache Kudu is that, if Kerberos hasn't been
    > enabled, we default to a whitelist of subnets. The default whitelist is
    > 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
    > matches the IANA "non-routeable IP" subnet list.
    >
    > In other words, out-of-the-box, you get a deployment that works fine within
    > a typical LAN environment, but won't allow some remote hacker to locate
    > your cluster and access your data. We thought this was a nice balance
    > between "works out of the box without lots of configuration" and "decent
    > security". In my opinion a "localhost-only by default" would be be overly
    > restrictive since I'd usually be deploying on some datacenter or EC2
    > machine and then trying to access it from a client on my laptop.
    >
    > We released this first a bit over a year ago if my memory serves me, and
    > we've had relatively few complaints or questions about it. We also made
    > sure that the error message that comes back to clients is pretty
    > reasonable, indicating the specific configuration that is disallowing
    > access, so if people hit the issue on upgrade they had a clear idea what is
    > going on.
    >
    > Of course it's not foolproof, since as Eric says, you're still likely open
    > to the entirety of your corporation, and you may not want that, but as he
    > also pointed out, that might be true even if you enable Kerberos
    > authentication.
    >
    > -Todd
    >
    > On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:
    >
    >> Hadoop default configuration aimed for user friendliness to increase
    >> adoption, and security can be enabled one by one.  This approach is most
    >> problematic to security because system can be compromised before all
    >> security features are turned on.
    >> Larry's proposal will add some safety to remind system admin if security
    >> is disabled.  However, reducing the number of knobs on security configs are
    >> likely required to make the system secure for the banner idea to work
    >> without writing too much guessing logic to determine if UI is secured.
    >> Penetration test can provide better insights of what hasn't been secured to
    >> improve the next release.  Thankfully most Hadoop vendors have done this
    >> work periodically to help the community secure Hadoop.
    >>
    >> There are plenty of company advertised if you want security, use
    >> Kerberos.  This statement is not entirely true.  Kerberos makes security
    >> more difficult to crack for external parties, but it shouldn't be the only
    >> method to secure Hadoop.  When the Kerberos environment is larger than
    >> Hadoop cluster, anyone within Kerberos environment can access Hadoop
    >> cluster freely without restriction.  In large scale enterprises or some
    >> cloud vendors that sublet their resources, this might not be acceptable.
    >>
    >> From my point of view, a secure Hadoop release must default all settings
    >> to localhost only and allow users to add more hosts through authorized
    >> white list of servers.  This will keep security perimeter in check.  All
    >> wild card ACLs will need to be removed or default to current user/current
    >> host only.  Proxy user/host ACL list must be enforced on http channels.
    >> This is basically realigning the default configuration to single node
    >> cluster or firewalled configuration.
    >>
    >> Regards,
    >> Eric
    >>
    >> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
    >>
    >>     Hi Steve -
    >>
    >>     This is a long overdue DISCUSS thread!
    >>
    >>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
    >>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
    >> warning
    >>     to get to the page like SSL exceptions in the browser do?
    >>     Similar tactic for UI access without SSL?
    >>     A new AuthenticationFilter can be added to the filter chains that
    >> blocks
    >>     API calls unless explicitly configured to be open and obvious log a
    >> similar
    >>     message?
    >>
    >>     thanks,
    >>
    >>     --larry
    >>
    >>
    >>
    >>
    >>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
    >> stevel@hortonworks.com>
    >>     wrote:
    >>
    >>     > Bitcoins are profitable enough to justify writing malware to run on
    >> Hadoop
    >>     > clusters & schedule mining jobs: there have been a couple of
    >> incidents of
    >>     > this in the wild, generally going in through no security, well known
    >>     > passwords, open ports.
    >>     >
    >>     > Vendors of Hadoop-related products get to deal with their lockdown
    >>     > themselves, which they often do by installing kerberos from the
    >> outset,
    >>     > making users make up their own password for admin accounts, etc.
    >>     >
    >>     > The ASF releases though: we just provide something insecure out the
    >> box
    >>     > and some docs saying "use kerberos if you want security"
    >>     >
    >>     > What we can do here?
    >>     >
    >>     > Some things to think about
    >>     >
    >>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
    >>     > cluster to a private subnet or use Kerberos
    >>     > * Anything which can be done to make Kerberos easier (?). I see
    >> there are
    >>     > some oustanding patches for HADOOP-12649 which need review, but what
    >> else?
    >>     >
    >>     > Could we have Hadoop determine when it's coming up on an open
    >> network and
    >>     > start warning? And how?
    >>     >
    >>     > At the very least, single node hadoop should be locked down. You
    >> shouldn't
    >>     > have to bring up kerberos to run it like that. And for more
    >> sophisticated
    >>     > multinode deployments, should the scripts refuse to work without
    >> kerberos
    >>     > unless you pass in some argument like "--Dinsecure-clusters-
    >> permitted"
    >>     >
    >>     > Any other ideas?
    >>     >
    >>     >
    >>     > ------------------------------------------------------------
    >> ---------
    >>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    >>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >>     >
    >>     >
    >>
    >>
    >>
    >
    >
    > --
    > Todd Lipcon
    > Software Engineer, Cloudera
    
    
    
    -- 
    busbey
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    For additional commands, e-mail: common-dev-help@hadoop.apache.org
    
    


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Anu Engineer <ae...@hortonworks.com>.
+1, on the Non-Routable Idea. We like it so much that we added it to the Ozone roadmap.
https://issues.apache.org/jira/browse/HDDS-231

If there is consensus on bringing this to Hadoop in general, we can build this feature in common.

--Anu


On 7/5/18, 1:09 PM, "Sean Busbey" <bu...@cloudera.com.INVALID> wrote:

    I really, really like the approach of defaulting to only non-routeable
    IPs allowed. it seems like a good tradeoff for complexity of
    implementation, pain to reconfigure, and level of protection.
    
    On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon <to...@cloudera.com.invalid> wrote:
    > The approach we took in Apache Kudu is that, if Kerberos hasn't been
    > enabled, we default to a whitelist of subnets. The default whitelist is
    > 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
    > matches the IANA "non-routeable IP" subnet list.
    >
    > In other words, out-of-the-box, you get a deployment that works fine within
    > a typical LAN environment, but won't allow some remote hacker to locate
    > your cluster and access your data. We thought this was a nice balance
    > between "works out of the box without lots of configuration" and "decent
    > security". In my opinion a "localhost-only by default" would be be overly
    > restrictive since I'd usually be deploying on some datacenter or EC2
    > machine and then trying to access it from a client on my laptop.
    >
    > We released this first a bit over a year ago if my memory serves me, and
    > we've had relatively few complaints or questions about it. We also made
    > sure that the error message that comes back to clients is pretty
    > reasonable, indicating the specific configuration that is disallowing
    > access, so if people hit the issue on upgrade they had a clear idea what is
    > going on.
    >
    > Of course it's not foolproof, since as Eric says, you're still likely open
    > to the entirety of your corporation, and you may not want that, but as he
    > also pointed out, that might be true even if you enable Kerberos
    > authentication.
    >
    > -Todd
    >
    > On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:
    >
    >> Hadoop default configuration aimed for user friendliness to increase
    >> adoption, and security can be enabled one by one.  This approach is most
    >> problematic to security because system can be compromised before all
    >> security features are turned on.
    >> Larry's proposal will add some safety to remind system admin if security
    >> is disabled.  However, reducing the number of knobs on security configs are
    >> likely required to make the system secure for the banner idea to work
    >> without writing too much guessing logic to determine if UI is secured.
    >> Penetration test can provide better insights of what hasn't been secured to
    >> improve the next release.  Thankfully most Hadoop vendors have done this
    >> work periodically to help the community secure Hadoop.
    >>
    >> There are plenty of company advertised if you want security, use
    >> Kerberos.  This statement is not entirely true.  Kerberos makes security
    >> more difficult to crack for external parties, but it shouldn't be the only
    >> method to secure Hadoop.  When the Kerberos environment is larger than
    >> Hadoop cluster, anyone within Kerberos environment can access Hadoop
    >> cluster freely without restriction.  In large scale enterprises or some
    >> cloud vendors that sublet their resources, this might not be acceptable.
    >>
    >> From my point of view, a secure Hadoop release must default all settings
    >> to localhost only and allow users to add more hosts through authorized
    >> white list of servers.  This will keep security perimeter in check.  All
    >> wild card ACLs will need to be removed or default to current user/current
    >> host only.  Proxy user/host ACL list must be enforced on http channels.
    >> This is basically realigning the default configuration to single node
    >> cluster or firewalled configuration.
    >>
    >> Regards,
    >> Eric
    >>
    >> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
    >>
    >>     Hi Steve -
    >>
    >>     This is a long overdue DISCUSS thread!
    >>
    >>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
    >>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
    >> warning
    >>     to get to the page like SSL exceptions in the browser do?
    >>     Similar tactic for UI access without SSL?
    >>     A new AuthenticationFilter can be added to the filter chains that
    >> blocks
    >>     API calls unless explicitly configured to be open and obvious log a
    >> similar
    >>     message?
    >>
    >>     thanks,
    >>
    >>     --larry
    >>
    >>
    >>
    >>
    >>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
    >> stevel@hortonworks.com>
    >>     wrote:
    >>
    >>     > Bitcoins are profitable enough to justify writing malware to run on
    >> Hadoop
    >>     > clusters & schedule mining jobs: there have been a couple of
    >> incidents of
    >>     > this in the wild, generally going in through no security, well known
    >>     > passwords, open ports.
    >>     >
    >>     > Vendors of Hadoop-related products get to deal with their lockdown
    >>     > themselves, which they often do by installing kerberos from the
    >> outset,
    >>     > making users make up their own password for admin accounts, etc.
    >>     >
    >>     > The ASF releases though: we just provide something insecure out the
    >> box
    >>     > and some docs saying "use kerberos if you want security"
    >>     >
    >>     > What we can do here?
    >>     >
    >>     > Some things to think about
    >>     >
    >>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
    >>     > cluster to a private subnet or use Kerberos
    >>     > * Anything which can be done to make Kerberos easier (?). I see
    >> there are
    >>     > some oustanding patches for HADOOP-12649 which need review, but what
    >> else?
    >>     >
    >>     > Could we have Hadoop determine when it's coming up on an open
    >> network and
    >>     > start warning? And how?
    >>     >
    >>     > At the very least, single node hadoop should be locked down. You
    >> shouldn't
    >>     > have to bring up kerberos to run it like that. And for more
    >> sophisticated
    >>     > multinode deployments, should the scripts refuse to work without
    >> kerberos
    >>     > unless you pass in some argument like "--Dinsecure-clusters-
    >> permitted"
    >>     >
    >>     > Any other ideas?
    >>     >
    >>     >
    >>     > ------------------------------------------------------------
    >> ---------
    >>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    >>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >>     >
    >>     >
    >>
    >>
    >>
    >
    >
    > --
    > Todd Lipcon
    > Software Engineer, Cloudera
    
    
    
    -- 
    busbey
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    For additional commands, e-mail: common-dev-help@hadoop.apache.org
    
    


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Sean Busbey <bu...@cloudera.com.INVALID>.
I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon <to...@cloudera.com.invalid> wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
>>
>>     Hi Steve -
>>
>>     This is a long overdue DISCUSS thread!
>>
>>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>>     to get to the page like SSL exceptions in the browser do?
>>     Similar tactic for UI access without SSL?
>>     A new AuthenticationFilter can be added to the filter chains that
>> blocks
>>     API calls unless explicitly configured to be open and obvious log a
>> similar
>>     message?
>>
>>     thanks,
>>
>>     --larry
>>
>>
>>
>>
>>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> stevel@hortonworks.com>
>>     wrote:
>>
>>     > Bitcoins are profitable enough to justify writing malware to run on
>> Hadoop
>>     > clusters & schedule mining jobs: there have been a couple of
>> incidents of
>>     > this in the wild, generally going in through no security, well known
>>     > passwords, open ports.
>>     >
>>     > Vendors of Hadoop-related products get to deal with their lockdown
>>     > themselves, which they often do by installing kerberos from the
>> outset,
>>     > making users make up their own password for admin accounts, etc.
>>     >
>>     > The ASF releases though: we just provide something insecure out the
>> box
>>     > and some docs saying "use kerberos if you want security"
>>     >
>>     > What we can do here?
>>     >
>>     > Some things to think about
>>     >
>>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
>>     > cluster to a private subnet or use Kerberos
>>     > * Anything which can be done to make Kerberos easier (?). I see
>> there are
>>     > some oustanding patches for HADOOP-12649 which need review, but what
>> else?
>>     >
>>     > Could we have Hadoop determine when it's coming up on an open
>> network and
>>     > start warning? And how?
>>     >
>>     > At the very least, single node hadoop should be locked down. You
>> shouldn't
>>     > have to bring up kerberos to run it like that. And for more
>> sophisticated
>>     > multinode deployments, should the scripts refuse to work without
>> kerberos
>>     > unless you pass in some argument like "--Dinsecure-clusters-
>> permitted"
>>     >
>>     > Any other ideas?
>>     >
>>     >
>>     > ------------------------------------------------------------
>> ---------
>>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>     >
>>     >
>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera



-- 
busbey

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Sean Busbey <bu...@cloudera.com.INVALID>.
I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon <to...@cloudera.com.invalid> wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
>>
>>     Hi Steve -
>>
>>     This is a long overdue DISCUSS thread!
>>
>>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>>     to get to the page like SSL exceptions in the browser do?
>>     Similar tactic for UI access without SSL?
>>     A new AuthenticationFilter can be added to the filter chains that
>> blocks
>>     API calls unless explicitly configured to be open and obvious log a
>> similar
>>     message?
>>
>>     thanks,
>>
>>     --larry
>>
>>
>>
>>
>>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> stevel@hortonworks.com>
>>     wrote:
>>
>>     > Bitcoins are profitable enough to justify writing malware to run on
>> Hadoop
>>     > clusters & schedule mining jobs: there have been a couple of
>> incidents of
>>     > this in the wild, generally going in through no security, well known
>>     > passwords, open ports.
>>     >
>>     > Vendors of Hadoop-related products get to deal with their lockdown
>>     > themselves, which they often do by installing kerberos from the
>> outset,
>>     > making users make up their own password for admin accounts, etc.
>>     >
>>     > The ASF releases though: we just provide something insecure out the
>> box
>>     > and some docs saying "use kerberos if you want security"
>>     >
>>     > What we can do here?
>>     >
>>     > Some things to think about
>>     >
>>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
>>     > cluster to a private subnet or use Kerberos
>>     > * Anything which can be done to make Kerberos easier (?). I see
>> there are
>>     > some oustanding patches for HADOOP-12649 which need review, but what
>> else?
>>     >
>>     > Could we have Hadoop determine when it's coming up on an open
>> network and
>>     > start warning? And how?
>>     >
>>     > At the very least, single node hadoop should be locked down. You
>> shouldn't
>>     > have to bring up kerberos to run it like that. And for more
>> sophisticated
>>     > multinode deployments, should the scripts refuse to work without
>> kerberos
>>     > unless you pass in some argument like "--Dinsecure-clusters-
>> permitted"
>>     >
>>     > Any other ideas?
>>     >
>>     >
>>     > ------------------------------------------------------------
>> ---------
>>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>     >
>>     >
>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera



-- 
busbey

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Sean Busbey <bu...@cloudera.com.INVALID>.
I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon <to...@cloudera.com.invalid> wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
>>
>>     Hi Steve -
>>
>>     This is a long overdue DISCUSS thread!
>>
>>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>>     to get to the page like SSL exceptions in the browser do?
>>     Similar tactic for UI access without SSL?
>>     A new AuthenticationFilter can be added to the filter chains that
>> blocks
>>     API calls unless explicitly configured to be open and obvious log a
>> similar
>>     message?
>>
>>     thanks,
>>
>>     --larry
>>
>>
>>
>>
>>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> stevel@hortonworks.com>
>>     wrote:
>>
>>     > Bitcoins are profitable enough to justify writing malware to run on
>> Hadoop
>>     > clusters & schedule mining jobs: there have been a couple of
>> incidents of
>>     > this in the wild, generally going in through no security, well known
>>     > passwords, open ports.
>>     >
>>     > Vendors of Hadoop-related products get to deal with their lockdown
>>     > themselves, which they often do by installing kerberos from the
>> outset,
>>     > making users make up their own password for admin accounts, etc.
>>     >
>>     > The ASF releases though: we just provide something insecure out the
>> box
>>     > and some docs saying "use kerberos if you want security"
>>     >
>>     > What we can do here?
>>     >
>>     > Some things to think about
>>     >
>>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
>>     > cluster to a private subnet or use Kerberos
>>     > * Anything which can be done to make Kerberos easier (?). I see
>> there are
>>     > some oustanding patches for HADOOP-12649 which need review, but what
>> else?
>>     >
>>     > Could we have Hadoop determine when it's coming up on an open
>> network and
>>     > start warning? And how?
>>     >
>>     > At the very least, single node hadoop should be locked down. You
>> shouldn't
>>     > have to bring up kerberos to run it like that. And for more
>> sophisticated
>>     > multinode deployments, should the scripts refuse to work without
>> kerberos
>>     > unless you pass in some argument like "--Dinsecure-clusters-
>> permitted"
>>     >
>>     > Any other ideas?
>>     >
>>     >
>>     > ------------------------------------------------------------
>> ---------
>>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>     >
>>     >
>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera



-- 
busbey

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Todd Lipcon <to...@cloudera.com.INVALID>.
The approach we took in Apache Kudu is that, if Kerberos hasn't been
enabled, we default to a whitelist of subnets. The default whitelist is
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
matches the IANA "non-routeable IP" subnet list.

In other words, out-of-the-box, you get a deployment that works fine within
a typical LAN environment, but won't allow some remote hacker to locate
your cluster and access your data. We thought this was a nice balance
between "works out of the box without lots of configuration" and "decent
security". In my opinion a "localhost-only by default" would be be overly
restrictive since I'd usually be deploying on some datacenter or EC2
machine and then trying to access it from a client on my laptop.

We released this first a bit over a year ago if my memory serves me, and
we've had relatively few complaints or questions about it. We also made
sure that the error message that comes back to clients is pretty
reasonable, indicating the specific configuration that is disallowing
access, so if people hit the issue on upgrade they had a clear idea what is
going on.

Of course it's not foolproof, since as Eric says, you're still likely open
to the entirety of your corporation, and you may not want that, but as he
also pointed out, that might be true even if you enable Kerberos
authentication.

-Todd

On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:

> Hadoop default configuration aimed for user friendliness to increase
> adoption, and security can be enabled one by one.  This approach is most
> problematic to security because system can be compromised before all
> security features are turned on.
> Larry's proposal will add some safety to remind system admin if security
> is disabled.  However, reducing the number of knobs on security configs are
> likely required to make the system secure for the banner idea to work
> without writing too much guessing logic to determine if UI is secured.
> Penetration test can provide better insights of what hasn't been secured to
> improve the next release.  Thankfully most Hadoop vendors have done this
> work periodically to help the community secure Hadoop.
>
> There are plenty of company advertised if you want security, use
> Kerberos.  This statement is not entirely true.  Kerberos makes security
> more difficult to crack for external parties, but it shouldn't be the only
> method to secure Hadoop.  When the Kerberos environment is larger than
> Hadoop cluster, anyone within Kerberos environment can access Hadoop
> cluster freely without restriction.  In large scale enterprises or some
> cloud vendors that sublet their resources, this might not be acceptable.
>
> From my point of view, a secure Hadoop release must default all settings
> to localhost only and allow users to add more hosts through authorized
> white list of servers.  This will keep security perimeter in check.  All
> wild card ACLs will need to be removed or default to current user/current
> host only.  Proxy user/host ACL list must be enforced on http channels.
> This is basically realigning the default configuration to single node
> cluster or firewalled configuration.
>
> Regards,
> Eric
>
> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
>
>     Hi Steve -
>
>     This is a long overdue DISCUSS thread!
>
>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
> warning
>     to get to the page like SSL exceptions in the browser do?
>     Similar tactic for UI access without SSL?
>     A new AuthenticationFilter can be added to the filter chains that
> blocks
>     API calls unless explicitly configured to be open and obvious log a
> similar
>     message?
>
>     thanks,
>
>     --larry
>
>
>
>
>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
> stevel@hortonworks.com>
>     wrote:
>
>     > Bitcoins are profitable enough to justify writing malware to run on
> Hadoop
>     > clusters & schedule mining jobs: there have been a couple of
> incidents of
>     > this in the wild, generally going in through no security, well known
>     > passwords, open ports.
>     >
>     > Vendors of Hadoop-related products get to deal with their lockdown
>     > themselves, which they often do by installing kerberos from the
> outset,
>     > making users make up their own password for admin accounts, etc.
>     >
>     > The ASF releases though: we just provide something insecure out the
> box
>     > and some docs saying "use kerberos if you want security"
>     >
>     > What we can do here?
>     >
>     > Some things to think about
>     >
>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
>     > cluster to a private subnet or use Kerberos
>     > * Anything which can be done to make Kerberos easier (?). I see
> there are
>     > some oustanding patches for HADOOP-12649 which need review, but what
> else?
>     >
>     > Could we have Hadoop determine when it's coming up on an open
> network and
>     > start warning? And how?
>     >
>     > At the very least, single node hadoop should be locked down. You
> shouldn't
>     > have to bring up kerberos to run it like that. And for more
> sophisticated
>     > multinode deployments, should the scripts refuse to work without
> kerberos
>     > unless you pass in some argument like "--Dinsecure-clusters-
> permitted"
>     >
>     > Any other ideas?
>     >
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>     >
>     >
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Todd Lipcon <to...@cloudera.com.INVALID>.
The approach we took in Apache Kudu is that, if Kerberos hasn't been
enabled, we default to a whitelist of subnets. The default whitelist is
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
matches the IANA "non-routeable IP" subnet list.

In other words, out-of-the-box, you get a deployment that works fine within
a typical LAN environment, but won't allow some remote hacker to locate
your cluster and access your data. We thought this was a nice balance
between "works out of the box without lots of configuration" and "decent
security". In my opinion a "localhost-only by default" would be be overly
restrictive since I'd usually be deploying on some datacenter or EC2
machine and then trying to access it from a client on my laptop.

We released this first a bit over a year ago if my memory serves me, and
we've had relatively few complaints or questions about it. We also made
sure that the error message that comes back to clients is pretty
reasonable, indicating the specific configuration that is disallowing
access, so if people hit the issue on upgrade they had a clear idea what is
going on.

Of course it's not foolproof, since as Eric says, you're still likely open
to the entirety of your corporation, and you may not want that, but as he
also pointed out, that might be true even if you enable Kerberos
authentication.

-Todd

On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:

> Hadoop default configuration aimed for user friendliness to increase
> adoption, and security can be enabled one by one.  This approach is most
> problematic to security because system can be compromised before all
> security features are turned on.
> Larry's proposal will add some safety to remind system admin if security
> is disabled.  However, reducing the number of knobs on security configs are
> likely required to make the system secure for the banner idea to work
> without writing too much guessing logic to determine if UI is secured.
> Penetration test can provide better insights of what hasn't been secured to
> improve the next release.  Thankfully most Hadoop vendors have done this
> work periodically to help the community secure Hadoop.
>
> There are plenty of company advertised if you want security, use
> Kerberos.  This statement is not entirely true.  Kerberos makes security
> more difficult to crack for external parties, but it shouldn't be the only
> method to secure Hadoop.  When the Kerberos environment is larger than
> Hadoop cluster, anyone within Kerberos environment can access Hadoop
> cluster freely without restriction.  In large scale enterprises or some
> cloud vendors that sublet their resources, this might not be acceptable.
>
> From my point of view, a secure Hadoop release must default all settings
> to localhost only and allow users to add more hosts through authorized
> white list of servers.  This will keep security perimeter in check.  All
> wild card ACLs will need to be removed or default to current user/current
> host only.  Proxy user/host ACL list must be enforced on http channels.
> This is basically realigning the default configuration to single node
> cluster or firewalled configuration.
>
> Regards,
> Eric
>
> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
>
>     Hi Steve -
>
>     This is a long overdue DISCUSS thread!
>
>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
> warning
>     to get to the page like SSL exceptions in the browser do?
>     Similar tactic for UI access without SSL?
>     A new AuthenticationFilter can be added to the filter chains that
> blocks
>     API calls unless explicitly configured to be open and obvious log a
> similar
>     message?
>
>     thanks,
>
>     --larry
>
>
>
>
>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
> stevel@hortonworks.com>
>     wrote:
>
>     > Bitcoins are profitable enough to justify writing malware to run on
> Hadoop
>     > clusters & schedule mining jobs: there have been a couple of
> incidents of
>     > this in the wild, generally going in through no security, well known
>     > passwords, open ports.
>     >
>     > Vendors of Hadoop-related products get to deal with their lockdown
>     > themselves, which they often do by installing kerberos from the
> outset,
>     > making users make up their own password for admin accounts, etc.
>     >
>     > The ASF releases though: we just provide something insecure out the
> box
>     > and some docs saying "use kerberos if you want security"
>     >
>     > What we can do here?
>     >
>     > Some things to think about
>     >
>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
>     > cluster to a private subnet or use Kerberos
>     > * Anything which can be done to make Kerberos easier (?). I see
> there are
>     > some oustanding patches for HADOOP-12649 which need review, but what
> else?
>     >
>     > Could we have Hadoop determine when it's coming up on an open
> network and
>     > start warning? And how?
>     >
>     > At the very least, single node hadoop should be locked down. You
> shouldn't
>     > have to bring up kerberos to run it like that. And for more
> sophisticated
>     > multinode deployments, should the scripts refuse to work without
> kerberos
>     > unless you pass in some argument like "--Dinsecure-clusters-
> permitted"
>     >
>     > Any other ideas?
>     >
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>     >
>     >
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Todd Lipcon <to...@cloudera.com.INVALID>.
The approach we took in Apache Kudu is that, if Kerberos hasn't been
enabled, we default to a whitelist of subnets. The default whitelist is
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
matches the IANA "non-routeable IP" subnet list.

In other words, out-of-the-box, you get a deployment that works fine within
a typical LAN environment, but won't allow some remote hacker to locate
your cluster and access your data. We thought this was a nice balance
between "works out of the box without lots of configuration" and "decent
security". In my opinion a "localhost-only by default" would be be overly
restrictive since I'd usually be deploying on some datacenter or EC2
machine and then trying to access it from a client on my laptop.

We released this first a bit over a year ago if my memory serves me, and
we've had relatively few complaints or questions about it. We also made
sure that the error message that comes back to clients is pretty
reasonable, indicating the specific configuration that is disallowing
access, so if people hit the issue on upgrade they had a clear idea what is
going on.

Of course it's not foolproof, since as Eric says, you're still likely open
to the entirety of your corporation, and you may not want that, but as he
also pointed out, that might be true even if you enable Kerberos
authentication.

-Todd

On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:

> Hadoop default configuration aimed for user friendliness to increase
> adoption, and security can be enabled one by one.  This approach is most
> problematic to security because system can be compromised before all
> security features are turned on.
> Larry's proposal will add some safety to remind system admin if security
> is disabled.  However, reducing the number of knobs on security configs are
> likely required to make the system secure for the banner idea to work
> without writing too much guessing logic to determine if UI is secured.
> Penetration test can provide better insights of what hasn't been secured to
> improve the next release.  Thankfully most Hadoop vendors have done this
> work periodically to help the community secure Hadoop.
>
> There are plenty of company advertised if you want security, use
> Kerberos.  This statement is not entirely true.  Kerberos makes security
> more difficult to crack for external parties, but it shouldn't be the only
> method to secure Hadoop.  When the Kerberos environment is larger than
> Hadoop cluster, anyone within Kerberos environment can access Hadoop
> cluster freely without restriction.  In large scale enterprises or some
> cloud vendors that sublet their resources, this might not be acceptable.
>
> From my point of view, a secure Hadoop release must default all settings
> to localhost only and allow users to add more hosts through authorized
> white list of servers.  This will keep security perimeter in check.  All
> wild card ACLs will need to be removed or default to current user/current
> host only.  Proxy user/host ACL list must be enforced on http channels.
> This is basically realigning the default configuration to single node
> cluster or firewalled configuration.
>
> Regards,
> Eric
>
> On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:
>
>     Hi Steve -
>
>     This is a long overdue DISCUSS thread!
>
>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
> warning
>     to get to the page like SSL exceptions in the browser do?
>     Similar tactic for UI access without SSL?
>     A new AuthenticationFilter can be added to the filter chains that
> blocks
>     API calls unless explicitly configured to be open and obvious log a
> similar
>     message?
>
>     thanks,
>
>     --larry
>
>
>
>
>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
> stevel@hortonworks.com>
>     wrote:
>
>     > Bitcoins are profitable enough to justify writing malware to run on
> Hadoop
>     > clusters & schedule mining jobs: there have been a couple of
> incidents of
>     > this in the wild, generally going in through no security, well known
>     > passwords, open ports.
>     >
>     > Vendors of Hadoop-related products get to deal with their lockdown
>     > themselves, which they often do by installing kerberos from the
> outset,
>     > making users make up their own password for admin accounts, etc.
>     >
>     > The ASF releases though: we just provide something insecure out the
> box
>     > and some docs saying "use kerberos if you want security"
>     >
>     > What we can do here?
>     >
>     > Some things to think about
>     >
>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
>     > cluster to a private subnet or use Kerberos
>     > * Anything which can be done to make Kerberos easier (?). I see
> there are
>     > some oustanding patches for HADOOP-12649 which need review, but what
> else?
>     >
>     > Could we have Hadoop determine when it's coming up on an open
> network and
>     > start warning? And how?
>     >
>     > At the very least, single node hadoop should be locked down. You
> shouldn't
>     > have to bring up kerberos to run it like that. And for more
> sophisticated
>     > multinode deployments, should the scripts refuse to work without
> kerberos
>     > unless you pass in some argument like "--Dinsecure-clusters-
> permitted"
>     >
>     > Any other ideas?
>     >
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>     > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>     >
>     >
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Eric Yang <ey...@hortonworks.com>.
Hadoop default configuration aimed for user friendliness to increase adoption, and security can be enabled one by one.  This approach is most problematic to security because system can be compromised before all security features are turned on.  
Larry's proposal will add some safety to remind system admin if security is disabled.  However, reducing the number of knobs on security configs are likely required to make the system secure for the banner idea to work without writing too much guessing logic to determine if UI is secured.  Penetration test can provide better insights of what hasn't been secured to improve the next release.  Thankfully most Hadoop vendors have done this work periodically to help the community secure Hadoop.

There are plenty of company advertised if you want security, use Kerberos.  This statement is not entirely true.  Kerberos makes security more difficult to crack for external parties, but it shouldn't be the only method to secure Hadoop.  When the Kerberos environment is larger than Hadoop cluster, anyone within Kerberos environment can access Hadoop cluster freely without restriction.  In large scale enterprises or some cloud vendors that sublet their resources, this might not be acceptable.
 
From my point of view, a secure Hadoop release must default all settings to localhost only and allow users to add more hosts through authorized white list of servers.  This will keep security perimeter in check.  All wild card ACLs will need to be removed or default to current user/current host only.  Proxy user/host ACL list must be enforced on http channels.  This is basically realigning the default configuration to single node cluster or firewalled configuration.  

Regards,
Eric

On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:

    Hi Steve -
    
    This is a long overdue DISCUSS thread!
    
    Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
    ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
    to get to the page like SSL exceptions in the browser do?
    Similar tactic for UI access without SSL?
    A new AuthenticationFilter can be added to the filter chains that blocks
    API calls unless explicitly configured to be open and obvious log a similar
    message?
    
    thanks,
    
    --larry
    
    
    
    
    On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <st...@hortonworks.com>
    wrote:
    
    > Bitcoins are profitable enough to justify writing malware to run on Hadoop
    > clusters & schedule mining jobs: there have been a couple of incidents of
    > this in the wild, generally going in through no security, well known
    > passwords, open ports.
    >
    > Vendors of Hadoop-related products get to deal with their lockdown
    > themselves, which they often do by installing kerberos from the outset,
    > making users make up their own password for admin accounts, etc.
    >
    > The ASF releases though: we just provide something insecure out the box
    > and some docs saying "use kerberos if you want security"
    >
    > What we can do here?
    >
    > Some things to think about
    >
    > * docs explaining IN CAPITAL LETTERS why you need to lock down your
    > cluster to a private subnet or use Kerberos
    > * Anything which can be done to make Kerberos easier (?). I see there are
    > some oustanding patches for HADOOP-12649 which need review, but what else?
    >
    > Could we have Hadoop determine when it's coming up on an open network and
    > start warning? And how?
    >
    > At the very least, single node hadoop should be locked down. You shouldn't
    > have to bring up kerberos to run it like that. And for more sophisticated
    > multinode deployments, should the scripts refuse to work without kerberos
    > unless you pass in some argument like "--Dinsecure-clusters-permitted"
    >
    > Any other ideas?
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >
    >
    


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Eric Yang <ey...@hortonworks.com>.
Hadoop default configuration aimed for user friendliness to increase adoption, and security can be enabled one by one.  This approach is most problematic to security because system can be compromised before all security features are turned on.  
Larry's proposal will add some safety to remind system admin if security is disabled.  However, reducing the number of knobs on security configs are likely required to make the system secure for the banner idea to work without writing too much guessing logic to determine if UI is secured.  Penetration test can provide better insights of what hasn't been secured to improve the next release.  Thankfully most Hadoop vendors have done this work periodically to help the community secure Hadoop.

There are plenty of company advertised if you want security, use Kerberos.  This statement is not entirely true.  Kerberos makes security more difficult to crack for external parties, but it shouldn't be the only method to secure Hadoop.  When the Kerberos environment is larger than Hadoop cluster, anyone within Kerberos environment can access Hadoop cluster freely without restriction.  In large scale enterprises or some cloud vendors that sublet their resources, this might not be acceptable.
 
From my point of view, a secure Hadoop release must default all settings to localhost only and allow users to add more hosts through authorized white list of servers.  This will keep security perimeter in check.  All wild card ACLs will need to be removed or default to current user/current host only.  Proxy user/host ACL list must be enforced on http channels.  This is basically realigning the default configuration to single node cluster or firewalled configuration.  

Regards,
Eric

On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:

    Hi Steve -
    
    This is a long overdue DISCUSS thread!
    
    Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
    ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
    to get to the page like SSL exceptions in the browser do?
    Similar tactic for UI access without SSL?
    A new AuthenticationFilter can be added to the filter chains that blocks
    API calls unless explicitly configured to be open and obvious log a similar
    message?
    
    thanks,
    
    --larry
    
    
    
    
    On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <st...@hortonworks.com>
    wrote:
    
    > Bitcoins are profitable enough to justify writing malware to run on Hadoop
    > clusters & schedule mining jobs: there have been a couple of incidents of
    > this in the wild, generally going in through no security, well known
    > passwords, open ports.
    >
    > Vendors of Hadoop-related products get to deal with their lockdown
    > themselves, which they often do by installing kerberos from the outset,
    > making users make up their own password for admin accounts, etc.
    >
    > The ASF releases though: we just provide something insecure out the box
    > and some docs saying "use kerberos if you want security"
    >
    > What we can do here?
    >
    > Some things to think about
    >
    > * docs explaining IN CAPITAL LETTERS why you need to lock down your
    > cluster to a private subnet or use Kerberos
    > * Anything which can be done to make Kerberos easier (?). I see there are
    > some oustanding patches for HADOOP-12649 which need review, but what else?
    >
    > Could we have Hadoop determine when it's coming up on an open network and
    > start warning? And how?
    >
    > At the very least, single node hadoop should be locked down. You shouldn't
    > have to bring up kerberos to run it like that. And for more sophisticated
    > multinode deployments, should the scripts refuse to work without kerberos
    > unless you pass in some argument like "--Dinsecure-clusters-permitted"
    >
    > Any other ideas?
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >
    >
    


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by Eric Yang <ey...@hortonworks.com>.
Hadoop default configuration aimed for user friendliness to increase adoption, and security can be enabled one by one.  This approach is most problematic to security because system can be compromised before all security features are turned on.  
Larry's proposal will add some safety to remind system admin if security is disabled.  However, reducing the number of knobs on security configs are likely required to make the system secure for the banner idea to work without writing too much guessing logic to determine if UI is secured.  Penetration test can provide better insights of what hasn't been secured to improve the next release.  Thankfully most Hadoop vendors have done this work periodically to help the community secure Hadoop.

There are plenty of company advertised if you want security, use Kerberos.  This statement is not entirely true.  Kerberos makes security more difficult to crack for external parties, but it shouldn't be the only method to secure Hadoop.  When the Kerberos environment is larger than Hadoop cluster, anyone within Kerberos environment can access Hadoop cluster freely without restriction.  In large scale enterprises or some cloud vendors that sublet their resources, this might not be acceptable.
 
From my point of view, a secure Hadoop release must default all settings to localhost only and allow users to add more hosts through authorized white list of servers.  This will keep security perimeter in check.  All wild card ACLs will need to be removed or default to current user/current host only.  Proxy user/host ACL list must be enforced on http channels.  This is basically realigning the default configuration to single node cluster or firewalled configuration.  

Regards,
Eric

On 7/5/18, 8:24 AM, "larry mccay" <la...@gmail.com> wrote:

    Hi Steve -
    
    This is a long overdue DISCUSS thread!
    
    Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
    ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
    to get to the page like SSL exceptions in the browser do?
    Similar tactic for UI access without SSL?
    A new AuthenticationFilter can be added to the filter chains that blocks
    API calls unless explicitly configured to be open and obvious log a similar
    message?
    
    thanks,
    
    --larry
    
    
    
    
    On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <st...@hortonworks.com>
    wrote:
    
    > Bitcoins are profitable enough to justify writing malware to run on Hadoop
    > clusters & schedule mining jobs: there have been a couple of incidents of
    > this in the wild, generally going in through no security, well known
    > passwords, open ports.
    >
    > Vendors of Hadoop-related products get to deal with their lockdown
    > themselves, which they often do by installing kerberos from the outset,
    > making users make up their own password for admin accounts, etc.
    >
    > The ASF releases though: we just provide something insecure out the box
    > and some docs saying "use kerberos if you want security"
    >
    > What we can do here?
    >
    > Some things to think about
    >
    > * docs explaining IN CAPITAL LETTERS why you need to lock down your
    > cluster to a private subnet or use Kerberos
    > * Anything which can be done to make Kerberos easier (?). I see there are
    > some oustanding patches for HADOOP-12649 which need review, but what else?
    >
    > Could we have Hadoop determine when it's coming up on an open network and
    > start warning? And how?
    >
    > At the very least, single node hadoop should be locked down. You shouldn't
    > have to bring up kerberos to run it like that. And for more sophisticated
    > multinode deployments, should the scripts refuse to work without kerberos
    > unless you pass in some argument like "--Dinsecure-clusters-permitted"
    >
    > Any other ideas?
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
    > For additional commands, e-mail: common-dev-help@hadoop.apache.org
    >
    >
    


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by larry mccay <la...@gmail.com>.
Hi Steve -

This is a long overdue DISCUSS thread!

Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
to get to the page like SSL exceptions in the browser do?
Similar tactic for UI access without SSL?
A new AuthenticationFilter can be added to the filter chains that blocks
API calls unless explicitly configured to be open and obvious log a similar
message?

thanks,

--larry




On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> Bitcoins are profitable enough to justify writing malware to run on Hadoop
> clusters & schedule mining jobs: there have been a couple of incidents of
> this in the wild, generally going in through no security, well known
> passwords, open ports.
>
> Vendors of Hadoop-related products get to deal with their lockdown
> themselves, which they often do by installing kerberos from the outset,
> making users make up their own password for admin accounts, etc.
>
> The ASF releases though: we just provide something insecure out the box
> and some docs saying "use kerberos if you want security"
>
> What we can do here?
>
> Some things to think about
>
> * docs explaining IN CAPITAL LETTERS why you need to lock down your
> cluster to a private subnet or use Kerberos
> * Anything which can be done to make Kerberos easier (?). I see there are
> some oustanding patches for HADOOP-12649 which need review, but what else?
>
> Could we have Hadoop determine when it's coming up on an open network and
> start warning? And how?
>
> At the very least, single node hadoop should be locked down. You shouldn't
> have to bring up kerberos to run it like that. And for more sophisticated
> multinode deployments, should the scripts refuse to work without kerberos
> unless you pass in some argument like "--Dinsecure-clusters-permitted"
>
> Any other ideas?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by larry mccay <la...@gmail.com>.
Hi Steve -

This is a long overdue DISCUSS thread!

Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
to get to the page like SSL exceptions in the browser do?
Similar tactic for UI access without SSL?
A new AuthenticationFilter can be added to the filter chains that blocks
API calls unless explicitly configured to be open and obvious log a similar
message?

thanks,

--larry




On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> Bitcoins are profitable enough to justify writing malware to run on Hadoop
> clusters & schedule mining jobs: there have been a couple of incidents of
> this in the wild, generally going in through no security, well known
> passwords, open ports.
>
> Vendors of Hadoop-related products get to deal with their lockdown
> themselves, which they often do by installing kerberos from the outset,
> making users make up their own password for admin accounts, etc.
>
> The ASF releases though: we just provide something insecure out the box
> and some docs saying "use kerberos if you want security"
>
> What we can do here?
>
> Some things to think about
>
> * docs explaining IN CAPITAL LETTERS why you need to lock down your
> cluster to a private subnet or use Kerberos
> * Anything which can be done to make Kerberos easier (?). I see there are
> some oustanding patches for HADOOP-12649 which need review, but what else?
>
> Could we have Hadoop determine when it's coming up on an open network and
> start warning? And how?
>
> At the very least, single node hadoop should be locked down. You shouldn't
> have to bring up kerberos to run it like that. And for more sophisticated
> multinode deployments, should the scripts refuse to work without kerberos
> unless you pass in some argument like "--Dinsecure-clusters-permitted"
>
> Any other ideas?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS]: securing ASF Hadoop releases out of the box

Posted by larry mccay <la...@gmail.com>.
Hi Steve -

This is a long overdue DISCUSS thread!

Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the warning
to get to the page like SSL exceptions in the browser do?
Similar tactic for UI access without SSL?
A new AuthenticationFilter can be added to the filter chains that blocks
API calls unless explicitly configured to be open and obvious log a similar
message?

thanks,

--larry




On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> Bitcoins are profitable enough to justify writing malware to run on Hadoop
> clusters & schedule mining jobs: there have been a couple of incidents of
> this in the wild, generally going in through no security, well known
> passwords, open ports.
>
> Vendors of Hadoop-related products get to deal with their lockdown
> themselves, which they often do by installing kerberos from the outset,
> making users make up their own password for admin accounts, etc.
>
> The ASF releases though: we just provide something insecure out the box
> and some docs saying "use kerberos if you want security"
>
> What we can do here?
>
> Some things to think about
>
> * docs explaining IN CAPITAL LETTERS why you need to lock down your
> cluster to a private subnet or use Kerberos
> * Anything which can be done to make Kerberos easier (?). I see there are
> some oustanding patches for HADOOP-12649 which need review, but what else?
>
> Could we have Hadoop determine when it's coming up on an open network and
> start warning? And how?
>
> At the very least, single node hadoop should be locked down. You shouldn't
> have to bring up kerberos to run it like that. And for more sophisticated
> multinode deployments, should the scripts refuse to work without kerberos
> unless you pass in some argument like "--Dinsecure-clusters-permitted"
>
> Any other ideas?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>