You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Karthik P <ka...@gmail.com> on 2019/04/25 01:55:44 UTC

CredentialProvider API

Team,

Datanode is failed to restart after configuring credentials provider,
storing credential into HDFS (jceks://hdfs@hostname
:9001/credential/keys.jceks).

Getting a StackOverFlow error in datanode jsvc.out file similar to
HADOOP-11934 <https://issues.apache.org/jira/browse/HADOOP-11934>.

As per the documentation link
<https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Supported_Features>,
we support storing credential in HDFS.

*URI jceks://file|hdfs/path-to-keystore, is used to retrieve credentials
from a Java keystore. The underlying use of the Hadoop filesystem
abstraction allows credentials to be stored on the local filesystem or
within HDFS.*

Assume a scenario, where all of our data nodes were down and we configured
hadoop.security.credential.provider.path to HDFS location. So when we try
to get FileSystem.get() during datanode restart we end up doing recursive
call if HDFS is inaccessible.


/**
 * Check and set 'configuration' if necessary.
 *
 * @param theObject object for which to set configuration
 * @param conf Configuration
 */
public static void setConf(Object theObject, Configuration conf) {
  if (conf != null) {
    if (theObject instanceof Configurable) {
      ((Configurable) theObject).setConf(conf);
    }
    setJobConf(theObject, conf);
  }
}


No issues if we store credential in LFS (localjceks://file). The problem
only with jceks://hdfs/.

Can I change Hadoop doc that we would not support storing credential in
HDFS? Or Shall I handle this scenario only for statup issue?


Thanks,
Karthik

Re: CredentialProvider API

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.
A similar issue is https://issues.apache.org/jira/browse/HADOOP-15412
You can't put the keystore for KMS in HDFS, and I filed
https://issues.apache.org/jira/browse/HADOOP-15413 to document it.

Karthik, would you consider use HADOOP-15413 to document scenarios where it
is not supported?

Additionally, I worked on HADOOP-13548
<https://issues.apache.org/jira/browse/HADOOP-13548> awhile ago to remove
the circular dependency for the LdapGroupsMapping case.
You can also consider taking over to resolve this case.

On Thu, Apr 25, 2019 at 3:20 AM Steve Loughran <st...@cloudera.com.invalid>
wrote:

> didn't think it worked at all, that is: you can't store the credentials to
> access an FS in the FS. There's some configuration option to disable
> specific sources of JCEKS files, and the one which reads in through the
> hadoop FS API shouldn't be included when reading any credentials needed to
> work with HDFS.
>
> JCEKS files in HDFS work for storing object storage credentials.
>
> On Thu, Apr 25, 2019 at 3:13 AM larry mccay <la...@gmail.com> wrote:
>
> > This is likely an issue only for issues where we need the password from
> > HDFS in order to access HDFS.
> > This should definitely be avoided by not having a static credential
> > provider path configured for startup that includes such a dependency.
> >
> > For instance, the JIRA you cite is an example where we need to do group
> > lookup in order to determine whether you are allowed to access the HDFS
> > resource that provides the password required to do group lookup.
> >
> > Storing passwords in credential stores within HDFS should be perfectly
> safe
> > for things like SSL that don't have a dependency on HDFS itself.
> >
> > Those details are in the documentation page that you referenced but if
> they
> > need to be made more clear that completely makes sense.
> >
> > On Wed, Apr 24, 2019 at 9:56 PM Karthik P <ka...@gmail.com>
> > wrote:
> >
> > > Team,
> > >
> > > Datanode is failed to restart after configuring credentials provider,
> > > storing credential into HDFS (jceks://hdfs@hostname
> > > :9001/credential/keys.jceks).
> > >
> > > Getting a StackOverFlow error in datanode jsvc.out file similar to
> > > HADOOP-11934 <https://issues.apache.org/jira/browse/HADOOP-11934>.
> > >
> > > As per the documentation link
> > > <
> > >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Supported_Features
> > > >,
> > > we support storing credential in HDFS.
> > >
> > > *URI jceks://file|hdfs/path-to-keystore, is used to retrieve
> credentials
> > > from a Java keystore. The underlying use of the Hadoop filesystem
> > > abstraction allows credentials to be stored on the local filesystem or
> > > within HDFS.*
> > >
> > > Assume a scenario, where all of our data nodes were down and we
> > configured
> > > hadoop.security.credential.provider.path to HDFS location. So when we
> try
> > > to get FileSystem.get() during datanode restart we end up doing
> recursive
> > > call if HDFS is inaccessible.
> > >
> > >
> > > /**
> > >  * Check and set 'configuration' if necessary.
> > >  *
> > >  * @param theObject object for which to set configuration
> > >  * @param conf Configuration
> > >  */
> > > public static void setConf(Object theObject, Configuration conf) {
> > >   if (conf != null) {
> > >     if (theObject instanceof Configurable) {
> > >       ((Configurable) theObject).setConf(conf);
> > >     }
> > >     setJobConf(theObject, conf);
> > >   }
> > > }
> > >
> > >
> > > No issues if we store credential in LFS (localjceks://file). The
> problem
> > > only with jceks://hdfs/.
> > >
> > > Can I change Hadoop doc that we would not support storing credential in
> > > HDFS? Or Shall I handle this scenario only for statup issue?
> > >
> > >
> > > Thanks,
> > > Karthik
> > >
> >
>

Re: CredentialProvider API

Posted by Steve Loughran <st...@cloudera.com.INVALID>.
didn't think it worked at all, that is: you can't store the credentials to
access an FS in the FS. There's some configuration option to disable
specific sources of JCEKS files, and the one which reads in through the
hadoop FS API shouldn't be included when reading any credentials needed to
work with HDFS.

JCEKS files in HDFS work for storing object storage credentials.

On Thu, Apr 25, 2019 at 3:13 AM larry mccay <la...@gmail.com> wrote:

> This is likely an issue only for issues where we need the password from
> HDFS in order to access HDFS.
> This should definitely be avoided by not having a static credential
> provider path configured for startup that includes such a dependency.
>
> For instance, the JIRA you cite is an example where we need to do group
> lookup in order to determine whether you are allowed to access the HDFS
> resource that provides the password required to do group lookup.
>
> Storing passwords in credential stores within HDFS should be perfectly safe
> for things like SSL that don't have a dependency on HDFS itself.
>
> Those details are in the documentation page that you referenced but if they
> need to be made more clear that completely makes sense.
>
> On Wed, Apr 24, 2019 at 9:56 PM Karthik P <ka...@gmail.com>
> wrote:
>
> > Team,
> >
> > Datanode is failed to restart after configuring credentials provider,
> > storing credential into HDFS (jceks://hdfs@hostname
> > :9001/credential/keys.jceks).
> >
> > Getting a StackOverFlow error in datanode jsvc.out file similar to
> > HADOOP-11934 <https://issues.apache.org/jira/browse/HADOOP-11934>.
> >
> > As per the documentation link
> > <
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Supported_Features
> > >,
> > we support storing credential in HDFS.
> >
> > *URI jceks://file|hdfs/path-to-keystore, is used to retrieve credentials
> > from a Java keystore. The underlying use of the Hadoop filesystem
> > abstraction allows credentials to be stored on the local filesystem or
> > within HDFS.*
> >
> > Assume a scenario, where all of our data nodes were down and we
> configured
> > hadoop.security.credential.provider.path to HDFS location. So when we try
> > to get FileSystem.get() during datanode restart we end up doing recursive
> > call if HDFS is inaccessible.
> >
> >
> > /**
> >  * Check and set 'configuration' if necessary.
> >  *
> >  * @param theObject object for which to set configuration
> >  * @param conf Configuration
> >  */
> > public static void setConf(Object theObject, Configuration conf) {
> >   if (conf != null) {
> >     if (theObject instanceof Configurable) {
> >       ((Configurable) theObject).setConf(conf);
> >     }
> >     setJobConf(theObject, conf);
> >   }
> > }
> >
> >
> > No issues if we store credential in LFS (localjceks://file). The problem
> > only with jceks://hdfs/.
> >
> > Can I change Hadoop doc that we would not support storing credential in
> > HDFS? Or Shall I handle this scenario only for statup issue?
> >
> >
> > Thanks,
> > Karthik
> >
>

Re: CredentialProvider API

Posted by larry mccay <la...@gmail.com>.
This is likely an issue only for issues where we need the password from
HDFS in order to access HDFS.
This should definitely be avoided by not having a static credential
provider path configured for startup that includes such a dependency.

For instance, the JIRA you cite is an example where we need to do group
lookup in order to determine whether you are allowed to access the HDFS
resource that provides the password required to do group lookup.

Storing passwords in credential stores within HDFS should be perfectly safe
for things like SSL that don't have a dependency on HDFS itself.

Those details are in the documentation page that you referenced but if they
need to be made more clear that completely makes sense.

On Wed, Apr 24, 2019 at 9:56 PM Karthik P <ka...@gmail.com> wrote:

> Team,
>
> Datanode is failed to restart after configuring credentials provider,
> storing credential into HDFS (jceks://hdfs@hostname
> :9001/credential/keys.jceks).
>
> Getting a StackOverFlow error in datanode jsvc.out file similar to
> HADOOP-11934 <https://issues.apache.org/jira/browse/HADOOP-11934>.
>
> As per the documentation link
> <
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Supported_Features
> >,
> we support storing credential in HDFS.
>
> *URI jceks://file|hdfs/path-to-keystore, is used to retrieve credentials
> from a Java keystore. The underlying use of the Hadoop filesystem
> abstraction allows credentials to be stored on the local filesystem or
> within HDFS.*
>
> Assume a scenario, where all of our data nodes were down and we configured
> hadoop.security.credential.provider.path to HDFS location. So when we try
> to get FileSystem.get() during datanode restart we end up doing recursive
> call if HDFS is inaccessible.
>
>
> /**
>  * Check and set 'configuration' if necessary.
>  *
>  * @param theObject object for which to set configuration
>  * @param conf Configuration
>  */
> public static void setConf(Object theObject, Configuration conf) {
>   if (conf != null) {
>     if (theObject instanceof Configurable) {
>       ((Configurable) theObject).setConf(conf);
>     }
>     setJobConf(theObject, conf);
>   }
> }
>
>
> No issues if we store credential in LFS (localjceks://file). The problem
> only with jceks://hdfs/.
>
> Can I change Hadoop doc that we would not support storing credential in
> HDFS? Or Shall I handle this scenario only for statup issue?
>
>
> Thanks,
> Karthik
>