You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by rconline <gi...@git.apache.org> on 2016/08/10 13:50:23 UTC

[GitHub] zeppelin pull request #1315: [ZEPPELIN-530] Added changes for Credential Pro...

GitHub user rconline opened a pull request:

    https://github.com/apache/zeppelin/pull/1315

    [ZEPPELIN-530] Added changes for Credential Provider, using hadoop commons Credential apis

    ### What is this PR for?
    This is the first step in order to ensure clear text passwords are not stored in the configuration files. To start with this PR will take care of getting AD system password from the .jceks file, configured by the user specified in the shiro.ini file. Going forward the same keystore can be used to read passwords for other systems as well. 
    
    If the hadoopSecurityCredentialPath path is present and not empty in the shiro.ini, then the password is read from the keystore file and it need not be stored inside the shiro.ini file. 
    
    
    ### What type of PR is it?
    [ Improvement]
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/ZEPPELIN-530
    
    
    ### How should this be tested?
    Create a keystore file using the hadoop credential commandline, for this the hadoop commons should be in the classpath
    
    `hadoop credential create systempassword -provider jceks://user/zeppelin/zeppelin.jceks`
    
    Change the following values in the Shiro.ini file, and uncomment the line:
    
    `activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://user/zeppelin/zeppelin.jceks`
    
    ### Questions:
    * Does the licenses files need update?
    No
    * Is there breaking changes for older versions?
    No. This is an additional option. 
    * Does this needs documentation?
    Yes
    
    ### Tasks
    * Documentation
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rconline/zeppelin ZEPPELIN-530

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/1315.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1315
    
----
commit cfecf74215c22cdde94aa6a60a8ac372afc8cfdc
Author: Rohit Choudhary <rc...@gmail.com>
Date:   2016-08-10T11:01:29Z

    [ZEPPELIN-530] Added changes for Credential Provider, using hadoop commons and credential api's.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by prabhjyotsingh <gi...@git.apache.org>.
Github user prabhjyotsingh commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    Merging this if no more discussion.
    CI fails for  #6786.9 - Which looks unrelated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by rconline <gi...@git.apache.org>.
Github user rconline commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @jongyoul  in that case, they will not be able to take advantage of storing AD passwords in an encrypted fashion. That said, there will be no functionality loss. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by rconline <gi...@git.apache.org>.
Github user rconline commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @jongyoul let me try to take a step back and try to explain.
    
    Zeppelin is going to be used for various use cases, some of which will involve HDFS - Hive/Spark/Phoenix/Hbase etc, some use cases will also need support for non-HDFS such as Postgres/Mysql etc. 
    
    **Problem** - all of these end systems may require users to store passwords. Currently in zeppelin there are two locations for storing these passwords - 1. shiro.ini for AD passwords and 2. interpreter.json for the rest of the data systems. These passwords are stored in clear text as of now. 
    
    **Solution** - Encrypt the password and store in a file that can be read only at runtime, by zeppelin process to connect successfully. Question is where? Either on the zeppelin host system or hdfs where big-data users are akin to storing passwords. JCEKS is a java supported concept and has worked well for most users, and therefore can be used. Creating a .jceks file is possible on a host, 
    `jceks://file/tmp/test.jceks`, whereas on HDFS the user may have to connect to hdfs and then create the file such as `jceks://hdfs@nn1.example.com/my/path/test.jceks`, when the password is being stored on hdfs. 
    
    At this point in time we have solved the problem for not storing passwords in Shiro.ini, which can be stored at the zeppelin host itself. However we have to improve this solution and make it work for the rest of the use cases and that's where Credential API comes into play. 
    
    Credential API is a generic solution which allows users to create password files for both of the cases - on the host and hdfs. Needless to point out that this has been used across Knox, which is a good standard for security. 
    
    Please let me know if this makes sense, or if you have any more questions. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by jongyoul <gi...@git.apache.org>.
Github user jongyoul commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @rconline It's not clear for me. Do you tell me that users store their Hive, Hbase passwords in HDFS? Or does this support reading `jceks` from HDFS? In your description, this PR is for encrypting the plain text password. I don't know what the relationship is from HDFS. I think you should describe why you must use hadoop-common in details.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by prabhjyotsingh <gi...@git.apache.org>.
Github user prabhjyotsingh commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    Merging this if no more discussion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by rconline <gi...@git.apache.org>.
Github user rconline commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @jongyoul this work is currently solving for AD, but is not limited to it. Going forward users may choose to store their Hive, Hbase and other data system passwords. These passwords as conventions get stored on HDFS, where we will need a simplified, credential API, and therefore the choice is to use the credential API. Also this is pretty standard implementation in the Big Data world. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by jongyoul <gi...@git.apache.org>.
Github user jongyoul commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @rconline This PR makes Zeppelin server module include Hadoop-common 2.6. How can I change the version of Hadoop? Or don't we need to change it even we use Hadoop 1.x?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin pull request #1315: [ZEPPELIN-530] Added changes for Credential Pro...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/zeppelin/pull/1315


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by prabhjyotsingh <gi...@git.apache.org>.
Github user prabhjyotsingh commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    Thank you @rconline for taking care of this. LGTM!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by jongyoul <gi...@git.apache.org>.
Github user jongyoul commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @rconline Do you mean it's fine if some users don't use Hadoop2 and don't want to use jceks features? Isn't there any side effect at all?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by rconline <gi...@git.apache.org>.
Github user rconline commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @jongyoul It does not matter, because we are just depending upon hadoop 2.6 credential api's. So if the user is on Hadoop 1.x, then he just needs to add Hadoop common - 2.6 jars to the classpath to create the jceks file. At the time of authentication, this jar is already present inside the zeppelin artifact, so we are good there. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by jongyoul <gi...@git.apache.org>.
Github user jongyoul commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @rconline Thanks for explaining it. I've researched the `jceks`. AFAIK, it doesn't need hadoop-common dependencies. It's included by java.security by default. Can you remove hadoop-common and support `jceks`? I've googled it and found some examples. Basically, `jceks` would be a good candidate for making password secured.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1315: [ZEPPELIN-530] Added changes for Credential Provider, ...

Posted by jongyoul <gi...@git.apache.org>.
Github user jongyoul commented on the issue:

    https://github.com/apache/zeppelin/pull/1315
  
    @rconline I know what you focus on and try to solve the problem. I, actually, cannot accept whether your solution is perfect or not, but I agree the problem you told. Thanks for the explaining it. your codes looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---