You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2021/01/14 08:50:47 UTC

[GitHub] [incubator-dolphinscheduler] zhuangchong opened a new issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

zhuangchong opened a new issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450


   
   **Describe the bug**
   Hive/Spark JDBC supports manual configuration of keytab parameters when using Kerberos authentication, and supports multi-tenancy when using Kerberos authentication for Hive/Spark data sources.
   
   The current Hive/Spark JDBC system uses Kerberos authentication using the keytab configured in the common.properties file. There is a problem when Hive/Spark data source Kerberos authenticates other users or multiple different users.
   
   **To Reproduce**
   1.common.properties configuration keytab file is for the DolphinScheduler tenant;
   2.Hive DataSource Kerberos authenticated tenants are hive_exec_1 tenants;
   3.Use the above Hive data source query to report no permissions;
   
   **Which version of Dolphin Scheduler:**
    -[dev]
   
   **Requirement or improvement**
   Improvement: Add Kerberos-related parameters to the other parameters or principal parameters section of the DataSource page for manual input by the user
   
   **Additional context**
   Points needing attention:
   Kerberos expiration issue (I don't think this Kerberos expiration is related to scheduling, we will consider whether to deal with this later)
   
   Solution 1:
   Execute the script kinit-kt/XXX /xxx.keytab xxx@xxx.xxx.com using the shell task node, and set the keytab to be swipe regularly
   
   If you have any good ideas, please leave a message!
   
   ---
   
   **描述错误**
   Hive/Spark JDBC 使用Kerberos认证时支持keytab相关参数手动配置，解决Hive/Spark数据源开启Kerberos认证时支持多租户。
   
   当前系统Hive/Spark JDBC 使用Kerberos认证时使用的都是Common.properties文件内配置的keytab,当Hive/Spark数据源kerberos认证其他用户或多个不同的用户是存在问题的。
   
   **To Reproduce**
   1.common.properties 配置 keytab文件 是dolphinscheduler租户的；
   2.hive datasource kerberos认证租户是 hive_exec_1租户的；
   3.使用上面的hive数据源查询语句会报没有权限；
   
   **要求或改进**
   改进：在datasource页面其他参数或principal参数部分新增kerberos相关参数，供用户手动输入
   
   **Additional context**
   需要注意的点：
   kerberos过期问题（我认为这个kerberos过期和调度无关，这块后面在考虑要不要处理）
   
   解决方案一：
   使用shell任务节点执行 kinit-kt/XXX /xxx.keytab xxx@xxx.xxx.com 脚本，设置定时刷keytab
   
   
   你有什么好的想法，欢迎留言！


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] zhuangchong commented on issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

zhuangchong commented on issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450#issuecomment-763294081


   > I think we should'nt use kinit in this context.
   
   In Dolphin, Kerberos Java API is called for authentication, but Kerberos has expired problem. For multi-task nodes and long execution time of workflow, how to ensure that the Kerberos authentication is valid in the entire workflow process? Using Kinit is just one idea to solve Kerberos expiration, but you can also set Kerberos expiration time for a day/two and so on for a long time. Do you have any good ideas?
   
   ---
   dolphin内是调用kerberos java api 进行认证，但kerberos存在过期问题，针对多任务节点并执行时间很长的工作流，如何保证整个工作流流程中kerberos认证有效? 使用kinit只是解决kerberos过期的一个想法，也可以设置kerberos过期时间1天/2天等很长时间。您有什么好的想法？


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] CalvinKirs closed issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

CalvinKirs closed issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] zhuangchong commented on issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

zhuangchong commented on issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450#issuecomment-760032589


   I will fix it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] brucemen711 commented on issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

brucemen711 commented on issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450#issuecomment-768916988


   For what i know, every sqltask will execute this function ( create connection from keytab not from cache , so only make sure sqltask not run too long ( over 1 day ) and i see this is possible.
   `// if upload resource is HDFS and kerberos startup
               CommonUtils.loadKerberosConf();
               // create connection
               connection = createConnection();
   `
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] brucemen711 edited a comment on issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

brucemen711 edited a comment on issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450#issuecomment-768916988


   For what i know, every sqltask will execute this function ( create connection from keytab not from cache , so only make sure sqltask not run too long ( over 1 day ) and i see this is possible.
   `
   // if upload resource is HDFS and kerberos startup
               CommonUtils.loadKerberosConf();
    // create connection
               connection = createConnection();
   `
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] brucemen711 commented on issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

brucemen711 commented on issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450#issuecomment-768919035


   I will create a pull request for this issue, but it need additional config (hadoop, impala) from user ( maybe we add it to docs ). 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] brucemen711 commented on issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

brucemen711 commented on issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450#issuecomment-763290466


   I think we should'nt use kinit in this context.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-dolphinscheduler] brucemen711 edited a comment on issue #4450: [Bug][datasource] Hive/Spark data sources do not support multi-tenancy when Kerberos authentication is enabled

Posted by GitBox <gi...@apache.org>.

brucemen711 edited a comment on issue #4450:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4450#issuecomment-768916988


   For what i know, every sqltask will execute this function ( create connection from keytab not from cache ) , so only make sure sqltask not run too long ( over 1 day ) and i see this is possible.
   `
   // if upload resource is HDFS and kerberos startup
               CommonUtils.loadKerberosConf();
    // create connection
               connection = createConnection();
   `
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org