You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/04/12 12:04:52 UTC

[GitHub] [dolphinscheduler] lenboo opened a new issue, #9462: [Feature][Improvement] Support multi cluster environments

lenboo opened a new issue, #9462:
URL: https://github.com/apache/dolphinscheduler/issues/9462

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   If there are Shanghai sh, Hangzhou hz, two clusters as an example. The data source of each cluster may be independent, especially multinational companies are restricted by local laws and policies. The same task may need to be performed in different data centers.
    - In the case of multiple clusters, split common operations and cluster-related operations.
   -  Whether the button of a single cluster remains unchanged or the complete set is identified by the available clusters.
   
   ### Use case
   
   Workflow A are associated with two clusters: C1 / C2.
   Run the workflow A, and two workflow instances would be generated: 
   instance1 with cluster C1;
   instance2 with cluster C2.
   
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #9462: [Feature][Improvement] Support multi cluster environments

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #9462:
URL: https://github.com/apache/dolphinscheduler/issues/9462#issuecomment-1096639019

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://join.slack.com/t/asf-dolphinscheduler/shared_invite/zt-omtdhuio-_JISsxYhiVsltmC5h38yfw) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] BoYiZhang commented on issue #9462: [Feature][Improvement] Support multi cluster environments

Posted by GitBox <gi...@apache.org>.
BoYiZhang commented on issue #9462:
URL: https://github.com/apache/dolphinscheduler/issues/9462#issuecomment-1104665023

   this is a good demand.
   
   There are several issues to consider when supporting multiple environments:
   1. The resource center is currently stored on the same set of HDFS, and the DS is pulled from HDFS when it is executed.
       If it is a multi-cluster, you need to consider how to design file storage.
   2. Currently ds supports kerberbos authentication. If kerberos authentication is enabled for multiple clusters, how does kerberos authenticate?
       Enable kerberbos mutual trust or use the same set of kerberbos authentication?
   3. The requirement to support multiple environments is to push tasks from the development environment to the production environment. Is there a review and release process for this?
       If only two clusters are synchronized, deploying two sets of DS and synchronizing data through scripts may be more lightweight.
   
   
   ----------------------------------
   这是一个好的需求.
   
   支持多环境会有几个问题需要考虑:
   1. 资源中心目前是存储在同一套HDFS上, DS执行的时候从HDFS上拉取.
      如果是多集群的话,需要考虑一下文件存储方面如何设计.
   2. 目前ds支持kerberbos认证, 如果是多个集群开启kerberos认证的话, kerberos如何认证?
      开启kerberbos互信还是使用同一套kerberbos认证?
   3. 支持多个环境的诉求是开发环境任务推送到生产环境. 这个是否有审核发布流程?
      如果只是两个集群数据同步的话,部署两套DS,通过脚本同步数据可能会更轻便.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] Amy0104 closed issue #9462: [Feature][Improvement] Support multi cluster environments

Posted by GitBox <gi...@apache.org>.
Amy0104 closed issue #9462: [Feature][Improvement] Support multi cluster environments
URL: https://github.com/apache/dolphinscheduler/issues/9462


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] lenboo commented on issue #9462: [Feature][Improvement] Support multi cluster environments

Posted by GitBox <gi...@apache.org>.
lenboo commented on issue #9462:
URL: https://github.com/apache/dolphinscheduler/issues/9462#issuecomment-1096640359

   Here is my solution:
   1. Redefine the concept of the environment, the environment supports env (current status), config/xml and other types, modify the environment meta data, and add the environment type field
   2. The field 'environment' is added to the workflow definition, which supports the association of multiple clusters (environments). When the workflow runs, a workflow instance is generated for each cluster.
   3. The field "environment" is added to the workflow instance. Each workflow instance specifies a certain cluster, but cannot override the environment specified by the task (environment priority: Task Configuration > Workflow Configuration)
   4. How to use the environment configuration is determined by the task execution logic itself


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng commented on issue #9462: [Feature][Improvement] Support multi cluster environments

Posted by GitBox <gi...@apache.org>.
caishunfeng commented on issue #9462:
URL: https://github.com/apache/dolphinscheduler/issues/9462#issuecomment-1096723595

   What about adding region for environment? user can change region from UI and see data of selected region.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] lenboo commented on issue #9462: [Feature][Improvement] Support multi cluster environments

Posted by GitBox <gi...@apache.org>.
lenboo commented on issue #9462:
URL: https://github.com/apache/dolphinscheduler/issues/9462#issuecomment-1097513453

   > 
   
   Region is a field of environment?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org