You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/04/27 11:43:13 UTC

[GitHub] [dolphinscheduler] EricGao888 opened a new issue, #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

EricGao888 opened a new issue, #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   By Integrating Apache Dolphin Scheduler with Apache Zeppelin through `zeppelin task plugin`, we aim to provide Dolphin users, especially those big data engineers, with so-called `Big Data Studio` experience, which means users can develop and debug big data related tasks in `zeppelin notebook` interactively and schedule them directly from dolphin with 'one click'. This feature will significantly boost development efficiency of big data engineers and lower the bar for those who do not have much experience in the big data area.
   
   However, currently dolphin only has basic integration with zeppelin and we need more features in zeppelin task plugin to achieve our goal. Here are a few points I come up with at the moment:
   
   * Enable note-level zeppelin task scheduling, which is already described here: #9798 
   * Add authentication feature to zeppelin task plugin so that a dolphin user will only have access to the notes of whose username is the same in zeppelin.
   * Add `custom variables` support in zeppelin task plugin.
   * Enable users to switch zeppelin server endpoint. Specifically speaking, add an `endpoint` field in UI and by filling different endpoints, users use dolphin to schedule zeppelin tasks to zeppelin servers deployed on different remote clusters. In this way, once configurations are completed on zeppelin side, users will be able to schedule big data tasks on different environment without worrying about configurations.
   
   I would like to invite Apache Zeppelin PMC, Jeff Zhang @zjffdu to help with the review.
   
   BTW, the idea of zeppelin task plugin is inspired by Jeff's previous work on `Apache Airflow Zeppelin Operator`, kudos to Jeff.
   
   ![image](https://user-images.githubusercontent.com/34905992/165510579-c9d79a91-9ed4-42fd-abcd-7699b6e6ae75.png)
   
   
   ### Use case
   
   * Already described above
   
   ### Related issues
   
   releated: #9201 #9798 #5271
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhongjiajie commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1112825170

   > BTW, since Dolphin also supports `workflow as code`, we could add a `dolphin-scheduler-workflow-interpreter` on Zeppelin side. With this feature, users will be able to write dolphin workflow python script in notebook interactively. I will open a related issue in Apache Zeppelin community later. Just write this idea down in case I forget it : ) @dailidong @zhongjiajie
   
   Sound good! thank for you bring this up !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhongjiajie commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1175842663

   > https://github.com/apache/dolphinscheduler/blob/719a9d4532733c0d7b4a54ed52b15dca7982ca8d/dolphinscheduler-common/src/main/resources/common.properties#L103-L106
   > 
   > I think it was not a good idea to put zeppelin endpoint in `common.properties`. I will submit a PR to remove it from `common.properties` and put it into task parameters. For default endpoint, maybe we could add it in `configuration center` in the future. see: #10283
   
   sure, agree with that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1155413787

   To make zeppelin task plugin more user-friendly, we could add some UI interaction features. For example, once a user fills in the noteId, there could be a button linking to the page of the zeppelin note  with same noteId. In the case, the user could edit the connected note conveniently.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
caishunfeng commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1110933909

   Look good to me,  but I I have the following questions:
   
   >Enable note-level zeppelin task
   
   Is it means that it needs database storage or resource manage?
   
   >Add custom variables support in zeppelin task plugin.
   
   What's the difference from ds?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1110902031

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://join.slack.com/t/asf-dolphinscheduler/shared_invite/zt-omtdhuio-_JISsxYhiVsltmC5h38yfw) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1110972502

   > Look good to me, but I I have the following questions:
   > 
   > > Enable note-level zeppelin task
   > 
   > Is it means that it needs database storage or resource manage?
   > 
   > > Add custom variables support in zeppelin task plugin.
   > 
   > What's the difference from ds?
   
   About `custom variables`, yes, I mean exact that of ds. Just want to combine it with [zeppelin dynamic form](https://zeppelin.apache.org/docs/0.8.0/usage/dynamic_form/intro.html).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhongjiajie commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1159366316

   > To make zeppelin task plugin more user-friendly, we could add some UI interaction features. For example, once a user fills in the noteId, there could be a button linking to the page of the zeppelin note with same noteId. In that case, the user could open and edit the connected note conveniently.
   
   yes, of cause, we have similar function in sub_process task, but just jump to dolphinscheduler resource instead of zeppelin's


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1110949404

   > Look good to me, but I I have the following questions:
   > 
   > > Enable note-level zeppelin task
   > 
   > Is it means that it needs database storage or resource manage?
   > 
   > > Add custom variables support in zeppelin task plugin.
   > 
   > What's the difference from ds?
   
   @caishunfeng My bad. I think I didn't make it clear. When developing big data tasks in zeppelin, users write zeppelin `note`, which is consist of one or multiple `paragraphs`. You can run the whole note or just a specific paragraph. Currently ds zeppelin task plugin only supports trigger zeppelin paragraphs. Enabling note-level zeppelin task means ds will be able to trigger a whole zeppelin note. To enable `note-level` zeppelin task scheduling, we just need to call `submitNote` method from `Zeppelin Client APIs`. The reason why I didn't use this API is because at that time, there was no `cancelNote` method in `Zeppelin Client APIs`. Once we trigger a zeppelin note from ds, we would not be able to cancel it, which may lead to some issues. Since `Zeppelin Client API` now includes `canceNote` method, we can add this feature in ds zeppelin task plugin.  It doesn't need database storage or resource management stuff. I hope this explanation make sense to you. : )
   
   ![image](https://user-images.githubusercontent.com/34905992/165517836-62d4fb44-7270-429a-a8a4-49d4c5fc9322.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1110902198

   Any ideas or suggestions to this feature are appreciated!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1111998083

   BTW, since Dolphin also supports `workflow as code`, we could add a `dolphin-scheduler-workflow-interpreter` on Zeppelin side. With this feature, users will be able to write dolphin workflow python script in notebook interactively. I will open a related issue in Apache Zeppelin community later. Just write this idea down in case I forget it : ) @dailidong @zhongjiajie 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #9814: [Feature][Task Plugin] Dolphin zeppelin task plugin improvement plan

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #9814:
URL: https://github.com/apache/dolphinscheduler/issues/9814#issuecomment-1166545646

   https://github.com/apache/dolphinscheduler/blob/719a9d4532733c0d7b4a54ed52b15dca7982ca8d/dolphinscheduler-common/src/main/resources/common.properties#L103-L106
   
   I think it was not a good idea to put zeppelin endpoint in `common.properties`. I will submit a PR to remove it from `common.properties` and put it into task parameters. For default endpoint, maybe we could add it in `configuration center` in the future. see: #10283


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org