You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/08/25 15:06:39 UTC

[GitHub] [dolphinscheduler] DarkAssassinator opened a new issue, #11652: [Feature] DS can support task running on remote host, not just worker server.

DarkAssassinator opened a new issue, #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   Currently DS tasks can only run on worker servers, but in actual business scenarios, we often need to send tasks to remote server for execution, such as Shell Task.
   So i suggest that could DS add a Host manage model under Security menu, such as Environment manage model. Then user can select the remote host to run the task. 
   
   ### Use case
   
   1. User can manage the hosts in Security menu.
   2. When config the task node, the user can select the remote host to run this task.
   
   ### Related issues
   
   N/A
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhuxt2015 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
zhuxt2015 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250037791

   Furthermore, Suggest add server manage page to other menue instead of Security menue. Remote server manage not related to security side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1237115621

   As i checked, `t_ds_cluster` was originally designed for K8s, will we combine the two parts to cause misunderstanding and confusion?
   And failover, we can get the task executePath, then we can ssh to the remote server to kill this process, such as `ps -ef | grep "sh xxxxxxx | awk '{print $2}' | xargs kill -9"`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1255008530

   > @DarkAssassinator Would u like to bring up this issue in community bi-weekly conference for further discussion? Thanks.
   
    @EricGao888 sure, it will be my pleasure. and may I know what time and way it is?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1245477937

   Hi @caishunfeng & @SbloodyS so can i start coding this case?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhuxt2015 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
zhuxt2015 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250037109

   > > resource files send to remote server by `scp` command, stop/kill/timeout/failover handled by the shell task itself.
   > 
   > right, this is my initial design, and our test environment is also implemented like this
   
   Good job.  Looking forward to your contribution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250052244

   > Furthermore, Suggest add server manage page to other menu instead of Security menu. Remote server manage not related to security side.
   
   actually I agree with you, and i think that Cluster/Environment also not related to the security side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250483654

   > Please make things simple. We just only want to execute command in remote server. just like `ssh user@10.10.125.1 "echo 'hello world'"`.
   > 
   > My suggestion is as follows:
   > 
   > 1. So we just add a ssh server source, config ip、port、username、 password or select a id_ras.pub file from Resources to use login with key-based authentication.
   > 
   > <img alt="image" width="591" src="https://user-images.githubusercontent.com/13765310/190841320-79961704-a4ac-46a9-8ae6-17d73fcd2d3e.png">
   > 
   > 2. Select the ssh server source in Shell task。
   > 
   > <img alt="image" width="600" src="https://user-images.githubusercontent.com/13765310/190841512-10f65629-23b6-4531-8b6f-d851e36a8533.png">
   > 
   > 3. When server source exists in shell task, then ssh to server source  to execute shell script.
   
   @zhuxt2015 @DarkAssassinator I'm +1 to this comment. Currently we could only manage connections for `datasource` task plugins instead of all task plugins. From my perspective, we could improve this feature and make it apply to all task plugins. Actually, we have a related issue here: #10283 but we haven't got time to design and implement it. Rather than configure ssh connections in `Security Center` or a totally new `Server Management Page`, I prefer to upgrade our current `Datasource Center` to `Connection Center` or `Configuration Center` and do it there. cc @caishunfeng @SbloodyS


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
caishunfeng commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1236308432

   Good idea. 
   What about expanding the `Cluster` ? see `t_ds_cluster`.
   Another question is how remote shell task failover when worker down?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1251163022

   
   > @zhuxt2015 @DarkAssassinator I'm +1 to this comment. Currently we could only manage connections for `datasource` task plugins instead of all task plugins. From my perspective, we could improve this feature and make it apply to all task plugins. Actually, we have a related issue here: #10283 but we haven't got time to design and implement it. Rather than configure ssh connections in `Security Center` or a totally new `Server Management Page`, I prefer to upgrade our current `Datasource Center` to `Connection Center` or `Configuration Center` and do it there. cc @caishunfeng @SbloodyS
   
   @EricGao888 i agree with u, i think `Configuration Center` should be better
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1228405815

   > That's great! Can you provide the details of the design architecture? @DarkAssassinator
   
   Sure, i will  provide a related design later for the community to review. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1235565446

   **What is the purpose of this feature?**
   DS can support the task instance running on the remote servers (task server), not just worker nodes. 
   Users can manage these task servers on the page, such as create, edit, delete and test connect.
   Each task node in DAG can specify the task server that needs to be executed. 
   The task server property belongs only to task instances, not workflow instances.
   **MOP**
   1. Add TaskServer Entity in dolphinscheduler-dao model, and create a table named t_ds_task_server in DB. Add TaskServer API.
   ![image](https://user-images.githubusercontent.com/20518339/188169786-fc95225a-3b57-4c80-bdb5-63740d151d26.png)
   
   2. Add a task server manage page in Secerity menu, such as create, delete, edit and test connect.
   3. Add a task server select input (field: taskServerCode) in Shell and Python task form.
   4. Add taskServerCode column in t_ds_task_instance、t_ds_task_definition、t_ds_task_definition_log tables. And change these APIs.
   5. Add taskServerCode field in TaskInstance, TaskDefintion, TaskDefinitionLog, TaskNode, and add taskServerInfo field in TaskInstance (not a table field).
   6. Add taskServerInfo (just contain ip, user, password, name) entity in dolphinscheduler-task-plugin model. And add taskServerInfo field in TaskExecutionContext. 
   7. Shell and Python task will check TaskExecutionContext.getTaskServerInfo(), if not null, it will scp the command files and resource files to the task server, and send the start command to remote to exec it. If other task plugin need this feture,  can also check this task server field.
   8. The task will ssh and scp to the task server as the DS running user, not tenant.
   
   Look forward to your comments. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] simsicon commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
simsicon commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1283258853

   I second that, no need to create table, a simple ssh task type will do this feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
SbloodyS commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1246191405

   > 6. Add **taskServerInfo** (just contain ip, user, password, name) entity in dolphinscheduler-task-plugin model. And add taskServerInfo field in TaskExecutionContext.
   
   I think it's better not to use userName/passWord in ssh since there are some security risks. Using pam file or authrized_key in ssh is a more secure way.
   
   BTW, I have some different views on the implementation method.
   
   Currently, DS supports S3 and HDFS as the storage mode of the resource center. In the future, it may also support other object storage, such as alibaba cloud oss, tencent cloud cos, etc.
   
   1. In common usage scenarios, The masterServer/apiServer's node usually does not contain the permission to use HDFS and S3. These permissions are usually included in the workerServer's node. It requires these permissions on user's masterServer/apiServer's node if using scp command to trasfer the files to the task server. In addition, downloading files from the masterServer/apiServer's node and then scping them to the task node will waste network and hard disk IO for some large files or large number of small files.
   
   2. Using SSH to execute shell commands usually requires escaping a lot of special characters for different task type. And I think this is a huge workload for subsequent maintenance.
   
   3. Using SSH means that the task running status and running logs need to be monitored by the masterServer. This may lead to high load on masterServer's node when the number of tasks is quite large.
   
   Which is not reasonable for users and maintainers.
   
   Based on all the above issues. I suggest implementing this in the following steps.
   
   1. Create a task level callback in the masterServer to provide a single task with task monitoring information.
   
   2. Create an executable task for each task, which can be a jar package or an executable file compiled through golang or any other languages, and transfer it to the task node for execution through asynchronous ssh. This executable task contains the actual execution of the task and the monitoring information reported by the task to the master.
   
   3. After the task is finished, the masterServer deletes the task's executable file through ssh or any other ways for clean up.
   
   In this way, all task types can be seamlessly implement with high performance and the task content can be executed without any escape. It also reduces the monitoring load of the master. Which is more reasonable for distributed processing.
   
   These are my humble opinions. If you have any questions, please let me know. @DarkAssassinator 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1239439080

   and i think the `failover` is a common issue, becuase if process is running, just worker service break down, now DS just will clone a new TaskInstance and dispatch to a new Worker to running again. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhuxt2015 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
zhuxt2015 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250032762

   resource files send to remote server by `scp` command,   stop/kill/timeout/failover handled by the shell task itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhuxt2015 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
zhuxt2015 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250002021

   Please make things simple.
   We just only want to execute command in remote server. just like `ssh user@10.10.125.1  "echo 'hello world'"`.
   
   My suggestion is as follows:
   
   1. So we just add a ssh server source, config ip、port、username、 password or select a id_ras.pub file from Resources to use login with key-based authentication.
   <img width="591" alt="image" src="https://user-images.githubusercontent.com/13765310/190841320-79961704-a4ac-46a9-8ae6-17d73fcd2d3e.png">
   
   2. select the ssh server source in Shell task。
   <img width="600" alt="image" src="https://user-images.githubusercontent.com/13765310/190841512-10f65629-23b6-4531-8b6f-d851e36a8533.png">
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1255232235

   > > > @DarkAssassinator Would u like to bring up this issue in community bi-weekly conference for further discussion? Thanks.
   > > 
   > > 
   > > @EricGao888 sure, it will be my pleasure. and may I know what time and way it is?
   > 
   > @DarkAssassinator https://docs.qq.com/doc/DQ3BGVHZ1bXp5R2FZ
   
   @EricGao888 done
   ![image](https://user-images.githubusercontent.com/20518339/191795326-cfd64d05-7350-4bd8-b031-857850a2e314.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1255039366

   > > @DarkAssassinator Would u like to bring up this issue in community bi-weekly conference for further discussion? Thanks.
   > 
   > @EricGao888 sure, it will be my pleasure. and may I know what time and way it is?
   
   @DarkAssassinator https://docs.qq.com/doc/DQ3BGVHZ1bXp5R2FZ


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1253238108

   > > @zhuxt2015 @DarkAssassinator I'm +1 to this comment. Currently we could only manage connections for `datasource` task plugins instead of all task plugins. From my perspective, we could improve this feature and make it apply to all task plugins. Actually, we have a related issue here: #10283 but we haven't got time to design and implement it. Rather than configure ssh connections in `Security Center` or a totally new `Server Management Page`, I prefer to upgrade our current `Datasource Center` to `Connection Center` or `Configuration Center` and do it there. cc @caishunfeng @SbloodyS
   > 
   > @EricGao888 i agree with u, i think `Configuration Center` should be better
   
   @DarkAssassinator FYI, you could refer to ssh operator in Apache Airflow and see how it is implemented and what we could do for `DS SSH Task Plugin`. https://github.com/apache/airflow/blob/main/airflow/providers/ssh/hooks/ssh.py


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1285588315

   > I second that, no need to create table, a simple ssh task type will do this feature.
   
   do u mean add a SSH task plugin? Actually it was originally conceived this way, but we also need to manage the remote host and can't make sure that only the SSH task needs it, may any other tasks may need it in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
SbloodyS commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1227905796

   That's great! Can you provide the details of the design architecture? @DarkAssassinator 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250031996

   > Please make things simple. We just only want to execute command in remote server. just like `ssh user@10.10.125.1 "echo 'hello world'"`.
   > 
   > My suggestion is as follows:
   > 
   > 1. So we just add a ssh server source, config ip、port、username、 password or select a id_ras.pub file from Resources to use login with key-based authentication.
   > 
   > <img alt="image" width="591" src="https://user-images.githubusercontent.com/13765310/190841320-79961704-a4ac-46a9-8ae6-17d73fcd2d3e.png">
   > 
   > 2. Select the ssh server source in Shell task。
   > 
   > <img alt="image" width="600" src="https://user-images.githubusercontent.com/13765310/190841512-10f65629-23b6-4531-8b6f-d851e36a8533.png">
   > 
   > 3. When server source exists in shell task, then ssh to server source  to execute shell script.
   
   yes, but many task contain many resources file, cannot just a simply ssh command, and we need handle the stop/kill/timeout/failover


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250034498

   > These are my humble opinions. If you have any questions, please let me know. @DarkAssassinator
   
   Hi @SbloodyS thank u so mach for ur suggestions. 
   > I think it's better not to use userName/passWord in ssh since there are some security risks. Using pam file or authrized_key in ssh is a more secure way. 
   
   u are right from a security point of view, but this increases the cost of use for the user, because user need create and download many host authrized_key files. May we can add password/authrized_key options in the UI that user can select the ssh policy.
   
   
   > In common usage scenarios, the masterServer/apiServer's node usually does not contain the permission to use HDFS and S3. These permissions are usually included in the workerServer's node. It requires these permissions on user's masterServer/apiServer's node if using scp command to trasfer the files to the task server. In addition, downloading files from the masterServer/apiServer's node and then scping them to the task node will waste network IO and hard disk IO for some large files or large number of small files.
   
   Sure, but not all tasks are suitable for running on remote servers. May just Shell/Python/JAVA.
   If the task need depend on the env or cluster services, this task will not add this setting. And about I/O, I think this part of the overhead users can perceive and accept, or we can add a I/O monitor, if I/O busy we can reject the command.
   
   > Using SSH to execute shell commands usually requires escaping a lot of special characters for different task type. And I think this is a huge workload for subsequent maintenance.
   
   ssh is same as the run the command at the local machine. because we just need send a ssh execute command to remote server same as local, because all detail command are saved in the other script. So we do not need have a big change. Just scp all tmp files to remote and run the main scipt. 
   
   > Using SSH means that the task running status and running logs need to be monitored by the masterServer. This may lead to high load on masterServer's node when the number of tasks is quite large.
   no need it, because worker will monitor the inputstream and errorstream, and print to worker logs, so this part no need any change.
   
   For this case, i think that we just need do the following changes:
   1. Add a ssh model, and add a UI management.
   2. Add a SSH selection in the Shell/Python task setting page.
   3. Add the ssh information to the task instance and context. 
   4. If ssh != null, shell/python will scp all tmp file to the ssh server and run the execute command. And get all run result and print  into the logs. And about  stop/kill/timeout/failover handled by the shell task itself.
   
   
   
   > Using SSH to execute shell commands usually requires escaping a lot of special characters for different task type. And I think this is a huge workload for subsequent maintenance.
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1227462281

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1229428080

   Dear @SbloodyS i has shared the architecture by email, please help to review. thx


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] DarkAssassinator commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
DarkAssassinator commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1250034656

   > resource files send to remote server by `scp` command, stop/kill/timeout/failover handled by the shell task itself.
   
   right, this is my initial design, and our test environment is also implemented like this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] SbloodyS commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
SbloodyS commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1253275455

   1. Upgrading out current `DataSource Center` to `Configuration Center` sound good to me.
   
   2. The problem of network IO and disk IO using SSH has not been solved. High IO load will lead to high CPU load. In extreme cases, the server will be suspended and cannot be connected to debug remotely. The problem can only be solved through restart the server. This will become the first time bomb of DS Since the current DS services are relatively lightweight.
   
   3. SSH with long connection will cause abnormal task failure due to problems such as network jitter. And the default maximum number of ssh connections is very small, which is not suitable for large-scale use. In order to solve this problem, it is also necessary to provide all relevant optimization configuration methods to users in the document.
   
   @EricGao888 @DarkAssassinator 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1253277639

   @DarkAssassinator Would like to bring up this issue in community bi-weekly conference for further discussion? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] simsicon commented on issue #11652: [Feature] DS can support task running on remote host, not just worker server.

Posted by GitBox <gi...@apache.org>.
simsicon commented on issue #11652:
URL: https://github.com/apache/dolphinscheduler/issues/11652#issuecomment-1283258330

   > Please make things simple. We just only want to execute command in remote server. just like `ssh user@10.10.125.1 "echo 'hello world'"`.
   > 
   > My suggestion is as follows:
   > 
   > 1. So we just add a ssh server source, config ip、port、username、 password or select a id_ras.pub file from Resources to use login with key-based authentication.
   > 
   > <img alt="image" width="591" src="https://user-images.githubusercontent.com/13765310/190841320-79961704-a4ac-46a9-8ae6-17d73fcd2d3e.png">
   > 
   > 2. Select the ssh server source in Shell task。
   > 
   > <img alt="image" width="600" src="https://user-images.githubusercontent.com/13765310/190841512-10f65629-23b6-4531-8b6f-d851e36a8533.png">
   > 
   > 3. When server source exists in shell task, then ssh to server source  to execute shell script.
   
   I second that, no need to create table, a simple ssh task type will do this feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org