You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/11/09 02:03:30 UTC

[GitHub] [dolphinscheduler] darkz1984 opened a new pull request, #12826: Speed Up delete process from Web UI

darkz1984 opened a new pull request, #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826

   ## Purpose of the pull request
   Speed Up delete process from Web UI
   
   ## Brief change log
   add an index on table t_ds_task_instance
   
   ```
   create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);
   ```
   
   I use postgresql as my dolphinscheduler meta data base
   
   Because I have some process run in each 5 minutes , so lead much process instance and task instance in the database , the most top instance of  task node may reach 2000 and the total t_ds_task_instance table hold nearly 1000000 records in it.
   
   I had try delete process or process instance from the web ui. But I found it too slow and lead my postgresql server cpu usage too high, and cost nearly about one hours I foud the sql is running long time is :
   
   ```
   select                                                                                                             
   
                                                                                                                      
   
           id, name, task_type, process_instance_id, task_code, task_definition_version, state, submit_time,          
   
           start_time, end_time, host, execute_path, log_path, alert_flag, retry_times, pid, app_link,                
   
           flag, retry_interval, max_retry_times, task_instance_priority, worker_group,environment_code , executor_id,
   
           first_submit_time, delay_time, task_params, var_pool, dry_run                                              
   
                                                                                                                      
   
           from t_ds_task_instance                                                                                    
   
           WHERE process_instance_id = $1                                                                             
   
           and flag = $2                                                                                              
   
           order by start_time desc                                                                                   
   
   
   ```
   
   then I try to add the index  , and try delete the related process definition or process instance , it Just finsh in few seconds !!!
   
   
   
   ## Verify this pull request
   then I try to add the index  , and try delete the related process definition or process instance , it Just finsh in few seconds !!!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] codecov-commenter commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1308111734

   # [Codecov](https://codecov.io/gh/apache/dolphinscheduler/pull/12826?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#12826](https://codecov.io/gh/apache/dolphinscheduler/pull/12826?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (68f541d) into [dev](https://codecov.io/gh/apache/dolphinscheduler/commit/04c6b5e4131c87258e47cdcfb4d5b3f21f9e78e4?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (04c6b5e) will **decrease** coverage by `0.02%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@             Coverage Diff              @@
   ##                dev   #12826      +/-   ##
   ============================================
   - Coverage     39.13%   39.10%   -0.03%     
   + Complexity     4212     4208       -4     
   ============================================
     Files          1046     1046              
     Lines         39765    39733      -32     
     Branches       4576     4558      -18     
   ============================================
   - Hits          15562    15538      -24     
   + Misses        22433    22423      -10     
   - Partials       1770     1772       +2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/dolphinscheduler/pull/12826?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...erver/master/processor/queue/TaskEventService.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci1tYXN0ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvc2VydmVyL21hc3Rlci9wcm9jZXNzb3IvcXVldWUvVGFza0V2ZW50U2VydmljZS5qYXZh) | `69.64% <0.00%> (-10.72%)` | :arrow_down: |
   | [...er/master/dispatch/host/assign/RandomSelector.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci1tYXN0ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvc2VydmVyL21hc3Rlci9kaXNwYXRjaC9ob3N0L2Fzc2lnbi9SYW5kb21TZWxlY3Rvci5qYXZh) | `77.77% <0.00%> (-5.56%)` | :arrow_down: |
   | [...e/dolphinscheduler/remote/NettyRemotingClient.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci1yZW1vdGUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvcmVtb3RlL05ldHR5UmVtb3RpbmdDbGllbnQuamF2YQ==) | `50.00% <0.00%> (-2.78%)` | :arrow_down: |
   | [...hinscheduler/plugin/alert/script/ScriptSender.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci1hbGVydC9kb2xwaGluc2NoZWR1bGVyLWFsZXJ0LXBsdWdpbnMvZG9scGhpbnNjaGVkdWxlci1hbGVydC1zY3JpcHQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvcGx1Z2luL2FsZXJ0L3NjcmlwdC9TY3JpcHRTZW5kZXIuamF2YQ==) | `70.58% <0.00%> (-1.64%)` | :arrow_down: |
   | [...r/plugin/task/sqoop/parameter/SqoopParameters.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci10YXNrLXBsdWdpbi9kb2xwaGluc2NoZWR1bGVyLXRhc2stc3Fvb3Avc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvcGx1Z2luL3Rhc2svc3Fvb3AvcGFyYW1ldGVyL1Nxb29wUGFyYW1ldGVycy5qYXZh) | `51.19% <0.00%> (-1.20%)` | :arrow_down: |
   | [...cheduler/plugin/alert/dingtalk/DingTalkSender.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci1hbGVydC9kb2xwaGluc2NoZWR1bGVyLWFsZXJ0LXBsdWdpbnMvZG9scGhpbnNjaGVkdWxlci1hbGVydC1kaW5ndGFsay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZG9scGhpbnNjaGVkdWxlci9wbHVnaW4vYWxlcnQvZGluZ3RhbGsvRGluZ1RhbGtTZW5kZXIuamF2YQ==) | `34.13% <0.00%> (-0.78%)` | :arrow_down: |
   | [...rver/master/runner/task/BlockingTaskProcessor.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci1tYXN0ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvc2VydmVyL21hc3Rlci9ydW5uZXIvdGFzay9CbG9ja2luZ1Rhc2tQcm9jZXNzb3IuamF2YQ==) | `75.86% <0.00%> (-0.55%)` | :arrow_down: |
   | [...inscheduler/plugin/registry/etcd/EtcdRegistry.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci1yZWdpc3RyeS9kb2xwaGluc2NoZWR1bGVyLXJlZ2lzdHJ5LXBsdWdpbnMvZG9scGhpbnNjaGVkdWxlci1yZWdpc3RyeS1ldGNkL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9kb2xwaGluc2NoZWR1bGVyL3BsdWdpbi9yZWdpc3RyeS9ldGNkL0V0Y2RSZWdpc3RyeS5qYXZh) | `50.69% <0.00%> (-0.35%)` | :arrow_down: |
   | [.../dolphinscheduler/plugin/task/datax/DataxTask.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci10YXNrLXBsdWdpbi9kb2xwaGluc2NoZWR1bGVyLXRhc2stZGF0YXgvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvcGx1Z2luL3Rhc2svZGF0YXgvRGF0YXhUYXNrLmphdmE=) | `36.62% <0.00%> (-0.26%)` | :arrow_down: |
   | [.../dolphinscheduler/plugin/task/spark/SparkTask.java](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZG9scGhpbnNjaGVkdWxlci10YXNrLXBsdWdpbi9kb2xwaGluc2NoZWR1bGVyLXRhc2stc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2RvbHBoaW5zY2hlZHVsZXIvcGx1Z2luL3Rhc2svc3BhcmsvU3BhcmtUYXNrLmphdmE=) | `72.56% <0.00%> (-0.25%)` | :arrow_down: |
   | ... and [18 more](https://codecov.io/gh/apache/dolphinscheduler/pull/12826/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] sonarcloud[bot] commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1308115925

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler&pullRequest=12826)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12826&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12826&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336366608

   > @darkz1984 @EricGao888 I close this PR since the index(process_instance_id, flag) is not needed, and I find the logic in our code has a bug, when we delete the task instance by workflow instance id, we will query the validate task by tag, and then delete the query result, this will cause the invalidate task instance will never be deleted.
   > 
   > I submit #13091 to fix this.
   
   Sounds good to me. Thanks : )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhongjiajie commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1017339520


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   Maybe we should also add to specific update directory



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1017343520


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   I just use postgresql
   
   and I found there are some foreign key in mysql
   I think it could be removed to improve



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] davidzollo commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
davidzollo commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1032867859


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   @darkz1984 you did a good job, 
   
   can you add these changes to mysql and H2 to keep the db file sync, I think it's not difficult for you ^_^



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1038102349


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   https://github.com/apache/dolphinscheduler/pull/13081
   the post upgrade ddl



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1038102896


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   @davidzollo 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] ruanwenjun commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
ruanwenjun commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336091686

   > > Every workflow instance has a distinct id, you will rerun the workflow instance in every 5 minutes?
   > > In general, the index `process_instance_id` is enough.
   > 
   > But delete process instance will delete the task instance casde, and the task instance may be with a huge number of rows in it. In my system, it reach 10,000 of rows one day, and total reach 4,000,000 rows. So it need to add this index to speed up delete the task instance with this index
   
   We delete the task instance by process_instance_id, I am not clear what is your problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] ruanwenjun commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
ruanwenjun commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336362760

   @darkz1984 There already exist process_instance_id in `t_ds_task_instance`, you don't need to add again.
   
   The problem is we will query the validate task instance when delete, you need to refactor this logic, directly delete by process_instance_id.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] ruanwenjun closed pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
ruanwenjun closed pull request #12826: Speed Up delete process from Web UI
URL: https://github.com/apache/dolphinscheduler/pull/12826


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336049996

   > Every workflow instance has a distinct id, you will rerun the workflow instance in every 5 minutes?
   > 
   > In general, the index `process_instance_id` is enough.
   
   But delete process instance will delete the task instance casde, and the task instance may be with a huge number of rows in it. In my system, it reach 10,000 of rows one day, and total reach 4,000,000 rows. So it need to add this index to speed up delete the task instance with this index


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] ruanwenjun commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
ruanwenjun commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336366274

   @darkz1984 @EricGao888 I close this PR since the index(process_instance_id, flag) is not needed, and I find the logic in our code has a bug, when we delete the task instance by workflow instance id, we will query the validate task by tag, and then delete the query result, this will cause the invalidate task instance will never be deleted.
   
   I submit #13091 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhongjiajie commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1017338107


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   It is only need to change postgresql, or did we miss adding to h2 and mysql?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] mergeable[bot] commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
mergeable[bot] commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1308095466

   :warning: This PR do not change database DDL synchronize.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1038085202


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   > @darkz1984 you did a good job,
   > 
   > can you add these changes to mysql and H2 to keep the db file sync, I think it's not difficult for you ^_^
   
   I am not use mysql or H2 with dolphinscheduler on my server, So, I think I could do this...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1038089138


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   And add an index for a big t_ds_task_instance with much rows online when dolphinscheduler is running in PostgreSQL is very easy and just cost few milliseconds. 
   But this action in MySQL is very danger, It may lead table lock and long time cost,  My be an production accident.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1038100817


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   > @darkz1984 you did a good job,
   > 
   > can you add these changes to mysql and H2 to keep the db file sync, I think it's not difficult for you ^_^
   
   I had try mysql :
   https://github.com/apache/dolphinscheduler/pull/13080



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] EricGao888 commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
EricGao888 commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336167856

   @darkz1984 I don't doubt that creating an index on the combination of `(process_instance_id, flag)` for table `t_ds_task_instance` could significantly improve the querying efficiency of the sql sentence you post in this issue, which you used to manually deleted the stuff from db. But I don't think this index could improve any of the queries in `TaskInstanceMapper.xml`. It is not meaningful for the upstream code.
   
   https://github.com/apache/dolphinscheduler/blob/dev/dolphinscheduler-dao/src/main/resources/org/apache/dolphinscheduler/dao/mapper/TaskInstanceMapper.xml


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336380114

   > @darkz1984 @EricGao888 I close this PR since the index(process_instance_id, flag) is not needed, and I find the logic in our code has a bug, when we delete the task instance by workflow instance id, we will query the validate task by tag, and then delete the query result, this will cause the invalidate task instance will never be deleted.
   > 
   > I submit #13091 to fix this.
   
   OK, good news


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1038105653


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   And I think H2 database has no index, so, no need to add index for it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] ruanwenjun commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
ruanwenjun commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1335488166

   Every workflow instance has a distinct id, you will rerun the workflow instance in every 5 minutes?
   
   In general, the index `process_instance_id` is enough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] darkz1984 commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
darkz1984 commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1336306485

   @EricGao888 @ruanwenjun 
   I detect this problem in my dolphinscheduler, and delete the process instance too slow, may be more than 10 minutes.
   And i follow the database process of dolphinscheduler, see this slow database query.
   Then I add this index,  the action end in  some seconds.
   
   I have some process in dolphinscheduler run each 5 minutes, so some process definition has much process instance and mo task instance.  
   About 20,000+ task instance of the process definition. So the delete of process definition too slow.
   if the process in dolphinscheduler just run every day, delete of the process definition is fast.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] sonarcloud[bot] commented on pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
sonarcloud[bot] commented on PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#issuecomment-1308115458

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache-dolphinscheduler&pullRequest=12826)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache-dolphinscheduler&pullRequest=12826&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12826&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache-dolphinscheduler&pullRequest=12826&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] zhongjiajie commented on a diff in pull request #12826: Speed Up delete process from Web UI

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on code in PR #12826:
URL: https://github.com/apache/dolphinscheduler/pull/12826#discussion_r1017939754


##########
dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql:
##########
@@ -796,6 +797,7 @@ CREATE TABLE t_ds_task_instance (
 ) ;
 
 create index idx_task_instance_code_version on t_ds_task_instance (task_code, task_definition_version);
+create index idx_task_instance_process_instance_flag on t_ds_task_instance (process_instance_id, flag);

Review Comment:
   > I just use postgresql
   > 
   > and I found there are some foreign key in mysql I think it could be removed to improve
   
   But as a PR, you should also add your change to other type of database, such as mysql and H2 to keep synchronous.
   Also, you should add the index add in https://github.com/apache/dolphinscheduler/blob/489e7fe4e2d980031c91eac2a4823b9f1713659a/dolphinscheduler-dao/src/main/resources/sql/upgrade/3.2.0_schema/postgresql/dolphinscheduler_ddl.sql and https://github.com/apache/dolphinscheduler/blob/489e7fe4e2d980031c91eac2a4823b9f1713659a/dolphinscheduler-dao/src/main/resources/sql/upgrade/3.2.0_schema/mysql/dolphinscheduler_ddl.sql



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org