You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/12/08 08:39:49 UTC

[GitHub] [dolphinscheduler] jieguangzhou opened a new issue, #13133: [Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks

jieguangzhou opened a new issue, #13133:
URL: https://github.com/apache/dolphinscheduler/issues/13133

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Description
   
   Like this :[Flyter Caching](https://docs.flyte.org/projects/cookbook/en/latest/auto/core/flyte_basics/task_cache.html#sphx-glr-auto-core-flyte-basics-task-cache-py)
   
   In machine learning workflow, if some tasks will be caching, the workflows will be executed faster
   
   
   ### Use case
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] caishunfeng closed issue #13133: [Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks

Posted by GitBox <gi...@apache.org>.
caishunfeng closed issue #13133: [Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks
URL: https://github.com/apache/dolphinscheduler/issues/13133


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] Radeity commented on issue #13133: [Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks

Posted by GitBox <gi...@apache.org>.
Radeity commented on issue #13133:
URL: https://github.com/apache/dolphinscheduler/issues/13133#issuecomment-1343784855

   Hi, @jieguangzhou, wonderful idea! In addition, one thing i think should take into consideration:
   > A well-behaved Flyte task should generate **deterministic** output given the same inputs and task functionality.
   
   How can we make sure the task is deterministic? Inference is deterministic, but training can only be relatively deterministic because of some random steps. Also, other types of tasks are not always deterministic. Should we add task-related configuration for using this cache mechanism, user can determine whether they think the task can give deterministic output or uncertainty can be ignored, WDYT? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] github-actions[bot] commented on issue #13133: [Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #13133:
URL: https://github.com/apache/dolphinscheduler/issues/13133#issuecomment-1342274162

   Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
   * In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
   * If you haven't received a reply for a long time, you can [join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [dolphinscheduler] jieguangzhou commented on issue #13133: [Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks

Posted by GitBox <gi...@apache.org>.
jieguangzhou commented on issue #13133:
URL: https://github.com/apache/dolphinscheduler/issues/13133#issuecomment-1344365660

   
   
   
   
   > Hi, @jieguangzhou, wonderful idea! In addition, one thing i think should take into consideration:
   > 
   > > A well-behaved Flyte task should generate **deterministic** output given the same inputs and task functionality.
   > 
   > How can we make sure the task is deterministic? Inference is deterministic, but training can only be relatively deterministic because of some random steps. Also, other types of tasks are not always deterministic. Should we add task-related configuration for using this cache mechanism, user can determine whether they think the task can give deterministic output or uncertainty can be ignored, WDYT?
   
   Yes, I think we can add a new flag to let users decide whether to use the cache task 
   
   Maybe like this
   ![image](https://user-images.githubusercontent.com/31528124/206722152-1aa8a7d5-0a84-43c9-ac22-d98d57a1bce5.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org