You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@shardingsphere.apache.org by GitBox <gi...@apache.org> on 2020/05/06 14:40:40 UTC

[GitHub] [shardingsphere] avalon5666 opened a new issue #5446: Improve scaling abnormal recovery ability

avalon5666 opened a new issue #5446:
URL: https://github.com/apache/shardingsphere/issues/5446


   I think the scaling job always successes in the end, we should try our best to make it complete. The network exception and task execute node crash are the two main reasons which synchronization is failure.
   - We can re run executor when network exception.
   - For executor node crash, we can dependency scheduling framework rescheduling mechanism, such as elastic job and kubernetes.
   - For other reasons, we need admin to fix the problems, but do not shutdown other tasks.
   
   ## Breakpoint resume ability
   This is necessary for abnormal recovery ability.
   - For inventory, we read data order by primary key order, it make us can mark it and re run by this. If there no primary key, we only can read full inventory data.
   - For incremental, there is a log position, and we just need to persistence the position.
   - Abount the position persistence, i think we can asynchronous persistence to zk.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] kimmking closed issue #5446: Improve scaling abnormal recovery ability

Posted by GitBox <gi...@apache.org>.
kimmking closed issue #5446:
URL: https://github.com/apache/shardingsphere/issues/5446


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] KomachiSion edited a comment on issue #5446: Improve scaling abnormal recovery ability

Posted by GitBox <gi...@apache.org>.
KomachiSion edited a comment on issue #5446:
URL: https://github.com/apache/shardingsphere/issues/5446#issuecomment-624965640


   Good suggestion.
   
   There is one question for 
   > We can re run executor when network exception.
   
   Whether re-run the executor or just re-try the command which network excepted?
   
   For example, the dumper query with SQL `select` and caused exception by network. Can we just retry this SQL rather than restart this dumper?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] avalon5666 commented on issue #5446: Improve scaling abnormal recovery ability

Posted by GitBox <gi...@apache.org>.
avalon5666 commented on issue #5446:
URL: https://github.com/apache/shardingsphere/issues/5446#issuecomment-626332785


   ## Task
   - [ ] Persistence task position info
   - [ ] Importer retry insert when network exception
   - [ ] Dumper restart when network exception
   - [ ] Start with task position info when dumper start


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] KomachiSion commented on issue #5446: Improve scaling abnormal recovery ability

Posted by GitBox <gi...@apache.org>.
KomachiSion commented on issue #5446:
URL: https://github.com/apache/shardingsphere/issues/5446#issuecomment-624965640


   Good suggestion.
   
   There are one question for 
   > We can re run executor when network exception.
   
   Whether re-run the executor or just re-try the command which network excepted?
   
   For example, the dumper query with SQL `select` and caused exception by network. Can we just retry this SQL rather than restart this dumper?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] avalon5666 edited a comment on issue #5446: Improve scaling abnormal recovery ability

Posted by GitBox <gi...@apache.org>.
avalon5666 edited a comment on issue #5446:
URL: https://github.com/apache/shardingsphere/issues/5446#issuecomment-624990318


   If we want the breakpoint resume in inventory dumper, I think restart the dumper rather than retry SQL. Otherwise just like you say.
   In importer, we just retry SQL rather than restart importer.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] avalon5666 commented on issue #5446: Improve scaling abnormal recovery ability

Posted by GitBox <gi...@apache.org>.
avalon5666 commented on issue #5446:
URL: https://github.com/apache/shardingsphere/issues/5446#issuecomment-624990318


   If we want the breakpoint resume in inventory dumper, I think restart the dumper rather than retry SQL. Otherwise just like you say.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org