You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uniffle.apache.org by GitBox <gi...@apache.org> on 2022/07/26 10:34:56 UTC

[GitHub] [incubator-uniffle] xianjingfeng opened a new issue, #80: [Feature Request] Support shuffle server decommissioned

xianjingfeng opened a new issue, #80:
URL: https://github.com/apache/incubator-uniffle/issues/80

   kill process is not graceful, so we need shuffle server support decommissioned


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1290017350

   > we can start a offline meeting to discuss this issue.
   
   I am looking forward to it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by "xianjingfeng (via GitHub)" <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1411385326

   As we discuss in design doc, i will make the following adjustments:
   1. Add concrete rest api list in the design doc.
   2. Remove token in this design.
   
   Any other suggestions? @jerqi @zuston @advancedxy 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1375397876

   Thanks a lot for proposing this, I will take a look ASSP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1289902149

   > @jerqi @colinmjj I want to know if you have plan recently. We have some functions that need to be built on the function of decommission, such as auto scaling We don't want to deviate too much from the community.
   
   We have no  related plan recently. If you have interest about this topic, we can start a offline meeting to discuss this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] kaijchen commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1463421654

   Thanks @xianjingfeng for working on this feature.
   I'm closing this issue now. Please feel free to reopen it if there is more work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1195315483

   +1. Now the decommission could be used by exclude node file in coordinator side.
   
   Besides, the exclude-node-file could be stored in HDFS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1291440080

   @xianjingfeng  I have already send an email https://lists.apache.org/thread/2jlm3fswmsxy619ldyo4px700p3ybnvc. Do you have time at 11 am (UTC +8) Thursday this week?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1290042208

   @zuston @xianjingfeng There are some other issues which we need to discuss, so I will send a email to our dev mail list, and select a proper date to start the meeting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1291570814

   > > > @xianjingfeng I have already send an email https://lists.apache.org/thread/2jlm3fswmsxy619ldyo4px700p3ybnvc. Do you have time at 11 am (UTC +8) Thursday this week?
   > > 
   > > 
   > > Yes, i have time.
   > 
   > Meeting link is https://meeting.tencent.com/dm/oR95wASCNe91
   
   Get


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1291497534

   > @xianjingfeng I have already send an email https://lists.apache.org/thread/2jlm3fswmsxy619ldyo4px700p3ybnvc. Do you have time at 11 am (UTC +8) Thursday this week?
   
   Yes, i have time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1375420047

   > Design doc: https://docs.google.com/document/d/1p1PksBN2LJ-OtGEHvdyEuH9b1Mv1aD_exMPl4TNaTs0/edit?usp=sharing PTAL @jerqi @zuston
   
   Commented. @xianjingfeng PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1288975766

   @jerqi @colinmjj  I want to know if you have plan recently.  We have some functions that need to be built on the function of decommission, such as auto scaling We don't want to deviate too much from the community.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1200327637

   I think we should consider more things, such as
   1. Is it easy to use if we deploy on k8s and IP is not fixed?
   2. Split-brain. If pass commands through heartbeat, shuffle server may receive different messages meanwhile and how do we ensure correctness. The decommission function can be solved. What about other functions in the future?
   3. Compatibility. If pass commands through heartbeat, we need to modify this interface frequently.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1200436991

   cc @colinmjj . What do you think? I remember that you want to use coordinator to dispatch the configuration to shuffle servers. It's similar to use coordinator to decommission.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1196808782

   Could you write a design doc's (use google doc) ? Because this issue is a little complex.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1196656326

   > I understand that you need a `rolling upgrade` feature. In our plan, we want to accomplish this feature by k8s operator. For the standalone mode, we don't have the plans,. And it's also necessary to do some surveys about this feature, I think we should have more discuss about this problem.
   
   Depoly on k8s is a good choice. But one more choice is not a bad thing. Not all team willing to use k8s . I have create a pr.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by "zuston (via GitHub)" <gi...@apache.org>.
zuston commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1411430506

   > As we discuss in design doc, i will make the following adjustments:
   > 
   > 1. Add concrete rest api list in the design doc.
   > 2. Remove token in this design.
   > 
   > Any other suggestions? @jerqi @zuston @advancedxy
   
   +1. Thanks for your effort


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] kaijchen closed issue #80: [Feature Request] Support shuffle server decommissioned

Posted by "kaijchen (via GitHub)" <gi...@apache.org>.
kaijchen closed issue #80: [Feature Request]  Support shuffle server decommissioned
URL: https://github.com/apache/incubator-uniffle/issues/80


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1196241494

   I understand that you need  a `rolling upgrade` feature.  In our plan, we want to  accomplish this feature by k8s operator. For the standalone mode, we don't have the plans,. And it's also necessary to do some surveys about this feature, I think we should have more discuss about this problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1292974565

   Offline Discussion Result:
   Coordinator provide admin rest api, Coordinator is only used as proxy, Coordinator redirect the request to shuffle server by rpc.
   Currently, we need the api
   1. Decommission
   2. UpdateConfiguration
   3. Upgrade
   cc @zuston, you can give your advice for us. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1291558298

   > > @xianjingfeng I have already send an email https://lists.apache.org/thread/2jlm3fswmsxy619ldyo4px700p3ybnvc. Do you have time at 11 am (UTC +8) Thursday this week?
   > 
   > Yes, i have time.
   
   Meeting link is https://meeting.tencent.com/dm/oR95wASCNe91


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by "jerqi (via GitHub)" <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1411386029

   I'm OK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] xianjingfeng commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
xianjingfeng commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1375395560

   Design doc: https://docs.google.com/document/d/1p1PksBN2LJ-OtGEHvdyEuH9b1Mv1aD_exMPl4TNaTs0/edit?usp=sharing
   PTAL @jerqi @zuston 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1198810860

   > Yarn node's decommission. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html 
   
   Yes. In #85, I follow the rule of YARN decommission mechanism. So i think it's better to control the decommission by coordinator. Feel free to discuss more
   
   > Maybe, we should also realize other system decommission implement.
   
   I looked the HDFS datanode decommission, it's also like YARN decommission. refer: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDataNodeAdminGuide.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1198369508

   Yarn node's decommission. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html Maybe, we should also realize other system decommission implement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1196888898

   If we want to add some interface to control shuffle server's behavior,  we should have a complete design, and we think we need detailed discussions. We ever have similar mind in issue #37 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] colinmjj commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
colinmjj commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1200618427

   I think such feature is about command line or some API to manage the behavior of coordinator/shuffle server.
   There should be an overall picture to describe how to make this happen.
   Besides decommission, how about update some configuration in shuffle server, clear shuffle data(which maybe useful for streaming jog), etc.
   All above feature is management related, so I prefer to have a framework which can involve all these things. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #80: [Feature Request] Support shuffle server decommissioned

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #80:
URL: https://github.com/apache/incubator-uniffle/issues/80#issuecomment-1289972807

   > > @jerqi @colinmjj I want to know if you have plan recently. We have some functions that need to be built on the function of decommission, such as auto scaling We don't want to deviate too much from the community.
   > 
   > 
   > 
   > We have no  related plan recently. If you have interest about this topic, we can start a offline meeting to discuss this issue.
   
   +1. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org