You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uniffle.apache.org by GitBox <gi...@apache.org> on 2022/08/21 14:57:15 UTC

[GitHub] [incubator-uniffle] zuston opened a new issue, #172: [Feature] Support short-circuit read in local-file

zuston opened a new issue, #172:
URL: https://github.com/apache/incubator-uniffle/issues/172

   ### Motivation
   When uniffle shuffle servers are co-located with Yarn nodemanagers, we could use the short-circuit read to improve performance and reduce the overhead.
   
   ### How to do
   There are two options to solve this
   1. Directly read shuffle-data files by client side. 
   2. Use the domain socket.
   
   #### Option 1 - Directly read
   This is the fastest way, but the local-files' read permission should be open for client side. This maybe have security problem.
   
   
   #### Option2 - Domain socket
   This way could be avoid security problem, but will be slower than above. And its implementation will be complex.
   
   ### Conclusion
   In my opinion, above two ways could be all supported in uniffle, which could be as different policies for users to choose.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221761020

   > > When the taskScheduler allocate task, we don't consider reduce partition distribution, fewer task can use the feature.
   > 
   > If having this feature, maybe we could update the Spark data-locality policy of `preferLocation` for uniffle. Anyway, this is a improvement. Do u think so?
   
   Although it's an improvement, if it may bring more complexity and less effectiveness, I prefer not merging it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1229465774

   > For your thought, do I need to submit a data-locality shuffle patch about Spark firstly? As I know, we need to change the Spark codebase about `MapOutputTracker.getPreferredLocationsForShuffle` and `ShuffleRowRDD.getPreferredLocations` . When enable RSS, the perferred locations could be gotten from the dep.shuffleHandle(RssShuffleHandle of Uniffle partitionToServers vars). If I'm wrong, plz point out it. Thanks
   > 
   > I still think this is a necessary feature for co-location deployment. Besides this has been implemented by other RSS project.
   
   Not only the spark patch, but also we need to modify the strategy of coordinator. I know the Bytedance RSS implement this feature, if the cluster is big enough, the feature will be useless.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221738131

   Oh, this feature is not bound to Yarn. 
   We will deploy shuffle-servers on Yarn nodemanager machine using ansible, instead of using YARN Applications to deploy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1223921407

   Do u have some ideas? @jerqi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221751157

   > How to know the local directory for Spark client?
   
   The stored local path should be retrieved from shuffle server by client side. So we need to add a new grpc api.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221758478

   > When the taskScheduler allocate task, we don't consider reduce partition distribution, fewer task can use the feature.
   
   If having this feature, maybe we could update the Spark data-locality policy of `preferLocation` for uniffle. Do u think so? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1229441977

   For your thought, do I need to submit a data locality shuffle patch about Spark firstly? As I know, we need to change the Spark codebase about `MapOutputTracker.getPreferredLocationsForShuffle` and `ShuffleRowRDD.getPreferredLocations` . When enable RSS, the perferred locations could be gotten from the dep.shuffleHandle(RssShuffleHandle of Uniffle partitionToServers vars). If I'm wrong, plz point out it. Thanks
   
   I still think this is a necessary feature for co-location deployment, it have been implemented by other RSS project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221729871

   I think it's the most important thing that we don't official deployment plan now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1223930666

   > Do u have some ideas? @jerqi
   
   This feature need to increase a new rpc interface. The interfaces are very important. I need to guarantee the compatibility of interfaces. I think this is the biggest complexity which this feature brings. For effectiveness, we don't have data locality, we can't get too much performance improvement. And we have no plans about data locality, if we want to have this feature, we should achieve the feature of data locality.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221562215

   I think it's necessary feature, what do u think? @jerqi 
   
   Besides, I have implement one policy of `read directly` and want to contribute.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221750359

   > > Which option do you prefer?
   > 
   > I prefer `directly read` in the initial version.
   
   How to know the local directory for Spark client?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1229468191

   > but also we need to modify the strategy of coordinator
   
   Yes, we’d better to assign shuffle servers with data locality. This is a long way to get the best optimization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221733373

   > Yes. we need a detailed deployment instruction for users in doc. But I dont think this feature depends on the former.
   
   Sorry, i means that we don't official  yarn deployment plan now. I think this feature depends yarn.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221732163

   Yes. we need a detailed deployment instruction for users in doc. But I dont think this feature depends on the former.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221754513

   When the taskScheduler allocate task, we don't consider reduce partition distribution,  fewer task can use the feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221739650

   > Oh, this feature is not bound to Yarn. We will deploy shuffle-servers on Yarn nodemanager machine using ansible, instead of using [YARN Applications](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html) to deploy.
   
   Which option do you prefer?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221749575

   > Which option do you prefer?
   
   I prefer `directly read` in the initial version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #172: [Feature] Support short-circuit read in local-file

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #172:
URL: https://github.com/apache/incubator-uniffle/issues/172#issuecomment-1221771080

   Got your thought.
   
   ### complexity
   This feature only introduces the short-circuit read handler. In my POC, it wont break down the current implementation. Besides, we could introduce the config to control whether to enable this.
   
   ### effectiveness
   The effectiveness depends on the Spark knowledge about the shuffle-data-locality. I think this can be improved by the later patches for Spark, like Alluxio does. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org