You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/04/06 11:32:49 UTC

[GitHub] [spark] attilapiros commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service on Kubernetes

attilapiros commented on pull request #32031:
URL: https://github.com/apache/spark/pull/32031#issuecomment-814047657


   I think the followings are good reasons to keep `remote-shuffle-service` as a separate plugin: 
   - most of the changes are brand new files added under directory `remote-shuffle-service` (the exceptions are only two `pom.xml` files: on in the root and the other in the assembly directory)
   - it is even deployed separately 
   - as you wrote `There are several disaggregated/remote shuffle solutions in different companies`, so why this one is chosen to be included? (although the rest of the solutions are not open sourced or open sourced yet)
   - if it remains as separate plugin it can be released separately (not depending on Spark releases might be needed as this code is fresh and not as tested so probably much more releases will be needed from the `remote-shuffle-service` in the beginning than from Spark)
   - as a separate thing the borders are more clean and easier to be kept clean
   - as a separate plugin the developer who knows everything about it has full control over the source code
   
   What would be advantage to have `remote-shuffle-service` be integrated into Spark?
   
   I can only see one advantage: this solution would be more advertised as the one which officially included.
   But I think widespread usage have to be achieved differently: by making remote-shuffle-service much more easier to be used on k8s is good direction. In addition to the docker image build provided here by giving scripts, tools and guidance how to set it up and configure it for your Spark jobs. 
   
   As more and more users will use it more and more experiences we will have and definitely more developers will join to work on it.
   
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org