You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by GitBox <gi...@apache.org> on 2021/10/25 22:33:31 UTC

[GitHub] [systemds] ywcb00 opened a new pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

ywcb00 opened a new pull request #1421:
URL: https://github.com/apache/systemds/pull/1421


   Hi,
   This PR introduces the Federated Lookup Table as a first step for supporting Federated Multi Tenancy.
   The Federated Lookup Table maps a Coordinator Identifier (FedUniqueCoordID) to an individual ExecutionContextMap at the Federated Worker. Each Federated Worker creates a FederatedLookupTable, and selects the corresponding ExecutionContextMap per FederatedRequest depending on the coordinator host (obtained from the netty channel) and the pid added to the FederatedRequest. Therefore, it allows the local sequential variable ID creation at the coordinator without collisions with any other coordinators, since every different coordinator has its own ExecutionContextMap.
   I also added some shell scripts to test the Setup with multiple coordinators locally, although I will most likely change the tests again in a later PR, since I'm not happy with them myself. :D
   
   Thanks for review :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] ywcb00 commented on pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

Posted by GitBox <gi...@apache.org>.
ywcb00 commented on pull request #1421:
URL: https://github.com/apache/systemds/pull/1421#issuecomment-954216164


   The _LineageFedReuseAlg_ test was failing because the execution of the test _without lineage_ reuse has overwritten the local _DMLScript.LINEAGE_ flag of the federated workers, which caused the federated workers to create the execution map without lineage object for the coordinator of the actual test _with lineage_ reuse.
   Since this problem can only occur with a single-process setup because otherwise each instance has its own DMLScript flags, I simply changed the execution order of the test so that the test without reuse gets executed last.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] asfgit closed pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #1421:
URL: https://github.com/apache/systemds/pull/1421


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] mboehm7 commented on pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

Posted by GitBox <gi...@apache.org>.
mboehm7 commented on pull request #1421:
URL: https://github.com/apache/systemds/pull/1421#issuecomment-961373148


   Thanks @ywcb00 - that's a great start, as just discussed offline let's move the tests into our java test suite and create individual worker and coordinator process accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] ywcb00 commented on pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

Posted by GitBox <gi...@apache.org>.
ywcb00 commented on pull request #1421:
URL: https://github.com/apache/systemds/pull/1421#issuecomment-980445356


   Thank you for the advice of adding the multi tenant test cases into our test suite @mboehm7. Now I can withdraw my statement that I am not satisfied with the tests :smiley: 
   I added the junit class FederatedMultiTenantTest, which includes two different test setups:
   
   1. SameWorkers: all the different coordinators share the same federated sites with the same data and partitioning
   2. SharedWorkers: each coordinator shares some federated sites with the prior coordinator and some with the subsequent coordinator, with different data/partitions on the federated sites


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] mboehm7 commented on pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

Posted by GitBox <gi...@apache.org>.
mboehm7 commented on pull request #1421:
URL: https://github.com/apache/systemds/pull/1421#issuecomment-961373148


   Thanks @ywcb00 - that's a great start, as just discussed offline let's move the tests into our java test suite and create individual worker and coordinator process accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] mboehm7 commented on pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

Posted by GitBox <gi...@apache.org>.
mboehm7 commented on pull request #1421:
URL: https://github.com/apache/systemds/pull/1421#issuecomment-996242254


   LGTM - this is a great start toward multi-tenant federated learning where multiple data scientists (coordinators) are properly isolated at the federated sites.
   
   Initially the tests did not work for me because the wait for process completion ran into a blocking state due to filled up queues. After reimplementing how we obtain the output before going into wait state this issues was resolved. However, there still seems to be an issue:
   
   ```
   org.apache.spark.SparkException: A master URL must be set in your configuration
   	at org.apache.spark.SparkContext.<init>(SparkContext.scala:380)
   ```
   
   because the `DMLScript.USE_LOCAL_SPARK_CONFIG` is still set to false in the spawned coordinator processes. Could you have a look - maybe a new SystemDS-config property could do the trick? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemds] mboehm7 commented on pull request #1421: [SYSTEMDS-3185] Federated Lookup Table

Posted by GitBox <gi...@apache.org>.
mboehm7 commented on pull request #1421:
URL: https://github.com/apache/systemds/pull/1421#issuecomment-961373148


   Thanks @ywcb00 - that's a great start, as just discussed offline let's move the tests into our java test suite and create individual worker and coordinator process accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@systemds.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org