You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/01 15:20:00 UTC

[GitHub] [arrow-datafusion] edrevo opened a new pull request #232: Make external hostname in executor optional

edrevo opened a new pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232


   # Which issue does this PR close?
   
   Closes #76.
   
    # Rationale for this change
   From the issue:
   
   > Having the ability to set an external/advertised hostname is great since it provides users a lot of flexibility in network deployments. However, having it a required argument is a pain for the most common scenario, where the scheduler, client and executors talk to each other in the same network (e.g. k8s or docker-compose).
   >
   > We should make the external hostname optional. If the scheduler receives executor metadata without a hostname, it should register the caller's IP address as the hostname.
   >
   > This will make it easier to deploy the executors as a kubernetes deployment, or to docker-compose scale ballista-executor= in the integration tests.
   
   # What changes are included in this PR?
   
   - Make hostname optional upon executor registration
   - Removed a bunch of old code in the planner
   
   # Are there any user-facing changes?
   
   Advertised hostname is now optional when launching the executor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] edrevo commented on a change in pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
edrevo commented on a change in pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232#discussion_r624524348



##########
File path: ballista/rust/scheduler/src/main.rs
##########
@@ -62,17 +62,20 @@ async fn start_server(
         BALLISTA_VERSION, addr
     );
 
-    let scheduler_server =
-        SchedulerServer::new(config_backend.clone(), namespace.clone());
     Ok(Server::bind(&addr)

Review comment:
       I'm a bit concerned this code is creating a new server per client. See https://github.com/apache/arrow/pull/9987/files#r624519673




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232#issuecomment-830651290


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#232](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f9861a8) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/c945b03f3a459a5c15f481f9d52819df56e1090c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c945b03) will **increase** coverage by `0.24%`.
   > The diff coverage is `27.02%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/232/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #232      +/-   ##
   ==========================================
   + Coverage   76.46%   76.70%   +0.24%     
   ==========================================
     Files         135      134       -1     
     Lines       23250    23174      -76     
   ==========================================
   - Hits        17777    17776       -1     
   + Misses       5473     5398      -75     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [ballista/rust/core/src/client.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9jbGllbnQucnM=) | `0.00% <ø> (ø)` | |
   | [...ust/core/src/execution\_plans/unresolved\_shuffle.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9leGVjdXRpb25fcGxhbnMvdW5yZXNvbHZlZF9zaHVmZmxlLnJz) | `50.00% <ø> (ø)` | |
   | [ballista/rust/executor/src/execution\_loop.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvZXhlY3V0aW9uX2xvb3AucnM=) | `0.00% <ø> (ø)` | |
   | [ballista/rust/executor/src/flight\_service.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvZmxpZ2h0X3NlcnZpY2UucnM=) | `0.00% <0.00%> (ø)` | |
   | [ballista/rust/executor/src/main.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvbWFpbi5ycw==) | `0.00% <0.00%> (ø)` | |
   | [ballista/rust/scheduler/src/main.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL21haW4ucnM=) | `0.00% <0.00%> (ø)` | |
   | [ballista/rust/scheduler/src/planner.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL3BsYW5uZXIucnM=) | `69.46% <50.00%> (+23.53%)` | :arrow_up: |
   | [ballista/rust/scheduler/src/lib.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL2xpYi5ycw==) | `21.13% <88.88%> (+1.62%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c945b03...f9861a8](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] edrevo commented on pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
edrevo commented on pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232#issuecomment-830648865


   cc @andygrove 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232#issuecomment-830651290


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#232](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1898ade) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/c945b03f3a459a5c15f481f9d52819df56e1090c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c945b03) will **increase** coverage by `0.24%`.
   > The diff coverage is `27.02%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/232/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #232      +/-   ##
   ==========================================
   + Coverage   76.46%   76.70%   +0.24%     
   ==========================================
     Files         135      134       -1     
     Lines       23250    23174      -76     
   ==========================================
   - Hits        17777    17776       -1     
   + Misses       5473     5398      -75     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [ballista/rust/core/src/client.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9jbGllbnQucnM=) | `0.00% <ø> (ø)` | |
   | [...ust/core/src/execution\_plans/unresolved\_shuffle.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9leGVjdXRpb25fcGxhbnMvdW5yZXNvbHZlZF9zaHVmZmxlLnJz) | `50.00% <ø> (ø)` | |
   | [ballista/rust/executor/src/execution\_loop.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvZXhlY3V0aW9uX2xvb3AucnM=) | `0.00% <ø> (ø)` | |
   | [ballista/rust/executor/src/flight\_service.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvZmxpZ2h0X3NlcnZpY2UucnM=) | `0.00% <0.00%> (ø)` | |
   | [ballista/rust/executor/src/main.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvbWFpbi5ycw==) | `0.00% <0.00%> (ø)` | |
   | [ballista/rust/scheduler/src/main.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL21haW4ucnM=) | `0.00% <0.00%> (ø)` | |
   | [ballista/rust/scheduler/src/planner.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL3BsYW5uZXIucnM=) | `69.46% <50.00%> (+23.53%)` | :arrow_up: |
   | [ballista/rust/scheduler/src/lib.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/232/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL2xpYi5ycw==) | `21.13% <88.88%> (+1.62%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c945b03...1898ade](https://codecov.io/gh/apache/arrow-datafusion/pull/232?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] andygrove merged pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
andygrove merged pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] andygrove commented on pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232#issuecomment-830671774


   Thanks @edrevo this looks like a good cleanup. I will find time this weekend (likely tomorrow) to pull this PR and do some testing to make sure I understand everything that is happening here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] edrevo commented on a change in pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
edrevo commented on a change in pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232#discussion_r624557196



##########
File path: ballista/rust/executor/src/main.rs
##########
@@ -158,16 +166,22 @@ async fn main() -> Result<()> {
     let scheduler = SchedulerGrpcClient::connect(scheduler_url)
         .await
         .context("Could not connect to scheduler")?;
-    let executor = Arc::new(BallistaExecutor::new(config));
-    let service = BallistaFlightService::new(executor);
+    let service = BallistaFlightService::new(work_dir);
 
     let server = FlightServiceServer::new(service);
     info!(
         "Ballista v{} Rust Executor listening on {:?}",
         BALLISTA_VERSION, addr
     );
     let server_future = tokio::spawn(Server::builder().add_service(server).serve(addr));
-    let client = BallistaClient::try_new(&external_host, port).await?;
+    let client_host = external_host.as_deref().unwrap_or_else(|| {
+        if bind_host == "0.0.0.0" {

Review comment:
       I have added a comment to clarify. Right now the executor does a really weird thing which has a big TODO here:
   
   https://github.com/apache/arrow-datafusion/blob/70afe4c459af33b8cb190383c923fcee09cde252/ballista/rust/executor/src/execution_loop.rs#L100-L101
   
   Basically, the executor needs to connect to itself through a BallistaClient in order to work. If there is an external host defined, then it is clear how to connect to oneself. If not, we need to check the bind address, but since `0.0.0.0` is a meta-address, for that case we can just use localhost to connect to ourselves. Does that make sense?
   
   I started working on the TODO to get rid of this ugliness, but then the PR would have been too big, so I was planning on tackling that separately.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] andygrove commented on a change in pull request #232: Make external hostname in executor optional

Posted by GitBox <gi...@apache.org>.
andygrove commented on a change in pull request #232:
URL: https://github.com/apache/arrow-datafusion/pull/232#discussion_r624544329



##########
File path: ballista/rust/executor/src/main.rs
##########
@@ -158,16 +166,22 @@ async fn main() -> Result<()> {
     let scheduler = SchedulerGrpcClient::connect(scheduler_url)
         .await
         .context("Could not connect to scheduler")?;
-    let executor = Arc::new(BallistaExecutor::new(config));
-    let service = BallistaFlightService::new(executor);
+    let service = BallistaFlightService::new(work_dir);
 
     let server = FlightServiceServer::new(service);
     info!(
         "Ballista v{} Rust Executor listening on {:?}",
         BALLISTA_VERSION, addr
     );
     let server_future = tokio::spawn(Server::builder().add_service(server).serve(addr));
-    let client = BallistaClient::try_new(&external_host, port).await?;
+    let client_host = external_host.as_deref().unwrap_or_else(|| {
+        if bind_host == "0.0.0.0" {

Review comment:
       I'm not sure I understand what the intent is here. According to https://en.wikipedia.org/wiki/0.0.0.0, binding to `0.0.0.0` means binding to `"any IPv4 address at all"`. This seems to change that behavior and prevents the user from doing that. Perhaps you could add some documentation here to explain this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org