You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/05/30 07:19:13 UTC

[GitHub] [spark] beliefer opened a new pull request, #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

beliefer opened a new pull request, #41379:
URL: https://github.com/apache/spark/pull/41379

   ### What changes were proposed in this pull request?
   `SparkConnectStreamHandler` treats the proto requests from connect client and send the responses back to connect client. `SparkConnectStreamHandler` holds a component `StreamObserver` to send responses.
   So I think we should keep the `StreamObserver` could be accessed only with `SparkConnectStreamHandler`.
   
   This PR want decouple the process handle commands and the other process send responses on server side.
   After this PR, we can remove `MockObserver` which seems is not useful.
   
   
   ### Why are the changes needed?
   Decouple handle command and send response on server side.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   Just update the inner implementation.
   
   
   ### How was this patch tested?
   Exists test cases.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1568115945

   ping @HyukjinKwon @hvanhovell @zhengruifeng 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1576127284

   cc @grundprinzip @hvanhovell  what do you think of this PR?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer closed pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer closed pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side
URL: https://github.com/apache/spark/pull/41379


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1569978050

   > In case of multi `ExecutePlanResponse`s, does this change mean we can not send the first response before the last response is ready?
   
   Yeah! This is an issue. I will improve it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1568420485

   In case of multi `ExecutePlanResponse`s, does this change mean we can not send the first response before the last response to be ready?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1573130534

   @zhengruifeng Currently, except for `handleSqlCommand`, all of them only return a reply. The `handleSqlCommand` returns two responses, the first being the result of executing the SQL command, and the second being the Metrics information. The time required to create Metrics information is minimal. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1576583751

   even if there is no other commands returning multiple response, I don't feel strongly about the idea of introducing such a limitation. I also think we should send them asap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1576541575

   Unfortunately, I think this might not be the best possible approach. Any command has the ability to send any response "right now" and expects that it will be returned. The client can change their behavior based on this and we should not simply buffer all responses. In addition, if I understand correctly, buffering the responses is not great for memory consumption reasons. 
   
   In the asynchronous query processing case, we want to immediately return a message for example that returns a query ID for the client to re-attache to.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] beliefer commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1576567216

   > In the asynchronous query processing case, we want to immediately return a message for example that returns a query ID for the client to re-attache to.
   
   I agree your opinion. In fact, all the has implemented command will not occur the issues you commented. So, do we really have the scene you comment above ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41379: [SPARK-43879][CONNECT] Decouple handle command and send response on server side

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41379:
URL: https://github.com/apache/spark/pull/41379#issuecomment-1578318511

   👎 on this PR in the current state as the protocol is using a streaming response. That today this is not directly used is not a good enough answer. Users can stream responses back using custom commands from the extension registry. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org