You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "xieshuaihu (via GitHub)" <gi...@apache.org> on 2023/10/31 11:43:32 UTC

[PR] [SPARK-45738][CONNECT]fix status conflict in ExecuteEventsManager and… [spark]

xieshuaihu opened a new pull request, #43601:
URL: https://github.com/apache/spark/pull/43601

   ### What changes were proposed in this pull request?
   
   Fix issue [spark-45738](https://issues.apache.org/jira/projects/SPARK/issues/SPARK-45738?filter=reportedbyme)
   
   Make `ExecuteEventsManager` more gracefully assert status in `SessionEventsManager`, don't throw exception when handling `Finished/Failed/Cancelled/Closed` events if status in SessionEventsManager is `Closed`. So ExecuteThreadRunner has chance to  response error to client when session is evicted.
   
   ### Why are the changes needed?
   
   I inspected spark sever session manager codes, and found that this bug is caused by ExecuteEventsManager and SessionEventsManager status conflict.
   
   When a session is `evicted`(SessionHolder.scala#L170), the status in SessionEventsManager(SessionEventsManager.scala#L72) will be changed to `Closed`, and the query within this session is cancelled. At the same time, the thread running the evicted session will throw exception(ExecuteThreadRunner.scala#L111) which is handled by ErrorUtils.handleError(ErrorUtils.scala#L211). However, this `handleError` will post failed or canceled event to ExecuteEventsManager(ErrorUtils.scala#L258), which will check the status of session in SessionEventsManager. If the status is `Closed`, it will throw exception. And **no chance to send error response**(ErrorUtils.scala#L265) to client, then the client wait forever.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   
   ### How was this patch tested?
   
   Manually test. After this patch, the client will receive error when its session is evicted.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45738][CONNECT]fix status conflict in ExecuteEventsManager and… [spark]

Posted by "xieshuaihu (via GitHub)" <gi...@apache.org>.
xieshuaihu closed pull request #43601: [SPARK-45738][CONNECT]fix status conflict in ExecuteEventsManager and…
URL: https://github.com/apache/spark/pull/43601


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45738][CONNECT]fix status conflict in ExecuteEventsManager and… [spark]

Posted by "xieshuaihu (via GitHub)" <gi...@apache.org>.
xieshuaihu commented on PR #43601:
URL: https://github.com/apache/spark/pull/43601#issuecomment-1792034871

   cc @juliuszsompolski  @HyukjinKwon @grundprinzip 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45738][CONNECT]fix status conflict in ExecuteEventsManager and… [spark]

Posted by "xieshuaihu (via GitHub)" <gi...@apache.org>.
xieshuaihu commented on PR #43601:
URL: https://github.com/apache/spark/pull/43601#issuecomment-1793936234

   Thanks for your explanation @juliuszsompolski 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45738][CONNECT]fix status conflict in ExecuteEventsManager and… [spark]

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on PR #43601:
URL: https://github.com/apache/spark/pull/43601#issuecomment-1792287685

   Hi @xieshuaihu ,
   
   I ran into this issue when implementing and testing `ReleaseSession`, and I modified the order of things happening during session eviction in https://github.com/apache/spark/pull/43546 so that this doesn't happen again.
   Now, before the session is evicted, all the executions in it are cancelled, and it waits for ExecuteThreadRunner threads to be joined before sending the session.close(). Could you review if I got that right?
   
   Before that, I talked with @jdesjean about doing what you are doing here, and he was not in favor of relaxing these assertions, saying that it's a good invariant to not have any events concerning the session be coming after the SessionClose event. So I fixed it to make the session eviction preserve this invariant.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org