You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/29 15:58:52 UTC

[GitHub] [spark] matar993 opened a new pull request, #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries

matar993 opened a new pull request, #37026:
URL: https://github.com/apache/spark/pull/37026

   ### What changes were proposed in this pull request?
   This PR aims to allow the user to check the state during a streaming query execution.
   It is possible to check how the state is updated on each batch execution.
   
   ### Why are the changes needed?
   The changes are needed in order to:
   - check if exists a state for a specific key
   - check the value of the state for a specific key
   - check the state expiration.
   
   All these checks can be done during a streaming query execution.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Add tests in StatefulStreamSuite class.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #37026:
URL: https://github.com/apache/spark/pull/37026#issuecomment-1272420493

   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on PR #37026:
URL: https://github.com/apache/spark/pull/37026#issuecomment-1171711978

   The new test utility methods heavily rely on fact the operation should be flatMapGroupsWithState whereas it is just a one of many stateful operations. I don't see this like a general way to query the state even in test context. In reality it should require the operator ID, store name (if the operator leverages multiple state stores) "at least". Please refer the implementation of stream-stream join how it leverages state store"s".
   
   It would be nice if we have a "general" way to query the state while streaming query is running (in general, reading the checkpoint while the query is running concurrently is a bad idea). Once we have the feature we can leverage the feature to achieve the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #37026:
URL: https://github.com/apache/spark/pull/37026#issuecomment-1171142982

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on PR #37026:
URL: https://github.com/apache/spark/pull/37026#issuecomment-1170842397

   Hello, I see it's more likely ease of tests for flatMapGroupsWithState. We added the utility class in Spark 3.2.0 which allows end users to test the user function of flatMapGroupsWithState with test-purpose GroupState implementation. With this approach, end users don't even need to run the streaming query - they just need to call their user function with the TestGroupState.
   https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/streaming/TestGroupState.html
   
   Could you please check whether this utility class fulfills your requirement? Thanks in advance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] matar993 commented on pull request #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries

Posted by GitBox <gi...@apache.org>.
matar993 commented on PR #37026:
URL: https://github.com/apache/spark/pull/37026#issuecomment-1171044253

   Hello, I think for sure it can be done with TestGroupState which doesn't need to start a streaming query. 
   It can be seen as a UnitTest which aims to test the user function of flatMapGroupWithState.
   I agree with that, but what I intended to do in this PR, was just add two StreamActions within an existing test framework that is used to test a streaming query (let me say like an IntegrationTest).
   I know that this kind of test can be done separately using the TestGroupState utility class, but I would just like to add the ability to check also the state when you want to test your streaming query using that framework.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #37026: [SPARK-39632][SQL][Tests] Add state utils to StreamTest to check states during streaming queries
URL: https://github.com/apache/spark/pull/37026


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org