You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/06 22:37:47 UTC

[GitHub] [spark] HeartSaVioR opened a new pull request, #38528: [SPARK-41025][SS] Introduce ComparableOffset to support offset range validation

HeartSaVioR opened a new pull request, #38528:
URL: https://github.com/apache/spark/pull/38528

   ### What changes were proposed in this pull request?
   
   This PR proposes to introduce a new interface ComparableOffset, which is a mix-in of streaming Offset interface to enable comparison between two offset instances. MicroBatchExecution will perform validation against offset range if the offset instance implements ComparableOffset.
   
   The new interface can be mixed-in with both DSv1 streaming Offset and DSv2 streaming Offset.
   
   This PR also implements this interface for streaming offset in built-in data sources.
   
   ### Why are the changes needed?
   
   Currently, Spark doesn't do any assertion against offsets and data source implementation is full of responsibility to validate the offset. It seems more useful to provide the offset validation by Spark rather than just documenting the responsibility and let data source implementation do the duty.
   
   This offset validation is more important since we have Trigger.AvailableNow which gradually increases the offset and terminates when the offset is equal to the desired offset. A bug in data source may stall the query progress or even data duplication.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No for end users.
   
   ### How was this patch tested?
   
   New UTs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #38528: [SPARK-41025][SS] Introduce ComparableOffset to support offset range validation

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on PR #38528:
URL: https://github.com/apache/spark/pull/38528#issuecomment-1304913137

   cc. @zsxwing @jerrypeng Appreciate your review. Thanks in advance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR closed pull request #38528: [SPARK-41025][SS] Introduce ValidateOffsetRange/ComparableOffset to support offset range validation

Posted by GitBox <gi...@apache.org>.
HeartSaVioR closed pull request #38528: [SPARK-41025][SS] Introduce ValidateOffsetRange/ComparableOffset to support offset range validation
URL: https://github.com/apache/spark/pull/38528


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #38528: [SPARK-41025][SS] Introduce ValidateOffsetRange/ComparableOffset to support offset range validation

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on PR #38528:
URL: https://github.com/apache/spark/pull/38528#issuecomment-1314833214

   Let me just deal with each data source - I got some feedback internally that it seems to be an over-engineering.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org