You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "anishshri-db (via GitHub)" <gi...@apache.org> on 2023/03/30 05:25:30 UTC

[GitHub] [spark] anishshri-db opened a new pull request, #40600: [SPARK-42968][SS] Add option to skip commit coordinator as part of StreamingWrite API for DSv2 sources/sinks

anishshri-db opened a new pull request, #40600:
URL: https://github.com/apache/spark/pull/40600

   ### What changes were proposed in this pull request?
   Add option to skip commit coordinator as part of StreamingWrite API for DSv2 sources/sinks. This option was already present as part of the BatchWrite API
   
   ### Why are the changes needed?
   Sinks such as the following are atleast-once for which we do not need to go through the commit coordinator on the driver to ensure that a single partition commits. This is even less useful for streaming use-cases where batches could be replayed from the checkpoint dir.
   
   - memory sink
   - console sink
   - no-op sink
   - Kafka v2 sink
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Added unit test for the change
   ```
   [info] ReportSinkMetricsSuite:
   22:23:01.276 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   22:23:03.139 WARN org.apache.spark.sql.execution.streaming.ResolveWriteToStream: spark.sql.adaptive.enabled is not supported in streaming DataFrames/Datasets and will be disabled.
   [info] - test ReportSinkMetrics with useCommitCoordinator=true (2 seconds, 709 milliseconds)
   22:23:04.522 WARN org.apache.spark.sql.execution.streaming.ResolveWriteToStream: spark.sql.adaptive.enabled is not supported in streaming DataFrames/Datasets and will be disabled.
   [info] - test ReportSinkMetrics with useCommitCoordinator=false (373 milliseconds)
   22:23:04.941 WARN org.apache.spark.sql.streaming.ReportSinkMetricsSuite:
   
   ===== POSSIBLE THREAD LEAK IN SUITE o.a.s.sql.streaming.ReportSinkMetricsSuite, threads: ForkJoinPool.commonPool-worker-19 (daemon=true), rpc-boss-3-1 (daemon=true), shuffle-boss-6-1 (daemon=true) =====
   [info] Run completed in 4 seconds, 934 milliseconds.
   [info] Total number of tests run: 2
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 2, failed 0, canceled 0, ignored 0, pending 0
   [info] All tests passed.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] anishshri-db commented on pull request #40600: [SPARK-42968][SS] Add option to skip commit coordinator as part of StreamingWrite API for DSv2 sources/sinks

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on PR #40600:
URL: https://github.com/apache/spark/pull/40600#issuecomment-1489721186

   @HeartSaVioR - PTAL when you get a chance. Thx
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR closed pull request #40600: [SPARK-42968][SS] Add option to skip commit coordinator as part of StreamingWrite API for DSv2 sources/sinks

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR closed pull request #40600: [SPARK-42968][SS] Add option to skip commit coordinator as part of StreamingWrite API for DSv2 sources/sinks
URL: https://github.com/apache/spark/pull/40600


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #40600: [SPARK-42968][SS] Add option to skip commit coordinator as part of StreamingWrite API for DSv2 sources/sinks

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on PR #40600:
URL: https://github.com/apache/spark/pull/40600#issuecomment-1490244143

   Thanks! Merging to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org