You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2022/12/20 07:40:00 UTC

[jira] [Assigned] (SPARK-39591) SPIP: Asynchronous Offset Management in Structured Streaming

     [ https://issues.apache.org/jira/browse/SPARK-39591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jungtaek Lim reassigned SPARK-39591:
------------------------------------

    Assignee: Boyang Jerry Peng

> SPIP: Asynchronous Offset Management in Structured Streaming
> ------------------------------------------------------------
>
>                 Key: SPARK-39591
>                 URL: https://issues.apache.org/jira/browse/SPARK-39591
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.3.0
>            Reporter: Boyang Jerry Peng
>            Assignee: Boyang Jerry Peng
>            Priority: Major
>              Labels: SPIP
>
> Currently in Structured Streaming, at the beginning of every micro-batch the offset to process up to for the current batch is persisted to durable storage.  At the end of every micro-batch, a marker to indicate the completion of this current micro-batch is persisted to durable storage. For pipelines such as one that read from Kafka and write to Kafka, end-to-end exactly once is not support and latency is sensitive, we can allow users to configure offset commits to be written asynchronously thus this commit operation will not contribute to the batch duration and effectively lowering the overall latency of the pipeline.
>  
> SPIP Doc: 
>  
> https://docs.google.com/document/d/1iPiI4YoGCM0i61pBjkxcggU57gHKf2jVwD7HWMHgH-Y/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org