You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2022/07/05 02:49:00 UTC

[jira] [Commented] (SPARK-39592) Asynchronous State Checkpointing in Structured Streaming

    [ https://issues.apache.org/jira/browse/SPARK-39592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562324#comment-17562324 ] 

Hyukjin Kwon commented on SPARK-39592:
--------------------------------------

cc [~kabhwan] FYI

> Asynchronous State Checkpointing in Structured Streaming
> --------------------------------------------------------
>
>                 Key: SPARK-39592
>                 URL: https://issues.apache.org/jira/browse/SPARK-39592
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.3.0
>            Reporter: Boyang Jerry Peng
>            Priority: Major
>
> We can reduce the latency of stateful pipelines in Structured Streaming by making state checkpoints asynchronous.  One of the major contributors of latency for stateful pipelines in Structured Streaming can be checkpointing the state changes of every micro-batch.  If we make the state checkpointing asynchronous, we can potentially significantly lower the latency of the pipeline as the state checkpointing won’t or will contribute less to the batch latency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org