You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Juliusz Sompolski (Jira)" <ji...@apache.org> on 2023/10/06 10:27:00 UTC

[jira] [Updated] (SPARK-45435) Document that lazy checkpoint may not be a consistent

     [ https://issues.apache.org/jira/browse/SPARK-45435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juliusz Sompolski updated SPARK-45435:
--------------------------------------
    Summary: Document that lazy checkpoint may not be a consistent  (was: Document that lazy checkpoint may cause undeterministm)

> Document that lazy checkpoint may not be a consistent
> -----------------------------------------------------
>
>                 Key: SPARK-45435
>                 URL: https://issues.apache.org/jira/browse/SPARK-45435
>             Project: Spark
>          Issue Type: Documentation
>          Components: Spark Core, SQL
>    Affects Versions: 4.0.0
>            Reporter: Juliusz Sompolski
>            Priority: Major
>              Labels: pull-request-available
>
> Some people may want to use checkpoint to get a consistent snapshot of the Dataset / RDD. Warn that this is not the case with lazy checkpoint, because checkpoint is computed only at the end of the first action, and the data used during the first action may be different because of non-determinism and retries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org