You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2022/03/11 07:49:00 UTC

[jira] [Created] (SPARK-38522) Strengthen the contract on iterator method in StateStore

Jungtaek Lim created SPARK-38522:
------------------------------------

             Summary: Strengthen the contract on iterator method in StateStore
                 Key: SPARK-38522
                 URL: https://issues.apache.org/jira/browse/SPARK-38522
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 3.3.0
            Reporter: Jungtaek Lim


The root cause of SPARK-38320 was that the logic initialized the iterator first, and performed some updates against state store, and iterated through iterator expecting that all updates in between should be visible in iterator.

That is not guaranteed in RocksDB state store, and the contract of Java ConcurrentHashMap which is used in HDFSBackedStateStore does not also guarantee it.

It would be clearer if we update the contract to draw a line on behavioral guarantee to callers so that callers don't get such expectation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org