You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2021/08/10 03:17:00 UTC

[jira] [Created] (SPARK-36463) Prohibit update mode in native support of session window

Jungtaek Lim created SPARK-36463:
------------------------------------

             Summary: Prohibit update mode in native support of session window
                 Key: SPARK-36463
                 URL: https://issues.apache.org/jira/browse/SPARK-36463
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 3.2.0
            Reporter: Jungtaek Lim


The semantic of update mode for native support of session window seems to be broken.

Strictly saying, it doesn't break the semantic based on our explanation of update mode:

{quote}
Update Mode - Only the rows that were updated in the Result Table since the last trigger will be written to the external storage (available since Spark 2.1.1). Note that this is different from the Complete Mode in that this mode only outputs the rows that have changed since the last trigger. If the query doesn’t contain aggregations, it will be equivalent to Append mode.
{quote}

But given the grouping key is changing due to the nature of session window, there is no way to "upsert" the output into destination. If end users try to "upsert" the output based on the grouping key, it is high likely that a single session window output will be written into multiple rows across multiple updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org