You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Gengliang Wang (Jira)" <ji...@apache.org> on 2021/08/10 08:06:00 UTC

[jira] [Comment Edited] (SPARK-36463) Prohibit update mode in native support of session window

    [ https://issues.apache.org/jira/browse/SPARK-36463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396434#comment-17396434 ] 

Gengliang Wang edited comment on SPARK-36463 at 8/10/21, 8:05 AM:
------------------------------------------------------------------

[~kabhwan] Thanks for the info. I plan to cut RC1 next week. Please try to finish the PR this week. Thanks!


was (Author: gengliang.wang):
[~kabhwan] Thanks for the info. I plan to cut RC1 next Monday. Please try to finish the PR this week. Thanks!

> Prohibit update mode in native support of session window
> --------------------------------------------------------
>
>                 Key: SPARK-36463
>                 URL: https://issues.apache.org/jira/browse/SPARK-36463
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>            Reporter: Jungtaek Lim
>            Priority: Blocker
>
> The semantic of update mode for native support of session window seems to be broken.
> Strictly saying, it doesn't break the semantic based on our explanation of update mode:
> {quote}
> Update Mode - Only the rows that were updated in the Result Table since the last trigger will be written to the external storage (available since Spark 2.1.1). Note that this is different from the Complete Mode in that this mode only outputs the rows that have changed since the last trigger. If the query doesn’t contain aggregations, it will be equivalent to Append mode.
> {quote}
> But given the grouping key is changing due to the nature of session window, there is no way to "upsert" the output into destination. If end users try to "upsert" the output based on the grouping key, it is high likely that a single session window output will be written into multiple rows across multiple updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org