You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Anish Shrigondekar (Jira)" <ji...@apache.org> on 2022/04/05 05:14:00 UTC

[jira] [Commented] (SPARK-38787) Possible correctness issue on stream-stream join when handling edge case

    [ https://issues.apache.org/jira/browse/SPARK-38787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517217#comment-17517217 ] 

Anish Shrigondekar commented on SPARK-38787:
--------------------------------------------

CC - [~kabhwan] 

> Possible correctness issue on stream-stream join when handling edge case
> ------------------------------------------------------------------------
>
>                 Key: SPARK-38787
>                 URL: https://issues.apache.org/jira/browse/SPARK-38787
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 3.2.1
>            Reporter: Anish Shrigondekar
>            Priority: Major
>
> There was an issue on NPE in stream-stream join. SPARK-35659 fixed the issue “partially”, and the part of fix is to ignore the null value from the last index on swapping elements in the list so the null value in the last index is going to be effectively dropped. If it is due to out of sync between numValues and the actual number of elements, this works effectively as a correction.
> This unfortunately opens the possibility of another “correctness” issue; the reason we swap the value with last index is effectively to remove the value in the current index. Doing nothing in any case would mean “we don’t remove the value in the current index”, whereas the caller would expect the value as dropped, and even for outer join they may be emitted as left/right null join output while the value can be re-evaluated and emitted again.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org