You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/09/02 09:08:00 UTC

[jira] [Resolved] (SPARK-32776) Limit in streaming should not be optimized away by PropagateEmptyRelation

     [ https://issues.apache.org/jira/browse/SPARK-32776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-32776.
----------------------------------
    Fix Version/s: 3.1.0
                   3.0.1
       Resolution: Fixed

> Limit in streaming should not be optimized away by PropagateEmptyRelation
> -------------------------------------------------------------------------
>
>                 Key: SPARK-32776
>                 URL: https://issues.apache.org/jira/browse/SPARK-32776
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 3.1.0
>            Reporter: Liwen Sun
>            Priority: Major
>             Fix For: 3.0.1, 3.1.0
>
>
> Right now, the limit operator in a streaming query may get optimized away when the relation is empty. This can be problematic for stateful streaming, as this empty batch will not write any state store files, and the next batch will fail when trying to read these state store files and throw a file not found error.
> We should not let PropagateEmptyRelation optimize away the Limit operator for streaming queries.
> This ticket is intended to apply a small and safe fix for PropagateEmptyRelation. A fundamental fix that can prevent this from happening again in the future and in other optimizer rules is more desirable, but that's a much larger task.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org