You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2020/12/16 20:38:00 UTC

[jira] [Commented] (SPARK-33814) Provide preferred locations for stateful operations without reported state store locations

    [ https://issues.apache.org/jira/browse/SPARK-33814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250623#comment-17250623 ] 

Apache Spark commented on SPARK-33814:
--------------------------------------

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/30812

> Provide preferred locations for stateful operations without reported state store locations
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-33814
>                 URL: https://issues.apache.org/jira/browse/SPARK-33814
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>
> Stateful operators in SS provides preferred locations on the previous batches if any. However, if there is no previous batch to follow, Spark possibly schedules stateful tasks in inefficient distribution. As stateful operations probably need to maintain large state stores, it is better we schedule stateful tasks across all executors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org