You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Karthik Palaniappan (JIRA)" <ji...@apache.org> on 2018/07/16 01:23:00 UTC

[jira] [Created] (SPARK-24815) Structured Streaming should support dynamic allocation

Karthik Palaniappan created SPARK-24815:
-------------------------------------------

             Summary: Structured Streaming should support dynamic allocation
                 Key: SPARK-24815
                 URL: https://issues.apache.org/jira/browse/SPARK-24815
             Project: Spark
          Issue Type: Improvement
          Components: Scheduler, Structured Streaming
    Affects Versions: 2.3.1
            Reporter: Karthik Palaniappan


Dynamic allocation is very useful for adding and removing containers to match the actual workload. On multi-tenant clusters, it ensures that a Spark job is taking no more resources than necessary. In cloud environments, it enables autoscaling.

However, if you set spark.dynamicAllocation.enabled=true and run a structured streaming job, Core's dynamic allocation algorithm kicks in. It requests executors if the task backlog is a certain size, and remove executors if they idle for a certain period of time.

This does not make sense for streaming jobs, as outlined in https://issues.apache.org/jira/browse/SPARK-12133, which introduced dynamic allocation for the old streaming API.

First, Spark should print a warning if you run a structured streaming job when Core's dynamic allocation is enabled

Second, structured streaming should have support for dynamic allocation. It would be convenient if it were the same set of properties as Core's dynamic allocation, but I don't have a strong opinion on that.

If somebody can give me pointers on how to add dynamic allocation support, I'd be happy to take a stab.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org