You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/01/22 01:25:00 UTC

[jira] [Commented] (SPARK-26679) Deconflict spark.executor.pyspark.memory and spark.python.worker.memory

    [ https://issues.apache.org/jira/browse/SPARK-26679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748307#comment-16748307 ] 

Hyukjin Kwon commented on SPARK-26679:
--------------------------------------

It's fine to have two different configurations to me if there's a possible case we can think of.

Would you have any case in your mind? If so, I think we can go for renaming. If both are expected to be same, I guess it better uses {{spark.executor.pyspark.memory}} for both cases.

> Deconflict spark.executor.pyspark.memory and spark.python.worker.memory
> -----------------------------------------------------------------------
>
>                 Key: SPARK-26679
>                 URL: https://issues.apache.org/jira/browse/SPARK-26679
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.4.0
>            Reporter: Ryan Blue
>            Priority: Major
>
> In 2.4.0, spark.executor.pyspark.memory was added to limit the total memory space of a python worker. There is another RDD setting, spark.python.worker.memory that controls when Spark decides to spill data to disk. These are currently similar, but not related to one another.
> PySpark should probably use spark.executor.pyspark.memory to limit or default the setting of spark.python.worker.memory because the latter property controls spilling and should be lower than the total memory limit. Renaming spark.python.worker.memory would also help clarity because it sounds like it should control the limit, but is more like the JVM setting spark.memory.fraction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org