You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by falaki <gi...@git.apache.org> on 2014/07/26 01:06:24 UTC

[GitHub] spark pull request: [Core][SPARK-2696] Reduce default value of spa...

GitHub user falaki opened a pull request:

    https://github.com/apache/spark/pull/1595

    [Core][SPARK-2696] Reduce default value of spark.serializer.objectStreamReset

    The current default value of spark.serializer.objectStreamReset is 10,000. 
    When trying to re-partition (e.g., to 64 partitions) a large file (e.g., 500MB), containing 1MB records, the serializer will cache 10000 x 1MB x 64 ~= 640 GB which will cause out of memory errors.
    
    This patch sets the default value to a more reasonable default value (100).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/falaki/spark objectStreamReset

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1595.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1595
    
----
commit 1aa0df87db69d3c814b827e27673b198acf49edb
Author: Hossein <ho...@databricks.com>
Date:   2014-07-25T22:56:06Z

    Reduce default value of spark.serializer.objectStreamReset

commit 650a935cdd810fe7bbc43555ad126cb2bebaab92
Author: Hossein <ho...@databricks.com>
Date:   2014-07-25T23:05:05Z

    Updated documentation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2696] Reduce default value of spark.ser...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1595#issuecomment-50221355
  
    Please update this in docs/configuration.md as well


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2696] Reduce default value of spark.ser...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1595


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2696] Reduce default value of spark.ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1595#issuecomment-50216324
  
    QA results for PR 1595:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17206/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2696] Reduce default value of spark.ser...

Posted by falaki <gi...@git.apache.org>.
Github user falaki commented on the pull request:

    https://github.com/apache/spark/pull/1595#issuecomment-50221558
  
    It is already done :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2696] Reduce default value of spark.ser...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1595#issuecomment-50226681
  
    Oh sorry, I missed that! Merging it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2696] Reduce default value of spark.ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1595#issuecomment-50214227
  
    QA tests have started for PR 1595. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17206/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---