You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by liyezhang556520 <gi...@git.apache.org> on 2014/08/11 16:37:21 UTC

[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

GitHub user liyezhang556520 opened a pull request:

    https://github.com/apache/spark/pull/1892

    [SPARK-1777 (partial)] bugfix: make size of requested memory correctly

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liyezhang556520/spark lazy_memory_request

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1892.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1892
    
----
commit 335ab61adc41f49dd53cbec2d586be423ccd2474
Author: Zhang, Liye <li...@intel.com>
Date:   2014-08-11T14:34:46Z

    [SPARK-1777 (partial)] bugfix: make size of requested memory correctly

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-52002658
  
    Ah I see. `memoryThreshold` is the amount of memory we currently hold in the global memory map. To end up at `currentSize * memoryGrowthFactor`, we need a delta of `currentSize * memoryGrowthFactor - memoryThreshold`. This makes sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1892#discussion_r16154792
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala ---
    @@ -254,7 +254,7 @@ private[spark] class MemoryStore(blockManager: BlockManager, maxMemory: Long)
                   }
                 }
                 // New threshold is currentSize * memoryGrowthFactor
    -            memoryThreshold = currentSize + amountToRequest
    +            memoryThreshold += amountToRequest
    --- End diff --
    
    Looks like this becomes `currentSize * memoryGrowthFactor` if you do the math. Why not just set it to this value? (Then you can remove the comment above)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1892


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by liyezhang556520 <gi...@git.apache.org>.
Github user liyezhang556520 commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-51861318
  
    Hi @andrewor14, 
    Yes, after requesting, `memoryThreshold` is supposed to be `currentSize * memoryGrowthFactor`. And the memory we have now is equals to `memoryThreshold` instead of the `currentSize`, so the delta between what we will have and what we have now should be `currentSize * memoryGrowthFactor - memoryThreshold`, if you are using `currentSize * (memoryGrowthThreshold - 1)` as the delta, will memory size `currentSize - memoryThreshold` missed (not calculated in memory request by the current thread in `unrollMemoryMap`)? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-51788524
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-52005609
  
    QA results for PR 1892:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18404/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by liyezhang556520 <gi...@git.apache.org>.
Github user liyezhang556520 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1892#discussion_r16154965
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala ---
    @@ -254,7 +254,7 @@ private[spark] class MemoryStore(blockManager: BlockManager, maxMemory: Long)
                   }
                 }
                 // New threshold is currentSize * memoryGrowthFactor
    -            memoryThreshold = currentSize + amountToRequest
    +            memoryThreshold += amountToRequest
    --- End diff --
    
    I think it's better use `memoryThreshold += amountToRequest` than `memoryThreshold = currentSize * memoryGrowthFactor`, because `memoryThreshold += amountToRequest` is more obvious for coder reader how `memoryThreshold` increases, and the comment above is more easy to understand.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-52002744
  
    add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-51818571
  
    Hi @liyezhang556520, can you explain your changes?
    
    The idea of the existing code is that after requesting, we will end up at `currentSize * memoryGrowthFactor`, so the delta between what we will have and what we have now is `currentSize * (memoryGrowthThreshold - 1)`. The problem with using `memoryThreshold` instead is that the current size of the buffer may overshoot the threshold by a lot, e.g. if we have one very large item.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-52015074
  
    Thanks. Merging in master & branch-1.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1892#discussion_r16155842
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala ---
    @@ -254,7 +254,7 @@ private[spark] class MemoryStore(blockManager: BlockManager, maxMemory: Long)
                   }
                 }
                 // New threshold is currentSize * memoryGrowthFactor
    -            memoryThreshold = currentSize + amountToRequest
    +            memoryThreshold += amountToRequest
    --- End diff --
    
    Ok, that's also fine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-52005688
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1892#issuecomment-52002989
  
    QA tests have started for PR 1892. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18404/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org