You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zhichao-li <gi...@git.apache.org> on 2015/04/14 10:39:29 UTC

[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

GitHub user zhichao-li opened a pull request:

    https://github.com/apache/spark/pull/5508

    [SPARK-6897][Streaming]remove volatile from BlockingGenerator.currentBuffer to reduce unnecessary overhead

    currentBuffer has been protected by synchronized, so it would introduce unnecessary overhead for adding an extra volatile modifier here.
    
    Hi @srowen , Found this while walking through the source code, could you pls review if we need to take this since it's only a trivial stuff ? 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zhichao-li/spark volatile

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5508.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5508
    
----
commit 40f0215190f4c5edd42e3dee17356a32cf11a4f0
Author: lisurprise <zh...@intel.com>
Date:   2015-04-14T08:16:08Z

    remove volatile to reduce unnecessary overhead

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92776528
  
    `volatile` may still matter if this were read outside a `synchronized` method here. Although I doubt that would happen, I suppose my instinct is to leave it if there's likely no benefit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92697977
  
      [Test build #30238 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30238/consoleFull) for   PR 5508 at commit [`40f0215`](https://github.com/apache/spark/commit/40f0215190f4c5edd42e3dee17356a32cf11a4f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92757494
  
      [Test build #30239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30239/consoleFull) for   PR 5508 at commit [`7052a7a`](https://github.com/apache/spark/commit/7052a7ae7421a7972d22680f36cb1e7719b8ae0e).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92757565
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30239/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by zhichao-li <gi...@git.apache.org>.
Github user zhichao-li commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92773590
  
    >These are slightly different ideas. I agree in practice here that there is no visibility problem that volatile would solve that isn't already taken care of by the memory barrier from synchronized. However I also don't imagine this would cause any measurable overhead from the extra read. Have you measured some improvement? If not, I am not sure this is worth changing since the moment this field becomes read outside synchronized blocks it's technically unsafe again.
    
    Just thinking of it would be a good practice to avoid the usage of `volatile` and  `synchronized` together  
    
    Each small batch currentBuffer would be r/w and it's private too, so even if it's only a tiny overhead it would be much if they are sum up together(haven't done benchmark regarding this though).
    
    This field is a mutable array, so even if it's modified by `volatile`, I guess technically it's not safe to use it in the other place without locking.
    
    I'm fine to stay the same,  given that we always set the batch interval to be no less then 500ms and also there would be no measurable improvement compared with that. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92730223
  
    These are slightly different ideas. I agree in practice here that there is no visibility problem that volatile would solve that isn't already taken care of by the memory barrier from `synchronized`. However I also don't imagine this would cause any measurable overhead from the extra read. Have you measured some improvement? If not, I am not sure this is worth changing since the moment this field becomes read outside `synchronized` blocks it's technically unsafe again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92698353
  
      [Test build #30238 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30238/consoleFull) for   PR 5508 at commit [`40f0215`](https://github.com/apache/spark/commit/40f0215190f4c5edd42e3dee17356a32cf11a4f0).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92698359
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30238/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by zhichao-li <gi...@git.apache.org>.
Github user zhichao-li closed the pull request at:

    https://github.com/apache/spark/pull/5508


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by zhichao-li <gi...@git.apache.org>.
Github user zhichao-li commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-93124531
  
    @srowen , Thanks for the comments. I'm closing it :) . 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92705040
  
      [Test build #30239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30239/consoleFull) for   PR 5508 at commit [`7052a7a`](https://github.com/apache/spark/commit/7052a7ae7421a7972d22680f36cb1e7719b8ae0e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6897][Streaming]remove volatile from Bl...

Posted by zhichao-li <gi...@git.apache.org>.
Github user zhichao-li commented on the pull request:

    https://github.com/apache/spark/pull/5508#issuecomment-92776100
  
    oops... Maybe it's not always true that avoid the use of `volatile` and `synchronized` like double check singleton(although it's a must in that case). Anyway, just post a PR here to see what's the opinion of your guys :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org