You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by AnthonyTruchet <gi...@git.apache.org> on 2016/07/21 09:19:44 UTC

[GitHub] spark pull request #14299: Ensure broadcasted variables are destroyed even i...

GitHub user AnthonyTruchet opened a pull request:

    https://github.com/apache/spark/pull/14299

    Ensure broadcasted variables are destroyed even in case of exception

    ## What changes were proposed in this pull request?
    
    Ensure broadcasted variable are destroyed even in case of exception
    
    ## How was this patch tested?
    
    Word2VecSuite was run locally
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/criteo-forks/spark SPARK-16440

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14299.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14299
    
----
commit 4ad38360290d59fdf25a009bc65823553cea9b10
Author: Anthony Truchet <a....@criteo.com>
Date:   2016-07-08T12:54:24Z

    [SPARK-16440][MLlib] Destroy broadcasted variables even on driver
    
    This contribution is on done on behalf of Criteo, according to the
    terms of the Apache license.

commit 53911e04f82f58ee936edededea1d8e72bcb4ea8
Author: Anthony Truchet <a....@criteo.com>
Date:   2016-07-19T17:42:26Z

    [SPARK-16440][MLlib] Destroy broadcasted variables in a try finally
    
    This contribution is on done on behalf of Criteo, according to the
    terms of the Apache license.

commit 568e4915f6b1c3cd30c1b9796764f543e27f91fc
Author: Anthony Truchet <a....@criteo.com>
Date:   2016-07-21T08:45:44Z

    Merge remote-tracking branch 'apache/master' into SPARK-16440

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

Posted by AnthonyTruchet <gi...@git.apache.org>.

Github user AnthonyTruchet commented on the issue:

    https://github.com/apache/spark/pull/14299
  
    I'm fixing this (issue compiling Spark locally delay me). The whole point is that they are *not* destroyed within the method in case of exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14299
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/14299
  
    Oh, these broadcasts are already destroyed actually, inside the method. This isn't needed then. I kinda thought we had taken care of most or all of these already.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/14299
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #14299: [SPARK-16440][MLlib] Ensure broadcasted variables...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14299#discussion_r103420891
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
    @@ -314,6 +315,20 @@ class Word2Vec extends Serializable with Logging {
         val expTable = sc.broadcast(createExpTable())
         val bcVocab = sc.broadcast(vocab)
         val bcVocabHash = sc.broadcast(vocabHash)
    +      do_fit(dataset, sc, expTable, bcVocab, bcVocabHash)
    --- End diff --
    
    Are we missing a try? The formatting should be
    
    ```
    try {
      ...
    } finally {
      ...
    }
    ```
    
    Also use camelCase rather than underscore_naming


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/14299
  
    Yes, but they are still cleaned up (eventually). It can only happen one place. Unless the exception path is common I don't know if it's worth changing this everywhere, because for consistency it would really apply everywhere, and we declined to do that before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/14299
  
    Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org