You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jbencook <gi...@git.apache.org> on 2014/12/11 16:53:24 UTC

[GitHub] spark pull request: [SPARK-2980][mllib] testing the Chi-squared hy...

GitHub user jbencook opened a pull request:

    https://github.com/apache/spark/pull/3679

    [SPARK-2980][mllib] testing the Chi-squared hypothesis test

    This PR tests the pyspark Chi-squared hypothesis test from this commit: c8abddc5164d8cf11cdede6ab3d5d1ea08028708 and moves some of the error messaging in to python.
    
    It is a port of the Scala tests here: [HypothesisTestSuite.scala](https://github.com/apache/spark/blob/master/mllib/src/test/scala/orgapache/spark/mllib/stat/HypothesisTestSuite.scala)
    
    Hopefully, SPARK-2980 can be closed.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jbencook/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3679.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3679
    
----
commit 3aeb0d91007960f33076b6e6775944bb9d81ead8
Author: jbencook <jb...@gmail.com>
Date:   2014-12-11T15:44:08Z

    [SPARK-2980][mllib] bringing Chi-squared error messages to the python side

commit a17ee843185bdb1ee96574712450243d112fbce6
Author: jbencook <jb...@gmail.com>
Date:   2014-12-11T15:44:34Z

    [SPARK-2980][mllib] adding unit tests for the pyspark chi-squared test

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4855][mllib] testing the Chi-squared hy...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67119469
  
    @jbencook  Thanks for the updates!  (Your comment about checking for exceptions makes me wonder if you were right before to throw a more meaningful exception than a Py4JJavaError.  But at least the error message includes the Scala-side error message.  If you prefer to catch the Py4JJavaError, identify the issue, and report a simpler error message, that could be nice, though it'd require reinstating some of your code.  Whatever you think best.  I'd say it's good to go.)
    
    CC: @mengxr  LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2980][mllib] testing the Chi-squared hy...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67095129
  
    I would recommend not testing for invalid input in stat.py as long as it is tested on the Scala side in ChiSqTest.scala.  It will be faster to only test once, and it will still be safe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2980][mllib] testing the Chi-squared hy...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67096627
  
      [Test build #546 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/546/consoleFull) for   PR 3679 at commit [`a17ee84`](https://github.com/apache/spark/commit/a17ee843185bdb1ee96574712450243d112fbce6).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4855][mllib] testing the Chi-squared hy...

Posted by jbencook <gi...@git.apache.org>.
Github user jbencook commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67107800
  
    Thanks for the comments @jkbradley.
    
    I revised the JIRA tag to reference the new ticket you added, updated the URL by adding the "/" (oops), and removed the python side input checking. I also removed the tests for `ValueError` exceptions. Although, now that I think of it I might as well check for the `Py4JJavaError` exceptions that should get thrown instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2980][mllib] testing the Chi-squared hy...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67094622
  
    I made a new JIRA for the unit tests.  Could you please swap the JIRA tag for this one?
    [https://issues.apache.org/jira/browse/SPARK-4855]
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4855][mllib] testing the Chi-squared hy...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67218225
  
    Merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2980][mllib] testing the Chi-squared hy...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-66639796
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2980][mllib] testing the Chi-squared hy...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67093985
  
    @jbencook  I think that JIRA was supposed to be closed; I'll fix that.  But adding some Python tests will be good---I'll take a look!
    
    Btw, the link for HypothesisTestSuite.scala above has a missing "/"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4855][mllib] testing the Chi-squared hy...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/3679


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4855][mllib] testing the Chi-squared hy...

Posted by jbencook <gi...@git.apache.org>.
Github user jbencook commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67108325
  
    OK, should be good to go now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2980][mllib] testing the Chi-squared hy...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3679#issuecomment-67103266
  
      [Test build #546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/546/consoleFull) for   PR 3679 at commit [`a17ee84`](https://github.com/apache/spark/commit/a17ee843185bdb1ee96574712450243d112fbce6).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org