You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2016/05/29 10:40:42 UTC

[GitHub] spark pull request: [SPARK-14615][ML][FOLLOWUP] Fix Python example...

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/13393

    [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use the new ML Vector and Matrix APIs in the ML pipeline based algorithms

    ## What changes were proposed in this pull request?
    
    This PR fixes Python examples to use the new ML Vector and Matrix APIs in the ML pipeline based algorithms.
    
    I firstly executed this shell command, `grep -r "from pyspark.mllib" .` and then executed them all.
    Some of tests in `ml` produced the error messages as below:
    
    ```
    pyspark.sql.utils.IllegalArgumentException: u'requirement failed: Input type must be VectorUDT but got org.apache.spark.mllib.linalg.VectorUDT@f71b0bce.'
    ```
    
    So, I fixed them to use new ones just identically with some Python tests fixed in https://github.com/apache/spark/pull/12627
    
    ## How was this patch tested?
    
    Manually tested for all the examples listed by `grep -r "from pyspark.mllib" .`.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-14615

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13393.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13393
    
----
commit 0ddb1f40c0be65dea3fa6372c40fed026eb0ff7c
Author: hyukjinkwon <gu...@gmail.com>
Date:   2016-05-29T10:22:21Z

    Fix Python examples

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14615][ML][FOLLOWUP] Fix Python example...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/13393#issuecomment-222354078
  
    @viirya Thank you for your quick reply on #12627. Please feel free to take over this if there are a lot of things wrong here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    Please let me ping @mengxr again. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14615][ML][FOLLOWUP] Fix Python example...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13393#issuecomment-222354345
  
    **[Test build #59588 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59588/consoleFull)** for PR 13393 at commit [`0ddb1f4`](https://github.com/apache/spark/commit/0ddb1f40c0be65dea3fa6372c40fed026eb0ff7c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14615][ML][FOLLOWUP] Fix Python example...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13393#issuecomment-222354105
  
    **[Test build #59588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59588/consoleFull)** for PR 13393 at commit [`0ddb1f4`](https://github.com/apache/spark/commit/0ddb1f40c0be65dea3fa6372c40fed026eb0ff7c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    **[Test build #3076 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3076/consoleFull)** for PR 13393 at commit [`0ddb1f4`](https://github.com/apache/spark/commit/0ddb1f40c0be65dea3fa6372c40fed026eb0ff7c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    @jkbradley Sure, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    Hi @yanboliang , could you maybe take a quick look please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples t...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13393


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    **[Test build #3076 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3076/consoleFull)** for PR 13393 at commit [`0ddb1f4`](https://github.com/apache/spark/commit/0ddb1f40c0be65dea3fa6372c40fed026eb0ff7c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    Hi @mengxr , Could you please take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14615][ML][FOLLOWUP] Fix Python example...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13393#issuecomment-222354363
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59588/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples t...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13393#discussion_r66687946
  
    --- Diff: examples/src/main/python/ml/aft_survival_regression.py ---
    @@ -19,7 +19,7 @@
     
     # $example on$
     from pyspark.ml.regression import AFTSurvivalRegression
    -from pyspark.mllib.linalg import Vectors
    +from pyspark.ml.linalg import Vectors
    --- End diff --
    
    Does this example run for you?  It seems broken (not due to your PR though).  Would you mind checking to identify the last time it worked?
    
    ```
    Traceback (most recent call last):                                              
      File "/Users/josephkb/spark/examples/src/main/python/ml/aft_survival_regression.py", line 49, in <module>
        model = aft.fit(training)
      File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/ml/base.py", line 64, in fit
      File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/ml/wrapper.py", line 213, in _fit
      File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/ml/wrapper.py", line 210, in _fit_java
      File "/Users/josephkb/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", line 933, in __call__
      File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 79, in deco
    pyspark.sql.utils.IllegalArgumentException: u'requirement failed: The number of instances should be greater than 0.0, but got 0.'
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14615][ML][FOLLOWUP] Fix Python example...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the pull request:

    https://github.com/apache/spark/pull/13393#issuecomment-222356558
  
    LGTM and cc @mengxr 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    Well, that test works in 1.6 but fails in branch-2.0.  I'll merge your PR.  Thanks!
    
    I created a JIRA for the bug.  Would you have time to look into it?  [https://issues.apache.org/jira/browse/SPARK-15892]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    I'll take a look


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14615][ML][FOLLOWUP] Fix Python example...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13393#issuecomment-222354362
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13393: [SPARK-14615][ML][FOLLOWUP] Fix Python examples to use t...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/13393
  
    LGTM except for the broken example, but I don't think that's from this PR.  I'll rerun tests before merging it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org