You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by hhbyyh <gi...@git.apache.org> on 2015/12/02 11:32:03 UTC

[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

GitHub user hhbyyh opened a pull request:

    https://github.com/apache/spark/pull/10102

    [SPARK-11605] [MLlib] ML 1.6 QA: API: Java compatibility, docs

    jira: https://issues.apache.org/jira/browse/SPARK-11605
    Check Java compatibility for MLlib for this release.
    
    fix: 
    1. `StreamingTest.registerStream` needs java friendly interface.
    
    2. `GradientBoostedTreesModel.computeInitialPredictionAndError` and `GradientBoostedTreesModel.updatePredictionError` has java compatibility issue. Mark them as `developerAPI`.
    
    TBD: 
    `org.apache.spark.mllib.classification.LogisticRegressionModel`
    `public scala.Option<java.lang.Object> getThreshold();` has wrong return type for Java invocation.
    `SVMModel` has the similar issue. 
    
    Yet adding a `scala.Option<java.util.Double> getThreshold()` would result in an overloading error due to the same function signature. And adding a new function with different name seems to be not necessary. Suggest as no fix.
    
    cc @jkbradley @feynmanliang 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hhbyyh/spark javaAPI

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10102.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10102
    
----
commit 5550a9723c391ee2d908059b1a36c6c05b327b78
Author: Yuhao Yang <hh...@gmail.com>
Date:   2015-12-02T10:03:18Z

    fix some java compatiblity issues

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162977622
  
    **[Test build #2185 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2185/consoleFull)** for PR 10102 at commit [`5ef6ff8`](https://github.com/apache/spark/commit/5ef6ff832b72cc712f76001667d593bdb54f0458).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-161264647
  
    **[Test build #47060 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47060/consoleFull)** for PR 10102 at commit [`5550a97`](https://github.com/apache/spark/commit/5550a9723c391ee2d908059b1a36c6c05b327b78).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162710956
  
    I'll take a look now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162808916
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162808918
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47320/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-161264777
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10102#discussion_r46901077
  
    --- Diff: mllib/src/test/java/org/apache/spark/mllib/stat/JavaStatisticsSuite.java ---
    @@ -20,7 +20,9 @@
     import java.io.Serializable;
     
     import java.util.Arrays;
    +import java.util.List;
     
    +import org.apache.spark.SparkConf;
    --- End diff --
    
    organize imports


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162993998
  
    Merging with master and branch-1.6


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162711408
  
    I wouldn't worry about the getThreshold issue.  I haven't seen users complain about it, and they will hopefully switch over to the spark.ml API anyways.
    
    Can you please modify the old registerStream taking tuples to use BinarySample as well?  We should be consistent across Scala and Java.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-161264779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47060/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10102#discussion_r46431705
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala ---
    @@ -17,13 +17,31 @@
     
     package org.apache.spark.mllib.stat.test
     
    +import scala.beans.BeanInfo
    +
     import org.apache.spark.Logging
     import org.apache.spark.annotation.{Experimental, Since}
    -import org.apache.spark.rdd.RDD
    +import org.apache.spark.streaming.api.java.JavaDStream
     import org.apache.spark.streaming.dstream.DStream
     import org.apache.spark.util.StatCounter
     
     /**
    + * Class that represents the group and value of a sample.
    + *
    + * @param isExperiment if the sample is of the experiment group.
    + * @param value numeric value of the observation.
    + */
    +@Since("1.6.0")
    +@BeanInfo
    +case class BinarySample @Since("1.6.0") (
    +    @Since("1.6.0") isExperiment: Boolean,
    --- End diff --
    
    This can be Int to support multiClass sample data.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10102


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162992057
  
    **[Test build #2185 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2185/consoleFull)** for PR 10102 at commit [`5ef6ff8`](https://github.com/apache/spark/commit/5ef6ff832b72cc712f76001667d593bdb54f0458).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162797184
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162797178
  
    **[Test build #47316 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47316/consoleFull)** for PR 10102 at commit [`5077aa7`](https://github.com/apache/spark/commit/5077aa7a8cadd7d14c1c1696876e23e0fd501f54).
     * This patch **fails Java style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `public class JavaQuantileDiscretizerExample `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10102#discussion_r46898407
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/test/StreamingTest.scala ---
    @@ -17,13 +17,31 @@
     
     package org.apache.spark.mllib.stat.test
     
    +import scala.beans.BeanInfo
    +
     import org.apache.spark.Logging
     import org.apache.spark.annotation.{Experimental, Since}
    -import org.apache.spark.rdd.RDD
    +import org.apache.spark.streaming.api.java.JavaDStream
     import org.apache.spark.streaming.dstream.DStream
     import org.apache.spark.util.StatCounter
     
     /**
    + * Class that represents the group and value of a sample.
    + *
    + * @param isExperiment if the sample is of the experiment group.
    + * @param value numeric value of the observation.
    + */
    +@Since("1.6.0")
    +@BeanInfo
    +case class BinarySample @Since("1.6.0") (
    +    @Since("1.6.0") isExperiment: Boolean,
    --- End diff --
    
    This was not part of the design, but I agree it would be nice someday.  I'll ping @mengxr  since he reviewed the original PRs, but I think we'll keep it as is for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162810114
  
    @jkbradley Thanks for the review. Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by hhbyyh <gi...@git.apache.org>.
Github user hhbyyh commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162810065
  
    hudson.plugins.git.GitException: Failed to fetch from https://github.com/apache/spark.git
    	at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:763)
    	at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1012)
    
    need a retest


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162797185
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47316/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162968904
  
    Thank you for updating!  LGTM pending tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-161251850
  
    **[Test build #47060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47060/consoleFull)** for PR 10102 at commit [`5550a97`](https://github.com/apache/spark/commit/5550a9723c391ee2d908059b1a36c6c05b327b78).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11605] [MLlib] ML 1.6 QA: API: Java com...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10102#issuecomment-162796594
  
    **[Test build #47316 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47316/consoleFull)** for PR 10102 at commit [`5077aa7`](https://github.com/apache/spark/commit/5077aa7a8cadd7d14c1c1696876e23e0fd501f54).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org