You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MLnick <gi...@git.apache.org> on 2016/06/01 17:41:58 UTC

[GitHub] spark pull request #13353: [SPARK-15605] [ML] [Examples] Fix broken ML JavaD...

Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13353#discussion_r65408086
  
    --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaDeveloperApiExample.java ---
    @@ -88,153 +95,156 @@ public static void main(String[] args) throws Exception {
         }
         if (sumPredictions != 0.0) {
           throw new Exception("MyJavaLogisticRegression predicted something other than 0," +
    -          " even though all coefficients are 0!");
    +        " even though all coefficients are 0!");
         }
     
         spark.stop();
       }
    -}
    -
    -/**
    - * Example of defining a type of {@link Classifier}.
    - *
    - * Note: Some IDEs (e.g., IntelliJ) will complain that this will not compile due to
    - *       {@link org.apache.spark.ml.param.Params#set} using incompatible return types.
    - *       However, this should still compile and run successfully.
    - */
    -class MyJavaLogisticRegression
    -  extends Classifier<Vector, MyJavaLogisticRegression, MyJavaLogisticRegressionModel> {
    -
    -  MyJavaLogisticRegression() {
    -    init();
    -  }
    -
    -  MyJavaLogisticRegression(String uid) {
    -    this.uid_ = uid;
    -    init();
    -  }
    -
    -  private String uid_ = Identifiable$.MODULE$.randomUID("myJavaLogReg");
    -
    -  @Override
    -  public String uid() {
    -    return uid_;
    -  }
     
       /**
    -   * Param for max number of iterations
    -   * <p>
    -   * NOTE: The usual way to add a parameter to a model or algorithm is to include:
    -   * - val myParamName: ParamType
    -   * - def getMyParamName
    -   * - def setMyParamName
    +   * Example of defining a type of {@link Classifier}.
    +   *
    +   * Note: Some IDEs (e.g., IntelliJ) will complain that this will not compile due to
    +   *       {@link org.apache.spark.ml.param.Params#set} using incompatible return types.
    +   *       However, this should still compile and run successfully.
        */
    -  IntParam maxIter = new IntParam(this, "maxIter", "max number of iterations");
    -
    -  int getMaxIter() { return (Integer) getOrDefault(maxIter); }
    -
    -  private void init() {
    -    setMaxIter(100);
    -  }
    -
    -  // The parameter setter is in this class since it should return type MyJavaLogisticRegression.
    -  MyJavaLogisticRegression setMaxIter(int value) {
    -    return (MyJavaLogisticRegression) set(maxIter, value);
    -  }
    +  public abstract static class MyJavaLogisticRegression
    +    extends Classifier<Vector, MyJavaLogisticRegression, MyJavaLogisticRegressionModel> {
     
    -  // This method is used by fit().
    -  // In Java, we have to make it public since Java does not understand Scala's protected modifier.
    -  public MyJavaLogisticRegressionModel train(Dataset<?> dataset) {
    -    // Extract columns from data using helper method.
    -    JavaRDD<LabeledPoint> oldDataset = extractLabeledPoints(dataset).toJavaRDD();
    -
    -    // Do learning to estimate the coefficients vector.
    -    int numFeatures = oldDataset.take(1).get(0).features().size();
    -    Vector coefficients = Vectors.zeros(numFeatures); // Learning would happen here.
    -
    -    // Create a model, and return it.
    -    return new MyJavaLogisticRegressionModel(uid(), coefficients).setParent(this);
    -  }
    -
    -  @Override
    -  public MyJavaLogisticRegression copy(ParamMap extra) {
    -    return defaultCopy(extra);
    -  }
    -}
    +    MyJavaLogisticRegression() {
    +      init();
    +    }
     
    -/**
    - * Example of defining a type of {@link ClassificationModel}.
    - *
    - * Note: Some IDEs (e.g., IntelliJ) will complain that this will not compile due to
    - *       {@link org.apache.spark.ml.param.Params#set} using incompatible return types.
    - *       However, this should still compile and run successfully.
    - */
    -class MyJavaLogisticRegressionModel
    -  extends ClassificationModel<Vector, MyJavaLogisticRegressionModel> {
    +    MyJavaLogisticRegression(String uid) {
    +      this.uid_ = uid;
    +      init();
    +    }
     
    -  private Vector coefficients_;
    -  public Vector coefficients() { return coefficients_; }
    +    private String uid_;
    +    public abstract String uid();
    +
    +    /**
    +     * Param for max number of iterations
    +     * <p>
    +     * NOTE: The usual way to add a parameter to a model or algorithm is to include:
    +     * - val myParamName: ParamType
    --- End diff --
    
    This is kinda confusing for a Java example having Scala code, no?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org