You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/10/25 18:14:30 UTC

[SPARK MLLIB] could not understand the wrong and inscrutable result of Linear Regression codes

Dear All,
I have some program as below which makes me very much confused and inscrutable, it is about multiple dimension linear regression mode, the weight / coefficient is always perfect while the dimension is smaller than 4, otherwise it is wrong all the time.Or, whether the LinearRegressionWithSGD would be selected for another one?
public class JavaLinearRegression {  public static void main(String[] args) {    SparkConf conf = new SparkConf().setAppName("Linear Regression Example");    JavaSparkContext sc = new JavaSparkContext(conf);    SQLContext jsql = new SQLContext(sc);
    //Ax = b, x = [1, 2, 3, 4] would be the only one output about weight     //x1 + 2 * x2 + 3 * x3 + 4 * x4 = y would be the multiple linear mode    List<LabeledPoint> localTraining = Lists.newArrayList(        new LabeledPoint(30.0, Vectors.dense(1.0, 2.0, 3.0, 4.0)),        new LabeledPoint(29.0, Vectors.dense(0.0, 2.0, 3.0, 4.0)),        new LabeledPoint(25.0, Vectors.dense(0.0, 0.0, 3.0, 4.0)),        new LabeledPoint(16.0, Vectors.dense(0.0, 0.0, 0.0, 4.0)));
    JavaRDD<LabeledPoint> training = sc.parallelize(localTraining).cache();
    // Building the model    int numIterations = 1000; //the number could be reset large    final LinearRegressionModel model = LinearRegressionWithSGD.train(JavaRDD.toRDD(training), numIterations);
    //the coefficient weights are perfect while dimension of LabeledPoint is SMALLER than 4.    //otherwise the output is always wrong and inscrutable.    //for instance, one output is    //Final w: [2.537341836047772E25,-7.744333206289736E24,6.697875883454909E23,-2.6704705246777624E22]    System.out.print("Final w: " + model.weights() + "\n\n");  }}    I would appreciate your kind help or guidance very much~~
Thank you!Zhiliang


[SPARK MLLIB] could not understand the wrong and inscrutable result of Linear Regression codes

Posted by Zhiliang Zhu <zc...@yahoo.com.INVALID>.
Dear All,
I have some program as below which makes me very much confused and inscrutable, it is about multiple dimension linear regression mode, the weight / coefficient is always perfect while the dimension is smaller than 4, otherwise it is wrong all the time.Or, whether the LinearRegressionWithSGD would be selected for another one?
public class JavaLinearRegression {  public static void main(String[] args) {    SparkConf conf = new SparkConf().setAppName("Linear Regression Example");    JavaSparkContext sc = new JavaSparkContext(conf);    SQLContext jsql = new SQLContext(sc);
    //Ax = b, x = [1, 2, 3, 4] would be the only one output about weight     //x1 + 2 * x2 + 3 * x3 + 4 * x4 = y would be the multiple linear mode    List<LabeledPoint> localTraining = Lists.newArrayList(        new LabeledPoint(30.0, Vectors.dense(1.0, 2.0, 3.0, 4.0)),        new LabeledPoint(29.0, Vectors.dense(0.0, 2.0, 3.0, 4.0)),        new LabeledPoint(25.0, Vectors.dense(0.0, 0.0, 3.0, 4.0)),        new LabeledPoint(16.0, Vectors.dense(0.0, 0.0, 0.0, 4.0)));
    JavaRDD<LabeledPoint> training = sc.parallelize(localTraining).cache();
    // Building the model    int numIterations = 1000; //the number could be reset large    final LinearRegressionModel model = LinearRegressionWithSGD.train(JavaRDD.toRDD(training), numIterations);
    //the coefficient weights are perfect while dimension of LabeledPoint is SMALLER than 4.    //otherwise the output is always wrong and inscrutable.    //for instance, one output is    //Final w: [2.537341836047772E25,-7.744333206289736E24,6.697875883454909E23,-2.6704705246777624E22]    System.out.print("Final w: " + model.weights() + "\n\n");  }}    I would appreciate your kind help or guidance very much~~
Thank you!Zhiliang