You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Bui, Tri" <Tr...@VerizonWireless.com.INVALID> on 2014/12/05 23:01:14 UTC

Cannot PredictOnValues or PredictOn base on the model build with StreamingLinearRegressionWithSGD

Hi,

The following example code is able to build the correct model.weights, but its prediction value is zero.   Am I passing the PredictOnValues incorrectly?  I also coded a batch version base on LinearRegressionWithSGD() with the same train and test data, iteration, stepsize info,  and  it was able to  model.predict with pretty good result.

I don' know why the predictOnValues is coming out zero, is there another way to predict on StreamingLinearRegressonWithSGD().

Attached is the test and train data I am using.

Numiteration and stepsize to converge to the model is 600 and .0001.

    val trainingData = ssc.textFileStream(inp(0)).map(LabeledPoint.parse)
    val testData = ssc.textFileStream(inp(1)).map(LabeledPoint.parse)
    val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(inp(3).toInt)).setNumIterations(inp(4).toInt).setStepSize(inp(5).toFloat)
    model.algorithm.setIntercept(true)
    model.trainOn(trainingData)
    //model.predictOnValues(testData.map(xp => (xp.label, xp.features))).print()
    model.predictOn(testData.map(xp => (xp.features))).print()
    ssc.start()
    ssc.awaitTermination()

Thanks for the help.
Tri