You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "steven taylor (Jira)" <ji...@apache.org> on 2020/07/08 15:58:00 UTC
[jira] [Created] (SPARK-32232) IllegalArgumentException: MultilayerPerceptronClassifier_... parameter solver given invalid value auto

steven taylor created SPARK-32232:
-------------------------------------

             Summary: IllegalArgumentException: MultilayerPerceptronClassifier_... parameter solver given invalid value auto
                 Key: SPARK-32232
                 URL: https://issues.apache.org/jira/browse/SPARK-32232
             Project: Spark
          Issue Type: Bug
          Components: ML
    Affects Versions: 3.0.0
            Reporter: steven taylor


I believe I have discovered a bug when loading MultilayerPerceptronClassificationModel in spark 3.0.0, scala 2.1.2 which I have tested and can see is not there in at least Spark 2.4.3, Scala 2.11.  (I'm not sure if the Scala version is important).

 

I am using pyspark on a databricks cluster and importing the library  "from pyspark.ml.classification import MultilayerPerceptronClassificationModel"

 

When running model=MultilayerPerceptronClassificationModel.("load") and then model. transform (df) I get the following error: IllegalArgumentException: MultilayerPerceptronClassifier_8055d1368e78 parameter solver given invalid value auto.

 

 

This issue can be easily replicated by running the example given on the spark documents: [http://spark.apache.org/docs/latest/ml-classification-regression.html#multilayer-perceptron-classifier]

 

Then adding a save model, load model and transform statement as such:

 

*from* *pyspark.ml.classification* *import* MultilayerPerceptronClassifier

*from* *pyspark.ml.evaluation* *import* MulticlassClassificationEvaluator

 

_# Load training data_

data = spark.read.format("libsvm")\

    .load("data/mllib/sample_multiclass_classification_data.txt")

 

_# Split the data into train and test_

splits = data.randomSplit([0.6, 0.4], 1234)

train = splits[0]

test = splits[1]

 

_# specify layers for the neural network:_

_# input layer of size 4 (features), two intermediate of size 5 and 4_

_# and output of size 3 (classes)_

layers = [4, 5, 4, 3]

 

_# create the trainer and set its parameters_

trainer = MultilayerPerceptronClassifier(maxIter=100, layers=layers, blockSize=128, seed=1234)

 

_# train the model_

model = trainer.fit(train)

 

_# compute accuracy on the test set_

result = model.transform(test)

predictionAndLabels = result.select("prediction", "label")

evaluator = MulticlassClassificationEvaluator(metricName="accuracy")

*print*("Test set accuracy = " + str(evaluator.evaluate(predictionAndLabels)))

 

*from* *pyspark.ml.classification* *import* MultilayerPerceptronClassifier, MultilayerPerceptronClassificationModel

model.save(Save_location)

model2. MultilayerPerceptronClassificationModel.load(Save_location)

 

result_from_loaded = model2.transform(test)

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org