You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2020/06/22 20:19:00 UTC

[jira] [Commented] (SPARK-32050) GBTClassifier not working with OnevsRest

    [ https://issues.apache.org/jira/browse/SPARK-32050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142387#comment-17142387 ] 

L. C. Hsieh commented on SPARK-32050:
-------------------------------------

I think this was fixed at SPARK-27007.

> GBTClassifier not working with OnevsRest
> ----------------------------------------
>
>                 Key: SPARK-32050
>                 URL: https://issues.apache.org/jira/browse/SPARK-32050
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.4.0
>         Environment: spark 2.4.0
>            Reporter: Raghuvarran V H
>            Priority: Minor
>
> I am trying to use GBT classifier for multi class classification using OnevsRest
>  
> {code:java}
> from pyspark.ml.classification import MultilayerPerceptronClassifier,OneVsRest,GBTClassifier
> from pyspark.ml import Pipeline,PipelineModel
> lr = GBTClassifier(featuresCol='features', labelCol='label', predictionCol='prediction', maxDepth=5,                                                                                              maxBins=32,minInstancesPerNode=1, minInfoGain=0.0, maxMemoryInMB=256, cacheNodeIds=False,checkpointInterval=10, lossType='logistic', maxIter=20,stepSize=0.1, seed=None,subsamplingRate=1.0, featureSubsetStrategy='auto')
> classifier = OneVsRest(featuresCol='features', labelCol='label', predictionCol='prediction', classifier=lr,    weightCol=None,parallelism=1)
> pipeline = Pipeline(stages=[str_indxr,ohe,vecAssembler,normalizer,classifier])
> model = pipeline.fit(train_data)
> {code}
>  
>  
> When I try this I get this error:
> /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/spark/python/pyspark/ml/classification.py in _fit(self, dataset)
>  1800 classifier = self.getClassifier()
>  1801 assert isinstance(classifier, HasRawPredictionCol),\
>  -> 1802 "Classifier %s doesn't extend from HasRawPredictionCol." % type(classifier)
>  1803 
>  1804 numClasses = int(dataset.agg(\{labelCol: "max"}).head()["max("+labelCol+")"]) + 1
> AssertionError: Classifier <class 'pyspark.ml.classification.GBTClassifier'> doesn't extend from HasRawPredictionCol.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org