You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Daniel Du <yu...@usc.edu> on 2018/05/31 08:25:40 UTC

[PySpark Pipeline XGboost] How to use XGboost in PySpark Pipeline

Dear all, 

I want to update my code of pyspark. In the pyspark, it must put the base
model in a pipeline, the office demo of pipeline use the LogistictRegression
as an base model. However, it seems not be able to use XGboost model in the
pipeline api. How can I use the pyspark like this: 

from xgboost import XGBClassifier 
... 
model = XGBClassifier() 
model.fit(X_train, y_train) 
pipeline = Pipeline(stages=[..., model, ...]) 

It is convenient to use the pipeline api, so can anybody give some advices?
Thank you! 

Daniel



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org