You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by jglov <ja...@capsenrobotics.com> on 2016/10/20 17:10:18 UTC

Re: Mlib RandomForest (Spark 2.0) predict a single vector

I would also like to know if there is a way to predict a single vector with
the new spark.ml API, although in my case it's because I want to do this
within a map() to avoid calling groupByKey() after a flatMap():

*Current code (pyspark):*

% Given 'model', 'rdd', and a function 'split_element' that splits an
element of the RDD into a list of elements (and assuming
% each element has both a value and a key so that groupByKey will work to
merge them later)

split_rdd = rdd.flatMap(split_element)
split_results = model.transform(split_rdd.toDF()).rdd
return split_results.groupByKey()

*Desired code:*

split_rdd = rdd.map(split_element)
split_results = split_rdd.map(lambda elem_list: [model.transformOne(elem)
for elem in elem_list])
return split_results




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Mlib-RandomForest-Spark-2-0-predict-a-single-vector-tp27447p27931.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org