You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by jglov <ja...@capsenrobotics.com> on 2016/10/20 17:11:49 UTC
Predict a single vector with the new spark.ml API to avoid
groupByKey() after a flatMap()?
Is there a way to predict a single vector with the new spark.ml API, although
in my case it's because I want to do this within a map() to avoid calling
groupByKey() after a flatMap():
*Current code (pyspark):*
% Given 'model', 'rdd', and a function 'split_element' that splits an
element of the RDD into a list of elements (and assuming
% each element has both a value and a key so that groupByKey will work to
merge them later)
split_rdd = rdd.flatMap(split_element)
split_results = model.transform(split_rdd.toDF()).rdd
return split_results.groupByKey()
*Desired code:*
split_rdd = rdd.map(split_element)
split_results = split_rdd.map(lambda elem_list: [model.transformOne(elem)
for elem in elem_list])
return split_results
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Predict-a-single-vector-with-the-new-spark-ml-API-to-avoid-groupByKey-after-a-flatMap-tp27932.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org