You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Sebastian Kuepers <se...@publicispixelpark.de> on 2016/07/04 19:40:03 UTC

pyspark aggregate vectors from onehotencoder

hey,

what is best practice to aggregate the vectors from onehotencoders in pyspark?

udafs are still not available in python.
is there any way to do it with spark sql?

or do you have to switch to rdds and do it with a reduceByKey for example?

thanks,
sebastian






------------------------------------------------------------------------
Disclaimer The information in this email and any attachments may contain proprietary and confidential information that is intended for the addressee(s) only. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, retention or use of the contents of this information is prohibited. When addressed to our clients or vendors, any information contained in this e-mail or any attachments is subject to the terms and conditions in any governing contract. If you have received this e-mail in error, please immediately contact the sender and delete the e-mail.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org