You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by pseudo oduesp <ps...@gmail.com> on 2016/06/16 13:42:28 UTC

advise please

Hi ,
who i can dummies large set of columns with  STRINGindexer fast ?
becasue i tested with 89 values and eache one had 10 max distinct  values
and that take
lot of time
thanks

Re: advise please

Posted by pseudo oduesp <ps...@gmail.com>.
hi ,
 i use pyspark 1.5.0 on yarn cluster with 19 nodes and 200 GO
and 4 cores eache (include  driver)

2016-06-16 15:42 GMT+02:00 pseudo oduesp <ps...@gmail.com>:

> Hi ,
> who i can dummies large set of columns with  STRINGindexer fast ?
> becasue i tested with 89 values and eache one had 10 max distinct  values
> and that take
> lot of time
> thanks
>