You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by pseudo oduesp <ps...@gmail.com> on 2016/06/17 08:25:42 UTC

update data frame inside function

Hi,
how i can update data frame inside function ?

why ?

i have to apply Stingindexer multiple time because i tried  Pipeline  but
it still extremly slow
for 84 columns to Stringindexed eache one have 10 modalities and data frame
with 21Milion row
i need 15 hours of processing .

now i want try  one by one to see  difference if you have other suggestion
your a welcome ?

thanks