You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by pseudo oduesp <ps...@gmail.com> on 2016/06/08 12:05:15 UTC
comparaing row in pyspark data frame
Hi,
how we can compare multiples columns in datframe i mean
if df it s dataframe like that :
df.col1 | df.col2 | df.col3
0.2 0.3 0.4
how we can compare columns to get max of row not columns and get name of
columns where max it present ?
thanks
Re: comparaing row in pyspark data frame
Posted by Jacek Laskowski <ja...@japila.pl>.
On Wed, Jun 8, 2016 at 2:05 PM, pseudo oduesp <ps...@gmail.com> wrote:
> how we can compare columns to get max of row not columns and get name of
> columns where max it present ?
First thought - a UDF.
Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: comparaing row in pyspark data frame
Posted by Ted Yu <yu...@gmail.com>.
Do you mean returning col3 and 0.4 for the example row below ?
> On Jun 8, 2016, at 5:05 AM, pseudo oduesp <ps...@gmail.com> wrote:
>
> Hi,
> how we can compare multiples columns in datframe i mean
>
> if df it s dataframe like that :
>
> df.col1 | df.col2 | df.col3
> 0.2 0.3 0.4
>
> how we can compare columns to get max of row not columns and get name of columns where max it present ?
>
> thanks
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org