You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Divya Gehlot <di...@gmail.com> on 2016/08/09 12:34:00 UTC

[Spark 1.6]-increment value column based on condition + Dataframe

Hi,
I have column values having values like
Value
30
12
56
23
12
16
12
89
12
5
6
4
8

I need create another column
if col("value") > 30  1 else col("value") < 30
newColValue
0
1
0
1
2
3
4
0
1
2
3
4
5

How can I have create an increment column
The grouping is happening based on some other cols which is not mentioned
here.
When I try Windows sum function ,its summing but instead of incrementing it
the total sum is getting displayed in all the rows .
val overWin = Window.partitionBy('col1,'col2,'col3).orderBy('Value)
val total = sum('Value).over(overWin)

With this logic
I am getting the below result
0
1
0
4
4
4
4
0
5
5
5
5
5

Written my own UDF also but customized UDF is not supported in windows
function in Spark 1.6

Would really appreciate the help.


Thanks,
Divya




Am I missing something