You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Carlo.Allocca" <ca...@open.ac.uk> on 2017/06/25 19:37:42 UTC

How to Fill Sparse Data With the Previous Non-Empty Value in SPARQL Dataset

Dear All,

I need to apply a dataset transformation to replace null values with the previous Non-null Value.
As an example, I report the following:

from:

id | col1
---------------------
1 null
1 null
2 4
2 null
2 null
3 5
3 null
3 null

to:

id  |  col1
---------------------
1 null
1 null
2 4
2 4
2 4
3 5
3 5
3 5

I am using SPARK SQL 2 and the Dataset.

I searched on google but I only find solution in the context of database e.g (https://blog.jooq.org/2015/12/17/how-to-fill-sparse-data-with-the-previous-non-empty-value-in-sql/)

Please, any help how to implement this in SPARK ? I understood that I should use Windows and Lang but I cannot put them together.


Thank you in advance for your help.

Best Regards,
Carlo