You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Stanislav Chernichkin (JIRA)" <ji...@apache.org> on 2017/06/10 09:42:18 UTC
[jira] [Updated] (SPARK-21037) ignoreNulls does not working
properly with window functions
[ https://issues.apache.org/jira/browse/SPARK-21037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stanislav Chernichkin updated SPARK-21037:
------------------------------------------
Description:
Following code reproduces issue:
spark
.sql("select 0 as key, null as value, 0 as order union select 0 as key, 'value' as value, 1 as order")
.select($"*", first($"value", true).over(partitionBy($"key").orderBy("order")).as("first_value"))
.show()
Since documentation climes than {{first}} function will return first non-null result I except to have:
|key|value|order|first_value|
+---+-----+-----+-----------+
| 0| null| 0| value|
| 0|value| 1| value|
+---+-----+-----+-----------+
But actual result is:
+---+-----+-----+-----------+
|key|value|order|first_value|
+---+-----+-----+-----------+
| 0| null| 0| null|
| 0|value| 1| value|
+---+-----+-----+-----------+
was:
Following code reproduces issue:
spark
.sql("select 0 as key, null as value, 0 as order union select 0 as key, 'value' as value, 1 as order")
.select($"*", first($"value", true).over(partitionBy($"key").orderBy("order")).as("first_value"))
.show()
Since documentation climes than {{first}} function will return first non-null result I except to have:
+---+-----+-----+-----------+
|key|value|order|first_value|
+---+-----+-----+-----------+
| 0| null| 0| value|
| 0|value| 1| value|
+---+-----+-----+-----------+
But actual result is:
+---+-----+-----+-----------+
|key|value|order|first_value|
+---+-----+-----+-----------+
| 0| null| 0| null|
| 0|value| 1| value|
+---+-----+-----+-----------+
> ignoreNulls does not working properly with window functions
> -----------------------------------------------------------
>
> Key: SPARK-21037
> URL: https://issues.apache.org/jira/browse/SPARK-21037
> Project: Spark
> Issue Type: Bug
> Components: Optimizer
> Affects Versions: 2.1.0, 2.1.1
> Reporter: Stanislav Chernichkin
>
> Following code reproduces issue:
> spark
> .sql("select 0 as key, null as value, 0 as order union select 0 as key, 'value' as value, 1 as order")
> .select($"*", first($"value", true).over(partitionBy($"key").orderBy("order")).as("first_value"))
> .show()
> Since documentation climes than {{first}} function will return first non-null result I except to have:
> |key|value|order|first_value|
> +---+-----+-----+-----------+
> | 0| null| 0| value|
> | 0|value| 1| value|
> +---+-----+-----+-----------+
> But actual result is:
> +---+-----+-----+-----------+
> |key|value|order|first_value|
> +---+-----+-----+-----------+
> | 0| null| 0| null|
> | 0|value| 1| value|
> +---+-----+-----+-----------+
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org