You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/07/24 00:49:34 UTC
[GitHub] [spark] HyukjinKwon commented on a change in pull request #25195:
[SPARK-28288][SQL][PYTHON][TESTS] Convert and port 'window.sql' into UDF
test base
HyukjinKwon commented on a change in pull request #25195: [SPARK-28288][SQL][PYTHON][TESTS] Convert and port 'window.sql' into UDF test base
URL: https://github.com/apache/spark/pull/25195#discussion_r306589637
##########
File path: sql/core/src/test/resources/sql-tests/results/udf/udf-window.sql.out
##########
@@ -0,0 +1,389 @@
+-- Automatically generated by SQLQueryTestSuite
+-- Number of queries: 23
+
+
+-- !query 0
+CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES
+(null, 1L, 1.0D, date("2017-08-01"), timestamp(1501545600), "a"),
+(1, 1L, 1.0D, date("2017-08-01"), timestamp(1501545600), "a"),
+(1, 2L, 2.5D, date("2017-08-02"), timestamp(1502000000), "a"),
+(2, 2147483650L, 100.001D, date("2020-12-31"), timestamp(1609372800), "a"),
+(1, null, 1.0D, date("2017-08-01"), timestamp(1501545600), "b"),
+(2, 3L, 3.3D, date("2017-08-03"), timestamp(1503000000), "b"),
+(3, 2147483650L, 100.001D, date("2020-12-31"), timestamp(1609372800), "b"),
+(null, null, null, null, null, null),
+(3, 1L, 1.0D, date("2017-08-01"), timestamp(1501545600), null)
+AS testData(val, val_long, val_double, val_date, val_timestamp, cate)
+-- !query 0 schema
+struct<>
+-- !query 0 output
+
+
+
+-- !query 1
+SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY val ROWS CURRENT ROW) FROM testData
+ORDER BY cate, val
+-- !query 1 schema
+struct<CAST(udf(cast(val as string)) AS INT):int,cate:string,count(val) OVER (PARTITION BY cate ORDER BY val ASC NULLS FIRST ROWS BETWEEN CURRENT ROW AND CURRENT ROW):bigint>
+-- !query 1 output
+NULL NULL 0
+3 NULL 1
+NULL a 0
+1 a 1
+1 a 1
+2 a 1
+1 b 1
+2 b 1
+3 b 1
+
+
+-- !query 2
+SELECT udf(val), cate, sum(val) OVER(PARTITION BY cate ORDER BY val
+ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING) FROM testData ORDER BY cate, val
+-- !query 2 schema
+struct<CAST(udf(cast(val as string)) AS INT):int,cate:string,sum(val) OVER (PARTITION BY cate ORDER BY val ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING):bigint>
+-- !query 2 output
+NULL NULL 3
+3 NULL 3
+NULL a 1
+1 a 2
+1 a 4
+2 a 4
+1 b 3
+2 b 6
+3 b 6
+
+
+-- !query 3
Review comment:
Hmm ...
```diff
-- !query 3
-SELECT val_long, cate, sum(val_long) OVER(PARTITION BY cate ORDER BY val_long
-ROWS BETWEEN CURRENT ROW AND 2147483648 FOLLOWING) FROM testData ORDER BY cate, val_long
+SELECT val_long, udf(cate), sum(val_long) OVER(PARTITION BY cate ORDER BY val_long
+ROWS BETWEEN CURRENT ROW AND CAST(2147483648 AS int) FOLLOWING) FROM testData ORDER BY cate, val_long
-- !query 3 schema
-struct<>
+struct<val_long:bigint,CAST(udf(cast(cate as string)) AS STRING):string,sum(val_long) OVER (PARTITION BY cate ORDER BY val_long ASC NULLS FIRST ROWS BETWEEN CURRENT ROW AND CAST(2147483648 AS INT) FOLLOWING):bigint>
-- !query 3 output
-org.apache.spark.sql.AnalysisException
-cannot resolve 'ROWS BETWEEN CURRENT ROW AND 2147483648L FOLLOWING' due to data type mismatch: The data type of the upper bound 'bigint' does not match the expected data type 'int'.; line 1 pos 41
+NULL NULL 1
+1 NULL 1
+1 a 2147483654
+1 a 2147483653
+2 a 2147483652
+2147483650 a 2147483650
+NULL b 2147483653
+3 b 2147483653
+2147483650 b 2147483650
```
Do you know why this works when it's wrapped by udf?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org