You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/07/11 00:47:38 UTC
[GitHub] [spark] huaxingao opened a new pull request #25103: [SPARK-28285][SQL][PYTHON][TESTS] Convert and port 'outer-join.sql' into UDF test base

huaxingao opened a new pull request #25103: [SPARK-28285][SQL][PYTHON][TESTS] Convert and port 'outer-join.sql' into UDF test base
URL: https://github.com/apache/spark/pull/25103
 
 
   
   
   ## What changes were proposed in this pull request?
   
   This PR adds some tests converted from ```outer-join.sql'``` to test UDFs. Please see contribution guide of this umbrella ticket - [SPARK-27921](url).
   <details><summary>Diff comparing to 'outer-join.sql'</summary>
   <p>
   
   ```diff
   diff --git a/sql/core/src/test/resources/sql-tests/results/outer-join.sql.out b/sql/core/src/test/resources/sql-tests/results/udf/udf-outer-join.sql.out
   index 5db3bae5d0..6394dad0f4 100644
   --- a/sql/core/src/test/resources/sql-tests/results/outer-join.sql.out
   +++ b/sql/core/src/test/resources/sql-tests/results/udf/udf-outer-join.sql.out
   @@ -24,22 +24,22 @@ struct<>
    
    -- !query 2
    SELECT
   -  (SUM(COALESCE(t1.int_col1, t2.int_col0))),
   -     ((COALESCE(t1.int_col1, t2.int_col0)) * 2)
   +  (udf(SUM(COALESCE(t1.int_col1, t2.int_col0)))),
   +     (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
    FROM t1
    RIGHT JOIN t2
      ON (t2.int_col0) = (t1.int_col1)
   -GROUP BY GREATEST(COALESCE(t2.int_col1, 109), COALESCE(t1.int_col1, -449)),
   +GROUP BY udf(GREATEST(COALESCE(t2.int_col1, 109), COALESCE(t1.int_col1, -449))),
             COALESCE(t1.int_col1, t2.int_col0)
   -HAVING (SUM(COALESCE(t1.int_col1, t2.int_col0)))
   -            > ((COALESCE(t1.int_col1, t2.int_col0)) * 2)
   +HAVING (udf(SUM(COALESCE(t1.int_col1, t2.int_col0))))
   +            > (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
    -- !query 2 schema
   -struct<sum(coalesce(int_col1, int_col0)):bigint,(coalesce(int_col1, int_col0) * 2):int>
   +struct<udf(sum(cast(coalesce(int_col1, int_col0) as bigint))):string,(CAST(udf(coalesce(int_col1, int_col0)) AS DOUBLE) * CAST(2 AS DOUBLE)):double>
    -- !query 2 output
   --367   -734
   --507   -1014
   --769   -1538
   --800   -1600
   +-367   -734.0
   +-507   -1014.0
   +-769   -1538.0
   +-800   -1600.0
    
    
    -- !query 3
   @@ -70,12 +70,12 @@ spark.sql.crossJoin.enabled true
    SELECT *
    FROM (
    SELECT
   -    COALESCE(t2.int_col1, t1.int_col1) AS int_col
   +    udf(COALESCE(t2.int_col1, udf(t1.int_col1))) AS int_col
        FROM t1
        LEFT JOIN t2 ON false
   -) t where (t.int_col) is not null
   +) t where (udf(t.int_col)) is not null
    -- !query 6 schema
   -struct<int_col:int>
   +struct<int_col:string>
    -- !query 6 output
    97
   ```
   
   </p>
   </details>
   ## How was this patch tested?
   
   Tested as guided in [SPARK-27921](url).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org