You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:01:07 UTC

[jira] [Updated] (SPARK-19731) IN Operator should support arrays

     [ https://issues.apache.org/jira/browse/SPARK-19731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-19731:
---------------------------------
    Labels: bulk-closed  (was: )

> IN Operator should support arrays
> ---------------------------------
>
>                 Key: SPARK-19731
>                 URL: https://issues.apache.org/jira/browse/SPARK-19731
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.2, 2.0.0, 2.1.0
>            Reporter: Shawn Lavelle
>            Priority: Minor
>              Labels: bulk-closed
>
> When the column type and array member type match, the IN operator should still operate on the array. This is useful for UDFs and Predicate SubQueries that return arrays.  
> (This isn't necessarily extensible to all collections, but certainly applies to arrays.)
> Example:
> select 5 in array(1,2,3) Should return false instead of parseException, since the type of the array and the type of the column match.
> create table test (val int);
> insert into test values (1);
> select * from test;
> +------+--+
> | val  |
> +------+--+
> | 1    |
> +------+--+
> *select val from test where array_contains(array(1,2,3), val);*
> +------+--+
> | val  |
> +------+--+
> | 1    |
> +------+--+
> {panel}
> *select val from test where val in (array(1,2,3));*
> Error: org.apache.spark.sql.AnalysisException: cannot resolve '(test.`val` IN (array(1, 2, 3)))' due to data type mismatch: Arguments must be same type; line 1 pos 31;
> 'Project ['val]
> +- 'Filter val#433 IN (array(1, 2, 3))
>    +- MetastoreRelation test (state=,code=0)
> {panel}
> {panel}
> *select val from test where val in (select array(1,2,3));*
> Error: org.apache.spark.sql.AnalysisException: cannot resolve '(test.`val` = `array(1, 2, 3)`)' due to data type mismatch: differing types in '(test.`val` = `array(1, 2, 3)`)' (int and array<int>).;;
> 'Project ['val]
> +- 'Filter predicate-subquery#434 [(val#435 = array(1, 2, 3)#436)]
>    :  +- Project [array(1, 2, 3) AS array(1, 2, 3)#436]
>    :     +- OneRowRelation$
>    +- MetastoreRelation test (state=,code=0)
> {panel}
> {panel}
> *select val from test where val in (select explode(array(1,2,3)));*
> +------+--+
> | val  |
> +------+--+
> | 1    |
> +------+--+
> Note: See [SPARK-19730|https://issues.apache.org/jira/browse/SPARK-19730] for how a predicate subquery breaks when applied to the DataSourceAPI
> {panel}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org