You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "philipse (Jira)" <ji...@apache.org> on 2020/07/16 02:15:00 UTC
[jira] [Updated] (SPARK-32324) Fix error messages during using
PIVOT and lateral view
[ https://issues.apache.org/jira/browse/SPARK-32324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
philipse updated SPARK-32324:
-----------------------------
Description:
Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it.
Steps to reproduce:
{code:java}
CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING);
INSERT INTO person VALUES
(100, 'John', 30, 1, 'Street 1'),
(200, 'Mary', NULL, 1, 'Street 2'),
(300, 'Mike', 80, 3, 'Street 3'),
(400, 'Dan', 50, 4, 'Street 4');
{code}
Query1:
{code:java}
SELECT * FROM person
lateral view outer explode(array(30,60)) tabelName as c_age
lateral view explode(array(40,80)) as d_age
PIVOT (
count(distinct age) as a
for name in ('Mary','John')
)
{code}
Result 1:
{code:java}
Error: org.apache.spark.sql.catalyst.parser.ParseException:
LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9)
== SQL ==
SELECT * FROM person
---------^^^
lateral view outer explode(array(30,60)) tabelName as c_age
lateral view explode(array(40,80)) as d_age
PIVOT (
count(distinct age) as a
for name in ('Mary','John')
) (state=,code=0)
{code}
Query2:
{code:java}
SELECT * FROM person
PIVOT (
count(distinct age) as a
for name in ('Mary','John')
)
lateral view outer explode(array(30,60)) tabelName as c_age
lateral view explode(array(40,80)) as d_age
{code}
Reuslt2:
+-------+------++---------------++--------
|id|Mary|John|c_age|d_age|
+-------+------++---------------++--------
|300|NULL|NULL|30|40|
|300|NULL|NULL|30|80|
|300|NULL|NULL|60|40|
|300|NULL|NULL|60|80|
|100|0|NULL|30|40|
|100|0|NULL|30|80|
|100|0|NULL|60|40|
|100|0|NULL|60|80|
|400|NULL|NULL|30|40|
|400|NULL|NULL|30|80|
|400|NULL|NULL|60|40|
|400|NULL|NULL|60|80|
|200|NULL|1|30|40|
|200|NULL|1|30|80|
|200|NULL|1|60|40|
|200|NULL|1|60|80|
+-------+------++---------------++--------
```
was:
Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it.
Steps to reproduce:
```
CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING);
INSERT INTO person VALUES
(100, 'John', 30, 1, 'Street 1'),
(200, 'Mary', NULL, 1, 'Street 2'),
(300, 'Mike', 80, 3, 'Street 3'),
(400, 'Dan', 50, 4, 'Street 4');
```
Query1:
```
SELECT * FROM person
lateral view outer explode(array(30,60)) tabelName as c_age
lateral view explode(array(40,80)) as d_age
PIVOT (
count(distinct age) as a
for name in ('Mary','John')
)
```
Result 1:
```
Error: org.apache.spark.sql.catalyst.parser.ParseException:
LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9)
== SQL ==
SELECT * FROM person
---------^^^
lateral view outer explode(array(30,60)) tabelName as c_age
lateral view explode(array(40,80)) as d_age
PIVOT (
count(distinct age) as a
for name in ('Mary','John')
) (state=,code=0)
```
Query2:
```
SELECT * FROM person
PIVOT (
count(distinct age) as a
for name in ('Mary','John')
)
lateral view outer explode(array(30,60)) tabelName as c_age
lateral view explode(array(40,80)) as d_age
```
Reuslt2:
```
+------+-------+-------+--------+--------+
| id | Mary | John | c_age | d_age |
+------+-------+-------+--------+--------+
| 300 | NULL | NULL | 30 | 40 |
| 300 | NULL | NULL | 30 | 80 |
| 300 | NULL | NULL | 60 | 40 |
| 300 | NULL | NULL | 60 | 80 |
| 100 | 0 | NULL | 30 | 40 |
| 100 | 0 | NULL | 30 | 80 |
| 100 | 0 | NULL | 60 | 40 |
| 100 | 0 | NULL | 60 | 80 |
| 400 | NULL | NULL | 30 | 40 |
| 400 | NULL | NULL | 30 | 80 |
| 400 | NULL | NULL | 60 | 40 |
| 400 | NULL | NULL | 60 | 80 |
| 200 | NULL | 1 | 30 | 40 |
| 200 | NULL | 1 | 30 | 80 |
| 200 | NULL | 1 | 60 | 40 |
| 200 | NULL | 1 | 60 | 80 |
+------+-------+-------+--------+--------+
```
> Fix error messages during using PIVOT and lateral view
> ------------------------------------------------------
>
> Key: SPARK-32324
> URL: https://issues.apache.org/jira/browse/SPARK-32324
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: philipse
> Priority: Minor
>
> Currently when we use `lateral view` and `pivot` together in from clause, if `lateral view` is before `pivot`, the error message is "LATERAL cannot be used together with PIVOT in FROM clause".if if `lateral view` is after `pivot`,the query will be normal ,So the error messages "LATERAL cannot be used together with PIVOT in FROM clause" is not accurate, we may improve it.
>
> Steps to reproduce:
> {code:java}
> CREATE TABLE person (id INT, name STRING, age INT, class int, address STRING);
> INSERT INTO person VALUES
> (100, 'John', 30, 1, 'Street 1'),
> (200, 'Mary', NULL, 1, 'Street 2'),
> (300, 'Mike', 80, 3, 'Street 3'),
> (400, 'Dan', 50, 4, 'Street 4');
> {code}
>
> Query1:
>
> {code:java}
> SELECT * FROM person
> lateral view outer explode(array(30,60)) tabelName as c_age
> lateral view explode(array(40,80)) as d_age
> PIVOT (
> count(distinct age) as a
> for name in ('Mary','John')
> )
> {code}
> Result 1:
>
> {code:java}
> Error: org.apache.spark.sql.catalyst.parser.ParseException:
> LATERAL cannot be used together with PIVOT in FROM clause(line 1, pos 9)
> == SQL ==
> SELECT * FROM person
> ---------^^^
> lateral view outer explode(array(30,60)) tabelName as c_age
> lateral view explode(array(40,80)) as d_age
> PIVOT (
> count(distinct age) as a
> for name in ('Mary','John')
> ) (state=,code=0)
> {code}
>
>
> Query2:
>
> {code:java}
> SELECT * FROM person
> PIVOT (
> count(distinct age) as a
> for name in ('Mary','John')
> )
> lateral view outer explode(array(30,60)) tabelName as c_age
> lateral view explode(array(40,80)) as d_age
> {code}
>
> Reuslt2:
> +-------+------++---------------++--------
> |id|Mary|John|c_age|d_age|
> +-------+------++---------------++--------
> |300|NULL|NULL|30|40|
> |300|NULL|NULL|30|80|
> |300|NULL|NULL|60|40|
> |300|NULL|NULL|60|80|
> |100|0|NULL|30|40|
> |100|0|NULL|30|80|
> |100|0|NULL|60|40|
> |100|0|NULL|60|80|
> |400|NULL|NULL|30|40|
> |400|NULL|NULL|30|80|
> |400|NULL|NULL|60|40|
> |400|NULL|NULL|60|80|
> |200|NULL|1|30|40|
> |200|NULL|1|30|80|
> |200|NULL|1|60|40|
> |200|NULL|1|60|80|
> +-------+------++---------------++--------
> ```
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org