You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Linhong Liu (Jira)" <ji...@apache.org> on 2022/08/31 17:29:00 UTC
[jira] [Updated] (SPARK-40292) arrays_zip output unexpected alias column names
[ https://issues.apache.org/jira/browse/SPARK-40292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Linhong Liu updated SPARK-40292:
--------------------------------
Description:
For the below query:
{code:sql}
with q as (
select
named_struct(
'my_array', array(named_struct('x', 1, 'y', 2))
) as my_struct
)
select
arrays_zip(my_struct.my_array)
from
q {code}
The latest spark gives the below schema, the field name "my_array" was changed to "0"
{code:java}
root
|-- arrays_zip(my_struct.my_array): array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- 0: struct (nullable = true)
| | | |-- x: integer (nullable = true)
| | | |-- y: integer (nullable = true){code}
While Spark 3.1 gives the expected result
{code:java}
root
|-- arrays_zip(my_struct.my_array): array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- my_array: struct (nullable = true)
| | | |-- x: integer (nullable = true)
| | | |-- y: integer (nullable = true)
{code}
was:
For the below query:
{code:java}
with q as (
select
named_struct(
'my_array', array(named_struct('x', 1, 'y', 2))
) as my_struct
)
select
arrays_zip(my_struct.my_array)
from
q {code}
The latest spark gives the below schema, the field name "my_array" was changed to "0"
root
|-- arrays_zip(my_struct.my_array): array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- 0: struct (nullable = true)
| | | |-- x: integer (nullable = true)
| | | |-- y: integer (nullable = true)
But the Spark 3.1 gives expected result
root
|-- arrays_zip(my_struct.my_array): array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- my_array: struct (nullable = true)
| | | |-- x: integer (nullable = true)
| | | |-- y: integer (nullable = true)
> arrays_zip output unexpected alias column names
> -----------------------------------------------
>
> Key: SPARK-40292
> URL: https://issues.apache.org/jira/browse/SPARK-40292
> Project: Spark
> Issue Type: Task
> Components: SQL
> Affects Versions: 3.4.0
> Reporter: Linhong Liu
> Priority: Major
>
> For the below query:
>
> {code:sql}
> with q as (
> select
> named_struct(
> 'my_array', array(named_struct('x', 1, 'y', 2))
> ) as my_struct
> )
> select
> arrays_zip(my_struct.my_array)
> from
> q {code}
> The latest spark gives the below schema, the field name "my_array" was changed to "0"
> {code:java}
> root
> |-- arrays_zip(my_struct.my_array): array (nullable = true)
> | |-- element: struct (containsNull = false)
> | | |-- 0: struct (nullable = true)
> | | | |-- x: integer (nullable = true)
> | | | |-- y: integer (nullable = true){code}
> While Spark 3.1 gives the expected result
> {code:java}
> root
> |-- arrays_zip(my_struct.my_array): array (nullable = true)
> | |-- element: struct (containsNull = false)
> | | |-- my_array: struct (nullable = true)
> | | | |-- x: integer (nullable = true)
> | | | |-- y: integer (nullable = true)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org