You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Touopi Touopi (Jira)" <ji...@apache.org> on 2020/05/14 14:22:00 UTC
[jira] [Reopened] (SPARK-31686) Return of String instead of array
in function get_json_object
[ https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Touopi Touopi reopened SPARK-31686:
-----------------------------------
I don't really understand the purpose to change the return type.
{code:sql}
select
v1.brandedcustomernumber as brandedcustomernumber
from
uniquecustomer.UniqueCustomer
lateral view explode(from_json(get_json_object(string(brandedCustomerInfoAggregate), '$.brandedCustomers[*].customerNumber'), 'array<string>')) v1 as brandedcustomernumber
{code}
Look this example,
Since i am using the wilcard [*] it means that i can have 0..n elements returned.
Lucky my brandedCustomerInfoAggregate object has more than one brandedCustomers elements so the result of the get_json_object function will be ["customer1","customer2"] for instance.
So now the function explode is waiting an array,what will happens if in any case i have just one brandedCustomers filled ?
the Object like String (actually i discover the " characters added on the chain) will be return liked this "customer1" an the function from_json will break.
I am expecting that during the parsing and selection of node if we have [*] we should return an array.
Actually when One element is returned for another query,i am converting to array and cast to string
(from_json(cast(array(get_json_object(string(customer),'$.addresses[*].location')) as string),'array<string>'))
But the result are not good when more elements are returned
> Return of String instead of array in function get_json_object
> -------------------------------------------------------------
>
> Key: SPARK-31686
> URL: https://issues.apache.org/jira/browse/SPARK-31686
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.5
> Environment: {code:json}
> // code placeholder
> {
> customer:{
> addesses:[ { {code}
> location : arizona
> }
> ]
> }
> }
> get_json_object(string(customer),'$addresses[*].location')
> return "arizona"
> result expected should be
> ["arizona"]
> Reporter: Touopi Touopi
> Priority: Major
>
> when we selecting a node of a json object that is array,
> When the array contains One element , the get_json_object return a String with " characters instead of an array of One element.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org