You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Touopi Touopi (Jira)" <ji...@apache.org> on 2020/05/14 14:22:00 UTC

[jira] [Reopened] (SPARK-31686) Return of String instead of array in function get_json_object

     [ https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Touopi Touopi reopened SPARK-31686:
-----------------------------------

I don't really understand the purpose to change the return type.
{code:sql}
select
v1.brandedcustomernumber as brandedcustomernumber
from
uniquecustomer.UniqueCustomer
lateral view explode(from_json(get_json_object(string(brandedCustomerInfoAggregate), '$.brandedCustomers[*].customerNumber'), 'array<string>')) v1 as brandedcustomernumber
{code}
Look this example,
 Since i am using the wilcard [*] it means that i can have 0..n elements returned.
 Lucky my brandedCustomerInfoAggregate object has more than one brandedCustomers elements so the result of the get_json_object function will be ["customer1","customer2"] for instance.

So now the function explode is waiting an array,what will happens if in any case i have just one brandedCustomers filled ?

the Object like String (actually i discover the " characters added on the chain) will be return liked this "customer1" an the function from_json will break.

I am expecting that during the parsing and selection of node if we have [*] we should return an array.
 Actually when One element is returned for another query,i am converting to array and cast to string
 (from_json(cast(array(get_json_object(string(customer),'$.addresses[*].location')) as string),'array<string>'))

But the result are not good when more elements are returned

> Return of String instead of array in function get_json_object
> -------------------------------------------------------------
>
>                 Key: SPARK-31686
>                 URL: https://issues.apache.org/jira/browse/SPARK-31686
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.5
>         Environment: {code:json}
> // code placeholder
> {
> customer:{ 
>      addesses:[ { {code}
>                   location :  arizona
>                   }
>                ]
> }
> }
>  get_json_object(string(customer),'$addresses[*].location')
> return "arizona"
> result expected should be
> ["arizona"]
>            Reporter: Touopi Touopi
>            Priority: Major
>
> when we selecting a node of a json object that is array,
> When the array contains One element , the get_json_object return a String with " characters instead of an array of One element.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org