You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "gabrywu (Jira)" <ji...@apache.org> on 2022/04/03 02:21:00 UTC

[jira] [Comment Edited] (SPARK-38769) [SQL] behavior of schema_of_json not same with 2.4.0

    [ https://issues.apache.org/jira/browse/SPARK-38769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516421#comment-17516421 ] 

gabrywu edited comment on SPARK-38769 at 4/3/22 2:20 AM:
---------------------------------------------------------

[~hyukjin.kwon]  nomatter which UDF to work together, I believe we should not change its behavior, right?

For example, following json contains a field ato_long_v2, however, it will be ato_long_v3, and ato_long_v4, etc. We want to extract the version string as v2,v3,v4, and schema_of_json is used here
{code:java}
{
  "tt_v1": 165
  "tt_long_v2": 474
  "ato_long_v2": 431
  "tt_short_v2": 338
  "ato_v1": 408
  "ato_short_v2": 358
  "sf_long_v3": 400
  "sf_short_v3": 498
}{code}


was (Author: gabry.wu):
nomatter which UDF to work together, I believe we should not change its behavior, right?

For example, following json contains a field ato_long_v2, however, it will be ato_long_v3, and ato_long_v4, etc. We want to extract the version string as v2,v3,v4, and schema_of_json is used here
{code:java}
{
  "tt_v1": 165
  "tt_long_v2": 474
  "ato_long_v2": 431
  "tt_short_v2": 338
  "ato_v1": 408
  "ato_short_v2": 358
  "sf_long_v3": 400
  "sf_short_v3": 498
}{code}

> [SQL] behavior of schema_of_json not same with 2.4.0
> ----------------------------------------------------
>
>                 Key: SPARK-38769
>                 URL: https://issues.apache.org/jira/browse/SPARK-38769
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.1.1
>            Reporter: gabrywu
>            Priority: Minor
>
> When I switch to spark 3.1.1 from spark 2.4.0, I found a built-in function throw errors:
> |== Physical Plan == org.apache.spark.sql.AnalysisException: cannot resolve 'schema_of_json(get_json_object(`adtnl_info_txt`, '$.all_model_scores'))' due to data type mismatch: The input json should be a foldable string expression and not null; however, got get_json_object(`adtnl_info_txt`, '$.all_model_scores').; line 3 pos 2; |
> But schema_of_json worked well in 2.4.0, So, is it a bug, or a new feature, which doesn't support non-Literal expressions?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org