You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "lihz (Jira)" <ji...@apache.org> on 2021/12/21 03:57:00 UTC

[jira] [Updated] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String

     [ https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

lihz updated HIVE-25541:
------------------------
    Description: 
本机 Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' 目前不支持将嵌套的 json 直接加载到字符串类型中。它需要将列声明为复杂类型(结构、映射、数组)以解压嵌套的 json 数据。

即使数据字段不是有效的 JSON 字符串类型,也可以将其视为普通字符串,而不是像我们目前那样抛出异常。
{code:java}

{code}
创建表 json_table(数据字符串,messageid 字符串,publish_time bigint,属性字符串);
{code:java}

{code}
{ 
{code:java}
“数据”
{code}
:{ 
{code:java}
“H”
{code}
:{ 
{code:java}
“事件”
{code}
:
{code:java}
“track_active”
{code}
,
{code:java}
“平台”
{code}
:
{code:java}
“Android”
{code}
 },
{code:java}
“B”
{code}
:{ 
{code:java}
“设备类型”
{code}
:
{code:java}
“电话”
{code}
,
{code:java}
“uuid”
{code}
:
{code:java}
“[36ffec24-f6a4 -4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16] “
{code}
 }}, 
{code:java}
”邮件ID“ 
{code}
:
{code:java}
”2475185636801962“ 
{code}
,
{code:java}
”publish_time“
{code}
:1622514629783, 
{code:java}
”属性“
{code}
:{ 
{code:java}
”区“
{code}
:
{code:java}
”IN“
{code}
 }}” }}

这个 JIRA 引入了一个额外的表属性,允许对复杂的 JSON 值进行字符串化,而不是强制用户定义完整的嵌套结构

  was:
Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not support loading nested json into a string type directly. It requires the declaring the column as complex type (struct, map, array) to unpack nested json data.

Even though the data field is not a valid JSON String type there is value treating it as plain String instead of throwing an exception as we currently do.

{code:java}
create table json_table(data string, messageid string, publish_time bigint, attributes string);

{"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
{code}

This JIRA introduces an extra Table Property allowing to Stringify Complex JSON values instead of forcing the User to define the complete nested structure


> JsonSerDe: TBLPROPERTY treating nested json as String
> -----------------------------------------------------
>
>                 Key: HIVE-25541
>                 URL: https://issues.apache.org/jira/browse/HIVE-25541
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> 本机 Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' 目前不支持将嵌套的 json 直接加载到字符串类型中。它需要将列声明为复杂类型(结构、映射、数组)以解压嵌套的 json 数据。
> 即使数据字段不是有效的 JSON 字符串类型,也可以将其视为普通字符串,而不是像我们目前那样抛出异常。
> {code:java}
> {code}
> 创建表 json_table(数据字符串,messageid 字符串,publish_time bigint,属性字符串);
> {code:java}
> {code}
> { 
> {code:java}
> “数据”
> {code}
> :{ 
> {code:java}
> “H”
> {code}
> :{ 
> {code:java}
> “事件”
> {code}
> :
> {code:java}
> “track_active”
> {code}
> ,
> {code:java}
> “平台”
> {code}
> :
> {code:java}
> “Android”
> {code}
>  },
> {code:java}
> “B”
> {code}
> :{ 
> {code:java}
> “设备类型”
> {code}
> :
> {code:java}
> “电话”
> {code}
> ,
> {code:java}
> “uuid”
> {code}
> :
> {code:java}
> “[36ffec24-f6a4 -4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16] “
> {code}
>  }}, 
> {code:java}
> ”邮件ID“ 
> {code}
> :
> {code:java}
> ”2475185636801962“ 
> {code}
> ,
> {code:java}
> ”publish_time“
> {code}
> :1622514629783, 
> {code:java}
> ”属性“
> {code}
> :{ 
> {code:java}
> ”区“
> {code}
> :
> {code:java}
> ”IN“
> {code}
>  }}” }}
> 这个 JIRA 引入了一个额外的表属性,允许对复杂的 JSON 值进行字符串化,而不是强制用户定义完整的嵌套结构



--
This message was sent by Atlassian Jira
(v8.20.1#820001)