You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Harish Jaiprakash (JIRA)" <ji...@apache.org> on 2019/03/01 02:17:00 UTC

[jira] [Commented] (HIVE-21362) Add an input format and serde to read from protobuf files.

    [ https://issues.apache.org/jira/browse/HIVE-21362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781212#comment-16781212 ] 

Harish Jaiprakash commented on HIVE-21362:
------------------------------------------

Thanks, [~jdere].

{noformat}
Is there a RB for this somewhere?
{noformat}

Created one forgot to link: https://reviews.apache.org/r/70075

{noformat}
Does ProtoMessageSerDe.createStructObjectInspector() need to handle repeated struct fields like createObjectInspector() does?
{noformat}

It does, createStructObjectInspector calls createObjectInspector which in turn calls createStructObjectInspect for struct types.

{noformat}
Also createObjectInspector() has a call to System.out.println(), please remove that or convert to debug logging.
{noformat}

Will remove this, was debugging code. 

{noformat}
I don't quite get the proto.maptypes property - is this just some special condition due to the data format you are trying to read, or is this the only way to specify that a field is of Map type? If the latter, doesn't the descriptor have a way to specify a map type?
{noformat}

Proto 2 compiler does not support map types. This is a way to configure conversion of repeated struct(key, value) into map<key, value>. That makes it easier to process in hive too, no explode followed by a filter is required for this and easier to extract several values from the map without joins.

> Add an input format and serde to read from protobuf files.
> ----------------------------------------------------------
>
>                 Key: HIVE-21362
>                 URL: https://issues.apache.org/jira/browse/HIVE-21362
>             Project: Hive
>          Issue Type: Task
>          Components: HiveServer2
>            Reporter: Harish Jaiprakash
>            Assignee: Harish Jaiprakash
>            Priority: Critical
>         Attachments: HIVE-21362.01.patch
>
>
> Logs are being generated using the HiveProtoLoggingHook and tez ProtoHistoryLoggingService. These are sequence files written using ProtobufMessageWritable.
> Implement a SerDe and input format to be able to create tables using these files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)