You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "liwei (Jira)" <ji...@apache.org> on 2020/10/09 16:03:00 UTC

[jira] [Closed] (HUDI-974) Fields out of order in MOR mode when using Hive

     [ https://issues.apache.org/jira/browse/HUDI-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liwei closed HUDI-974.
----------------------

> Fields out of order in MOR mode when using Hive
> -----------------------------------------------
>
>                 Key: HUDI-974
>                 URL: https://issues.apache.org/jira/browse/HUDI-974
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Hive Integration
>            Reporter: leesf
>            Assignee: liwei
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.6.0
>
>         Attachments: image-2020-05-28-21-06-02-396.png, image-2020-05-28-21-07-30-803.png
>
>
> When querying MOR hudi dataset via hive
> hive table:
> CREATE EXTERNAL TABLE `unknown_rt`(
>  `_hoodie_commit_time` string,
>  `_hoodie_commit_seqno` string,
>  `_hoodie_record_key` string,
>  `_hoodie_partition_path` string,
>  `_hoodie_file_name` string,
>  `age` bigint,
>  `name` string,
>  `sex` string,
>  `ts` bigint)
>  PARTITIONED BY (
>  `location` string)
>  ROW FORMAT SERDE
>  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
>  STORED AS INPUTFORMAT
>  'org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat'
>  OUTPUTFORMAT
>  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
>  LOCATION
>  'file:/Users/sflee/personal/backup_demo'
>  TBLPROPERTIES (
>  'last_commit_time_sync'='20200528153331',
>  'transient_lastDdlTime'='1590650733')
>  
> sql:
> set hoodie.realtime.merge.skip = true;
> select sex, name, age from unknown_rt;
> result:
> !image-2020-05-28-21-06-02-396.png!
> the fields is out of order when setting hoodie.realtime.merge.skip = true;
> sql:
> set hoodie.realtime.merge.skip = false;
> select sex, name, age from unknown_rt
> !image-2020-05-28-21-07-30-803.png!
> query result is ok when setting hoodie.realtime.merge.skip = false;
> after debugging, I found that hudi use getWriterSchema in RealtimeUnmergedRecordReader instead of getHiveSchema, we need fix it.
>  
> cc [~vbalaji]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)