You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "leesf (Jira)" <ji...@apache.org> on 2020/05/28 13:11:00 UTC

[jira] [Created] (HUDI-974) Fields out of order in MOR mode when using Hive

leesf created HUDI-974:
--------------------------

             Summary: Fields out of order in MOR mode when using Hive
                 Key: HUDI-974
                 URL: https://issues.apache.org/jira/browse/HUDI-974
             Project: Apache Hudi
          Issue Type: Bug
          Components: Hive Integration
            Reporter: leesf
            Assignee: liwei
             Fix For: 0.6.0
         Attachments: image-2020-05-28-21-06-02-396.png, image-2020-05-28-21-07-30-803.png

When querying MOR hudi dataset via hive

hive table:

CREATE EXTERNAL TABLE `unknown_rt`(
 `_hoodie_commit_time` string,
 `_hoodie_commit_seqno` string,
 `_hoodie_record_key` string,
 `_hoodie_partition_path` string,
 `_hoodie_file_name` string,
 `age` bigint,
 `name` string,
 `sex` string,
 `ts` bigint)
PARTITIONED BY (
 `location` string)
ROW FORMAT SERDE
 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
 'org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat'
OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
 'file:/Users/sflee/personal/backup_demo'
TBLPROPERTIES (
 'last_commit_time_sync'='20200528153331',
 'transient_lastDdlTime'='1590650733')

 

sql:

set hoodie.realtime.merge.skip = true;

select sex, name, age from unknown_rt;

result:

!image-2020-05-28-21-06-02-396.png!

the fields is out of order when setting hoodie.realtime.merge.skip = true;

sql:

set hoodie.realtime.merge.skip = false;

select sex, name, age from unknown_rt

!image-2020-05-28-21-07-30-803.png!

query result is ok when setting hoodie.realtime.merge.skip = false;

after debugging, I found that hudi use getWriterSchema in RealtimeUnmergedRecordReader instead of getHiveSchema, we need fix it.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)