You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "leesf (Jira)" <ji...@apache.org> on 2020/05/28 13:11:00 UTC
[jira] [Created] (HUDI-974) Fields out of order in MOR mode when
using Hive
leesf created HUDI-974:
--------------------------
Summary: Fields out of order in MOR mode when using Hive
Key: HUDI-974
URL: https://issues.apache.org/jira/browse/HUDI-974
Project: Apache Hudi
Issue Type: Bug
Components: Hive Integration
Reporter: leesf
Assignee: liwei
Fix For: 0.6.0
Attachments: image-2020-05-28-21-06-02-396.png, image-2020-05-28-21-07-30-803.png
When querying MOR hudi dataset via hive
hive table:
CREATE EXTERNAL TABLE `unknown_rt`(
`_hoodie_commit_time` string,
`_hoodie_commit_seqno` string,
`_hoodie_record_key` string,
`_hoodie_partition_path` string,
`_hoodie_file_name` string,
`age` bigint,
`name` string,
`sex` string,
`ts` bigint)
PARTITIONED BY (
`location` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
'file:/Users/sflee/personal/backup_demo'
TBLPROPERTIES (
'last_commit_time_sync'='20200528153331',
'transient_lastDdlTime'='1590650733')
sql:
set hoodie.realtime.merge.skip = true;
select sex, name, age from unknown_rt;
result:
!image-2020-05-28-21-06-02-396.png!
the fields is out of order when setting hoodie.realtime.merge.skip = true;
sql:
set hoodie.realtime.merge.skip = false;
select sex, name, age from unknown_rt
!image-2020-05-28-21-07-30-803.png!
query result is ok when setting hoodie.realtime.merge.skip = false;
after debugging, I found that hudi use getWriterSchema in RealtimeUnmergedRecordReader instead of getHiveSchema, we need fix it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)