You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Panagiotis Garefalakis (Jira)" <ji...@apache.org> on 2020/11/13 13:12:00 UTC
[jira] [Updated] (HIVE-24224) Fix skipping header/footer for Hive
on Tez on compressed files
[ https://issues.apache.org/jira/browse/HIVE-24224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Panagiotis Garefalakis updated HIVE-24224:
------------------------------------------
Parent: HIVE-22769
Issue Type: Sub-task (was: Bug)
> Fix skipping header/footer for Hive on Tez on compressed files
> --------------------------------------------------------------
>
> Key: HIVE-24224
> URL: https://issues.apache.org/jira/browse/HIVE-24224
> Project: Hive
> Issue Type: Sub-task
> Reporter: Panagiotis Garefalakis
> Assignee: Panagiotis Garefalakis
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Compressed file with Hive on Tez returns header and footers - for both select * and select count ( * ):
> {noformat}
> printf "offset,id,other\n9,\"20200315 X00 1356\",123\n17,\"20200315 X00 1357\",123\nrst,rst,rst" > data.csv
> hdfs dfs -put -f data.csv /apps/hive/warehouse/bz2test/bz2tbl1/
> bzip2 -f data.csv
> hdfs dfs -put -f data.csv.bz2 /apps/hive/warehouse/bz2test/bz2tbl2/
> beeline -e "CREATE EXTERNAL TABLE default.bz2tst2 (
> sequence int,
> id string,
> other string)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
> LOCATION '/apps/hive/warehouse/bz2test/bz2tbl2'
> TBLPROPERTIES (
> 'skip.header.line.count'='1',
> 'skip.footer.line.count'='1');"
> beeline -e "
> SET hive.fetch.task.conversion = none;
> SELECT * FROM default.bz2tst2;"
> +-------------------+--------------------+----------------+
> | bz2tst2.sequence | bz2tst2.id | bz2tst2.other |
> +-------------------+--------------------+----------------+
> | offset | id | other |
> | 9 | 20200315 X00 1356 | 123 |
> | 17 | 20200315 X00 1357 | 123 |
> | rst | rst | rst |
> +-------------------+--------------------+----------------+
> {noformat}
> PS: HIVE-22769 addressed the issue for Hive on LLAP.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)