You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Attila Jeges (Code Review)" <ge...@cloudera.org> on 2017/03/10 16:11:53 UTC
[Impala-ASF-CR] IMPALA-3079: Fix sequence file writer
Attila Jeges has uploaded a new patch set (#2).
Change subject: IMPALA-3079: Fix sequence file writer
......................................................................
IMPALA-3079: Fix sequence file writer
Before the fix, sequence file writer produced corrupt files in some
cases. Steps to reproduce:
SET ALLOW_UNSUPPORTED_FORMATS=1;
create table store_sales_seq_snap like tpcds_parquet.store_sales
stored as SEQUENCEFILE;
insert into store_sales_seq_snap partition(ss_sold_date_sk)
select * from tpcds_parquet.store_sales
where ss_sold_date_sk between 2450816 and 2451200;
The insert statement produces a corrupt file that cannot be read back.
This change fixes:
- The implementation of zero-compressed encoding in ReadWriteUtil
class.
- The calculation of block sizes in SnappyBlockCompressor class.
- Creating record/block compressed sequence files in
HdfsSequenceTableWriter class.
Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
---
M be/src/exec/hdfs-sequence-table-writer.cc
M be/src/exec/hdfs-sequence-table-writer.h
M be/src/exec/read-write-util-test.cc
M be/src/exec/read-write-util.h
M be/src/util/compress.cc
M be/src/util/decompress-test.cc
M testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
M tests/query_test/test_compressed_formats.py
8 files changed, 385 insertions(+), 77 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6107/2
--
To view, visit http://gerrit.cloudera.org:8080/6107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>