You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2016/07/15 18:27:23 UTC
[5/5] incubator-impala git commit: IMPALA-3845: Split up
hdfs-parquet-scanner.cc into more files/components.
IMPALA-3845: Split up hdfs-parquet-scanner.cc into more files/components.
This patch refactors hdfs-parquet-scanner.cc into several files.
The new responsibilities of each file/component are roughly as follows:
hdfs-parquet-scanner.h/cc
- Creates column readers and uses them to materializes row batches.
- Evaluates runtime filters and conjuncts, populates row batch queue.
parquet-metadata-utils.h/cc
- Contains utilities for validating Parquet file metadata.
- Parses the schema of a Parquet file into our internal schema
representation.
- Resolves SchemaPaths (e.g. from a table descriptor) against
the internal representation of the Parquet file schema.
parquet-column-readers.h/cc
- Contains the per-column data reading, parsing and value
materialization logic.
Testing: A private core/hdfs run passed.
Change-Id: I4c5fd46f9c1a0ff2a4c30ea5a712fbae17c68f92
Reviewed-on: http://gerrit.cloudera.org:8080/3596
Tested-by: Internal Jenkins
Reviewed-by: Alex Behm <al...@cloudera.com>
Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/6ee15fad
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/6ee15fad
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/6ee15fad
Branch: refs/heads/master
Commit: 6ee15fadedcac9d41f8ad660caf8d4a60267df8e
Parents: baf8fe2
Author: Alex Behm <al...@cloudera.com>
Authored: Tue May 17 10:46:36 2016 -0700
Committer: Taras Bobrovytsky <ta...@apache.org>
Committed: Fri Jul 15 18:27:05 2016 +0000
----------------------------------------------------------------------
be/src/exec/CMakeLists.txt | 2 +
be/src/exec/base-sequence-scanner.cc | 2 +-
be/src/exec/hdfs-parquet-scanner.cc | 2316 +-----------------------
be/src/exec/hdfs-parquet-scanner.h | 221 +--
be/src/exec/hdfs-rcfile-scanner.cc | 2 +-
be/src/exec/hdfs-scanner.cc | 20 -
be/src/exec/hdfs-scanner.h | 11 -
be/src/exec/hdfs-text-scanner.cc | 3 +-
be/src/exec/parquet-column-readers.cc | 1093 +++++++++++
be/src/exec/parquet-column-readers.h | 500 +++++
be/src/exec/parquet-metadata-utils.cc | 647 +++++++
be/src/exec/parquet-metadata-utils.h | 202 +++
be/src/exec/parquet-scratch-tuple-batch.h | 72 +
be/src/exec/parquet-version-test.cc | 7 +-
be/src/exprs/expr-value.h | 2 +-
be/src/runtime/runtime-state.cc | 14 +
be/src/runtime/runtime-state.h | 6 +
be/src/util/debug-util.cc | 8 +
be/src/util/debug-util.h | 18 +
19 files changed, 2684 insertions(+), 2462 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6ee15fad/be/src/exec/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/be/src/exec/CMakeLists.txt b/be/src/exec/CMakeLists.txt
index 876fc7e..7cf4267 100644
--- a/be/src/exec/CMakeLists.txt
+++ b/be/src/exec/CMakeLists.txt
@@ -62,6 +62,8 @@ add_library(Exec
hbase-table-scanner.cc
incr-stats-util.cc
nested-loop-join-node.cc
+ parquet-column-readers.cc
+ parquet-metadata-utils.cc
partitioned-aggregation-node.cc
partitioned-aggregation-node-ir.cc
partitioned-hash-join-node.cc
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6ee15fad/be/src/exec/base-sequence-scanner.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/base-sequence-scanner.cc b/be/src/exec/base-sequence-scanner.cc
index dc7a983..268fdae 100644
--- a/be/src/exec/base-sequence-scanner.cc
+++ b/be/src/exec/base-sequence-scanner.cc
@@ -124,7 +124,7 @@ Status BaseSequenceScanner::ProcessSplit() {
header_ = state_->obj_pool()->Add(AllocateFileHeader());
Status status = ReadFileHeader();
if (!status.ok()) {
- RETURN_IF_ERROR(LogOrReturnError(status.msg()));
+ RETURN_IF_ERROR(state_->LogOrReturnError(status.msg()));
// We need to complete the ranges for this file.
CloseFileRanges(stream_->filename());
return Status::OK();