You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Gabor Kaszab (Code Review)" <ge...@cloudera.org> on 2017/08/09 10:12:32 UTC

[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.

Gabor Kaszab has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7625

Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
......................................................................

IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions
pointing to the same filesystem location.

The maps storing file descriptors and file metadata were using filename as a key.
Multiple partitions pointing to the same filesystem location resulted that these
map entries were occasionally overwritted by the other partition poing to the same.

As a solution the map key was enhanced to contain a pair of partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M tests/metadata/test_partition_metadata.py
7 files changed, 109 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1040/

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................

IMPALA-5412 Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using
filename as a key. Multiple partitions pointing to the same
filesystem location resulted that these map entries were
occasionally overwritted by the other partition poing to
the same.

As a solution the map key was enhanced to contain a pair of
partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/util/container-util.h
M tests/metadata/test_partition_metadata.py
8 files changed, 109 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1040/

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 7: Verified+1

https://jenkins.impala.io/job/parallel-all-tests/1284/

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7625/5/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 365:   typedef std::unordered_map<PartitionFileKey , HdfsFileDesc*, pair_hash> FileDescMap;
> extra space after PartitionFileKey
Done


http://gerrit.cloudera.org:8080/#/c/7625/5/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

Line 52:   def test_multiple_partitions_same_location(self, vector, unique_database):
> I think this test could use some cleanup, but I'm ok to accept this patch i
I see your point but I don't have the time right now to do this refactoring. I think we should get this in and improve the tests as a separate step (if we think it's important).


Line 103:     # check if using num_nodes=1 has the same behaviour
> # force all scan ranges to be on the same node
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................


Patch Set 3:

(1 comment)

> (2 comments)
 > 
 > Gabor told me that he won't be able to work on this for around 10
 > days. From my point of view this is pretty close, so I think we
 > should consider filing a follow-on JIRA for the extra test coverage
 > and getting this in soonish to give it time to bake.

Works for me.

http://gerrit.cloudera.org:8080/#/c/7625/3/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

PS3, Line 589:   c
nit: 4 spaces


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has uploaded a new patch set (#3).

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................

IMPALA-5412 Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using
filename as a key. Multiple partitions pointing to the same
filesystem location resulted that these map entries were
occasionally overwritted by the other partition poing to
the same.

As a solution the map key was enhanced to contain a pair of
partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/util/container-util.h
M tests/metadata/test_partition_metadata.py
8 files changed, 120 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 8: Code-Review+2 Verified+1

Carry +2
https://jenkins.impala.io/job/parallel-all-tests/1284/

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#5).

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................

IMPALA-5412: Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using
filename as a key. Multiple partitions pointing to the same
filesystem location resulted that these map entries were
occasionally overwritted by the other partition poing to
the same.

As a solution the map key was enhanced to contain a pair of
partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/util/container-util.h
M tests/metadata/test_partition_metadata.py
8 files changed, 109 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................


Patch Set 3:

(2 comments)

Gabor told me that he won't be able to work on this for around 10 days. From my point of view this is pretty close, so I think we should consider filing a follow-on JIRA for the extra test coverage and getting this in soonish to give it time to bake.

http://gerrit.cloudera.org:8080/#/c/7625/3/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

PS3, Line 86: Imapala
There's still an extra 'a' :)


PS3, Line 125:  
Trailing whitespace.


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................


Patch Set 3:

(3 comments)

I ended up combining the tests in test_partition_metadata and adding parquet and seq. Reran the tests to confirm that it still failed before the fix.

http://gerrit.cloudera.org:8080/#/c/7625/3/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

PS3, Line 589:   c
> nit: 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/7625/3/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

PS3, Line 86: Imapala
> There's still an extra 'a' :)
Done


PS3, Line 125:  
> Trailing whitespace.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 7: Code-Review+2

Rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has uploaded a new patch set (#2).

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................

IMPALA-5412 Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using filename as a key.
Multiple partitions pointing to the same filesystem location resulted that these
map entries were occasionally overwritted by the other partition poing to the same.

As a solution the map key was enhanced to contain a pair of partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M tests/metadata/test_partition_metadata.py
7 files changed, 109 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
......................................................................


Patch Set 1:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

Line 90:           context->partition_descriptor()->id(),stream_->filename()));
nit:space after ,


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

PS1, Line 266: string
Do we need to call the string constructor explicitly? Seems a bit weird.


PS1, Line 634: name) {
Nit: long line > 90 chars


Line 648: void* HdfsScanNodeBase::GetFileMetadata(
Nit: I think this line fits in 90 characters.


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 197:   /// Allocate a new scan range object, stored in the runtime state's object pool. For
This comment needs updating since partition_id is now required.


Line 360:   struct pair_hash {
We have some utilities like this already in util/container-util.h, let's move it there.


Line 363:         return std::hash<T2>{}(p.second);
Let's hash both values in the pair and combine them with boost::hash_combine. The current approach probably works ok for this use case but it seems better to implement a general-purpose pair hash that can be reused, given it's not much more work.


http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

Line 43:     """Regression test for IMPALA-597. Verifies Impala is able to properly read
Can we run this test for Parquet too?

TestInsertQueries has text and parquet in its test matrix and uses the "STORED AS" clause too, so we can copy its approach.


Line 71:     data = self.execute_scalar("select sum(i), sum(j) from %s" % FQ_TBL_NAME)
Can we run this query with num_nodes=1 to verify that the same bug doesn't exist with text file?


Line 80:   def test_multiple_partitions_same_location_avro(self, vector, unique_database):
Test looks good, I think we should just make sure we address the full test coverage gap for the other file formats too.

Let's run this test for the other affected formats with an unsupported writers - SequenceFile.

If we change the test matrix for this class as I suggest above, we probably want to pull out this test into a separate class with a different test matrix.


PS1, Line 81: imapala
Impala


Line 110:     # (note, that shortcoming on avro writer would result the inserted value as NULL,
How about we only query the second column to avoid this complication?


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has submitted this change and it was merged.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


IMPALA-5412: Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using
filename as a key. Multiple partitions pointing to the same
filesystem location resulted that these map entries were
occasionally overwritted by the other partition poing to
the same.

As a solution the map key was enhanced to contain a pair of
partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Reviewed-on: http://gerrit.cloudera.org:8080/7625
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Tim Armstrong <ta...@cloudera.com>
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/util/container-util.h
M tests/metadata/test_partition_metadata.py
8 files changed, 110 insertions(+), 47 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................


Patch Set 1:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

Line 90:           context->partition_descriptor()->id(),stream_->filename()));
> nit:space after ,
Done


PS1, Line 163: ,s
> space
Done


PS1, Line 309:     
> 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 613:     context_->partition_descriptor()->id(), filename());
> 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

PS1, Line 266: ,
> space
Done


PS1, Line 266: string
> Do we need to call the string constructor explicitly? Seems a bit weird.
Done


PS1, Line 266: string
> remove string constructor here
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 197:   /// Allocate a new scan range object, stored in the runtime state's object pool. For
> This comment needs updating since partition_id is now required.
Done


PS1, Line 237:   /// Returns nullptr if the search doesn't find the descriptor.
> not true; it has a dcheck
Done


PS1, Line 360: pair_hash
> misleading name because the hash is computed on the value.
apparently, unordered_map doesn't accept pair as key by default, only if a custom hash is provided.


Line 360:   struct pair_hash {
> We have some utilities like this already in util/container-util.h, let's mo
Done


Line 363:         return std::hash<T2>{}(p.second);
> Let's hash both values in the pair and combine them with boost::hash_combin
Done


PS1, Line 369: pair<int64_t, std::string>
> Can you typedef this and using that here and in the .cc file ?
Done


PS1, Line 381: pair<int64_t, std::string>
> same
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

Line 71:     data = self.execute_scalar("select sum(i), sum(j) from %s" % FQ_TBL_NAME)
> Can we run this query with num_nodes=1 to verify that the same bug doesn't 
Done


PS1, Line 81: ad
> as?
Done


PS1, Line 81: imapala
> Impala
Done


PS1, Line 109: output
> 'output' isn't clear (it doesn't match up with the queries). say this is wh
Done


Line 110:     # (note, that shortcoming on avro writer would result the inserted value as NULL,
> How about we only query the second column to avoid this complication?
Done


PS1, Line 122: output:
             :     # [NULL, 1] 3 times
             :     # [NULL, 2] 3 times
> again, this is confusing
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7625/5/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 365:   typedef std::unordered_map<PartitionFileKey , HdfsFileDesc*, pair_hash> FileDescMap;
extra space after PartitionFileKey


http://gerrit.cloudera.org:8080/#/c/7625/5/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

Line 52:   def test_multiple_partitions_same_location(self, vector, unique_database):
I think this test could use some cleanup, but I'm ok to accept this patch if you feel the cleanup is too cumbersome.

* instead of limiting the file formats and relying on allow_unsupported_formats, we could use existing alltypes data (create table like, then create partitions pointing to known locations with data)
* split up the read and write tests; it's good to have coverage of the write path, but none of the JIRAs mentioned here were bugs in the write path


Line 103:     # check if using num_nodes=1 has the same behaviour
# force all scan ranges to be on the same node


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 7:

Running tests against the Impala-lzo branch with matching change: https://jenkins.impala.io/job/parallel-all-tests/1281/

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
......................................................................


Patch Set 1:

(2 comments)

Looks like MJ and I collided. I think we mostly agree here.

http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

PS1, Line 80: avro
> does this need to be avro? seems like text would work, and this test class 
The bug affects Seq/RC/Avro only.


PS1, Line 110: note, that shortcoming on avro writer would result the inserted value as NULL,
             :     # but the point is the second column for partition i
> This isn't a clear sentence; why does the avro writer insert NULL?
The Avro writer is broken. It always represents null values as the second element of the union, ignoring any schema. But the default avro schema in the frontend represents null values as the first element of the union.

I guess we could flip the 0 and 1 in HdfsAvroTableWriter::AppendField() and at least make the writer usable for this test.


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
......................................................................


Patch Set 1:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/7625/1//COMMIT_MSG
Commit Message:

PS1, Line 7: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions
           : pointing to the same filesystem location.
           : 
           : The maps storing file descriptors and file metadata were using filename as a key.
           : Multiple partitions pointing to the same filesystem location resulted that these
           : map entries were occasionally overwritted by the other partition poing to the same.
           : 
           : As a solution the map key was enhanced to contain a pair of partition ID and file name.
wrap at 60cols in git commit msgs


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

PS1, Line 163: ,s
space


PS1, Line 309:     
4 spaces


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 613:     context_->partition_descriptor()->id(), filename());
4 spaces


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

PS1, Line 266: string
remove string constructor here


PS1, Line 266: ,
space


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS1, Line 237:   /// Returns nullptr if the search doesn't find the descriptor.
not true; it has a dcheck


PS1, Line 360: pair_hash
misleading name because the hash is computed on the value.

why is it necessary to provide a custom hash?


PS1, Line 369: pair<int64_t, std::string>
Can you typedef this and using that here and in the .cc file ?


PS1, Line 381: pair<int64_t, std::string>
same


http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

PS1, Line 80: avro
does this need to be avro? seems like text would work, and this test class is using the text/none dimension (see l37)


PS1, Line 81: ad
as?


PS1, Line 109: output
'output' isn't clear (it doesn't match up with the queries). say this is what would get returned by 
 select i, j from %s


PS1, Line 110: note, that shortcoming on avro writer would result the inserted value as NULL,
             :     # but the point is the second column for partition i
This isn't a clear sentence; why does the avro writer insert NULL?


PS1, Line 122: output:
             :     # [NULL, 1] 3 times
             :     # [NULL, 2] 3 times
again, this is confusing


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 5: Code-Review+1

Looks good

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7625

to look at the new patch set (#6).

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................

IMPALA-5412: Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using
filename as a key. Multiple partitions pointing to the same
filesystem location resulted that these map entries were
occasionally overwritted by the other partition poing to
the same.

As a solution the map key was enhanced to contain a pair of
partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/util/container-util.h
M tests/metadata/test_partition_metadata.py
8 files changed, 110 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-5412: Fix scan result with partitions on same file

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5412: Fix scan result with partitions on same file
......................................................................


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No