You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org> on 2019/04/30 14:22:13 UTC
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13190
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
IMPALA-8409: Fix row-size for STRING columns with unknown stats
Explain returned row-size=11B for STRING columns without statistics.
The issue was caused by adding -1 (meaning unknown) to the 12 byte
slot size (sizeof(StringValue)). The code in TupleDescriptor.java
tried to handle this by checking if the size is -1, but it was
already 11 at this point.
There is more potential for cleanup, but I wanted to keep this
change minimal.
Testing:
- revived some tests in CatalogTest.java that were removed
in 2013 due to flakiness
- added an EE test that checks row size with and without stats
- ran core FE and EE tests
Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
---
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M tests/metadata/test_explain.py
5 files changed, 78 insertions(+), 50 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13190/1
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13190
to look at the new patch set (#5).
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
IMPALA-8409: Fix row-size for STRING columns with unknown stats
Explain returned row-size=11B for STRING columns without statistics.
The issue was caused by adding -1 (meaning unknown) to the 12 byte
slot size (sizeof(StringValue)). The code in TupleDescriptor.java
tried to handle this by checking if the size is -1, but it was
already 11 at this point.
There is more potential for cleanup, but I wanted to keep this
change minimal.
Testing:
- revived some tests in CatalogTest.java that were removed
in 2013 due to flakiness
- added an EE test that checks row size with and without stats
- fixed a similar test, test_explain_validate_cardinality_estimates
(the format of the line it looks for has changed, which lead to
skipping the actual verification and accepting everything)
- ran core FE and EE tests
Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
---
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M tests/metadata/test_explain.py
5 files changed, 80 insertions(+), 50 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13190/5
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 2:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/3002/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 30 Apr 2019 15:16:33 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 2:
(3 comments)
http://gerrit.cloudera.org:8080/#/c/13190/2/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
File fe/src/test/java/org/apache/impala/catalog/CatalogTest.java:
http://gerrit.cloudera.org:8080/#/c/13190/2/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java@469
PS2, Line 469: // Note that this test used to be flaky, so it was removed (6 years ago). I added it
> Yeah I think we'd have noticed if stats were disappearing in our Jenkin's r
Done
http://gerrit.cloudera.org:8080/#/c/13190/2/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:
http://gerrit.cloudera.org:8080/#/c/13190/2/tests/metadata/test_explain.py@141
PS2, Line 141: def test_explain_row_size_estimates(self, vector, unique_database):
> My initial reaction was that this should be a planner test. I can see, thou
I agree that it would be better to check this among planner tests - if the planner acts in an unexpected way, then it is the best to catch it in planner tests. My laziness says that it has a nice place near the similar test_explain_validate_cardinality_estimates.
There was a benefit in doing it here - it made me realize that test_explain_validate_cardinality_estimates is totally broken, it accepts any cardinality as its regexp no longer matches the lines it should.
http://gerrit.cloudera.org:8080/#/c/13190/2/tests/metadata/test_explain.py@142
PS2, Line 142: # Tests that EXPLAIN returns the expected row sizes with and without stats.
> Should use a """ """ docstring for method comments
Done
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 14:31:46 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13190
to look at the new patch set (#3).
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
IMPALA-8409: Fix row-size for STRING columns with unknown stats
Explain returned row-size=11B for STRING columns without statistics.
The issue was caused by adding -1 (meaning unknown) to the 12 byte
slot size (sizeof(StringValue)). The code in TupleDescriptor.java
tried to handle this by checking if the size is -1, but it was
already 11 at this point.
There is more potential for cleanup, but I wanted to keep this
change minimal.
Testing:
- revived some tests in CatalogTest.java that were removed
in 2013 due to flakiness
- added an EE test that checks row size with and without stats
- fixed a similar test, test_explain_validate_cardinality_estimates
(the format of the line it looks for has changed, which lead to
skipping the actual verification and accepting everything)
- ran core FE and EE tests
Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
---
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M tests/metadata/test_explain.py
5 files changed, 85 insertions(+), 50 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13190/3
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 3:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/3036/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 15:01:33 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
IMPALA-8409: Fix row-size for STRING columns with unknown stats
Explain returned row-size=11B for STRING columns without statistics.
The issue was caused by adding -1 (meaning unknown) to the 12 byte
slot size (sizeof(StringValue)). The code in TupleDescriptor.java
tried to handle this by checking if the size is -1, but it was
already 11 at this point.
There is more potential for cleanup, but I wanted to keep this
change minimal.
Testing:
- revived some tests in CatalogTest.java that were removed
in 2013 due to flakiness
- added an EE test that checks row size with and without stats
- fixed a similar test, test_explain_validate_cardinality_estimates
(the format of the line it looks for has changed, which lead to
skipping the actual verification and accepting everything)
- ran core FE and EE tests
Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Reviewed-on: http://gerrit.cloudera.org:8080/13190
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M tests/metadata/test_explain.py
5 files changed, 80 insertions(+), 50 deletions(-)
Approvals:
Impala Public Jenkins: Looks good to me, approved; Verified
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 5:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/3041/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 19:23:09 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 2:
(3 comments)
http://gerrit.cloudera.org:8080/#/c/13190/2/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
File fe/src/test/java/org/apache/impala/catalog/CatalogTest.java:
http://gerrit.cloudera.org:8080/#/c/13190/2/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java@469
PS2, Line 469: // Note that this test used to be flaky, so it was removed (6 years ago). I added it
Yeah I think we'd have noticed if stats were disappearing in our Jenkin's runs. Could just remove this comment.
http://gerrit.cloudera.org:8080/#/c/13190/2/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:
http://gerrit.cloudera.org:8080/#/c/13190/2/tests/metadata/test_explain.py@141
PS2, Line 141: def test_explain_row_size_estimates(self, vector, unique_database):
My initial reaction was that this should be a planner test. I can see, though, that there's some value in having an end-to-end test like this since it does actually test that the stats are handled correctly end-to-end from compute stats to the planner. Can you add a sentence to the method comment to justify why this should be an end-to-end test. We don't want future people blindly copying this instead of writing a planner test.
http://gerrit.cloudera.org:8080/#/c/13190/2/tests/metadata/test_explain.py@142
PS2, Line 142: # Tests that EXPLAIN returns the expected row sizes with and without stats.
Should use a """ """ docstring for method comments
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 30 Apr 2019 16:02:09 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 4: Code-Review+2
(1 comment)
http://gerrit.cloudera.org:8080/#/c/13190/4/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:
http://gerrit.cloudera.org:8080/#/c/13190/4/tests/metadata/test_explain.py@87
PS4, Line 87: assert found_match
Can you make sure these asserts print the output with the missing string? Makes it easier to debug if it fails in future.
assert found_match, query_result
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 17:35:18 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13190
to look at the new patch set (#4).
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
IMPALA-8409: Fix row-size for STRING columns with unknown stats
Explain returned row-size=11B for STRING columns without statistics.
The issue was caused by adding -1 (meaning unknown) to the 12 byte
slot size (sizeof(StringValue)). The code in TupleDescriptor.java
tried to handle this by checking if the size is -1, but it was
already 11 at this point.
There is more potential for cleanup, but I wanted to keep this
change minimal.
Testing:
- revived some tests in CatalogTest.java that were removed
in 2013 due to flakiness
- added an EE test that checks row size with and without stats
- fixed a similar test, test_explain_validate_cardinality_estimates
(the format of the line it looks for has changed, which lead to
skipping the actual verification and accepting everything)
- ran core FE and EE tests
Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
---
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M tests/metadata/test_explain.py
5 files changed, 80 insertions(+), 50 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13190/4
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 6: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 18:40:28 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13190
to look at the new patch set (#2).
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
IMPALA-8409: Fix row-size for STRING columns with unknown stats
Explain returned row-size=11B for STRING columns without statistics.
The issue was caused by adding -1 (meaning unknown) to the 12 byte
slot size (sizeof(StringValue)). The code in TupleDescriptor.java
tried to handle this by checking if the size is -1, but it was
already 11 at this point.
There is more potential for cleanup, but I wanted to keep this
change minimal.
Testing:
- revived some tests in CatalogTest.java that were removed
in 2013 due to flakiness
- added an EE test that checks row size with and without stats
- ran core FE and EE tests
Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
---
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M tests/metadata/test_explain.py
5 files changed, 78 insertions(+), 50 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/13190/2
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 6: Verified+1
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 23:58:25 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 4:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/3037/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 15:11:04 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 1:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/3001/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 30 Apr 2019 15:08:43 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 5: Code-Review+2
(1 comment)
Carry +2 from Tim.
http://gerrit.cloudera.org:8080/#/c/13190/4/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:
http://gerrit.cloudera.org:8080/#/c/13190/4/tests/metadata/test_explain.py@87
PS4, Line 87: assert found_match, query_result
> Can you make sure these asserts print the output with the missing string? M
Thanks, good idea!
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 18:39:14 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8409: Fix row-size for STRING columns with unknown stats
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13190 )
Change subject: IMPALA-8409: Fix row-size for STRING columns with unknown stats
......................................................................
Patch Set 6:
Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4143/ DRY_RUN=false
--
To view, visit http://gerrit.cloudera.org:8080/13190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Gerrit-Change-Number: 13190
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 May 2019 18:40:29 +0000
Gerrit-HasComments: No