You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Paul Rogers (Code Review)" <ge...@cloudera.org> on 2019/03/04 20:16:57 UTC

[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

Paul Rogers has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12628


Change subject: IMPALA-8258: Realistic star-schema tables
......................................................................

IMPALA-8258: Realistic star-schema tables

The tables in the `functional` db provide many interesting cases. The
tables in TPC-H and TPC-DS simulate a well-behaved application.
We also need some tables that show messy, real-world cases:

- Correlated filters (same filter on multiple tables)
- Compund join (primary, foreign) keys
- Extreme data skew

This patch adds scripts to generate five new "star_" tables. Modifies
functional scripts to load the tables, adds a text description of the
schema, and adds test that use the schema.

Tests: This is a test-only patch. Adds a new PlannerTest that highlights
current limitations in the planner: data skew and wrong plan when
correlated filters are present.

Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/bin/compute-table-stats.sh
A testdata/datasets/functional/build-star.py
M testdata/datasets/functional/functional_schema_template.sql
A testdata/datasets/functional/preload
M testdata/datasets/functional/schema_constraints.csv
A testdata/datasets/functional/star-schema.txt
A testdata/workloads/functional-planner/queries/PlannerTest/card-star-schema.test
8 files changed, 579 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/12628/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers <pr...@cloudera.com>

[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

Posted by "Paul Rogers (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12628

to look at the new patch set (#5).

Change subject: IMPALA-8258: Realistic star-schema tables
......................................................................

IMPALA-8258: Realistic star-schema tables

The tables in the `functional` db provide many interesting cases. The
tables in TPC-H and TPC-DS simulate a well-behaved application.
We also need some tables that show messy, real-world cases:

- Correlated filters (same filter on multiple tables)
- Compund join (primary, foreign) keys
- Extreme data skew

This patch adds scripts to generate five new "star_" tables. Modifies
functional scripts to load the tables, adds a text description of the
schema, and adds test that use the schema.

Tests: This is a test-only patch. Adds a new PlannerTest that highlights
current limitations in the planner: data skew and wrong plan when
correlated filters are present.

Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/bin/compute-table-stats.sh
A testdata/datasets/functional/build-star.py
M testdata/datasets/functional/functional_schema_template.sql
A testdata/datasets/functional/preload
M testdata/datasets/functional/schema_constraints.csv
A testdata/datasets/functional/star-schema.txt
A testdata/workloads/functional-planner/queries/PlannerTest/card-star-schema.test
8 files changed, 589 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/12628/5
-- 
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers <pr...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <pr...@cloudera.com>

[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12628 )

Change subject: IMPALA-8258: Realistic star-schema tables
......................................................................


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py
File testdata/datasets/functional/build-star.py:

http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@67
PS5, Line 67:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@82
PS5, Line 82:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@84
PS5, Line 84:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@132
PS5, Line 132:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@143
PS5, Line 143:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@160
PS5, Line 160:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@163
PS5, Line 163:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@174
PS5, Line 174:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/5/testdata/datasets/functional/build-star.py@179
PS5, Line 179:  
flake8: E203 whitespace before ':'



-- 
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers <pr...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <pr...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Mar 2019 20:34:14 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

Posted by "Paul Rogers (Code Review)" <ge...@cloudera.org>.
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12628 )

Change subject: IMPALA-8258: Realistic star-schema tables
......................................................................


Patch Set 4:

Passed pre-review tests: https://jenkins.impala.io/job/pre-review-test/328/


-- 
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers <pr...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <pr...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Mar 2019 20:17:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12628 )

Change subject: IMPALA-8258: Realistic star-schema tables
......................................................................


Patch Set 4:

(41 comments)

http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py
File testdata/datasets/functional/build-star.py:

http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@47
PS4, Line 47: ,
flake8: E231 missing whitespace after ','


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@50
PS4, Line 50: def buildTenantTable() :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@50
PS4, Line 50:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@53
PS4, Line 53:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@60
PS4, Line 60: def buildVendorTable() :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@60
PS4, Line 60:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@63
PS4, Line 63:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@65
PS4, Line 65:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@74
PS4, Line 74: def buildProductTable() :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@74
PS4, Line 74:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@77
PS4, Line 77:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@79
PS4, Line 79:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@81
PS4, Line 81:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@91
PS4, Line 91: def buildClientTable() :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@91
PS4, Line 91:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@94
PS4, Line 94:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@96
PS4, Line 96:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@114
PS4, Line 114: def buildFactTable() :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@114
PS4, Line 114:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@121
PS4, Line 121:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@123
PS4, Line 123:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@127
PS4, Line 127:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@136
PS4, Line 136: def buildTables() :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@136
PS4, Line 136:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@137
PS4, Line 137:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@139
PS4, Line 139:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@147
PS4, Line 147: def makeTable(name) :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@147
PS4, Line 147:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@151
PS4, Line 151: def writeHeader(out, fields) :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@151
PS4, Line 151:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@152
PS4, Line 152:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@154
PS4, Line 154:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@155
PS4, Line 155:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@162
PS4, Line 162: def writeRow(out, fields) :
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@162
PS4, Line 162:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@164
PS4, Line 164:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@165
PS4, Line 165:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@166
PS4, Line 166:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@170
PS4, Line 170:  
flake8: E203 whitespace before ':'


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@175
PS4, Line 175: if __name__ == "__main__":
flake8: E305 expected 2 blank lines after class or function definition, found 1


http://gerrit.cloudera.org:8080/#/c/12628/4/testdata/datasets/functional/build-star.py@176
PS4, Line 176:  
flake8: E203 whitespace before ':'



-- 
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers <pr...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <pr...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Mar 2019 20:17:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12628 )

Change subject: IMPALA-8258: Realistic star-schema tables
......................................................................


Patch Set 5:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/2343/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers <pr...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <pr...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Mar 2019 21:01:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12628 )

Change subject: IMPALA-8258: Realistic star-schema tables
......................................................................


Patch Set 4:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/2342/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers <pr...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <pr...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Mar 2019 20:45:26 +0000
Gerrit-HasComments: No