You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Brown (Code Review)" <ge...@cloudera.org> on 2017/01/25 23:01:59 UTC

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5795

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................

IMPALA-4359: qgen: add UPSERT support

UPSERTs are very similar to INSERTs, so the UPSERT support is simply
folded into that of INSERT. We do this by adding another "conflict
action", CONFLICT_ACTION_UPDATE. The object responsible for holding the
conflict_action attribute is now the InsertClause. This is needed here
because the SqlWriter now needs to know the conflict_action both when
writing the InsertClause (Impala) and at the tail end of the
InsertStatement (PostgreSQL). We also add a few properties to the
InsertStatement interface so that the PostgresqlSqlWriter can form the
correct "DO UPDATE" conflic action, in which primary key columns and
updatable columns must be known. More information on that here:

https://www.postgresql.org/docs/9.5/static/sql-insert.html

By default, we will tend to generate 3 UPSERTs for every 1 INSERT.

In addition to adding unit tests to make sure UPSERTs are properly
written, I used discrepancy_searcher.py --profile dmlonly, both with and
without --explain-only, do run tests. I made sure we were generating
syntactically valid UPSERT statements, and that the INSERT/UPSERT ratio
was roughly 1/3 after 100 statements.

Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
---
M tests/comparison/discrepancy_searcher.py
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/query_profile.py
M tests/comparison/statement_generator.py
M tests/comparison/tests/query_object_testdata.py
6 files changed, 256 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/5795/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 2:

(3 comments)

Thanks for the review, Taras. Please see patch set 2.

http://gerrit.cloudera.org:8080/#/c/5795/1/tests/comparison/model_translator.py
File tests/comparison/model_translator.py:

PS1, Line 534: be
> Why 1 here? Are we expecting there to be several "INSERT" keywords in the q
Done


http://gerrit.cloudera.org:8080/#/c/5795/1/tests/comparison/query.py
File tests/comparison/query.py:

Line 705:   # This enum represents possibilities for different types of INSERTs. A user of this
> I see that you explained what these mean below, but it's not to perfectly c
Done


PS1, Line 727:  so th
> inserting into a Kudu table
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/5795/1/tests/comparison/model_translator.py
File tests/comparison/model_translator.py:

PS1, Line 534: 1)
Why 1 here? Are we expecting there to be several "INSERT" keywords in the query sql?


http://gerrit.cloudera.org:8080/#/c/5795/1/tests/comparison/query.py
File tests/comparison/query.py:

Line 705:   (CONFLICT_ACTION_DEFAULT,
I see that you explained what these mean below, but it's not to perfectly clear to me. Maybe explain what these mean exactly here too?

Conflict is when there is a row with the same primary key as you're trying to insert, right? Ignore means insert anyways (so duplicate the primary key?). What is default exactly?


PS1, Line 727: ng a K
inserting into a Kudu table


http://gerrit.cloudera.org:8080/#/c/5795/1/tests/comparison/statement_generator.py
File tests/comparison/statement_generator.py:

Line 70:     if dml_table.primary_keys:
Add a comment that if the table has primary keys then it's a Kudu table and Kudu supports UPSERT. (Otherwise upsert is not generated because it's not supported)


-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/237/

-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has uploaded a new patch set (#2).

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................

IMPALA-4359: qgen: add UPSERT support

UPSERTs are very similar to INSERTs, so the UPSERT support is simply
folded into that of INSERT. We do this by adding another "conflict
action", CONFLICT_ACTION_UPDATE. The object responsible for holding the
conflict_action attribute is now the InsertClause. This is needed here
because the SqlWriter now needs to know the conflict_action both when
writing the InsertClause (Impala) and at the tail end of the
InsertStatement (PostgreSQL). We also add a few properties to the
InsertStatement interface so that the PostgresqlSqlWriter can form the
correct "DO UPDATE" conflic action, in which primary key columns and
updatable columns must be known. More information on that here:

https://www.postgresql.org/docs/9.5/static/sql-insert.html

By default, we will tend to generate 3 UPSERTs for every 1 INSERT.

In addition to adding unit tests to make sure UPSERTs are properly
written, I used discrepancy_searcher.py --profile dmlonly, both with and
without --explain-only, do run tests. I made sure we were generating
syntactically valid UPSERT statements, and that the INSERT/UPSERT ratio
was roughly 1/3 after 100 statements.

Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
---
M tests/comparison/discrepancy_searcher.py
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/query_profile.py
M tests/comparison/statement_generator.py
M tests/comparison/tests/query_object_testdata.py
6 files changed, 296 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/5795/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5795/1/tests/comparison/statement_generator.py
File tests/comparison/statement_generator.py:

Line 70:     if dml_table.primary_keys:
> Add a comment that if the table has primary keys then it's a Kudu table and
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

Looks good to me.

http://gerrit.cloudera.org:8080/#/c/5795/2/tests/comparison/query.py
File tests/comparison/query.py:

Line 705:   # This enum represents possibilities for different types of INSERTs. A user of this
Great explanation, thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Hello David Knupp, Taras Bobrovytsky,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5795

to look at the new patch set (#3).

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................

IMPALA-4359: qgen: add UPSERT support

UPSERTs are very similar to INSERTs, so the UPSERT support is simply
folded into that of INSERT. We do this by adding another "conflict
action", CONFLICT_ACTION_UPDATE. The object responsible for holding the
conflict_action attribute is now the InsertClause. This is needed here
because the SqlWriter now needs to know the conflict_action both when
writing the InsertClause (Impala) and at the tail end of the
InsertStatement (PostgreSQL). We also add a few properties to the
InsertStatement interface so that the PostgresqlSqlWriter can form the
correct "DO UPDATE" conflic action, in which primary key columns and
updatable columns must be known. More information on that here:

https://www.postgresql.org/docs/9.5/static/sql-insert.html

By default, we will tend to generate 3 UPSERTs for every 1 INSERT.

In addition to adding unit tests to make sure UPSERTs are properly
written, I used discrepancy_searcher.py --profile dmlonly, both with and
without --explain-only, do run tests. I made sure we were generating
syntactically valid UPSERT statements, and that the INSERT/UPSERT ratio
was roughly 1/3 after 100 statements.

Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
---
M tests/comparison/discrepancy_searcher.py
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/query_profile.py
M tests/comparison/statement_generator.py
M tests/comparison/tests/query_object_testdata.py
6 files changed, 296 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/5795/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


IMPALA-4359: qgen: add UPSERT support

UPSERTs are very similar to INSERTs, so the UPSERT support is simply
folded into that of INSERT. We do this by adding another "conflict
action", CONFLICT_ACTION_UPDATE. The object responsible for holding the
conflict_action attribute is now the InsertClause. This is needed here
because the SqlWriter now needs to know the conflict_action both when
writing the InsertClause (Impala) and at the tail end of the
InsertStatement (PostgreSQL). We also add a few properties to the
InsertStatement interface so that the PostgresqlSqlWriter can form the
correct "DO UPDATE" conflic action, in which primary key columns and
updatable columns must be known. More information on that here:

https://www.postgresql.org/docs/9.5/static/sql-insert.html

By default, we will tend to generate 3 UPSERTs for every 1 INSERT.

In addition to adding unit tests to make sure UPSERTs are properly
written, I used discrepancy_searcher.py --profile dmlonly, both with and
without --explain-only, do run tests. I made sure we were generating
syntactically valid UPSERT statements, and that the INSERT/UPSERT ratio
was roughly 1/3 after 100 statements.

Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Reviewed-on: http://gerrit.cloudera.org:8080/5795
Reviewed-by: Taras Bobrovytsky <tb...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M tests/comparison/discrepancy_searcher.py
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/query_profile.py
M tests/comparison/statement_generator.py
M tests/comparison/tests/query_object_testdata.py
6 files changed, 296 insertions(+), 70 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Taras Bobrovytsky: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 3:

rebase + fix 1 typo

-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4359: qgen: add UPSERT support

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4359: qgen: add UPSERT support
......................................................................


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5795
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6382f6ab22ba29c117e39a5d90592d3637df4b25
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No