You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Anonymous Coward (Code Review)" <ge...@cloudera.org> on 2016/09/09 22:13:00 UTC

[Impala-ASF-CR] IMPALA-4100, IMPALA-4112: RQG-on-Hive: Replace EXTRACT UDF and IS [NOT] DISTINCT FROM in HiveSqlWriter

stakiar@cloudera.com has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4357

Change subject: IMPALA-4100, IMPALA-4112: RQG-on-Hive: Replace EXTRACT UDF and IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................

IMPALA-4100, IMPALA-4112: RQG-on-Hive: Replace EXTRACT UDF and IS [NOT] DISTINCT FROM in HiveSqlWriter

IMPALA-4100:

* Postgres and Impala support EXTRACT([field] from [date-type]), but Hive doesn't
* Hive has other UDFs that perform the same function, but they have different names
* This commit modifies the HiveSqlWriter to use the corresponding Hive functions

IMPALA-4112:

* Postgres and Impala support IS [NOT] DISTINCT FROM clauses as a null safe equals
* Hive doesn't support this clause, but has a null safe equals operator: <=>
* This commit modifies the HiveSqlWriter to use <=> instead of IS [NOT] DISTINCT FROM

Testing:

* This commit only modifies the HiveSqlWriter, so no testing against Impala was done
* Tested locally against Hive

Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
---
M tests/comparison/model_translator.py
1 file changed, 25 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/4357/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


Patch Set 2: Code-Review+1

This needs a committer's +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: stakiar@cloudera.com
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


Patch Set 2:

Note to committer: there's no GVO path for this, so I suggest you submit along with +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: stakiar@cloudera.com
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


Patch Set 2: Code-Review+2 Verified+1

Have we considered added some basic sanity tests for the query generator? Might help prevent it bitrotting in future.

-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: stakiar@cloudera.com
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4100, IMPALA-4112: RQG-on-Hive: Replace EXTRACT UDF and IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4100, IMPALA-4112: RQG-on-Hive: Replace EXTRACT UDF and IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


Patch Set 1:

(2 comments)

Thanks for the patch! This is easy to review since the changes are small and isolated to HiveSqlWriter.

http://gerrit.cloudera.org:8080/#/c/4357/1//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4100, IMPALA-4112: RQG-on-Hive: Replace EXTRACT UDF and IS [NOT] DISTINCT FROM in HiveSqlWriter
Please keep the commit message to < 90 characters at most.

For this line, you can convey the same info with:

  IMPALA-4100,4112: qgen: replace EXTRACT UDF and IS [NOT] DISTINCT FROM in HiveSqlWriter


http://gerrit.cloudera.org:8080/#/c/4357/1/tests/comparison/model_translator.py
File tests/comparison/model_translator.py:

PS1, Line 462:     self.operator_funcs['IsNotDistinctFrom'] = '({0}) <=> ({1})'
             :     self.operator_funcs['IsNotDistinctFromOp'] = '({0}) <=> ({1})'
             :     self.operator_funcs['IsDistinctFrom'] = 'NOT(({0}) <=> ({1}))'
What do you think about:

  self.operator_funcs.update({
      'IsNotDistinctFrom': '({0}) <=> ({1})',
      # etc.
  })


-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
stakiar@cloudera.com has posted comments on this change.

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4357/1//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
> Please keep the commit message to < 90 characters at most.
Will do, fixed


http://gerrit.cloudera.org:8080/#/c/4357/1/tests/comparison/model_translator.py
File tests/comparison/model_translator.py:

PS1, Line 462:     self.operator_funcs.update({
             :       'IsNotDistinctFrom': '({0}) <=> ({1})',
             :       'IsNotDistinctFromOp': '({0}) <=> ({1})',
> What do you think about:
Sounds good to me, updated.


-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: stakiar@cloudera.com
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


Patch Set 3:

> Have we considered added some basic sanity tests for the query generator?

I've got a few directed unit tests for a single area and plan to add more as I add features.

-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: stakiar@cloudera.com
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
stakiar@cloudera.com has uploaded a new patch set (#2).

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................

IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

IMPALA-4100:

* Postgres and Impala support EXTRACT([field] from [date-type]), but Hive doesn't
* Hive has other UDFs that perform the same function, but they have different names
* This commit modifies the HiveSqlWriter to use the corresponding Hive functions

IMPALA-4112:

* Postgres and Impala support IS [NOT] DISTINCT FROM clauses as a null safe equals
* Hive doesn't support this clause, but has a null safe equals operator: <=>
* This commit modifies the HiveSqlWriter to use <=> instead of IS [NOT] DISTINCT FROM

Testing:

* This commit only modifies the HiveSqlWriter, so no testing against Impala was done
* Tested locally against Hive

Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
---
M tests/comparison/model_translator.py
1 file changed, 27 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/4357/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
stakiar@cloudera.com has posted comments on this change.

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


Patch Set 3:

> > Have we considered added some basic sanity tests for the query
 > generator?
 > 
 > I've got a few directed unit tests for a single area and plan to
 > add more as I add features.

Thanks for the quick code review! I start working on adding some unit tests also.

-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: stakiar@cloudera.com
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has submitted this change and it was merged.

Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
......................................................................


IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter

IMPALA-4100:

* Postgres and Impala support EXTRACT([field] from [date-type]), but Hive doesn't
* Hive has other UDFs that perform the same function, but they have different names
* This commit modifies the HiveSqlWriter to use the corresponding Hive functions

IMPALA-4112:

* Postgres and Impala support IS [NOT] DISTINCT FROM clauses as a null safe equals
* Hive doesn't support this clause, but has a null safe equals operator: <=>
* This commit modifies the HiveSqlWriter to use <=> instead of IS [NOT] DISTINCT FROM

Testing:

* This commit only modifies the HiveSqlWriter, so no testing against Impala was done
* Tested locally against Hive

Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Reviewed-on: http://gerrit.cloudera.org:8080/4357
Reviewed-by: Michael Brown <mi...@cloudera.com>
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Tim Armstrong <ta...@cloudera.com>
---
M tests/comparison/model_translator.py
1 file changed, 27 insertions(+), 0 deletions(-)

Approvals:
  Michael Brown: Looks good to me, but someone else must approve
  Tim Armstrong: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stakiar@cloudera.com
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: stakiar@cloudera.com