You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Deepak Jaiswal <dj...@hortonworks.com> on 2017/05/03 20:12:03 UTC

Review Request 58973: HIVE-16578

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/
-----------------------------------------------------------

Review request for hive and Jason Dere.


Repository: hive-git


Description
-------

Semijoin Hints should use column name, if provided for partition key check.
Involves some code refactoring.


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 


Diff: https://reviews.apache.org/r/58973/diff/1/


Testing
-------


Thanks,

Deepak Jaiswal


Re: Review Request 58973: HIVE-16578

Posted by Deepak Jaiswal <dj...@hortonworks.com>.

> On May 4, 2017, 7:20 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Line 234 (original), 234 (patched)
> > <https://reviews.apache.org/r/58973/diff/1-2/?file=1707234#file1707234line234>
> >
> >     is colName ever non-null, after the change on line 226? Was under the impression that an empty StringBuilder would simply return an empty string, so this would always be non-null.

Thanks for bringing this up. I will fix it.


- Deepak


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/#review173944
-----------------------------------------------------------


On May 4, 2017, 7:34 a.m., Deepak Jaiswal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58973/
> -----------------------------------------------------------
> 
> (Updated May 4, 2017, 7:34 a.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Semijoin Hints should use column name, if provided for partition key check.
> Involves some code refactoring.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
>   ql/src/test/queries/clientpositive/semijoin_hint.q a3cd1d664d 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 1d1f86bfaa 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_2.q.out a5fdd90811 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_3.q.out 8950b70b09 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_user_level.q.out b910df4f34 
>   ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 388888ef1c 
> 
> 
> Diff: https://reviews.apache.org/r/58973/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>


Re: Review Request 58973: HIVE-16578

Posted by Jason Dere <jd...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/#review173944
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Line 234 (original), 234 (patched)
<https://reviews.apache.org/r/58973/#comment247015>

    is colName ever non-null, after the change on line 226? Was under the impression that an empty StringBuilder would simply return an empty string, so this would always be non-null.


- Jason Dere


On May 4, 2017, 7:34 a.m., Deepak Jaiswal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58973/
> -----------------------------------------------------------
> 
> (Updated May 4, 2017, 7:34 a.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Semijoin Hints should use column name, if provided for partition key check.
> Involves some code refactoring.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
>   ql/src/test/queries/clientpositive/semijoin_hint.q a3cd1d664d 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 1d1f86bfaa 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_2.q.out a5fdd90811 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_3.q.out 8950b70b09 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_user_level.q.out b910df4f34 
>   ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 388888ef1c 
> 
> 
> Diff: https://reviews.apache.org/r/58973/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>


Re: Review Request 58973: HIVE-16578

Posted by Deepak Jaiswal <dj...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/
-----------------------------------------------------------

(Updated May 4, 2017, 8 p.m.)


Review request for hive and Jason Dere.


Changes
-------

Somehow results for a test got updated with incorrect stats.


Repository: hive-git


Description
-------

Semijoin Hints should use column name, if provided for partition key check.
Involves some code refactoring.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
  ql/src/test/queries/clientpositive/semijoin_hint.q a3cd1d664d 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 1d1f86bfaa 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_2.q.out a5fdd90811 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_3.q.out 8950b70b09 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_user_level.q.out b910df4f34 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 388888ef1c 


Diff: https://reviews.apache.org/r/58973/diff/5/

Changes: https://reviews.apache.org/r/58973/diff/4-5/


Testing
-------


Thanks,

Deepak Jaiswal


Re: Review Request 58973: HIVE-16578

Posted by Deepak Jaiswal <dj...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/
-----------------------------------------------------------

(Updated May 4, 2017, 7:43 p.m.)


Review request for hive and Jason Dere.


Changes
-------

Implemented the review comment.


Repository: hive-git


Description
-------

Semijoin Hints should use column name, if provided for partition key check.
Involves some code refactoring.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
  ql/src/test/queries/clientpositive/semijoin_hint.q a3cd1d664d 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 1d1f86bfaa 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_2.q.out a5fdd90811 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_3.q.out 8950b70b09 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_user_level.q.out b910df4f34 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 388888ef1c 


Diff: https://reviews.apache.org/r/58973/diff/4/

Changes: https://reviews.apache.org/r/58973/diff/3-4/


Testing
-------


Thanks,

Deepak Jaiswal


Re: Review Request 58973: HIVE-16578

Posted by Deepak Jaiswal <dj...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/
-----------------------------------------------------------

(Updated May 4, 2017, 7:34 a.m.)


Review request for hive and Jason Dere.


Changes
-------

Fixed failing tests.


Repository: hive-git


Description
-------

Semijoin Hints should use column name, if provided for partition key check.
Involves some code refactoring.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
  ql/src/test/queries/clientpositive/semijoin_hint.q a3cd1d664d 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 1d1f86bfaa 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_2.q.out a5fdd90811 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_3.q.out 8950b70b09 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_user_level.q.out b910df4f34 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 388888ef1c 


Diff: https://reviews.apache.org/r/58973/diff/3/

Changes: https://reviews.apache.org/r/58973/diff/2-3/


Testing
-------


Thanks,

Deepak Jaiswal


Re: Review Request 58973: HIVE-16578

Posted by Deepak Jaiswal <dj...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/
-----------------------------------------------------------

(Updated May 4, 2017, 1:19 a.m.)


Review request for hive and Jason Dere.


Changes
-------

Worked on the review comments. Added some more tests.


Repository: hive-git


Description
-------

Semijoin Hints should use column name, if provided for partition key check.
Involves some code refactoring.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
  ql/src/test/queries/clientpositive/semijoin_hint.q a3cd1d664d 
  ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 1d1f86bfaa 
  ql/src/test/results/clientpositive/llap/semijoin_hint.q.out 388888ef1c 


Diff: https://reviews.apache.org/r/58973/diff/2/

Changes: https://reviews.apache.org/r/58973/diff/1-2/


Testing
-------


Thanks,

Deepak Jaiswal


Re: Review Request 58973: HIVE-16578

Posted by Deepak Jaiswal <dj...@hortonworks.com>.

> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 225 (patched)
> > <https://reviews.apache.org/r/58973/diff/1/?file=1707234#file1707234line225>
> >
> >     Can't totally tell the nesting here, but it seems like if getColumnName() returns false this will not do semijoin reduction for this column, regardless of if there is a hint or not .. is that intended?

Yes. From the old logic, in generateSemiJoinOperator(), we would return if we fail to obtain the appropriate ExprNodeColumnDesc. Instead of returning there, now it never calls the function.
Basically, the logic has been pushed down.


> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 235 (patched)
> > <https://reviews.apache.org/r/58973/diff/1/?file=1707234#file1707234line237>
> >
> >     Remove your name from the log line.

Thanks! Forgot after debugging the code.


> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 306 (patched)
> > <https://reviews.apache.org/r/58973/diff/1/?file=1707234#file1707234line308>
> >
> >     Not sure if you can do this and truly say this is the column for this expression - the expression could be a function, such as replace(col1, col2).
> >     
> >     How about ExprNodeDescUtils.getColumnExpr()?

Thanks for telling me about the method, will see if it works in this case.


> On May 3, 2017, 10:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
> > Lines 320 (patched)
> > <https://reviews.apache.org/r/58973/diff/1/?file=1707234#file1707234line322>
> >
> >     same here, about ExprNodeDescUtils.getColumnExpr()

ditto


- Deepak


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/#review173818
-----------------------------------------------------------


On May 3, 2017, 8:12 p.m., Deepak Jaiswal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58973/
> -----------------------------------------------------------
> 
> (Updated May 3, 2017, 8:12 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Semijoin Hints should use column name, if provided for partition key check.
> Involves some code refactoring.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
> 
> 
> Diff: https://reviews.apache.org/r/58973/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>


Re: Review Request 58973: HIVE-16578

Posted by Jason Dere <jd...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58973/#review173818
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 225 (patched)
<https://reviews.apache.org/r/58973/#comment246865>

    Can't totally tell the nesting here, but it seems like if getColumnName() returns false this will not do semijoin reduction for this column, regardless of if there is a hint or not .. is that intended?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 235 (patched)
<https://reviews.apache.org/r/58973/#comment246858>

    Remove your name from the log line.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 306 (patched)
<https://reviews.apache.org/r/58973/#comment246862>

    Not sure if you can do this and truly say this is the column for this expression - the expression could be a function, such as replace(col1, col2).
    
    How about ExprNodeDescUtils.getColumnExpr()?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
Lines 320 (patched)
<https://reviews.apache.org/r/58973/#comment246868>

    same here, about ExprNodeDescUtils.getColumnExpr()


- Jason Dere


On May 3, 2017, 8:12 p.m., Deepak Jaiswal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58973/
> -----------------------------------------------------------
> 
> (Updated May 3, 2017, 8:12 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Semijoin Hints should use column name, if provided for partition key check.
> Involves some code refactoring.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b8c01020b7 
> 
> 
> Diff: https://reviews.apache.org/r/58973/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>