You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com> on 2015/07/01 00:56:39 UTC

Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------

Review request for hive, John Pullokkaran and Mostafa Mokhtar.


Repository: hive-git


Description
-------

Improve RuleRegExp when the Expression node stack gets huge


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExp.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleStringExp.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java 6f35b87 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java a090a5b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java b8f5c71 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagate.java b5ee4ef 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java 8546d21 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupByOptimizer.java af54286 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java e3d3ce6 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java e850550 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java 2764ae1 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerUtils.java 108177e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 37f9473 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchAggregation.java 39e11a2 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SkewJoinOptimizer.java dc885ab 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java 7bcb797 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java 51f1b74 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java bc8d8f7 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverterPostProc.java d861682 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java c1f1519 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java b56b608 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/ExprProcFactory.java c930b80 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java 51bef04 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java c304e97 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PartitionConditionRemover.java cbed375 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java d5102bc 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java f370d4d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CrossProductCheck.java 6bdb0a7 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java c0a72b6 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java eb8597d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanOptimizer.java 080a0e6 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java f48d118 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 6e86d69 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java ae96def 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkReduceSinkMapJoinProc.java fd42959 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinResolver.java 608a0de 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/AnnotateWithStatistics.java 4aeeff2 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcessor.java 9937343 
  ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java eeccc4b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TableAccessAnalyzer.java cc0a7d1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 8ab7cd4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 0e97530 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 19aae70 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 3a07b17 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 7f26f0f 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java ea1f713 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 363e49e 

Diff: https://reviews.apache.org/r/36069/diff/


Testing
-------

Local testing.


Thanks,

Hari Sankar Sivarama Subramaniyan


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by John Pullokkaran <jp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90982
-----------------------------------------------------------

Ship it!


+1 conditional on QA clean run on Patch 5.

- John Pullokkaran


On July 7, 2015, 7:54 p.m., Hari Sankar Sivarama Subramaniyan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
> 
> (Updated July 7, 2015, 7:54 p.m.)
> 
> 
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Improve RuleRegExp when the Expression node stack gets huge
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
>   ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/36069/diff/
> 
> 
> Testing
> -------
> 
> Local testing.
> 
> 
> Thanks,
> 
> Hari Sankar Sivarama Subramaniyan
> 
>


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------

(Updated July 7, 2015, 7:54 p.m.)


Review request for hive, John Pullokkaran and Mostafa Mokhtar.


Repository: hive-git


Description
-------

Improve RuleRegExp when the Expression node stack gets huge


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
  ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 

Diff: https://reviews.apache.org/r/36069/diff/


Testing
-------

Local testing.


Thanks,

Hari Sankar Sivarama Subramaniyan


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------

(Updated July 7, 2015, 6:12 p.m.)


Review request for hive, John Pullokkaran and Mostafa Mokhtar.


Repository: hive-git


Description
-------

Improve RuleRegExp when the Expression node stack gets huge


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
  ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 

Diff: https://reviews.apache.org/r/36069/diff/


Testing
-------

Local testing.


Thanks,

Hari Sankar Sivarama Subramaniyan


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by John Pullokkaran <jp...@hortonworks.com>.

> On July 7, 2015, 6:12 p.m., Hari Sankar Sivarama Subramaniyan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java, line 52
> > <https://reviews.apache.org/r/36069/diff/2/?file=997802#file997802line52>
> >
> >     I dont clearly understand what you suggested, but I can assure that the implementation here wont take much time in worst cases.

Currently for each patter char we are traversing the whole string. Instead create a hashset of wild chars (static) then for each char in the rule string do a lookup to see if it exists in the wild char set.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90748
-----------------------------------------------------------


On July 7, 2015, 6:12 p.m., Hari Sankar Sivarama Subramaniyan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
> 
> (Updated July 7, 2015, 6:12 p.m.)
> 
> 
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Improve RuleRegExp when the Expression node stack gets huge
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
>   ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/36069/diff/
> 
> 
> Testing
> -------
> 
> Local testing.
> 
> 
> Thanks,
> 
> Hari Sankar Sivarama Subramaniyan
> 
>


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90748
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java (line 52)
<https://reviews.apache.org/r/36069/#comment143877>

    I dont clearly understand what you suggested, but I can assure that the implementation here wont take much time in worst cases.


- Hari Sankar Sivarama Subramaniyan


On July 2, 2015, 1:10 a.m., Hari Sankar Sivarama Subramaniyan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
> 
> (Updated July 2, 2015, 1:10 a.m.)
> 
> 
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Improve RuleRegExp when the Expression node stack gets huge
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
>   ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/36069/diff/
> 
> 
> Testing
> -------
> 
> Local testing.
> 
> 
> Thanks,
> 
> Hari Sankar Sivarama Subramaniyan
> 
>


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by John Pullokkaran <jp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90502
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java (line 49)
<https://reviews.apache.org/r/36069/#comment143613>

    What about '.', '?', '&&'



ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java (line 52)
<https://reviews.apache.org/r/36069/#comment143618>

    Isn't it better to change wildCards to Hashset then use contains?


- John Pullokkaran


On July 2, 2015, 1:10 a.m., Hari Sankar Sivarama Subramaniyan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
> 
> (Updated July 2, 2015, 1:10 a.m.)
> 
> 
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Improve RuleRegExp when the Expression node stack gets huge
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
>   ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/36069/diff/
> 
> 
> Testing
> -------
> 
> Local testing.
> 
> 
> Thanks,
> 
> Hari Sankar Sivarama Subramaniyan
> 
>


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------

(Updated July 2, 2015, 1:10 a.m.)


Review request for hive, John Pullokkaran and Mostafa Mokhtar.


Changes
-------

Thanks John for the review. Addressed John's comments.


Repository: hive-git


Description
-------

Improve RuleRegExp when the Expression node stack gets huge


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
  ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 

Diff: https://reviews.apache.org/r/36069/diff/


Testing
-------

Local testing.


Thanks,

Hari Sankar Sivarama Subramaniyan


Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

Posted by John Pullokkaran <jp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90087
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExp.java (line 2)
<https://reviews.apache.org/r/36069/#comment143042>

    1. How about moving RuleExpString in to RuleRegex
    2. This way RuleRegEx can make a decision to string route or regex route dynamically and avoid all changes to callers.


- John Pullokkaran


On June 30, 2015, 10:56 p.m., Hari Sankar Sivarama Subramaniyan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
> 
> (Updated June 30, 2015, 10:56 p.m.)
> 
> 
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Improve RuleRegExp when the Expression node stack gets huge
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExp.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleStringExp.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java 6f35b87 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java a090a5b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java b8f5c71 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagate.java b5ee4ef 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java 8546d21 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupByOptimizer.java af54286 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java e3d3ce6 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java e850550 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java 2764ae1 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerUtils.java 108177e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 37f9473 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchAggregation.java 39e11a2 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SkewJoinOptimizer.java dc885ab 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java 7bcb797 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java 51f1b74 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java bc8d8f7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverterPostProc.java d861682 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java c1f1519 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java b56b608 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/ExprProcFactory.java c930b80 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java 51bef04 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java c304e97 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PartitionConditionRemover.java cbed375 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java d5102bc 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java f370d4d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CrossProductCheck.java 6bdb0a7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java c0a72b6 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java eb8597d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanOptimizer.java 080a0e6 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java f48d118 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 6e86d69 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java ae96def 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkReduceSinkMapJoinProc.java fd42959 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinResolver.java 608a0de 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/AnnotateWithStatistics.java 4aeeff2 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcessor.java 9937343 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java eeccc4b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TableAccessAnalyzer.java cc0a7d1 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 8ab7cd4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 0e97530 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 19aae70 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 3a07b17 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 7f26f0f 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java ea1f713 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 363e49e 
> 
> Diff: https://reviews.apache.org/r/36069/diff/
> 
> 
> Testing
> -------
> 
> Local testing.
> 
> 
> Thanks,
> 
> Hari Sankar Sivarama Subramaniyan
> 
>