You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com> on 2015/07/01 00:56:39 UTC
Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------
Review request for hive, John Pullokkaran and Mostafa Mokhtar.
Repository: hive-git
Description
-------
Improve RuleRegExp when the Expression node stack gets huge
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExp.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleStringExp.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java 6f35b87
ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java a090a5b
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java b8f5c71
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagate.java b5ee4ef
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java 8546d21
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupByOptimizer.java af54286
ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java e3d3ce6
ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java e850550
ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f
ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java 2764ae1
ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerUtils.java 108177e
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 37f9473
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchAggregation.java 39e11a2
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SkewJoinOptimizer.java dc885ab
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java 7bcb797
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java 51f1b74
ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java bc8d8f7
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverterPostProc.java d861682
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java c1f1519
ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java b56b608
ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/ExprProcFactory.java c930b80
ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java 51bef04
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java c304e97
ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PartitionConditionRemover.java cbed375
ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java d5102bc
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java f370d4d
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CrossProductCheck.java 6bdb0a7
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java c0a72b6
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java eb8597d
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanOptimizer.java 080a0e6
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java f48d118
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 6e86d69
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java ae96def
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkReduceSinkMapJoinProc.java fd42959
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinResolver.java 608a0de
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/AnnotateWithStatistics.java 4aeeff2
ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcessor.java 9937343
ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java eeccc4b
ql/src/java/org/apache/hadoop/hive/ql/parse/TableAccessAnalyzer.java cc0a7d1
ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 8ab7cd4
ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 0e97530
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 19aae70
ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 3a07b17
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 7f26f0f
ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java ea1f713
ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 363e49e
Diff: https://reviews.apache.org/r/36069/diff/
Testing
-------
Local testing.
Thanks,
Hari Sankar Sivarama Subramaniyan
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by John Pullokkaran <jp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90982
-----------------------------------------------------------
Ship it!
+1 conditional on QA clean run on Patch 5.
- John Pullokkaran
On July 7, 2015, 7:54 p.m., Hari Sankar Sivarama Subramaniyan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
>
> (Updated July 7, 2015, 7:54 p.m.)
>
>
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Improve RuleRegExp when the Expression node stack gets huge
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
> ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/36069/diff/
>
>
> Testing
> -------
>
> Local testing.
>
>
> Thanks,
>
> Hari Sankar Sivarama Subramaniyan
>
>
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------
(Updated July 7, 2015, 7:54 p.m.)
Review request for hive, John Pullokkaran and Mostafa Mokhtar.
Repository: hive-git
Description
-------
Improve RuleRegExp when the Expression node stack gets huge
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION
Diff: https://reviews.apache.org/r/36069/diff/
Testing
-------
Local testing.
Thanks,
Hari Sankar Sivarama Subramaniyan
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------
(Updated July 7, 2015, 6:12 p.m.)
Review request for hive, John Pullokkaran and Mostafa Mokhtar.
Repository: hive-git
Description
-------
Improve RuleRegExp when the Expression node stack gets huge
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION
Diff: https://reviews.apache.org/r/36069/diff/
Testing
-------
Local testing.
Thanks,
Hari Sankar Sivarama Subramaniyan
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by John Pullokkaran <jp...@hortonworks.com>.
> On July 7, 2015, 6:12 p.m., Hari Sankar Sivarama Subramaniyan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java, line 52
> > <https://reviews.apache.org/r/36069/diff/2/?file=997802#file997802line52>
> >
> > I dont clearly understand what you suggested, but I can assure that the implementation here wont take much time in worst cases.
Currently for each patter char we are traversing the whole string. Instead create a hashset of wild chars (static) then for each char in the rule string do a lookup to see if it exists in the wild char set.
- John
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90748
-----------------------------------------------------------
On July 7, 2015, 6:12 p.m., Hari Sankar Sivarama Subramaniyan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
>
> (Updated July 7, 2015, 6:12 p.m.)
>
>
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Improve RuleRegExp when the Expression node stack gets huge
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
> ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/36069/diff/
>
>
> Testing
> -------
>
> Local testing.
>
>
> Thanks,
>
> Hari Sankar Sivarama Subramaniyan
>
>
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90748
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java (line 52)
<https://reviews.apache.org/r/36069/#comment143877>
I dont clearly understand what you suggested, but I can assure that the implementation here wont take much time in worst cases.
- Hari Sankar Sivarama Subramaniyan
On July 2, 2015, 1:10 a.m., Hari Sankar Sivarama Subramaniyan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
>
> (Updated July 2, 2015, 1:10 a.m.)
>
>
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Improve RuleRegExp when the Expression node stack gets huge
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
> ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/36069/diff/
>
>
> Testing
> -------
>
> Local testing.
>
>
> Thanks,
>
> Hari Sankar Sivarama Subramaniyan
>
>
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by John Pullokkaran <jp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90502
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java (line 49)
<https://reviews.apache.org/r/36069/#comment143613>
What about '.', '?', '&&'
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java (line 52)
<https://reviews.apache.org/r/36069/#comment143618>
Isn't it better to change wildCards to Hashset then use contains?
- John Pullokkaran
On July 2, 2015, 1:10 a.m., Hari Sankar Sivarama Subramaniyan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
>
> (Updated July 2, 2015, 1:10 a.m.)
>
>
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Improve RuleRegExp when the Expression node stack gets huge
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
> ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/36069/diff/
>
>
> Testing
> -------
>
> Local testing.
>
>
> Thanks,
>
> Hari Sankar Sivarama Subramaniyan
>
>
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by Hari Sankar Sivarama Subramaniyan <hs...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
-----------------------------------------------------------
(Updated July 2, 2015, 1:10 a.m.)
Review request for hive, John Pullokkaran and Mostafa Mokhtar.
Changes
-------
Thanks John for the review. Addressed John's comments.
Repository: hive-git
Description
-------
Improve RuleRegExp when the Expression node stack gets huge
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION
Diff: https://reviews.apache.org/r/36069/diff/
Testing
-------
Local testing.
Thanks,
Hari Sankar Sivarama Subramaniyan
Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the
Expression node stack gets huge
Posted by John Pullokkaran <jp...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90087
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExp.java (line 2)
<https://reviews.apache.org/r/36069/#comment143042>
1. How about moving RuleExpString in to RuleRegex
2. This way RuleRegEx can make a decision to string route or regex route dynamically and avoid all changes to callers.
- John Pullokkaran
On June 30, 2015, 10:56 p.m., Hari Sankar Sivarama Subramaniyan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36069/
> -----------------------------------------------------------
>
> (Updated June 30, 2015, 10:56 p.m.)
>
>
> Review request for hive, John Pullokkaran and Mostafa Mokhtar.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Improve RuleRegExp when the Expression node stack gets huge
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExp.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2
> ql/src/java/org/apache/hadoop/hive/ql/lib/RuleStringExp.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java 6f35b87
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java a090a5b
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java b8f5c71
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagate.java b5ee4ef
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java 8546d21
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupByOptimizer.java af54286
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java e3d3ce6
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java e850550
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java 2764ae1
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerUtils.java 108177e
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 37f9473
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchAggregation.java 39e11a2
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SkewJoinOptimizer.java dc885ab
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java 7bcb797
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java 51f1b74
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java bc8d8f7
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverterPostProc.java d861682
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java c1f1519
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java b56b608
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/ExprProcFactory.java c930b80
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java 51bef04
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java c304e97
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PartitionConditionRemover.java cbed375
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java d5102bc
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java f370d4d
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CrossProductCheck.java 6bdb0a7
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java c0a72b6
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MemoryDecider.java eb8597d
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanOptimizer.java 080a0e6
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java f48d118
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 6e86d69
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java ae96def
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkReduceSinkMapJoinProc.java fd42959
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSkewJoinResolver.java 608a0de
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/AnnotateWithStatistics.java 4aeeff2
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcessor.java 9937343
> ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java eeccc4b
> ql/src/java/org/apache/hadoop/hive/ql/parse/TableAccessAnalyzer.java cc0a7d1
> ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 8ab7cd4
> ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 0e97530
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 19aae70
> ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 3a07b17
> ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 7f26f0f
> ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicateTransitivePropagate.java ea1f713
> ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 363e49e
>
> Diff: https://reviews.apache.org/r/36069/diff/
>
>
> Testing
> -------
>
> Local testing.
>
>
> Thanks,
>
> Hari Sankar Sivarama Subramaniyan
>
>