You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Amareshwari Sriramadasu <am...@apache.org> on 2011/05/09 15:36:29 UTC
Review Request: HIVE-2056
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/700/
-----------------------------------------------------------
Review request for hive.
Summary
-------
Attached patch generates a single M/R job for multi group by query with non-null common group by key set. Added configuration hive.multigroupby.singlemr to turn on and off the optimization.
This addresses bug HIVE-2056.
https://issues.apache.org/jira/browse/HIVE-2056
Diffs
-----
trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1100910
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1100910
trunk/ql/src/test/queries/clientpositive/groupby10.q 1100910
trunk/ql/src/test/queries/clientpositive/groupby8.q 1100910
trunk/ql/src/test/queries/clientpositive/groupby8_noskew.q 1100910
trunk/ql/src/test/queries/clientpositive/groupby9.q 1100910
trunk/ql/src/test/results/clientpositive/groupby10.q.out 1100910
trunk/ql/src/test/results/clientpositive/groupby8.q.out 1100910
trunk/ql/src/test/results/clientpositive/groupby9.q.out 1100910
Diff: https://reviews.apache.org/r/700/diff
Testing
-------
Updated jira with performance tests.
Thanks,
Amareshwari
Re: Review Request: HIVE-2056
Posted by Amareshwari Sriramadasu <am...@apache.org>.
> On 2011-05-09 17:07:16, namit jain wrote:
> > Change hive-default.xml with the new parameter.
> > Add the new parameter in the name of the jira.
Done
> On 2011-05-09 17:07:16, namit jain wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 5518
> > <https://reviews.apache.org/r/700/diff/1/?file=18439#file18439line5518>
> >
> > Add a comment - this optimization is not enabled
> > if one of the sub-queries does not involve a
> > aggregation
Done
> On 2011-05-09 17:07:16, namit jain wrote:
> > trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 5524
> > <https://reviews.apache.org/r/700/diff/1/?file=18439#file18439line5524>
> >
> > The code is not preforming a prefix match.
> > I mean,
> >
> > if the query is:
> >
> > from T
> > insert overwrite T1 select ... group by c1
> > insert overwrite T1 select ... group by c2, c1
> >
> >
> > c1 will still be returned.
> >
> > Is that desirable ?
> >
> > I dont think this will work - can you add a testcase
> > for this - I mean, with a explain which shows that
> > the parameter does not make a difference
> >
Agreed. I missed this.
Updated the patch with prefix matching
- Amareshwari
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/700/#review651
-----------------------------------------------------------
On 2011-05-11 13:14:36, Amareshwari Sriramadasu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/700/
> -----------------------------------------------------------
>
> (Updated 2011-05-11 13:14:36)
>
>
> Review request for hive.
>
>
> Summary
> -------
>
> Attached patch generates a single M/R job for multi group by query with non-null common group by key set. Added configuration hive.multigroupby.singlemr to turn on and off the optimization.
>
>
> This addresses bug HIVE-2056.
> https://issues.apache.org/jira/browse/HIVE-2056
>
>
> Diffs
> -----
>
> trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1100910
> trunk/conf/hive-default.xml 1100910
> trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1100910
> trunk/ql/src/test/queries/clientpositive/groupby10.q 1100910
> trunk/ql/src/test/queries/clientpositive/groupby8.q 1100910
> trunk/ql/src/test/queries/clientpositive/groupby8_noskew.q 1100910
> trunk/ql/src/test/queries/clientpositive/groupby9.q 1100910
> trunk/ql/src/test/queries/clientpositive/multigroupby_singlemr.q PRE-CREATION
> trunk/ql/src/test/results/clientpositive/groupby10.q.out 1100910
> trunk/ql/src/test/results/clientpositive/groupby8.q.out 1100910
> trunk/ql/src/test/results/clientpositive/groupby9.q.out 1100910
> trunk/ql/src/test/results/clientpositive/multigroupby_singlemr.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/700/diff
>
>
> Testing
> -------
>
> Updated jira with performance tests.
>
> All unit tests passed with the patch
>
>
> Thanks,
>
> Amareshwari
>
>
Re: Review Request: HIVE-2056
Posted by namit jain <nj...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/700/#review651
-----------------------------------------------------------
Change hive-default.xml with the new parameter.
Add the new parameter in the name of the jira.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/700/#comment1306>
Add a comment - this optimization is not enabled
if one of the sub-queries does not involve a
aggregation
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/700/#comment1307>
The code is not preforming a prefix match.
I mean,
if the query is:
from T
insert overwrite T1 select ... group by c1
insert overwrite T1 select ... group by c2, c1
c1 will still be returned.
Is that desirable ?
I dont think this will work - can you add a testcase
for this - I mean, with a explain which shows that
the parameter does not make a difference
- namit
On 2011-05-09 13:36:28, Amareshwari Sriramadasu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/700/
> -----------------------------------------------------------
>
> (Updated 2011-05-09 13:36:28)
>
>
> Review request for hive.
>
>
> Summary
> -------
>
> Attached patch generates a single M/R job for multi group by query with non-null common group by key set. Added configuration hive.multigroupby.singlemr to turn on and off the optimization.
>
>
> This addresses bug HIVE-2056.
> https://issues.apache.org/jira/browse/HIVE-2056
>
>
> Diffs
> -----
>
> trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1100910
> trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1100910
> trunk/ql/src/test/queries/clientpositive/groupby10.q 1100910
> trunk/ql/src/test/queries/clientpositive/groupby8.q 1100910
> trunk/ql/src/test/queries/clientpositive/groupby8_noskew.q 1100910
> trunk/ql/src/test/queries/clientpositive/groupby9.q 1100910
> trunk/ql/src/test/results/clientpositive/groupby10.q.out 1100910
> trunk/ql/src/test/results/clientpositive/groupby8.q.out 1100910
> trunk/ql/src/test/results/clientpositive/groupby9.q.out 1100910
>
> Diff: https://reviews.apache.org/r/700/diff
>
>
> Testing
> -------
>
> Updated jira with performance tests.
>
>
> Thanks,
>
> Amareshwari
>
>
Re: Review Request: HIVE-2056
Posted by Amareshwari Sriramadasu <am...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/700/
-----------------------------------------------------------
(Updated 2011-05-11 13:14:36.754720)
Review request for hive.
Changes
-------
Updated the patch to do prefix matching and added a testcase.
Summary
-------
Attached patch generates a single M/R job for multi group by query with non-null common group by key set. Added configuration hive.multigroupby.singlemr to turn on and off the optimization.
This addresses bug HIVE-2056.
https://issues.apache.org/jira/browse/HIVE-2056
Diffs (updated)
-----
trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1100910
trunk/conf/hive-default.xml 1100910
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1100910
trunk/ql/src/test/queries/clientpositive/groupby10.q 1100910
trunk/ql/src/test/queries/clientpositive/groupby8.q 1100910
trunk/ql/src/test/queries/clientpositive/groupby8_noskew.q 1100910
trunk/ql/src/test/queries/clientpositive/groupby9.q 1100910
trunk/ql/src/test/queries/clientpositive/multigroupby_singlemr.q PRE-CREATION
trunk/ql/src/test/results/clientpositive/groupby10.q.out 1100910
trunk/ql/src/test/results/clientpositive/groupby8.q.out 1100910
trunk/ql/src/test/results/clientpositive/groupby9.q.out 1100910
trunk/ql/src/test/results/clientpositive/multigroupby_singlemr.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/700/diff
Testing (updated)
-------
Updated jira with performance tests.
All unit tests passed with the patch
Thanks,
Amareshwari