You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Kevin Wilfong (Created) (JIRA)" <ji...@apache.org> on 2012/01/21 01:44:39 UTC
[jira] [Created] (HIVE-2732) Reduce Sink deduplication fails if the
child reduce sink is followed by a join
Reduce Sink deduplication fails if the child reduce sink is followed by a join
------------------------------------------------------------------------------
Key: HIVE-2732
URL: https://issues.apache.org/jira/browse/HIVE-2732
Project: Hive
Issue Type: Bug
Reporter: Kevin Wilfong
set hive.optimize.reducededuplication=true;
set hive.auto.convert.join=true;
explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
fails with the following exception
java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2732) Reduce Sink deduplication fails if
the child reduce sink is followed by a join
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13274797#comment-13274797 ]
Namit Jain commented on HIVE-2732:
----------------------------------
+1
Running tests
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
> Key: HIVE-2732
> URL: https://issues.apache.org/jira/browse/HIVE-2732
> Project: Hive
> Issue Type: Bug
> Reporter: Kevin Wilfong
> Assignee: Navis
> Attachments: HIVE-2732.D1809.1.patch, HIVE-2732.D1809.2.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
> at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2732) Reduce Sink deduplication fails if
the child reduce sink is followed by a join
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis reassigned HIVE-2732:
---------------------------
Assignee: Navis
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
> Key: HIVE-2732
> URL: https://issues.apache.org/jira/browse/HIVE-2732
> Project: Hive
> Issue Type: Bug
> Reporter: Kevin Wilfong
> Assignee: Navis
> Attachments: HIVE-2732.D1809.1.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
> at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2732) Reduce Sink deduplication fails if the
child reduce sink is followed by a join
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2732:
-----------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
Committed. Thanks Navis
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
> Key: HIVE-2732
> URL: https://issues.apache.org/jira/browse/HIVE-2732
> Project: Hive
> Issue Type: Bug
> Reporter: Kevin Wilfong
> Assignee: Navis
> Attachments: HIVE-2732.D1809.1.patch, HIVE-2732.D1809.2.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
> at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2732) Reduce Sink deduplication fails if the
child reduce sink is followed by a join
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-2732:
------------------------
Status: Patch Available (was: Open)
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
> Key: HIVE-2732
> URL: https://issues.apache.org/jira/browse/HIVE-2732
> Project: Hive
> Issue Type: Bug
> Reporter: Kevin Wilfong
> Assignee: Navis
> Attachments: HIVE-2732.D1809.1.patch, HIVE-2732.D1809.2.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
> at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2732) Reduce Sink deduplication fails if the
child reduce sink is followed by a join
Posted by "Phabricator (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HIVE-2732:
------------------------------
Attachment: HIVE-2732.D1809.2.patch
navis updated the revision "HIVE-2732 [jira] Reduce Sink deduplication fails if the child reduce sink is followed by a join".
Reviewers: JIRA
1. Rebased & Fixed test case
REVISION DETAIL
https://reviews.facebook.net/D1809
AFFECTED FILES
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
ql/src/test/queries/clientpositive/reduce_deduplicate_exclude_join.q
ql/src/test/results/clientpositive/reduce_deduplicate_exclude_join.q.out
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
> Key: HIVE-2732
> URL: https://issues.apache.org/jira/browse/HIVE-2732
> Project: Hive
> Issue Type: Bug
> Reporter: Kevin Wilfong
> Assignee: Navis
> Attachments: HIVE-2732.D1809.1.patch, HIVE-2732.D1809.2.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
> at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2732) Reduce Sink deduplication fails if
the child reduce sink is followed by a join
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276395#comment-13276395 ]
Hudson commented on HIVE-2732:
------------------------------
Integrated in Hive-trunk-h0.21 #1433 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1433/])
HIVE-2732 Reduce Sink deduplication fails if the child reduce sink is followed by a join
(Navis via namit) (Revision 1338871)
Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1338871
Files :
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
* /hive/trunk/ql/src/test/queries/clientpositive/reduce_deduplicate_exclude_join.q
* /hive/trunk/ql/src/test/results/clientpositive/reduce_deduplicate_exclude_join.q.out
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
> Key: HIVE-2732
> URL: https://issues.apache.org/jira/browse/HIVE-2732
> Project: Hive
> Issue Type: Bug
> Reporter: Kevin Wilfong
> Assignee: Navis
> Attachments: HIVE-2732.D1809.1.patch, HIVE-2732.D1809.2.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
> at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2732) Reduce Sink deduplication fails if the
child reduce sink is followed by a join
Posted by "Phabricator (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HIVE-2732:
------------------------------
Attachment: HIVE-2732.D1809.1.patch
navis requested code review of "HIVE-2732 [jira] Reduce Sink deduplication fails if the child reduce sink is followed by a join".
Reviewers: JIRA
DPAL-854 Reduce Sink deduplication fails if the child reduce sink is followed by a join
set hive.optimize.reducededuplication=true;
set hive.auto.convert.join=true;
explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
fails with the following exception
java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
TEST PLAN
EMPTY
REVISION DETAIL
https://reviews.facebook.net/D1809
AFFECTED FILES
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
ql/src/test/queries/clientpositive/reduce_deduplicate_exclude_gby.q
ql/src/test/results/clientpositive/reduce_deduplicate_exclude_gby.q.out
MANAGE HERALD DIFFERENTIAL RULES
https://reviews.facebook.net/herald/view/differential/
WHY DID I GET THIS EMAIL?
https://reviews.facebook.net/herald/transcript/3855/
Tip: use the X-Herald-Rules header to filter Herald messages in your client.
> Reduce Sink deduplication fails if the child reduce sink is followed by a join
> ------------------------------------------------------------------------------
>
> Key: HIVE-2732
> URL: https://issues.apache.org/jira/browse/HIVE-2732
> Project: Hive
> Issue Type: Bug
> Reporter: Kevin Wilfong
> Attachments: HIVE-2732.D1809.1.patch
>
>
> set hive.optimize.reducededuplication=true;
> set hive.auto.convert.join=true;
> explain select * from (select * from src distribute by key sort by key) a join src b on a.key = b.key;
> fails with the following exception
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
> at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
> at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
> at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
> at org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> If hive.auto.convert.join is set to false, it produces an incorrect plan where the two halves of the join are processed in two separate map reduce tasks, and the reducers of these two tasks both contain the join operator resulting in an exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira