You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Navis (JIRA)" <ji...@apache.org> on 2012/08/28 04:59:07 UTC
[jira] [Created] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Navis created HIVE-3411:
---------------------------
Summary: Filter predicates on outer join overlapped on single alias is not handled properly
Key: HIVE-3411
URL: https://issues.apache.org/jira/browse/HIVE-3411
Project: Hive
Issue Type: Sub-task
Components: Query Processor
Affects Versions: 0.10.0
Environment: ubuntu 10.10
Reporter: Navis
Assignee: Navis
Priority: Minor
Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
For example with table a with values,
{noformat}
100 40
100 50
100 60
{noformat}
Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
{noformat}
select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
NULL NULL 100 40 NULL NULL
NULL NULL 100 50 NULL NULL
NULL NULL 100 60 NULL NULL
-- Join predicate
Join Operator
condition map:
Right Outer Join0 to 1
Left Outer Join1 to 2
condition expressions:
0 {VALUE._col0} {VALUE._col1}
1 {VALUE._col0} {VALUE._col1}
2 {VALUE._col0} {VALUE._col1}
filter predicates:
0
1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
2
{noformat}
but this should be
{noformat}
NULL NULL 100 40 NULL NULL
100 50 100 50 NULL NULL
NULL NULL 100 60 100 60
{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463496#comment-13463496 ]
Hudson commented on HIVE-3411:
------------------------------
Integrated in Hive-trunk-h0.21 #1697 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1697/])
HIVE-3411 Filter predicates on outer join overlapped on single alias is not handled properly
(Navis via namit) (Revision 1390010)
Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1390010
Files :
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
* /hive/trunk/ql/src/test/queries/clientpositive/join_filters_overlap.q
* /hive/trunk/ql/src/test/results/clientpositive/auto_join29.q.out
* /hive/trunk/ql/src/test/results/clientpositive/join_filters_overlap.q.out
* /hive/trunk/ql/src/test/results/clientpositive/louter_join_ppr.q.out
* /hive/trunk/ql/src/test/results/clientpositive/outer_join_ppr.q.out
* /hive/trunk/ql/src/test/results/clientpositive/router_join_ppr.q.out
* /hive/trunk/ql/src/test/results/clientpositive/union22.q.out
* /hive/trunk/ql/src/test/results/compiler/plan/join1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join3.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join4.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join5.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join6.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join7.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join8.q.xml
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448563#comment-13448563 ]
Navis commented on HIVE-3411:
-----------------------------
A boolean tag for filter result is replaced with a byte tag, limiting number of aliases for single join to max 8.
I'm not yet decided whether to extended the limit or to throw exception at compile time.
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Sub-task
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-3411:
------------------------
Status: Patch Available (was: Open)
Addressed comments
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-3411:
-----------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
Committed. Thanks Navis
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Phabricator (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461657#comment-13461657 ]
Phabricator commented on HIVE-3411:
-----------------------------------
njain has accepted the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".
REVISION DETAIL
https://reviews.facebook.net/D5169
BRANCH
DPAL-1835
To: JIRA, njain, navis
Cc: njain
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-3411:
------------------------
Issue Type: Bug (was: Sub-task)
Parent: (was: HIVE-3381)
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-3411:
------------------------
Attachment: HIVE-3411.1.patch.txt
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Sub-task
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-3411:
------------------------
Attachment: HIVE-3411.2.patch.txt
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-3411:
------------------------
Status: Patch Available (was: Open)
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Phabricator (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461675#comment-13461675 ]
Phabricator commented on HIVE-3411:
-----------------------------------
njain has commented on the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".
can you refresh ?
REVISION DETAIL
https://reviews.facebook.net/D5169
BRANCH
DPAL-1835
To: JIRA, njain, navis
Cc: njain
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-3411:
------------------------
Status: Patch Available (was: Open)
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461652#comment-13461652 ]
Namit Jain commented on HIVE-3411:
----------------------------------
Can you refresh ?
It is not applying cleanly.
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460248#comment-13460248 ]
Namit Jain commented on HIVE-3411:
----------------------------------
Do you want to abandon this and get https://issues.apache.org/jira/browse/HIVE-3381 instead ?
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-3411:
-----------------------------
Status: Open (was: Patch Available)
comments on phabricator
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462427#comment-13462427 ]
Namit Jain commented on HIVE-3411:
----------------------------------
+1
running tests
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-3411:
-----------------------------
Status: Open (was: Patch Available)
comments
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Phabricator (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HIVE-3411:
------------------------------
Attachment: HIVE-3411.D5169.6.patch
navis updated the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".
Reviewers: JIRA, njain
Rebased to trunk
REVISION DETAIL
https://reviews.facebook.net/D5169
AFFECTED FILES
ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java
ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
ql/src/test/queries/clientpositive/join_filters_overlap.q
ql/src/test/results/clientpositive/auto_join29.q.out
ql/src/test/results/clientpositive/join_filters_overlap.q.out
To: JIRA, njain, navis
Cc: njain
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460301#comment-13460301 ]
Navis commented on HIVE-3411:
-----------------------------
HIVE-3381 depends on this (and contains this) and it seemed better to be processed separately.
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Phabricator (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HIVE-3411:
------------------------------
Attachment: HIVE-3411.D5169.5.patch
navis updated the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".
Reviewers: JIRA
Fixed comments
REVISION DETAIL
https://reviews.facebook.net/D5169
AFFECTED FILES
ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java
ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
ql/src/test/queries/clientpositive/join_filters_overlap.q
ql/src/test/results/clientpositive/auto_join29.q.out
ql/src/test/results/clientpositive/join_filters_overlap.q.out
To: JIRA, navis
Cc: njain
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Navis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-3411:
------------------------
Status: Patch Available (was: Open)
https://reviews.facebook.net/D5169
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Sub-task
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Phabricator (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461629#comment-13461629 ]
Phabricator commented on HIVE-3411:
-----------------------------------
navis has commented on the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".
INLINE COMMENTS
ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java:56 I mean cadinality of '*'. I'll change it.
REVISION DETAIL
https://reviews.facebook.net/D5169
To: JIRA, navis
Cc: njain
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461650#comment-13461650 ]
Namit Jain commented on HIVE-3411:
----------------------------------
+1
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3411) Filter predicates on outer join
overlapped on single alias is not handled properly
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-3411:
-----------------------------
Status: Open (was: Patch Available)
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0
> Environment: ubuntu 10.10
> Reporter: Navis
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result.
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL NULL 100 40 NULL NULL
> NULL NULL 100 50 NULL NULL
> NULL NULL 100 60 NULL NULL
> -- Join predicate
> Join Operator
> condition map:
> Right Outer Join0 to 1
> Left Outer Join1 to 2
> condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
> filter predicates:
> 0
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2
> {noformat}
> but this should be
> {noformat}
> NULL NULL 100 40 NULL NULL
> 100 50 100 50 NULL NULL
> NULL NULL 100 60 100 60
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira