You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Navis (JIRA)" <ji...@apache.org> on 2012/08/28 04:59:07 UTC

[jira] [Created] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Navis created HIVE-3411:
---------------------------

             Summary: Filter predicates on outer join overlapped on single alias is not handled properly
                 Key: HIVE-3411
                 URL: https://issues.apache.org/jira/browse/HIVE-3411
             Project: Hive
          Issue Type: Sub-task
          Components: Query Processor
    Affects Versions: 0.10.0
         Environment: ubuntu 10.10
            Reporter: Navis
            Assignee: Navis
            Priority: Minor


Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 

For example with table a with values,
{noformat}
100 40
100 50
100 60
{noformat}

Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
{noformat}
select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);

NULL	NULL	100	40	NULL	NULL
NULL	NULL	100	50	NULL	NULL
NULL	NULL	100	60	NULL	NULL

-- Join predicate
Join Operator
  condition map:
       Right Outer Join0 to 1
       Left Outer Join1 to 2
  condition expressions:
    0 {VALUE._col0} {VALUE._col1}
    1 {VALUE._col0} {VALUE._col1}
    2 {VALUE._col0} {VALUE._col1}
  filter predicates:
    0 
    1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
    2 
{noformat}

but this should be 

{noformat}
NULL	NULL	100	40	NULL	NULL
100	50	100	50	NULL	NULL
NULL	NULL	100	60	100	60
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463496#comment-13463496 ] 

Hudson commented on HIVE-3411:
------------------------------

Integrated in Hive-trunk-h0.21 #1697 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1697/])
    HIVE-3411 Filter predicates on outer join overlapped on single alias is not handled properly
(Navis via namit) (Revision 1390010)

     Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1390010
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
* /hive/trunk/ql/src/test/queries/clientpositive/join_filters_overlap.q
* /hive/trunk/ql/src/test/results/clientpositive/auto_join29.q.out
* /hive/trunk/ql/src/test/results/clientpositive/join_filters_overlap.q.out
* /hive/trunk/ql/src/test/results/clientpositive/louter_join_ppr.q.out
* /hive/trunk/ql/src/test/results/clientpositive/outer_join_ppr.q.out
* /hive/trunk/ql/src/test/results/clientpositive/router_join_ppr.q.out
* /hive/trunk/ql/src/test/results/clientpositive/union22.q.out
* /hive/trunk/ql/src/test/results/compiler/plan/join1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join3.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join4.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join5.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join6.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join7.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join8.q.xml

                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448563#comment-13448563 ] 

Navis commented on HIVE-3411:
-----------------------------

A boolean tag for filter result is replaced with a byte tag, limiting number of aliases for single join to max 8. 
I'm not yet decided whether to extended the limit or to throw exception at compile time.
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navis updated HIVE-3411:
------------------------

    Status: Patch Available  (was: Open)

Addressed comments
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3411:
-----------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Committed. Thanks Navis
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Phabricator (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461657#comment-13461657 ] 

Phabricator commented on HIVE-3411:
-----------------------------------

njain has accepted the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".

REVISION DETAIL
  https://reviews.facebook.net/D5169

BRANCH
  DPAL-1835

To: JIRA, njain, navis
Cc: njain

                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navis updated HIVE-3411:
------------------------

    Issue Type: Bug  (was: Sub-task)
        Parent:     (was: HIVE-3381)
    
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navis updated HIVE-3411:
------------------------

    Attachment: HIVE-3411.1.patch.txt
    
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navis updated HIVE-3411:
------------------------

    Attachment: HIVE-3411.2.patch.txt
    
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navis updated HIVE-3411:
------------------------

    Status: Patch Available  (was: Open)
    
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Phabricator (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461675#comment-13461675 ] 

Phabricator commented on HIVE-3411:
-----------------------------------

njain has commented on the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".

  can you refresh ?

REVISION DETAIL
  https://reviews.facebook.net/D5169

BRANCH
  DPAL-1835

To: JIRA, njain, navis
Cc: njain

                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navis updated HIVE-3411:
------------------------

    Status: Patch Available  (was: Open)
    
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461652#comment-13461652 ] 

Namit Jain commented on HIVE-3411:
----------------------------------

Can you refresh ?
It is not applying cleanly.
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460248#comment-13460248 ] 

Namit Jain commented on HIVE-3411:
----------------------------------

Do you want to abandon this and get https://issues.apache.org/jira/browse/HIVE-3381 instead ?

                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3411:
-----------------------------

    Status: Open  (was: Patch Available)

comments on phabricator
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462427#comment-13462427 ] 

Namit Jain commented on HIVE-3411:
----------------------------------

+1
running tests
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3411:
-----------------------------

    Status: Open  (was: Patch Available)

comments
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Phabricator (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phabricator updated HIVE-3411:
------------------------------

    Attachment: HIVE-3411.D5169.6.patch

navis updated the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".
Reviewers: JIRA, njain

  Rebased to trunk


REVISION DETAIL
  https://reviews.facebook.net/D5169

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
  ql/src/test/queries/clientpositive/join_filters_overlap.q
  ql/src/test/results/clientpositive/auto_join29.q.out
  ql/src/test/results/clientpositive/join_filters_overlap.q.out

To: JIRA, njain, navis
Cc: njain

                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch, HIVE-3411.D5169.6.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460301#comment-13460301 ] 

Navis commented on HIVE-3411:
-----------------------------

HIVE-3381 depends on this (and contains this) and it seemed better to be processed separately. 
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Phabricator (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phabricator updated HIVE-3411:
------------------------------

    Attachment: HIVE-3411.D5169.5.patch

navis updated the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".
Reviewers: JIRA

  Fixed comments


REVISION DETAIL
  https://reviews.facebook.net/D5169

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
  ql/src/test/queries/clientpositive/join_filters_overlap.q
  ql/src/test/results/clientpositive/auto_join29.q.out
  ql/src/test/results/clientpositive/join_filters_overlap.q.out

To: JIRA, navis
Cc: njain

                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Navis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navis updated HIVE-3411:
------------------------

    Status: Patch Available  (was: Open)

https://reviews.facebook.net/D5169
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Phabricator (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461629#comment-13461629 ] 

Phabricator commented on HIVE-3411:
-----------------------------------

navis has commented on the revision "HIVE-3411 [jira] Filter predicates on outer join overlapped on single alias is not handled properly".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java:56 I mean cadinality of '*'. I'll change it.

REVISION DETAIL
  https://reviews.facebook.net/D5169

To: JIRA, navis
Cc: njain

                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461650#comment-13461650 ] 

Namit Jain commented on HIVE-3411:
----------------------------------

+1
                
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3411:
-----------------------------

    Status: Open  (was: Patch Available)
    
> Filter predicates on outer join overlapped on single alias is not handled properly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-3411
>                 URL: https://issues.apache.org/jira/browse/HIVE-3411
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>         Environment: ubuntu 10.10
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3411.1.patch.txt, HIVE-3411.2.patch.txt, HIVE-3411.D5169.5.patch
>
>
> Currently, join predicates on outer join are evaluated in join operator (or HashSink for MapJoin) and the result value is tagged to end of each values(as a boolean), which is used for joining values. But when predicates are overlapped on single alias, all the predicates are evaluated with AND conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND c.value=60);
> NULL	NULL	100	40	NULL	NULL
> NULL	NULL	100	50	NULL	NULL
> NULL	NULL	100	60	NULL	NULL
> -- Join predicate
> Join Operator
>   condition map:
>        Right Outer Join0 to 1
>        Left Outer Join1 to 2
>   condition expressions:
>     0 {VALUE._col0} {VALUE._col1}
>     1 {VALUE._col0} {VALUE._col1}
>     2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
>     0 
>     1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
>     2 
> {noformat}
> but this should be 
> {noformat}
> NULL	NULL	100	40	NULL	NULL
> 100	50	100	50	NULL	NULL
> NULL	NULL	100	60	100	60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira