You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Zoltan Haindrich (Jira)" <ji...@apache.org> on 2021/04/01 08:13:00 UTC

[jira] [Commented] (HIVE-24957) Wrong results when subquery has COALESCE in correlation predicate

    [ https://issues.apache.org/jira/browse/HIVE-24957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312977#comment-17312977 ] 

Zoltan Haindrich commented on HIVE-24957:
-----------------------------------------

I was wondering if changing the semijoin compare conditional to {{<=>}} (is not distinct) would help in this case or not - not sure if we can do I saw some null related comments in the decorrelator...

it seems to me that we could avoid this issue by materializing the coalesced value before the correlated subq:
{code}
with 
    b as (select *,coalesce(b_authorkey, 300) as bb from book)
select b.b_title
from b
where exists
          (select a_authorkey
           from author a
           where bb = a.a_authorkey);
{code}

> Wrong results when subquery has COALESCE in correlation predicate
> -----------------------------------------------------------------
>
>                 Key: HIVE-24957
>                 URL: https://issues.apache.org/jira/browse/HIVE-24957
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> Consider the following example:
> {code:sql}
> create table author (
>     a_authorkey   int,
>     a_name varchar(50));
> create table book (
>     b_bookkey   int,
>     b_title varchar(50),
>     b_authorkey int);
> insert into author values (10, 'Victor Hugo');
> insert into author values (20, 'Alexandre Dumas');
> insert into author values (300, 'UNKNOWN');
> insert into book values (1, 'Les Miserables', 10);
> insert into book values (2, 'The Count of Monte Cristo', 20);
> insert into book values (3, 'Men Without Women', 30);
> insert into book values (4, 'Odyssey', null);
> select b.b_title
> from book b
> where exists
>           (select a_authorkey
>            from author a
>            where coalesce(b.b_authorkey, 300) = a.a_authorkey);
> {code}
> *Expected results*
> ||B_TITLE||
> |Les Miserables|
> |The Count of Monte Cristo|
> |Odyssey|
> *Actual results*
> ||B_TITLE||
> |Les Miserables|
> |The Count of Monte Cristo|
> {{Odyssey}} is missing from the result set and it shouldn't since with the application of COALESCE operator it should match with the UNKNOWN author.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)