You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/01/13 01:09:01 UTC

[jira] [Commented] (IMPALA-9725) LEFT ANTI JOIN produces wrong result when PHJ build spills

    [ https://issues.apache.org/jira/browse/IMPALA-9725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263808#comment-17263808 ] 

ASF subversion and git services commented on IMPALA-9725:
---------------------------------------------------------

Commit b71187dbf90b2bc7ea5c2aeac255c0a4a8bd0e9e in impala's branch refs/heads/3.x from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b71187d ]

IMPALA-9725: incorrect spilling join results for wide keys

The control flow was broken if the join operator hit
the end of the expression values cache before the end
of the probe batch, immediately after processing a row
for a spilled partition. In NextProbeRow(), the problematic
code path was:
* The last row in the expression values cache was for a
  spilled partition, so skip_row=true and it falls out
  of the loop with 'current_probe_row_' pointing to that
  row.
* probe_batch_iterator->AtEnd() is false, because
  the expression value cache is smaller than the probe batch,
  so 'current_probe_row_' is not nulled out.

Thus we end up in a state where 'current_probe_row_' is
set, but 'hash_table_iterator_' is unset.

In the case of a left anti join, this was interpreted by
ProcessProbeRowLeftSemiJoins() as meaning that there was
no hash table match for 'current_probe_row_', and it
therefore returned the row.

This bug could only occur under specific circumstances:
* The join key takes up > 256 bytes in the expression values
  cache (assuming the default batch size of 1024).
* The join spilled.
* The join operator returns rows that were unmatched in
  the right input, i.e. LEFT OUTER JOIN, LEFT ANTI JOIN,
  FULL OUTER JOIN.

The core of the fix is to null out 'current_probe_row_' when
falling out the bottom of the loop in NextProbeRow(). Related
DCHECKS were fixed and some control flow was slightly
simplified.

Testing:
Added a test query on TPC-H that reproduces the problem reliably.

Ran exhaustive tests.

Change-Id: I9d7e5871c35a90e8cf24b8dded04775ee1eae9d8
Reviewed-on: http://gerrit.cloudera.org:8080/15904
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
(cherry picked from commit fcf08d18228dd07c2371a89e3b0788791cb03dfa)


> LEFT ANTI JOIN produces wrong result when PHJ build spills
> ----------------------------------------------------------
>
>                 Key: IMPALA-9725
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9725
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, Impala 3.4.0
>            Reporter: Xiaomin Zhang
>            Assignee: Tim Armstrong
>            Priority: Blocker
>              Labels: correctness
>             Fix For: Impala 4.0
>
>         Attachments: wide.csv.gz
>
>
> Using the attached data set, below query produced non-zero result when setting a small mem_limt. The expected result should be 0 because we are simply ANTI JOIN the same table.
> set mem_limit=100m;
> select id from wide_test.wide L1 where not exists (
>  select 1 from wide_test.wide L2 where L1.id = L2.id
>  and L1.col2 = L2.col2
>  and L1.col3 = L2.col3
>  and L1.col4 = L2.col4
>  and L1.col5 = L2.col5
>  and L1.col6 = L2.col6
>  and L1.col7 = L2.col7
>  and L1.col8 = L2.col8
>  and L1.col9 = L2.col9
>  and L1.col10 = L2.col10
>  and L1.col11 = L2.col11
>  and L1.col12 = L2.col12
>  and L1.col13 = L2.col13
>  and L1.col14 = L2.col14
>  and L1.col15 = L2.col15
>  and L1.col16 = L2.col16
>  and L1.col17 = L2.col17
>  and L1.col18 = L2.col18
>  and L1.col19 = L2.col19
>  and L1.col20 = L2.col20
>  and L1.col21 = L2.col21
>  and L1.col22 = L2.col22
>  and L1.col23 = L2.col23
> ) order by id;
>  
> With a larger mem_limit (or do not set mem_limit), above query return 0 which is correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org