You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/06/07 20:05:00 UTC

[jira] [Commented] (IMPALA-12051) Propagate analytic tuple predicates of outer-joined InlineView

    [ https://issues.apache.org/jira/browse/IMPALA-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730265#comment-17730265 ] 

ASF subversion and git services commented on IMPALA-12051:
----------------------------------------------------------

Commit 88275f327fb2b52db13d1ffa1db8fafd08875941 in impala's branch refs/heads/master from Minghui Zhu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=88275f327 ]

IMPALA-12051: Propagate analytic tuple predicates of outer-joined InlineView

In some cases, direct pushing down predicates that reference analytic
tuple into inline view leads to incorrect query results. While pushing
down analytic predicates (e.g. row_number() < 10), we should also divide
them into two groups. Some of them can be migrated into the view so are
removed in the current scope. Some of them can be copied into the view
but still need to be evaluated in the current scope as demonstrated with
the following query. The bug is due to we migrate all of them into the
view.

  WITH detail_measure AS (
    SELECT
      *
    FROM
      (
        VALUES
          (
            1 AS `isqbiuar`,
            1 AS `bgsfrbun`,
            1 AS `result_type`,
            1 AS `bjuzzevg`
          ),
          (2, 2, 2, 2)
      ) a
  ),
  order_measure_sql0 AS (
    SELECT
      row_number() OVER (
        ORDER BY
          row_number_0 DESC NULLS LAST,
          isqbiuar ASC NULLS LAST
      ) AS `row_number_0`,
      `isqbiuar`
    FROM
      (
        VALUES
          (1 AS `row_number_0`, 1 AS `isqbiuar`),
          (2, 2)
      ) b
  )
  SELECT
    detail_measure.`isqbiuar` AS `isqbiuar`,
    detail_measure.`bgsfrbun` AS `bgsfrbun`,
    detail_measure.`result_type` AS `result_type`,
    detail_measure.`bjuzzevg` AS `bjuzzevg`,
    `row_number_0` AS `row_number_0`
  FROM
    detail_measure
    LEFT JOIN order_measure_sql0
    ON order_measure_sql0.isqbiuar = detail_measure.isqbiuar
  WHERE
    row_number_0 BETWEEN 1
    AND 1
  ORDER BY
    `row_number_0` ASC NULLS LAST,
    `bgsfrbun` ASC NULLS LAST

The current incorrect result is:
+----------+----------+-------------+----------+--------------+
| isqbiuar | bgsfrbun | result_type | bjuzzevg | row_number_0 |
+----------+----------+-------------+----------+--------------+
| 2        | 2        | 2           | 2        | 1            |
| 1        | 1        | 1           | 1        | NULL         |
+----------+----------+-------------+----------+--------------+

The correct result is:
+----------+----------+-------------+----------+--------------+
| isqbiuar | bgsfrbun | result_type | bjuzzevg | row_number_0 |
+----------+----------+-------------+----------+--------------+
| 2        | 2        | 2           | 2        | 1            |
+----------+----------+-------------+----------+--------------+

In the plan, the analysis predicate is pushed down to the TOP-N node,
but not in the HASH JOIN node, which leads to incorrect results.

  ...

  05:HASH JOIN [RIGHT OUTER JOIN]
  |  hash predicates: isqbiuar = isqbiuar
  |  row-size=14B cardinality=2

  ...

  02:TOP-N [LIMIT=1]
  |  order by: row_number_0 DESC NULLS LAST, isqbiuar ASC NULLS LAST
  |  source expr: row_number() <= CAST(1 AS BIGINT)
  |  row-size=2B cardinality=1

  ...

The HASH JOIN node shoud be:

  05:HASH JOIN [RIGHT OUTER JOIN]
  |  hash predicates: isqbiuar = isqbiuar
  |  other predicates: row_number() <= 1, row_number() >= 1
  |  row-size=14B cardinality=2

Tests:
* Add plan tests in analytic-rank-pushdown.test
* Add e2e tests in analytic-fns.test

Change-Id: If6c209b2a64bad37d893ba8b520342bf1f9a7513
Reviewed-on: http://gerrit.cloudera.org:8080/19768
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Propagate analytic tuple predicates of outer-joined InlineView
> --------------------------------------------------------------
>
>                 Key: IMPALA-12051
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12051
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: ZhuMinghui
>            Assignee: ZhuMinghui
>            Priority: Major
>         Attachments: image-2023-04-07-11-57-13-571.png, image-2023-04-07-11-57-59-883.png
>
>
> In some cases, direct pushing down predicates that reference analytic tuple into inline view leads to incorrect query results. such as sql:
> {code:java}
> WITH detail_measure AS (
>   SELECT
>     *
>   FROM
>     (
>       VALUES
>         (
>           1 AS `isqbiuar`,
>           1 AS `bgsfrbun`,
>           1 AS `result_type`,
>           1 AS `bjuzzevg`
>         ),
>         (2, 2, 2, 2)
>     ) a
> ),
> order_measure_sql0 AS (
>   SELECT
>     row_number() OVER (
>       ORDER BY
>         row_number_0 DESC NULLS LAST,
>         isqbiuar ASC NULLS LAST
>     ) AS `row_number_0`,
>     `isqbiuar`
>   FROM
>     (
>       VALUES
>         (1 AS `row_number_0`, 1 AS `isqbiuar`),
>         (2, 2)
>     ) b
> )
> SELECT
>   detail_measure.`isqbiuar` AS `isqbiuar`,
>   detail_measure.`bgsfrbun` AS `bgsfrbun`,
>   detail_measure.`result_type` AS `result_type`,
>   detail_measure.`bjuzzevg` AS `bjuzzevg`,
>   `row_number_0` AS `row_number_0`
> FROM
>   detail_measure
>   LEFT JOIN order_measure_sql0 ON order_measure_sql0.isqbiuar = detail_measure.isqbiuar
> WHERE
>   row_number_0 BETWEEN 1
>   AND 1
> ORDER BY
>   `row_number_0` ASC NULLS LAST,
>   `bgsfrbun` ASC NULLS LAST{code}
> The current query result is:
> !image-2023-04-07-11-57-13-571.png!
> The correct query result is:
> !image-2023-04-07-11-57-59-883.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org