You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "chenglei (Jira)" <ji...@apache.org> on 2022/10/01 06:33:00 UTC

[jira] [Comment Edited] (PHOENIX-6752) Duplicate expression nodes in extract nodes during WHERE compilation phase leads to poor performance.

    [ https://issues.apache.org/jira/browse/PHOENIX-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611832#comment-17611832 ] 

chenglei edited comment on PHOENIX-6752 at 10/1/22 6:32 AM:
------------------------------------------------------------

[~jisaac], thank you very much.
The problem of this jira to solve is  {{KeyExpressionVisitor.andKeySlots}} adds many repeated {{keyPart.extractNodes}} to {{extractNodes}} and {{extractNodes}} is exploded ?
Yes, make the {{extractNodes}} from {{List}} to {{Set}} could prevent the {{extractNodes}} exploded, but for the case you described there are large number of OR clauses, how the PR could prevent the cpu time consumed by {{SlotsIterator}} to enumerate all {{KeyRange}} combinations?


was (Author: comnetwork):
[~jisaac], thank you very much.
The problem of this jira to solve is  {{KeyExpressionVisitor.andKeySlots}} adds many repeated {{keyPart.extractNodes}} to {{extractNodes}} and {{extractNodes}} is exploded ?
Yes, make the {{extractNodes}} from {{List}} to {{Set}} could prevent the {{extractNodes}} exploded, but for the case you described there are large number of OR clauses, how the PR could prevent the cpu time consumed by {{SlotsIterator}} to enumerate all {{KeyRange}} combination?

> Duplicate expression nodes in extract nodes during WHERE compilation phase leads to poor performance.
> -----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-6752
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6752
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.15.0, 5.1.0, 4.16.1, 5.2.0
>            Reporter: Jacob Isaac
>            Assignee: Jacob Isaac
>            Priority: Critical
>             Fix For: 5.2.0
>
>         Attachments: test-case.txt
>
>
> SQL queries using the OR operator were taking a long time during the WHERE clause compilation phase when a large number of OR clauses (~50k) are used.
> The key observation was that during the AND/OR processing, when there are a large number of OR expression nodes the same set of extracted nodes was getting added.
> Thus bloating the set size and slowing down the processing.
> [code|https://github.com/apache/phoenix/blob/0c2008ddf32566c525df26cb94d60be32acc10da/phoenix-core/src/main/java/org/apache/phoenix/compile/WhereOptimizer.java#L930]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)