You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Lars Hofhansl (Jira)" <ji...@apache.org> on 2022/03/07 18:00:00 UTC

[jira] [Comment Edited] (PHOENIX-6458) Using global indexes for queries with uncovered columns

    [ https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502468#comment-17502468 ] 

Lars Hofhansl edited comment on PHOENIX-6458 at 3/7/22, 5:59 PM:
-----------------------------------------------------------------

Trying now. There's something pretty terrible going on:

First time I run the query above it returns the right value after 32s. Second time, there seems to be some memory leak. Getting lots of "responseTooSlow" and GC pauses in the logs.

The query essentially never finishes, only way out is to kill the region server.


was (Author: lhofhansl):
Trying now. There's something pretty terrible going on:

First time I run the query above it returns the right value.

Second time, there seems to be some memory leak. Getting lots of "responseTooSlow" and GC pauses in the logs.

 

> Using global indexes for queries with uncovered columns
> -------------------------------------------------------
>
>                 Key: PHOENIX-6458
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6458
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.1.0
>            Reporter: Kadir Ozdemir
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>             Fix For: 4.17.0, 5.2.0, 5.1.3
>
>         Attachments: PHOENIX-6458.master.001.patch, PHOENIX-6458.master.002.patch, PHOENIX-6458.master.addendum.patch
>
>
> The Phoenix query optimizer does not use a global index for a query with the columns that are not covered by the global index if the query does not have the corresponding index hint for this index. With the index hint, the optimizer rewrites the query where the index is used within a subquery. With this subquery, the row keys of the index rows that satisfy the subquery are retrieved by the Phoenix client and then pushed into the Phoenix server caches of the data table regions. Finally, on the server side, data table rows are scanned and joined with the index rows using HashJoin. Based on the selectivity of the original query, this join operation may still result in scanning a large amount of data table rows. 
> Eliminating these data table scans would be a significant improvement. To do that, instead of rewriting the query, the Phoenix optimizer simply treats the global index as a covered index for the given query. With this, the Phoenix query optimizer chooses the index table for the query especially when the index row key prefix length is greater than the data row key prefix length for the query. On the server side, the index table is scanned using index row key ranges implied by the query and the index row keys are then mapped to the data table row keys (please note an index row key includes all the data row key columns). Finally, the corresponding data table rows are scanned using server-to-server RPCs.  PHOENIX-6458 (this Jira) retrieves the data table rows one by one using the HBase get operation. PHOENIX-6501 replaces this get operation with the scan operation to reduce the number of server-to-server RPC calls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)