You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Lars Hofhansl (Jira)" <ji...@apache.org> on 2021/03/16 06:00:03 UTC
[jira] [Comment Edited] (PHOENIX-6412) Consider batching uncovered
column merge for local indexes
[ https://issues.apache.org/jira/browse/PHOENIX-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302238#comment-17302238 ]
Lars Hofhansl edited comment on PHOENIX-6412 at 3/16/21, 5:59 AM:
------------------------------------------------------------------
Performancewise when using FAST_DIFF on the main data CF, I see hardly any improvement, though.
Looks like RESEEK with FAST_DIFF is hardly any faster than a full SEEK each time. Since the data region is local there is no RPC overhead. All the time is simply spent in the FAST_DIFF decoder.
I did see an improvement when I switch the block encoding to ROW_INDEX_V1. Overall, though, this does not seem to be worth the effort.
[~kozdemir], FYI. Not what I had expected. But I guess it makes sense.
was (Author: lhofhansl):
Performancewise when using FAST_DIFF on the main data CF, I see hardly any improvement, though.
Looks like RESEEK with FAST_DIFF is hardly any faster than a full SEEK each time. Since the data region is local there is not RPC overhead.
I did see an improvement when I switch the block encoding to ROW_INDEX_V1. Overall, though, this does not seem to be worth the effort.
[~kozdemir], FYI. Not what I had expected. But I guess it makes sense.
> Consider batching uncovered column merge for local indexes
> ----------------------------------------------------------
>
> Key: PHOENIX-6412
> URL: https://issues.apache.org/jira/browse/PHOENIX-6412
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Lars Hofhansl
> Priority: Minor
> Fix For: 5.2.0
>
> Attachments: 6412-hack.txt
>
>
> Currently uncovered columns are merged row-by-row, performing a Get to the data region for each matching row in the index region.
> Each Get needs to seek all the store scanners, and doing this per row is quite expensive.
> Instead we could batch inside the RegionScannerFactory.getWrappedScanner() -> RegionScanner.nextRaw() method. Collect N index rows and then execute a single skip scan on the data region.
> I might be able to get to that, but there's someone who is interested in taking this up I would not mind :)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)