You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Viraj Jasani (Jira)" <ji...@apache.org> on 2021/03/12 07:14:00 UTC
[jira] [Assigned] (PHOENIX-6385) Not to use Scan#setSmall for HBase
2.x versions
[ https://issues.apache.org/jira/browse/PHOENIX-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani reassigned PHOENIX-6385:
-------------------------------------
Assignee: Viraj Jasani (was: Anoop Sam John)
> Not to use Scan#setSmall for HBase 2.x versions
> -----------------------------------------------
>
> Key: PHOENIX-6385
> URL: https://issues.apache.org/jira/browse/PHOENIX-6385
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Anoop Sam John
> Assignee: Viraj Jasani
> Priority: Major
>
> {code:java}
> /**
> * Set whether this scan is a small scan
> * <p>
> * Small scan should use pread and big scan can use seek + read seek + read is fast but can cause
> * two problem (1) resource contention (2) cause too much network io [89-fb] Using pread for
> * non-compaction read request https://issues.apache.org/jira/browse/HBASE-7266 On the other hand,
> * if setting it true, we would do openScanner,next,closeScanner in one RPC call. It means the
> * better performance for small scan. [HBASE-9488]. Generally, if the scan range is within one
> * data block(64KB), it could be considered as a small scan.
> * @param small
> * @deprecated since 2.0.0 and will be removed in 3.0.0. Use {@link #setLimit(int)} and
> * {@link #setReadType(ReadType)} instead. And for the one rpc optimization, now we will also
> * fetch data when openScanner, and if the number of rows reaches the limit then we will close
> * the scanner automatically which means we will fall back to one rpc.
> * @see #setLimit(int)
> * @see #setReadType(ReadType)
> * @see <a href="https://issues.apache.org/jira/browse/HBASE-17045">HBASE-17045</a>
> */
> @Deprecated
> public Scan setSmall(boolean small)
> {code}
> In 1.x there is seperate ClientSmallScanner impl doing single RPC for scan. In 2.x by default it handles single RPC if there are less number of rows. We might have to setLimit.
> Also in 2.x setSmall() API sets the readType to be PRead. But this is good for small reads like only one hfile blocks get read. But what I noticed in phoenix is that even count(*) query result in setting Scan as small scan and end up using PRead (This will result in full scan).
> So we should not be setting the ReadType at all. Instead rely on HBase to intelligently switch scan type (start with PRead and if it scans more data switch to stream read)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)