You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Viraj Jasani (Jira)" <ji...@apache.org> on 2021/03/12 07:14:00 UTC
[jira] [Assigned] (PHOENIX-6385) Not to use Scan#setSmall for HBase 2.x versions

     [ https://issues.apache.org/jira/browse/PHOENIX-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Viraj Jasani reassigned PHOENIX-6385:
-------------------------------------

    Assignee: Viraj Jasani  (was: Anoop Sam John)

> Not to use Scan#setSmall for HBase 2.x versions
> -----------------------------------------------
>
>                 Key: PHOENIX-6385
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6385
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Anoop Sam John
>            Assignee: Viraj Jasani
>            Priority: Major
>
> {code:java}
> /**
>    * Set whether this scan is a small scan
>    * <p>
>    * Small scan should use pread and big scan can use seek + read seek + read is fast but can cause
>    * two problem (1) resource contention (2) cause too much network io [89-fb] Using pread for
>    * non-compaction read request https://issues.apache.org/jira/browse/HBASE-7266 On the other hand,
>    * if setting it true, we would do openScanner,next,closeScanner in one RPC call. It means the
>    * better performance for small scan. [HBASE-9488]. Generally, if the scan range is within one
>    * data block(64KB), it could be considered as a small scan.
>    * @param small
>    * @deprecated since 2.0.0 and will be removed in 3.0.0. Use {@link #setLimit(int)} and
>    *   {@link #setReadType(ReadType)} instead. And for the one rpc optimization, now we will also
>    *   fetch data when openScanner, and if the number of rows reaches the limit then we will close
>    *   the scanner automatically which means we will fall back to one rpc.
>    * @see #setLimit(int)
>    * @see #setReadType(ReadType)
>    * @see <a href="https://issues.apache.org/jira/browse/HBASE-17045">HBASE-17045</a>
>    */
>   @Deprecated
>   public Scan setSmall(boolean small) 
> {code}
> In 1.x there is seperate ClientSmallScanner impl doing single RPC for scan.  In 2.x by default it handles single RPC if there are less number of rows.    We might have to setLimit.
> Also in 2.x setSmall() API sets the readType to be PRead.   But this is good for small reads like only one hfile blocks get read.  But what I noticed in phoenix is that even count(*) query result in setting Scan as small scan and end up using PRead (This will result in full scan).
> So we should not be setting the ReadType at all.  Instead rely on HBase to intelligently switch scan type (start with PRead and if it scans more data switch to stream read)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)