You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2016/01/24 21:32:39 UTC

[jira] [Resolved] (PHOENIX-2189) Starting from HBase 1.x, phoenix shouldn't probably override the hbase.client.scanner.caching attribute

     [ https://issues.apache.org/jira/browse/PHOENIX-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Taylor resolved PHOENIX-2189.
-----------------------------------
    Resolution: Not A Problem

> Starting from HBase 1.x, phoenix shouldn't probably override the hbase.client.scanner.caching attribute
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2189
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2189
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>
> After PHOENIX-2188 is fixed, we need to think about whether it makes sense to override the scanner cache size in Phoenix for branches HBase 1.x. For ex  - in HBase 1.1, the default value of hbase.client.scanner.caching is now Integer.MAX_VALUE.
> {code:xml}
> <property>
>     <name>hbase.client.scanner.caching</name>
>     <value>2147483647</value>
>     <description>Number of rows that we try to fetch when calling next
>     on a scanner if it is not served from (local, client) memory. This configuration
>     works together with hbase.client.scanner.max.result.size to try and use the
>     network efficiently. The default value is Integer.MAX_VALUE by default so that
>     the network will fill the chunk size defined by hbase.client.scanner.max.result.size
>     rather than be limited by a particular number of rows since the size of rows varies
>     table to table. If you know ahead of time that you will not require more than a certain
>     number of rows from a scan, this configuration should be set to that row limit via
>     Scan#setCaching. Higher caching values will enable faster scanners but will eat up more
>     memory and some calls of next may take longer and longer times when the cache is empty.
>     Do not set this value such that the time between invocations is greater than the scanner
>     timeout; i.e. hbase.client.scanner.timeout.period</description>
>   </property>
> {code:xml}
> From the comments it sounds like, by default, HBase is going to provide an upper bound on the scanner cache size in bytes and not number of records. 
> If we end up overriding the hbase.client.scanner.caching to 1000, then potentially for narrower rows we will likely be fetching too few rows. For wider rows, likely the bytes limit will kick in to make sure we don't end up caching too much on the client.
> Maybe we shouldn't be using the scanner caching override at all? Thoughts? [~jamestaylor], [~lhofhansl]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)