You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2015/02/13 02:52:11 UTC

[jira] [Commented] (PHOENIX-1304) Customize "full scan" behavior

    [ https://issues.apache.org/jira/browse/PHOENIX-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319422#comment-14319422 ] 

Lars Hofhansl commented on PHOENIX-1304:
----------------------------------------

In fact now we have the information we need. We can use the guide post and set a threshold after which we'll scan with caching disabled.
We'll probably want to scale this with the number of involved region servers.

So we'd say: If the scan will read more than (say) 1GB per regionserver, we'll scan with caching disabled. Thoughts?

> Customize "full scan" behavior
> ------------------------------
>
>                 Key: PHOENIX-1304
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1304
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Priority: Minor
>
> Most databases by default avoid filling the block cache during full scans.
> Typically either stats are consulted to decide whether a full scan should fill the blockcache, or a subset of the block cache is dedicated to full scan using the cache like a ring buffer.
> We already have the "NO_CACHE" hint, but we can do better.
> In Phoenix we could detect scans that neither use any parts of the key nor any indexes and then optionally:
> # avoid using the blockcache
> # throw a "slow query" exception (this is especially useful for large data set, where we'd rather fail than go into a nirvana for an hour)
> (both configurable - either globally or per table or connection or query)
> Skip scans represent an interesting middle ground. If we skip many blocks between rows we'd definitely benefit from the blockcache, if not we have a case similar to a full scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)