You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2017/03/21 23:36:42 UTC

[jira] [Commented] (PHOENIX-3744) Support snapshot scanners for MR-based queries

    [ https://issues.apache.org/jira/browse/PHOENIX-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935548#comment-15935548 ] 

James Taylor commented on PHOENIX-3744:
---------------------------------------

In offline conversation we determined that the simplest path initially will be to support snapshot reads for the MR integration queries (I've updated the JIRA subject to reflect this). We can make it configurable per job on whether snapshot reads are used or not (this would be if the client is ok seeing data that is potentially 1hr old which is probably almost all of the time). Also, I believe our coprocessors are not needed in this case (since we only execute simple scans for our MR integration) which simplifies things.

> Support snapshot scanners for MR-based queries
> ----------------------------------------------
>
>                 Key: PHOENIX-3744
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3744
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: Akshita Malhotra
>
> HBase support scanning over snapshots, with a SnapshotScanner that accesses the region directly in HDFS. We should make sure that Phoenix can support that.
> Not sure how we'd want to decide when to run a query over a snapshot. Some ideas:
> - if there's an SCN set (i.e. the query is running at a point in time in the past)
> - if the memstore is empty
> - if the query is being run at a timestamp earlier than any memstore data
> - as a config option on the table
> - as a query hint
> - based on some kind of optimizer rule (i.e. based on estimated # of bytes that will be scanned)
> Phoenix typically runs a query at the timestamp at which it was compiled. Any data committed after this time should not be seen while a query is running.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)