You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Ethan Wang (JIRA)" <ji...@apache.org> on 2017/10/10 20:10:01 UTC

[jira] [Updated] (PHOENIX-153) Implement TABLESAMPLE clause

     [ https://issues.apache.org/jira/browse/PHOENIX-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Wang updated PHOENIX-153:
-------------------------------
    Description: 
Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next hint based on the region boundaries of the table to only return n rows per region.

When TABLESAMPLE  clause is used, Phoenix will sample (N) percent of the the hbase table with only O(M) run time complexity. (N is size of table, M is size of stats)

[Update]
Usage:
https://phoenix.apache.org/tablesample.html

Syntax of using table sampling:
select * from PERSON TABLESAMPLE(45);
select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2

Source Code: 
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25

  was:
Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next hint based on the region boundaries of the table to only return n rows per region.

When TABLESAMPLE  clause is used, Phoenix will sample (N) percent of the the hbase table with only O(M) run time complexity. (N is size of table, M is size of stats)

[Update]
Syntax of using table sampling:
select * from PERSON TABLESAMPLE(45);
select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2

Source Code: 
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25


> Implement TABLESAMPLE clause
> ----------------------------
>
>                 Key: PHOENIX-153
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-153
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>            Assignee: Ethan Wang
>              Labels: enhancement
>             Fix For: 4.12.0
>
>         Attachments: Sampling_Accuracy_Performance.jpg
>
>
> Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next hint based on the region boundaries of the table to only return n rows per region.
> When TABLESAMPLE  clause is used, Phoenix will sample (N) percent of the the hbase table with only O(M) run time complexity. (N is size of table, M is size of stats)
> [Update]
> Usage:
> https://phoenix.apache.org/tablesample.html
> Syntax of using table sampling:
> select * from PERSON TABLESAMPLE(45);
> select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2
> Source Code: 
> https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)