You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Ethan Wang (JIRA)" <ji...@apache.org> on 2017/10/10 20:10:01 UTC
[jira] [Updated] (PHOENIX-153) Implement TABLESAMPLE clause
[ https://issues.apache.org/jira/browse/PHOENIX-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Wang updated PHOENIX-153:
-------------------------------
Description:
Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next hint based on the region boundaries of the table to only return n rows per region.
When TABLESAMPLE clause is used, Phoenix will sample (N) percent of the the hbase table with only O(M) run time complexity. (N is size of table, M is size of stats)
[Update]
Usage:
https://phoenix.apache.org/tablesample.html
Syntax of using table sampling:
select * from PERSON TABLESAMPLE(45);
select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2
Source Code:
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25
was:
Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next hint based on the region boundaries of the table to only return n rows per region.
When TABLESAMPLE clause is used, Phoenix will sample (N) percent of the the hbase table with only O(M) run time complexity. (N is size of table, M is size of stats)
[Update]
Syntax of using table sampling:
select * from PERSON TABLESAMPLE(45);
select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2
Source Code:
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25
> Implement TABLESAMPLE clause
> ----------------------------
>
> Key: PHOENIX-153
> URL: https://issues.apache.org/jira/browse/PHOENIX-153
> Project: Phoenix
> Issue Type: Task
> Reporter: James Taylor
> Assignee: Ethan Wang
> Labels: enhancement
> Fix For: 4.12.0
>
> Attachments: Sampling_Accuracy_Performance.jpg
>
>
> Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next hint based on the region boundaries of the table to only return n rows per region.
> When TABLESAMPLE clause is used, Phoenix will sample (N) percent of the the hbase table with only O(M) run time complexity. (N is size of table, M is size of stats)
> [Update]
> Usage:
> https://phoenix.apache.org/tablesample.html
> Syntax of using table sampling:
> select * from PERSON TABLESAMPLE(45);
> select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2
> Source Code:
> https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)