You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Alexander Behm (JIRA)" <ji...@apache.org> on 2017/05/24 06:41:04 UTC

[jira] [Resolved] (IMPALA-5309) Implement TABLESAMPLE for HDFS tables

     [ https://issues.apache.org/jira/browse/IMPALA-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Behm resolved IMPALA-5309.
------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.9.0

commit ee0fc260d1420b34a3d3fb1073fe80b3c63a9ab9
Author: Alex Behm <al...@cloudera.com>
Date:   Tue May 9 22:02:29 2017 -0700

    IMPALA-5309: Adds TABLESAMPLE clause for HDFS table refs.
    
    Syntax:
    <tableref> TABLESAMPLE SYSTEM(<number>) [REPEATABLE(<number>)]
    The first number specifies the percent of table bytes to sample.
    The second number specifies the random seed to use.
    
    The sampling is coarse-grained. Impala keeps randomly adding
    files to the sample until at least the desired percentage of
    file bytes have been reached.
    
    Examples:
    SELECT * FROM t TABLESAMPLE SYSTEM(10)
    SELECT * FROM t TABLESAMPLE SYSTEM(50) REPEATABLE(1234)
    
    Testing:
    - Added parser, analyser, planner, and end-to-end tests
    - Private core/hdfs run passed
    
    Change-Id: Ief112cfb1e4983c5d94c08696dc83da9ccf43f70
    Reviewed-on: http://gerrit.cloudera.org:8080/6868
    Reviewed-by: Alex Behm <al...@cloudera.com>
    Tested-by: Impala Public Jenkins


> Implement TABLESAMPLE for HDFS tables
> -------------------------------------
>
>                 Key: IMPALA-5309
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5309
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Frontend
>            Reporter: Alexander Behm
>            Assignee: Alexander Behm
>             Fix For: Impala 2.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)