You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Sagar Sumit (Jira)" <ji...@apache.org> on 2022/02/08 03:55:00 UTC

[jira] [Updated] (HUDI-3177) CREATE INDEX command

     [ https://issues.apache.org/jira/browse/HUDI-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sagar Sumit updated HUDI-3177:
------------------------------
    Summary: CREATE INDEX command  (was: Support CREATE INDEX statement)

> CREATE INDEX command
> --------------------
>
>                 Key: HUDI-3177
>                 URL: https://issues.apache.org/jira/browse/HUDI-3177
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: index, metadata
>            Reporter: Sagar Sumit
>            Assignee: Sagar Sumit
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> Users should be able to trigger index creation using CREATE INDEX statement for one or more partitions.
>  
> {code:java}
> CREATE [BLOOM | COL_STATS | SOME_INDEX_TYPE] INDEX ON TABLE  [table_name] FOR COLUMNS (col1, col2, col3) WITH OPTION  (<file_group_count>, <some_other_option>);{code}
>  
> Maps to following hudi configs:
> {code:java}
> METADATA_PREFIX + ".index.bloom.filter.file.group.count” 
> METADATA_PREFIX + ".index.column.stats.file.group.count" 
> METADATA_PREFIX + ".index.bloom.filter.for.columns” -> comma-separated column names 
> METADATA_PREFIX + ".index.column.stats.for.columns" -> comma-separated column names{code}
> Even the CLI indexer tool will map user inputs to the above configs.
> By default, bloom filter will only be for record key and column stats will be for all columns.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)