You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Prashant Wason (Jira)" <ji...@apache.org> on 2020/04/01 05:30:00 UTC

[jira] [Commented] (HUDI-757) Add a command to hudi-cli to export commit metadata

    [ https://issues.apache.org/jira/browse/HUDI-757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072394#comment-17072394 ] 

Prashant Wason commented on HUDI-757:
-------------------------------------

Test run on my setup on a hudi table which exported 2761 instants into a local directory.

 

hudi:pwason_db.pwason_test_table->export instants --localFolder /home/pwason/instant_dump

...

Exported 2761 Instants

 

ls -al  /home/pwason/instant_dump/

drwxrwxr-x 2 pwason users 131072 Mar 31 20:55 .
drwxr-xr-x 17 pwason users 4096 Mar 31 19:56 ..
-rw-rw-r-- 1 pwason users 15000711 Mar 31 20:55 20200228063343.commit
-rw-rw-r-- 1 pwason users 12042716 Mar 31 20:55 20200228094924.commit
-rw-rw-r-- 1 pwason users 73320 Mar 31 20:55 20200228094925.clean
-rw-rw-r-- 1 pwason users 11420128 Mar 31 20:55 20200228211516.commit
-rw-rw-r-- 1 pwason users 73320 Mar 31 20:55 20200228211540.clean
-rw-rw-r-- 1 pwason users 7567466 Mar 31 20:55 20200228221520.commit

> Add a command to hudi-cli to export commit metadata
> ---------------------------------------------------
>
>                 Key: HUDI-757
>                 URL: https://issues.apache.org/jira/browse/HUDI-757
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>            Reporter: Prashant Wason
>            Priority: Minor
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> HUDI stores commit related information in files within the .hoodie directory. Each commit / delatacommit / rollback / etc creates one or more files. To prevent a large number of files, older files are consolidated together and moved into a commit archive which has multiple such files written together using the format of HUDI Log files.
> During debugging of issues or for development of new features, it may be required to refer to the metadata of older commits / cleanups / rollbacks. There is no simple way to get these from a production setup especially from the archive files.
> This enhancement provides a hudi cli command which allows exporting metadata from HUDI commit archives.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)