You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Xinli Shang (Jira)" <ji...@apache.org> on 2021/02/19 02:44:00 UTC

[jira] [Commented] (PARQUET-1948) TransCompressionCommand Inoperable

    [ https://issues.apache.org/jira/browse/PARQUET-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286825#comment-17286825 ] 

Xinli Shang commented on PARQUET-1948:
--------------------------------------

[~vanhooser], glad to see you have the interests of this tool. We have been using it by translating GZIP to ZSTD for existing parquet files. Let me know if you hit any issues. 

> TransCompressionCommand Inoperable
> ----------------------------------
>
>                 Key: PARQUET-1948
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1948
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.11.1
>         Environment: I am using parquet-tools 1.11.1 on a Mac machine running Catalina, and my parquet-tools jar was downloaded from Maven Central. 
>            Reporter: Shelby Vanhooser
>            Priority: Blocker
>              Labels: parquet-tools
>
> {{TransCompressionCommand}} in parquet-tools is intended to allow translation of compression types in parquet files.  We are intending to use this functionality to debug a corrupted file, but this command fails to run at the moment entirely. 
> Running the following command (on the uncorrupted file):
> {code:java}
> java -jar ./parquet-tools-1.11.1.jar trans-compression ~/Downloads/part-00048-69f65188-94b5-4772-8906-5c78989240b5_00048.c000.snappy.parquet{code}
> This results in 
>  
> {code:java}
> Unknown command: trans-compression{code}
>  
> I believe this is due to the Registry class [silently catching any errors to initialize|https://github.com/apache/parquet-mr/blob/master/parquet-tools/src/main/java/org/apache/parquet/tools/command/Registry.java#L65] which subsequently is [misinterpreted as an unknown command|https://github.com/apache/parquet-mr/blob/master/parquet-tools/src/main/java/org/apache/parquet/tools/Main.java#L200].
> We need to: 
>  # Write a test for the TransCompressionCommand to figure out why it's showing up as unknown command
>  # Probably expand these tests to cover all the other commands
>  
> This will then unblock our debugging work on the suspect file. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)