You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by GitBox <gi...@apache.org> on 2020/06/20 05:40:14 UTC

[GitHub] [parquet-mr] shangxinli commented on pull request #796: Parquet-1872: Add TransCompression command to parquet-tools

shangxinli commented on pull request #796:
URL: https://github.com/apache/parquet-mr/pull/796#issuecomment-646945646


   > There are a lot of code duplications between `cli` and `tools`. I would suggest adding these to `parquet-hadoop` as a utility for this functionality and use it from the two command line tools. The unit test also can be placed in `parquet-hadoop` so you write it once only. (You may keep simple unit tests at the tools side to verify the tool itself but the functionality shall be verified in the module where it is implemented.)
   
   reply:
   I was able to move the core port to parquet-hadoop. However, I found it is hard share the test class because between parquet-hadoop and parquet-cli because by default the test class is not package as this article said https://dzone.com/articles/sharing-test-classes-between-multiple-modules-in-a. So I cannot utilize the test methods in parquet-hadoop test. There are several approaches as below. Let me know your thoughts. 
   1. Keep as is. Parquet-tool will be removed later. 
   2. Move the most parts into parquet-hadoop and implement the unit tests there. In parquet-cli and parquet-tool, we don't add unit test.  Give the code in parquet-cli/tools would be just a simple wrapper, it should be OK.  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org