You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2021/01/04 00:21:42 UTC

[GitHub] [drill] cgivre opened a new pull request #2133: DRILL-7834: Add Utility Functions for Compressed Files

cgivre opened a new pull request #2133:
URL: https://github.com/apache/drill/pull/2133


   # [DRILL-7834](https://issues.apache.org/jira/browse/DRILL-7834): Add Utility Functions for Compressed Files
   
   ## Description
   
   Some format plugins that use third party parsers throw errors when they receive compressed input streams from Drill.  This PR proposes to introduce three utility functions to the DrillFileSystem:
   * `isCompressed(<path>)`:  Returns true/false whether the input file is compressed
   * `getCodec(<path>)`  This method returns the codec of the file if any
   * `openDecompressedInputStream(<path>)`:  Returns an InputStream that should be readable by parsers that read raw bytes.  This method converts the original InputStream to a byte[] first, then returns that via a ByteArrayInputStream.
   
   ## Documentation
   No user facing changes.
   
   ## Testing
   Tested manually.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2133: DRILL-7834: Add Utility Functions for Compressed Files

Posted by GitBox <gi...@apache.org>.
cgivre merged pull request #2133:
URL: https://github.com/apache/drill/pull/2133


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] luocooong commented on pull request #2133: DRILL-7834: Add Utility Functions for Compressed Files

Posted by GitBox <gi...@apache.org>.
luocooong commented on pull request #2133:
URL: https://github.com/apache/drill/pull/2133#issuecomment-753734110


   +1


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2133: DRILL-7834: Add Utility Functions for Compressed Files

Posted by GitBox <gi...@apache.org>.
cgivre commented on pull request #2133:
URL: https://github.com/apache/drill/pull/2133#issuecomment-753733783


   I can add a comment.  If you're approving this PR, could you please add a `+1` to the comment?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2133: DRILL-7834: Add Utility Functions for Compressed Files

Posted by GitBox <gi...@apache.org>.
cgivre commented on pull request #2133:
URL: https://github.com/apache/drill/pull/2133#issuecomment-753733624


   > @cgivre The `To EVF` work needs these good ideas. Is it possible to add a comment for `openPossiblyCompressedStream()` function to describe the differences of them?
   
   @luocooong 
   Thanks for the review.   The `openPossiblyCompressedStream()` function opens an InputStream but if the file is compressed you can get a ZipCompressedStream or something like that.  In most cases, it won't matter, however, I found that in the case of a proprietary plugin that I was working on which read a byte array.  I'm not sure exactly why, but the Zip stream was breaking the reader.   The plugin in question also didn't work on S3 for the same reason.   
   
   I'm working on refactoring the LTSV plugin and was running into the same issue.   Hopefully this will make future development a little easier. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2133: DRILL-7834: Add Utility Functions for Compressed Files

Posted by GitBox <gi...@apache.org>.
cgivre commented on pull request #2133:
URL: https://github.com/apache/drill/pull/2133#issuecomment-754039606


   @luocooong 
   I added a comment explaining the methods.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org