You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Florian Scharinger (JIRA)" <ji...@apache.org> on 2016/10/03 01:07:21 UTC

[jira] [Commented] (BEAM-55) Allow users to compress FileBasedSink output files

    [ https://issues.apache.org/jira/browse/BEAM-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15541262#comment-15541262 ] 

Florian Scharinger commented on BEAM-55:
----------------------------------------

Hi Daniel,

At the time when I raised this, I was under the impression that codecs like Snappy are not supported. We have changed our system design significantly so that we do not need to read uncompressed Avro files from GCS. Having said that, we will need to produce compressed text files for another part of our system in the future, so Jeffrey's contribution will be very useful.

Cheers,
Florian


> Allow users to compress FileBasedSink output files
> --------------------------------------------------
>
>                 Key: BEAM-55
>                 URL: https://issues.apache.org/jira/browse/BEAM-55
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-core
>            Reporter: Daniel Halperin
>            Priority: Minor
>
> FileBasedSink (also TextIO.Write, AvroIO.Write, etc). does not have an option for compressing its output.
> In general, we discourage compression because it limits or blocks scalably reading from a file in parallel. However, users may want it -- so we should support the option (with appropriate warnings).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)