You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Austin Cawley-Edwards <au...@gmail.com> on 2020/02/18 22:33:42 UTC

CSV StreamingFileSink

Hey all,

Has anyone had success using the StreamingFileSink[1] to write CSV files?
And if so, what about compressed (Gzipped, ideally) files/ which libraries
did you use?


Best,
Austin


[1]:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html

Re: CSV StreamingFileSink

Posted by Austin Cawley-Edwards <au...@gmail.com>.
Hey Timo,

Thanks for the assignment link! Looks like most of my issues can be solved
by getting better acquainted with Java file APIs and not in Flink-land.


Best,
Austin

On Wed, Feb 19, 2020 at 6:48 AM Timo Walther <tw...@apache.org> wrote:

> Hi Austin,
>
> the StreamingFileSink allows bucketing the output data.
>
> This should help for your use case:
>
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment
>
> Regards,
> Timo
>
>
> On 19.02.20 01:00, Austin Cawley-Edwards wrote:
> > Following up on this -- does anyone know if it's possible to stream
> > individual files to a directory using the StreamingFileSink? For
> > instance, if I want all records that come in during a certain day to be
> > partitioned into daily directories:
> >
> > 2020-02-18/
> >     large-file-1.txt
> >     large-file-2.txt
> > 2020-02-19/
> >     large-file-3.txt
> >
> > Or is there another way to accomplish this?
> >
> > Thanks!
> > Austin
> >
> > On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards
> > <austin.cawley@gmail.com <ma...@gmail.com>> wrote:
> >
> >     Hey all,
> >
> >     Has anyone had success using the StreamingFileSink[1] to write CSV
> >     files? And if so, what about compressed (Gzipped, ideally) files/
> >     which libraries did you use?
> >
> >
> >     Best,
> >     Austin
> >
> >
> >     [1]:
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
> >
>
>

Re: CSV StreamingFileSink

Posted by Timo Walther <tw...@apache.org>.
Hi Austin,

the StreamingFileSink allows bucketing the output data.

This should help for your use case:

https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment

Regards,
Timo


On 19.02.20 01:00, Austin Cawley-Edwards wrote:
> Following up on this -- does anyone know if it's possible to stream 
> individual files to a directory using the StreamingFileSink? For 
> instance, if I want all records that come in during a certain day to be 
> partitioned into daily directories:
> 
> 2020-02-18/
>     large-file-1.txt
>     large-file-2.txt
> 2020-02-19/
>     large-file-3.txt
> 
> Or is there another way to accomplish this?
> 
> Thanks!
> Austin
> 
> On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards 
> <austin.cawley@gmail.com <ma...@gmail.com>> wrote:
> 
>     Hey all,
> 
>     Has anyone had success using the StreamingFileSink[1] to write CSV
>     files? And if so, what about compressed (Gzipped, ideally) files/
>     which libraries did you use?
> 
> 
>     Best,
>     Austin
> 
> 
>     [1]:
>     https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
> 


Re: CSV StreamingFileSink

Posted by Austin Cawley-Edwards <au...@gmail.com>.
Following up on this -- does anyone know if it's possible to stream
individual files to a directory using the StreamingFileSink? For instance,
if I want all records that come in during a certain day to be
partitioned into daily directories:

2020-02-18/
   large-file-1.txt
   large-file-2.txt
2020-02-19/
   large-file-3.txt

Or is there another way to accomplish this?

Thanks!
Austin

On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards <
austin.cawley@gmail.com> wrote:

> Hey all,
>
> Has anyone had success using the StreamingFileSink[1] to write CSV files?
> And if so, what about compressed (Gzipped, ideally) files/ which libraries
> did you use?
>
>
> Best,
> Austin
>
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
>