You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Austin Cawley-Edwards <au...@gmail.com> on 2020/02/18 22:33:42 UTC
CSV StreamingFileSink
Hey all,
Has anyone had success using the StreamingFileSink[1] to write CSV files?
And if so, what about compressed (Gzipped, ideally) files/ which libraries
did you use?
Best,
Austin
[1]:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
Re: CSV StreamingFileSink
Posted by Austin Cawley-Edwards <au...@gmail.com>.
Hey Timo,
Thanks for the assignment link! Looks like most of my issues can be solved
by getting better acquainted with Java file APIs and not in Flink-land.
Best,
Austin
On Wed, Feb 19, 2020 at 6:48 AM Timo Walther <tw...@apache.org> wrote:
> Hi Austin,
>
> the StreamingFileSink allows bucketing the output data.
>
> This should help for your use case:
>
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment
>
> Regards,
> Timo
>
>
> On 19.02.20 01:00, Austin Cawley-Edwards wrote:
> > Following up on this -- does anyone know if it's possible to stream
> > individual files to a directory using the StreamingFileSink? For
> > instance, if I want all records that come in during a certain day to be
> > partitioned into daily directories:
> >
> > 2020-02-18/
> > large-file-1.txt
> > large-file-2.txt
> > 2020-02-19/
> > large-file-3.txt
> >
> > Or is there another way to accomplish this?
> >
> > Thanks!
> > Austin
> >
> > On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards
> > <austin.cawley@gmail.com <ma...@gmail.com>> wrote:
> >
> > Hey all,
> >
> > Has anyone had success using the StreamingFileSink[1] to write CSV
> > files? And if so, what about compressed (Gzipped, ideally) files/
> > which libraries did you use?
> >
> >
> > Best,
> > Austin
> >
> >
> > [1]:
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
> >
>
>
Re: CSV StreamingFileSink
Posted by Timo Walther <tw...@apache.org>.
Hi Austin,
the StreamingFileSink allows bucketing the output data.
This should help for your use case:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment
Regards,
Timo
On 19.02.20 01:00, Austin Cawley-Edwards wrote:
> Following up on this -- does anyone know if it's possible to stream
> individual files to a directory using the StreamingFileSink? For
> instance, if I want all records that come in during a certain day to be
> partitioned into daily directories:
>
> 2020-02-18/
> large-file-1.txt
> large-file-2.txt
> 2020-02-19/
> large-file-3.txt
>
> Or is there another way to accomplish this?
>
> Thanks!
> Austin
>
> On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards
> <austin.cawley@gmail.com <ma...@gmail.com>> wrote:
>
> Hey all,
>
> Has anyone had success using the StreamingFileSink[1] to write CSV
> files? And if so, what about compressed (Gzipped, ideally) files/
> which libraries did you use?
>
>
> Best,
> Austin
>
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
>
Re: CSV StreamingFileSink
Posted by Austin Cawley-Edwards <au...@gmail.com>.
Following up on this -- does anyone know if it's possible to stream
individual files to a directory using the StreamingFileSink? For instance,
if I want all records that come in during a certain day to be
partitioned into daily directories:
2020-02-18/
large-file-1.txt
large-file-2.txt
2020-02-19/
large-file-3.txt
Or is there another way to accomplish this?
Thanks!
Austin
On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards <
austin.cawley@gmail.com> wrote:
> Hey all,
>
> Has anyone had success using the StreamingFileSink[1] to write CSV files?
> And if so, what about compressed (Gzipped, ideally) files/ which libraries
> did you use?
>
>
> Best,
> Austin
>
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html
>