You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by pradeep s <sr...@gmail.com> on 2017/04/08 06:09:19 UTC

Flink to S3 streaming

Hi,
I have a use case to stream messages from Kafka to Amazon S3. I am not
using the s3 file system way since i need to have Object tags to be added
for each object written in S3.
So i am planning to use the AWS S3 sdk . But i have a query on how to hold
the data till the message size is in few MBs and then write to S3.Also what
should be sink to be used in this case if i am using S3 sdks to write to S3.
Regards
Pradeep S

Re: Flink to S3 streaming

Posted by Aljoscha Krettek <al...@apache.org>.

Hi,
You would have to write your own SinkFunction that uses the AWS S3 sdk to write to S3. You might be interested in the work proposed in this Jira: https://issues.apache.org/jira/browse/FLINK-6306 <https://issues.apache.org/jira/browse/FLINK-6306>

As to buffering elements, I’m afraid you would also have to roll your own solution for now. You could use the Flink state API for that: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/state.html <https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/state.html> This even has an example of a buffering Sink.

Best,
Aljoscha
> On 8. Apr 2017, at 08:09, pradeep s <sr...@gmail.com> wrote:
> 
> Hi,
> I have a use case to stream messages from Kafka to Amazon S3. I am not using the s3 file system way since i need to have Object tags to be added for each object written in S3.
> So i am planning to use the AWS S3 sdk . But i have a query on how to hold the data till the message size is in few MBs and then write to S3.Also what should be sink to be used in this case if i am using S3 sdks to write to S3.
> Regards
> Pradeep S