You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by ShB <sh...@gmail.com> on 2017/10/23 17:50:19 UTC

Adding headers to tuples before writing to S3

Hi,

I'm working with Flink for data analytics and reporting. The use case is
that, when a user requests a report, a Flink cluster does some computations
on the data, generates the final report(a DataSet of tuples) and uploads the
report to S3, post which an email is sent to the corresponding email id. So
I need the uploaded report to be the final, complete one that is sent to the
user. 

I'm struggling with adding a header to the final tuple DataSet that I get,
before writing it to S3. The header will be a tuple of the same arity as the
final dataset, but with all Strings, whereas my final report tuple dataset
has Long, Double, etc.

I've been trying to write my own writeToS3 function, which creates a CSV
file with the header and the Dataset tuple and then uploads to S3, but I'm
having trouble scaling to larger dataset sizes.

Is there any other recommended way to do this? Is there any way I can extend
upon the Flink writeAsCsv method to do this?

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/