You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Kamil ty <ka...@gmail.com> on 2021/08/17 08:13:18 UTC

PyFlink StreamingFileSink bulk-encoded format (Avro)

Hello,

I'm trying to save my data stream to an Avro file on HDFS. In Flink
documentation I can only see explanations for Java/Scala. However, I can't
seem to find a way to do it in PyFlink. Is this possible to do in PyFlink
currently?

Kind Regards
Kamil

Re: PyFlink StreamingFileSink bulk-encoded format (Avro)

Posted by Dian Fu <di...@gmail.com>.
Hi Kamil,

AFAIK, it should still not support Avro format in Python StreamingFileSink in the Python DataStream API. However, I guess you could convert DataStream to Table[1] and then you could use all the connectors supported in the Table & SQL. In this case, you could use the FileSystem connector[2] and Avro format[3] for your requirements.

Regards,
Dian

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/python/datastream/intro_to_datastream_api/#emit-results-to-a-table--sql-sink-connector
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/filesystem/ <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/filesystem/>
[3] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/formats/avro/

> 2021年8月17日 下午4:13,Kamil ty <ka...@gmail.com> 写道:
> 
> Hello,
> 
> I'm trying to save my data stream to an Avro file on HDFS. In Flink documentation I can only see explanations for Java/Scala. However, I can't seem to find a way to do it in PyFlink. Is this possible to do in PyFlink currently?
> 
> Kind Regards
> Kamil