You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Benchao Li (Jira)" <ji...@apache.org> on 2020/08/06 15:06:00 UTC

[jira] [Commented] (FLINK-18202) Introduce Protobuf format

    [ https://issues.apache.org/jira/browse/FLINK-18202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172448#comment-17172448 ] 

Benchao Li commented on FLINK-18202:
------------------------------------

[~FrankZou] 

Regarding the performance, I did some simple tests before, here is the summary:
 * serialize
 ** protobuf java api: 167w OP/S
 ** dynamic message: 150w OP/S
 * deserialize
 ** protobuf java api: 90.3w OP/S
 ** dynamic message: 26.5w OP/S

Regarding the default value for missing fields, I have no strong opinion, using `null` is ok to me.

> Introduce Protobuf format
> -------------------------
>
>                 Key: FLINK-18202
>                 URL: https://issues.apache.org/jira/browse/FLINK-18202
>             Project: Flink
>          Issue Type: New Feature
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table SQL / API
>            Reporter: Benchao Li
>            Priority: Major
>         Attachments: image-2020-06-15-17-18-03-182.png
>
>
> PB[1] is a very famous and wildly used (de)serialization framework. The ML[2] also has some discussions about this. It's a useful feature.
> This issue maybe needs some designs, or a FLIP.
> [1] [https://developers.google.com/protocol-buffers]
> [2] [http://apache-flink.147419.n8.nabble.com/Flink-SQL-UDF-td3725.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)