You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Maciej Obuchowski <ob...@gmail.com> on 2021/02/24 16:43:54 UTC

Re: Jackson object serialisations

Hey Lasse,
I've had a similar case, albeit with Avro. I was reading from multiple
Kafka topics, which all had different objects and did some metadata
driven operations on them.
I could not go with any concrete predefined types for them, because
there were hundreds of different object types.

My solution was to serialize the object itself manually as byte[] and
deserialize it manually in operator.
You can do it the same way using something like
objectMapper.writeValueAsBytes and transfer data as Tuple2<String,
byte[]>.

Overall, Flink does not support "dynamic" data types very well.

Regards,
Maciej

śr., 24 lut 2021 o 17:08 Lasse Nedergaard
<la...@gmail.com> napisał(a):
>
> Hi
>
> I’m looking for advice for the best and simplest solution to handle JSON in Flink.
>
> Our system is data driven and based on JSON. As the structure isn’t static mapping it to POJO isn’t an option I therefore transfer ObjectNode and / or ArrayNode between operators either in Tuples
> Tuple2<String, ObjecNode> or as attributes in POJO’s.
>
> Flink doesn’t know about Jackson objects and therefore fail back to Kryo
>
> I see two options.
> 1. Add kryo serialisation objects for all the Jackson types we use and register them.
> 2. Add Jackson objects as Flink types.
>
> I guess option 2 perform best, but it require an annotation for the classes and I can’t do that for 3. Party objects. One workaround could be to create my own objects that extends the Jackson objects and use them between operators.
>
> I can’t be the first to solve this problem so I like to hear what the community suggests.
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>

Re: Jackson object serialisations

Posted by Lasse Nedergaard <la...@gmail.com>.
Thanks for your feedback. 

I go with specific Kryo serialisation as it make the code easier to use and if I encounter perf. Problems I can change the dataformat later. 

Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 24. feb. 2021 kl. 17.44 skrev Maciej Obuchowski <ob...@gmail.com>:
> 
> Hey Lasse,
> I've had a similar case, albeit with Avro. I was reading from multiple
> Kafka topics, which all had different objects and did some metadata
> driven operations on them.
> I could not go with any concrete predefined types for them, because
> there were hundreds of different object types.
> 
> My solution was to serialize the object itself manually as byte[] and
> deserialize it manually in operator.
> You can do it the same way using something like
> objectMapper.writeValueAsBytes and transfer data as Tuple2<String,
> byte[]>.
> 
> Overall, Flink does not support "dynamic" data types very well.
> 
> Regards,
> Maciej
> 
> śr., 24 lut 2021 o 17:08 Lasse Nedergaard
> <la...@gmail.com> napisał(a):
>> 
>> Hi
>> 
>> I’m looking for advice for the best and simplest solution to handle JSON in Flink.
>> 
>> Our system is data driven and based on JSON. As the structure isn’t static mapping it to POJO isn’t an option I therefore transfer ObjectNode and / or ArrayNode between operators either in Tuples
>> Tuple2<String, ObjecNode> or as attributes in POJO’s.
>> 
>> Flink doesn’t know about Jackson objects and therefore fail back to Kryo
>> 
>> I see two options.
>> 1. Add kryo serialisation objects for all the Jackson types we use and register them.
>> 2. Add Jackson objects as Flink types.
>> 
>> I guess option 2 perform best, but it require an annotation for the classes and I can’t do that for 3. Party objects. One workaround could be to create my own objects that extends the Jackson objects and use them between operators.
>> 
>> I can’t be the first to solve this problem so I like to hear what the community suggests.
>> 
>> Med venlig hilsen / Best regards
>> Lasse Nedergaard
>>