You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Kruse, Sebastian" <Se...@hpi.de> on 2015/04/17 08:54:03 UTC
Collections within POJOs/tuples
Hello everyone,
I was just wondering, which class would be most efficient to store collections of primitive elements, and which one to store objects, within POJOs and tuples from a serialization point of view. And would it make any difference if such a collection is not embedded within a POJO/tuple but is the "top-level element"?
Cheers,
Sebastian
Re: Collections within POJOs/tuples
Posted by Stephan Ewen <se...@apache.org>.
Here are some rough cornerpoints for serialization efficiency in Flink:
- Tuples are a bit more efficient than POJOs, because they do not support
(and encode) possible subclasses and they do not involve and reflection
code at all.
- Arrays are more efficient than collections (collections go in Scala
through the collection serializer, sometimes through Kryo, arrays are
handles directly)
- The efficiency of a "Tuple1<Type>" is virtually the same as that of
"Type", since the Tuple is never serialized, it is just in the
TypeInformation metadata and not in the serialized data
- Arrays or lists as a top level element are good. Don't pout them into a
POJO or a tuple unless you need to add also other fields.
Let us know if you have more questions.
On Fri, Apr 17, 2015 at 8:54 AM, Kruse, Sebastian <Se...@hpi.de>
wrote:
> Hello everyone,
>
> I was just wondering, which class would be most efficient to store
> collections of primitive elements, and which one to store objects, within
> POJOs and tuples from a serialization point of view. And would it make any
> difference if such a collection is not embedded within a POJO/tuple but is
> the "top-level element"?
>
> Cheers,
> Sebastian
>