You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by "Zheng, Kai" <ka...@intel.com> on 2015/09/19 02:05:59 UTC

A default backend based on Google FlatBuffers?

Hi,

How would you like google flatbuffers? It's performance is super cool! It eliminate decoding at all! I'm thinking about a built-in and default backend for Kerby based on the format. Existing backends are only suitable in some cases, but a default backend like kdb used in MIT KDC would be desired for a production deployment for Kerby KDC in future. Mention it in case it's also useful in some other components.

https://github.com/google/flatbuffers
http://google.github.io/flatbuffers/md__benchmarks.html

Regards,
Kai


RE: A default backend based on Google FlatBuffers?

Posted by "Zheng, Kai" <ka...@intel.com>.
>> What I don't get is how you go from a Java instance to a flatbuffer...
Simply saying, it just copies the byte representation of each field of the object to the flat buffer, as what we did in ApacheDS 'by hand'. The different might be that, the 'by hand' codes are generated from a predefined schema file, instead of manually composing. No Java serialization is involved. Note Java is just one of the listed supported languages. 

For now one thing I'm not sure about is how large could the flat buffer that's supported. It may not work if the content is too large to be held in memory without extra paging support like we did in Mavibot/Btree I guess. 

Regards,
Kai

-----Original Message-----
From: Emmanuel Lécharny [mailto:elecharny@gmail.com] 
Sent: Sunday, September 20, 2015 12:40 AM
To: Apache Directory Developers List <de...@directory.apache.org>
Subject: Re: A default backend based on Google FlatBuffers?

Le 19/09/15 13:55, Zheng, Kai a écrit :
> Thanks Emmanuel for the feedback.
>
> Yeah, it needs encoding, that means the flatbuffers format should be used to store entries.
>
> It does avoid decoding data, because it operates directly on the binary data. In details, one may mmap loads the content from a file and gets a bytebuffer. Then all the entries can be looked up directly on the bytebuffer. As you said one may find an entry efficiently, I thought a mapping from key to the start address of the corresponding object would be required. Given the start address of the object, then all the fields of the object can be directly retrieved without any decoding.

What I don't get is how you go from a Java instance to a flatbuffer...
Because, make no mistake, the costly processing is the serialization. In flatbuffer, how is this serialization done ?

For instance, in ApacheDS, we serialize entries and other data using our own implementation, not depending on any Java default serialization (through reflection). the gain is massive. That's what I don't get :
what's the flatbuffers offer that is better than what we do when serializing 'by hand' ?



Re: A default backend based on Google FlatBuffers?

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 19/09/15 13:55, Zheng, Kai a écrit :
> Thanks Emmanuel for the feedback.
>
> Yeah, it needs encoding, that means the flatbuffers format should be used to store entries.
>
> It does avoid decoding data, because it operates directly on the binary data. In details, one may mmap loads the content from a file and gets a bytebuffer. Then all the entries can be looked up directly on the bytebuffer. As you said one may find an entry efficiently, I thought a mapping from key to the start address of the corresponding object would be required. Given the start address of the object, then all the fields of the object can be directly retrieved without any decoding.

What I don't get is how you go from a Java instance to a flatbuffer...
Because, make no mistake, the costly processing is the serialization. In
flatbuffer, how is this serialization done ?

For instance, in ApacheDS, we serialize entries and other data using our
own implementation, not depending on any Java default serialization
(through reflection). the gain is massive. That's what I don't get :
what's the flatbuffers offer that is better than what we do when
serializing 'by hand' ?



RE: A default backend based on Google FlatBuffers?

Posted by "Zheng, Kai" <ka...@intel.com>.
Thanks Emmanuel for the feedback.

Yeah, it needs encoding, that means the flatbuffers format should be used to store entries.

It does avoid decoding data, because it operates directly on the binary data. In details, one may mmap loads the content from a file and gets a bytebuffer. Then all the entries can be looked up directly on the bytebuffer. As you said one may find an entry efficiently, I thought a mapping from key to the start address of the corresponding object would be required. Given the start address of the object, then all the fields of the object can be directly retrieved without any decoding.

Please take your time to read the doc, I thought it's interesting. Hope it helps in our projects.

You're right, better to have some prototype. Code talks. I will have some try.

Regards,
Kai

-----Original Message-----
From: Emmanuel Lécharny [mailto:elecharny@gmail.com] 
Sent: Saturday, September 19, 2015 2:55 PM
To: Apache Directory Developers List <de...@directory.apache.org>
Subject: Re: A default backend based on Google FlatBuffers?

Le 19/09/15 02:05, Zheng, Kai a écrit :
> Hi,
>
> How would you like google flatbuffers? It's performance is super cool! It eliminate decoding at all! I'm thinking about a built-in and default backend for Kerby based on the format. Existing backends are only suitable in some cases, but a default backend like kdb used in MIT KDC would be desired for a production deployment for Kerby KDC in future. Mention it in case it's also useful in some other components.

I still don't get what it brings. At some point, you still have to serialize/deserialize data, and if you want to be efficient, you need a way to find data efficiently (ie, without parsing all the file).

Give me a couple of days to look at the documentation. At some point, some prototype would certainly be useful.



Re: A default backend based on Google FlatBuffers?

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 19/09/15 02:05, Zheng, Kai a écrit :
> Hi,
>
> How would you like google flatbuffers? It's performance is super cool! It eliminate decoding at all! I'm thinking about a built-in and default backend for Kerby based on the format. Existing backends are only suitable in some cases, but a default backend like kdb used in MIT KDC would be desired for a production deployment for Kerby KDC in future. Mention it in case it's also useful in some other components.

I still don't get what it brings. At some point, you still have to
serialize/deserialize data, and if you want to be efficient, you need a
way to find data efficiently (ie, without parsing all the file).

Give me a couple of days to look at the documentation. At some point,
some prototype would certainly be useful.