You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Andrey Kornev <an...@hotmail.com> on 2017/11/03 20:07:16 UTC

Deserialization of BinaryObject's byte arrays

Hello,

We store lots of byte arrays (serialized graph-like data structures) as fields of BinaryObject. Later on, while reading such field, BinaryInputStream implementation creates an on-heap array and copies the bytes from the BinaryObject's internal backing array to the new array.

While in general case it works just fine, in our case, a lot of CPU is spent on copying of the bytes and significant amount garbage is generated as well.

I'd like Ignite Community to consider the following proposal: introduce support for serialization/deserialization of the ByteBuffer class.  A BinaryInputStream implementation would special case ByteBuffer fields same way it currently special cases various Java types (collections, dates, timestamps, enums, etc.)

The application developer would simply call BinaryObjectBuilder.setField(...) passing in an instance of ByteBuffer. During serialization, the ByteBuffer's bytes would simply be copied to the backing array (similar to how regular byte arrays are serialized today) and in the binary object's metadata the field marked as, say, GridBinaryMarshaller.BYTE_BUFFER.

During deserialization of the field, instead of allocating a new byte array and transferring the bytes from the BinaryObject's backing array, BinaryInputStream would simply wrap the required sub-array into a read-only instance of ByteBuffer using "public static ByteBuffer wrap(byte[] array, int offset, int length)" and return the instance to the application.

This approach doesn't require any array allocation/data copying and is faster and significantly reduces pressure on the garbage collector. The benefits are especially great when application stores significant amount of its data as byte arrays.

Clearly, this approach equally applies to arrays of other primitive types.

A minor complication does arise when dealing with off-heap BinaryObjects, but nothing that can't solved with a little bit of Java reflection wizardry (or by other means).

Thanks,
Andrey

Re: Deserialization of BinaryObject's byte arrays

Posted by zhenya <ar...@mail.ru>.
Looks like this issue would be linked too.
https://issues.apache.org/jira/browse/IGNITE-5602



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Re: Deserialization of BinaryObject's byte arrays

Posted by Vladimir Ozerov <vo...@gridgain.com>.
Andrey,

Special collection types support was our mistake when we designed binary
protocol in the first place.

As far as BinaryObject idea - makes sense. But I would avoid any kind of
implicit conversions. Special method for byte array should be enough.

вт, 7 нояб. 2017 г. в 22:01, Andrey Kornev <an...@hotmail.com>:

> Vladimir,
>
> I appreciate your taking the time to reply. Thank you for sharing your
> thoughts!
>
> I have to admit I'm still not convinced there is anything inherently wrong
> about having the Binary protocol support ByteBuffers as first class
> citizen. The Binary marshaller already supports specific collection/map
> types (such as linked list, array list, hash set etc.) which all, one
> could argue, are just "thin wrappers" around "fundamental" sequence/map
> abstractions. I don't see why ByteBuffer can't be treated the same way.
>
> Also, your approach, given a relatively simple problem we're trying to
> solve here, feels very much over-engineered. As an Ignite user, I'd like to
> have a *simple* way of deserializing primitive arrays efficiently. Having
> to implement a bunch of interfaces to do just that doesn't fit my (Ignite
> user's) definition of "simple". Contrary to your statement below, the APIs
> you define (or not define) do have a lot to do with user experience.
>
> In any case, here's a new proposal. And this time, it doesn't require
> modification of the Binary protocol.
>
> Essentially what we need is just a way to tell the Binary marshaller
> how array fields are to be deserialized. It can be done globally (for
> example, by introducing a new config property on BinaryConfiguration). Or,
> it can be done on per binary object instance basis. For example, we can add
> a new method to the BinaryObject class (e.g. BinaryObject.withArrayAsBuffer())
> to make subsequent calls to BinaryObject.field() return an instance of
> XxxBuffer class rather than a raw array (if the underlying field is of
> course an array).
>
> (Alternatively, we could enhance the BinaryObject class by adding:
>
> Buffer fieldAsBuffer(String fieldName);
>
> method in addition to the existing:
>
> T field(String fieldName);
>
> This would make it possible to read any field (including scalar
> primitive types and String's) as an instance of XxxBuffer that wraps the
> field's raw bytes in the BinaryObject's backing array.)
>
> Actually I personally like this alternative better -- it's more consistent
> and symmetric, and opens up opportunities for further optimization. For
> example, in some cases (such as equality checks) I don't need an
> actual instance of a String -- its character (or even byte) arrays are
> sufficient.
>
> The good thing is that neither approach requires the field to have
> been originally written (serialized) as XxxBuffer. It could've been just a
> regular array. The actual type (e.g. byte[] or ByteBuffer) only matters
> during deserialization.
>
> Regards
> Andrey
>
> ------------------------------
> *From:* Vladimir Ozerov <vo...@gridgain.com>
> *Sent:* Tuesday, November 7, 2017 12:25 AM
> *To:* dev@ignite.apache.org
>
> *Subject:* Re: Deserialization of BinaryObject's byte arrays
> Regarding "sefl-imposed rules":
> 1) Binary protocol is essentially ignite's type system. It is used in many
> places: Java, .NET, CPP, SQL (native, JDBC, ODBC), thin client. Of those 5
> parties only Java has "ByteBuffer" concept. How other platforms should work
> with this type?
> 2) We compare objects using binary representation
> (BinaryArrayIdentityResolver#equals0). So if one user write data as byte
> array, and another write the same data as ByteBuffer, resulting objects
> will not be equal
> These are clear examples on why type system should be as small as possible
> - we simply cannot afford extending type system for every minor use case.
> Binary protocol is internal thing and has nothing to do with user
> experience.
>
> What is worse, ByteBuffer is terrible abstraction as recognized by many
> millions of Java users - it is stateful. One cannot read data from it
> without changing it's state. And Ignite gives no guarantees that particular
> object will be serialized only once per certain operation. One example is
> BinaryMetadataCollector - Ignite could go through object twice per
> serialization cycle to collect metadata. How are we going to maintain
> idempotence in general case?
>
> What could improve UX is public interfaces. Raw example of what we can do:
> 1) Define generic wrappers over our streams that will give users ability to
> read/write byte arrays as they want. E.g.:
> interface BinaryArrayWriter {
>     write(byte[] src, int offset, int len);
>     write(ByteBuffer src, ...);
> }
>
> interface BinaryArrayReader {
>     int readLength();
>     byte[] read();
>     void read(byte[] dest, int off, int len);
>     void read(ByteBuffer dest);
> }
>
> 2) Then add new methods to reader/writer/builder. E.g.:
> BinaryWriter.writeByteArray(String name, BiConsumer<Object,
> BinaryArrayWriter> consumer); // Consumes original field and array writer;
> BinaryReader.readByteArray(String name, Function<BinaryArrayReader>
> consumer);            // Takes array reader, produces original field
>
> 3) Last, introduce field annotation for user convenience:
> class MyClass {
>     @BinaryArray(writerClass = MyWriterBiConsumer.class, readerClass =
> MyReaderFunction.class)
>     private ByteBuffer buf;
> }
>
> Now all user need to do is to implement write consumer and read function.
>
> Vladimir.
>
>
> On Tue, Nov 7, 2017 at 10:16 AM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > ByteBuffer is not fundamental data type. It is a thin wrapper over array.
> > Protocol should contain only fundamental types.
> >
> > вт, 7 нояб. 2017 г. в 10:05, Andrey Kornev <an...@hotmail.com>:
> >
> >> Vladimir,
> >>
> >> Could you please elaborate as to why adding ByteBuffer to the Binary
> >> protocol is “not a good idea”? What is inherently wrong about it?
> >>
> >> As for your suggestion, while I certainly see your point, it comes with
> >> the inconvenience of implementing Binarylizable for every entity type
> in an
> >> application, which makes me wonder if the self imposed rules such as
> “type
> >> system should be kept as small as possible” could be relaxed in the
> name of
> >> greater good and user friendliness. Ignite API is not easy to grok as
> it is
> >> already.
> >>
> >> Thanks
> >> Andrey
> >> _____________________________
> >> From: Vladimir Ozerov <vozerov@gridgain.com<mailto:vozerov@gridgain.com
> >>
> >> Sent: Monday, November 6, 2017 10:22 PM
> >> Subject: Re: Deserialization of BinaryObject's byte arrays
> >> To: <de...@ignite.apache.org>>
> >> Cc: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
> >> valentin.kulichenko@gmail.com>>
> >>
> >>
> >> Hi Andrey,
> >>
> >> While we deifnitely need to support byte buffers, adding new type to
> >> protocol is not very good idea. Binary protocol is the heart of our
> >> ssytem,
> >> which is used not oly for plain Java, but for other platforms and SQL.
> And
> >> ByteBuffer is essentially the same byte array, but with different API.
> >> What
> >> you describe is a special case of byte array read/write. For this reason
> >> our type system should be kept as small as possible and it doesn't
> require
> >> new type to be introduced. Instead, we'd better to handle everything on
> >> API
> >> level only. For example, we can add new methods to
> reader/writer/builder,
> >> which will handle this. E.g.:
> >>
> >> BinaryWriter.writeByteArray(<some producer here>);
> >> BinaryReader.readByteArray(<some consumer here>);
> >>
> >> Vladimir.
> >>
> >> On Mon, Nov 6, 2017 at 9:25 PM, Andrey Kornev <andrewkornev@hotmail.com
> <
> >> mailto:andrewkornev@hotmail.com <an...@hotmail.com>>>
> >> wrote:
> >>
> >> > Will do! Thanks!
> >> > ________________________________
> >> > From: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
> >> valentin.kulichenko@gmail.com>>
> >> > Sent: Monday, November 6, 2017 9:17 AM
> >> > To: dev@ignite.apache.org<ma...@ignite.apache.org>
> >> > Subject: Re: Deserialization of BinaryObject's byte arrays
> >> >
> >> > This use case was discussed couple of times already, so it seems to be
> >> > pretty important for users. And I like the approach suggested by
> Andrey.
> >> >
> >> > Andrey, can you create a Jira ticket for this change? Would you like
> to
> >> > contribute it?
> >> >
> >> > -Val
> >> >
> >> > On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <
> >> dsetrakyan@apache.org<ma...@apache.org>>
> >> > wrote:
> >> >
> >> > > This makes sense to me. Sounds like a useful feature.
> >> > >
> >> > > Would be nice to hear what others in the community think?
> >> > >
> >> > > D.
> >> > >
> >> > > On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <
> >> andrewkornev@hotmail.com<ma...@hotmail.com>>
> >> > > wrote:
> >> > >
> >> > > > Hello,
> >> > > >
> >> > > > We store lots of byte arrays (serialized graph-like data
> >> structures) as
> >> > > > fields of BinaryObject. Later on, while reading such field,
> >> > > > BinaryInputStream implementation creates an on-heap array and
> copies
> >> > the
> >> > > > bytes from the BinaryObject's internal backing array to the new
> >> array.
> >> > > >
> >> > > > While in general case it works just fine, in our case, a lot of
> CPU
> >> is
> >> > > > spent on copying of the bytes and significant amount garbage is
> >> > generated
> >> > > > as well.
> >> > > >
> >> > > > I'd like Ignite Community to consider the following proposal:
> >> introduce
> >> > > > support for serialization/deserialization of the ByteBuffer
> class. A
> >> > > > BinaryInputStream implementation would special case ByteBuffer
> >> fields
> >> > > same
> >> > > > way it currently special cases various Java types (collections,
> >> dates,
> >> > > > timestamps, enums, etc.)
> >> > > >
> >> > > > The application developer would simply call
> >> > > BinaryObjectBuilder.setField(...)
> >> > > > passing in an instance of ByteBuffer. During serialization, the
> >> > > > ByteBuffer's bytes would simply be copied to the backing array
> >> (similar
> >> > > to
> >> > > > how regular byte arrays are serialized today) and in the binary
> >> > object's
> >> > > > metadata the field marked as, say, GridBinaryMarshaller.BYTE_
> >> BUFFER.
> >> > > >
> >> > > > During deserialization of the field, instead of allocating a new
> >> byte
> >> > > > array and transferring the bytes from the BinaryObject's backing
> >> array,
> >> > > > BinaryInputStream would simply wrap the required sub-array into a
> >> > > read-only
> >> > > > instance of ByteBuffer using "public static ByteBuffer wrap(byte[]
> >> > array,
> >> > > > int offset, int length)" and return the instance to the
> application.
> >> > > >
> >> > > > This approach doesn't require any array allocation/data copying
> and
> >> is
> >> > > > faster and significantly reduces pressure on the garbage
> collector.
> >> The
> >> > > > benefits are especially great when application stores significant
> >> > amount
> >> > > of
> >> > > > its data as byte arrays.
> >> > > >
> >> > > > Clearly, this approach equally applies to arrays of other
> primitive
> >> > > types.
> >> > > >
> >> > > > A minor complication does arise when dealing with off-heap
> >> > BinaryObjects,
> >> > > > but nothing that can't solved with a little bit of Java reflection
> >> > > wizardry
> >> > > > (or by other means).
> >> > > >
> >> > > > Thanks,
> >> > > > Andrey
> >> > > >
> >> > >
> >> >
> >>
> >>
> >>
>

Re: Deserialization of BinaryObject's byte arrays

Posted by Andrey Kornev <an...@hotmail.com>.
Vladimir,

I appreciate your taking the time to reply. Thank you for sharing your thoughts!

I have to admit I'm still not convinced there is anything inherently wrong about having the Binary protocol support ByteBuffers as first class citizen. The Binary marshaller already supports specific collection/map types (such as linked list, array list, hash set etc.) which all, one could argue, are just "thin wrappers" around "fundamental" sequence/map abstractions. I don't see why ByteBuffer can't be treated the same way.

Also, your approach, given a relatively simple problem we're trying to solve here, feels very much over-engineered. As an Ignite user, I'd like to have a *simple* way of deserializing primitive arrays efficiently. Having to implement a bunch of interfaces to do just that doesn't fit my (Ignite user's) definition of "simple". Contrary to your statement below, the APIs you define (or not define) do have a lot to do with user experience.

In any case, here's a new proposal. And this time, it doesn't require modification of the Binary protocol.

Essentially what we need is just a way to tell the Binary marshaller how array fields are to be deserialized. It can be done globally (for example, by introducing a new config property on BinaryConfiguration). Or, it can be done on per binary object instance basis. For example, we can add a new method to the BinaryObject class (e.g. BinaryObject.withArrayAsBuffer()) to make subsequent calls to BinaryObject.field() return an instance of XxxBuffer class rather than a raw array (if the underlying field is of course an array).

(Alternatively, we could enhance the BinaryObject class by adding:

Buffer fieldAsBuffer(String fieldName);

method in addition to the existing:

T field(String fieldName);

This would make it possible to read any field (including scalar primitive types and String's) as an instance of XxxBuffer that wraps the field's raw bytes in the BinaryObject's backing array.)

Actually I personally like this alternative better -- it's more consistent and symmetric, and opens up opportunities for further optimization. For example, in some cases (such as equality checks) I don't need an actual instance of a String -- its character (or even byte) arrays are sufficient.

The good thing is that neither approach requires the field to have been originally written (serialized) as XxxBuffer. It could've been just a regular array. The actual type (e.g. byte[] or ByteBuffer) only matters during deserialization.

Regards
Andrey

________________________________
From: Vladimir Ozerov <vo...@gridgain.com>
Sent: Tuesday, November 7, 2017 12:25 AM
To: dev@ignite.apache.org
Subject: Re: Deserialization of BinaryObject's byte arrays

Regarding "sefl-imposed rules":
1) Binary protocol is essentially ignite's type system. It is used in many
places: Java, .NET, CPP, SQL (native, JDBC, ODBC), thin client. Of those 5
parties only Java has "ByteBuffer" concept. How other platforms should work
with this type?
2) We compare objects using binary representation
(BinaryArrayIdentityResolver#equals0). So if one user write data as byte
array, and another write the same data as ByteBuffer, resulting objects
will not be equal
These are clear examples on why type system should be as small as possible
- we simply cannot afford extending type system for every minor use case.
Binary protocol is internal thing and has nothing to do with user
experience.

What is worse, ByteBuffer is terrible abstraction as recognized by many
millions of Java users - it is stateful. One cannot read data from it
without changing it's state. And Ignite gives no guarantees that particular
object will be serialized only once per certain operation. One example is
BinaryMetadataCollector - Ignite could go through object twice per
serialization cycle to collect metadata. How are we going to maintain
idempotence in general case?

What could improve UX is public interfaces. Raw example of what we can do:
1) Define generic wrappers over our streams that will give users ability to
read/write byte arrays as they want. E.g.:
interface BinaryArrayWriter {
    write(byte[] src, int offset, int len);
    write(ByteBuffer src, ...);
}

interface BinaryArrayReader {
    int readLength();
    byte[] read();
    void read(byte[] dest, int off, int len);
    void read(ByteBuffer dest);
}

2) Then add new methods to reader/writer/builder. E.g.:
BinaryWriter.writeByteArray(String name, BiConsumer<Object,
BinaryArrayWriter> consumer); // Consumes original field and array writer;
BinaryReader.readByteArray(String name, Function<BinaryArrayReader>
consumer);            // Takes array reader, produces original field

3) Last, introduce field annotation for user convenience:
class MyClass {
    @BinaryArray(writerClass = MyWriterBiConsumer.class, readerClass =
MyReaderFunction.class)
    private ByteBuffer buf;
}

Now all user need to do is to implement write consumer and read function.

Vladimir.


On Tue, Nov 7, 2017 at 10:16 AM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> ByteBuffer is not fundamental data type. It is a thin wrapper over array.
> Protocol should contain only fundamental types.
>
> вт, 7 нояб. 2017 г. в 10:05, Andrey Kornev <an...@hotmail.com>:
>
>> Vladimir,
>>
>> Could you please elaborate as to why adding ByteBuffer to the Binary
>> protocol is “not a good idea”? What is inherently wrong about it?
>>
>> As for your suggestion, while I certainly see your point, it comes with
>> the inconvenience of implementing Binarylizable for every entity type in an
>> application, which makes me wonder if the self imposed rules such as “type
>> system should be kept as small as possible” could be relaxed in the name of
>> greater good and user friendliness. Ignite API is not easy to grok as it is
>> already.
>>
>> Thanks
>> Andrey
>> _____________________________
>> From: Vladimir Ozerov <vo...@gridgain.com>>
>> Sent: Monday, November 6, 2017 10:22 PM
>> Subject: Re: Deserialization of BinaryObject's byte arrays
>> To: <de...@ignite.apache.org>>
>> Cc: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
>> valentin.kulichenko@gmail.com>>
>>
>>
>> Hi Andrey,
>>
>> While we deifnitely need to support byte buffers, adding new type to
>> protocol is not very good idea. Binary protocol is the heart of our
>> ssytem,
>> which is used not oly for plain Java, but for other platforms and SQL. And
>> ByteBuffer is essentially the same byte array, but with different API.
>> What
>> you describe is a special case of byte array read/write. For this reason
>> our type system should be kept as small as possible and it doesn't require
>> new type to be introduced. Instead, we'd better to handle everything on
>> API
>> level only. For example, we can add new methods to reader/writer/builder,
>> which will handle this. E.g.:
>>
>> BinaryWriter.writeByteArray(<some producer here>);
>> BinaryReader.readByteArray(<some consumer here>);
>>
>> Vladimir.
>>
>> On Mon, Nov 6, 2017 at 9:25 PM, Andrey Kornev <andrewkornev@hotmail.com<
>> mailto:andrewkornev@hotmail.com>>
>> wrote:
>>
>> > Will do! Thanks!
>> > ________________________________
>> > From: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
>> valentin.kulichenko@gmail.com>>
>> > Sent: Monday, November 6, 2017 9:17 AM
>> > To: dev@ignite.apache.org<ma...@ignite.apache.org>
>> > Subject: Re: Deserialization of BinaryObject's byte arrays
>> >
>> > This use case was discussed couple of times already, so it seems to be
>> > pretty important for users. And I like the approach suggested by Andrey.
>> >
>> > Andrey, can you create a Jira ticket for this change? Would you like to
>> > contribute it?
>> >
>> > -Val
>> >
>> > On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <
>> dsetrakyan@apache.org<ma...@apache.org>>
>> > wrote:
>> >
>> > > This makes sense to me. Sounds like a useful feature.
>> > >
>> > > Would be nice to hear what others in the community think?
>> > >
>> > > D.
>> > >
>> > > On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <
>> andrewkornev@hotmail.com<ma...@hotmail.com>>
>> > > wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > We store lots of byte arrays (serialized graph-like data
>> structures) as
>> > > > fields of BinaryObject. Later on, while reading such field,
>> > > > BinaryInputStream implementation creates an on-heap array and copies
>> > the
>> > > > bytes from the BinaryObject's internal backing array to the new
>> array.
>> > > >
>> > > > While in general case it works just fine, in our case, a lot of CPU
>> is
>> > > > spent on copying of the bytes and significant amount garbage is
>> > generated
>> > > > as well.
>> > > >
>> > > > I'd like Ignite Community to consider the following proposal:
>> introduce
>> > > > support for serialization/deserialization of the ByteBuffer class. A
>> > > > BinaryInputStream implementation would special case ByteBuffer
>> fields
>> > > same
>> > > > way it currently special cases various Java types (collections,
>> dates,
>> > > > timestamps, enums, etc.)
>> > > >
>> > > > The application developer would simply call
>> > > BinaryObjectBuilder.setField(...)
>> > > > passing in an instance of ByteBuffer. During serialization, the
>> > > > ByteBuffer's bytes would simply be copied to the backing array
>> (similar
>> > > to
>> > > > how regular byte arrays are serialized today) and in the binary
>> > object's
>> > > > metadata the field marked as, say, GridBinaryMarshaller.BYTE_
>> BUFFER.
>> > > >
>> > > > During deserialization of the field, instead of allocating a new
>> byte
>> > > > array and transferring the bytes from the BinaryObject's backing
>> array,
>> > > > BinaryInputStream would simply wrap the required sub-array into a
>> > > read-only
>> > > > instance of ByteBuffer using "public static ByteBuffer wrap(byte[]
>> > array,
>> > > > int offset, int length)" and return the instance to the application.
>> > > >
>> > > > This approach doesn't require any array allocation/data copying and
>> is
>> > > > faster and significantly reduces pressure on the garbage collector.
>> The
>> > > > benefits are especially great when application stores significant
>> > amount
>> > > of
>> > > > its data as byte arrays.
>> > > >
>> > > > Clearly, this approach equally applies to arrays of other primitive
>> > > types.
>> > > >
>> > > > A minor complication does arise when dealing with off-heap
>> > BinaryObjects,
>> > > > but nothing that can't solved with a little bit of Java reflection
>> > > wizardry
>> > > > (or by other means).
>> > > >
>> > > > Thanks,
>> > > > Andrey
>> > > >
>> > >
>> >
>>
>>
>>

Re: Deserialization of BinaryObject's byte arrays

Posted by Vladimir Ozerov <vo...@gridgain.com>.
Regarding "sefl-imposed rules":
1) Binary protocol is essentially ignite's type system. It is used in many
places: Java, .NET, CPP, SQL (native, JDBC, ODBC), thin client. Of those 5
parties only Java has "ByteBuffer" concept. How other platforms should work
with this type?
2) We compare objects using binary representation
(BinaryArrayIdentityResolver#equals0). So if one user write data as byte
array, and another write the same data as ByteBuffer, resulting objects
will not be equal
These are clear examples on why type system should be as small as possible
- we simply cannot afford extending type system for every minor use case.
Binary protocol is internal thing and has nothing to do with user
experience.

What is worse, ByteBuffer is terrible abstraction as recognized by many
millions of Java users - it is stateful. One cannot read data from it
without changing it's state. And Ignite gives no guarantees that particular
object will be serialized only once per certain operation. One example is
BinaryMetadataCollector - Ignite could go through object twice per
serialization cycle to collect metadata. How are we going to maintain
idempotence in general case?

What could improve UX is public interfaces. Raw example of what we can do:
1) Define generic wrappers over our streams that will give users ability to
read/write byte arrays as they want. E.g.:
interface BinaryArrayWriter {
    write(byte[] src, int offset, int len);
    write(ByteBuffer src, ...);
}

interface BinaryArrayReader {
    int readLength();
    byte[] read();
    void read(byte[] dest, int off, int len);
    void read(ByteBuffer dest);
}

2) Then add new methods to reader/writer/builder. E.g.:
BinaryWriter.writeByteArray(String name, BiConsumer<Object,
BinaryArrayWriter> consumer); // Consumes original field and array writer;
BinaryReader.readByteArray(String name, Function<BinaryArrayReader>
consumer);            // Takes array reader, produces original field

3) Last, introduce field annotation for user convenience:
class MyClass {
    @BinaryArray(writerClass = MyWriterBiConsumer.class, readerClass =
MyReaderFunction.class)
    private ByteBuffer buf;
}

Now all user need to do is to implement write consumer and read function.

Vladimir.


On Tue, Nov 7, 2017 at 10:16 AM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> ByteBuffer is not fundamental data type. It is a thin wrapper over array.
> Protocol should contain only fundamental types.
>
> вт, 7 нояб. 2017 г. в 10:05, Andrey Kornev <an...@hotmail.com>:
>
>> Vladimir,
>>
>> Could you please elaborate as to why adding ByteBuffer to the Binary
>> protocol is “not a good idea”? What is inherently wrong about it?
>>
>> As for your suggestion, while I certainly see your point, it comes with
>> the inconvenience of implementing Binarylizable for every entity type in an
>> application, which makes me wonder if the self imposed rules such as “type
>> system should be kept as small as possible” could be relaxed in the name of
>> greater good and user friendliness. Ignite API is not easy to grok as it is
>> already.
>>
>> Thanks
>> Andrey
>> _____________________________
>> From: Vladimir Ozerov <vo...@gridgain.com>>
>> Sent: Monday, November 6, 2017 10:22 PM
>> Subject: Re: Deserialization of BinaryObject's byte arrays
>> To: <de...@ignite.apache.org>>
>> Cc: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
>> valentin.kulichenko@gmail.com>>
>>
>>
>> Hi Andrey,
>>
>> While we deifnitely need to support byte buffers, adding new type to
>> protocol is not very good idea. Binary protocol is the heart of our
>> ssytem,
>> which is used not oly for plain Java, but for other platforms and SQL. And
>> ByteBuffer is essentially the same byte array, but with different API.
>> What
>> you describe is a special case of byte array read/write. For this reason
>> our type system should be kept as small as possible and it doesn't require
>> new type to be introduced. Instead, we'd better to handle everything on
>> API
>> level only. For example, we can add new methods to reader/writer/builder,
>> which will handle this. E.g.:
>>
>> BinaryWriter.writeByteArray(<some producer here>);
>> BinaryReader.readByteArray(<some consumer here>);
>>
>> Vladimir.
>>
>> On Mon, Nov 6, 2017 at 9:25 PM, Andrey Kornev <andrewkornev@hotmail.com<
>> mailto:andrewkornev@hotmail.com>>
>> wrote:
>>
>> > Will do! Thanks!
>> > ________________________________
>> > From: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
>> valentin.kulichenko@gmail.com>>
>> > Sent: Monday, November 6, 2017 9:17 AM
>> > To: dev@ignite.apache.org<ma...@ignite.apache.org>
>> > Subject: Re: Deserialization of BinaryObject's byte arrays
>> >
>> > This use case was discussed couple of times already, so it seems to be
>> > pretty important for users. And I like the approach suggested by Andrey.
>> >
>> > Andrey, can you create a Jira ticket for this change? Would you like to
>> > contribute it?
>> >
>> > -Val
>> >
>> > On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <
>> dsetrakyan@apache.org<ma...@apache.org>>
>> > wrote:
>> >
>> > > This makes sense to me. Sounds like a useful feature.
>> > >
>> > > Would be nice to hear what others in the community think?
>> > >
>> > > D.
>> > >
>> > > On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <
>> andrewkornev@hotmail.com<ma...@hotmail.com>>
>> > > wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > We store lots of byte arrays (serialized graph-like data
>> structures) as
>> > > > fields of BinaryObject. Later on, while reading such field,
>> > > > BinaryInputStream implementation creates an on-heap array and copies
>> > the
>> > > > bytes from the BinaryObject's internal backing array to the new
>> array.
>> > > >
>> > > > While in general case it works just fine, in our case, a lot of CPU
>> is
>> > > > spent on copying of the bytes and significant amount garbage is
>> > generated
>> > > > as well.
>> > > >
>> > > > I'd like Ignite Community to consider the following proposal:
>> introduce
>> > > > support for serialization/deserialization of the ByteBuffer class. A
>> > > > BinaryInputStream implementation would special case ByteBuffer
>> fields
>> > > same
>> > > > way it currently special cases various Java types (collections,
>> dates,
>> > > > timestamps, enums, etc.)
>> > > >
>> > > > The application developer would simply call
>> > > BinaryObjectBuilder.setField(...)
>> > > > passing in an instance of ByteBuffer. During serialization, the
>> > > > ByteBuffer's bytes would simply be copied to the backing array
>> (similar
>> > > to
>> > > > how regular byte arrays are serialized today) and in the binary
>> > object's
>> > > > metadata the field marked as, say, GridBinaryMarshaller.BYTE_
>> BUFFER.
>> > > >
>> > > > During deserialization of the field, instead of allocating a new
>> byte
>> > > > array and transferring the bytes from the BinaryObject's backing
>> array,
>> > > > BinaryInputStream would simply wrap the required sub-array into a
>> > > read-only
>> > > > instance of ByteBuffer using "public static ByteBuffer wrap(byte[]
>> > array,
>> > > > int offset, int length)" and return the instance to the application.
>> > > >
>> > > > This approach doesn't require any array allocation/data copying and
>> is
>> > > > faster and significantly reduces pressure on the garbage collector.
>> The
>> > > > benefits are especially great when application stores significant
>> > amount
>> > > of
>> > > > its data as byte arrays.
>> > > >
>> > > > Clearly, this approach equally applies to arrays of other primitive
>> > > types.
>> > > >
>> > > > A minor complication does arise when dealing with off-heap
>> > BinaryObjects,
>> > > > but nothing that can't solved with a little bit of Java reflection
>> > > wizardry
>> > > > (or by other means).
>> > > >
>> > > > Thanks,
>> > > > Andrey
>> > > >
>> > >
>> >
>>
>>
>>

Re: Deserialization of BinaryObject's byte arrays

Posted by Vladimir Ozerov <vo...@gridgain.com>.
ByteBuffer is not fundamental data type. It is a thin wrapper over array.
Protocol should contain only fundamental types.

вт, 7 нояб. 2017 г. в 10:05, Andrey Kornev <an...@hotmail.com>:

> Vladimir,
>
> Could you please elaborate as to why adding ByteBuffer to the Binary
> protocol is “not a good idea”? What is inherently wrong about it?
>
> As for your suggestion, while I certainly see your point, it comes with
> the inconvenience of implementing Binarylizable for every entity type in an
> application, which makes me wonder if the self imposed rules such as “type
> system should be kept as small as possible” could be relaxed in the name of
> greater good and user friendliness. Ignite API is not easy to grok as it is
> already.
>
> Thanks
> Andrey
> _____________________________
> From: Vladimir Ozerov <vo...@gridgain.com>>
> Sent: Monday, November 6, 2017 10:22 PM
> Subject: Re: Deserialization of BinaryObject's byte arrays
> To: <de...@ignite.apache.org>>
> Cc: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
> valentin.kulichenko@gmail.com>>
>
>
> Hi Andrey,
>
> While we deifnitely need to support byte buffers, adding new type to
> protocol is not very good idea. Binary protocol is the heart of our ssytem,
> which is used not oly for plain Java, but for other platforms and SQL. And
> ByteBuffer is essentially the same byte array, but with different API. What
> you describe is a special case of byte array read/write. For this reason
> our type system should be kept as small as possible and it doesn't require
> new type to be introduced. Instead, we'd better to handle everything on API
> level only. For example, we can add new methods to reader/writer/builder,
> which will handle this. E.g.:
>
> BinaryWriter.writeByteArray(<some producer here>);
> BinaryReader.readByteArray(<some consumer here>);
>
> Vladimir.
>
> On Mon, Nov 6, 2017 at 9:25 PM, Andrey Kornev <andrewkornev@hotmail.com
> <ma...@hotmail.com>>
> wrote:
>
> > Will do! Thanks!
> > ________________________________
> > From: Valentin Kulichenko <valentin.kulichenko@gmail.com<mailto:
> valentin.kulichenko@gmail.com>>
> > Sent: Monday, November 6, 2017 9:17 AM
> > To: dev@ignite.apache.org<ma...@ignite.apache.org>
> > Subject: Re: Deserialization of BinaryObject's byte arrays
> >
> > This use case was discussed couple of times already, so it seems to be
> > pretty important for users. And I like the approach suggested by Andrey.
> >
> > Andrey, can you create a Jira ticket for this change? Would you like to
> > contribute it?
> >
> > -Val
> >
> > On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <dsetrakyan@apache.org
> <ma...@apache.org>>
> > wrote:
> >
> > > This makes sense to me. Sounds like a useful feature.
> > >
> > > Would be nice to hear what others in the community think?
> > >
> > > D.
> > >
> > > On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <
> andrewkornev@hotmail.com<ma...@hotmail.com>>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > We store lots of byte arrays (serialized graph-like data structures)
> as
> > > > fields of BinaryObject. Later on, while reading such field,
> > > > BinaryInputStream implementation creates an on-heap array and copies
> > the
> > > > bytes from the BinaryObject's internal backing array to the new
> array.
> > > >
> > > > While in general case it works just fine, in our case, a lot of CPU
> is
> > > > spent on copying of the bytes and significant amount garbage is
> > generated
> > > > as well.
> > > >
> > > > I'd like Ignite Community to consider the following proposal:
> introduce
> > > > support for serialization/deserialization of the ByteBuffer class. A
> > > > BinaryInputStream implementation would special case ByteBuffer fields
> > > same
> > > > way it currently special cases various Java types (collections,
> dates,
> > > > timestamps, enums, etc.)
> > > >
> > > > The application developer would simply call
> > > BinaryObjectBuilder.setField(...)
> > > > passing in an instance of ByteBuffer. During serialization, the
> > > > ByteBuffer's bytes would simply be copied to the backing array
> (similar
> > > to
> > > > how regular byte arrays are serialized today) and in the binary
> > object's
> > > > metadata the field marked as, say, GridBinaryMarshaller.BYTE_BUFFER.
> > > >
> > > > During deserialization of the field, instead of allocating a new byte
> > > > array and transferring the bytes from the BinaryObject's backing
> array,
> > > > BinaryInputStream would simply wrap the required sub-array into a
> > > read-only
> > > > instance of ByteBuffer using "public static ByteBuffer wrap(byte[]
> > array,
> > > > int offset, int length)" and return the instance to the application.
> > > >
> > > > This approach doesn't require any array allocation/data copying and
> is
> > > > faster and significantly reduces pressure on the garbage collector.
> The
> > > > benefits are especially great when application stores significant
> > amount
> > > of
> > > > its data as byte arrays.
> > > >
> > > > Clearly, this approach equally applies to arrays of other primitive
> > > types.
> > > >
> > > > A minor complication does arise when dealing with off-heap
> > BinaryObjects,
> > > > but nothing that can't solved with a little bit of Java reflection
> > > wizardry
> > > > (or by other means).
> > > >
> > > > Thanks,
> > > > Andrey
> > > >
> > >
> >
>
>
>

Re: Deserialization of BinaryObject's byte arrays

Posted by Andrey Kornev <an...@hotmail.com>.
Vladimir,

Could you please elaborate as to why adding ByteBuffer to the Binary protocol is “not a good idea”? What is inherently wrong about it?

As for your suggestion, while I certainly see your point, it comes with the inconvenience of implementing Binarylizable for every entity type in an application, which makes me wonder if the self imposed rules such as “type system should be kept as small as possible” could be relaxed in the name of greater good and user friendliness. Ignite API is not easy to grok as it is already.

Thanks
Andrey
_____________________________
From: Vladimir Ozerov <vo...@gridgain.com>>
Sent: Monday, November 6, 2017 10:22 PM
Subject: Re: Deserialization of BinaryObject's byte arrays
To: <de...@ignite.apache.org>>
Cc: Valentin Kulichenko <va...@gmail.com>>


Hi Andrey,

While we deifnitely need to support byte buffers, adding new type to
protocol is not very good idea. Binary protocol is the heart of our ssytem,
which is used not oly for plain Java, but for other platforms and SQL. And
ByteBuffer is essentially the same byte array, but with different API. What
you describe is a special case of byte array read/write. For this reason
our type system should be kept as small as possible and it doesn't require
new type to be introduced. Instead, we'd better to handle everything on API
level only. For example, we can add new methods to reader/writer/builder,
which will handle this. E.g.:

BinaryWriter.writeByteArray(<some producer here>);
BinaryReader.readByteArray(<some consumer here>);

Vladimir.

On Mon, Nov 6, 2017 at 9:25 PM, Andrey Kornev <an...@hotmail.com>>
wrote:

> Will do! Thanks!
> ________________________________
> From: Valentin Kulichenko <va...@gmail.com>>
> Sent: Monday, November 6, 2017 9:17 AM
> To: dev@ignite.apache.org<ma...@ignite.apache.org>
> Subject: Re: Deserialization of BinaryObject's byte arrays
>
> This use case was discussed couple of times already, so it seems to be
> pretty important for users. And I like the approach suggested by Andrey.
>
> Andrey, can you create a Jira ticket for this change? Would you like to
> contribute it?
>
> -Val
>
> On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <ds...@apache.org>>
> wrote:
>
> > This makes sense to me. Sounds like a useful feature.
> >
> > Would be nice to hear what others in the community think?
> >
> > D.
> >
> > On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <an...@hotmail.com>>
> > wrote:
> >
> > > Hello,
> > >
> > > We store lots of byte arrays (serialized graph-like data structures) as
> > > fields of BinaryObject. Later on, while reading such field,
> > > BinaryInputStream implementation creates an on-heap array and copies
> the
> > > bytes from the BinaryObject's internal backing array to the new array.
> > >
> > > While in general case it works just fine, in our case, a lot of CPU is
> > > spent on copying of the bytes and significant amount garbage is
> generated
> > > as well.
> > >
> > > I'd like Ignite Community to consider the following proposal: introduce
> > > support for serialization/deserialization of the ByteBuffer class. A
> > > BinaryInputStream implementation would special case ByteBuffer fields
> > same
> > > way it currently special cases various Java types (collections, dates,
> > > timestamps, enums, etc.)
> > >
> > > The application developer would simply call
> > BinaryObjectBuilder.setField(...)
> > > passing in an instance of ByteBuffer. During serialization, the
> > > ByteBuffer's bytes would simply be copied to the backing array (similar
> > to
> > > how regular byte arrays are serialized today) and in the binary
> object's
> > > metadata the field marked as, say, GridBinaryMarshaller.BYTE_BUFFER.
> > >
> > > During deserialization of the field, instead of allocating a new byte
> > > array and transferring the bytes from the BinaryObject's backing array,
> > > BinaryInputStream would simply wrap the required sub-array into a
> > read-only
> > > instance of ByteBuffer using "public static ByteBuffer wrap(byte[]
> array,
> > > int offset, int length)" and return the instance to the application.
> > >
> > > This approach doesn't require any array allocation/data copying and is
> > > faster and significantly reduces pressure on the garbage collector. The
> > > benefits are especially great when application stores significant
> amount
> > of
> > > its data as byte arrays.
> > >
> > > Clearly, this approach equally applies to arrays of other primitive
> > types.
> > >
> > > A minor complication does arise when dealing with off-heap
> BinaryObjects,
> > > but nothing that can't solved with a little bit of Java reflection
> > wizardry
> > > (or by other means).
> > >
> > > Thanks,
> > > Andrey
> > >
> >
>



Re: Deserialization of BinaryObject's byte arrays

Posted by Vladimir Ozerov <vo...@gridgain.com>.
Hi Andrey,

While we deifnitely need to support byte buffers, adding new type to
protocol is not very good idea. Binary protocol is the heart of our ssytem,
which is used not oly for plain Java, but for other platforms and SQL. And
ByteBuffer is essentially the same byte array, but with different API. What
you describe is a special case of byte array read/write. For this reason
our type system should be kept as small as possible and it doesn't require
new type to be introduced. Instead, we'd better to handle everything on API
level only. For example, we can add new methods to reader/writer/builder,
which will handle this. E.g.:

BinaryWriter.writeByteArray(<some producer here>);
BinaryReader.readByteArray(<some consumer here>);

Vladimir.

On Mon, Nov 6, 2017 at 9:25 PM, Andrey Kornev <an...@hotmail.com>
wrote:

> Will do! Thanks!
> ________________________________
> From: Valentin Kulichenko <va...@gmail.com>
> Sent: Monday, November 6, 2017 9:17 AM
> To: dev@ignite.apache.org
> Subject: Re: Deserialization of BinaryObject's byte arrays
>
> This use case was discussed couple of times already, so it seems to be
> pretty important for users. And I like the approach suggested by Andrey.
>
> Andrey, can you create a Jira ticket for this change? Would you like to
> contribute it?
>
> -Val
>
> On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <ds...@apache.org>
> wrote:
>
> > This makes sense to me. Sounds like a useful feature.
> >
> > Would be nice to hear what others in the community think?
> >
> > D.
> >
> > On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <an...@hotmail.com>
> > wrote:
> >
> > > Hello,
> > >
> > > We store lots of byte arrays (serialized graph-like data structures) as
> > > fields of BinaryObject. Later on, while reading such field,
> > > BinaryInputStream implementation creates an on-heap array and copies
> the
> > > bytes from the BinaryObject's internal backing array to the new array.
> > >
> > > While in general case it works just fine, in our case, a lot of CPU is
> > > spent on copying of the bytes and significant amount garbage is
> generated
> > > as well.
> > >
> > > I'd like Ignite Community to consider the following proposal: introduce
> > > support for serialization/deserialization of the ByteBuffer class.  A
> > > BinaryInputStream implementation would special case ByteBuffer fields
> > same
> > > way it currently special cases various Java types (collections, dates,
> > > timestamps, enums, etc.)
> > >
> > > The application developer would simply call
> > BinaryObjectBuilder.setField(...)
> > > passing in an instance of ByteBuffer. During serialization, the
> > > ByteBuffer's bytes would simply be copied to the backing array (similar
> > to
> > > how regular byte arrays are serialized today) and in the binary
> object's
> > > metadata the field marked as, say, GridBinaryMarshaller.BYTE_BUFFER.
> > >
> > > During deserialization of the field, instead of allocating a new byte
> > > array and transferring the bytes from the BinaryObject's backing array,
> > > BinaryInputStream would simply wrap the required sub-array into a
> > read-only
> > > instance of ByteBuffer using "public static ByteBuffer wrap(byte[]
> array,
> > > int offset, int length)" and return the instance to the application.
> > >
> > > This approach doesn't require any array allocation/data copying and is
> > > faster and significantly reduces pressure on the garbage collector. The
> > > benefits are especially great when application stores significant
> amount
> > of
> > > its data as byte arrays.
> > >
> > > Clearly, this approach equally applies to arrays of other primitive
> > types.
> > >
> > > A minor complication does arise when dealing with off-heap
> BinaryObjects,
> > > but nothing that can't solved with a little bit of Java reflection
> > wizardry
> > > (or by other means).
> > >
> > > Thanks,
> > > Andrey
> > >
> >
>

Re: Deserialization of BinaryObject's byte arrays

Posted by Andrey Kornev <an...@hotmail.com>.
Will do! Thanks!
________________________________
From: Valentin Kulichenko <va...@gmail.com>
Sent: Monday, November 6, 2017 9:17 AM
To: dev@ignite.apache.org
Subject: Re: Deserialization of BinaryObject's byte arrays

This use case was discussed couple of times already, so it seems to be
pretty important for users. And I like the approach suggested by Andrey.

Andrey, can you create a Jira ticket for this change? Would you like to
contribute it?

-Val

On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <ds...@apache.org>
wrote:

> This makes sense to me. Sounds like a useful feature.
>
> Would be nice to hear what others in the community think?
>
> D.
>
> On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <an...@hotmail.com>
> wrote:
>
> > Hello,
> >
> > We store lots of byte arrays (serialized graph-like data structures) as
> > fields of BinaryObject. Later on, while reading such field,
> > BinaryInputStream implementation creates an on-heap array and copies the
> > bytes from the BinaryObject's internal backing array to the new array.
> >
> > While in general case it works just fine, in our case, a lot of CPU is
> > spent on copying of the bytes and significant amount garbage is generated
> > as well.
> >
> > I'd like Ignite Community to consider the following proposal: introduce
> > support for serialization/deserialization of the ByteBuffer class.  A
> > BinaryInputStream implementation would special case ByteBuffer fields
> same
> > way it currently special cases various Java types (collections, dates,
> > timestamps, enums, etc.)
> >
> > The application developer would simply call
> BinaryObjectBuilder.setField(...)
> > passing in an instance of ByteBuffer. During serialization, the
> > ByteBuffer's bytes would simply be copied to the backing array (similar
> to
> > how regular byte arrays are serialized today) and in the binary object's
> > metadata the field marked as, say, GridBinaryMarshaller.BYTE_BUFFER.
> >
> > During deserialization of the field, instead of allocating a new byte
> > array and transferring the bytes from the BinaryObject's backing array,
> > BinaryInputStream would simply wrap the required sub-array into a
> read-only
> > instance of ByteBuffer using "public static ByteBuffer wrap(byte[] array,
> > int offset, int length)" and return the instance to the application.
> >
> > This approach doesn't require any array allocation/data copying and is
> > faster and significantly reduces pressure on the garbage collector. The
> > benefits are especially great when application stores significant amount
> of
> > its data as byte arrays.
> >
> > Clearly, this approach equally applies to arrays of other primitive
> types.
> >
> > A minor complication does arise when dealing with off-heap BinaryObjects,
> > but nothing that can't solved with a little bit of Java reflection
> wizardry
> > (or by other means).
> >
> > Thanks,
> > Andrey
> >
>

Re: Deserialization of BinaryObject's byte arrays

Posted by Valentin Kulichenko <va...@gmail.com>.
This use case was discussed couple of times already, so it seems to be
pretty important for users. And I like the approach suggested by Andrey.

Andrey, can you create a Jira ticket for this change? Would you like to
contribute it?

-Val

On Fri, Nov 3, 2017 at 4:18 PM, Dmitriy Setrakyan <ds...@apache.org>
wrote:

> This makes sense to me. Sounds like a useful feature.
>
> Would be nice to hear what others in the community think?
>
> D.
>
> On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <an...@hotmail.com>
> wrote:
>
> > Hello,
> >
> > We store lots of byte arrays (serialized graph-like data structures) as
> > fields of BinaryObject. Later on, while reading such field,
> > BinaryInputStream implementation creates an on-heap array and copies the
> > bytes from the BinaryObject's internal backing array to the new array.
> >
> > While in general case it works just fine, in our case, a lot of CPU is
> > spent on copying of the bytes and significant amount garbage is generated
> > as well.
> >
> > I'd like Ignite Community to consider the following proposal: introduce
> > support for serialization/deserialization of the ByteBuffer class.  A
> > BinaryInputStream implementation would special case ByteBuffer fields
> same
> > way it currently special cases various Java types (collections, dates,
> > timestamps, enums, etc.)
> >
> > The application developer would simply call
> BinaryObjectBuilder.setField(...)
> > passing in an instance of ByteBuffer. During serialization, the
> > ByteBuffer's bytes would simply be copied to the backing array (similar
> to
> > how regular byte arrays are serialized today) and in the binary object's
> > metadata the field marked as, say, GridBinaryMarshaller.BYTE_BUFFER.
> >
> > During deserialization of the field, instead of allocating a new byte
> > array and transferring the bytes from the BinaryObject's backing array,
> > BinaryInputStream would simply wrap the required sub-array into a
> read-only
> > instance of ByteBuffer using "public static ByteBuffer wrap(byte[] array,
> > int offset, int length)" and return the instance to the application.
> >
> > This approach doesn't require any array allocation/data copying and is
> > faster and significantly reduces pressure on the garbage collector. The
> > benefits are especially great when application stores significant amount
> of
> > its data as byte arrays.
> >
> > Clearly, this approach equally applies to arrays of other primitive
> types.
> >
> > A minor complication does arise when dealing with off-heap BinaryObjects,
> > but nothing that can't solved with a little bit of Java reflection
> wizardry
> > (or by other means).
> >
> > Thanks,
> > Andrey
> >
>

Re: Deserialization of BinaryObject's byte arrays

Posted by Dmitriy Setrakyan <ds...@apache.org>.
This makes sense to me. Sounds like a useful feature.

Would be nice to hear what others in the community think?

D.

On Fri, Nov 3, 2017 at 1:07 PM, Andrey Kornev <an...@hotmail.com>
wrote:

> Hello,
>
> We store lots of byte arrays (serialized graph-like data structures) as
> fields of BinaryObject. Later on, while reading such field,
> BinaryInputStream implementation creates an on-heap array and copies the
> bytes from the BinaryObject's internal backing array to the new array.
>
> While in general case it works just fine, in our case, a lot of CPU is
> spent on copying of the bytes and significant amount garbage is generated
> as well.
>
> I'd like Ignite Community to consider the following proposal: introduce
> support for serialization/deserialization of the ByteBuffer class.  A
> BinaryInputStream implementation would special case ByteBuffer fields same
> way it currently special cases various Java types (collections, dates,
> timestamps, enums, etc.)
>
> The application developer would simply call BinaryObjectBuilder.setField(...)
> passing in an instance of ByteBuffer. During serialization, the
> ByteBuffer's bytes would simply be copied to the backing array (similar to
> how regular byte arrays are serialized today) and in the binary object's
> metadata the field marked as, say, GridBinaryMarshaller.BYTE_BUFFER.
>
> During deserialization of the field, instead of allocating a new byte
> array and transferring the bytes from the BinaryObject's backing array,
> BinaryInputStream would simply wrap the required sub-array into a read-only
> instance of ByteBuffer using "public static ByteBuffer wrap(byte[] array,
> int offset, int length)" and return the instance to the application.
>
> This approach doesn't require any array allocation/data copying and is
> faster and significantly reduces pressure on the garbage collector. The
> benefits are especially great when application stores significant amount of
> its data as byte arrays.
>
> Clearly, this approach equally applies to arrays of other primitive types.
>
> A minor complication does arise when dealing with off-heap BinaryObjects,
> but nothing that can't solved with a little bit of Java reflection wizardry
> (or by other means).
>
> Thanks,
> Andrey
>