You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Pavel Tupitsyn <pt...@apache.org> on 2017/08/01 11:30:33 UTC

Re: Accessing array elements within items cached in Ignite without deserialising the entire item

Hi Raymond,

First of all, BinaryObject is a cross-platform concept, it exists in C#,
C++, Java.
From C# point of view there are some inconsistencies (like nullable Guid,
or non-generic collections),
but these things are dictated by the existing protocol, so we can't change
them.
In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
these inconsistencies.

1. You can implement array pooling yourself using IBinaryRawReader methods.
   For example, byte array is written like rawWriter.WriteByte(arr.Length);
for (...) rawWriter.WriteByte(arr[i]);
   I think an extension method would be easy to write.

2. See above, use WriteObject<>/ReadObject<> to avoid dealing with nullables

3. Random array access is not possible with current API.

Thanks,
Pavel

On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

> Hi,
>
>
>
> I’ve been looking at IBinarizable and IBinarySerializer with regards to
> controlling object serialization (using the Ignite.Net client).
>
>
>
> A couple of questions:
>
>
>
> 1.       Some of the APIs in IBinarizable allow for a factory methods to
> control construction of collection and dictionary elements, but not for
> array elements (which could allow for performance optimization through
> array pooling).
>
> 2.       GUID and DateTime elements are nullable (and there is no
> non-nullable variant for these types). Apart from being inconsistent with
> all the other types supported in the API, nullability in .Net carries a
> performance penalty. Curious as to why these types are defined like this?
>
> 3.       I see it is possible to read arrays of elements. But I see no
> way to read a particular element within an array without deserialising the
> entire array. Is it possible to do something like  byte ReadByte(string
> fieldname, uint index); ?
>
>
>
> Thanks,
>
> Raymond.
>
>
>

Re: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Pavel Tupitsyn <pt...@apache.org>.
A copy is made on every public API cache entry access (ICache.Get, etc).

> randomly access items within that structure without deserialising the
entire cached item
IBinaryObject API is for that:
https://apacheignite-net.readme.io/docs/binary-mode

// Get a copy of serialized cache data (basically memcpy)
IBinaryObject obj = cache.WithKeepBinary().Get(1);

// Deserialize a single field
obj.GetField("foo");

// Access other fields, modify fields, etc

This way the copy is made only once.

On Thu, Aug 31, 2017 at 4:35 AM, Raymond Wilson <ra...@trimble.com>
wrote:

> I agree on correctness being the first priority always J
>
>
>
> Is there documentation on when that copy is made? For instance, is a copy
> made for every invocation of a BinarySerializer, or are there some
> additional caching semantics that mean the copy is made once and stays
> around for a while so subsequent invocations don’t have additional overhead
> recopying the cached item from the unmanaged context?
>
>
>
> The context here would be  a cache item with potentially significant
> internal structure where you might want to randomly access items within
> that structure without deserialising the entire cached item.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Friday, August 4, 2017 10:10 PM
>
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> >  git refused to clone the repo in GitExtensions
>
> "git clone https://github.com/apache/ignite.git" in console should work
>
>
>
> > Is the ‘data’ pointer actually a pointer to the unmanaged memory in the
> off heap cache containing this element
>
> It is just a pointer, it can point both to managed or unmanaged memory.
>
> Yes, in some cases we have a pointer to unmanaged memory that comes from
> Java side, but it is always a copy of the actual cache data.
>
> Otherwise it would be quite difficult to maintain atomicity and the like.
>
>
>
> Generally, I don't think we should introduce pointers and other unsafe
> stuff in the public API.
>
> Performance is important, but correctness is always a priority.
>
>
>
> On Fri, Aug 4, 2017 at 5:25 AM, Raymond Wilson <ra...@trimble.com>
> wrote:
>
> I had not seen that page yet – very useful.
>
>
>
> There’s a few moving parts to getting it working, so not sure I will get
> time to really dig into, but will have a look for sure.
>
>
>
> I did pull a static copy of the source (after git refused to clone the
> repo in GitExtensions) and started looking at the code. It does seem
> relatively simple to add appropriate methods to the appropriate interface
> and implementation classes.
>
>
>
> Question: When I see a method like this in BinaryStreamBase.cs:
>
>
>
>         /// <summary>
>
>         /// Read byte array.
>
>         /// </summary>
>
>         /// <param name="cnt">Count.</param>
>
>         /// <returns>
>
>         /// Byte array.
>
>         /// </returns>
>
>         public abstract byte[] ReadByteArray(int cnt);
>
>
>
>         protected static byte[] ReadByteArray0(int len, byte* data)
>
>         {
>
>             byte[] res = new byte[len];
>
>
>
>             fixed (byte* res0 = res)
>
>             {
>
>                 CopyMemory(data, res0, len);
>
>             }
>
>
>
>             return res;
>
>         }
>
>
>
> Is the ‘data’ pointer actually a pointer to the unmanaged memory in the
> off heap cache containing this element? If so, would this permit ‘user
> defined’ operations to performed, something like this? [or does Ignite.Net
> Linq already support this]?
>
>
>
>         /// <summary>
>
>         /// Perform action on a range of byte array elements with a
> delegate
>
>         /// </summary>
>
>         /// <param name="index">Start at.</param>
>
>         /// <param name="cnt">Count.</param>
>
>         /// <returns>
>
>         /// Nothing
>
>         /// </returns>
>
>         protected static void PerformByteArrayOperation(int index, int
> len, Action<byte> action, byte* data)
>
>         {
>
>             fixed (byte* res0 = &res[index])
>
>             {
>
>                 for (int i = 0; I < len; i++)
>
>                 {
>
>                      action(res0++);
>
>                 }
>
>             }
>
>         }
>
>
>
> There’s probably a nice way to genericize this across multiple array
> types, but it’s useful as an example.
>
>
>
> In this way you can operate on the data without the need to move it around
> all the time between unmanaged and managed contexts.
>
>
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Thursday, August 3, 2017 7:21 PM
>
>
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Great!
>
>
>
> Here's .NET development page, in case you haven't seen it yet:
> https://cwiki.apache.org/confluence/display/IGNITE/Ignite.NET+Development
>
> Let me know if you need any assistance.
>
>
>
> Pavel
>
>
>
> On Thu, Aug 3, 2017 at 5:28 AM, Raymond Wilson <ra...@trimble.com>
> wrote:
>
> Hi Pavel,
>
>
>
> Thanks for putting it on the plan.
>
>
>
> I’ve been reading through the ‘how to contribute’ documentation to see
> what’s required and have pulled a static download of the Git repository to
> start looking at the code. I’ll see… J
>
>
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Wednesday, August 2, 2017 9:08 PM
>
>
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Actually, you are right, we can add this easily, because internal API
> allows random stream access.
>
> I've filed a ticket: https://issues.apache.org/jira/browse/IGNITE-5904
>
>
>
> Thank you for a good suggestion!
>
> And, by the way, everyone is welcome to contribute, this ticket can be a
> perfect start!
>
>
>
> Pavel
>
>
>
> On Wed, Aug 2, 2017 at 12:46 AM, Raymond Wilson <
> raymond_wilson@trimble.com> wrote:
>
> Hi Pavel,
>
>
>
> Thanks for the clarifications. I certainly appreciate that cross platform
> protocols constrain what can be done…
>
>
>
> Thanks for pointing out IBinaryRawReader.
>
>
>
> Regarding random access into arrays, is this something that is on the
> books for a future version?
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Tuesday, August 1, 2017 11:31 PM
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Hi Raymond,
>
>
>
> First of all, BinaryObject is a cross-platform concept, it exists in C#,
> C++, Java.
>
> From C# point of view there are some inconsistencies (like nullable Guid,
> or non-generic collections),
>
> but these things are dictated by the existing protocol, so we can't change
> them.
>
> In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
> these inconsistencies.
>
>
>
> 1. You can implement array pooling yourself using IBinaryRawReader methods.
>
>    For example, byte array is written like rawWriter.WriteByte(arr.Length);
> for (...) rawWriter.WriteByte(arr[i]);
>
>    I think an extension method would be easy to write.
>
>
>
> 2. See above, use WriteObject<>/ReadObject<> to avoid dealing with
> nullables
>
>
>
> 3. Random array access is not possible with current API.
>
>
>
> Thanks,
>
> Pavel
>
>
>
> On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
> wrote:
>
> Hi,
>
>
>
> I’ve been looking at IBinarizable and IBinarySerializer with regards to
> controlling object serialization (using the Ignite.Net client).
>
>
>
> A couple of questions:
>
>
>
> 1.       Some of the APIs in IBinarizable allow for a factory methods to
> control construction of collection and dictionary elements, but not for
> array elements (which could allow for performance optimization through
> array pooling).
>
> 2.       GUID and DateTime elements are nullable (and there is no
> non-nullable variant for these types). Apart from being inconsistent with
> all the other types supported in the API, nullability in .Net carries a
> performance penalty. Curious as to why these types are defined like this?
>
> 3.       I see it is possible to read arrays of elements. But I see no
> way to read a particular element within an array without deserialising the
> entire array. Is it possible to do something like  byte ReadByte(string
> fieldname, uint index); ?
>
>
>
> Thanks,
>
> Raymond.
>
>
>
>
>
>
>
>
>
>
>

RE: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Raymond Wilson <ra...@trimble.com>.
I agree on correctness being the first priority always J



Is there documentation on when that copy is made? For instance, is a copy
made for every invocation of a BinarySerializer, or are there some
additional caching semantics that mean the copy is made once and stays
around for a while so subsequent invocations don’t have additional overhead
recopying the cached item from the unmanaged context?



The context here would be  a cache item with potentially significant
internal structure where you might want to randomly access items within
that structure without deserialising the entire cached item.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Friday, August 4, 2017 10:10 PM
*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



>  git refused to clone the repo in GitExtensions

"git clone https://github.com/apache/ignite.git" in console should work



> Is the ‘data’ pointer actually a pointer to the unmanaged memory in the
off heap cache containing this element

It is just a pointer, it can point both to managed or unmanaged memory.

Yes, in some cases we have a pointer to unmanaged memory that comes from
Java side, but it is always a copy of the actual cache data.

Otherwise it would be quite difficult to maintain atomicity and the like.



Generally, I don't think we should introduce pointers and other unsafe
stuff in the public API.

Performance is important, but correctness is always a priority.



On Fri, Aug 4, 2017 at 5:25 AM, Raymond Wilson <ra...@trimble.com>
wrote:

I had not seen that page yet – very useful.



There’s a few moving parts to getting it working, so not sure I will get
time to really dig into, but will have a look for sure.



I did pull a static copy of the source (after git refused to clone the repo
in GitExtensions) and started looking at the code. It does seem relatively
simple to add appropriate methods to the appropriate interface and
implementation classes.



Question: When I see a method like this in BinaryStreamBase.cs:



        /// <summary>

        /// Read byte array.

        /// </summary>

        /// <param name="cnt">Count.</param>

        /// <returns>

        /// Byte array.

        /// </returns>

        public abstract byte[] ReadByteArray(int cnt);



        protected static byte[] ReadByteArray0(int len, byte* data)

        {

            byte[] res = new byte[len];



            fixed (byte* res0 = res)

            {

                CopyMemory(data, res0, len);

            }



            return res;

        }



Is the ‘data’ pointer actually a pointer to the unmanaged memory in the off
heap cache containing this element? If so, would this permit ‘user defined’
operations to performed, something like this? [or does Ignite.Net Linq
already support this]?



        /// <summary>

        /// Perform action on a range of byte array elements with a delegate

        /// </summary>

        /// <param name="index">Start at.</param>

        /// <param name="cnt">Count.</param>

        /// <returns>

        /// Nothing

        /// </returns>

        protected static void PerformByteArrayOperation(int index, int len,
Action<byte> action, byte* data)

        {

            fixed (byte* res0 = &res[index])

            {

                for (int i = 0; I < len; i++)

                {

                     action(res0++);

                }

            }

        }



There’s probably a nice way to genericize this across multiple array types,
but it’s useful as an example.



In this way you can operate on the data without the need to move it around
all the time between unmanaged and managed contexts.



Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Thursday, August 3, 2017 7:21 PM


*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Great!



Here's .NET development page, in case you haven't seen it yet:
https://cwiki.apache.org/confluence/display/IGNITE/Ignite.NET+Development

Let me know if you need any assistance.



Pavel



On Thu, Aug 3, 2017 at 5:28 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi Pavel,



Thanks for putting it on the plan.



I’ve been reading through the ‘how to contribute’ documentation to see
what’s required and have pulled a static download of the Git repository to
start looking at the code. I’ll see… J



Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Wednesday, August 2, 2017 9:08 PM


*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Actually, you are right, we can add this easily, because internal API
allows random stream access.

I've filed a ticket: https://issues.apache.org/jira/browse/IGNITE-5904



Thank you for a good suggestion!

And, by the way, everyone is welcome to contribute, this ticket can be a
perfect start!



Pavel



On Wed, Aug 2, 2017 at 12:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi Pavel,



Thanks for the clarifications. I certainly appreciate that cross platform
protocols constrain what can be done…



Thanks for pointing out IBinaryRawReader.



Regarding random access into arrays, is this something that is on the books
for a future version?

Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Tuesday, August 1, 2017 11:31 PM
*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Hi Raymond,



First of all, BinaryObject is a cross-platform concept, it exists in C#,
C++, Java.

From C# point of view there are some inconsistencies (like nullable Guid,
or non-generic collections),

but these things are dictated by the existing protocol, so we can't change
them.

In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
these inconsistencies.



1. You can implement array pooling yourself using IBinaryRawReader methods.

   For example, byte array is written like rawWriter.WriteByte(arr.Length);
for (...) rawWriter.WriteByte(arr[i]);

   I think an extension method would be easy to write.



2. See above, use WriteObject<>/ReadObject<> to avoid dealing with nullables



3. Random array access is not possible with current API.



Thanks,

Pavel



On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi,



I’ve been looking at IBinarizable and IBinarySerializer with regards to
controlling object serialization (using the Ignite.Net client).



A couple of questions:



1.       Some of the APIs in IBinarizable allow for a factory methods to
control construction of collection and dictionary elements, but not for
array elements (which could allow for performance optimization through
array pooling).

2.       GUID and DateTime elements are nullable (and there is no
non-nullable variant for these types). Apart from being inconsistent with
all the other types supported in the API, nullability in .Net carries a
performance penalty. Curious as to why these types are defined like this?

3.       I see it is possible to read arrays of elements. But I see no way
to read a particular element within an array without deserialising the
entire array. Is it possible to do something like  byte ReadByte(string
fieldname, uint index); ?



Thanks,

Raymond.

Re: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Pavel Tupitsyn <pt...@apache.org>.
>  git refused to clone the repo in GitExtensions
"git clone https://github.com/apache/ignite.git" in console should work

> Is the ‘data’ pointer actually a pointer to the unmanaged memory in the
off heap cache containing this element
It is just a pointer, it can point both to managed or unmanaged memory.
Yes, in some cases we have a pointer to unmanaged memory that comes from
Java side, but it is always a copy of the actual cache data.
Otherwise it would be quite difficult to maintain atomicity and the like.

Generally, I don't think we should introduce pointers and other unsafe
stuff in the public API.
Performance is important, but correctness is always a priority.

On Fri, Aug 4, 2017 at 5:25 AM, Raymond Wilson <ra...@trimble.com>
wrote:

> I had not seen that page yet – very useful.
>
>
>
> There’s a few moving parts to getting it working, so not sure I will get
> time to really dig into, but will have a look for sure.
>
>
>
> I did pull a static copy of the source (after git refused to clone the
> repo in GitExtensions) and started looking at the code. It does seem
> relatively simple to add appropriate methods to the appropriate interface
> and implementation classes.
>
>
>
> Question: When I see a method like this in BinaryStreamBase.cs:
>
>
>
>         /// <summary>
>
>         /// Read byte array.
>
>         /// </summary>
>
>         /// <param name="cnt">Count.</param>
>
>         /// <returns>
>
>         /// Byte array.
>
>         /// </returns>
>
>         public abstract byte[] ReadByteArray(int cnt);
>
>
>
>         protected static byte[] ReadByteArray0(int len, byte* data)
>
>         {
>
>             byte[] res = new byte[len];
>
>
>
>             fixed (byte* res0 = res)
>
>             {
>
>                 CopyMemory(data, res0, len);
>
>             }
>
>
>
>             return res;
>
>         }
>
>
>
> Is the ‘data’ pointer actually a pointer to the unmanaged memory in the
> off heap cache containing this element? If so, would this permit ‘user
> defined’ operations to performed, something like this? [or does Ignite.Net
> Linq already support this]?
>
>
>
>         /// <summary>
>
>         /// Perform action on a range of byte array elements with a
> delegate
>
>         /// </summary>
>
>         /// <param name="index">Start at.</param>
>
>         /// <param name="cnt">Count.</param>
>
>         /// <returns>
>
>         /// Nothing
>
>         /// </returns>
>
>         protected static void PerformByteArrayOperation(int index, int
> len, Action<byte> action, byte* data)
>
>         {
>
>             fixed (byte* res0 = &res[index])
>
>             {
>
>                 for (int i = 0; I < len; i++)
>
>                 {
>
>                      action(res0++);
>
>                 }
>
>             }
>
>         }
>
>
>
> There’s probably a nice way to genericize this across multiple array
> types, but it’s useful as an example.
>
>
>
> In this way you can operate on the data without the need to move it around
> all the time between unmanaged and managed contexts.
>
>
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Thursday, August 3, 2017 7:21 PM
>
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Great!
>
>
>
> Here's .NET development page, in case you haven't seen it yet:
> https://cwiki.apache.org/confluence/display/IGNITE/Ignite.NET+Development
>
> Let me know if you need any assistance.
>
>
>
> Pavel
>
>
>
> On Thu, Aug 3, 2017 at 5:28 AM, Raymond Wilson <ra...@trimble.com>
> wrote:
>
> Hi Pavel,
>
>
>
> Thanks for putting it on the plan.
>
>
>
> I’ve been reading through the ‘how to contribute’ documentation to see
> what’s required and have pulled a static download of the Git repository to
> start looking at the code. I’ll see… J
>
>
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Wednesday, August 2, 2017 9:08 PM
>
>
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Actually, you are right, we can add this easily, because internal API
> allows random stream access.
>
> I've filed a ticket: https://issues.apache.org/jira/browse/IGNITE-5904
>
>
>
> Thank you for a good suggestion!
>
> And, by the way, everyone is welcome to contribute, this ticket can be a
> perfect start!
>
>
>
> Pavel
>
>
>
> On Wed, Aug 2, 2017 at 12:46 AM, Raymond Wilson <
> raymond_wilson@trimble.com> wrote:
>
> Hi Pavel,
>
>
>
> Thanks for the clarifications. I certainly appreciate that cross platform
> protocols constrain what can be done…
>
>
>
> Thanks for pointing out IBinaryRawReader.
>
>
>
> Regarding random access into arrays, is this something that is on the
> books for a future version?
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Tuesday, August 1, 2017 11:31 PM
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Hi Raymond,
>
>
>
> First of all, BinaryObject is a cross-platform concept, it exists in C#,
> C++, Java.
>
> From C# point of view there are some inconsistencies (like nullable Guid,
> or non-generic collections),
>
> but these things are dictated by the existing protocol, so we can't change
> them.
>
> In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
> these inconsistencies.
>
>
>
> 1. You can implement array pooling yourself using IBinaryRawReader methods.
>
>    For example, byte array is written like rawWriter.WriteByte(arr.Length);
> for (...) rawWriter.WriteByte(arr[i]);
>
>    I think an extension method would be easy to write.
>
>
>
> 2. See above, use WriteObject<>/ReadObject<> to avoid dealing with
> nullables
>
>
>
> 3. Random array access is not possible with current API.
>
>
>
> Thanks,
>
> Pavel
>
>
>
> On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
> wrote:
>
> Hi,
>
>
>
> I’ve been looking at IBinarizable and IBinarySerializer with regards to
> controlling object serialization (using the Ignite.Net client).
>
>
>
> A couple of questions:
>
>
>
> 1.       Some of the APIs in IBinarizable allow for a factory methods to
> control construction of collection and dictionary elements, but not for
> array elements (which could allow for performance optimization through
> array pooling).
>
> 2.       GUID and DateTime elements are nullable (and there is no
> non-nullable variant for these types). Apart from being inconsistent with
> all the other types supported in the API, nullability in .Net carries a
> performance penalty. Curious as to why these types are defined like this?
>
> 3.       I see it is possible to read arrays of elements. But I see no
> way to read a particular element within an array without deserialising the
> entire array. Is it possible to do something like  byte ReadByte(string
> fieldname, uint index); ?
>
>
>
> Thanks,
>
> Raymond.
>
>
>
>
>
>
>
>
>

RE: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Raymond Wilson <ra...@trimble.com>.
I had not seen that page yet – very useful.



There’s a few moving parts to getting it working, so not sure I will get
time to really dig into, but will have a look for sure.



I did pull a static copy of the source (after git refused to clone the repo
in GitExtensions) and started looking at the code. It does seem relatively
simple to add appropriate methods to the appropriate interface and
implementation classes.



Question: When I see a method like this in BinaryStreamBase.cs:



        /// <summary>

        /// Read byte array.

        /// </summary>

        /// <param name="cnt">Count.</param>

        /// <returns>

        /// Byte array.

        /// </returns>

        public abstract byte[] ReadByteArray(int cnt);



        protected static byte[] ReadByteArray0(int len, byte* data)

        {

            byte[] res = new byte[len];



            fixed (byte* res0 = res)

            {

                CopyMemory(data, res0, len);

            }



            return res;

        }



Is the ‘data’ pointer actually a pointer to the unmanaged memory in the off
heap cache containing this element? If so, would this permit ‘user defined’
operations to performed, something like this? [or does Ignite.Net Linq
already support this]?



        /// <summary>

        /// Perform action on a range of byte array elements with a delegate

        /// </summary>

        /// <param name="index">Start at.</param>

        /// <param name="cnt">Count.</param>

        /// <returns>

        /// Nothing

        /// </returns>

        protected static void PerformByteArrayOperation(int index, int len,
Action<byte> action, byte* data)

        {

            fixed (byte* res0 = &res[index])

            {

                for (int i = 0; I < len; i++)

                {

                     action(res0++);

                }

            }

        }



There’s probably a nice way to genericize this across multiple array types,
but it’s useful as an example.



In this way you can operate on the data without the need to move it around
all the time between unmanaged and managed contexts.



Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Thursday, August 3, 2017 7:21 PM
*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Great!



Here's .NET development page, in case you haven't seen it yet:
https://cwiki.apache.org/confluence/display/IGNITE/Ignite.NET+Development

Let me know if you need any assistance.



Pavel



On Thu, Aug 3, 2017 at 5:28 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi Pavel,



Thanks for putting it on the plan.



I’ve been reading through the ‘how to contribute’ documentation to see
what’s required and have pulled a static download of the Git repository to
start looking at the code. I’ll see… J



Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Wednesday, August 2, 2017 9:08 PM


*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Actually, you are right, we can add this easily, because internal API
allows random stream access.

I've filed a ticket: https://issues.apache.org/jira/browse/IGNITE-5904



Thank you for a good suggestion!

And, by the way, everyone is welcome to contribute, this ticket can be a
perfect start!



Pavel



On Wed, Aug 2, 2017 at 12:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi Pavel,



Thanks for the clarifications. I certainly appreciate that cross platform
protocols constrain what can be done…



Thanks for pointing out IBinaryRawReader.



Regarding random access into arrays, is this something that is on the books
for a future version?

Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Tuesday, August 1, 2017 11:31 PM
*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Hi Raymond,



First of all, BinaryObject is a cross-platform concept, it exists in C#,
C++, Java.

From C# point of view there are some inconsistencies (like nullable Guid,
or non-generic collections),

but these things are dictated by the existing protocol, so we can't change
them.

In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
these inconsistencies.



1. You can implement array pooling yourself using IBinaryRawReader methods.

   For example, byte array is written like rawWriter.WriteByte(arr.Length);
for (...) rawWriter.WriteByte(arr[i]);

   I think an extension method would be easy to write.



2. See above, use WriteObject<>/ReadObject<> to avoid dealing with nullables



3. Random array access is not possible with current API.



Thanks,

Pavel



On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi,



I’ve been looking at IBinarizable and IBinarySerializer with regards to
controlling object serialization (using the Ignite.Net client).



A couple of questions:



1.       Some of the APIs in IBinarizable allow for a factory methods to
control construction of collection and dictionary elements, but not for
array elements (which could allow for performance optimization through
array pooling).

2.       GUID and DateTime elements are nullable (and there is no
non-nullable variant for these types). Apart from being inconsistent with
all the other types supported in the API, nullability in .Net carries a
performance penalty. Curious as to why these types are defined like this?

3.       I see it is possible to read arrays of elements. But I see no way
to read a particular element within an array without deserialising the
entire array. Is it possible to do something like  byte ReadByte(string
fieldname, uint index); ?



Thanks,

Raymond.

Re: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Pavel Tupitsyn <pt...@apache.org>.
Great!

Here's .NET development page, in case you haven't seen it yet:
https://cwiki.apache.org/confluence/display/IGNITE/Ignite.NET+Development
Let me know if you need any assistance.

Pavel

On Thu, Aug 3, 2017 at 5:28 AM, Raymond Wilson <ra...@trimble.com>
wrote:

> Hi Pavel,
>
>
>
> Thanks for putting it on the plan.
>
>
>
> I’ve been reading through the ‘how to contribute’ documentation to see
> what’s required and have pulled a static download of the Git repository to
> start looking at the code. I’ll see… J
>
>
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Wednesday, August 2, 2017 9:08 PM
>
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Actually, you are right, we can add this easily, because internal API
> allows random stream access.
>
> I've filed a ticket: https://issues.apache.org/jira/browse/IGNITE-5904
>
>
>
> Thank you for a good suggestion!
>
> And, by the way, everyone is welcome to contribute, this ticket can be a
> perfect start!
>
>
>
> Pavel
>
>
>
> On Wed, Aug 2, 2017 at 12:46 AM, Raymond Wilson <
> raymond_wilson@trimble.com> wrote:
>
> Hi Pavel,
>
>
>
> Thanks for the clarifications. I certainly appreciate that cross platform
> protocols constrain what can be done…
>
>
>
> Thanks for pointing out IBinaryRawReader.
>
>
>
> Regarding random access into arrays, is this something that is on the
> books for a future version?
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Tuesday, August 1, 2017 11:31 PM
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Hi Raymond,
>
>
>
> First of all, BinaryObject is a cross-platform concept, it exists in C#,
> C++, Java.
>
> From C# point of view there are some inconsistencies (like nullable Guid,
> or non-generic collections),
>
> but these things are dictated by the existing protocol, so we can't change
> them.
>
> In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
> these inconsistencies.
>
>
>
> 1. You can implement array pooling yourself using IBinaryRawReader methods.
>
>    For example, byte array is written like rawWriter.WriteByte(arr.Length);
> for (...) rawWriter.WriteByte(arr[i]);
>
>    I think an extension method would be easy to write.
>
>
>
> 2. See above, use WriteObject<>/ReadObject<> to avoid dealing with
> nullables
>
>
>
> 3. Random array access is not possible with current API.
>
>
>
> Thanks,
>
> Pavel
>
>
>
> On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
> wrote:
>
> Hi,
>
>
>
> I’ve been looking at IBinarizable and IBinarySerializer with regards to
> controlling object serialization (using the Ignite.Net client).
>
>
>
> A couple of questions:
>
>
>
> 1.       Some of the APIs in IBinarizable allow for a factory methods to
> control construction of collection and dictionary elements, but not for
> array elements (which could allow for performance optimization through
> array pooling).
>
> 2.       GUID and DateTime elements are nullable (and there is no
> non-nullable variant for these types). Apart from being inconsistent with
> all the other types supported in the API, nullability in .Net carries a
> performance penalty. Curious as to why these types are defined like this?
>
> 3.       I see it is possible to read arrays of elements. But I see no
> way to read a particular element within an array without deserialising the
> entire array. Is it possible to do something like  byte ReadByte(string
> fieldname, uint index); ?
>
>
>
> Thanks,
>
> Raymond.
>
>
>
>
>
>
>

RE: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Raymond Wilson <ra...@trimble.com>.
Hi Pavel,



Thanks for putting it on the plan.



I’ve been reading through the ‘how to contribute’ documentation to see
what’s required and have pulled a static download of the Git repository to
start looking at the code. I’ll see… J



Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Wednesday, August 2, 2017 9:08 PM
*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Actually, you are right, we can add this easily, because internal API
allows random stream access.

I've filed a ticket: https://issues.apache.org/jira/browse/IGNITE-5904



Thank you for a good suggestion!

And, by the way, everyone is welcome to contribute, this ticket can be a
perfect start!



Pavel



On Wed, Aug 2, 2017 at 12:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi Pavel,



Thanks for the clarifications. I certainly appreciate that cross platform
protocols constrain what can be done…



Thanks for pointing out IBinaryRawReader.



Regarding random access into arrays, is this something that is on the books
for a future version?

Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Tuesday, August 1, 2017 11:31 PM
*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Hi Raymond,



First of all, BinaryObject is a cross-platform concept, it exists in C#,
C++, Java.

From C# point of view there are some inconsistencies (like nullable Guid,
or non-generic collections),

but these things are dictated by the existing protocol, so we can't change
them.

In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
these inconsistencies.



1. You can implement array pooling yourself using IBinaryRawReader methods.

   For example, byte array is written like rawWriter.WriteByte(arr.Length);
for (...) rawWriter.WriteByte(arr[i]);

   I think an extension method would be easy to write.



2. See above, use WriteObject<>/ReadObject<> to avoid dealing with nullables



3. Random array access is not possible with current API.



Thanks,

Pavel



On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi,



I’ve been looking at IBinarizable and IBinarySerializer with regards to
controlling object serialization (using the Ignite.Net client).



A couple of questions:



1.       Some of the APIs in IBinarizable allow for a factory methods to
control construction of collection and dictionary elements, but not for
array elements (which could allow for performance optimization through
array pooling).

2.       GUID and DateTime elements are nullable (and there is no
non-nullable variant for these types). Apart from being inconsistent with
all the other types supported in the API, nullability in .Net carries a
performance penalty. Curious as to why these types are defined like this?

3.       I see it is possible to read arrays of elements. But I see no way
to read a particular element within an array without deserialising the
entire array. Is it possible to do something like  byte ReadByte(string
fieldname, uint index); ?



Thanks,

Raymond.

Re: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Pavel Tupitsyn <pt...@apache.org>.
Actually, you are right, we can add this easily, because internal API
allows random stream access.
I've filed a ticket: https://issues.apache.org/jira/browse/IGNITE-5904

Thank you for a good suggestion!
And, by the way, everyone is welcome to contribute, this ticket can be a
perfect start!

Pavel

On Wed, Aug 2, 2017 at 12:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

> Hi Pavel,
>
>
>
> Thanks for the clarifications. I certainly appreciate that cross platform
> protocols constrain what can be done…
>
>
>
> Thanks for pointing out IBinaryRawReader.
>
>
>
> Regarding random access into arrays, is this something that is on the
> books for a future version?
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
> *Sent:* Tuesday, August 1, 2017 11:31 PM
> *To:* user@ignite.apache.org
> *Subject:* Re: Accessing array elements within items cached in Ignite
> without deserialising the entire item
>
>
>
> Hi Raymond,
>
>
>
> First of all, BinaryObject is a cross-platform concept, it exists in C#,
> C++, Java.
>
> From C# point of view there are some inconsistencies (like nullable Guid,
> or non-generic collections),
>
> but these things are dictated by the existing protocol, so we can't change
> them.
>
> In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
> these inconsistencies.
>
>
>
> 1. You can implement array pooling yourself using IBinaryRawReader methods.
>
>    For example, byte array is written like rawWriter.WriteByte(arr.Length);
> for (...) rawWriter.WriteByte(arr[i]);
>
>    I think an extension method would be easy to write.
>
>
>
> 2. See above, use WriteObject<>/ReadObject<> to avoid dealing with
> nullables
>
>
>
> 3. Random array access is not possible with current API.
>
>
>
> Thanks,
>
> Pavel
>
>
>
> On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
> wrote:
>
> Hi,
>
>
>
> I’ve been looking at IBinarizable and IBinarySerializer with regards to
> controlling object serialization (using the Ignite.Net client).
>
>
>
> A couple of questions:
>
>
>
> 1.       Some of the APIs in IBinarizable allow for a factory methods to
> control construction of collection and dictionary elements, but not for
> array elements (which could allow for performance optimization through
> array pooling).
>
> 2.       GUID and DateTime elements are nullable (and there is no
> non-nullable variant for these types). Apart from being inconsistent with
> all the other types supported in the API, nullability in .Net carries a
> performance penalty. Curious as to why these types are defined like this?
>
> 3.       I see it is possible to read arrays of elements. But I see no
> way to read a particular element within an array without deserialising the
> entire array. Is it possible to do something like  byte ReadByte(string
> fieldname, uint index); ?
>
>
>
> Thanks,
>
> Raymond.
>
>
>
>
>

RE: Accessing array elements within items cached in Ignite without deserialising the entire item

Posted by Raymond Wilson <ra...@trimble.com>.
Hi Pavel,



Thanks for the clarifications. I certainly appreciate that cross platform
protocols constrain what can be done…



Thanks for pointing out IBinaryRawReader.



Regarding random access into arrays, is this something that is on the books
for a future version?

Thanks,

Raymond.



*From:* Pavel Tupitsyn [mailto:ptupitsyn@apache.org]
*Sent:* Tuesday, August 1, 2017 11:31 PM
*To:* user@ignite.apache.org
*Subject:* Re: Accessing array elements within items cached in Ignite
without deserialising the entire item



Hi Raymond,



First of all, BinaryObject is a cross-platform concept, it exists in C#,
C++, Java.

From C# point of view there are some inconsistencies (like nullable Guid,
or non-generic collections),

but these things are dictated by the existing protocol, so we can't change
them.

In most cases you can just use WriteObject<>/ReadObject<> methods to avoid
these inconsistencies.



1. You can implement array pooling yourself using IBinaryRawReader methods.

   For example, byte array is written like rawWriter.WriteByte(arr.Length);
for (...) rawWriter.WriteByte(arr[i]);

   I think an extension method would be easy to write.



2. See above, use WriteObject<>/ReadObject<> to avoid dealing with nullables



3. Random array access is not possible with current API.



Thanks,

Pavel



On Tue, Aug 1, 2017 at 2:46 AM, Raymond Wilson <ra...@trimble.com>
wrote:

Hi,



I’ve been looking at IBinarizable and IBinarySerializer with regards to
controlling object serialization (using the Ignite.Net client).



A couple of questions:



1.       Some of the APIs in IBinarizable allow for a factory methods to
control construction of collection and dictionary elements, but not for
array elements (which could allow for performance optimization through
array pooling).

2.       GUID and DateTime elements are nullable (and there is no
non-nullable variant for these types). Apart from being inconsistent with
all the other types supported in the API, nullability in .Net carries a
performance penalty. Curious as to why these types are defined like this?

3.       I see it is possible to read arrays of elements. But I see no way
to read a particular element within an array without deserialising the
entire array. Is it possible to do something like  byte ReadByte(string
fieldname, uint index); ?



Thanks,

Raymond.