You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by dmreshet <dm...@gmail.com> on 2016/04/20 16:00:28 UTC

BinaryObject performance issue

Hello!
I need to put <Long, Map&lt;Integer, String>> data structure to cache.
I have found out that there is a BinaryObject, that solves the problem of
dynamic fields list and improves cache query operations performance. 
But I faced a performance issue. 

I have 3 node cluster with 5GB of RAM. I want to add 5 000 entries into
cache. 
In case I put <Long, Map&lt;Integer, String>> it takes over* 6,8 seconds*
In case I put <Long, BinaryObject> it takes *382 seconds*

I use atomic partitioned cache. Here is code example with BinaryObject:

                Map<Person, List&lt;Integer>> persons = ... //original data
structure
                IgniteCache<Long, BinaryObject> personCache =
Ignition.ignite().cache(PERSON_CACHE);

                IgniteBinary binary = Ignition.ignite().binary();

                persons.forEach((person, integers) -> {
                    BinaryObjectBuilder valBuilder =
binary.builder("categories");
                    integers.stream().forEach((integer -> {
                        valBuilder.setField(String.valueOf(integer),
integer);
                    }));
                    personCache.put(person.getId(), valBuilder.build());
                });


Is that expected behaviour? 



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/BinaryObject-performance-issue-tp4375.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: BinaryObject performance issue

Posted by vkulichenko <va...@gmail.com>.
Hi,

null fields are taken into account. So you should have this issue only if
there is always different set of fields provided to a builder for the same
type. Is this your case or you have smth different?

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/BinaryObject-performance-issue-tp4375p8747.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: BinaryObject performance issue

Posted by lawrencefinn <la...@gmail.com>.
Hi vkulichenko I am having a similar problem with BinaryObjects.  I am
building binaryobjects off of DB results or Json results where columns may
be null.  It seems like for binary objects you just don't set the null
columns.  When I build objects this way I constantly see communication among
the nodes.  I even tried creating a dummy first object with all the possible
fields before iterating through the resultset, but that did not help.  The
only thing that helped was making sure each binaryobject had all of the
fields set to something no matter what.

Any ideas?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/BinaryObject-performance-issue-tp4375p8710.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: BinaryObject performance issue

Posted by vkulichenko <va...@gmail.com>.
Hi,

BinaryObject is designed to represent POJOs in binary format, in this case
schema change is generally a very rare event, so there is no performance
concern about metadata update. You use case is more specific and thus
requires specific solution.

And btw, I'm not sure I understand the difference between
BinaryObject.hasField and Map.containsKey in your scenario. Can you clarify?

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/BinaryObject-performance-issue-tp4375p4420.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: BinaryObject performance issue

Posted by dmreshet <dm...@gmail.com>.
I need BinaryObject to process ScanQueries on data with dynamic categories
list. There are 30 different categories at current case.
My Task is to calculate amount of persons in each category, so *map* is not
working for me. 

try (QueryCursor cursor = cache.query(new ScanQuery<Long, BinaryObject>((k,
p) -> p.hasField(category)))) {
            for (Object o : cursor)
                counter += 1;
}

I see that it is possible to use another data structure: <(Long)CategoryId,
List<(Long)PersonId>>. In this case my task will be calculated very fast.
But this solution is very spesific for this task and I will not be able to
reuse this data structure. So I want to undestand if I can use <Long,
BinaryObject> to solve this task, because it looks like more general
solution.

As you said that each update may be cause of metadata update I have do sort
of categories before puch them to cache. It looks that it partialy helps,
now it takes* 44 seconds *to put 5 000 elements in cache.

Is there any way to improve performance?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/BinaryObject-performance-issue-tp4375p4407.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: BinaryObject performance issue

Posted by vkulichenko <va...@gmail.com>.
Agree with Andrey. This use case doesn't look like a good fit for
BinaryObject, is there any particular reason for using it? Is Map not
working for you?

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/BinaryObject-performance-issue-tp4375p4392.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: BinaryObject performance issue

Posted by Andrey Kornev <an...@hotmail.com>.
Hello,

I guess the problem may be with the way you build the binary object - by using the integers as field names. It forces Ignite to execute a global transaction to update the "category" type's metadata every time you invoke binary.build(). The update is required because the schema of the new binary object is not the same as that of the previous one (most likely, because the list of integers is not the same).

By design, Ignite Binary expects a binary type (such as 
"category" in your case) to have a fairly stable schema since it reflects a domain
class in your application. As such, the schema is expected to be updated relatively infrequently, for example, when the domain class changes in the next release of the application.

Long story short, you'll be better off by storing the Map as a single field of the binary type "category".

Hope it helps.
Andrey

> Date: Wed, 20 Apr 2016 07:00:28 -0700
> From: dmreshet@gmail.com
> To: user@ignite.apache.org
> Subject: BinaryObject performance issue
> 
> Hello!
> I need to put <Long, Map&lt;Integer, String>> data structure to cache.
> I have found out that there is a BinaryObject, that solves the problem of
> dynamic fields list and improves cache query operations performance. 
> But I faced a performance issue. 
> 
> I have 3 node cluster with 5GB of RAM. I want to add 5 000 entries into
> cache. 
> In case I put <Long, Map&lt;Integer, String>> it takes over* 6,8 seconds*
> In case I put <Long, BinaryObject> it takes *382 seconds*
> 
> I use atomic partitioned cache. Here is code example with BinaryObject:
> 
>                 Map<Person, List&lt;Integer>> persons = ... //original data
> structure
>                 IgniteCache<Long, BinaryObject> personCache =
> Ignition.ignite().cache(PERSON_CACHE);
> 
>                 IgniteBinary binary = Ignition.ignite().binary();
> 
>                 persons.forEach((person, integers) -> {
>                     BinaryObjectBuilder valBuilder =
> binary.builder("categories");
>                     integers.stream().forEach((integer -> {
>                         valBuilder.setField(String.valueOf(integer),
> integer);
>                     }));
>                     personCache.put(person.getId(), valBuilder.build());
>                 });
> 
> 
> Is that expected behaviour? 
> 
> 
> 
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/BinaryObject-performance-issue-tp4375.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.