You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Pavel Pereslegin <xx...@gmail.com> on 2018/06/25 13:06:58 UTC

Re: IgniteSet implementation: changes required

Hello, Igniters.

I tried to implement IgniteSet data recovery when persistence enabled
[1] using trivial cache scanning, however I cannot find optimal way to
do that because of the following reasons:
- Performing operations on IgniteSet requires completion of data
loading (restoring of setDataMap) on all nodes. Do this during
partition map exchange is too long.
- The prohibition of operations on IgniteSet before the completion of
asynchronous cache scanning on all nodes looks rather complicated,
because It is necessary to support all situations of unstable
topology.

So I see one option to fix data loss on node restart - remove the
entire optimization (setDataMap) and rework the iterator
implementation to perform cache scanning.

Thoughts?

[1] https://issues.apache.org/jira/browse/IGNITE-5553


2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <st...@gmail.com>:
> Thanks, Dmitry. I agree ultimately, DS API uniformity is a weighty reason.
>
> 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
>
>> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <st...@gmail.com>
>> wrote:
>>
>> > Dmitry, your way allows to reuse existing {{Ignite.set()}} API to create
>> > both set flavors. We can adopt it unless somebody in the community
>> objects.
>> > Personally, I like {{IgniteCache.asSet()}} approach proposed by Vladimir
>> O.
>> > more, since it emphasizes the difference between sets being created, but
>> > this will require API extension.
>> >
>>
>> Andrey, I am suggesting that Ignite.set(...) in non-collocated mode behaves
>> exactly the same as the proposed IgniteCache.asSet() method. I do not like
>> the IgniteCache.asSet() API because it is inconsistent with Ignite data
>> structure design. All data structures are provided on Ignite API directly
>> and we should not change that.
>>
>> D.
>>
>
>
>
> --
> Best regards,
>   Andrey Kuznetsov.

Re: IgniteSet implementation: changes required

Posted by Pavel Pereslegin <xx...@gmail.com>.
Hello Igniters.

As was discussed, IgniteSet implementation was based on on-heap data
duplication (setDataMap), as a result, the data was not recovered after
cluster restart and in the case of large data sets, this led to a
significant heap growing and gc pressure.

We changed the implementation so that this structure works well without
duplicating the data [1]. To reduce performance drop and speed up large
data sets, non-collocated version of IgniteSet now uses separate cache [2].

[1] https://issues.apache.org/jira/browse/IGNITE-5553
[2] https://issues.apache.org/jira/browse/IGNITE-7823



ср, 27 июн. 2018 г. в 23:26, Amir Akhmedov <am...@gmail.com>:

> Yes, you are right.
>
> Thanks,
> Amir
>
>
> On Wed, Jun 27, 2018 at 1:15 PM Denis Magda <dm...@apache.org> wrote:
>
> > Got you. If it's about redundant data duplication in onheap region then
> no
> > any concerns from my side.
> >
> > Anyway, considering that the data structure will be interacting with the
> > page memory directly then its entries can be stored in Ignite persistence
> > automatically (if the latter is on). Does it mean that the data structure
> > will be fully recovered after a restart and its entries can be pulled
> from
> > disk on demand?
> >
> > --
> > Denis
> >
> >
> > On Tue, Jun 26, 2018 at 1:49 PM Amir Akhmedov <am...@gmail.com>
> > wrote:
> >
> > > I also think it will better to remove setDataMap support cause
> > > 1. It's making extra pressure on GC by keeping entries on heap
> > > 2. It has difficult logic to support with lots of nuances
> > > 3. To maintain setDataMap today GridCacheMapEntry calls
> > > cctx.dataStructures().onEntryUpdated() on each entry mutation. I think
> > it's
> > > unnecessary cohesion.
> > > 4. For the case with single Ignite cache for all collocated
> > datastructure,
> > > an iterator creation will not be much slower than current
> implementation
> > > since we can run affinity call on the node where all entries reside.
> > Also,
> > > we can create a better affinity mapper to fairly distribute
> > datastructures
> > > across a cluster rather than mapping by datastructure's name.
> > >
> > > Thanks,
> > > Amir
> > >
> > >
> > > On Tue, Jun 26, 2018 at 8:10 AM Anton Vinogradov <av...@apache.org>
> wrote:
> > >
> > > > Denis,
> > > >
> > > > I think that better case is to remove onheap
> optimisation/duplication.
> > > > This brings no drop to frequently used operations (put/remove), but
> > even
> > > > will make it slightly faster.
> > > >
> > > > The only one question we have here is "is it possible to restore
> onheap
> > > map
> > > > in easy way?".
> > > > Seems that answer is no, so, I vote for setDataMap removal.
> > > >
> > > > вт, 26 июн. 2018 г. в 15:00, Denis Magda <dm...@apache.org>:
> > > >
> > > > > Anton,
> > > > >
> > > > > Will it be possible to reuse such a functionality for the rest of
> > data
> > > > > structures? I would invest our time in this if all data structures
> > > would
> > > > be
> > > > > able to work with Ignite persistence this way.
> > > > >
> > > > > --
> > > > > Denis
> > > > >
> > > > > On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <av...@apache.org>
> > > wrote:
> > > > >
> > > > > > >> Why don't we read data straight from the persistence layer
> > warming
> > > > RAM
> > > > > > up
> > > > > > >> in the background?
> > > > > > Because it's not a trivial task to finish such loading on
> unstable
> > > > > > topology.
> > > > > > That's possible, ofcourse, but solution and complexity will be
> > almost
> > > > > > equals to WAL enable/disable.
> > > > > >
> > > > > > пн, 25 июн. 2018 г. в 22:13, Denis Magda <dm...@apache.org>:
> > > > > >
> > > > > > > Folks,
> > > > > > >
> > > > > > > Why don't we read data straight from the persistence layer
> > warming
> > > > RAM
> > > > > up
> > > > > > > in the background? (like we do for SQL and other APIs). If
> it's a
> > > > > > question
> > > > > > > of time, then I would suggest us not to hurry up and do it in a
> > > right
> > > > > > way.
> > > > > > >
> > > > > > > --
> > > > > > > Denis
> > > > > > >
> > > > > > > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <
> av@apache.org>
> > > > > wrote:
> > > > > > >
> > > > > > > > +1 to removal in case there is no easy, fast and consistent
> way
> > > to
> > > > > > > restore
> > > > > > > > setDataMap on node restart.
> > > > > > > > I see that we'll gain some performance drop on size() or
> > keys(),
> > > > but
> > > > > > > these
> > > > > > > > methods are rarely used.
> > > > > > > >
> > > > > > > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <
> > xxtern@gmail.com
> > > >:
> > > > > > > >
> > > > > > > > > Hello, Igniters.
> > > > > > > > >
> > > > > > > > > I tried to implement IgniteSet data recovery when
> persistence
> > > > > enabled
> > > > > > > > > [1] using trivial cache scanning, however I cannot find
> > optimal
> > > > way
> > > > > > to
> > > > > > > > > do that because of the following reasons:
> > > > > > > > > - Performing operations on IgniteSet requires completion of
> > > data
> > > > > > > > > loading (restoring of setDataMap) on all nodes. Do this
> > during
> > > > > > > > > partition map exchange is too long.
> > > > > > > > > - The prohibition of operations on IgniteSet before the
> > > > completion
> > > > > of
> > > > > > > > > asynchronous cache scanning on all nodes looks rather
> > > > complicated,
> > > > > > > > > because It is necessary to support all situations of
> unstable
> > > > > > > > > topology.
> > > > > > > > >
> > > > > > > > > So I see one option to fix data loss on node restart -
> remove
> > > the
> > > > > > > > > entire optimization (setDataMap) and rework the iterator
> > > > > > > > > implementation to perform cache scanning.
> > > > > > > > >
> > > > > > > > > Thoughts?
> > > > > > > > >
> > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <
> > stkuzma@gmail.com
> > > >:
> > > > > > > > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is
> a
> > > > > weighty
> > > > > > > > > reason.
> > > > > > > > > >
> > > > > > > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <
> > > > > dsetrakyan@apache.org
> > > > > > >:
> > > > > > > > > >
> > > > > > > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <
> > > > > > > stkuzma@gmail.com>
> > > > > > > > > >> wrote:
> > > > > > > > > >>
> > > > > > > > > >> > Dmitry, your way allows to reuse existing
> > {{Ignite.set()}}
> > > > API
> > > > > > to
> > > > > > > > > create
> > > > > > > > > >> > both set flavors. We can adopt it unless somebody in
> the
> > > > > > community
> > > > > > > > > >> objects.
> > > > > > > > > >> > Personally, I like {{IgniteCache.asSet()}} approach
> > > proposed
> > > > > by
> > > > > > > > > Vladimir
> > > > > > > > > >> O.
> > > > > > > > > >> > more, since it emphasizes the difference between sets
> > > being
> > > > > > > created,
> > > > > > > > > but
> > > > > > > > > >> > this will require API extension.
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >> Andrey, I am suggesting that Ignite.set(...) in
> > > non-collocated
> > > > > > mode
> > > > > > > > > behaves
> > > > > > > > > >> exactly the same as the proposed IgniteCache.asSet()
> > > method. I
> > > > > do
> > > > > > > not
> > > > > > > > > like
> > > > > > > > > >> the IgniteCache.asSet() API because it is inconsistent
> > with
> > > > > Ignite
> > > > > > > > data
> > > > > > > > > >> structure design. All data structures are provided on
> > Ignite
> > > > API
> > > > > > > > > directly
> > > > > > > > > >> and we should not change that.
> > > > > > > > > >>
> > > > > > > > > >> D.
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best regards,
> > > > > > > > > >   Andrey Kuznetsov.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: IgniteSet implementation: changes required

Posted by Amir Akhmedov <am...@gmail.com>.
Yes, you are right.

Thanks,
Amir


On Wed, Jun 27, 2018 at 1:15 PM Denis Magda <dm...@apache.org> wrote:

> Got you. If it's about redundant data duplication in onheap region then no
> any concerns from my side.
>
> Anyway, considering that the data structure will be interacting with the
> page memory directly then its entries can be stored in Ignite persistence
> automatically (if the latter is on). Does it mean that the data structure
> will be fully recovered after a restart and its entries can be pulled from
> disk on demand?
>
> --
> Denis
>
>
> On Tue, Jun 26, 2018 at 1:49 PM Amir Akhmedov <am...@gmail.com>
> wrote:
>
> > I also think it will better to remove setDataMap support cause
> > 1. It's making extra pressure on GC by keeping entries on heap
> > 2. It has difficult logic to support with lots of nuances
> > 3. To maintain setDataMap today GridCacheMapEntry calls
> > cctx.dataStructures().onEntryUpdated() on each entry mutation. I think
> it's
> > unnecessary cohesion.
> > 4. For the case with single Ignite cache for all collocated
> datastructure,
> > an iterator creation will not be much slower than current implementation
> > since we can run affinity call on the node where all entries reside.
> Also,
> > we can create a better affinity mapper to fairly distribute
> datastructures
> > across a cluster rather than mapping by datastructure's name.
> >
> > Thanks,
> > Amir
> >
> >
> > On Tue, Jun 26, 2018 at 8:10 AM Anton Vinogradov <av...@apache.org> wrote:
> >
> > > Denis,
> > >
> > > I think that better case is to remove onheap optimisation/duplication.
> > > This brings no drop to frequently used operations (put/remove), but
> even
> > > will make it slightly faster.
> > >
> > > The only one question we have here is "is it possible to restore onheap
> > map
> > > in easy way?".
> > > Seems that answer is no, so, I vote for setDataMap removal.
> > >
> > > вт, 26 июн. 2018 г. в 15:00, Denis Magda <dm...@apache.org>:
> > >
> > > > Anton,
> > > >
> > > > Will it be possible to reuse such a functionality for the rest of
> data
> > > > structures? I would invest our time in this if all data structures
> > would
> > > be
> > > > able to work with Ignite persistence this way.
> > > >
> > > > --
> > > > Denis
> > > >
> > > > On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <av...@apache.org>
> > wrote:
> > > >
> > > > > >> Why don't we read data straight from the persistence layer
> warming
> > > RAM
> > > > > up
> > > > > >> in the background?
> > > > > Because it's not a trivial task to finish such loading on unstable
> > > > > topology.
> > > > > That's possible, ofcourse, but solution and complexity will be
> almost
> > > > > equals to WAL enable/disable.
> > > > >
> > > > > пн, 25 июн. 2018 г. в 22:13, Denis Magda <dm...@apache.org>:
> > > > >
> > > > > > Folks,
> > > > > >
> > > > > > Why don't we read data straight from the persistence layer
> warming
> > > RAM
> > > > up
> > > > > > in the background? (like we do for SQL and other APIs). If it's a
> > > > > question
> > > > > > of time, then I would suggest us not to hurry up and do it in a
> > right
> > > > > way.
> > > > > >
> > > > > > --
> > > > > > Denis
> > > > > >
> > > > > > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <av...@apache.org>
> > > > wrote:
> > > > > >
> > > > > > > +1 to removal in case there is no easy, fast and consistent way
> > to
> > > > > > restore
> > > > > > > setDataMap on node restart.
> > > > > > > I see that we'll gain some performance drop on size() or
> keys(),
> > > but
> > > > > > these
> > > > > > > methods are rarely used.
> > > > > > >
> > > > > > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <
> xxtern@gmail.com
> > >:
> > > > > > >
> > > > > > > > Hello, Igniters.
> > > > > > > >
> > > > > > > > I tried to implement IgniteSet data recovery when persistence
> > > > enabled
> > > > > > > > [1] using trivial cache scanning, however I cannot find
> optimal
> > > way
> > > > > to
> > > > > > > > do that because of the following reasons:
> > > > > > > > - Performing operations on IgniteSet requires completion of
> > data
> > > > > > > > loading (restoring of setDataMap) on all nodes. Do this
> during
> > > > > > > > partition map exchange is too long.
> > > > > > > > - The prohibition of operations on IgniteSet before the
> > > completion
> > > > of
> > > > > > > > asynchronous cache scanning on all nodes looks rather
> > > complicated,
> > > > > > > > because It is necessary to support all situations of unstable
> > > > > > > > topology.
> > > > > > > >
> > > > > > > > So I see one option to fix data loss on node restart - remove
> > the
> > > > > > > > entire optimization (setDataMap) and rework the iterator
> > > > > > > > implementation to perform cache scanning.
> > > > > > > >
> > > > > > > > Thoughts?
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > > > > > > >
> > > > > > > >
> > > > > > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <
> stkuzma@gmail.com
> > >:
> > > > > > > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is a
> > > > weighty
> > > > > > > > reason.
> > > > > > > > >
> > > > > > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <
> > > > dsetrakyan@apache.org
> > > > > >:
> > > > > > > > >
> > > > > > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <
> > > > > > stkuzma@gmail.com>
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >> > Dmitry, your way allows to reuse existing
> {{Ignite.set()}}
> > > API
> > > > > to
> > > > > > > > create
> > > > > > > > >> > both set flavors. We can adopt it unless somebody in the
> > > > > community
> > > > > > > > >> objects.
> > > > > > > > >> > Personally, I like {{IgniteCache.asSet()}} approach
> > proposed
> > > > by
> > > > > > > > Vladimir
> > > > > > > > >> O.
> > > > > > > > >> > more, since it emphasizes the difference between sets
> > being
> > > > > > created,
> > > > > > > > but
> > > > > > > > >> > this will require API extension.
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >> Andrey, I am suggesting that Ignite.set(...) in
> > non-collocated
> > > > > mode
> > > > > > > > behaves
> > > > > > > > >> exactly the same as the proposed IgniteCache.asSet()
> > method. I
> > > > do
> > > > > > not
> > > > > > > > like
> > > > > > > > >> the IgniteCache.asSet() API because it is inconsistent
> with
> > > > Ignite
> > > > > > > data
> > > > > > > > >> structure design. All data structures are provided on
> Ignite
> > > API
> > > > > > > > directly
> > > > > > > > >> and we should not change that.
> > > > > > > > >>
> > > > > > > > >> D.
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best regards,
> > > > > > > > >   Andrey Kuznetsov.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: IgniteSet implementation: changes required

Posted by Denis Magda <dm...@apache.org>.
Got you. If it's about redundant data duplication in onheap region then no
any concerns from my side.

Anyway, considering that the data structure will be interacting with the
page memory directly then its entries can be stored in Ignite persistence
automatically (if the latter is on). Does it mean that the data structure
will be fully recovered after a restart and its entries can be pulled from
disk on demand?

--
Denis


On Tue, Jun 26, 2018 at 1:49 PM Amir Akhmedov <am...@gmail.com>
wrote:

> I also think it will better to remove setDataMap support cause
> 1. It's making extra pressure on GC by keeping entries on heap
> 2. It has difficult logic to support with lots of nuances
> 3. To maintain setDataMap today GridCacheMapEntry calls
> cctx.dataStructures().onEntryUpdated() on each entry mutation. I think it's
> unnecessary cohesion.
> 4. For the case with single Ignite cache for all collocated datastructure,
> an iterator creation will not be much slower than current implementation
> since we can run affinity call on the node where all entries reside. Also,
> we can create a better affinity mapper to fairly distribute datastructures
> across a cluster rather than mapping by datastructure's name.
>
> Thanks,
> Amir
>
>
> On Tue, Jun 26, 2018 at 8:10 AM Anton Vinogradov <av...@apache.org> wrote:
>
> > Denis,
> >
> > I think that better case is to remove onheap optimisation/duplication.
> > This brings no drop to frequently used operations (put/remove), but even
> > will make it slightly faster.
> >
> > The only one question we have here is "is it possible to restore onheap
> map
> > in easy way?".
> > Seems that answer is no, so, I vote for setDataMap removal.
> >
> > вт, 26 июн. 2018 г. в 15:00, Denis Magda <dm...@apache.org>:
> >
> > > Anton,
> > >
> > > Will it be possible to reuse such a functionality for the rest of data
> > > structures? I would invest our time in this if all data structures
> would
> > be
> > > able to work with Ignite persistence this way.
> > >
> > > --
> > > Denis
> > >
> > > On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <av...@apache.org>
> wrote:
> > >
> > > > >> Why don't we read data straight from the persistence layer warming
> > RAM
> > > > up
> > > > >> in the background?
> > > > Because it's not a trivial task to finish such loading on unstable
> > > > topology.
> > > > That's possible, ofcourse, but solution and complexity will be almost
> > > > equals to WAL enable/disable.
> > > >
> > > > пн, 25 июн. 2018 г. в 22:13, Denis Magda <dm...@apache.org>:
> > > >
> > > > > Folks,
> > > > >
> > > > > Why don't we read data straight from the persistence layer warming
> > RAM
> > > up
> > > > > in the background? (like we do for SQL and other APIs). If it's a
> > > > question
> > > > > of time, then I would suggest us not to hurry up and do it in a
> right
> > > > way.
> > > > >
> > > > > --
> > > > > Denis
> > > > >
> > > > > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <av...@apache.org>
> > > wrote:
> > > > >
> > > > > > +1 to removal in case there is no easy, fast and consistent way
> to
> > > > > restore
> > > > > > setDataMap on node restart.
> > > > > > I see that we'll gain some performance drop on size() or keys(),
> > but
> > > > > these
> > > > > > methods are rarely used.
> > > > > >
> > > > > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <xxtern@gmail.com
> >:
> > > > > >
> > > > > > > Hello, Igniters.
> > > > > > >
> > > > > > > I tried to implement IgniteSet data recovery when persistence
> > > enabled
> > > > > > > [1] using trivial cache scanning, however I cannot find optimal
> > way
> > > > to
> > > > > > > do that because of the following reasons:
> > > > > > > - Performing operations on IgniteSet requires completion of
> data
> > > > > > > loading (restoring of setDataMap) on all nodes. Do this during
> > > > > > > partition map exchange is too long.
> > > > > > > - The prohibition of operations on IgniteSet before the
> > completion
> > > of
> > > > > > > asynchronous cache scanning on all nodes looks rather
> > complicated,
> > > > > > > because It is necessary to support all situations of unstable
> > > > > > > topology.
> > > > > > >
> > > > > > > So I see one option to fix data loss on node restart - remove
> the
> > > > > > > entire optimization (setDataMap) and rework the iterator
> > > > > > > implementation to perform cache scanning.
> > > > > > >
> > > > > > > Thoughts?
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > > > > > >
> > > > > > >
> > > > > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <stkuzma@gmail.com
> >:
> > > > > > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is a
> > > weighty
> > > > > > > reason.
> > > > > > > >
> > > > > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <
> > > dsetrakyan@apache.org
> > > > >:
> > > > > > > >
> > > > > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <
> > > > > stkuzma@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Dmitry, your way allows to reuse existing {{Ignite.set()}}
> > API
> > > > to
> > > > > > > create
> > > > > > > >> > both set flavors. We can adopt it unless somebody in the
> > > > community
> > > > > > > >> objects.
> > > > > > > >> > Personally, I like {{IgniteCache.asSet()}} approach
> proposed
> > > by
> > > > > > > Vladimir
> > > > > > > >> O.
> > > > > > > >> > more, since it emphasizes the difference between sets
> being
> > > > > created,
> > > > > > > but
> > > > > > > >> > this will require API extension.
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >> Andrey, I am suggesting that Ignite.set(...) in
> non-collocated
> > > > mode
> > > > > > > behaves
> > > > > > > >> exactly the same as the proposed IgniteCache.asSet()
> method. I
> > > do
> > > > > not
> > > > > > > like
> > > > > > > >> the IgniteCache.asSet() API because it is inconsistent with
> > > Ignite
> > > > > > data
> > > > > > > >> structure design. All data structures are provided on Ignite
> > API
> > > > > > > directly
> > > > > > > >> and we should not change that.
> > > > > > > >>
> > > > > > > >> D.
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best regards,
> > > > > > > >   Andrey Kuznetsov.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: IgniteSet implementation: changes required

Posted by Amir Akhmedov <am...@gmail.com>.
I also think it will better to remove setDataMap support cause
1. It's making extra pressure on GC by keeping entries on heap
2. It has difficult logic to support with lots of nuances
3. To maintain setDataMap today GridCacheMapEntry calls
cctx.dataStructures().onEntryUpdated() on each entry mutation. I think it's
unnecessary cohesion.
4. For the case with single Ignite cache for all collocated datastructure,
an iterator creation will not be much slower than current implementation
since we can run affinity call on the node where all entries reside. Also,
we can create a better affinity mapper to fairly distribute datastructures
across a cluster rather than mapping by datastructure's name.

Thanks,
Amir


On Tue, Jun 26, 2018 at 8:10 AM Anton Vinogradov <av...@apache.org> wrote:

> Denis,
>
> I think that better case is to remove onheap optimisation/duplication.
> This brings no drop to frequently used operations (put/remove), but even
> will make it slightly faster.
>
> The only one question we have here is "is it possible to restore onheap map
> in easy way?".
> Seems that answer is no, so, I vote for setDataMap removal.
>
> вт, 26 июн. 2018 г. в 15:00, Denis Magda <dm...@apache.org>:
>
> > Anton,
> >
> > Will it be possible to reuse such a functionality for the rest of data
> > structures? I would invest our time in this if all data structures would
> be
> > able to work with Ignite persistence this way.
> >
> > --
> > Denis
> >
> > On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <av...@apache.org> wrote:
> >
> > > >> Why don't we read data straight from the persistence layer warming
> RAM
> > > up
> > > >> in the background?
> > > Because it's not a trivial task to finish such loading on unstable
> > > topology.
> > > That's possible, ofcourse, but solution and complexity will be almost
> > > equals to WAL enable/disable.
> > >
> > > пн, 25 июн. 2018 г. в 22:13, Denis Magda <dm...@apache.org>:
> > >
> > > > Folks,
> > > >
> > > > Why don't we read data straight from the persistence layer warming
> RAM
> > up
> > > > in the background? (like we do for SQL and other APIs). If it's a
> > > question
> > > > of time, then I would suggest us not to hurry up and do it in a right
> > > way.
> > > >
> > > > --
> > > > Denis
> > > >
> > > > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <av...@apache.org>
> > wrote:
> > > >
> > > > > +1 to removal in case there is no easy, fast and consistent way to
> > > > restore
> > > > > setDataMap on node restart.
> > > > > I see that we'll gain some performance drop on size() or keys(),
> but
> > > > these
> > > > > methods are rarely used.
> > > > >
> > > > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <xx...@gmail.com>:
> > > > >
> > > > > > Hello, Igniters.
> > > > > >
> > > > > > I tried to implement IgniteSet data recovery when persistence
> > enabled
> > > > > > [1] using trivial cache scanning, however I cannot find optimal
> way
> > > to
> > > > > > do that because of the following reasons:
> > > > > > - Performing operations on IgniteSet requires completion of data
> > > > > > loading (restoring of setDataMap) on all nodes. Do this during
> > > > > > partition map exchange is too long.
> > > > > > - The prohibition of operations on IgniteSet before the
> completion
> > of
> > > > > > asynchronous cache scanning on all nodes looks rather
> complicated,
> > > > > > because It is necessary to support all situations of unstable
> > > > > > topology.
> > > > > >
> > > > > > So I see one option to fix data loss on node restart - remove the
> > > > > > entire optimization (setDataMap) and rework the iterator
> > > > > > implementation to perform cache scanning.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > > > > >
> > > > > >
> > > > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <st...@gmail.com>:
> > > > > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is a
> > weighty
> > > > > > reason.
> > > > > > >
> > > > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <
> > dsetrakyan@apache.org
> > > >:
> > > > > > >
> > > > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <
> > > > stkuzma@gmail.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Dmitry, your way allows to reuse existing {{Ignite.set()}}
> API
> > > to
> > > > > > create
> > > > > > >> > both set flavors. We can adopt it unless somebody in the
> > > community
> > > > > > >> objects.
> > > > > > >> > Personally, I like {{IgniteCache.asSet()}} approach proposed
> > by
> > > > > > Vladimir
> > > > > > >> O.
> > > > > > >> > more, since it emphasizes the difference between sets being
> > > > created,
> > > > > > but
> > > > > > >> > this will require API extension.
> > > > > > >> >
> > > > > > >>
> > > > > > >> Andrey, I am suggesting that Ignite.set(...) in non-collocated
> > > mode
> > > > > > behaves
> > > > > > >> exactly the same as the proposed IgniteCache.asSet() method. I
> > do
> > > > not
> > > > > > like
> > > > > > >> the IgniteCache.asSet() API because it is inconsistent with
> > Ignite
> > > > > data
> > > > > > >> structure design. All data structures are provided on Ignite
> API
> > > > > > directly
> > > > > > >> and we should not change that.
> > > > > > >>
> > > > > > >> D.
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best regards,
> > > > > > >   Andrey Kuznetsov.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: IgniteSet implementation: changes required

Posted by Anton Vinogradov <av...@apache.org>.
Denis,

I think that better case is to remove onheap optimisation/duplication.
This brings no drop to frequently used operations (put/remove), but even
will make it slightly faster.

The only one question we have here is "is it possible to restore onheap map
in easy way?".
Seems that answer is no, so, I vote for setDataMap removal.

вт, 26 июн. 2018 г. в 15:00, Denis Magda <dm...@apache.org>:

> Anton,
>
> Will it be possible to reuse such a functionality for the rest of data
> structures? I would invest our time in this if all data structures would be
> able to work with Ignite persistence this way.
>
> --
> Denis
>
> On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <av...@apache.org> wrote:
>
> > >> Why don't we read data straight from the persistence layer warming RAM
> > up
> > >> in the background?
> > Because it's not a trivial task to finish such loading on unstable
> > topology.
> > That's possible, ofcourse, but solution and complexity will be almost
> > equals to WAL enable/disable.
> >
> > пн, 25 июн. 2018 г. в 22:13, Denis Magda <dm...@apache.org>:
> >
> > > Folks,
> > >
> > > Why don't we read data straight from the persistence layer warming RAM
> up
> > > in the background? (like we do for SQL and other APIs). If it's a
> > question
> > > of time, then I would suggest us not to hurry up and do it in a right
> > way.
> > >
> > > --
> > > Denis
> > >
> > > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <av...@apache.org>
> wrote:
> > >
> > > > +1 to removal in case there is no easy, fast and consistent way to
> > > restore
> > > > setDataMap on node restart.
> > > > I see that we'll gain some performance drop on size() or keys(), but
> > > these
> > > > methods are rarely used.
> > > >
> > > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <xx...@gmail.com>:
> > > >
> > > > > Hello, Igniters.
> > > > >
> > > > > I tried to implement IgniteSet data recovery when persistence
> enabled
> > > > > [1] using trivial cache scanning, however I cannot find optimal way
> > to
> > > > > do that because of the following reasons:
> > > > > - Performing operations on IgniteSet requires completion of data
> > > > > loading (restoring of setDataMap) on all nodes. Do this during
> > > > > partition map exchange is too long.
> > > > > - The prohibition of operations on IgniteSet before the completion
> of
> > > > > asynchronous cache scanning on all nodes looks rather complicated,
> > > > > because It is necessary to support all situations of unstable
> > > > > topology.
> > > > >
> > > > > So I see one option to fix data loss on node restart - remove the
> > > > > entire optimization (setDataMap) and rework the iterator
> > > > > implementation to perform cache scanning.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > > > >
> > > > >
> > > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <st...@gmail.com>:
> > > > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is a
> weighty
> > > > > reason.
> > > > > >
> > > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <
> dsetrakyan@apache.org
> > >:
> > > > > >
> > > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <
> > > stkuzma@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Dmitry, your way allows to reuse existing {{Ignite.set()}} API
> > to
> > > > > create
> > > > > >> > both set flavors. We can adopt it unless somebody in the
> > community
> > > > > >> objects.
> > > > > >> > Personally, I like {{IgniteCache.asSet()}} approach proposed
> by
> > > > > Vladimir
> > > > > >> O.
> > > > > >> > more, since it emphasizes the difference between sets being
> > > created,
> > > > > but
> > > > > >> > this will require API extension.
> > > > > >> >
> > > > > >>
> > > > > >> Andrey, I am suggesting that Ignite.set(...) in non-collocated
> > mode
> > > > > behaves
> > > > > >> exactly the same as the proposed IgniteCache.asSet() method. I
> do
> > > not
> > > > > like
> > > > > >> the IgniteCache.asSet() API because it is inconsistent with
> Ignite
> > > > data
> > > > > >> structure design. All data structures are provided on Ignite API
> > > > > directly
> > > > > >> and we should not change that.
> > > > > >>
> > > > > >> D.
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > >   Andrey Kuznetsov.
> > > > >
> > > >
> > >
> >
>

Re: IgniteSet implementation: changes required

Posted by Denis Magda <dm...@apache.org>.
Anton,

Will it be possible to reuse such a functionality for the rest of data
structures? I would invest our time in this if all data structures would be
able to work with Ignite persistence this way.

--
Denis

On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov <av...@apache.org> wrote:

> >> Why don't we read data straight from the persistence layer warming RAM
> up
> >> in the background?
> Because it's not a trivial task to finish such loading on unstable
> topology.
> That's possible, ofcourse, but solution and complexity will be almost
> equals to WAL enable/disable.
>
> пн, 25 июн. 2018 г. в 22:13, Denis Magda <dm...@apache.org>:
>
> > Folks,
> >
> > Why don't we read data straight from the persistence layer warming RAM up
> > in the background? (like we do for SQL and other APIs). If it's a
> question
> > of time, then I would suggest us not to hurry up and do it in a right
> way.
> >
> > --
> > Denis
> >
> > On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <av...@apache.org> wrote:
> >
> > > +1 to removal in case there is no easy, fast and consistent way to
> > restore
> > > setDataMap on node restart.
> > > I see that we'll gain some performance drop on size() or keys(), but
> > these
> > > methods are rarely used.
> > >
> > > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <xx...@gmail.com>:
> > >
> > > > Hello, Igniters.
> > > >
> > > > I tried to implement IgniteSet data recovery when persistence enabled
> > > > [1] using trivial cache scanning, however I cannot find optimal way
> to
> > > > do that because of the following reasons:
> > > > - Performing operations on IgniteSet requires completion of data
> > > > loading (restoring of setDataMap) on all nodes. Do this during
> > > > partition map exchange is too long.
> > > > - The prohibition of operations on IgniteSet before the completion of
> > > > asynchronous cache scanning on all nodes looks rather complicated,
> > > > because It is necessary to support all situations of unstable
> > > > topology.
> > > >
> > > > So I see one option to fix data loss on node restart - remove the
> > > > entire optimization (setDataMap) and rework the iterator
> > > > implementation to perform cache scanning.
> > > >
> > > > Thoughts?
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > > >
> > > >
> > > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <st...@gmail.com>:
> > > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is a weighty
> > > > reason.
> > > > >
> > > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org
> >:
> > > > >
> > > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <
> > stkuzma@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Dmitry, your way allows to reuse existing {{Ignite.set()}} API
> to
> > > > create
> > > > >> > both set flavors. We can adopt it unless somebody in the
> community
> > > > >> objects.
> > > > >> > Personally, I like {{IgniteCache.asSet()}} approach proposed by
> > > > Vladimir
> > > > >> O.
> > > > >> > more, since it emphasizes the difference between sets being
> > created,
> > > > but
> > > > >> > this will require API extension.
> > > > >> >
> > > > >>
> > > > >> Andrey, I am suggesting that Ignite.set(...) in non-collocated
> mode
> > > > behaves
> > > > >> exactly the same as the proposed IgniteCache.asSet() method. I do
> > not
> > > > like
> > > > >> the IgniteCache.asSet() API because it is inconsistent with Ignite
> > > data
> > > > >> structure design. All data structures are provided on Ignite API
> > > > directly
> > > > >> and we should not change that.
> > > > >>
> > > > >> D.
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > >   Andrey Kuznetsov.
> > > >
> > >
> >
>

Re: IgniteSet implementation: changes required

Posted by Anton Vinogradov <av...@apache.org>.
>> Why don't we read data straight from the persistence layer warming RAM up
>> in the background?
Because it's not a trivial task to finish such loading on unstable
topology.
That's possible, ofcourse, but solution and complexity will be almost
equals to WAL enable/disable.

пн, 25 июн. 2018 г. в 22:13, Denis Magda <dm...@apache.org>:

> Folks,
>
> Why don't we read data straight from the persistence layer warming RAM up
> in the background? (like we do for SQL and other APIs). If it's a question
> of time, then I would suggest us not to hurry up and do it in a right way.
>
> --
> Denis
>
> On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <av...@apache.org> wrote:
>
> > +1 to removal in case there is no easy, fast and consistent way to
> restore
> > setDataMap on node restart.
> > I see that we'll gain some performance drop on size() or keys(), but
> these
> > methods are rarely used.
> >
> > пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <xx...@gmail.com>:
> >
> > > Hello, Igniters.
> > >
> > > I tried to implement IgniteSet data recovery when persistence enabled
> > > [1] using trivial cache scanning, however I cannot find optimal way to
> > > do that because of the following reasons:
> > > - Performing operations on IgniteSet requires completion of data
> > > loading (restoring of setDataMap) on all nodes. Do this during
> > > partition map exchange is too long.
> > > - The prohibition of operations on IgniteSet before the completion of
> > > asynchronous cache scanning on all nodes looks rather complicated,
> > > because It is necessary to support all situations of unstable
> > > topology.
> > >
> > > So I see one option to fix data loss on node restart - remove the
> > > entire optimization (setDataMap) and rework the iterator
> > > implementation to perform cache scanning.
> > >
> > > Thoughts?
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> > >
> > >
> > > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <st...@gmail.com>:
> > > > Thanks, Dmitry. I agree ultimately, DS API uniformity is a weighty
> > > reason.
> > > >
> > > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> > > >
> > > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <
> stkuzma@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Dmitry, your way allows to reuse existing {{Ignite.set()}} API to
> > > create
> > > >> > both set flavors. We can adopt it unless somebody in the community
> > > >> objects.
> > > >> > Personally, I like {{IgniteCache.asSet()}} approach proposed by
> > > Vladimir
> > > >> O.
> > > >> > more, since it emphasizes the difference between sets being
> created,
> > > but
> > > >> > this will require API extension.
> > > >> >
> > > >>
> > > >> Andrey, I am suggesting that Ignite.set(...) in non-collocated mode
> > > behaves
> > > >> exactly the same as the proposed IgniteCache.asSet() method. I do
> not
> > > like
> > > >> the IgniteCache.asSet() API because it is inconsistent with Ignite
> > data
> > > >> structure design. All data structures are provided on Ignite API
> > > directly
> > > >> and we should not change that.
> > > >>
> > > >> D.
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > >   Andrey Kuznetsov.
> > >
> >
>

Re: IgniteSet implementation: changes required

Posted by Denis Magda <dm...@apache.org>.
Folks,

Why don't we read data straight from the persistence layer warming RAM up
in the background? (like we do for SQL and other APIs). If it's a question
of time, then I would suggest us not to hurry up and do it in a right way.

--
Denis

On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov <av...@apache.org> wrote:

> +1 to removal in case there is no easy, fast and consistent way to restore
> setDataMap on node restart.
> I see that we'll gain some performance drop on size() or keys(), but these
> methods are rarely used.
>
> пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <xx...@gmail.com>:
>
> > Hello, Igniters.
> >
> > I tried to implement IgniteSet data recovery when persistence enabled
> > [1] using trivial cache scanning, however I cannot find optimal way to
> > do that because of the following reasons:
> > - Performing operations on IgniteSet requires completion of data
> > loading (restoring of setDataMap) on all nodes. Do this during
> > partition map exchange is too long.
> > - The prohibition of operations on IgniteSet before the completion of
> > asynchronous cache scanning on all nodes looks rather complicated,
> > because It is necessary to support all situations of unstable
> > topology.
> >
> > So I see one option to fix data loss on node restart - remove the
> > entire optimization (setDataMap) and rework the iterator
> > implementation to perform cache scanning.
> >
> > Thoughts?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-5553
> >
> >
> > 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <st...@gmail.com>:
> > > Thanks, Dmitry. I agree ultimately, DS API uniformity is a weighty
> > reason.
> > >
> > > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> > >
> > >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <st...@gmail.com>
> > >> wrote:
> > >>
> > >> > Dmitry, your way allows to reuse existing {{Ignite.set()}} API to
> > create
> > >> > both set flavors. We can adopt it unless somebody in the community
> > >> objects.
> > >> > Personally, I like {{IgniteCache.asSet()}} approach proposed by
> > Vladimir
> > >> O.
> > >> > more, since it emphasizes the difference between sets being created,
> > but
> > >> > this will require API extension.
> > >> >
> > >>
> > >> Andrey, I am suggesting that Ignite.set(...) in non-collocated mode
> > behaves
> > >> exactly the same as the proposed IgniteCache.asSet() method. I do not
> > like
> > >> the IgniteCache.asSet() API because it is inconsistent with Ignite
> data
> > >> structure design. All data structures are provided on Ignite API
> > directly
> > >> and we should not change that.
> > >>
> > >> D.
> > >>
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >   Andrey Kuznetsov.
> >
>

Re: IgniteSet implementation: changes required

Posted by Anton Vinogradov <av...@apache.org>.
+1 to removal in case there is no easy, fast and consistent way to restore
setDataMap on node restart.
I see that we'll gain some performance drop on size() or keys(), but these
methods are rarely used.

пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin <xx...@gmail.com>:

> Hello, Igniters.
>
> I tried to implement IgniteSet data recovery when persistence enabled
> [1] using trivial cache scanning, however I cannot find optimal way to
> do that because of the following reasons:
> - Performing operations on IgniteSet requires completion of data
> loading (restoring of setDataMap) on all nodes. Do this during
> partition map exchange is too long.
> - The prohibition of operations on IgniteSet before the completion of
> asynchronous cache scanning on all nodes looks rather complicated,
> because It is necessary to support all situations of unstable
> topology.
>
> So I see one option to fix data loss on node restart - remove the
> entire optimization (setDataMap) and rework the iterator
> implementation to perform cache scanning.
>
> Thoughts?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5553
>
>
> 2018-03-17 8:20 GMT+03:00 Andrey Kuznetsov <st...@gmail.com>:
> > Thanks, Dmitry. I agree ultimately, DS API uniformity is a weighty
> reason.
> >
> > 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan <ds...@apache.org>:
> >
> >> On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov <st...@gmail.com>
> >> wrote:
> >>
> >> > Dmitry, your way allows to reuse existing {{Ignite.set()}} API to
> create
> >> > both set flavors. We can adopt it unless somebody in the community
> >> objects.
> >> > Personally, I like {{IgniteCache.asSet()}} approach proposed by
> Vladimir
> >> O.
> >> > more, since it emphasizes the difference between sets being created,
> but
> >> > this will require API extension.
> >> >
> >>
> >> Andrey, I am suggesting that Ignite.set(...) in non-collocated mode
> behaves
> >> exactly the same as the proposed IgniteCache.asSet() method. I do not
> like
> >> the IgniteCache.asSet() API because it is inconsistent with Ignite data
> >> structure design. All data structures are provided on Ignite API
> directly
> >> and we should not change that.
> >>
> >> D.
> >>
> >
> >
> >
> > --
> > Best regards,
> >   Andrey Kuznetsov.
>