You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Ilya Kasnacheev <il...@gmail.com> on 2018/07/24 13:03:22 UTC

ConcurrentLinkedHashMap works incorrectly after clear()

Hello igniters!

So I was working on a fix for
https://issues.apache.org/jira/browse/IGNITE-9056
The reason for test flakiness turned out our ConcurrentLinkedHashMap (and
its tautological cousin GridBoundedConcurrentLinkedHashMap) is broken :(

When you do clear(). its size counter is not updated. So sizex() will
return the old size after clear, and if there's maxCnt set, after several
clear()s it will immediately evict entries after they are inserted,
maintaining map size at 0.

This is scary since indexing internals make intense use of
ConcurrentLinkedHashMaps.

My suggestion for this fix is to avoid ever calling clear(), making it
throw UnsupportedOperationException and recreating/replacing map instead of
clear()ing it. Unless somebody is going to stand up and fix
ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm afraid of
touching this code in any non-trivial way.

-- 
Ilya Kasnacheev

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello Igniters!

So we have merged the patch. If there are further steps with CLHM, feel
free to contribute.

Regards,

-- 
Ilya Kasnacheev

2018-08-10 17:50 GMT+03:00 Andrey Gura <ag...@apache.org>:

> Stas,
>
> SkipList implementation offers O(log n) for get/put/contains
> operations while CLHM - O(1). So it is suitable for small data sets
> but will have serious performance impact for the big ones.
>
> However, it seems it's time for right choice: correctness or
> performance. The answer seems obvious )
> On Tue, Jul 24, 2018 at 4:45 PM Stanislav Lukyanov
> <st...@gmail.com> wrote:
> >
> > It seems that we currently use the CLHM primarily as a FIFO cache.
> > I see two ways around that.
> >
> > First, we could implement such LRU/FIFO cache based on another, properly
> supported data structure from j.u.c.
> > ConcurrentSkipListMap seems OK. I have a draft implementation of
> LruEvictionPolicy based on it that passes functional tests,
> > but I haven’t benchmarked it yet.
> >
> > Second, Guava has a Cache API with a lot of configuration options that
> we could use (license is Apache, should be OK).
> >
> > Stan
> >
> > From: Nikolay Izhikov
> > Sent: 24 июля 2018 г. 16:27
> > To: dev@ignite.apache.org
> > Subject: Re: ConcurrentLinkedHashMap works incorrectly after clear()
> >
> > Hello, Ilya.
> >
> > May be we need to proceed with ticket [1] "Get rid of org.jsr166.
> ConcurrentLinkedHashMap"?
> >
> > Especially, if this class is broken and buggy.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-7516
> >
> > В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > > Thanks for revealing this issue!
> > >
> > > I don't understand why should we disallow calling clear().
> > >
> > > One way how it can be re-implemented is:
> > > 1. acquire write locks on all segments;
> > > 2. clear them;
> > > 3. reset size to 0;
> > > 4. release locks.
> > >
> > > Another approach is to calculate inside
> > > ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
> > > deleted and then call size.addAndGet(...).
> > >
> > > In both cases you'll have to replace LongAdder with AtomicLong.
> > >
> > > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> ilya.kasnacheev@gmail.com>
> > > wrote:
> > >
> > > > Hello igniters!
> > > >
> > > > So I was working on a fix for
> > > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > > The reason for test flakiness turned out our ConcurrentLinkedHashMap
> (and
> > > > its tautological cousin GridBoundedConcurrentLinkedHashMap) is
> broken :(
> > > >
> > > > When you do clear(). its size counter is not updated. So sizex() will
> > > > return the old size after clear, and if there's maxCnt set, after
> several
> > > > clear()s it will immediately evict entries after they are inserted,
> > > > maintaining map size at 0.
> > > >
> > > > This is scary since indexing internals make intense use of
> > > > ConcurrentLinkedHashMaps.
> > > >
> > > > My suggestion for this fix is to avoid ever calling clear(), making
> it
> > > > throw UnsupportedOperationException and recreating/replacing map
> instead of
> > > > clear()ing it. Unless somebody is going to stand up and fix
> > > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm
> afraid of
> > > > touching this code in any non-trivial way.
> > > >
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > >
> > >
> > >
> >
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Andrey Gura <ag...@apache.org>.

Stas,

SkipList implementation offers O(log n) for get/put/contains
operations while CLHM - O(1). So it is suitable for small data sets
but will have serious performance impact for the big ones.

However, it seems it's time for right choice: correctness or
performance. The answer seems obvious )
On Tue, Jul 24, 2018 at 4:45 PM Stanislav Lukyanov
<st...@gmail.com> wrote:
>
> It seems that we currently use the CLHM primarily as a FIFO cache.
> I see two ways around that.
>
> First, we could implement such LRU/FIFO cache based on another, properly supported data structure from j.u.c.
> ConcurrentSkipListMap seems OK. I have a draft implementation of LruEvictionPolicy based on it that passes functional tests,
> but I haven’t benchmarked it yet.
>
> Second, Guava has a Cache API with a lot of configuration options that we could use (license is Apache, should be OK).
>
> Stan
>
> From: Nikolay Izhikov
> Sent: 24 июля 2018 г. 16:27
> To: dev@ignite.apache.org
> Subject: Re: ConcurrentLinkedHashMap works incorrectly after clear()
>
> Hello, Ilya.
>
> May be we need to proceed with ticket [1] "Get rid of org.jsr166.ConcurrentLinkedHashMap"?
>
> Especially, if this class is broken and buggy.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-7516
>
> В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > Thanks for revealing this issue!
> >
> > I don't understand why should we disallow calling clear().
> >
> > One way how it can be re-implemented is:
> > 1. acquire write locks on all segments;
> > 2. clear them;
> > 3. reset size to 0;
> > 4. release locks.
> >
> > Another approach is to calculate inside
> > ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
> > deleted and then call size.addAndGet(...).
> >
> > In both cases you'll have to replace LongAdder with AtomicLong.
> >
> > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <il...@gmail.com>
> > wrote:
> >
> > > Hello igniters!
> > >
> > > So I was working on a fix for
> > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > The reason for test flakiness turned out our ConcurrentLinkedHashMap (and
> > > its tautological cousin GridBoundedConcurrentLinkedHashMap) is broken :(
> > >
> > > When you do clear(). its size counter is not updated. So sizex() will
> > > return the old size after clear, and if there's maxCnt set, after several
> > > clear()s it will immediately evict entries after they are inserted,
> > > maintaining map size at 0.
> > >
> > > This is scary since indexing internals make intense use of
> > > ConcurrentLinkedHashMaps.
> > >
> > > My suggestion for this fix is to avoid ever calling clear(), making it
> > > throw UnsupportedOperationException and recreating/replacing map instead of
> > > clear()ing it. Unless somebody is going to stand up and fix
> > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm afraid of
> > > touching this code in any non-trivial way.
> > >
> > > --
> > > Ilya Kasnacheev
> > >
> >
> >
> >
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Dmitriy Setrakyan <ds...@apache.org>.

On Thu, Aug 9, 2018 at 8:16 AM, Alexey Goncharuk <alexey.goncharuk@gmail.com
> wrote:

> Guys, I think we can definitely change current implementation of CLHM with
> a more stable one, but as a temporal solution I see nothing wrong with
> throwing an UnsupportedOperationException from clear() method. Ilya already
> provided a patch which replaces all clear() calls with a new map creation,
> semantically it has the same behavior as a correct clear() method.
>
> I suggest to merge the provided PR because IgniteH2Indexing is broken
> because of this bug and then create a ticket to replace/fix/whatever it
> feels right to do with current CLHM.
>
> Thoughts?
>

I agree. This sounds like a less risky approach.

D.

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Alexey Goncharuk <al...@gmail.com>.

Guys, I think we can definitely change current implementation of CLHM with
a more stable one, but as a temporal solution I see nothing wrong with
throwing an UnsupportedOperationException from clear() method. Ilya already
provided a patch which replaces all clear() calls with a new map creation,
semantically it has the same behavior as a correct clear() method.

I suggest to merge the provided PR because IgniteH2Indexing is broken
because of this bug and then create a ticket to replace/fix/whatever it
feels right to do with current CLHM.

Thoughts?

пт, 27 июл. 2018 г. в 16:43, Anton Vinogradov <av...@apache.org>:

> Is it possible to replace current implementation with google's [1] or some
> other?
> It a bad idea to have own CLHM and keep it outdated (this version based on
> CHM from java7!) and broken.
>
> [1]
>
> https://mvnrepository.com/artifact/com.googlecode.concurrentlinkedhashmap/concurrentlinkedhashmap-lru
>
> пт, 27 июл. 2018 г. в 16:40, Ilya Kasnacheev <il...@gmail.com>:
>
> > Hello!
> >
> > Most of our code uses CLHM as a container with fixed size. I can surely
> fix
> > LogThrottle but my main concern is H2 indexing code which uses the same
> > CLHM with cap.
> >
> > Regards,
> >
> > --
> > Ilya Kasnacheev
> >
> > 2018-07-27 16:38 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:
> >
> > > Ilya,
> > >
> > > As for me, the main cause of this problem is not in CLHM itself but
> that
> > > we are using it for GridLogThrottle as container with fixed size.
> > Suppose,
> > > at current moment we have no alternative and should start thinking
> about
> > > futher steps.
> > >
> > > Can you create clear reproducer for described issue with CLHM above?
> > >
> > > As workaround for GridLogThrottle we can change clear() like this:
> > >
> > >     public static void clear() {
> > >         errors.forEach((k, v) -> errors.remove(k));
> > >     }
> > >
> > > Will it helps you to fix test?
> > >
> > > Thoughts?
> > >
> > > On Wed, 25 Jul 2018 at 19:57 Ilya Kasnacheev <
> ilya.kasnacheev@gmail.com>
> > > wrote:
> > >
> > > > Hello!
> > > >
> > > > LT stops throttling input as soon as it is unable to add any entries
> to
> > > > underlying map since it would consider itself full with 0 entries.
> > > >
> > > > I have tried to implement both suggestions, and they all have big
> > impact
> > > on
> > > > CLHM code. I am super uncomfortable with making non-trivial edits to
> > it.
> > > >
> > > > If the first approach is chosen, there's the need to iterate values
> > > instead
> > > > of clearing table, and if second approach is chosen, locking becomes
> > > > non-trivial since we're grabbing segment locks outside of segment
> > code..
> > > >
> > > > Changing LongAdder to AtomicLong also has potential implications
> which
> > > are
> > > > not readily understood.
> > > >
> > > > Note that entryQ is not cleared in clear() either which can cause
> > further
> > > > problems. My suggestion is still to either disallow clear() or throw
> > this
> > > > class away in its entirety.
> > > >
> > > > Regards,
> > > >
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > > > 2018-07-25 12:00 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:
> > > >
> > > > > Hello Ilya,
> > > > >
> > > > > Can you add more info about why and how LT for this test case
> prints
> > > log
> > > > > message twice?
> > > > >
> > > > > From my point, maiking clear() method to throw
> > > > > UnsupportedOperationException is not right
> > > > > fix for flaky test issues. A brief search through CLHM led me to a
> > > > thought
> > > > > that we just forgot to drop down
> > > > > LongAdder size when iterating over HashEntry array. We incrementing
> > and
> > > > > decrementing this
> > > > > counter on put/remove operations by why not in clear method? Am I
> > > right?
> > > > > So, replacing LongAdder to AtomicLong
> > > > > sounds good to me too, as it was suggested by Ilya Lantukh. But I
> can
> > > > > mistake here.
> > > > >
> > > > > In general way, I think it's a good case to start thinking about
> how
> > to
> > > > get
> > > > > rid of CLHM. E.g. we can consider this implementaion [1].
> > > > >
> > > > > [1] https://github.com/ben-manes/concurrentlinkedhashmap
> > > > >
> > > > > вт, 24 июл. 2018 г. в 16:45, Stanislav Lukyanov <
> > > stanlukyanov@gmail.com
> > > > >:
> > > > >
> > > > > > It seems that we currently use the CLHM primarily as a FIFO
> cache.
> > > > > > I see two ways around that.
> > > > > >
> > > > > > First, we could implement such LRU/FIFO cache based on another,
> > > > properly
> > > > > > supported data structure from j.u.c.
> > > > > > ConcurrentSkipListMap seems OK. I have a draft implementation of
> > > > > > LruEvictionPolicy based on it that passes functional tests,
> > > > > > but I haven’t benchmarked it yet.
> > > > > >
> > > > > > Second, Guava has a Cache API with a lot of configuration options
> > > that
> > > > we
> > > > > > could use (license is Apache, should be OK).
> > > > > >
> > > > > > Stan
> > > > > >
> > > > > > From: Nikolay Izhikov
> > > > > > Sent: 24 июля 2018 г. 16:27
> > > > > > To: dev@ignite.apache.org
> > > > > > Subject: Re: ConcurrentLinkedHashMap works incorrectly after
> > clear()
> > > > > >
> > > > > > Hello, Ilya.
> > > > > >
> > > > > > May be we need to proceed with ticket [1] "Get rid of
> > > > > > org.jsr166.ConcurrentLinkedHashMap"?
> > > > > >
> > > > > > Especially, if this class is broken and buggy.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7516
> > > > > >
> > > > > > В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > > > > > > Thanks for revealing this issue!
> > > > > > >
> > > > > > > I don't understand why should we disallow calling clear().
> > > > > > >
> > > > > > > One way how it can be re-implemented is:
> > > > > > > 1. acquire write locks on all segments;
> > > > > > > 2. clear them;
> > > > > > > 3. reset size to 0;
> > > > > > > 4. release locks.
> > > > > > >
> > > > > > > Another approach is to calculate inside
> > > > > > > ConcurrentLinkedHashMap.Segment.clear() how many entries you
> > > actually
> > > > > > > deleted and then call size.addAndGet(...).
> > > > > > >
> > > > > > > In both cases you'll have to replace LongAdder with AtomicLong.
> > > > > > >
> > > > > > > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> > > > > > ilya.kasnacheev@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello igniters!
> > > > > > > >
> > > > > > > > So I was working on a fix for
> > > > > > > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > > > > > > The reason for test flakiness turned out our
> > > > ConcurrentLinkedHashMap
> > > > > > (and
> > > > > > > > its tautological cousin GridBoundedConcurrentLinkedHashMap)
> is
> > > > > broken
> > > > > > :(
> > > > > > > >
> > > > > > > > When you do clear(). its size counter is not updated. So
> > sizex()
> > > > will
> > > > > > > > return the old size after clear, and if there's maxCnt set,
> > after
> > > > > > several
> > > > > > > > clear()s it will immediately evict entries after they are
> > > inserted,
> > > > > > > > maintaining map size at 0.
> > > > > > > >
> > > > > > > > This is scary since indexing internals make intense use of
> > > > > > > > ConcurrentLinkedHashMaps.
> > > > > > > >
> > > > > > > > My suggestion for this fix is to avoid ever calling clear(),
> > > making
> > > > > it
> > > > > > > > throw UnsupportedOperationException and recreating/replacing
> > map
> > > > > > instead of
> > > > > > > > clear()ing it. Unless somebody is going to stand up and fix
> > > > > > > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking
> I'm
> > > > > afraid
> > > > > > of
> > > > > > > > touching this code in any non-trivial way.
> > > > > > > >
> > > > > > > > --
> > > > > > > > Ilya Kasnacheev
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > --
> > > > > Maxim Muzafarov
> > > > >
> > > >
> > > --
> > > --
> > > Maxim Muzafarov
> > >
> >
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Anton Vinogradov <av...@apache.org>.

Is it possible to replace current implementation with google's [1] or some
other?
It a bad idea to have own CLHM and keep it outdated (this version based on
CHM from java7!) and broken.

[1]
https://mvnrepository.com/artifact/com.googlecode.concurrentlinkedhashmap/concurrentlinkedhashmap-lru

пт, 27 июл. 2018 г. в 16:40, Ilya Kasnacheev <il...@gmail.com>:

> Hello!
>
> Most of our code uses CLHM as a container with fixed size. I can surely fix
> LogThrottle but my main concern is H2 indexing code which uses the same
> CLHM with cap.
>
> Regards,
>
> --
> Ilya Kasnacheev
>
> 2018-07-27 16:38 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:
>
> > Ilya,
> >
> > As for me, the main cause of this problem is not in CLHM itself but that
> > we are using it for GridLogThrottle as container with fixed size.
> Suppose,
> > at current moment we have no alternative and should start thinking about
> > futher steps.
> >
> > Can you create clear reproducer for described issue with CLHM above?
> >
> > As workaround for GridLogThrottle we can change clear() like this:
> >
> >     public static void clear() {
> >         errors.forEach((k, v) -> errors.remove(k));
> >     }
> >
> > Will it helps you to fix test?
> >
> > Thoughts?
> >
> > On Wed, 25 Jul 2018 at 19:57 Ilya Kasnacheev <il...@gmail.com>
> > wrote:
> >
> > > Hello!
> > >
> > > LT stops throttling input as soon as it is unable to add any entries to
> > > underlying map since it would consider itself full with 0 entries.
> > >
> > > I have tried to implement both suggestions, and they all have big
> impact
> > on
> > > CLHM code. I am super uncomfortable with making non-trivial edits to
> it.
> > >
> > > If the first approach is chosen, there's the need to iterate values
> > instead
> > > of clearing table, and if second approach is chosen, locking becomes
> > > non-trivial since we're grabbing segment locks outside of segment
> code..
> > >
> > > Changing LongAdder to AtomicLong also has potential implications which
> > are
> > > not readily understood.
> > >
> > > Note that entryQ is not cleared in clear() either which can cause
> further
> > > problems. My suggestion is still to either disallow clear() or throw
> this
> > > class away in its entirety.
> > >
> > > Regards,
> > >
> > > --
> > > Ilya Kasnacheev
> > >
> > > 2018-07-25 12:00 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:
> > >
> > > > Hello Ilya,
> > > >
> > > > Can you add more info about why and how LT for this test case prints
> > log
> > > > message twice?
> > > >
> > > > From my point, maiking clear() method to throw
> > > > UnsupportedOperationException is not right
> > > > fix for flaky test issues. A brief search through CLHM led me to a
> > > thought
> > > > that we just forgot to drop down
> > > > LongAdder size when iterating over HashEntry array. We incrementing
> and
> > > > decrementing this
> > > > counter on put/remove operations by why not in clear method? Am I
> > right?
> > > > So, replacing LongAdder to AtomicLong
> > > > sounds good to me too, as it was suggested by Ilya Lantukh. But I can
> > > > mistake here.
> > > >
> > > > In general way, I think it's a good case to start thinking about how
> to
> > > get
> > > > rid of CLHM. E.g. we can consider this implementaion [1].
> > > >
> > > > [1] https://github.com/ben-manes/concurrentlinkedhashmap
> > > >
> > > > вт, 24 июл. 2018 г. в 16:45, Stanislav Lukyanov <
> > stanlukyanov@gmail.com
> > > >:
> > > >
> > > > > It seems that we currently use the CLHM primarily as a FIFO cache.
> > > > > I see two ways around that.
> > > > >
> > > > > First, we could implement such LRU/FIFO cache based on another,
> > > properly
> > > > > supported data structure from j.u.c.
> > > > > ConcurrentSkipListMap seems OK. I have a draft implementation of
> > > > > LruEvictionPolicy based on it that passes functional tests,
> > > > > but I haven’t benchmarked it yet.
> > > > >
> > > > > Second, Guava has a Cache API with a lot of configuration options
> > that
> > > we
> > > > > could use (license is Apache, should be OK).
> > > > >
> > > > > Stan
> > > > >
> > > > > From: Nikolay Izhikov
> > > > > Sent: 24 июля 2018 г. 16:27
> > > > > To: dev@ignite.apache.org
> > > > > Subject: Re: ConcurrentLinkedHashMap works incorrectly after
> clear()
> > > > >
> > > > > Hello, Ilya.
> > > > >
> > > > > May be we need to proceed with ticket [1] "Get rid of
> > > > > org.jsr166.ConcurrentLinkedHashMap"?
> > > > >
> > > > > Especially, if this class is broken and buggy.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7516
> > > > >
> > > > > В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > > > > > Thanks for revealing this issue!
> > > > > >
> > > > > > I don't understand why should we disallow calling clear().
> > > > > >
> > > > > > One way how it can be re-implemented is:
> > > > > > 1. acquire write locks on all segments;
> > > > > > 2. clear them;
> > > > > > 3. reset size to 0;
> > > > > > 4. release locks.
> > > > > >
> > > > > > Another approach is to calculate inside
> > > > > > ConcurrentLinkedHashMap.Segment.clear() how many entries you
> > actually
> > > > > > deleted and then call size.addAndGet(...).
> > > > > >
> > > > > > In both cases you'll have to replace LongAdder with AtomicLong.
> > > > > >
> > > > > > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> > > > > ilya.kasnacheev@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello igniters!
> > > > > > >
> > > > > > > So I was working on a fix for
> > > > > > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > > > > > The reason for test flakiness turned out our
> > > ConcurrentLinkedHashMap
> > > > > (and
> > > > > > > its tautological cousin GridBoundedConcurrentLinkedHashMap) is
> > > > broken
> > > > > :(
> > > > > > >
> > > > > > > When you do clear(). its size counter is not updated. So
> sizex()
> > > will
> > > > > > > return the old size after clear, and if there's maxCnt set,
> after
> > > > > several
> > > > > > > clear()s it will immediately evict entries after they are
> > inserted,
> > > > > > > maintaining map size at 0.
> > > > > > >
> > > > > > > This is scary since indexing internals make intense use of
> > > > > > > ConcurrentLinkedHashMaps.
> > > > > > >
> > > > > > > My suggestion for this fix is to avoid ever calling clear(),
> > making
> > > > it
> > > > > > > throw UnsupportedOperationException and recreating/replacing
> map
> > > > > instead of
> > > > > > > clear()ing it. Unless somebody is going to stand up and fix
> > > > > > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm
> > > > afraid
> > > > > of
> > > > > > > touching this code in any non-trivial way.
> > > > > > >
> > > > > > > --
> > > > > > > Ilya Kasnacheev
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > --
> > > > Maxim Muzafarov
> > > >
> > >
> > --
> > --
> > Maxim Muzafarov
> >
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

Most of our code uses CLHM as a container with fixed size. I can surely fix
LogThrottle but my main concern is H2 indexing code which uses the same
CLHM with cap.

Regards,

-- 
Ilya Kasnacheev

2018-07-27 16:38 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:

> Ilya,
>
> As for me, the main cause of this problem is not in CLHM itself but that
> we are using it for GridLogThrottle as container with fixed size. Suppose,
> at current moment we have no alternative and should start thinking about
> futher steps.
>
> Can you create clear reproducer for described issue with CLHM above?
>
> As workaround for GridLogThrottle we can change clear() like this:
>
>     public static void clear() {
>         errors.forEach((k, v) -> errors.remove(k));
>     }
>
> Will it helps you to fix test?
>
> Thoughts?
>
> On Wed, 25 Jul 2018 at 19:57 Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
> > Hello!
> >
> > LT stops throttling input as soon as it is unable to add any entries to
> > underlying map since it would consider itself full with 0 entries.
> >
> > I have tried to implement both suggestions, and they all have big impact
> on
> > CLHM code. I am super uncomfortable with making non-trivial edits to it.
> >
> > If the first approach is chosen, there's the need to iterate values
> instead
> > of clearing table, and if second approach is chosen, locking becomes
> > non-trivial since we're grabbing segment locks outside of segment code..
> >
> > Changing LongAdder to AtomicLong also has potential implications which
> are
> > not readily understood.
> >
> > Note that entryQ is not cleared in clear() either which can cause further
> > problems. My suggestion is still to either disallow clear() or throw this
> > class away in its entirety.
> >
> > Regards,
> >
> > --
> > Ilya Kasnacheev
> >
> > 2018-07-25 12:00 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:
> >
> > > Hello Ilya,
> > >
> > > Can you add more info about why and how LT for this test case prints
> log
> > > message twice?
> > >
> > > From my point, maiking clear() method to throw
> > > UnsupportedOperationException is not right
> > > fix for flaky test issues. A brief search through CLHM led me to a
> > thought
> > > that we just forgot to drop down
> > > LongAdder size when iterating over HashEntry array. We incrementing and
> > > decrementing this
> > > counter on put/remove operations by why not in clear method? Am I
> right?
> > > So, replacing LongAdder to AtomicLong
> > > sounds good to me too, as it was suggested by Ilya Lantukh. But I can
> > > mistake here.
> > >
> > > In general way, I think it's a good case to start thinking about how to
> > get
> > > rid of CLHM. E.g. we can consider this implementaion [1].
> > >
> > > [1] https://github.com/ben-manes/concurrentlinkedhashmap
> > >
> > > вт, 24 июл. 2018 г. в 16:45, Stanislav Lukyanov <
> stanlukyanov@gmail.com
> > >:
> > >
> > > > It seems that we currently use the CLHM primarily as a FIFO cache.
> > > > I see two ways around that.
> > > >
> > > > First, we could implement such LRU/FIFO cache based on another,
> > properly
> > > > supported data structure from j.u.c.
> > > > ConcurrentSkipListMap seems OK. I have a draft implementation of
> > > > LruEvictionPolicy based on it that passes functional tests,
> > > > but I haven’t benchmarked it yet.
> > > >
> > > > Second, Guava has a Cache API with a lot of configuration options
> that
> > we
> > > > could use (license is Apache, should be OK).
> > > >
> > > > Stan
> > > >
> > > > From: Nikolay Izhikov
> > > > Sent: 24 июля 2018 г. 16:27
> > > > To: dev@ignite.apache.org
> > > > Subject: Re: ConcurrentLinkedHashMap works incorrectly after clear()
> > > >
> > > > Hello, Ilya.
> > > >
> > > > May be we need to proceed with ticket [1] "Get rid of
> > > > org.jsr166.ConcurrentLinkedHashMap"?
> > > >
> > > > Especially, if this class is broken and buggy.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-7516
> > > >
> > > > В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > > > > Thanks for revealing this issue!
> > > > >
> > > > > I don't understand why should we disallow calling clear().
> > > > >
> > > > > One way how it can be re-implemented is:
> > > > > 1. acquire write locks on all segments;
> > > > > 2. clear them;
> > > > > 3. reset size to 0;
> > > > > 4. release locks.
> > > > >
> > > > > Another approach is to calculate inside
> > > > > ConcurrentLinkedHashMap.Segment.clear() how many entries you
> actually
> > > > > deleted and then call size.addAndGet(...).
> > > > >
> > > > > In both cases you'll have to replace LongAdder with AtomicLong.
> > > > >
> > > > > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> > > > ilya.kasnacheev@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello igniters!
> > > > > >
> > > > > > So I was working on a fix for
> > > > > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > > > > The reason for test flakiness turned out our
> > ConcurrentLinkedHashMap
> > > > (and
> > > > > > its tautological cousin GridBoundedConcurrentLinkedHashMap) is
> > > broken
> > > > :(
> > > > > >
> > > > > > When you do clear(). its size counter is not updated. So sizex()
> > will
> > > > > > return the old size after clear, and if there's maxCnt set, after
> > > > several
> > > > > > clear()s it will immediately evict entries after they are
> inserted,
> > > > > > maintaining map size at 0.
> > > > > >
> > > > > > This is scary since indexing internals make intense use of
> > > > > > ConcurrentLinkedHashMaps.
> > > > > >
> > > > > > My suggestion for this fix is to avoid ever calling clear(),
> making
> > > it
> > > > > > throw UnsupportedOperationException and recreating/replacing map
> > > > instead of
> > > > > > clear()ing it. Unless somebody is going to stand up and fix
> > > > > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm
> > > afraid
> > > > of
> > > > > > touching this code in any non-trivial way.
> > > > > >
> > > > > > --
> > > > > > Ilya Kasnacheev
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > --
> > > Maxim Muzafarov
> > >
> >
> --
> --
> Maxim Muzafarov
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Maxim Muzafarov <ma...@gmail.com>.

Ilya,

As for me, the main cause of this problem is not in CLHM itself but that
we are using it for GridLogThrottle as container with fixed size. Suppose,
at current moment we have no alternative and should start thinking about
futher steps.

Can you create clear reproducer for described issue with CLHM above?

As workaround for GridLogThrottle we can change clear() like this:

    public static void clear() {
        errors.forEach((k, v) -> errors.remove(k));
    }

Will it helps you to fix test?

Thoughts?

On Wed, 25 Jul 2018 at 19:57 Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> LT stops throttling input as soon as it is unable to add any entries to
> underlying map since it would consider itself full with 0 entries.
>
> I have tried to implement both suggestions, and they all have big impact on
> CLHM code. I am super uncomfortable with making non-trivial edits to it.
>
> If the first approach is chosen, there's the need to iterate values instead
> of clearing table, and if second approach is chosen, locking becomes
> non-trivial since we're grabbing segment locks outside of segment code..
>
> Changing LongAdder to AtomicLong also has potential implications which are
> not readily understood.
>
> Note that entryQ is not cleared in clear() either which can cause further
> problems. My suggestion is still to either disallow clear() or throw this
> class away in its entirety.
>
> Regards,
>
> --
> Ilya Kasnacheev
>
> 2018-07-25 12:00 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:
>
> > Hello Ilya,
> >
> > Can you add more info about why and how LT for this test case prints log
> > message twice?
> >
> > From my point, maiking clear() method to throw
> > UnsupportedOperationException is not right
> > fix for flaky test issues. A brief search through CLHM led me to a
> thought
> > that we just forgot to drop down
> > LongAdder size when iterating over HashEntry array. We incrementing and
> > decrementing this
> > counter on put/remove operations by why not in clear method? Am I right?
> > So, replacing LongAdder to AtomicLong
> > sounds good to me too, as it was suggested by Ilya Lantukh. But I can
> > mistake here.
> >
> > In general way, I think it's a good case to start thinking about how to
> get
> > rid of CLHM. E.g. we can consider this implementaion [1].
> >
> > [1] https://github.com/ben-manes/concurrentlinkedhashmap
> >
> > вт, 24 июл. 2018 г. в 16:45, Stanislav Lukyanov <stanlukyanov@gmail.com
> >:
> >
> > > It seems that we currently use the CLHM primarily as a FIFO cache.
> > > I see two ways around that.
> > >
> > > First, we could implement such LRU/FIFO cache based on another,
> properly
> > > supported data structure from j.u.c.
> > > ConcurrentSkipListMap seems OK. I have a draft implementation of
> > > LruEvictionPolicy based on it that passes functional tests,
> > > but I haven’t benchmarked it yet.
> > >
> > > Second, Guava has a Cache API with a lot of configuration options that
> we
> > > could use (license is Apache, should be OK).
> > >
> > > Stan
> > >
> > > From: Nikolay Izhikov
> > > Sent: 24 июля 2018 г. 16:27
> > > To: dev@ignite.apache.org
> > > Subject: Re: ConcurrentLinkedHashMap works incorrectly after clear()
> > >
> > > Hello, Ilya.
> > >
> > > May be we need to proceed with ticket [1] "Get rid of
> > > org.jsr166.ConcurrentLinkedHashMap"?
> > >
> > > Especially, if this class is broken and buggy.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-7516
> > >
> > > В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > > > Thanks for revealing this issue!
> > > >
> > > > I don't understand why should we disallow calling clear().
> > > >
> > > > One way how it can be re-implemented is:
> > > > 1. acquire write locks on all segments;
> > > > 2. clear them;
> > > > 3. reset size to 0;
> > > > 4. release locks.
> > > >
> > > > Another approach is to calculate inside
> > > > ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
> > > > deleted and then call size.addAndGet(...).
> > > >
> > > > In both cases you'll have to replace LongAdder with AtomicLong.
> > > >
> > > > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> > > ilya.kasnacheev@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello igniters!
> > > > >
> > > > > So I was working on a fix for
> > > > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > > > The reason for test flakiness turned out our
> ConcurrentLinkedHashMap
> > > (and
> > > > > its tautological cousin GridBoundedConcurrentLinkedHashMap) is
> > broken
> > > :(
> > > > >
> > > > > When you do clear(). its size counter is not updated. So sizex()
> will
> > > > > return the old size after clear, and if there's maxCnt set, after
> > > several
> > > > > clear()s it will immediately evict entries after they are inserted,
> > > > > maintaining map size at 0.
> > > > >
> > > > > This is scary since indexing internals make intense use of
> > > > > ConcurrentLinkedHashMaps.
> > > > >
> > > > > My suggestion for this fix is to avoid ever calling clear(), making
> > it
> > > > > throw UnsupportedOperationException and recreating/replacing map
> > > instead of
> > > > > clear()ing it. Unless somebody is going to stand up and fix
> > > > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm
> > afraid
> > > of
> > > > > touching this code in any non-trivial way.
> > > > >
> > > > > --
> > > > > Ilya Kasnacheev
> > > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > --
> > Maxim Muzafarov
> >
>
-- 
--
Maxim Muzafarov

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

LT stops throttling input as soon as it is unable to add any entries to
underlying map since it would consider itself full with 0 entries.

I have tried to implement both suggestions, and they all have big impact on
CLHM code. I am super uncomfortable with making non-trivial edits to it.

If the first approach is chosen, there's the need to iterate values instead
of clearing table, and if second approach is chosen, locking becomes
non-trivial since we're grabbing segment locks outside of segment code..

Changing LongAdder to AtomicLong also has potential implications which are
not readily understood.

Note that entryQ is not cleared in clear() either which can cause further
problems. My suggestion is still to either disallow clear() or throw this
class away in its entirety.

Regards,

-- 
Ilya Kasnacheev

2018-07-25 12:00 GMT+03:00 Maxim Muzafarov <ma...@gmail.com>:

> Hello Ilya,
>
> Can you add more info about why and how LT for this test case prints log
> message twice?
>
> From my point, maiking clear() method to throw
> UnsupportedOperationException is not right
> fix for flaky test issues. A brief search through CLHM led me to a thought
> that we just forgot to drop down
> LongAdder size when iterating over HashEntry array. We incrementing and
> decrementing this
> counter on put/remove operations by why not in clear method? Am I right?
> So, replacing LongAdder to AtomicLong
> sounds good to me too, as it was suggested by Ilya Lantukh. But I can
> mistake here.
>
> In general way, I think it's a good case to start thinking about how to get
> rid of CLHM. E.g. we can consider this implementaion [1].
>
> [1] https://github.com/ben-manes/concurrentlinkedhashmap
>
> вт, 24 июл. 2018 г. в 16:45, Stanislav Lukyanov <st...@gmail.com>:
>
> > It seems that we currently use the CLHM primarily as a FIFO cache.
> > I see two ways around that.
> >
> > First, we could implement such LRU/FIFO cache based on another, properly
> > supported data structure from j.u.c.
> > ConcurrentSkipListMap seems OK. I have a draft implementation of
> > LruEvictionPolicy based on it that passes functional tests,
> > but I haven’t benchmarked it yet.
> >
> > Second, Guava has a Cache API with a lot of configuration options that we
> > could use (license is Apache, should be OK).
> >
> > Stan
> >
> > From: Nikolay Izhikov
> > Sent: 24 июля 2018 г. 16:27
> > To: dev@ignite.apache.org
> > Subject: Re: ConcurrentLinkedHashMap works incorrectly after clear()
> >
> > Hello, Ilya.
> >
> > May be we need to proceed with ticket [1] "Get rid of
> > org.jsr166.ConcurrentLinkedHashMap"?
> >
> > Especially, if this class is broken and buggy.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-7516
> >
> > В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > > Thanks for revealing this issue!
> > >
> > > I don't understand why should we disallow calling clear().
> > >
> > > One way how it can be re-implemented is:
> > > 1. acquire write locks on all segments;
> > > 2. clear them;
> > > 3. reset size to 0;
> > > 4. release locks.
> > >
> > > Another approach is to calculate inside
> > > ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
> > > deleted and then call size.addAndGet(...).
> > >
> > > In both cases you'll have to replace LongAdder with AtomicLong.
> > >
> > > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> > ilya.kasnacheev@gmail.com>
> > > wrote:
> > >
> > > > Hello igniters!
> > > >
> > > > So I was working on a fix for
> > > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > > The reason for test flakiness turned out our ConcurrentLinkedHashMap
> > (and
> > > > its tautological cousin GridBoundedConcurrentLinkedHashMap) is
> broken
> > :(
> > > >
> > > > When you do clear(). its size counter is not updated. So sizex() will
> > > > return the old size after clear, and if there's maxCnt set, after
> > several
> > > > clear()s it will immediately evict entries after they are inserted,
> > > > maintaining map size at 0.
> > > >
> > > > This is scary since indexing internals make intense use of
> > > > ConcurrentLinkedHashMaps.
> > > >
> > > > My suggestion for this fix is to avoid ever calling clear(), making
> it
> > > > throw UnsupportedOperationException and recreating/replacing map
> > instead of
> > > > clear()ing it. Unless somebody is going to stand up and fix
> > > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm
> afraid
> > of
> > > > touching this code in any non-trivial way.
> > > >
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > >
> > >
> > >
> >
> > --
> --
> Maxim Muzafarov
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Maxim Muzafarov <ma...@gmail.com>.

Hello Ilya,

Can you add more info about why and how LT for this test case prints log
message twice?

From my point, maiking clear() method to throw
UnsupportedOperationException is not right
fix for flaky test issues. A brief search through CLHM led me to a thought
that we just forgot to drop down
LongAdder size when iterating over HashEntry array. We incrementing and
decrementing this
counter on put/remove operations by why not in clear method? Am I right?
So, replacing LongAdder to AtomicLong
sounds good to me too, as it was suggested by Ilya Lantukh. But I can
mistake here.

In general way, I think it's a good case to start thinking about how to get
rid of CLHM. E.g. we can consider this implementaion [1].

[1] https://github.com/ben-manes/concurrentlinkedhashmap

вт, 24 июл. 2018 г. в 16:45, Stanislav Lukyanov <st...@gmail.com>:

> It seems that we currently use the CLHM primarily as a FIFO cache.
> I see two ways around that.
>
> First, we could implement such LRU/FIFO cache based on another, properly
> supported data structure from j.u.c.
> ConcurrentSkipListMap seems OK. I have a draft implementation of
> LruEvictionPolicy based on it that passes functional tests,
> but I haven’t benchmarked it yet.
>
> Second, Guava has a Cache API with a lot of configuration options that we
> could use (license is Apache, should be OK).
>
> Stan
>
> From: Nikolay Izhikov
> Sent: 24 июля 2018 г. 16:27
> To: dev@ignite.apache.org
> Subject: Re: ConcurrentLinkedHashMap works incorrectly after clear()
>
> Hello, Ilya.
>
> May be we need to proceed with ticket [1] "Get rid of
> org.jsr166.ConcurrentLinkedHashMap"?
>
> Especially, if this class is broken and buggy.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-7516
>
> В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > Thanks for revealing this issue!
> >
> > I don't understand why should we disallow calling clear().
> >
> > One way how it can be re-implemented is:
> > 1. acquire write locks on all segments;
> > 2. clear them;
> > 3. reset size to 0;
> > 4. release locks.
> >
> > Another approach is to calculate inside
> > ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
> > deleted and then call size.addAndGet(...).
> >
> > In both cases you'll have to replace LongAdder with AtomicLong.
> >
> > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> ilya.kasnacheev@gmail.com>
> > wrote:
> >
> > > Hello igniters!
> > >
> > > So I was working on a fix for
> > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > The reason for test flakiness turned out our ConcurrentLinkedHashMap
> (and
> > > its tautological cousin GridBoundedConcurrentLinkedHashMap) is broken
> :(
> > >
> > > When you do clear(). its size counter is not updated. So sizex() will
> > > return the old size after clear, and if there's maxCnt set, after
> several
> > > clear()s it will immediately evict entries after they are inserted,
> > > maintaining map size at 0.
> > >
> > > This is scary since indexing internals make intense use of
> > > ConcurrentLinkedHashMaps.
> > >
> > > My suggestion for this fix is to avoid ever calling clear(), making it
> > > throw UnsupportedOperationException and recreating/replacing map
> instead of
> > > clear()ing it. Unless somebody is going to stand up and fix
> > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm afraid
> of
> > > touching this code in any non-trivial way.
> > >
> > > --
> > > Ilya Kasnacheev
> > >
> >
> >
> >
>
> --
--
Maxim Muzafarov

RE: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Stanislav Lukyanov <st...@gmail.com>.

It seems that we currently use the CLHM primarily as a FIFO cache.
I see two ways around that.

First, we could implement such LRU/FIFO cache based on another, properly supported data structure from j.u.c.
ConcurrentSkipListMap seems OK. I have a draft implementation of LruEvictionPolicy based on it that passes functional tests, 
but I haven’t benchmarked it yet.

Second, Guava has a Cache API with a lot of configuration options that we could use (license is Apache, should be OK). 

Stan

From: Nikolay Izhikov
Sent: 24 июля 2018 г. 16:27
To: dev@ignite.apache.org
Subject: Re: ConcurrentLinkedHashMap works incorrectly after clear()

Hello, Ilya.

May be we need to proceed with ticket [1] "Get rid of org.jsr166.ConcurrentLinkedHashMap"?

Especially, if this class is broken and buggy.

[1] https://issues.apache.org/jira/browse/IGNITE-7516

В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> Thanks for revealing this issue!
> 
> I don't understand why should we disallow calling clear().
> 
> One way how it can be re-implemented is:
> 1. acquire write locks on all segments;
> 2. clear them;
> 3. reset size to 0;
> 4. release locks.
> 
> Another approach is to calculate inside
> ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
> deleted and then call size.addAndGet(...).
> 
> In both cases you'll have to replace LongAdder with AtomicLong.
> 
> On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <il...@gmail.com>
> wrote:
> 
> > Hello igniters!
> > 
> > So I was working on a fix for
> > https://issues.apache.org/jira/browse/IGNITE-9056
> > The reason for test flakiness turned out our ConcurrentLinkedHashMap (and
> > its tautological cousin GridBoundedConcurrentLinkedHashMap) is broken :(
> > 
> > When you do clear(). its size counter is not updated. So sizex() will
> > return the old size after clear, and if there's maxCnt set, after several
> > clear()s it will immediately evict entries after they are inserted,
> > maintaining map size at 0.
> > 
> > This is scary since indexing internals make intense use of
> > ConcurrentLinkedHashMaps.
> > 
> > My suggestion for this fix is to avoid ever calling clear(), making it
> > throw UnsupportedOperationException and recreating/replacing map instead of
> > clear()ing it. Unless somebody is going to stand up and fix
> > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm afraid of
> > touching this code in any non-trivial way.
> > 
> > --
> > Ilya Kasnacheev
> > 
> 
> 
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Nikolay Izhikov <ni...@apache.org>.

Hello, Ilya.

May be we need to proceed with ticket [1] "Get rid of org.jsr166.ConcurrentLinkedHashMap"?

Especially, if this class is broken and buggy.

[1] https://issues.apache.org/jira/browse/IGNITE-7516

В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> Thanks for revealing this issue!
> 
> I don't understand why should we disallow calling clear().
> 
> One way how it can be re-implemented is:
> 1. acquire write locks on all segments;
> 2. clear them;
> 3. reset size to 0;
> 4. release locks.
> 
> Another approach is to calculate inside
> ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
> deleted and then call size.addAndGet(...).
> 
> In both cases you'll have to replace LongAdder with AtomicLong.
> 
> On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <il...@gmail.com>
> wrote:
> 
> > Hello igniters!
> > 
> > So I was working on a fix for
> > https://issues.apache.org/jira/browse/IGNITE-9056
> > The reason for test flakiness turned out our ConcurrentLinkedHashMap (and
> > its tautological cousin GridBoundedConcurrentLinkedHashMap) is broken :(
> > 
> > When you do clear(). its size counter is not updated. So sizex() will
> > return the old size after clear, and if there's maxCnt set, after several
> > clear()s it will immediately evict entries after they are inserted,
> > maintaining map size at 0.
> > 
> > This is scary since indexing internals make intense use of
> > ConcurrentLinkedHashMaps.
> > 
> > My suggestion for this fix is to avoid ever calling clear(), making it
> > throw UnsupportedOperationException and recreating/replacing map instead of
> > clear()ing it. Unless somebody is going to stand up and fix
> > ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm afraid of
> > touching this code in any non-trivial way.
> > 
> > --
> > Ilya Kasnacheev
> > 
> 
> 
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Posted by Ilya Lantukh <il...@gridgain.com>.

Thanks for revealing this issue!

I don't understand why should we disallow calling clear().

One way how it can be re-implemented is:
1. acquire write locks on all segments;
2. clear them;
3. reset size to 0;
4. release locks.

Another approach is to calculate inside
ConcurrentLinkedHashMap.Segment.clear() how many entries you actually
deleted and then call size.addAndGet(...).

In both cases you'll have to replace LongAdder with AtomicLong.

On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello igniters!
>
> So I was working on a fix for
> https://issues.apache.org/jira/browse/IGNITE-9056
> The reason for test flakiness turned out our ConcurrentLinkedHashMap (and
> its tautological cousin GridBoundedConcurrentLinkedHashMap) is broken :(
>
> When you do clear(). its size counter is not updated. So sizex() will
> return the old size after clear, and if there's maxCnt set, after several
> clear()s it will immediately evict entries after they are inserted,
> maintaining map size at 0.
>
> This is scary since indexing internals make intense use of
> ConcurrentLinkedHashMaps.
>
> My suggestion for this fix is to avoid ever calling clear(), making it
> throw UnsupportedOperationException and recreating/replacing map instead of
> clear()ing it. Unless somebody is going to stand up and fix
> ConcurrentLinkedHashMap.clear() properly. Frankly speaking I'm afraid of
> touching this code in any non-trivial way.
>
> --
> Ilya Kasnacheev
>

-- 
Best regards,
Ilya