You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Markus Jelsma <ma...@openindex.io> on 2018/05/01 14:41:46 UTC

RE: 7.3 appears to leak

Mạnh, Shalin,

I tried to reproduce it locally but i failed, it is not just a stream of queries and frequent updates/commits. We will temporarily abuse a production machine to run 7.3 and a control  machine on 7.2 to rule some things out.

We have plenty custom plugins, so when i can reproduce it again, we can rule stuff out and hopefully get back at you guys!

Thanks,
Markus
 
-----Original message-----
> From:Đạt Cao Mạnh <ca...@gmail.com>
> Sent: Monday 30th April 2018 4:07
> To: solr-user@lucene.apache.org
> Subject: Re: 7.3 appears to leak
> 
> Hi Markus,
> 
> I tried indexing documents and query documents with queries and filter
> question, but can't not find any leak problems. Can you give us more
> information about the leak?
> 
> Thanks!
> 
> On Fri, Apr 27, 2018 at 5:11 PM Shalin Shekhar Mangar <
> shalinmangar@gmail.com> wrote:
> 
> > Hi Markus,
> >
> > Can you give an idea of what your filter queries look like? Any custom
> > plugins or things we should be aware of? Simple indexing artificial docs,
> > querying and committing doesn't seem to reproduce the issue for me.
> >
> > On Thu, Apr 26, 2018 at 10:13 PM, Markus Jelsma <
> > markus.jelsma@openindex.io>
> > wrote:
> >
> > > Hello,
> > >
> > > We just finished upgrading our three separate clusters from 7.2.1 to 7.3,
> > > which went fine, except for our main text search collection, it appears
> > to
> > > leak memory on commit!
> > >
> > > After initial upgrade we saw the cluster slowly starting to run out of
> > > memory within about an hour and a half. We increased heap in case 7.3
> > just
> > > requires more of it, but the heap consumption graph is still growing on
> > > each commit. Heap space cannot be reclaimed by forcing the garbage
> > > collector to run, everything just piles up in the OldGen. Running with
> > this
> > > slightly larger heap, the first nodes will run out of memory in about two
> > > and a half hours after cluster restart.
> > >
> > > The heap eating cluster is a 2shard/3replica system on separate nodes.
> > > Each replica is about 50 GB in size and about 8.5 million documents. On
> > > 7.2.1 it ran fine with just a 2 GB heap. With 7.3 and 2.5 GB heap, it
> > will
> > > take just a little longer for it to run out of memory.
> > >
> > > I inspected reports shown by the sampler of VisualVM and spotted one
> > > peculiarity, the number of instances of SortedIntDocSet kept growing on
> > > each commit by about the same amount as the number of cached filter
> > > queries. But this doesn't happen on the logs cluster, SortedIntDocSet
> > > instances are neatly collected there. The number of instances also
> > accounts
> > > for the number of commits since start up times the cache sizes
> > >
> > > Our other two clusters don't have this problem, one of them receives very
> > > few commits per day, but the other receives data all the time, it logs
> > user
> > > interactions so a large amount of data is coming in all the time. I
> > cannot
> > > reproduce it locally by indexing data and committing all the time, the
> > peak
> > > usage in OldGen stays about the same. But, i can reproduce it locally
> > when
> > > i introduce queries, and filter queries while indexing pieces of data and
> > > committing it.
> > >
> > > So, what is the problem? I dug in the CHANGES.txt of both Lucene and
> > Solr,
> > > but nothing really caught my attention. Does anyone here have an idea
> > where
> > > to look?
> > >
> > > Many thanks,
> > > Markus
> > >
> >
> >
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
> >
> 

Re: 7.3 appears to leak

Posted by Đạt Cao Mạnh <ca...@gmail.com>.
Thank Markus,

So I will go ahead with 7.3.1 release.

On Tue, May 1, 2018 at 9:41 PM Markus Jelsma <ma...@openindex.io>
wrote:

> Mạnh, Shalin,
>
> I tried to reproduce it locally but i failed, it is not just a stream of
> queries and frequent updates/commits. We will temporarily abuse a
> production machine to run 7.3 and a control  machine on 7.2 to rule some
> things out.
>
> We have plenty custom plugins, so when i can reproduce it again, we can
> rule stuff out and hopefully get back at you guys!
>
> Thanks,
> Markus
>
> -----Original message-----
> > From:Đạt Cao Mạnh <ca...@gmail.com>
> > Sent: Monday 30th April 2018 4:07
> > To: solr-user@lucene.apache.org
> > Subject: Re: 7.3 appears to leak
> >
> > Hi Markus,
> >
> > I tried indexing documents and query documents with queries and filter
> > question, but can't not find any leak problems. Can you give us more
> > information about the leak?
> >
> > Thanks!
> >
> > On Fri, Apr 27, 2018 at 5:11 PM Shalin Shekhar Mangar <
> > shalinmangar@gmail.com> wrote:
> >
> > > Hi Markus,
> > >
> > > Can you give an idea of what your filter queries look like? Any custom
> > > plugins or things we should be aware of? Simple indexing artificial
> docs,
> > > querying and committing doesn't seem to reproduce the issue for me.
> > >
> > > On Thu, Apr 26, 2018 at 10:13 PM, Markus Jelsma <
> > > markus.jelsma@openindex.io>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > We just finished upgrading our three separate clusters from 7.2.1 to
> 7.3,
> > > > which went fine, except for our main text search collection, it
> appears
> > > to
> > > > leak memory on commit!
> > > >
> > > > After initial upgrade we saw the cluster slowly starting to run out
> of
> > > > memory within about an hour and a half. We increased heap in case 7.3
> > > just
> > > > requires more of it, but the heap consumption graph is still growing
> on
> > > > each commit. Heap space cannot be reclaimed by forcing the garbage
> > > > collector to run, everything just piles up in the OldGen. Running
> with
> > > this
> > > > slightly larger heap, the first nodes will run out of memory in
> about two
> > > > and a half hours after cluster restart.
> > > >
> > > > The heap eating cluster is a 2shard/3replica system on separate
> nodes.
> > > > Each replica is about 50 GB in size and about 8.5 million documents.
> On
> > > > 7.2.1 it ran fine with just a 2 GB heap. With 7.3 and 2.5 GB heap, it
> > > will
> > > > take just a little longer for it to run out of memory.
> > > >
> > > > I inspected reports shown by the sampler of VisualVM and spotted one
> > > > peculiarity, the number of instances of SortedIntDocSet kept growing
> on
> > > > each commit by about the same amount as the number of cached filter
> > > > queries. But this doesn't happen on the logs cluster, SortedIntDocSet
> > > > instances are neatly collected there. The number of instances also
> > > accounts
> > > > for the number of commits since start up times the cache sizes
> > > >
> > > > Our other two clusters don't have this problem, one of them receives
> very
> > > > few commits per day, but the other receives data all the time, it
> logs
> > > user
> > > > interactions so a large amount of data is coming in all the time. I
> > > cannot
> > > > reproduce it locally by indexing data and committing all the time,
> the
> > > peak
> > > > usage in OldGen stays about the same. But, i can reproduce it locally
> > > when
> > > > i introduce queries, and filter queries while indexing pieces of
> data and
> > > > committing it.
> > > >
> > > > So, what is the problem? I dug in the CHANGES.txt of both Lucene and
> > > Solr,
> > > > but nothing really caught my attention. Does anyone here have an idea
> > > where
> > > > to look?
> > > >
> > > > Many thanks,
> > > > Markus
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Shalin Shekhar Mangar.
> > >
> >
>