You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Phillip Peleshok <pp...@gmail.com> on 2016/06/01 19:34:07 UTC

Solr off-heap FieldCache & HelioSearch

Hey everyone,

I've been using Solr for some time now and running into GC issues as most
others have.  Now I've exhausted all the traditional GC settings
recommended by various individuals (ie Shawn Heisey, etc) but neither
proved sufficient.  The one solution that I've seen that proved useful is
Heliosearch and the off-heap implementation.

My question is this, why wasn't the off-heap FieldCache implementation (
http://yonik.com/hs-solr-off-heap-fieldcache-performance/) ever rolled into
Solr when the other HelioSearch improvement were merged? Was there a
fundamental design problem or just a matter of time/testing that would be
incurred by the move?

Thanks,
Phil

Re: Solr off-heap FieldCache & HelioSearch

Posted by Phillip Peleshok <pp...@gmail.com>.

Funny you say that, as that's exactly what happened.  Tried them a couple
weeks ago and nothing.  Going at them again and will see what happens.

Yeah, we're in the same boat.  We started with the profilers (Yourkit) to
track down the causes.  Mainly got hit in the field cache and ordinal maps
(and all the objects just to build them).  Since we transitioned from
classic facets to json facets, unfortunately SOLR-8922 doesn't lend much
but it looks really good.  We were looking at cutting out the ordinal cache
depending on the cardinality but that's still a PoC at this point, but does
allow us to cap the memory usage.  Then given the (
http://stackoverflow.com/questions/214362/java-very-large-heap-sizes) we
stumbled across the off-heap and were giving that a go to see if it's worth
the avenue.  But after reading the UnSafe, started getting cold feet and
that's why I was trying to dig up a little more history.

Was actually thinking about the isolation of JVM per shard too.  Going
through the whiteboarding, decided against since it didn't lend itself to
our scenarios, but would be interested in how it turns out for you.

Thanks!
Phil

On Fri, Jun 3, 2016 at 8:33 AM, Jeff Wartes <jw...@whitepages.com> wrote:

>
> For what it’s worth, I’d suggest you go into a conversation with Azul with
> a more explicit “I’m looking to buy” approach. I reached out to them with a
> more “I’m exploring my options” attitude, and never even got a trial. I get
> the impression their business model involves a fairly expensive (to them)
> trial process, so they’re looking for more urgency on the part of the
> client than I was expressing.
>
> Instead, I spent a few weeks analyzing how my specific index allocated
> memory. This turned out to be quite worthwhile. Armed with that
> information, I was able to file a few patches (coming in 6.1, perhaps?)
> that reduced allocations by a pretty decent amount on large indexes.
> (SOLR-8922, particularly) It also straight-up ruled out certain things Solr
> supports, because the allocations were just too heavy. (SOLR-9125)
>
> I suppose the next thing I’m considering is using multiple JVMs per host,
> essentially one per shard. This wouldn’t change the allocation rate, but
> does serve to reduce the worst-case GC pause, since each JVM can have a
> smaller heap. I’d be trading a little p50 latency for some p90 latency
> reduction, I’d expect. Of course, that adds a bunch of headache to managing
> replica locations too.
>
>
> On 6/2/16, 6:30 PM, "Phillip Peleshok" <pp...@gmail.com> wrote:
>
> >Fantastic! I'm sorry I couldn't find that JIRA before and for getting you
> >to track it down.
> >
> >Yup, I noticed that for the docvalues with the ordinal map and I'm
> >definitely leveraging all that but I'm hitting the terms limit now and
> that
> >ends up pushing me over.  I'll see about giving Zing/Azul a try.  From all
> >my readings using theUnsafe seemed a little sketchy (
> >http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/) so
> >I'm glad that seemed to be the point of contention bringing it in and not
> >anything else.
> >
> >Thank you very much for the info,
> >Phil
> >
> >On Thu, Jun 2, 2016 at 6:14 PM, Erick Erickson <er...@gmail.com>
> >wrote:
> >
> >> Basically it never reached consensus, see the discussion at:
> >> https://issues.apache.org/jira/browse/SOLR-6638
> >>
> >> If you can afford it I've seen people with very good results
> >> using Zing/Azul, but that can be expensive.
> >>
> >> DocValues can help for fields you facet and sort on,
> >> those essentially move memory into the OS
> >> cache.
> >>
> >> But memory is an ongoing struggle I'm afraid.
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Jun 1, 2016 at 12:34 PM, Phillip Peleshok <pp...@gmail.com>
> >> wrote:
> >> > Hey everyone,
> >> >
> >> > I've been using Solr for some time now and running into GC issues as
> most
> >> > others have.  Now I've exhausted all the traditional GC settings
> >> > recommended by various individuals (ie Shawn Heisey, etc) but neither
> >> > proved sufficient.  The one solution that I've seen that proved
> useful is
> >> > Heliosearch and the off-heap implementation.
> >> >
> >> > My question is this, why wasn't the off-heap FieldCache
> implementation (
> >> > http://yonik.com/hs-solr-off-heap-fieldcache-performance/) ever
> rolled
> >> into
> >> > Solr when the other HelioSearch improvement were merged? Was there a
> >> > fundamental design problem or just a matter of time/testing that
> would be
> >> > incurred by the move?
> >> >
> >> > Thanks,
> >> > Phil
> >>
>
>

Re: Solr off-heap FieldCache & HelioSearch

Posted by Jeff Wartes <jw...@whitepages.com>.

For what it’s worth, I’d suggest you go into a conversation with Azul with a more explicit “I’m looking to buy” approach. I reached out to them with a more “I’m exploring my options” attitude, and never even got a trial. I get the impression their business model involves a fairly expensive (to them) trial process, so they’re looking for more urgency on the part of the client than I was expressing.

Instead, I spent a few weeks analyzing how my specific index allocated memory. This turned out to be quite worthwhile. Armed with that information, I was able to file a few patches (coming in 6.1, perhaps?) that reduced allocations by a pretty decent amount on large indexes. (SOLR-8922, particularly) It also straight-up ruled out certain things Solr supports, because the allocations were just too heavy. (SOLR-9125)

I suppose the next thing I’m considering is using multiple JVMs per host, essentially one per shard. This wouldn’t change the allocation rate, but does serve to reduce the worst-case GC pause, since each JVM can have a smaller heap. I’d be trading a little p50 latency for some p90 latency reduction, I’d expect. Of course, that adds a bunch of headache to managing replica locations too.

On 6/2/16, 6:30 PM, "Phillip Peleshok" <pp...@gmail.com> wrote:

>Fantastic! I'm sorry I couldn't find that JIRA before and for getting you
>to track it down.
>
>Yup, I noticed that for the docvalues with the ordinal map and I'm
>definitely leveraging all that but I'm hitting the terms limit now and that
>ends up pushing me over.  I'll see about giving Zing/Azul a try.  From all
>my readings using theUnsafe seemed a little sketchy (
>http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/) so
>I'm glad that seemed to be the point of contention bringing it in and not
>anything else.
>
>Thank you very much for the info,
>Phil
>
>On Thu, Jun 2, 2016 at 6:14 PM, Erick Erickson <er...@gmail.com>
>wrote:
>
>> Basically it never reached consensus, see the discussion at:
>> https://issues.apache.org/jira/browse/SOLR-6638
>>
>> If you can afford it I've seen people with very good results
>> using Zing/Azul, but that can be expensive.
>>
>> DocValues can help for fields you facet and sort on,
>> those essentially move memory into the OS
>> cache.
>>
>> But memory is an ongoing struggle I'm afraid.
>>
>> Best,
>> Erick
>>
>> On Wed, Jun 1, 2016 at 12:34 PM, Phillip Peleshok <pp...@gmail.com>
>> wrote:
>> > Hey everyone,
>> >
>> > I've been using Solr for some time now and running into GC issues as most
>> > others have.  Now I've exhausted all the traditional GC settings
>> > recommended by various individuals (ie Shawn Heisey, etc) but neither
>> > proved sufficient.  The one solution that I've seen that proved useful is
>> > Heliosearch and the off-heap implementation.
>> >
>> > My question is this, why wasn't the off-heap FieldCache implementation (
>> > http://yonik.com/hs-solr-off-heap-fieldcache-performance/) ever rolled
>> into
>> > Solr when the other HelioSearch improvement were merged? Was there a
>> > fundamental design problem or just a matter of time/testing that would be
>> > incurred by the move?
>> >
>> > Thanks,
>> > Phil
>>

Re: Solr off-heap FieldCache & HelioSearch

Posted by Phillip Peleshok <pp...@gmail.com>.

Fantastic! I'm sorry I couldn't find that JIRA before and for getting you
to track it down.

Yup, I noticed that for the docvalues with the ordinal map and I'm
definitely leveraging all that but I'm hitting the terms limit now and that
ends up pushing me over.  I'll see about giving Zing/Azul a try.  From all
my readings using theUnsafe seemed a little sketchy (
http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/) so
I'm glad that seemed to be the point of contention bringing it in and not
anything else.

Thank you very much for the info,
Phil

On Thu, Jun 2, 2016 at 6:14 PM, Erick Erickson <er...@gmail.com>
wrote:

> Basically it never reached consensus, see the discussion at:
> https://issues.apache.org/jira/browse/SOLR-6638
>
> If you can afford it I've seen people with very good results
> using Zing/Azul, but that can be expensive.
>
> DocValues can help for fields you facet and sort on,
> those essentially move memory into the OS
> cache.
>
> But memory is an ongoing struggle I'm afraid.
>
> Best,
> Erick
>
> On Wed, Jun 1, 2016 at 12:34 PM, Phillip Peleshok <pp...@gmail.com>
> wrote:
> > Hey everyone,
> >
> > I've been using Solr for some time now and running into GC issues as most
> > others have.  Now I've exhausted all the traditional GC settings
> > recommended by various individuals (ie Shawn Heisey, etc) but neither
> > proved sufficient.  The one solution that I've seen that proved useful is
> > Heliosearch and the off-heap implementation.
> >
> > My question is this, why wasn't the off-heap FieldCache implementation (
> > http://yonik.com/hs-solr-off-heap-fieldcache-performance/) ever rolled
> into
> > Solr when the other HelioSearch improvement were merged? Was there a
> > fundamental design problem or just a matter of time/testing that would be
> > incurred by the move?
> >
> > Thanks,
> > Phil
>

Re: Solr off-heap FieldCache & HelioSearch

Posted by Phillip Peleshok <pp...@gmail.com>.

Thank you for the info on this.  Yeah, I should've raised this in the dev
lists; sorry about that.  Funny you mention that since I was trending in
that direction as well.  Then saw the off-heap stuff and thought it might
have had an easy way out.  I'd like to focus on the re-use scheme to be
honest.  Already looking at that approach for the ordinal maps.

Thanks again,
Phil

On Fri, Jun 3, 2016 at 4:33 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
wrote:

> On Thu, 2016-06-02 at 18:14 -0700, Erick Erickson wrote:
> > But memory is an ongoing struggle I'm afraid.
>
> With fear of going too far into devel-territory...
>
>
> There are several places in Solr where memory usage if far from optimal
> with high-cardinality data and where improvements can be made without
> better GC or off-heap.
>
> Some places it is due to "clean object oriented" programming, for
> example with priority queues filled with objects, which gets very GC
> expensive for 100K+ entries. Some of this can be remedied by less clean
> coding and bit-hacking, but often results in less-manageable code.
>
> https://sbdevel.wordpress.com/2015/11/13/the-ones-that-got-away/
>
>
> Other places it is large arrays that are hard to avoid, for example with
> docID-bitmaps and counter-arrays for String faceting. These put quite a
> strain on GC as they are being allocated and released all the time.
> Unless the index is constantly updated, DocValues does not help much
> with GC as the counters are the same, DocValues or not.
>
> The layout of these structures is well-defined: As long as the Searcher
> has not been re-opened, each new instance of an array is of the exact
> same size as the previous one. When the searcher is re-opened, all the
> sizes changes. Putting those structures off-heap is one solution,
> another is to re-use the structures.
>
> Our experiments with re-using faceting counter structures has been very
> promising (far less GC, lower response times). I would think that the
> same would be true for a similar docID-bitmap re-use scheme.
>
>
> So yes, very much an on-going struggle, but one where there are multiple
> known remedies. Not necessarily easy to implement though.
>
> - Toke Eskildsen, State and Univeristy Library, Denmark
>
>
>

Re: Solr off-heap FieldCache & HelioSearch

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.

On Thu, 2016-06-02 at 18:14 -0700, Erick Erickson wrote:
> But memory is an ongoing struggle I'm afraid.

With fear of going too far into devel-territory...


There are several places in Solr where memory usage if far from optimal
with high-cardinality data and where improvements can be made without
better GC or off-heap.

Some places it is due to "clean object oriented" programming, for
example with priority queues filled with objects, which gets very GC
expensive for 100K+ entries. Some of this can be remedied by less clean
coding and bit-hacking, but often results in less-manageable code.

https://sbdevel.wordpress.com/2015/11/13/the-ones-that-got-away/


Other places it is large arrays that are hard to avoid, for example with
docID-bitmaps and counter-arrays for String faceting. These put quite a
strain on GC as they are being allocated and released all the time.
Unless the index is constantly updated, DocValues does not help much
with GC as the counters are the same, DocValues or not.

The layout of these structures is well-defined: As long as the Searcher
has not been re-opened, each new instance of an array is of the exact
same size as the previous one. When the searcher is re-opened, all the
sizes changes. Putting those structures off-heap is one solution,
another is to re-use the structures.

Our experiments with re-using faceting counter structures has been very
promising (far less GC, lower response times). I would think that the
same would be true for a similar docID-bitmap re-use scheme.


So yes, very much an on-going struggle, but one where there are multiple
known remedies. Not necessarily easy to implement though.

- Toke Eskildsen, State and Univeristy Library, Denmark

Re: Solr off-heap FieldCache & HelioSearch

Posted by Erick Erickson <er...@gmail.com>.

Basically it never reached consensus, see the discussion at:
https://issues.apache.org/jira/browse/SOLR-6638

If you can afford it I've seen people with very good results
using Zing/Azul, but that can be expensive.

DocValues can help for fields you facet and sort on,
those essentially move memory into the OS
cache.

But memory is an ongoing struggle I'm afraid.

Best,
Erick

On Wed, Jun 1, 2016 at 12:34 PM, Phillip Peleshok <pp...@gmail.com> wrote:
> Hey everyone,
>
> I've been using Solr for some time now and running into GC issues as most
> others have.  Now I've exhausted all the traditional GC settings
> recommended by various individuals (ie Shawn Heisey, etc) but neither
> proved sufficient.  The one solution that I've seen that proved useful is
> Heliosearch and the off-heap implementation.
>
> My question is this, why wasn't the off-heap FieldCache implementation (
> http://yonik.com/hs-solr-off-heap-fieldcache-performance/) ever rolled into
> Solr when the other HelioSearch improvement were merged? Was there a
> fundamental design problem or just a matter of time/testing that would be
> incurred by the move?
>
> Thanks,
> Phil