You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Markus Jelsma <ma...@openindex.io> on 2016/06/02 12:47:34 UTC

Small setFacetLimit() terminates Solr

Hello,

I ran accros an awkward situation where is collect all ~7.000.000 distinct values for a field via facetting. To keep things optimized and reduce memory consumption i don't do setFacetLimit(-1) but a reasonable limit of 10.000 or 100.000.

To my surprise, Solr just stops or crashes. So, instead of decreasing the limit, i increased the limit to a 1.000.000! And it works! The weird thing is that with a limit of 100.000 or 200.000 and a heap of 3.5 GB, Solr stops. But with a limit of 1.000.000 and a reduced heap of 2.5 GB, it just works fine.

When it fails, it some times doesn't crash, but throw

396882 WARN  (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983) [   ] o.a.z.s.NIOServerCnxn caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x155111cc413000d, likely client has closed socket
        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:745)s a

This is on Solr 6.0.0 in cloud mode. 3 shards and 2 replica's on my local machine.

What is happening here?

Many thanks,
Markus

RE: Small setFacetLimit() terminates Solr

Posted by Markus Jelsma <ma...@openindex.io>.

I'll have a look at it!
Thanks guys!

Markus


 
-----Original message-----
> From:Toke Eskildsen <te...@statsbiblioteket.dk>
> Sent: Thursday 2nd June 2016 15:49
> To: solr-user@lucene.apache.org
> Subject: Re: Small setFacetLimit() terminates Solr
> 
> On Thu, 2016-06-02 at 09:26 -0400, Yonik Seeley wrote:
> > My guess would be that the smaller limit causes large facet refinement
> > requests to be sent out on the second phase.
> > It's not clear what's happening after that though (i.e. why that
> > causes things to crash)
> 
> The facet refinement can be a lot heavier than the initial call. For
> some of our queries (with unpatched Solr), we observed that it took 10
> times as long.
> 
> 
> Markus: You are hitting Solr in a way that scales very poorly. Maybe you
> can use export instead?
> https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets
> 
> 
> If you really need the faceting with full counts & everything, consider
> switching to a single-shard (and multiple replicas) setup as that
> removes the need for the refinement phase.
> 
> 
> - Toke Eskildsen, State and University Library, Denmark
> 
> 
>

Re: Small setFacetLimit() terminates Solr

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.

On Thu, 2016-06-02 at 09:26 -0400, Yonik Seeley wrote:
> My guess would be that the smaller limit causes large facet refinement
> requests to be sent out on the second phase.
> It's not clear what's happening after that though (i.e. why that
> causes things to crash)

The facet refinement can be a lot heavier than the initial call. For
some of our queries (with unpatched Solr), we observed that it took 10
times as long.

Markus: You are hitting Solr in a way that scales very poorly. Maybe you
can use export instead?
https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets

If you really need the faceting with full counts & everything, consider
switching to a single-shard (and multiple replicas) setup as that
removes the need for the refinement phase.

- Toke Eskildsen, State and University Library, Denmark

Re: Small setFacetLimit() terminates Solr

Posted by Yonik Seeley <ys...@gmail.com>.

My guess would be that the smaller limit causes large facet refinement
requests to be sent out on the second phase.
It's not clear what's happening after that though (i.e. why that
causes things to crash)

-Yonik


On Thu, Jun 2, 2016 at 8:47 AM, Markus Jelsma
<ma...@openindex.io> wrote:
> Hello,
>
> I ran accros an awkward situation where is collect all ~7.000.000 distinct values for a field via facetting. To keep things optimized and reduce memory consumption i don't do setFacetLimit(-1) but a reasonable limit of 10.000 or 100.000.
>
> To my surprise, Solr just stops or crashes. So, instead of decreasing the limit, i increased the limit to a 1.000.000! And it works! The weird thing is that with a limit of 100.000 or 200.000 and a heap of 3.5 GB, Solr stops. But with a limit of 1.000.000 and a reduced heap of 2.5 GB, it just works fine.
>
> When it fails, it some times doesn't crash, but throw
>
> 396882 WARN  (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983) [   ] o.a.z.s.NIOServerCnxn caught end of stream exception
> EndOfStreamException: Unable to read additional data from client sessionid 0x155111cc413000d, likely client has closed socket
>         at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
>         at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
>         at java.lang.Thread.run(Thread.java:745)s a
>
> This is on Solr 6.0.0 in cloud mode. 3 shards and 2 replica's on my local machine.
>
> What is happening here?
>
> Many thanks,
> Markus