You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bertrand Mahé <bm...@servicepilot.com> on 2018/06/29 16:37:48 UTC

Solr - zoo with more than 1000 collections

Hi,

 

In order to store timeseries data and perform deletion easily, we create a
several collections per day and then use aliases.

 

We are using SOLR 7.3 and we have 2 questions:

 

Q1 : In order to access quickly the latest data would it be possible to load
cores in descending chronological order rather than alphabetical order?

 

Q2: When we exceed 1200-1300 collections, zookeeper suddenly changes from
6-700 KB RAM to 3 GB RAM which makes zoo very slow or almost unusable. Is
this normal?

 

Thanks in advance,

 

Bertrand


Re: Solr - zoo with more than 1000 collections

Posted by Gus Heck <gu...@gmail.com>.
Hi Bertrand,

Are you by any chance using the new Time Routed Aliases feature? You didn't
mention it so I suspect not, but you might want to look...  It's still
pretty new, but it would be interesting to get your feedback on it if it
looks like it would help. I'm wondering how you get to that many
collections, and if any of those collections are old data that doesn't need
to be queried anymore? If so, TRA's can clean up collections with old data
automatically... (see router.autoDeleteAge) That would possibly put an
upper bound on the number of collections you have to handle, and allow the
performance to be stable indefinitely once things are sized correctly
(assuming a steady rate of new data, and a steady query rate/complexity).

https://lucene.apache.org/solr/guide/7_3/collections-api.html#createalias

More some improvements and a dedicated section in the docs were added in
7.4....

https://lucene.apache.org/solr/guide/7_4/collections-api.html#createalias

-Gus

On Fri, Jun 29, 2018 at 12:49 PM, Yago Riveiro <ya...@gmail.com>
wrote:

> Solr doesn’t scale very well with ~2K collections, and yes de bottleneck
> is Zookeeper itself.
>
> Zookeeper doesn’t perform operation as quickly as expected with folders
> with a lot of children.
>
> In a scenario where you are in a recovery state (a node crash), this
> limitation will hurt a lot, the queue work stacks recovery operations due
> the low throughput to consume the queue.
>
> Regards.
>
> --
>
> Yago Riveiro
>
> On 29 Jun 2018 17:38 +0100, Bertrand Mahé <bm...@servicepilot.com>, wrote:
> > Hi,
> >
> >
> >
> > In order to store timeseries data and perform deletion easily, we create
> a
> > several collections per day and then use aliases.
> >
> >
> >
> > We are using SOLR 7.3 and we have 2 questions:
> >
> >
> >
> > Q1 : In order to access quickly the latest data would it be possible to
> load
> > cores in descending chronological order rather than alphabetical order?
> >
> >
> >
> > Q2: When we exceed 1200-1300 collections, zookeeper suddenly changes from
> > 6-700 KB RAM to 3 GB RAM which makes zoo very slow or almost unusable. Is
> > this normal?
> >
> >
> >
> > Thanks in advance,
> >
> >
> >
> > Bertrand
> >
>



-- 
http://www.the111shift.com

Re: Solr - zoo with more than 1000 collections

Posted by Yago Riveiro <ya...@gmail.com>.
Solr doesn’t scale very well with ~2K collections, and yes de bottleneck is Zookeeper itself.

Zookeeper doesn’t perform operation as quickly as expected with folders with a lot of children.

In a scenario where you are in a recovery state (a node crash), this limitation will hurt a lot, the queue work stacks recovery operations due the low throughput to consume the queue.

Regards.

--

Yago Riveiro

On 29 Jun 2018 17:38 +0100, Bertrand Mahé <bm...@servicepilot.com>, wrote:
> Hi,
>
>
>
> In order to store timeseries data and perform deletion easily, we create a
> several collections per day and then use aliases.
>
>
>
> We are using SOLR 7.3 and we have 2 questions:
>
>
>
> Q1 : In order to access quickly the latest data would it be possible to load
> cores in descending chronological order rather than alphabetical order?
>
>
>
> Q2: When we exceed 1200-1300 collections, zookeeper suddenly changes from
> 6-700 KB RAM to 3 GB RAM which makes zoo very slow or almost unusable. Is
> this normal?
>
>
>
> Thanks in advance,
>
>
>
> Bertrand
>