You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Aurélien MAZOYER <au...@francelabs.com> on 2014/07/23 14:55:28 UTC

Passivate core in Solr Cloud

Hello,

We want to setup a Solr Cloud cluster in order to handle a high volume 
of documents with a multi-tenant architecture. The problem is that an 
application-level isolation for a tenant (using a mutual index with a 
field "customer") is not enough to fit our requirements. As a result, we 
need 1 collection/customer. There is more than a thousand customers and 
it seems unreasonable to create thousands of collections in Solr 
Cloud... But as we know that there are less than 1 query/customer/day, 
we are currently looking for a way to passivate collection when they are 
not in use. Can it be a good idea? If yes, are there best practices to 
implement this? What side effects can we expect? Do we need to put some 
application-level logic on top on the Solr Cloud cluster to choose which 
collection we have to unload (and maybe there is something smarter (and 
quicker?) than simply loading/unloading the core when it is not in used?) ?


Thank you for your answer(s),

Aurelien


Re: Passivate core in Solr Cloud

Posted by au...@francelabs.com.
Thank you Erick,

Ok, I will probably perform some tests. It seems to be a good candidate 
for a future blog post...

Regards,

Aurelien

On 27.07.2014 20:20, Erick Erickson wrote:
> "Does not play nice" really means it was designed to run in a
> non-distributed mode. There has
> been no work done to verify that it does work in cloud mode, I fully 
> expect
> some "interesting"
> problems in that mode. If/when we get to it that is.
> 
> About replication: I haven't heard of any problems, but I also haven't
> heard of it
> working in that environment. I expect that it'll only try to replicate 
> when
> it's
> loaded, so that might be interesting....
> 
> Best,
> Erick
> 
> 
> On Thu, Jul 24, 2014 at 6:49 AM, Aurélien MAZOYER <
> aurelien.mazoyer@francelabs.com> wrote:
> 
>> Thank you Erick and Alex for your answers. Lots of core stuff seems to
>> meet my requirement but it is a problem if it does not work with Solr
>> Cloud. Is there an issue opened for this problem?
>> If I understand well, the only solution for me is to use multiple
>> monoinstances of Solr using transient cores and to distribute manually 
>> the
>> cores for my tenant (I assume the LRU mechanimn will be less effective 
>> as
>> it will be done per solr instance).
>> When you say "does NOT play nice with distributed mode", does it also
>> include the standard replication mecanism?
>> 
>> Thanks,
>> 
>> Regards,
>> 
>> Aurelien
>> 
>> 
>> 
>> Le 23/07/2014 17:21, Erick Erickson a écrit :
>> 
>>  Do note that the lots of cores stuff does NOT play nice with in
>>> distributed mode (yet).
>>> 
>>> Best,
>>> Erick
>>> 
>>> 
>>> On Wed, Jul 23, 2014 at 6:00 AM, Alexandre 
>>> Rafalovitch<arafalov@gmail.com
>>> >
>>> wrote:
>>> 
>>>  Solr has some support for large number of cores, including transient
>>>> cores:http://wiki.apache.org/solr/LotsOfCores
>>>> 
>>>> Regards,
>>>>     Alex.
>>>> Personal:http://www.outerthoughts.com/  and @arafalov
>>>> Solr resources:http://www.solr-start.com/  and @solrstart
>>>> Solr popularizers 
>>>> community:https://www.linkedin.com/groups?gid=6713853
>>>> 
>>>> 
>>>> On Wed, Jul 23, 2014 at 7:55 PM, Aurélien MAZOYER
>>>> <au...@francelabs.com>  wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> We want to setup a Solr Cloud cluster in order to handle a high 
>>>>> volume
>>>>> of
>>>>> documents with a multi-tenant architecture. The problem is that an
>>>>> application-level isolation for a tenant (using a mutual index with 
>>>>> a
>>>>> 
>>>> field
>>>> 
>>>>> "customer") is not enough to fit our requirements. As a result, we 
>>>>> need
>>>>> 1
>>>>> collection/customer. There is more than a thousand customers and it
>>>>> seems
>>>>> unreasonable to create thousands of collections in Solr Cloud... 
>>>>> But as
>>>>> 
>>>> we
>>>> 
>>>>> know that there are less than 1 query/customer/day, we are 
>>>>> currently
>>>>> 
>>>> looking
>>>> 
>>>>> for a way to passivate collection when they are not in use. Can it 
>>>>> be a
>>>>> 
>>>> good
>>>> 
>>>>> idea? If yes, are there best practices to implement this? What side
>>>>> 
>>>> effects
>>>> 
>>>>> can we expect? Do we need to put some application-level logic on 
>>>>> top on
>>>>> 
>>>> the
>>>> 
>>>>> Solr Cloud cluster to choose which collection we have to unload 
>>>>> (and
>>>>> 
>>>> maybe
>>>> 
>>>>> there is something smarter (and quicker?) than simply 
>>>>> loading/unloading
>>>>> 
>>>> the
>>>> 
>>>>> core when it is not in used?) ?
>>>>> 
>>>>> 
>>>>> Thank you for your answer(s),
>>>>> 
>>>>> Aurelien
>>>>> 
>>>>> 
>> 

Re: Passivate core in Solr Cloud

Posted by Erick Erickson <er...@gmail.com>.
"Does not play nice" really means it was designed to run in a
non-distributed mode. There has
been no work done to verify that it does work in cloud mode, I fully expect
some "interesting"
problems in that mode. If/when we get to it that is.

About replication: I haven't heard of any problems, but I also haven't
heard of it
working in that environment. I expect that it'll only try to replicate when
it's
loaded, so that might be interesting....

Best,
Erick


On Thu, Jul 24, 2014 at 6:49 AM, Aurélien MAZOYER <
aurelien.mazoyer@francelabs.com> wrote:

> Thank you Erick and Alex for your answers. Lots of core stuff seems to
> meet my requirement but it is a problem if it does not work with Solr
> Cloud. Is there an issue opened for this problem?
> If I understand well, the only solution for me is to use multiple
> monoinstances of Solr using transient cores and to distribute manually the
> cores for my tenant (I assume the LRU mechanimn will be less effective as
> it will be done per solr instance).
> When you say "does NOT play nice with distributed mode", does it also
> include the standard replication mecanism?
>
> Thanks,
>
> Regards,
>
> Aurelien
>
>
>
> Le 23/07/2014 17:21, Erick Erickson a écrit :
>
>  Do note that the lots of cores stuff does NOT play nice with in
>> distributed mode (yet).
>>
>> Best,
>> Erick
>>
>>
>> On Wed, Jul 23, 2014 at 6:00 AM, Alexandre Rafalovitch<arafalov@gmail.com
>> >
>> wrote:
>>
>>  Solr has some support for large number of cores, including transient
>>> cores:http://wiki.apache.org/solr/LotsOfCores
>>>
>>> Regards,
>>>     Alex.
>>> Personal:http://www.outerthoughts.com/  and @arafalov
>>> Solr resources:http://www.solr-start.com/  and @solrstart
>>> Solr popularizers community:https://www.linkedin.com/groups?gid=6713853
>>>
>>>
>>> On Wed, Jul 23, 2014 at 7:55 PM, Aurélien MAZOYER
>>> <au...@francelabs.com>  wrote:
>>>
>>>> Hello,
>>>>
>>>> We want to setup a Solr Cloud cluster in order to handle a high volume
>>>> of
>>>> documents with a multi-tenant architecture. The problem is that an
>>>> application-level isolation for a tenant (using a mutual index with a
>>>>
>>> field
>>>
>>>> "customer") is not enough to fit our requirements. As a result, we need
>>>> 1
>>>> collection/customer. There is more than a thousand customers and it
>>>> seems
>>>> unreasonable to create thousands of collections in Solr Cloud... But as
>>>>
>>> we
>>>
>>>> know that there are less than 1 query/customer/day, we are currently
>>>>
>>> looking
>>>
>>>> for a way to passivate collection when they are not in use. Can it be a
>>>>
>>> good
>>>
>>>> idea? If yes, are there best practices to implement this? What side
>>>>
>>> effects
>>>
>>>> can we expect? Do we need to put some application-level logic on top on
>>>>
>>> the
>>>
>>>> Solr Cloud cluster to choose which collection we have to unload (and
>>>>
>>> maybe
>>>
>>>> there is something smarter (and quicker?) than simply loading/unloading
>>>>
>>> the
>>>
>>>> core when it is not in used?) ?
>>>>
>>>>
>>>> Thank you for your answer(s),
>>>>
>>>> Aurelien
>>>>
>>>>
>

Re: Passivate core in Solr Cloud

Posted by Aurélien MAZOYER <au...@francelabs.com>.
Thank you Erick and Alex for your answers. Lots of core stuff seems to 
meet my requirement but it is a problem if it does not work with Solr 
Cloud. Is there an issue opened for this problem?
If I understand well, the only solution for me is to use multiple 
monoinstances of Solr using transient cores and to distribute manually 
the cores for my tenant (I assume the LRU mechanimn will be less 
effective as it will be done per solr instance).
When you say "does NOT play nice with distributed mode", does it also 
include the standard replication mecanism?

Thanks,

Regards,

Aurelien



Le 23/07/2014 17:21, Erick Erickson a écrit :
> Do note that the lots of cores stuff does NOT play nice with in
> distributed mode (yet).
>
> Best,
> Erick
>
>
> On Wed, Jul 23, 2014 at 6:00 AM, Alexandre Rafalovitch<ar...@gmail.com>
> wrote:
>
>> Solr has some support for large number of cores, including transient
>> cores:http://wiki.apache.org/solr/LotsOfCores
>>
>> Regards,
>>     Alex.
>> Personal:http://www.outerthoughts.com/  and @arafalov
>> Solr resources:http://www.solr-start.com/  and @solrstart
>> Solr popularizers community:https://www.linkedin.com/groups?gid=6713853
>>
>>
>> On Wed, Jul 23, 2014 at 7:55 PM, Aurélien MAZOYER
>> <au...@francelabs.com>  wrote:
>>> Hello,
>>>
>>> We want to setup a Solr Cloud cluster in order to handle a high volume of
>>> documents with a multi-tenant architecture. The problem is that an
>>> application-level isolation for a tenant (using a mutual index with a
>> field
>>> "customer") is not enough to fit our requirements. As a result, we need 1
>>> collection/customer. There is more than a thousand customers and it seems
>>> unreasonable to create thousands of collections in Solr Cloud... But as
>> we
>>> know that there are less than 1 query/customer/day, we are currently
>> looking
>>> for a way to passivate collection when they are not in use. Can it be a
>> good
>>> idea? If yes, are there best practices to implement this? What side
>> effects
>>> can we expect? Do we need to put some application-level logic on top on
>> the
>>> Solr Cloud cluster to choose which collection we have to unload (and
>> maybe
>>> there is something smarter (and quicker?) than simply loading/unloading
>> the
>>> core when it is not in used?) ?
>>>
>>>
>>> Thank you for your answer(s),
>>>
>>> Aurelien
>>>


Re: Passivate core in Solr Cloud

Posted by Erick Erickson <er...@gmail.com>.
Do note that the lots of cores stuff does NOT play nice with in
distributed mode (yet).

Best,
Erick


On Wed, Jul 23, 2014 at 6:00 AM, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> Solr has some support for large number of cores, including transient
> cores: http://wiki.apache.org/solr/LotsOfCores
>
> Regards,
>    Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On Wed, Jul 23, 2014 at 7:55 PM, Aurélien MAZOYER
> <au...@francelabs.com> wrote:
> > Hello,
> >
> > We want to setup a Solr Cloud cluster in order to handle a high volume of
> > documents with a multi-tenant architecture. The problem is that an
> > application-level isolation for a tenant (using a mutual index with a
> field
> > "customer") is not enough to fit our requirements. As a result, we need 1
> > collection/customer. There is more than a thousand customers and it seems
> > unreasonable to create thousands of collections in Solr Cloud... But as
> we
> > know that there are less than 1 query/customer/day, we are currently
> looking
> > for a way to passivate collection when they are not in use. Can it be a
> good
> > idea? If yes, are there best practices to implement this? What side
> effects
> > can we expect? Do we need to put some application-level logic on top on
> the
> > Solr Cloud cluster to choose which collection we have to unload (and
> maybe
> > there is something smarter (and quicker?) than simply loading/unloading
> the
> > core when it is not in used?) ?
> >
> >
> > Thank you for your answer(s),
> >
> > Aurelien
> >
>

Re: Passivate core in Solr Cloud

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Solr has some support for large number of cores, including transient
cores: http://wiki.apache.org/solr/LotsOfCores

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Wed, Jul 23, 2014 at 7:55 PM, Aurélien MAZOYER
<au...@francelabs.com> wrote:
> Hello,
>
> We want to setup a Solr Cloud cluster in order to handle a high volume of
> documents with a multi-tenant architecture. The problem is that an
> application-level isolation for a tenant (using a mutual index with a field
> "customer") is not enough to fit our requirements. As a result, we need 1
> collection/customer. There is more than a thousand customers and it seems
> unreasonable to create thousands of collections in Solr Cloud... But as we
> know that there are less than 1 query/customer/day, we are currently looking
> for a way to passivate collection when they are not in use. Can it be a good
> idea? If yes, are there best practices to implement this? What side effects
> can we expect? Do we need to put some application-level logic on top on the
> Solr Cloud cluster to choose which collection we have to unload (and maybe
> there is something smarter (and quicker?) than simply loading/unloading the
> core when it is not in used?) ?
>
>
> Thank you for your answer(s),
>
> Aurelien
>