You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by adfel70 <ad...@gmail.com> on 2013/11/27 15:42:30 UTC

solr as a service for multiple projects in the same environment

Hi
I have various solr related projects in a single environment.
These project are not related one to another.

I'm thinking of building a solr architecture so that all the projects will
use different solr collections in the same cluster, as opposed to having a
solr cluster for each project.

1. as I understand I can separate the configs of each collection in
zookeeper. is it correct?
2.are there any solr operations that can be performed on collection A and
somehow affect collection B?
3. is the solr cache separated for each collection? 
4. I assume that I'll encounter a problem with the os cache, when the
different indices will compete on the same memory, right? how severe is this
issue?
5. any other advice on building such an architecture? does the maintenance
overhead of maintaining multiple clusters in production really overwhelm the
problems and risks of using the same cluster for multiple systems?

thanks.



--
View this message in context: http://lucene.472066.n3.nabble.com/solr-as-a-service-for-multiple-projects-in-the-same-environment-tp4103523.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr as a service for multiple projects in the same environment

Posted by "Ing. Jorge Luis Betancourt Gonzalez" <jl...@uci.cu>.
I think that one experience in this area could by provided by Tray Grainger, author of Solr in Action, I believe that some of his work on careerbuilder involve the creation of something (somehow) similar to what you're trying to accomplish. I must say that I'm also interested in this topic, but haven't had the time to really do anything about this.

----- Mensaje original -----
De: "adfel70" <ad...@gmail.com>
Para: solr-user@lucene.apache.org
Enviados: Domingo, 1 de Diciembre 2013 2:41:00
Asunto: Re: solr as a service for multiple projects in the same environment

The risk is if you buy mistake mess up a cluster while doing maintenance on
one of the systems, you can affect the other system.
Its a pretty amorfic risk.
Aside from having multiple systems share the same hardware resources, I
don't see any other real risk.

Are your collections share the same topology in terms of shards and
replicas?
Do you manually configure the nodes on which each collection is created so
that you'll still have some level of seperation between the systems?




michael.boom wrote
> Hi,
> 
> There's nothing unusual in what you are trying to do, this scenario is
> very common.
> 
> To answer your questions:
>> 1. as I understand I can separate the configs of each collection in
>> zookeeper. is it correct? 
> Yes, that's correct. You'll have to upload your configs to ZK and use the
> CollectionAPI to create your collections.
> 
>>2.are there any solr operations that can be performed on collection A and
somehow affect collection B? 
> No, I can't think of any cross-collection operation. Here you can find a
> list of collection related operations:
> https://cwiki.apache.org/confluence/display/solr/Collections+API
> 
>>3. is the solr cache separated for each collection? 
> Yes, separate and configurable in solrconfig.xml for each collection.
> 
>>4. I assume that I'll encounter a problem with the os cache, when the
different indices will compete on the same memory, right? how severe is this
issue? 
> Hardware can be a bottleneck. If all your collection will face the same
> load you should try to give solr a RAM amount equal to the index size (all
> indexes)
> 
>>5. any other advice on building such an architecture? does the maintenance
overhead of maintaining multiple clusters in production really overwhelm the
problems and risks of using the same cluster for multiple systems? 
> I was in the same situation as you, and putting everything in multiple
> collections in just one cluster made sense for me : it's easier to manage
> and has no obvious downside. As for "risks of using the same cluster for
> multiple systems" they are pretty much the same  in both scenarios. Only
> that with multiple clusters you'll have much more machines to manage.





--
View this message in context: http://lucene.472066.n3.nabble.com/solr-as-a-service-for-multiple-projects-in-the-same-environment-tp4103523p4104206.html
Sent from the Solr - User mailing list archive at Nabble.com.
________________________________________________________________________________________________
III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu
________________________________________________________________________________________________
III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu

Re: solr as a service for multiple projects in the same environment

Posted by adfel70 <ad...@gmail.com>.
The risk is if you buy mistake mess up a cluster while doing maintenance on
one of the systems, you can affect the other system.
Its a pretty amorfic risk.
Aside from having multiple systems share the same hardware resources, I
don't see any other real risk.

Are your collections share the same topology in terms of shards and
replicas?
Do you manually configure the nodes on which each collection is created so
that you'll still have some level of seperation between the systems?




michael.boom wrote
> Hi,
> 
> There's nothing unusual in what you are trying to do, this scenario is
> very common.
> 
> To answer your questions:
>> 1. as I understand I can separate the configs of each collection in
>> zookeeper. is it correct? 
> Yes, that's correct. You'll have to upload your configs to ZK and use the
> CollectionAPI to create your collections.
> 
>>2.are there any solr operations that can be performed on collection A and
somehow affect collection B? 
> No, I can't think of any cross-collection operation. Here you can find a
> list of collection related operations:
> https://cwiki.apache.org/confluence/display/solr/Collections+API
> 
>>3. is the solr cache separated for each collection? 
> Yes, separate and configurable in solrconfig.xml for each collection.
> 
>>4. I assume that I'll encounter a problem with the os cache, when the
different indices will compete on the same memory, right? how severe is this
issue? 
> Hardware can be a bottleneck. If all your collection will face the same
> load you should try to give solr a RAM amount equal to the index size (all
> indexes)
> 
>>5. any other advice on building such an architecture? does the maintenance
overhead of maintaining multiple clusters in production really overwhelm the
problems and risks of using the same cluster for multiple systems? 
> I was in the same situation as you, and putting everything in multiple
> collections in just one cluster made sense for me : it's easier to manage
> and has no obvious downside. As for "risks of using the same cluster for
> multiple systems" they are pretty much the same  in both scenarios. Only
> that with multiple clusters you'll have much more machines to manage.





--
View this message in context: http://lucene.472066.n3.nabble.com/solr-as-a-service-for-multiple-projects-in-the-same-environment-tp4103523p4104206.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr as a service for multiple projects in the same environment

Posted by "michael.boom" <my...@yahoo.com>.
Hi,

There's nothing unusual in what you are trying to do, this scenario is very
common.

To answer your questions:
> 1. as I understand I can separate the configs of each collection in
> zookeeper. is it correct? 
Yes, that's correct. You'll have to upload your configs to ZK and use the
CollectionAPI to create your collections.

>2.are there any solr operations that can be performed on collection A and
somehow affect collection B? 
No, I can't think of any cross-collection operation. Here you can find a
list of collection related operations:
https://cwiki.apache.org/confluence/display/solr/Collections+API

>3. is the solr cache separated for each collection? 
Yes, separate and configurable in solrconfig.xml for each collection.

>4. I assume that I'll encounter a problem with the os cache, when the
different indices will compete on the same memory, right? how severe is this
issue? 
Hardware can be a bottleneck. If all your collection will face the same load
you should try to give solr a RAM amount equal to the index size (all
indexes)

>5. any other advice on building such an architecture? does the maintenance
overhead of maintaining multiple clusters in production really overwhelm the
problems and risks of using the same cluster for multiple systems? 
I was in the same situation as you, and putting everything in multiple
collections in just one cluster made sense for me : it's easier to manage
and has no obvious downside. As for "risks of using the same cluster for
multiple systems" they are pretty much the same  in both scenarios. Only
that with multiple clusters you'll have much more machines to manage.



-----
Thanks,
Michael
--
View this message in context: http://lucene.472066.n3.nabble.com/solr-as-a-service-for-multiple-projects-in-the-same-environment-tp4103523p4103537.html
Sent from the Solr - User mailing list archive at Nabble.com.