You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tod <li...@gmail.com> on 2011/06/30 18:45:53 UTC
mutliple webapps vs multi-core vs distruibuted

Currently I'm working with a group implementing Solr on an enterprise 
level.  Their initial toe dipping into Solr consists of running multiple 
(two) webapps on Tomcat using identical schemas.

Content is dispersed among a variety of repositories from CMS, DMS, WCMS 
to file systems and RDBS'.  The expectation is that this implementation 
is going to get very popular very quick.  With that in mind there is 
also a very large, very diverse set of business groups spanning the 
entire organization all of which want to participate.

This participation is based mostly on marketing their wares, not making 
sure a unified enterprise taxonomy exists that can ultimately facilitate 
search relevancy at an enterprise level.  Therefore accomplishing a 
unified taxonomy most likely can't be completed within the time frame 
the customer wants to have the search up and running.

So its up to us to figure out how to satisfy the immediate needs of each 
individual business entity, without the benefit of a unified enterprise 
wide taxonomy, and with advance knowledge there is a likelihood that 
each unit's search index may be based on a different schema dependent on 
their individual business drivers.

At an enterprise level users should be able to search the entire set of 
individual indexes returning a merged result with a desire to provide a 
high level of relevancy to individual business groups along with the 
enterprise audience both internal and external.

 From what I've been reading I think the current configuration may not 
stand up to the long term demand both from a usability and 
administrative standpoint, but I'm not completely sure.  That leaves 
multi-core and distributed search as possibilities.

I'm leaning towards multi-core.  Part of this decision is based on my 
perceived performance and administrative gains over the current 
configuration.  Distributed search is a possibility but in the short to 
medium term I don't see the number of indexed documents increasing to a 
size that would require it.  Plus I think the lack of a unified schema 
might throw a monkey wrench into the mix limiting the available solutions.

Does anyone have a similar experience that would be willing to share? 
Its early enough in the project life cycle that alternative ideas can be 
considered.  I'd be interested to hear other's opinions.


TIA - Tod