You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by caman <AB...@gmail.com> on 2010/03/03 20:08:59 UTC

SOLR Index or database

Hello All, 

Just struggling with a thought where SOLR or a database would be good option
for me.Here are my requirements.
We index about 600+ news/blogs into out system. Only information we store
locally is the title,link and article snippet.We are able to index all these
sources into SOLR index and it works perfectly.
This is where is gets tricky: 
We need to store certain meta information as well. e.g.
1. Rating/popularity of article
2. Sharing of the articles between users
3. How may times articles is viewed.
4. Comments on each article.

So far, we are deciding to store meta-information in the database and link
this data with the a document in the index. When user opens the page,
results are combined from index and the database to render the view. 

Any reservation on using the above architecture? 
Is SOLR right fit in this case? We do need full text search so SOLR is
no-brainer imho but would love to hear community view.

Any feedback appreciated

thanks




-- 
View this message in context: http://old.nabble.com/SOLR-Index-or-database-tp27772362p27772362.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SOLR Index or database

Posted by Dallan Quass <da...@quass.org>.
FWIW, I just implemented a system that stores the index in SOLR but the
records in a partitioned set of MySQL databases.  The only stored field in
SOLR is an ID field, which is the key to a table in the MySQL database.  I
had to modify SOLR a tiny bit and write a "database" search component so
that search results are read from the database instead of the SOLR index
partitions, but it works really well. 

The system indexes around 750M records partitioned across 10 SOLR servers
and 4 MySQL servers.  Storing the records in MySQL kept the indexes small
enough to be cached entirely in memory.  The MySQL databases require one
disk IO for each record displayed in the search results.

--dallan

> -----Original Message-----
> From: Walter Underwood [mailto:wunder@wunderwood.org] 
> Sent: Wednesday, March 03, 2010 1:20 PM
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR Index or database
> 
> You need two, maybe three things that Solr doesn't do (or 
> doesn't do well):
> 
> * field updating
> * storing content
> * real time search and/or simple transactions
> 
> I would seriously look at Mark Logic for that. It does all of 
> those, plus full-text search, gracefully, plus it scales. 
> There is also a version for Amazon EC2.  www.marklogic.com
> 
> Note: I work at Mark Logic, but I chose Solr for Netflix when 
> I worked there.
> 
> wunder
> 
> On Mar 3, 2010, at 11:08 AM, caman wrote:
> 
> > 
> > Hello All,
> > 
> > Just struggling with a thought where SOLR or a database 
> would be good 
> > option for me.Here are my requirements.
> > We index about 600+ news/blogs into out system. Only information we 
> > store locally is the title,link and article snippet.We are able to 
> > index all these sources into SOLR index and it works perfectly.
> > This is where is gets tricky: 
> > We need to store certain meta information as well. e.g.
> > 1. Rating/popularity of article
> > 2. Sharing of the articles between users 3. How may times 
> articles is 
> > viewed.
> > 4. Comments on each article.
> > 
> > So far, we are deciding to store meta-information in the 
> database and 
> > link this data with the a document in the index. When user 
> opens the 
> > page, results are combined from index and the database to 
> render the view.
> > 
> > Any reservation on using the above architecture? 
> > Is SOLR right fit in this case? We do need full text search 
> so SOLR is 
> > no-brainer imho but would love to hear community view.
> > 
> > Any feedback appreciated
> > 
> > thanks
> 


Re: SOLR Index or database

Posted by Walter Underwood <wu...@wunderwood.org>.
You need two, maybe three things that Solr doesn't do (or doesn't do well):

* field updating
* storing content
* real time search and/or simple transactions

I would seriously look at Mark Logic for that. It does all of those, plus full-text search, gracefully, plus it scales. There is also a version for Amazon EC2.  www.marklogic.com

Note: I work at Mark Logic, but I chose Solr for Netflix when I worked there.

wunder

On Mar 3, 2010, at 11:08 AM, caman wrote:

> 
> Hello All, 
> 
> Just struggling with a thought where SOLR or a database would be good option
> for me.Here are my requirements.
> We index about 600+ news/blogs into out system. Only information we store
> locally is the title,link and article snippet.We are able to index all these
> sources into SOLR index and it works perfectly.
> This is where is gets tricky: 
> We need to store certain meta information as well. e.g.
> 1. Rating/popularity of article
> 2. Sharing of the articles between users
> 3. How may times articles is viewed.
> 4. Comments on each article.
> 
> So far, we are deciding to store meta-information in the database and link
> this data with the a document in the index. When user opens the page,
> results are combined from index and the database to render the view. 
> 
> Any reservation on using the above architecture? 
> Is SOLR right fit in this case? We do need full text search so SOLR is
> no-brainer imho but would love to hear community view.
> 
> Any feedback appreciated
> 
> thanks