You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by pistacchio <pi...@gmail.com> on 2012/02/06 08:13:20 UTC

Searching context within a book

I'm very new to Solr and I'm evaluating it. My task is to look for words
within a corpus of books and return them within a small context. So far, I'm
storing the books in a database split by paragraphs (slicing the books by
line breaks), I do a fulltext search and return the row.

In Solr, would I have to do the same, or can I add the whole book (in .txt
format) and, whenever a match is found, return something like the match plus
100 words before and 100 words after or something like that? Thanks

--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-context-within-a-book-tp3718997p3718997.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Searching context within a book

Posted by Robert Stewart <bs...@gmail.com>.
You are probably better off splitting up each book into separate SOLR documents, one document per paragraph (each document with same book ID,  ISBN, etc.).  Then you can use field-collapsing on the book ID to return a single document per book.  And you can use highlighting to show the paragraph that matched the query.
You will need to "store" the full-text in SOLR in order to use highlighting feature and/or to return the text in the search results.


On Feb 6, 2012, at 2:13 AM, pistacchio wrote:

> I'm very new to Solr and I'm evaluating it. My task is to look for words
> within a corpus of books and return them within a small context. So far, I'm
> storing the books in a database split by paragraphs (slicing the books by
> line breaks), I do a fulltext search and return the row.
> 
> In Solr, would I have to do the same, or can I add the whole book (in .txt
> format) and, whenever a match is found, return something like the match plus
> 100 words before and 100 words after or something like that? Thanks
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Searching-context-within-a-book-tp3718997p3718997.html
> Sent from the Solr - User mailing list archive at Nabble.com.