You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ycrux <yc...@club-internet.fr> on 2008/01/14 20:45:09 UTC

Text Summarizer

Hi!

I'm looking for a good way to get a good "text summarizer"
for my personal search engine based Solr.

Actually, I'm using "ots" (Open Text Summurizer) but the result
is far from perfection.

Here's an example of usage:
$ elinks "http://lucene.apache.org/solr/" -force-html -no-numbering \
-no-references  2>/dev/null | ots -r 40 | less -S

The result is OK for this site, but I would like to obtain something 
similar
to google "text snippet" (a real excerpt).

Advices are welcome?

N.B: all the HTML pages I'm indexing are converted to text with "elinks" 
(the text browser)
like in the previous example.

Thanks in adavance.

cheers
Younès