You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Doug Cutting <cu...@apache.org> on 2004/07/01 21:24:11 UTC
Re: Making a case for Lucene
> The best example that I've been able to find is the Yahoo research
> lab - as I understand it, this is a Nutch (i.e. Lucene)
> implementation that's providing impressive performance over a
> 100 million document repository.
This demo runs on a handful of boxes. It was originally running on
three dual-processor boxes, but I think Yahoo! subsequently moved it to
six or eight single-processor boxes. Queries are broadcast to all
servers, and the top-scoring matches overall are presented.
In Nutch-based benchmarks, we found that a single-processor box with 4GB
of memory and a 2M page Nutch index (i.e., the entire index fits in RAM)
could handle over 20 Nutch searches/second. A box with 1GB of memory
and a 20M page Nutch index (i.e., the entire index does not fit in
memory) could only handle around 1 or 2 Nutch searches/second. These
were done with Lucene 1.3. Lucene 1.4 should be somewhat faster.
Performance will obviously vary with processor speed, disk speed,
average document size, average number terms per query, etc.
Doug
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Visualization of Lucene search results with a treemap
Posted by Stefan Groschupf <sg...@media-style.com>.
>
>> Do you know:
>> http://websom.hut.fi/websom/comp.ai.neural-nets-new/html/root.html ?
>
> Interesting - is there any code avail to draw the maps?
The algorithm is described here;
http://www.cis.hut.fi/research/som-research/book/
A short summary and some sample code is available here:
http://davis.wpi.edu/~matt/courses/soms/
Some more interesting papers about visualization is available at the
text-mining.org community page.
http://www.text-mining.org/index.jsp?folderPK=793
Happy hacking! :-)
Stefan
---------------------------------------------------------------
enterprise information technology consulting
open technology: http://www.media-style.com
open source: http://www.weta-group.net
open discussion: http://www.text-mining.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Visualization of Lucene search results with a treemap
Posted by David Spencer <da...@tropo.com>.
Stefan Groschupf wrote:
> Dave,
> cool stuff, think aboout to contribute that to nutch.. ;-)!
Well the code is very generic - basically 1 method that takes a
Searcher, a Query, the # of cells to show, and the size of the diagram.
Technically I think it would be a Lucene sandbox contribution - but -
for my site I do want to convert the custom spider/cache to use Nutch...
> Do you know:
> http://websom.hut.fi/websom/comp.ai.neural-nets-new/html/root.html ?
Interesting - is there any code avail to draw the maps?
thx,
Dave
>
> Cheers,
> Stefan
>
> Am 01.07.2004 um 23:28 schrieb David Spencer:
>
>>
>> Inspired by these guys who put results from Google into a treemap...
>> http://google.hivegroup.com/
>>
>> I did up my own version running against my index of OSS/javadoc trees.
>> This query for "thread pool" shows it off nicely:
>>
>> http://www.searchmorph.com/kat/tsearch.jsp?
>> s=thread%20pool&side=300&goal=500
>>
>> This is the empty search form:
>>
>> http://www.searchmorph.com/kat/tsearch.jsp
>>
>> And the weblog entry has a few more links, esp useful if you don't
>> know what a treemap is:
>>
>> http://searchmorph.com/weblog/index.php?id=18
>>
>> Oh: As a start, a treemap is a visualization technique, not
>> java.util.Treemap. Bigger boxes show a higher score, and x,y location
>> has no significance.
>>
>> Enjoy,
>> Dave
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>
> ---------------------------------------------------------------
> enterprise information technology consulting
> open technology: http://www.media-style.com
> open source: http://www.weta-group.net
> open discussion: http://www.text-mining.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Visualization of Lucene search results with a treemap
Posted by Stefan Groschupf <sg...@media-style.com>.
Dave,
cool stuff, think aboout to contribute that to nutch.. ;-)!
Do you know:
http://websom.hut.fi/websom/comp.ai.neural-nets-new/html/root.html ?
Cheers,
Stefan
Am 01.07.2004 um 23:28 schrieb David Spencer:
>
> Inspired by these guys who put results from Google into a treemap...
> http://google.hivegroup.com/
>
> I did up my own version running against my index of OSS/javadoc trees.
> This query for "thread pool" shows it off nicely:
>
> http://www.searchmorph.com/kat/tsearch.jsp?
> s=thread%20pool&side=300&goal=500
>
> This is the empty search form:
>
> http://www.searchmorph.com/kat/tsearch.jsp
>
> And the weblog entry has a few more links, esp useful if you don't
> know what a treemap is:
>
> http://searchmorph.com/weblog/index.php?id=18
>
> Oh: As a start, a treemap is a visualization technique, not
> java.util.Treemap. Bigger boxes show a higher score, and x,y location
> has no significance.
>
> Enjoy,
> Dave
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
---------------------------------------------------------------
enterprise information technology consulting
open technology: http://www.media-style.com
open source: http://www.weta-group.net
open discussion: http://www.text-mining.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Visualization of Lucene search results with a treemap
Posted by David Spencer <da...@tropo.com>.
Inspired by these guys who put results from Google into a treemap...
http://google.hivegroup.com/
I did up my own version running against my index of OSS/javadoc trees.
This query for "thread pool" shows it off nicely:
http://www.searchmorph.com/kat/tsearch.jsp?s=thread%20pool&side=300&goal=500
This is the empty search form:
http://www.searchmorph.com/kat/tsearch.jsp
And the weblog entry has a few more links, esp useful if you don't know
what a treemap is:
http://searchmorph.com/weblog/index.php?id=18
Oh: As a start, a treemap is a visualization technique, not
java.util.Treemap. Bigger boxes show a higher score, and x,y location
has no significance.
Enjoy,
Dave
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org