You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Kevin A. Burton" <bu...@newsmonster.org> on 2004/04/01 04:43:21 UTC

Re: Performance of hit highlighting and finding term positions for

Doug Cutting wrote:

> http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=1413989 
>
>
> According to these, if your documents average 16k, then a 10-hit 
> result page would require just 66ms to generate highlights using 
> SimpleAnalyzer.

The whole search takes only 300ms... this means that if I highlight 5 
docs I've doubled my search time.

Note that Google has a whole subsection of their cluster dedicated to 
keyword in context extraction.

Kevin

-- 

Please reply using PGP.

    http://peerfear.org/pubkey.asc    
    
    NewsMonster - http://www.newsmonster.org/
    
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
       AIM/YIM - sfburtonator,  Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
  IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster


Re: Performance of hit highlighting and finding term positions for

Posted by Doug Cutting <cu...@apache.org>.
Kevin A. Burton wrote:
> Doug Cutting wrote:
> 
>> http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=1413989 
>>
>> According to these, if your documents average 16k, then a 10-hit 
>> result page would require just 66ms to generate highlights using 
>> SimpleAnalyzer.
> 
> The whole search takes only 300ms... this means that if I highlight 5 
> docs I've doubled my search time.

My math was wrong, but yours seems even more so!  I meant 110ms to 
highlight ten docs.  If you only highlight 5, then it's 55ms.  If your 
query is taking 300ms, then this adds less than 20%.

> Note that Google has a whole subsection of their cluster dedicated to 
> keyword in context extraction.

I think that's that's for i/o reasons, not that it requires a lot of 
computation.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org