You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Richard Marr (JIRA)" <ji...@apache.org> on 2009/07/28 17:02:14 UTC

[jira] Updated: (LUCENE-1690) Morelikethis queries are very slow compared to other search types

     [ https://issues.apache.org/jira/browse/LUCENE-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Marr updated LUCENE-1690:
---------------------------------

    Attachment: LruCache.patch

Attached is a draft of an implementation that uses a WeakHashMap to bind the cache to the IndexReader instance, and a LinkedHashMap to provide LRU functionality.

Disclaimer: I'm not fluent in Java or OSS contribution so there may be holes or bad style in this implementation. I also need to check it meets the project coding standards.

Anybody up for giving me some feedback in the meantime?

> Morelikethis queries are very slow compared to other search types
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1690
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1690
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>    Affects Versions: 2.4.1
>            Reporter: Richard Marr
>            Priority: Minor
>         Attachments: LruCache.patch, LUCENE-1690.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The MoreLikeThis object performs term frequency lookups for every query.  From my testing that's what seems to take up the majority of time for MoreLikeThis searches.  
> For some (I'd venture many) applications it's not necessary for term statistics to be looked up every time. A fairly naive opt-in caching mechanism tied to the life of the MoreLikeThis object would allow applications to cache term statistics for the duration that suits them.
> I've got this working in my test code. I'll put together a patch file when I get a minute. From my testing this can improve performance by a factor of around 10.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org