You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by bu...@apache.org on 2004/03/23 12:05:07 UTC
DO NOT REPLY [Bug 27868] New: -
Bad performance in PrefixQuery for large indices.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=27868>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=27868
Bad performance in PrefixQuery for large indices.
Summary: Bad performance in PrefixQuery for large indices.
Product: Lucene
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: Normal
Priority: Other
Component: Search
AssignedTo: lucene-dev@jakarta.apache.org
ReportedBy: jorgen@polopoly.com
[Version is lucene-1.3-final, but that was not selectable as version above]
In org.apache.lucene.search.PrefixQuery.rewrite(IndexReader):
1. term.text().startsWith(prefixText) is checked before
term.field() == prefixField although it is much more expensive.
Why check text at all when it is the wrong field?
2. If there are many matches in the index, lots and lots of
potentially identical TermQuery's are added to the BooleanQuery.
Either it can be solved here by first adding the TermQueries to
a HashSet (so all entries in the set are unique) and then traverse
the set and add them to the BooleanQuery. Or modify BooleanQuery's
add method so it only adds if not already contained in "clauses".
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org