You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Karl Heinz Marbaise <kh...@gmx.de> on 2009/01/27 20:29:55 UTC
Lucene 2.4 - Searching
Hi there,
I'm trying to do a, from my point of view, simple thing.
I would like to do a search ignoring the case of the stored information
in the index...with the following code:
reader = IndexReader.open(indexDirectory);
Searcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer();
//Created my own Query parse to handle ranges like filed:[1 TO 6]
QueryParser parser = new CustomQueryParser(FieldNames.CONTENTS, analyzer);
parser.setAllowLeadingWildcard(true);
parser.setLowercaseExpandedTerms(false);
Query query = parser.parse(queryLine);
TopDocs tmp = searcher.search(query, null, 20, sort);
To be more percisely...
I have a field which is called filename and contains a filename which
can of course be lowercase or upppercase or a mixture...
I would like to do the following:
+filename:/*scm*.doc
That should result in getting things like
/...SCMtest.doc
/...scmtest.doc
/...scm.doc
etc.
May be someone can give me hint how to solve this...
kind regards
Karl Heinz Marbaise
--
SoftwareEntwicklung Beratung Schulung Tel.: +49 (0) 2405 / 415 893
Dipl.Ing.(FH) Karl Heinz Marbaise ICQ#: 135949029
Hauptstrasse 177 USt.IdNr: DE191347579
52146 Würselen http://www.soebes.de
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene 2.4 - Searching
Posted by Antony Bowesman <ad...@teamware.com>.
Karl Heinz Marbaise wrote:
>
> I have a field which is called filename and contains a filename which
> can of course be lowercase or upppercase or a mixture...
>
> I would like to do the following:
>
> +filename:/*scm*.doc
>
> That should result in getting things like
>
> /...SCMtest.doc
> /...scmtest.doc
> /...scm.doc
> etc.
>
> May be someone can give me hint how to solve this...
It's all down to the analyzer you use when you index that field and how you
choose to tokenize it. If you want to always search case insensitively, then
you should lower case the filename when indexing.
Depending on how you implemented your query parser, if you have implemented
wildcard query support, if it's anything like the standard QP, it will not put
the query string through the analyzer, so a search for
+filename:/*SCm*.doc
would then not find anything, so you'd need to make sure you lower case all the
filename field searches at some point.
I use a custom analyzer for filenames, which lower cases and tokenizes by letter
or digit or any custom chars and my query parser supports custom analyzers for
getFieldQuery().
If you want to keep the original filename, then just store the field as well as
index it, then you can get the original back from the Document.
Antony
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene 2.4 - Searching
Posted by Ian Lea <ia...@gmail.com>.
Hi
Sounds like a job for RegexQuery. If you can't figure out how to use
it Google will throw up some examples. You can downcase everything
yourself or use an analyzer that does it or maybe use a case
insensitive regexp.
Depending on your file names you might want to avoid StandardAnalyzer.
It is likely to split them. KeywordAnalyzer might be what you want.
--
Ian.
On Tue, Jan 27, 2009 at 7:29 PM, Karl Heinz Marbaise <kh...@gmx.de> wrote:
> Hi there,
>
> I'm trying to do a, from my point of view, simple thing.
>
> I would like to do a search ignoring the case of the stored information in
> the index...with the following code:
>
> reader = IndexReader.open(indexDirectory);
>
> Searcher searcher = new IndexSearcher(reader);
> Analyzer analyzer = new StandardAnalyzer();
>
> //Created my own Query parse to handle ranges like filed:[1 TO 6]
> QueryParser parser = new CustomQueryParser(FieldNames.CONTENTS, analyzer);
> parser.setAllowLeadingWildcard(true);
> parser.setLowercaseExpandedTerms(false);
> Query query = parser.parse(queryLine);
>
> TopDocs tmp = searcher.search(query, null, 20, sort);
>
> To be more percisely...
>
> I have a field which is called filename and contains a filename which can of
> course be lowercase or upppercase or a mixture...
>
> I would like to do the following:
>
> +filename:/*scm*.doc
>
> That should result in getting things like
>
> /...SCMtest.doc
> /...scmtest.doc
> /...scm.doc
> etc.
>
> May be someone can give me hint how to solve this...
>
> kind regards
> Karl Heinz Marbaise
> --
> SoftwareEntwicklung Beratung Schulung Tel.: +49 (0) 2405 / 415 893
> Dipl.Ing.(FH) Karl Heinz Marbaise ICQ#: 135949029
> Hauptstrasse 177 USt.IdNr: DE191347579
> 52146 Würselen http://www.soebes.de
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org