You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by "G.Long" <jd...@gmail.com> on 2012/11/07 16:50:56 UTC

case-insensitive index and queries

Hi :)

I would like the "text" field of my index to be case-insensitive.
I'm using a PerFieldAnalyzerWrapper with a standardAnalyzer for this 
field for both indexing and querying. I read that StandardAnalyzer uses 
LowerCaseFilter to lowercase the value of the field but when I run a 
query, it doesn' work.

Here is my query :

IndexSearcher isearcher = new IndexSearcher(directory);
BooleanQuery query = new BooleanQuery();
PerFieldAnalyzerWrapper pfaWrapper = getPerfFieldAnalyzer();

QueryParser parser = new QueryParser(Version.LUCENE_31, key, pfaWrapper);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
Query param = parser.parse(value);
query.add(param, BooleanClause.Occur.MUST);

TopFieldCollector collector = TopFieldCollector.create(new 
Sort(SortField.FIELD_DOC), 200000, true, false, false, false);
isearcher.search(query, collector);


The getPerFieldAnalyzer() methods looks like :

if(perFieldAnalyzerWrapper==null){
             perFieldAnalyzerWrapper = new PerFieldAnalyzerWrapper(new 
KeywordAnalyzer());
             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TEXT, new 
StandardAnalyzer(Version.LUCENE_31));
             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TITLE, new 
StandardAnalyzer(Version.LUCENE_31));
}
return perFieldAnalyzerWrapper;

Is there something wrong with this code?

Thank you :)


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: case-insensitive index and queries

Posted by "G.Long" <jd...@gmail.com>.

Thank you for the tips. I looked at the index and the query and nothing 
seemed to be wrong. Then I realized that someone put a condition in the 
code after getting the results of the query. this condition removed docs 
which did not contain the exact words of the query. This condition was 
case sensitive u_u.

Problem solved :)



Le 07/11/2012 17:09, Ian Lea a écrit :
>  From a glance the code looks OK, but there's lots you're not showing
> that could cause it not to work - whatever you mean by that. Fails to
> get hits on docs you think are in the index?
>
> Look at the index with Luke to see what actually has been indexed.
>
> Look at Query.toString() to see how the query has been parsed.
>
> Read the bit of the FAQ titled something like "Why are my searches not
> working?".
>
>
> --
> Ian.
>
>
> On Wed, Nov 7, 2012 at 3:50 PM, G.Long <jd...@gmail.com> wrote:
>> Hi :)
>>
>> I would like the "text" field of my index to be case-insensitive.
>> I'm using a PerFieldAnalyzerWrapper with a standardAnalyzer for this field
>> for both indexing and querying. I read that StandardAnalyzer uses
>> LowerCaseFilter to lowercase the value of the field but when I run a query,
>> it doesn' work.
>>
>> Here is my query :
>>
>> IndexSearcher isearcher = new IndexSearcher(directory);
>> BooleanQuery query = new BooleanQuery();
>> PerFieldAnalyzerWrapper pfaWrapper = getPerfFieldAnalyzer();
>>
>> QueryParser parser = new QueryParser(Version.LUCENE_31, key, pfaWrapper);
>> parser.setDefaultOperator(QueryParser.AND_OPERATOR);
>> Query param = parser.parse(value);
>> query.add(param, BooleanClause.Occur.MUST);
>>
>> TopFieldCollector collector = TopFieldCollector.create(new
>> Sort(SortField.FIELD_DOC), 200000, true, false, false, false);
>> isearcher.search(query, collector);
>>
>>
>> The getPerFieldAnalyzer() methods looks like :
>>
>> if(perFieldAnalyzerWrapper==null){
>>              perFieldAnalyzerWrapper = new PerFieldAnalyzerWrapper(new
>> KeywordAnalyzer());
>>              perFieldAnalyzerWrapper.addAnalyzer(FIELD_TEXT, new
>> StandardAnalyzer(Version.LUCENE_31));
>>              perFieldAnalyzerWrapper.addAnalyzer(FIELD_TITLE, new
>> StandardAnalyzer(Version.LUCENE_31));
>> }
>> return perFieldAnalyzerWrapper;
>>
>> Is there something wrong with this code?
>>
>> Thank you :)
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: case-insensitive index and queries

Posted by Ian Lea <ia...@gmail.com>.

>From a glance the code looks OK, but there's lots you're not showing
that could cause it not to work - whatever you mean by that. Fails to
get hits on docs you think are in the index?

Look at the index with Luke to see what actually has been indexed.

Look at Query.toString() to see how the query has been parsed.

Read the bit of the FAQ titled something like "Why are my searches not
working?".


--
Ian.


On Wed, Nov 7, 2012 at 3:50 PM, G.Long <jd...@gmail.com> wrote:
> Hi :)
>
> I would like the "text" field of my index to be case-insensitive.
> I'm using a PerFieldAnalyzerWrapper with a standardAnalyzer for this field
> for both indexing and querying. I read that StandardAnalyzer uses
> LowerCaseFilter to lowercase the value of the field but when I run a query,
> it doesn' work.
>
> Here is my query :
>
> IndexSearcher isearcher = new IndexSearcher(directory);
> BooleanQuery query = new BooleanQuery();
> PerFieldAnalyzerWrapper pfaWrapper = getPerfFieldAnalyzer();
>
> QueryParser parser = new QueryParser(Version.LUCENE_31, key, pfaWrapper);
> parser.setDefaultOperator(QueryParser.AND_OPERATOR);
> Query param = parser.parse(value);
> query.add(param, BooleanClause.Occur.MUST);
>
> TopFieldCollector collector = TopFieldCollector.create(new
> Sort(SortField.FIELD_DOC), 200000, true, false, false, false);
> isearcher.search(query, collector);
>
>
> The getPerFieldAnalyzer() methods looks like :
>
> if(perFieldAnalyzerWrapper==null){
>             perFieldAnalyzerWrapper = new PerFieldAnalyzerWrapper(new
> KeywordAnalyzer());
>             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TEXT, new
> StandardAnalyzer(Version.LUCENE_31));
>             perFieldAnalyzerWrapper.addAnalyzer(FIELD_TITLE, new
> StandardAnalyzer(Version.LUCENE_31));
> }
> return perFieldAnalyzerWrapper;
>
> Is there something wrong with this code?
>
> Thank you :)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org