You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by "Sharma, Siddharth" <Si...@Staples.com> on 2005/10/27 22:35:23 UTC

Search problems

Hi 

My index has 4 keyword fields and one unindexed field.
I want to search by the 4 keyword fields and return the one unindexed field.

I can iterate over the documents via Luke.
But when I search for the same values that I see via Luke, it does not find
the document.
Out of the 4 fields, 2 are alphanumeric and searching on just these two
fields does succeed and I can find the document in question.

The other 2 fields can have numeric values. When I include these two fields
in the search, the same document cannot be found.

I thought that the fact that these fields had numeric values might be the
reason for the search to be unsuccessful. So I browsed for another document
via Luke where these fields had alphanumeric values, but again could not
find the document? Returns no result.

What could the problem be? Any ideas?
I have added all the 4 fields with 'Field.Keyword'.

Thanks
Sid


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Search problems

Posted by Robert Watkins <rw...@foo-bar.org>.

One approach for matching your queries with Luke would be to write a
custom Analyzer that does absolutely nothing to the terms. Then, if you
put this Analyzer in your classpath when running Luke you can select it
as the Analyzer you want Luke to use to tokenize your query. This is
not, of course, the approach you would want to take in your application,
but at least it will allow you to use Luke as you have specified.

-- Robert Watkins

On Tue, 1 Nov 2005, Miles Barr wrote:

> On Thu, 2005-10-27 at 16:35 -0400, Sharma, Siddharth wrote:
>> My index has 4 keyword fields and one unindexed field.
>> I want to search by the 4 keyword fields and return the one unindexed field.
>>
>> I can iterate over the documents via Luke.
>> But when I search for the same values that I see via Luke, it does not find
>> the document.
>>
>> [ snipped ]
>>
>> What could the problem be? Any ideas?
>> I have added all the 4 fields with 'Field.Keyword'.
>
> Field.Keyword requires an exact match, i.e. you should manually create a
> TermQuery. Luke will analyze your query and hence tokenise it. Almost
> certainly the tokens it creates won't match the values in your field,
> because they have to be an exact match.
>
> [ snipped ]

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Search problems

Posted by Steven Rowe <sa...@syr.edu>.

Such an analyzer already exists, in Lucene's Subversion repository, 
under contrib/analyzers/: KeywordAnalyzer.

Robert Watkins wrote:
> One approach for matching your queries with Luke would be to write a
> custom Analyzer that does absolutely nothing to the terms. Then, if you
> put this Analyzer in your classpath when running Luke you can select it
> as the Analyzer you want Luke to use to tokenize your query. This is
> not, of course, the approach you would want to take in your application,
> but at least it will allow you to use Luke as you have specified.
> 
> -- Robert Watkins
> 
> On Tue, 1 Nov 2005, Miles Barr wrote:
> 
>> On Thu, 2005-10-27 at 16:35 -0400, Sharma, Siddharth wrote:
>>
>>> My index has 4 keyword fields and one unindexed field.
>>> I want to search by the 4 keyword fields and return the one unindexed 
>>> field.
>>>
>>> I can iterate over the documents via Luke.
>>> But when I search for the same values that I see via Luke, it does 
>>> not find
>>> the document.
>>>
>>> [ snipped ]
>>>
>>> What could the problem be? Any ideas?
>>> I have added all the 4 fields with 'Field.Keyword'.
>>
>>
>> Field.Keyword requires an exact match, i.e. you should manually create a
>> TermQuery. Luke will analyze your query and hence tokenise it. Almost
>> certainly the tokens it creates won't match the values in your field,
>> because they have to be an exact match.
>>
>> [ snipped ]


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Search problems

Posted by Robert Watkins <rw...@foo-bar.org>.

One approach for matching your queries with Luke would be to write a
custom Analyzer that does absolutely nothing to the terms. Then, if you
put this Analyzer in your classpath when running Luke you can select it
as the Analyzer you want Luke to use to tokenize your query. This is
not, of course, the approach you would want to take in your application,
but at least it will allow you to use Luke as you have specified.

-- Robert Watkins

On Tue, 1 Nov 2005, Miles Barr wrote:

> On Thu, 2005-10-27 at 16:35 -0400, Sharma, Siddharth wrote:
>> My index has 4 keyword fields and one unindexed field.
>> I want to search by the 4 keyword fields and return the one unindexed field.
>>
>> I can iterate over the documents via Luke.
>> But when I search for the same values that I see via Luke, it does not find
>> the document.
>>
>> [ snipped ]
>>
>> What could the problem be? Any ideas?
>> I have added all the 4 fields with 'Field.Keyword'.
>
> Field.Keyword requires an exact match, i.e. you should manually create a
> TermQuery. Luke will analyze your query and hence tokenise it. Almost
> certainly the tokens it creates won't match the values in your field,
> because they have to be an exact match.
>
> [ snipped ]

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Search problems

Posted by Miles Barr <mi...@runtime-collective.com>.

On Thu, 2005-10-27 at 16:35 -0400, Sharma, Siddharth wrote:
> My index has 4 keyword fields and one unindexed field.
> I want to search by the 4 keyword fields and return the one unindexed field.
> 
> I can iterate over the documents via Luke.
> But when I search for the same values that I see via Luke, it does not find
> the document.
> Out of the 4 fields, 2 are alphanumeric and searching on just these two
> fields does succeed and I can find the document in question.
> 
> The other 2 fields can have numeric values. When I include these two fields
> in the search, the same document cannot be found.
> 
> I thought that the fact that these fields had numeric values might be the
> reason for the search to be unsuccessful. So I browsed for another document
> via Luke where these fields had alphanumeric values, but again could not
> find the document? Returns no result.
> 
> What could the problem be? Any ideas?
> I have added all the 4 fields with 'Field.Keyword'.

Field.Keyword requires an exact match, i.e. you should manually create a
TermQuery. Luke will analyze your query and hence tokenise it. Almost
certainly the tokens it creates won't match the values in your field,
because they have to be an exact match.

The StandardAnalyzer is the analyzer Luke uses by default. It will make
the search terms lower case, and AFAIK it almost removes numbers from
the query.


--
Miles Barr


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org