You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mufaddal Khumri <mk...@allegromedical.com> on 2006/02/23 22:16:42 UTC

Getting no hits ...

I have been trying to figure out why my query below would not return any 
hits.

I use two custom analyzers for indexing and searching. The one I use for 
indexing uses this:

    public TokenStream tokenStream(String fieldName, Reader reader)
    {
        TokenStream result = new StandardTokenizer(reader);
        result = new StandardFilter(result);
        result = new LowerCaseFilter(result);
        result = new StopFilter(result, stopSet);
        result = new SynonymFilter(result, new MySynonymEngine());
        result = new PorterStemFilter(result);
        return result;
    }

The one I use for searching uses this:

    public TokenStream tokenStream(String fieldName, Reader reader)
    {
        TokenStream result = new StandardTokenizer(reader);
        result = new StandardFilter(result);
        result = new LowerCaseFilter(result);
        result = new StopFilter(result, stopSet);
        result = new PorterStemFilter(result);
        return result;
    }

(Basically while searching I do not use the SynonymFilter.)

I have quite a few products that I index that have the text on which I 
am querying on.

I do a search for this: ES-20D

This is the final query that I run:
+(+content:es\-20d) +entity:product +(title:"es\-20d"~2^40.0 
((title:es\-20d)^10.0) content:"es\-20d"~2^20.0 (content:es\-20d) 
categoryName:"es\-20d"^80.0)

(The content and title fields are Indexed, Tokenized and Stored. The 
categoryName field is Indexed and Stored.)

I get no hits?

Where am i going wrong with this? Any pointers?

-Thanks.





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Getting no hits ...

Posted by Mufaddal Khumri <mk...@allegromedical.com>.
Follow up on my previous email ...

When I execute this query using luke using the standard analyzer on the 
same index, i get 8 hits.
 +(+content:eos\-20d) +entity:product +(title:"eos\-20d"~2^40.0 
((title:eos\-20d)^10.0) content:"eos\-20d"~2^20.0 (content:eos\-20d) 
categoryName:"eos\-20d"^80.0)

I modified my searching code to use the standard analyzer, but i did not 
get any hits back. I am still trying to figure out the problem out. Any 
ideas?

Mufaddal Khumri wrote:

> In my earlier email i put in the wrong query that I am searching on. 
> The correct query is: EOS-20D
>
> And this is the query under question that is producing no hits still:
>
> +(+content:eos\-20d) +entity:product +(title:"eos\-20d"~2^40.0 
> ((title:eos\-20d)^10.0) content:"eos\-20d"~2^20.0 (content:eos\-20d) 
> categoryName:"eos\-20d"^80.0)
>
> I have used the AnalyzerUtils.displayTokensWithFullDetails(analyzer, 
> string); (AnalyzerUtils from the LIA book).
>
> This is part of the log output from using the 
> AnalyzerUtils.displayTokensWithFullDetails(analyzer, string) when this 
> product gets indexed:
> ....
> ....
> 119: [013803044430:857->869:<ALPHANUM>]
> 120: [eos-20d:870->877:<NUM>]
> 121: [011-eos-20d:878->889:<NUM>]
>
> This is part of the log output from using the 
> AnalyzerUtils.displayTokensWithFullDetails(analyzer, string) when I do 
> the search:
> 1: [eos-20d:0->6:<NUM>]
>
> From what I understand I see that the analyzer is producing the same 
> tokens while indexing and during searching.
>
> Chris Hostetter wrote:
>
>> 1) Have you looked at what tokens your indexing analyzer produces 
>> when you
>>   tokenize "ES-20D" ?
>> 2) Have you looked at what tokens your query analyser products when you
>>   tokenize "ES-20D" ?
>> 3) Have you tried a simpler query (ie: just "content:es\-20d" ) ?
>> 4) When giving QueryParser a (quoted) phrase search, i don't think you
>>   really want to escape that "-" character.
>>
>>
>>
>> : Date: Thu, 23 Feb 2006 14:16:42 -0700
>> : From: Mufaddal Khumri <mk...@allegromedical.com>
>> : Reply-To: java-user@lucene.apache.org
>> : To: java-user@lucene.apache.org
>> : Subject: Getting no hits ...
>> :
>> : I have been trying to figure out why my query below would not 
>> return any
>> : hits.
>> :
>> : I use two custom analyzers for indexing and searching. The one I 
>> use for
>> : indexing uses this:
>> :
>> :     public TokenStream tokenStream(String fieldName, Reader reader)
>> :     {
>> :         TokenStream result = new StandardTokenizer(reader);
>> :         result = new StandardFilter(result);
>> :         result = new LowerCaseFilter(result);
>> :         result = new StopFilter(result, stopSet);
>> :         result = new SynonymFilter(result, new MySynonymEngine());
>> :         result = new PorterStemFilter(result);
>> :         return result;
>> :     }
>> :
>> : The one I use for searching uses this:
>> :
>> :     public TokenStream tokenStream(String fieldName, Reader reader)
>> :     {
>> :         TokenStream result = new StandardTokenizer(reader);
>> :         result = new StandardFilter(result);
>> :         result = new LowerCaseFilter(result);
>> :         result = new StopFilter(result, stopSet);
>> :         result = new PorterStemFilter(result);
>> :         return result;
>> :     }
>> :
>> : (Basically while searching I do not use the SynonymFilter.)
>> :
>> : I have quite a few products that I index that have the text on which I
>> : am querying on.
>> :
>> : I do a search for this: ES-20D
>> :
>> : This is the final query that I run:
>> : +(+content:es\-20d) +entity:product +(title:"es\-20d"~2^40.0
>> : ((title:es\-20d)^10.0) content:"es\-20d"~2^20.0 (content:es\-20d)
>> : categoryName:"es\-20d"^80.0)
>> :
>> : (The content and title fields are Indexed, Tokenized and Stored. The
>> : categoryName field is Indexed and Stored.)
>> :
>> : I get no hits?
>> :
>> : Where am i going wrong with this? Any pointers?
>> :
>> : -Thanks.
>> :
>> :
>> :
>> :
>> :
>> : ---------------------------------------------------------------------
>> : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> : For additional commands, e-mail: java-user-help@lucene.apache.org
>> :
>>
>>
>>
>> -Hoss
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>  
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Getting no hits ...

Posted by Mufaddal Khumri <mk...@allegromedical.com>.
In my earlier email i put in the wrong query that I am searching on. The 
correct query is: EOS-20D

And this is the query under question that is producing no hits still:

+(+content:eos\-20d) +entity:product +(title:"eos\-20d"~2^40.0 
((title:eos\-20d)^10.0) content:"eos\-20d"~2^20.0 (content:eos\-20d) 
categoryName:"eos\-20d"^80.0)

I have used the AnalyzerUtils.displayTokensWithFullDetails(analyzer, 
string); (AnalyzerUtils from the LIA book).

This is part of the log output from using the 
AnalyzerUtils.displayTokensWithFullDetails(analyzer, string) when this 
product gets indexed:
....
....
119: [013803044430:857->869:<ALPHANUM>]
120: [eos-20d:870->877:<NUM>]
121: [011-eos-20d:878->889:<NUM>]

This is part of the log output from using the 
AnalyzerUtils.displayTokensWithFullDetails(analyzer, string) when I do 
the search:
1: [eos-20d:0->6:<NUM>]

 From what I understand I see that the analyzer is producing the same 
tokens while indexing and during searching.

Chris Hostetter wrote:

>1) Have you looked at what tokens your indexing analyzer produces when you
>   tokenize "ES-20D" ?
>2) Have you looked at what tokens your query analyser products when you
>   tokenize "ES-20D" ?
>3) Have you tried a simpler query (ie: just "content:es\-20d" ) ?
>4) When giving QueryParser a (quoted) phrase search, i don't think you
>   really want to escape that "-" character.
>
>
>
>: Date: Thu, 23 Feb 2006 14:16:42 -0700
>: From: Mufaddal Khumri <mk...@allegromedical.com>
>: Reply-To: java-user@lucene.apache.org
>: To: java-user@lucene.apache.org
>: Subject: Getting no hits ...
>:
>: I have been trying to figure out why my query below would not return any
>: hits.
>:
>: I use two custom analyzers for indexing and searching. The one I use for
>: indexing uses this:
>:
>:     public TokenStream tokenStream(String fieldName, Reader reader)
>:     {
>:         TokenStream result = new StandardTokenizer(reader);
>:         result = new StandardFilter(result);
>:         result = new LowerCaseFilter(result);
>:         result = new StopFilter(result, stopSet);
>:         result = new SynonymFilter(result, new MySynonymEngine());
>:         result = new PorterStemFilter(result);
>:         return result;
>:     }
>:
>: The one I use for searching uses this:
>:
>:     public TokenStream tokenStream(String fieldName, Reader reader)
>:     {
>:         TokenStream result = new StandardTokenizer(reader);
>:         result = new StandardFilter(result);
>:         result = new LowerCaseFilter(result);
>:         result = new StopFilter(result, stopSet);
>:         result = new PorterStemFilter(result);
>:         return result;
>:     }
>:
>: (Basically while searching I do not use the SynonymFilter.)
>:
>: I have quite a few products that I index that have the text on which I
>: am querying on.
>:
>: I do a search for this: ES-20D
>:
>: This is the final query that I run:
>: +(+content:es\-20d) +entity:product +(title:"es\-20d"~2^40.0
>: ((title:es\-20d)^10.0) content:"es\-20d"~2^20.0 (content:es\-20d)
>: categoryName:"es\-20d"^80.0)
>:
>: (The content and title fields are Indexed, Tokenized and Stored. The
>: categoryName field is Indexed and Stored.)
>:
>: I get no hits?
>:
>: Where am i going wrong with this? Any pointers?
>:
>: -Thanks.
>:
>:
>:
>:
>:
>: ---------------------------------------------------------------------
>: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>: For additional commands, e-mail: java-user-help@lucene.apache.org
>:
>
>
>
>-Hoss
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Getting no hits ...

Posted by Chris Hostetter <ho...@fucit.org>.
1) Have you looked at what tokens your indexing analyzer produces when you
   tokenize "ES-20D" ?
2) Have you looked at what tokens your query analyser products when you
   tokenize "ES-20D" ?
3) Have you tried a simpler query (ie: just "content:es\-20d" ) ?
4) When giving QueryParser a (quoted) phrase search, i don't think you
   really want to escape that "-" character.



: Date: Thu, 23 Feb 2006 14:16:42 -0700
: From: Mufaddal Khumri <mk...@allegromedical.com>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Getting no hits ...
:
: I have been trying to figure out why my query below would not return any
: hits.
:
: I use two custom analyzers for indexing and searching. The one I use for
: indexing uses this:
:
:     public TokenStream tokenStream(String fieldName, Reader reader)
:     {
:         TokenStream result = new StandardTokenizer(reader);
:         result = new StandardFilter(result);
:         result = new LowerCaseFilter(result);
:         result = new StopFilter(result, stopSet);
:         result = new SynonymFilter(result, new MySynonymEngine());
:         result = new PorterStemFilter(result);
:         return result;
:     }
:
: The one I use for searching uses this:
:
:     public TokenStream tokenStream(String fieldName, Reader reader)
:     {
:         TokenStream result = new StandardTokenizer(reader);
:         result = new StandardFilter(result);
:         result = new LowerCaseFilter(result);
:         result = new StopFilter(result, stopSet);
:         result = new PorterStemFilter(result);
:         return result;
:     }
:
: (Basically while searching I do not use the SynonymFilter.)
:
: I have quite a few products that I index that have the text on which I
: am querying on.
:
: I do a search for this: ES-20D
:
: This is the final query that I run:
: +(+content:es\-20d) +entity:product +(title:"es\-20d"~2^40.0
: ((title:es\-20d)^10.0) content:"es\-20d"~2^20.0 (content:es\-20d)
: categoryName:"es\-20d"^80.0)
:
: (The content and title fields are Indexed, Tokenized and Stored. The
: categoryName field is Indexed and Stored.)
:
: I get no hits?
:
: Where am i going wrong with this? Any pointers?
:
: -Thanks.
:
:
:
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org