You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by pashupathinath <pa...@yahoo.com> on 2005/03/31 18:44:55 UTC

using different analyzer for searching

  is it possible to index using a predefined analyzer
and search using a custom analyzer ??
  i'm searching using the built in whitespace
analyser. the problem is when i'm searching for a part
of a string the search results are zero.
  i'm using white space analyzer. for example if the
statement is "my name is abc123" the search for abc or
123 doesnt return any hits. 
  anyinsight into this ?? 

thanks,
pashupathinath.k

Send instant messages to your online friends http://uk.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: using different analyzer for searching

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Mar 31, 2005, at 11:44 AM, pashupathinath wrote:

>   is it possible to index using a predefined analyzer
> and search using a custom analyzer ??

Yes, its perfectly fine to do so with the caveat that you end up 
searching for the terms exactly as they were indexed.

I end up doing this in most applications, actually, primarily because 
untokenized fields need to use the KeywordAnalyzer during searching.

>   i'm searching using the built in whitespace
> analyser. the problem is when i'm searching for a part
> of a string the search results are zero.
>   i'm using white space analyzer. for example if the
> statement is "my name is abc123" the search for abc or
> 123 doesnt return any hits.
>   anyinsight into this ??

The exact terms indexed using WhitespaceAnalyzer are like this (using 
the Lucene in Action AnalyzerDemo - "ant AnalyzerDemo"):

     [input] String to analyze: [This string will be analyzed.]
my name is abc123
      [echo] Running lia.analysis.AnalyzerDemo...
      [java] Analyzing "my name is abc123"
      [java]   WhitespaceAnalyzer:
      [java]     [my] [name] [is] [abc123]

      [java]   SimpleAnalyzer:
      [java]     [my] [name] [is] [abc]

      [java]   StopAnalyzer:
      [java]     [my] [name] [abc]

      [java]   StandardAnalyzer:
      [java]     [my] [name] [abc123]

So you indexed "abc123" and searches must search for that term 
*exactly*.  You can search for "abc*" as a PrefixQuery or WildcardQuery 
and find "abc123".  "*123" will also find it though QueryParser does 
not support leading wildcard characters (but the API does).  Wildcard 
queries are not ideally what you want as it tends to be much slower for 
large indexes.

You may need to do specialized analysis.  Perhaps you could share you 
real needs with the list and we could offer recommendations.  It is 
possible to index "abc123", "abc", and "123" all within the same 
position in the index if you do some clever analysis and that meshes 
with what you're after.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org