You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Сергій Карпенко <be...@ukr.net> on 2008/08/19 17:10:53 UTC

Re[2]: How I can find wildcard symbol with WildcardQuery?

Yes, you are correct - NO_NORMS has nothing to do with tokenization, thats mean no analyzers used. String fall's in index as single term.    
But, what about our wildcard symbols?    
    
Re: How I can find wildcard symbol with WildcardQuery?    
    
    Before going down this path I'd really recommend you get a copy of Luke    
and look at your index. Depending upon the analyzer you're using, you    
may or may not have w*orld indexed. You may have the tokens:    
w    
orld    
    
with the * dropped completely.    
    
As far as I know, NO_NORMS has nothing to do with tokenization, the    
critical question is what *analyzer* you're using to index.    
    
And you could always sidestep the issue entirely by pre-processing    
your text and query to replace * with something else.    
    
But for escaping, see:    
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html    
    
Best    
Erick    
    
2008/8/19 Сергій Карпенко <be...@ukr.net>    
    
>    
> Hello    
>    
> For example, we have a text:    
>    
> " Hello w*orld"    
> it's indexed as NO_NORMS, so this phrase is term.    
>    
> And I have a code:    
>    
> Query query = new WildcardQuery(new Term("field", " Hello w*orld")); its    
> work    
>    
> But I need symbol '*' as ordinary symbol, not escape symbol.    
>    
> The QueryParser's analogue '\\*'    
> Query query = new WildcardQuery(new Term("field", " Hello w\\*orld"));    
> don't wokrs.    
>    
> Thanks    
>    
>    
>    
>    
    
    
  

Re: How I can find wildcard symbol with WildcardQuery?

Posted by Erick Erickson <er...@gmail.com>.
Thanks for correcting me on this, I had no idea...... Just goes to show
what happens when an amateur gets in the mix <G>.

Best
Erick

On Tue, Aug 19, 2008 at 8:09 PM, Daniel Noll <da...@nuix.com> wrote:

>  Сергій Карпенко wrote:
>
>> Yes, you are correct - NO_NORMS has nothing to do with tokenization,
>> thats mean no analyzers used.
>>
>
> Just to avoid this ambiguous, semi-contradicting wording confusing the hell
> out of anyone...
>
> NO_NORMS *does* have something to do with tokenisation -- it implies
> UN_TOKENIZED.
>
> Source code QFT:
>
>    } else if (index == Index.NO_NORMS) {
>      this.isIndexed = true;
>      this.isTokenized = false;
>      this.omitNorms = true;
>    } ...
>
> Daniel
>
> --
> Daniel Noll
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: How I can find wildcard symbol with WildcardQuery?

Posted by Daniel Noll <da...@nuix.com>.
  Сергій Карпенко wrote:
> Yes, you are correct - NO_NORMS has nothing to do with tokenization,
> thats mean no analyzers used.

Just to avoid this ambiguous, semi-contradicting wording confusing the 
hell out of anyone...

NO_NORMS *does* have something to do with tokenisation -- it implies 
UN_TOKENIZED.

Source code QFT:

     } else if (index == Index.NO_NORMS) {
       this.isIndexed = true;
       this.isTokenized = false;
       this.omitNorms = true;
     } ...

Daniel

-- 
Daniel Noll

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org