You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Сергій Карпенко <be...@ukr.net> on 2008/08/19 17:10:53 UTC
Re[2]: How I can find wildcard symbol with WildcardQuery?
Yes, you are correct - NO_NORMS has nothing to do with tokenization, thats mean no analyzers used. String fall's in index as single term.
But, what about our wildcard symbols?
Re: How I can find wildcard symbol with WildcardQuery?
Before going down this path I'd really recommend you get a copy of Luke
and look at your index. Depending upon the analyzer you're using, you
may or may not have w*orld indexed. You may have the tokens:
w
orld
with the * dropped completely.
As far as I know, NO_NORMS has nothing to do with tokenization, the
critical question is what *analyzer* you're using to index.
And you could always sidestep the issue entirely by pre-processing
your text and query to replace * with something else.
But for escaping, see:
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html
Best
Erick
2008/8/19 Сергій Карпенко <be...@ukr.net>
>
> Hello
>
> For example, we have a text:
>
> " Hello w*orld"
> it's indexed as NO_NORMS, so this phrase is term.
>
> And I have a code:
>
> Query query = new WildcardQuery(new Term("field", " Hello w*orld")); its
> work
>
> But I need symbol '*' as ordinary symbol, not escape symbol.
>
> The QueryParser's analogue '\\*'
> Query query = new WildcardQuery(new Term("field", " Hello w\\*orld"));
> don't wokrs.
>
> Thanks
>
>
>
>
Re: How I can find wildcard symbol with WildcardQuery?
Posted by Erick Erickson <er...@gmail.com>.
Thanks for correcting me on this, I had no idea...... Just goes to show
what happens when an amateur gets in the mix <G>.
Best
Erick
On Tue, Aug 19, 2008 at 8:09 PM, Daniel Noll <da...@nuix.com> wrote:
> Сергій Карпенко wrote:
>
>> Yes, you are correct - NO_NORMS has nothing to do with tokenization,
>> thats mean no analyzers used.
>>
>
> Just to avoid this ambiguous, semi-contradicting wording confusing the hell
> out of anyone...
>
> NO_NORMS *does* have something to do with tokenisation -- it implies
> UN_TOKENIZED.
>
> Source code QFT:
>
> } else if (index == Index.NO_NORMS) {
> this.isIndexed = true;
> this.isTokenized = false;
> this.omitNorms = true;
> } ...
>
> Daniel
>
> --
> Daniel Noll
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: How I can find wildcard symbol with WildcardQuery?
Posted by Daniel Noll <da...@nuix.com>.
Сергій Карпенко wrote:
> Yes, you are correct - NO_NORMS has nothing to do with tokenization,
> thats mean no analyzers used.
Just to avoid this ambiguous, semi-contradicting wording confusing the
hell out of anyone...
NO_NORMS *does* have something to do with tokenisation -- it implies
UN_TOKENIZED.
Source code QFT:
} else if (index == Index.NO_NORMS) {
this.isIndexed = true;
this.isTokenized = false;
this.omitNorms = true;
} ...
Daniel
--
Daniel Noll
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org