You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Doug Cutting <DC...@grandcentral.com> on 2001/10/31 00:59:43 UTC
RE: Re(2): Re: [Lucene-dev] Katakana characters in queries (a bug
?)
> From: Halácsy Péter [mailto:halacsy.peter@axelero.com]
>
> I think IDENTIFIER_CHAR doesn't need to be the first char so my
> proposal is:
> <TERM: ( ~["\"", " ", "\t", "(", ")", ":", "&", "|", "^", "*", "?",
> "~", "{", "}", "[", "]" ] )+ >
That looks like the right approach to me.
> On the other hand IDENTIFIER, ALPHA_CHAR, ALPHANUM_CHAR tokens are
> definied but are not used.
So let's remove them!
> ps: I don't understand the definition of WILD_TERM. It states that a
> wild term must end with identifier_char, so cannot end with
> *. Is it the right definition?
Yes. The code for handling a final asterisk (PrefixQuery) is different from
term general term wildcarding code (WildCardQuery).
These changes yield the following token definitions in QueryParser.jj:
<*> TOKEN : {
<#_NUM_CHAR: ["0"-"9"] >
| <#_TERM_CHAR: ~["\"", " ", "\t", "(", ")", ":", "&", "|",
"^", "*", "?", "~", "{", "}", "[", "]" ] >
| <#_NEWLINE: ( "\r\n" | "\r" | "\n" ) >
| <#_WHITESPACE: ( " " | "\t" ) >
| <#_QCHAR: ( "\\" (<_NEWLINE> | ~["a"-"z", "A"-"Z", "0"-"9"] ) ) >
| <#_RESTOFLINE: (~["\r", "\n"])* >
}
<DEFAULT> TOKEN : {
<AND: ("AND" | "&&") >
| <OR: ("OR" | "||") >
| <NOT: ("NOT" | "!") >
| <PLUS: "+" >
| <MINUS: "-" >
| <LPAREN: "(" >
| <RPAREN: ")" >
| <COLON: ":" >
| <CARAT: "^" >
| <STAR: "*" >
| <QUOTED: "\"" (~["\""])+ "\"">
| <NUMBER: (["+","-"])? (<_NUM_CHAR>)+ "." (<_NUM_CHAR>)+ >
| <TERM: (<_TERM_CHAR>)+ >
| <FUZZY: "~" >
| <WILDTERM: <_TERM_CHAR>
( ~["\"", " ", "\t", "(", ")", ":", "&", "|", "^", "~", "{",
"}", "[", "]" ] )+ <_TERM_CHAR>>
| <RANGEIN: "[" (~["]"])+ "]">
| <RANGEEX: "{" (~["}"])+ "}">
}
<DEFAULT> SKIP : {
<<_WHITESPACE>>
}
Can folks try these and tell me if it solves the problem?
Ideally we should add some cases for this to the junit tests, but I can't
get junit to work at all right now... Have the junit tests ever run
correctly from ant since the move to Jakarta? Can someone more familiar
with junit have a look at this?
Doug
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>