You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by maahi333 <ma...@gmail.com> on 2016/09/12 10:46:15 UTC

Lexical error showing wrong column number.

Hi,

I am parsing query and calculating offset at which the query parsing failed.
I am having problem with query having double quotes.

e.g: for query "news \" obama"  here I get below error:
: Lexical error at line 1, column 13. Encountered: <EOF> after : "\" OBAMA"
. Offset which I calculated is 5. 

So is it when we enter double quotes in query it considers as the end of
string and gives error at column which is equal to length of the string of
query?.

Another example for query "test123456\"99999900002493524" it will give
lexical error at column 29 but we can see that error offset is 11.




--
View this message in context: http://lucene.472066.n3.nabble.com/Lexical-error-showing-wrong-column-number-tp4295716.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Lexical error showing wrong column number.

Posted by maahi333 <ma...@gmail.com>.
Thanks Hoss. 



--
View this message in context: http://lucene.472066.n3.nabble.com/Lexical-error-showing-wrong-column-number-tp4295716p4300835.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Lexical error showing wrong column number.

Posted by Chris Hostetter <ho...@fucit.org>.
: I am parsing query and calculating offset at which the query parsing failed.
: I am having problem with query having double quotes.
: 
: e.g: for query "news \" obama"  here I get below error:
: : Lexical error at line 1, column 13. Encountered: <EOF> after : "\" OBAMA"
: . Offset which I calculated is 5. 
: 
: So is it when we enter double quotes in query it considers as the end of
: string and gives error at column which is equal to length of the string of
: query?.

Correct.

The parsing error happens at teh end of the string when it detects the 
quoted phrase is unterminated.  The parser has no way of guessing if the 
double-quote character at position 5 is intended to be an "end quoted 
phrase" -- it must assume (as it's parsing hte string left to right) that 
the string is valid and it is intended as a "begining quoted phrase" and 
encounters an error when there is no "end quoted phrase" before the EOF.

As far as the parser knows, everything up to column 13 is 100% valid 
syntax .. and the easiest "fix" to he string is to add '\"' at the end of 
teh string.

If you instead had "xxx)yyy" the parser would be able to tell that the 
')' character was out of place, and would report the error at column 3.



-Hoss
http://www.lucidworks.com/