You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Matt Magoffin <ap...@msqr.us> on 2005/08/12 16:30:19 UTC

QueryParser exception on escaped backslash preceding ) character

When I try to parse a query with an escaped backslash character like this
(using Lucene 1.4.3):

-id:20677 +(addr:Street143 AND zip:\\)

the QueryParser thows an Exception:

Encountered "<EOF>" at line 1, column 289.
			Was expecting one of: <AND> ... <OR> ...
			<NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
			<QUOTED> ... <TERM> ...
			<PREFIXTERM> ... <WILDTERM> ... "[" ...
			"{" ... <NUMBER> ...

However, if I insert a space between the backslash and the parenthesis:

-id:20677 +(addr:Street143 AND zip:\\ )

it works. Is this expected behavior or perhaps a bug in the QueryParser?

-- m@

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Matt Magoffin <ap...@msqr.us>.
Sure:

import junit.framework.TestCase;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Query;

public class TestLuceneBackslashBug extends TestCase {

	public void testLuceneBackslashBug() throws Exception {
		// this fails
		Query q = new QueryParser("foo",new StandardAnalyzer()).parse(
			"-id:123 +(addr:Street143 AND zip:\\\\)");
	}

	public void testLuceneBackslashBugWorkaround() throws Exception {
		// this passes
		Query q = new QueryParser("foo",new StandardAnalyzer()).parse(
			"-id:123 +(addr:Street143 AND zip:\\\\ )");
	}

}

>
> the mailing list isn't fond of attachemnts ... can you inline it in your
> email?
>
> : Date: Fri, 12 Aug 2005 12:09:55 -0700 (PDT)
> : From: Matt Magoffin <ap...@msqr.us>
> : Reply-To: java-user@lucene.apache.org
> : To: java-user@lucene.apache.org
> : Subject: Re: QueryParser exception on escaped backslash preceding )
> :        character
> :
> : Sure, here's a test case with the bug and the workaround.
> :
> : -- m@
> :
> : > can you provide a JUnit test that genertes the exception ... if it's
> : > coming from the parse call it should only require a 1 line test.
> : >
> : > : Date: Fri, 12 Aug 2005 10:29:41 -0700 (PDT)
> : > : From: Matt Magoffin <ap...@msqr.us>
> : > : Reply-To: java-user@lucene.apache.org
> : > : To: java-user@lucene.apache.org
> : > : Subject: Re: QueryParser exception on escaped backslash preceding )
> : > :     character
> : > :
> : > : The strings are not coming from Java literals, actually, so I didn't
> : > think
> : > : that was the problem.
> : > :
> : > : Any other thoughts?
> : > :
> : > : -- m@
> : > :
> : > : > I think you are encountering a "double escape" problem in java
> : > literals.
> : > : > QP is seeing a backslash in front of the ) and waiting for you to
> : > finish
> : > : > the paren grouping.
> : > : >
> : > : > how are you passing that string to the QP, is it embedded in your
> java
> : > : > code?  if so the java compiler is interpreting your \\ and your
> java
> : > app
> : > : > is never seeing it.
> : > : >
> : > : > : Date: Fri, 12 Aug 2005 07:30:19 -0700 (PDT)
> : > : > : From: Matt Magoffin <ap...@msqr.us>
> : > : > : Reply-To: java-user@lucene.apache.org
> : > : > : To: java-user@lucene.apache.org
> : > : > : Subject: QueryParser exception on escaped backslash preceding )
> : > : > character
> : > : > :
> : > : > : When I try to parse a query with an escaped backslash character
> like
> : > : > this
> : > : > : (using Lucene 1.4.3):
> : > : > :
> : > : > : -id:20677 +(addr:Street143 AND zip:\\)
> : > : > :
> : > : > : the QueryParser thows an Exception:
> : > : > :
> : > : > : Encountered "<EOF>" at line 1, column 289.
> : > : > : 			Was expecting one of: <AND> ... <OR> ...
> : > : > : 			<NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
> : > : > : 			<QUOTED> ... <TERM> ...
> : > : > : 			<PREFIXTERM> ... <WILDTERM> ... "[" ...
> : > : > : 			"{" ... <NUMBER> ...
> : > : > :
> : > : > : However, if I insert a space between the backslash and the
> : > parenthesis:
> : > : > :
> : > : > : -id:20677 +(addr:Street143 AND zip:\\ )
> : > : > :
> : > : > : it works. Is this expected behavior or perhaps a bug in the
> : > QueryParser?
> : > : > :
> : > : > : -- m@
> : > : > :
> : > : > :
> : > ---------------------------------------------------------------------
> : > : > : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > : > : For additional commands, e-mail:
> java-user-help@lucene.apache.org
> : > : > :
> : > : >
> : > : >
> : > : >
> : > : > -Hoss
> : > : >
> : > : >
> : > : >
> ---------------------------------------------------------------------
> : > : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > : > For additional commands, e-mail: java-user-help@lucene.apache.org
> : > : >
> : > : >
> : > :
> : > :
> : > :
> ---------------------------------------------------------------------
> : > : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > : For additional commands, e-mail: java-user-help@lucene.apache.org
> : > :
> : >
> : >
> : >
> : > -Hoss
> : >
> : >
> : > ---------------------------------------------------------------------
> : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > For additional commands, e-mail: java-user-help@lucene.apache.org
> : >
> : >
> :
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Chris Hostetter <ho...@fucit.org>.
the mailing list isn't fond of attachemnts ... can you inline it in your
email?

: Date: Fri, 12 Aug 2005 12:09:55 -0700 (PDT)
: From: Matt Magoffin <ap...@msqr.us>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Re: QueryParser exception on escaped backslash preceding )
:        character
:
: Sure, here's a test case with the bug and the workaround.
:
: -- m@
:
: > can you provide a JUnit test that genertes the exception ... if it's
: > coming from the parse call it should only require a 1 line test.
: >
: > : Date: Fri, 12 Aug 2005 10:29:41 -0700 (PDT)
: > : From: Matt Magoffin <ap...@msqr.us>
: > : Reply-To: java-user@lucene.apache.org
: > : To: java-user@lucene.apache.org
: > : Subject: Re: QueryParser exception on escaped backslash preceding )
: > :     character
: > :
: > : The strings are not coming from Java literals, actually, so I didn't
: > think
: > : that was the problem.
: > :
: > : Any other thoughts?
: > :
: > : -- m@
: > :
: > : > I think you are encountering a "double escape" problem in java
: > literals.
: > : > QP is seeing a backslash in front of the ) and waiting for you to
: > finish
: > : > the paren grouping.
: > : >
: > : > how are you passing that string to the QP, is it embedded in your java
: > : > code?  if so the java compiler is interpreting your \\ and your java
: > app
: > : > is never seeing it.
: > : >
: > : > : Date: Fri, 12 Aug 2005 07:30:19 -0700 (PDT)
: > : > : From: Matt Magoffin <ap...@msqr.us>
: > : > : Reply-To: java-user@lucene.apache.org
: > : > : To: java-user@lucene.apache.org
: > : > : Subject: QueryParser exception on escaped backslash preceding )
: > : > character
: > : > :
: > : > : When I try to parse a query with an escaped backslash character like
: > : > this
: > : > : (using Lucene 1.4.3):
: > : > :
: > : > : -id:20677 +(addr:Street143 AND zip:\\)
: > : > :
: > : > : the QueryParser thows an Exception:
: > : > :
: > : > : Encountered "<EOF>" at line 1, column 289.
: > : > : 			Was expecting one of: <AND> ... <OR> ...
: > : > : 			<NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
: > : > : 			<QUOTED> ... <TERM> ...
: > : > : 			<PREFIXTERM> ... <WILDTERM> ... "[" ...
: > : > : 			"{" ... <NUMBER> ...
: > : > :
: > : > : However, if I insert a space between the backslash and the
: > parenthesis:
: > : > :
: > : > : -id:20677 +(addr:Street143 AND zip:\\ )
: > : > :
: > : > : it works. Is this expected behavior or perhaps a bug in the
: > QueryParser?
: > : > :
: > : > : -- m@
: > : > :
: > : > :
: > ---------------------------------------------------------------------
: > : > : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > : > : For additional commands, e-mail: java-user-help@lucene.apache.org
: > : > :
: > : >
: > : >
: > : >
: > : > -Hoss
: > : >
: > : >
: > : > ---------------------------------------------------------------------
: > : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > : > For additional commands, e-mail: java-user-help@lucene.apache.org
: > : >
: > : >
: > :
: > :
: > : ---------------------------------------------------------------------
: > : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > : For additional commands, e-mail: java-user-help@lucene.apache.org
: > :
: >
: >
: >
: > -Hoss
: >
: >
: > ---------------------------------------------------------------------
: > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > For additional commands, e-mail: java-user-help@lucene.apache.org
: >
: >
:
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Matt Magoffin <ap...@msqr.us>.
Sure, here's a test case with the bug and the workaround.

-- m@

> can you provide a JUnit test that genertes the exception ... if it's
> coming from the parse call it should only require a 1 line test.
>
> : Date: Fri, 12 Aug 2005 10:29:41 -0700 (PDT)
> : From: Matt Magoffin <ap...@msqr.us>
> : Reply-To: java-user@lucene.apache.org
> : To: java-user@lucene.apache.org
> : Subject: Re: QueryParser exception on escaped backslash preceding )
> :     character
> :
> : The strings are not coming from Java literals, actually, so I didn't
> think
> : that was the problem.
> :
> : Any other thoughts?
> :
> : -- m@
> :
> : > I think you are encountering a "double escape" problem in java
> literals.
> : > QP is seeing a backslash in front of the ) and waiting for you to
> finish
> : > the paren grouping.
> : >
> : > how are you passing that string to the QP, is it embedded in your java
> : > code?  if so the java compiler is interpreting your \\ and your java
> app
> : > is never seeing it.
> : >
> : > : Date: Fri, 12 Aug 2005 07:30:19 -0700 (PDT)
> : > : From: Matt Magoffin <ap...@msqr.us>
> : > : Reply-To: java-user@lucene.apache.org
> : > : To: java-user@lucene.apache.org
> : > : Subject: QueryParser exception on escaped backslash preceding )
> : > character
> : > :
> : > : When I try to parse a query with an escaped backslash character like
> : > this
> : > : (using Lucene 1.4.3):
> : > :
> : > : -id:20677 +(addr:Street143 AND zip:\\)
> : > :
> : > : the QueryParser thows an Exception:
> : > :
> : > : Encountered "<EOF>" at line 1, column 289.
> : > : 			Was expecting one of: <AND> ... <OR> ...
> : > : 			<NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
> : > : 			<QUOTED> ... <TERM> ...
> : > : 			<PREFIXTERM> ... <WILDTERM> ... "[" ...
> : > : 			"{" ... <NUMBER> ...
> : > :
> : > : However, if I insert a space between the backslash and the
> parenthesis:
> : > :
> : > : -id:20677 +(addr:Street143 AND zip:\\ )
> : > :
> : > : it works. Is this expected behavior or perhaps a bug in the
> QueryParser?
> : > :
> : > : -- m@
> : > :
> : > :
> ---------------------------------------------------------------------
> : > : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > : For additional commands, e-mail: java-user-help@lucene.apache.org
> : > :
> : >
> : >
> : >
> : > -Hoss
> : >
> : >
> : > ---------------------------------------------------------------------
> : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > For additional commands, e-mail: java-user-help@lucene.apache.org
> : >
> : >
> :
> :
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : For additional commands, e-mail: java-user-help@lucene.apache.org
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Chris Hostetter <ho...@fucit.org>.
can you provide a JUnit test that genertes the exception ... if it's
coming from the parse call it should only require a 1 line test.

: Date: Fri, 12 Aug 2005 10:29:41 -0700 (PDT)
: From: Matt Magoffin <ap...@msqr.us>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Re: QueryParser exception on escaped backslash preceding )
:     character
:
: The strings are not coming from Java literals, actually, so I didn't think
: that was the problem.
:
: Any other thoughts?
:
: -- m@
:
: > I think you are encountering a "double escape" problem in java literals.
: > QP is seeing a backslash in front of the ) and waiting for you to finish
: > the paren grouping.
: >
: > how are you passing that string to the QP, is it embedded in your java
: > code?  if so the java compiler is interpreting your \\ and your java  app
: > is never seeing it.
: >
: > : Date: Fri, 12 Aug 2005 07:30:19 -0700 (PDT)
: > : From: Matt Magoffin <ap...@msqr.us>
: > : Reply-To: java-user@lucene.apache.org
: > : To: java-user@lucene.apache.org
: > : Subject: QueryParser exception on escaped backslash preceding )
: > character
: > :
: > : When I try to parse a query with an escaped backslash character like
: > this
: > : (using Lucene 1.4.3):
: > :
: > : -id:20677 +(addr:Street143 AND zip:\\)
: > :
: > : the QueryParser thows an Exception:
: > :
: > : Encountered "<EOF>" at line 1, column 289.
: > : 			Was expecting one of: <AND> ... <OR> ...
: > : 			<NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
: > : 			<QUOTED> ... <TERM> ...
: > : 			<PREFIXTERM> ... <WILDTERM> ... "[" ...
: > : 			"{" ... <NUMBER> ...
: > :
: > : However, if I insert a space between the backslash and the parenthesis:
: > :
: > : -id:20677 +(addr:Street143 AND zip:\\ )
: > :
: > : it works. Is this expected behavior or perhaps a bug in the QueryParser?
: > :
: > : -- m@
: > :
: > : ---------------------------------------------------------------------
: > : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > : For additional commands, e-mail: java-user-help@lucene.apache.org
: > :
: >
: >
: >
: > -Hoss
: >
: >
: > ---------------------------------------------------------------------
: > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > For additional commands, e-mail: java-user-help@lucene.apache.org
: >
: >
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Yonik Seeley <ys...@gmail.com>.
I can verify that bad things are going on with backslashes and the
query parser in lucene 1.4.3
 foo:hi\\ ==> foo:hi\
 (foo:hi\\) ==> exception
 foo:"hi\\" ==> foo:hi\\
 foo:hi\\^3 ==> foo:hi\^3
foo:"hi \\ there" ==> foo:"hi \\ there"
foo:'hi there' ==> foo:'hi
foo:"\"" ==> exception
foo:hi\" ==> foo:hi"

So there appears to be no way to query for something with a quote and
a space in the same string...

IMHO, backslash should work as an escape inside quoted values.

-Yonik


On 8/12/05, Matt Magoffin <ap...@msqr.us> wrote:
> The strings are not coming from Java literals, actually, so I didn't think
> that was the problem.
> 
> Any other thoughts?
> 
> -- m@
> 
> > I think you are encountering a "double escape" problem in java literals.
> > QP is seeing a backslash in front of the ) and waiting for you to finish
> > the paren grouping.
> >
> > how are you passing that string to the QP, is it embedded in your java
> > code?  if so the java compiler is interpreting your \\ and your java  app
> > is never seeing it.
> >
> > : Date: Fri, 12 Aug 2005 07:30:19 -0700 (PDT)
> > : From: Matt Magoffin <ap...@msqr.us>
> > : Reply-To: java-user@lucene.apache.org
> > : To: java-user@lucene.apache.org
> > : Subject: QueryParser exception on escaped backslash preceding )
> > character
> > :
> > : When I try to parse a query with an escaped backslash character like
> > this
> > : (using Lucene 1.4.3):
> > :
> > : -id:20677 +(addr:Street143 AND zip:\\)
> > :
> > : the QueryParser thows an Exception:
> > :
> > : Encountered "<EOF>" at line 1, column 289.
> > :                     Was expecting one of: <AND> ... <OR> ...
> > :                     <NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
> > :                     <QUOTED> ... <TERM> ...
> > :                     <PREFIXTERM> ... <WILDTERM> ... "[" ...
> > :                     "{" ... <NUMBER> ...
> > :
> > : However, if I insert a space between the backslash and the parenthesis:
> > :
> > : -id:20677 +(addr:Street143 AND zip:\\ )
> > :
> > : it works. Is this expected behavior or perhaps a bug in the QueryParser?
> > :
> > : -- m@
> > :
> > : ---------------------------------------------------------------------
> > : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > : For additional commands, e-mail: java-user-help@lucene.apache.org
> > :
> >
> >
> >
> > -Hoss
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Matt Magoffin <ap...@msqr.us>.
The strings are not coming from Java literals, actually, so I didn't think
that was the problem.

Any other thoughts?

-- m@

> I think you are encountering a "double escape" problem in java literals.
> QP is seeing a backslash in front of the ) and waiting for you to finish
> the paren grouping.
>
> how are you passing that string to the QP, is it embedded in your java
> code?  if so the java compiler is interpreting your \\ and your java  app
> is never seeing it.
>
> : Date: Fri, 12 Aug 2005 07:30:19 -0700 (PDT)
> : From: Matt Magoffin <ap...@msqr.us>
> : Reply-To: java-user@lucene.apache.org
> : To: java-user@lucene.apache.org
> : Subject: QueryParser exception on escaped backslash preceding )
> character
> :
> : When I try to parse a query with an escaped backslash character like
> this
> : (using Lucene 1.4.3):
> :
> : -id:20677 +(addr:Street143 AND zip:\\)
> :
> : the QueryParser thows an Exception:
> :
> : Encountered "<EOF>" at line 1, column 289.
> : 			Was expecting one of: <AND> ... <OR> ...
> : 			<NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
> : 			<QUOTED> ... <TERM> ...
> : 			<PREFIXTERM> ... <WILDTERM> ... "[" ...
> : 			"{" ... <NUMBER> ...
> :
> : However, if I insert a space between the backslash and the parenthesis:
> :
> : -id:20677 +(addr:Street143 AND zip:\\ )
> :
> : it works. Is this expected behavior or perhaps a bug in the QueryParser?
> :
> : -- m@
> :
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : For additional commands, e-mail: java-user-help@lucene.apache.org
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Chris Hostetter <ho...@fucit.org>.
I think you are encountering a "double escape" problem in java literals.
QP is seeing a backslash in front of the ) and waiting for you to finish
the paren grouping.

how are you passing that string to the QP, is it embedded in your java
code?  if so the java compiler is interpreting your \\ and your java  app
is never seeing it.

: Date: Fri, 12 Aug 2005 07:30:19 -0700 (PDT)
: From: Matt Magoffin <ap...@msqr.us>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: QueryParser exception on escaped backslash preceding ) character
:
: When I try to parse a query with an escaped backslash character like this
: (using Lucene 1.4.3):
:
: -id:20677 +(addr:Street143 AND zip:\\)
:
: the QueryParser thows an Exception:
:
: Encountered "<EOF>" at line 1, column 289.
: 			Was expecting one of: <AND> ... <OR> ...
: 			<NOT> ... "+" ... "-" ... "(" ... ")" ... "^" ...
: 			<QUOTED> ... <TERM> ...
: 			<PREFIXTERM> ... <WILDTERM> ... "[" ...
: 			"{" ... <NUMBER> ...
:
: However, if I insert a space between the backslash and the parenthesis:
:
: -id:20677 +(addr:Street143 AND zip:\\ )
:
: it works. Is this expected behavior or perhaps a bug in the QueryParser?
:
: -- m@
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser exception on escaped backslash preceding ) character

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Aug 15, 2005, at 3:05 PM, Monsur Hossain wrote:
> We've actually been running into this sort of issue a lot, since we  
> take a
> user generated query from a web page and then push it into a  
> QueryParser.
> In general we've learned that escaping special characters is not  
> enough to
> create a well formed query.  Since our users aren't running  
> complicated
> queries, we decided instead to completely parse out any non- 
> alphanumerics.
> But we still have issues when, for example, someone will search for:
>
> Portland, OR
>
> And Lucene will interpret that "OR" as a special word, rather than  
> "Oregon".
> I'm wondering how others are dealing with this type of scenario.   
> If it'll
> help, I can provide more queries that will cause errors similar to  
> the one
> below.

If you're ripping out non-alphanumerics, you may as well simply  
create the Query through the API instead of using QueryParser.  I do  
this for all the structured search options on the Rossetti Archive  
search page: http://www.rossettiarchive.org/rose/

I analyze (or string tokenize) the text for a field, and then build a  
Query from it rather than allowing QueryParser expression syntax.   
QueryParser is probably overkill for most applications needs anyway,  
and can end up getting in the way in the cases described in this thread.

     Erik

>
> Thanks,
> Monsur
>
>
>
>
>> -----Original Message-----
>> From: Matt Magoffin [mailto:apache.org@msqr.us]
>> Sent: Friday, August 12, 2005 10:30 AM
>> To: java-user@lucene.apache.org
>> Subject: QueryParser exception on escaped backslash preceding
>> ) character
>>
>> When I try to parse a query with an escaped backslash
>> character like this
>> (using Lucene 1.4.3):
>>
>> -id:20677 +(addr:Street143 AND zip:\\)
>>
>> the QueryParser thows an Exception:
>>
>> Encountered "<EOF>" at line 1, column 289.
>>             Was expecting one of: <AND> ... <OR> ...
>>             <NOT> ... "+" ... "-" ... "(" ... ")"
>> ... "^" ...
>>             <QUOTED> ... <TERM> ...
>>             <PREFIXTERM> ... <WILDTERM> ... "[" ...
>>             "{" ... <NUMBER> ...
>>
>> However, if I insert a space between the backslash and the
>> parenthesis:
>>
>> -id:20677 +(addr:Street143 AND zip:\\ )
>>
>> it works. Is this expected behavior or perhaps a bug in the
>> QueryParser?
>>
>> -- m@
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: QueryParser exception on escaped backslash preceding ) character

Posted by Monsur Hossain <mo...@monsur.com>.
We've actually been running into this sort of issue a lot, since we take a
user generated query from a web page and then push it into a QueryParser.
In general we've learned that escaping special characters is not enough to
create a well formed query.  Since our users aren't running complicated
queries, we decided instead to completely parse out any non-alphanumerics.
But we still have issues when, for example, someone will search for:

Portland, OR

And Lucene will interpret that "OR" as a special word, rather than "Oregon".
I'm wondering how others are dealing with this type of scenario.  If it'll
help, I can provide more queries that will cause errors similar to the one
below.

Thanks,
Monsur

 

> -----Original Message-----
> From: Matt Magoffin [mailto:apache.org@msqr.us] 
> Sent: Friday, August 12, 2005 10:30 AM
> To: java-user@lucene.apache.org
> Subject: QueryParser exception on escaped backslash preceding 
> ) character
> 
> When I try to parse a query with an escaped backslash 
> character like this
> (using Lucene 1.4.3):
> 
> -id:20677 +(addr:Street143 AND zip:\\)
> 
> the QueryParser thows an Exception:
> 
> Encountered "<EOF>" at line 1, column 289.
> 			Was expecting one of: <AND> ... <OR> ...
> 			<NOT> ... "+" ... "-" ... "(" ... ")" 
> ... "^" ...
> 			<QUOTED> ... <TERM> ...
> 			<PREFIXTERM> ... <WILDTERM> ... "[" ...
> 			"{" ... <NUMBER> ...
> 
> However, if I insert a space between the backslash and the 
> parenthesis:
> 
> -id:20677 +(addr:Street143 AND zip:\\ )
> 
> it works. Is this expected behavior or perhaps a bug in the 
> QueryParser?
> 
> -- m@
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org