You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shankar Sundararaju <sh...@ebrary.com> on 2013/05/22 18:16:07 UTC

Can anyone explain this Solr query behavior?

This query returns 0 documents: *q=(+Title:() +Classification:()
+Contributors:() +text:())*

This returns 1 document: *q=doc-id:3000*

And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
AND (+Title:() +Classification:() +Contributors:() +text:())*

Am I missing something here? Can someone please explain? I am using Solr
4.2.1

Thanks
-Shankar

Re: Can anyone explain this Solr query behavior?

Posted by Shankar Sundararaju <sh...@ebrary.com>.
Hi Upayavira,

Thank you for your analysis. I thought 'AND' & groupings are supported as
per documentation:

http://docs.lucidworks.com/display/solr/The+Extended+DisMax+Query+Parser
http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/queryparsersyntax.html#Grouping

But yes, q=doc-id:3000 AND (-text:[* TO *]) works as expected.

Thanks
-Shankar



On Thu, May 23, 2013 at 5:31 PM, Upayavira <uv...@odoko.co.uk> wrote:

> (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
> Classification:and^2.0 | Contributors:and^2.0 |
> Title:and^3.0))))/no_coord
>
> You're using edismax, not lucene. So AND is being considered as a search
> term, not an operator, and the word 'and' probably exists in 631580
> documents.
>
> Why is it triggering dismax? Probably because field:() is not valid
> syntax, so edismax is dropping to dismax because it isn't a valid lucene
> query.
>
> What do you expect text:() to do?
>
> If you want to match any docs that have a value in the text field, use
> q=text:[* TO *]
>
> To match docs that *don't* have a value in the text field: q=-text[* TO
> *]
>
> Upayavira
>
> On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
> > Hi Erick,
> >
> > Here's the output after turning on the debug flag:
> >
> > *q=text:()&debug=query*
> >
> >     yields
> >
> > <response>
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">17</int>
> > <lst name="params">
> > <str name="indent">true</str>
> > <str name="q">text:()</str>
> > <str name="debug">query</str>
> > </lst>
> > </lst>
> > <result name="response" numFound="0" start="0" maxScore="0.0"></result>
> > <lst name="debug">
> > <str name="rawquerystring">text:()</str>
> > <str name="querystring">text:()</str>
> > <str name="parsedquery">(+())/no_coord</str>
> > <str name="parsedquery_toString">+()</str>
> > <str name="QParser">ExtendedDismaxQParser</str>
> > <null name="altquerystring"/>
> > <null name="boost_queries"/>
> > <arr name="parsed_boost_queries"/>
> > <null name="boostfuncs"/>
> > </lst>
> > </response>
> >
> > *q=doc-id:3000&debug=query*
> >
> >     yields
> >
> > <response>
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">17</int>
> > <lst name="params">
> > <str name="q">doc-id:3000</str>
> > <str name="debug">query</str>
> > </lst>
> > </lst>
> > <result name="response" numFound="1" start="0" maxScore="11.682044">
> > <doc>
> >   :
> >   :
> > </doc>
> > </result>
> > <lst name="debug">
> > <str name="rawquerystring">doc-id:3000</str>
> > <str name="querystring">doc-id:3000</str>
> > <str name="parsedquery">(+doc-id:3000)/no_coord</str>
> > <str name="parsedquery_toString">+doc-id:`#8;#0;#0;#23;8</str>
> > <str name="QParser">ExtendedDismaxQParser</str>
> > <null name="altquerystring"/>
> > <null name="boost_queries"/>
> > <arr name="parsed_boost_queries"/>
> > <null name="boostfuncs"/>
> > </lst>
> > </response>
> >
> > *q=doc-id:3000 AND text:()&debug=query*
> >
> >   yields
> >
> > <response>
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">23</int>
> > <lst name="params">
> > <str name="q">doc-id:3000 AND text:()</str>
> > <str name="debug">query</str>
> > </lst>
> > </lst>
> > <result name="response" numFound="631647" start="0" maxScore="8.056607">
> > <doc>
> >  :
> > </doc>
> >  :
> > </doc>
> > <doc>
> >  :
> > </doc>
> > <doc>
> >  :
> > </doc>
> > <doc>
> >  :
> > </doc>
> > <doc>
> >  :
> > </doc>
> > </result>
> > <lst name="debug">
> > <str name="rawquerystring">doc-id:3000 AND text:()</str>
> > <str name="querystring">doc-id:3000 AND text:()</str>
> > <str name="parsedquery">
> > (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
> > Classification:and^2.0 | Contributors:and^2.0 |
> > Title:and^3.0))))/no_coord
> > </str>
> > <str name="parsedquery_toString">
> > +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
> > Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
> > </str>
> > <str name="QParser">ExtendedDismaxQParser</str>
> > <null name="altquerystring"/>
> > <null name="boost_queries"/>
> > <arr name="parsed_boost_queries"/>
> > <null name="boostfuncs"/>
> > </lst>
> > </response>
> >
> > *solrconfig.xml:*
> > <requestHandler name="/select" class="solr.SearchHandler">
> >      <lst name="defaults">
> >        <str name="echoParams">explicit</str>
> >        <int name="rows">10</int>
> >        <str name="df">text</str>
> >        <str name="defType">edismax</str>
> >        <str name="qf">text^1.0 Title^3.0 Classification^2.0
> > Contributors^2.0 Publisher^2.0</str>
> >      </lst>
> >
> > *schema.xml:*
> > <field name="text" type="my_text" indexed="true" stored="false" required=
> > "false"/>*
> > *
> > <dynamicField name="*" type="my_text" indexed="true" stored="true"
> > multiValued="false"/>
> > <fieldType name="my_text" class="solr.TextField"> <analyzer type="index"
> > class="MyAnalyzer"/> <analyzer type="query" class="MyAnalyzer"/>
> > <analyzer
> > type="multiterm" class="MyAnalyzer"/> </fieldType>
> > *
> > *
> > *Note:* MyAnalyzer among few other customizations, uses
> > WhitespaceTokenizer
> > and LoweCaseFilter
> >
> > Thanks a lot.
> >
> > -Shankar
> >
> >
> > On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
> > <er...@gmail.com>wrote:
> >
> > > Please post the results of adding &debug=query to the URL.
> > > That'll tell us what the query parser spits out which is much
> > > easier to analyze.
> > >
> > > Best
> > > Erick
> > >
> > > On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
> > > <sh...@ebrary.com> wrote:
> > > > This query returns 0 documents: *q=(+Title:() +Classification:()
> > > > +Contributors:() +text:())*
> > > >
> > > > This returns 1 document: *q=doc-id:3000*
> > > >
> > > > And this returns 631580 documents when I was expecting 0:
> *q=doc-id:3000
> > > > AND (+Title:() +Classification:() +Contributors:() +text:())*
> > > >
> > > > Am I missing something here? Can someone please explain? I am using
> Solr
> > > > 4.2.1
> > > >
> > > > Thanks
> > > > -Shankar
> > >
> >
> >
> >
> > --
> > Regards,
> > *Shankar Sundararaju
> > *Sr. Software Architect
> > ebrary, a ProQuest company
> > 410 Cambridge Avenue, Palo Alto, CA 94306 USA
> > Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057(c)
>



-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)

Re: Can anyone explain this Solr query behavior?

Posted by Upayavira <uv...@odoko.co.uk>.
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 |
Title:and^3.0))))/no_coord

You're using edismax, not lucene. So AND is being considered as a search
term, not an operator, and the word 'and' probably exists in 631580
documents.

Why is it triggering dismax? Probably because field:() is not valid
syntax, so edismax is dropping to dismax because it isn't a valid lucene
query.

What do you expect text:() to do?

If you want to match any docs that have a value in the text field, use
q=text:[* TO *]

To match docs that *don't* have a value in the text field: q=-text[* TO
*]

Upayavira

On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
> Hi Erick,
> 
> Here's the output after turning on the debug flag:
> 
> *q=text:()&debug=query*
> 
>     yields
> 
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">17</int>
> <lst name="params">
> <str name="indent">true</str>
> <str name="q">text:()</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0" maxScore="0.0"></result>
> <lst name="debug">
> <str name="rawquerystring">text:()</str>
> <str name="querystring">text:()</str>
> <str name="parsedquery">(+())/no_coord</str>
> <str name="parsedquery_toString">+()</str>
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
> 
> *q=doc-id:3000&debug=query*
> 
>     yields
> 
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">17</int>
> <lst name="params">
> <str name="q">doc-id:3000</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="1" start="0" maxScore="11.682044">
> <doc>
>   :
>   :
> </doc>
> </result>
> <lst name="debug">
> <str name="rawquerystring">doc-id:3000</str>
> <str name="querystring">doc-id:3000</str>
> <str name="parsedquery">(+doc-id:3000)/no_coord</str>
> <str name="parsedquery_toString">+doc-id:`#8;#0;#0;#23;8</str>
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
> 
> *q=doc-id:3000 AND text:()&debug=query*
> 
>   yields
> 
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">23</int>
> <lst name="params">
> <str name="q">doc-id:3000 AND text:()</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="631647" start="0" maxScore="8.056607">
> <doc>
>  :
> </doc>
>  :
> </doc>
> <doc>
>  :
> </doc>
> <doc>
>  :
> </doc>
> <doc>
>  :
> </doc>
> <doc>
>  :
> </doc>
> </result>
> <lst name="debug">
> <str name="rawquerystring">doc-id:3000 AND text:()</str>
> <str name="querystring">doc-id:3000 AND text:()</str>
> <str name="parsedquery">
> (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
> Classification:and^2.0 | Contributors:and^2.0 |
> Title:and^3.0))))/no_coord
> </str>
> <str name="parsedquery_toString">
> +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
> Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
> </str>
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
> 
> *solrconfig.xml:*
> <requestHandler name="/select" class="solr.SearchHandler">
>      <lst name="defaults">
>        <str name="echoParams">explicit</str>
>        <int name="rows">10</int>
>        <str name="df">text</str>
>        <str name="defType">edismax</str>
>        <str name="qf">text^1.0 Title^3.0 Classification^2.0
> Contributors^2.0 Publisher^2.0</str>
>      </lst>
> 
> *schema.xml:*
> <field name="text" type="my_text" indexed="true" stored="false" required=
> "false"/>*
> *
> <dynamicField name="*" type="my_text" indexed="true" stored="true"
> multiValued="false"/>
> <fieldType name="my_text" class="solr.TextField"> <analyzer type="index"
> class="MyAnalyzer"/> <analyzer type="query" class="MyAnalyzer"/>
> <analyzer
> type="multiterm" class="MyAnalyzer"/> </fieldType>
> *
> *
> *Note:* MyAnalyzer among few other customizations, uses
> WhitespaceTokenizer
> and LoweCaseFilter
> 
> Thanks a lot.
> 
> -Shankar
> 
> 
> On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
> <er...@gmail.com>wrote:
> 
> > Please post the results of adding &debug=query to the URL.
> > That'll tell us what the query parser spits out which is much
> > easier to analyze.
> >
> > Best
> > Erick
> >
> > On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
> > <sh...@ebrary.com> wrote:
> > > This query returns 0 documents: *q=(+Title:() +Classification:()
> > > +Contributors:() +text:())*
> > >
> > > This returns 1 document: *q=doc-id:3000*
> > >
> > > And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
> > > AND (+Title:() +Classification:() +Contributors:() +text:())*
> > >
> > > Am I missing something here? Can someone please explain? I am using Solr
> > > 4.2.1
> > >
> > > Thanks
> > > -Shankar
> >
> 
> 
> 
> -- 
> Regards,
> *Shankar Sundararaju
> *Sr. Software Architect
> ebrary, a ProQuest company
> 410 Cambridge Avenue, Palo Alto, CA 94306 USA
> Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)

Re: Can anyone explain this Solr query behavior?

Posted by Jack Krupansky <ja...@basetechnology.com>.
Oh, I simply changed the query parser type to lucene, with &defType=lucene 
and then I see essentially the same error that edismax does when it 
internally tries to parse the query.

But, it might be nice if DEBUG level logging for edismax did display the 
error as well and then told you what remediation it was performing..

-- Jack Krupansky

-----Original Message----- 
From: Shankar Sundararaju
Sent: Friday, May 24, 2013 1:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?

Hi Jack Krupansky,

Thank you for your reply. I would like to know how you got the error
logging? Is there any special flag I have to turn on? Because I don't see
it in my solr.log even after switching the log level to DEBUG.

<str name="msg">org.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered " ")" ") "" at line 1, column 15.

Thanks
-Shankar


On Thu, May 23, 2013 at 5:41 PM, Jack Krupansky 
<ja...@basetechnology.com>wrote:

> Okay... sorry I wasn't paying close enough attention. What is happening is
> that the empty parentheses are illegal in Lucene query syntax:
>
>  <str name="msg">org.apache.solr.**search.SyntaxError: Cannot parse 'id:*
> AND text:()': Encountered " ")" ") "" at line 1, column 15.
> Was expecting one of:
>    &lt;NOT&gt; ...
>    "+" ...
>    "-" ...
>    &lt;BAREOPER&gt; ...
>    "(" ...
>    "*" ...
>    &lt;QUOTED&gt; ...
>    &lt;TERM&gt; ...
>    &lt;PREFIXTERM&gt; ...
>    &lt;WILDTERM&gt; ...
>    &lt;REGEXPTERM&gt; ...
>    "[" ...
>    "{" ...
>    &lt;LPARAMS&gt; ...
>    &lt;NUMBER&gt; ...
>    &lt;TERM&gt; ...
>    "*" ...
>    </str>
>  <int name="code">400</int>
>
> Edismax traps such errors and then "escapes" the query so that Lucene will
> no longer throw an error. In this case, it puts quotes around the "AND"
> operator, which is why you see "and" included in the parsed query as if it
> were a term. And I believe it turns "text:()" into "text:"()"", which 
> makes
> the original Lucene error go away, but the "()" analyzes to nothing and
> generates no term in the query.
>
> So, fix your syntax error and the anomaly should go away.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Shankar Sundararaju
> Sent: Thursday, May 23, 2013 7:23 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Can anyone explain this Solr query behavior?
>
>
> Hi Erick,
>
> Here's the output after turning on the debug flag:
>
> *q=text:()&debug=query*
>
>
>    yields
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">17</int>
> <lst name="params">
> <str name="indent">true</str>
> <str name="q">text:()</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0" maxScore="0.0"></result>
> <lst name="debug">
> <str name="rawquerystring">text:()<**/str>
> <str name="querystring">text:()</**str>
> <str name="parsedquery">(+())/no_**coord</str>
> <str name="parsedquery_toString">+(**)</str>
> <str name="QParser">**ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
>
> *q=doc-id:3000&debug=query*
>
>
>    yields
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">17</int>
> <lst name="params">
> <str name="q">doc-id:3000</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="1" start="0" maxScore="11.682044">
> <doc>
>  :
>  :
> </doc>
> </result>
> <lst name="debug">
> <str name="rawquerystring">doc-id:**3000</str>
> <str name="querystring">doc-id:**3000</str>
> <str name="parsedquery">(+doc-id:**3000)/no_coord</str>
> <str name="parsedquery_toString">+**doc-id:`#8;#0;#0;#23;8</str>
> <str name="QParser">**ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
>
> *q=doc-id:3000 AND text:()&debug=query*
>
>  yields
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">23</int>
> <lst name="params">
> <str name="q">doc-id:3000 AND text:()</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="631647" start="0" maxScore="8.056607">
> <doc>
> :
> </doc>
> :
> </doc>
> <doc>
> :
> </doc>
> <doc>
> :
> </doc>
> <doc>
> :
> </doc>
> <doc>
> :
> </doc>
> </result>
> <lst name="debug">
> <str name="rawquerystring">doc-id:**3000 AND text:()</str>
> <str name="querystring">doc-id:3000 AND text:()</str>
> <str name="parsedquery">
> (+(doc-id:3000 DisjunctionMaxQuery((**Publisher:and^2.0 | text:and |
> Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))))/no_coord
> </str>
> <str name="parsedquery_toString">
> +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
> Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
> </str>
> <str name="QParser">**ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
>
> *solrconfig.xml:*
>
> <requestHandler name="/select" class="solr.SearchHandler">
>     <lst name="defaults">
>       <str name="echoParams">explicit</**str>
>       <int name="rows">10</int>
>       <str name="df">text</str>
>       <str name="defType">edismax</str>
>       <str name="qf">text^1.0 Title^3.0 Classification^2.0
> Contributors^2.0 Publisher^2.0</str>
>     </lst>
>
> *schema.xml:*
>
> <field name="text" type="my_text" indexed="true" stored="false" required=
> "false"/>*
> *
>
> <dynamicField name="*" type="my_text" indexed="true" stored="true"
> multiValued="false"/>
> <fieldType name="my_text" class="solr.TextField"> <analyzer type="index"
> class="MyAnalyzer"/> <analyzer type="query" class="MyAnalyzer"/> <analyzer
> type="multiterm" class="MyAnalyzer"/> </fieldType>
> *
> *
> *Note:* MyAnalyzer among few other customizations, uses 
> WhitespaceTokenizer
>
> and LoweCaseFilter
>
> Thanks a lot.
>
> -Shankar
>
>
> On Thu, May 23, 2013 at 4:34 AM, Erick Erickson <er...@gmail.com>*
> *wrote:
>
>  Please post the results of adding &debug=query to the URL.
>> That'll tell us what the query parser spits out which is much
>> easier to analyze.
>>
>> Best
>> Erick
>>
>> On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
>> <sh...@ebrary.com> wrote:
>> > This query returns 0 documents: *q=(+Title:() +Classification:()
>> > +Contributors:() +text:())*
>> >
>> > This returns 1 document: *q=doc-id:3000*
>> >
>> > And this returns 631580 documents when I was expecting 0: 
>> > *q=doc-id:3000
>> > AND (+Title:() +Classification:() +Contributors:() +text:())*
>> >
>> > Am I missing something here? Can someone please explain? I am using 
>> > Solr
>> > 4.2.1
>> >
>> > Thanks
>> > -Shankar
>>
>>
>
>
> --
> Regards,
> *Shankar Sundararaju
> *Sr. Software Architect
>
> ebrary, a ProQuest company
> 410 Cambridge Avenue, Palo Alto, CA 94306 USA
> Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)
>



-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c) 


Re: Can anyone explain this Solr query behavior?

Posted by Shankar Sundararaju <sh...@ebrary.com>.
Hi Jack Krupansky,

Thank you for your reply. I would like to know how you got the error
logging? Is there any special flag I have to turn on? Because I don't see
it in my solr.log even after switching the log level to DEBUG.

<str name="msg">org.apache.solr.**search.SyntaxError: Cannot parse 'id:*
AND text:()': Encountered " ")" ") "" at line 1, column 15.

Thanks
-Shankar


On Thu, May 23, 2013 at 5:41 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Okay... sorry I wasn't paying close enough attention. What is happening is
> that the empty parentheses are illegal in Lucene query syntax:
>
>  <str name="msg">org.apache.solr.**search.SyntaxError: Cannot parse 'id:*
> AND text:()': Encountered " ")" ") "" at line 1, column 15.
> Was expecting one of:
>    &lt;NOT&gt; ...
>    "+" ...
>    "-" ...
>    &lt;BAREOPER&gt; ...
>    "(" ...
>    "*" ...
>    &lt;QUOTED&gt; ...
>    &lt;TERM&gt; ...
>    &lt;PREFIXTERM&gt; ...
>    &lt;WILDTERM&gt; ...
>    &lt;REGEXPTERM&gt; ...
>    "[" ...
>    "{" ...
>    &lt;LPARAMS&gt; ...
>    &lt;NUMBER&gt; ...
>    &lt;TERM&gt; ...
>    "*" ...
>    </str>
>  <int name="code">400</int>
>
> Edismax traps such errors and then "escapes" the query so that Lucene will
> no longer throw an error. In this case, it puts quotes around the "AND"
> operator, which is why you see "and" included in the parsed query as if it
> were a term. And I believe it turns "text:()" into "text:"()"", which makes
> the original Lucene error go away, but the "()" analyzes to nothing and
> generates no term in the query.
>
> So, fix your syntax error and the anomaly should go away.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Shankar Sundararaju
> Sent: Thursday, May 23, 2013 7:23 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Can anyone explain this Solr query behavior?
>
>
> Hi Erick,
>
> Here's the output after turning on the debug flag:
>
> *q=text:()&debug=query*
>
>
>    yields
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">17</int>
> <lst name="params">
> <str name="indent">true</str>
> <str name="q">text:()</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0" maxScore="0.0"></result>
> <lst name="debug">
> <str name="rawquerystring">text:()<**/str>
> <str name="querystring">text:()</**str>
> <str name="parsedquery">(+())/no_**coord</str>
> <str name="parsedquery_toString">+(**)</str>
> <str name="QParser">**ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
>
> *q=doc-id:3000&debug=query*
>
>
>    yields
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">17</int>
> <lst name="params">
> <str name="q">doc-id:3000</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="1" start="0" maxScore="11.682044">
> <doc>
>  :
>  :
> </doc>
> </result>
> <lst name="debug">
> <str name="rawquerystring">doc-id:**3000</str>
> <str name="querystring">doc-id:**3000</str>
> <str name="parsedquery">(+doc-id:**3000)/no_coord</str>
> <str name="parsedquery_toString">+**doc-id:`#8;#0;#0;#23;8</str>
> <str name="QParser">**ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
>
> *q=doc-id:3000 AND text:()&debug=query*
>
>  yields
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">23</int>
> <lst name="params">
> <str name="q">doc-id:3000 AND text:()</str>
> <str name="debug">query</str>
> </lst>
> </lst>
> <result name="response" numFound="631647" start="0" maxScore="8.056607">
> <doc>
> :
> </doc>
> :
> </doc>
> <doc>
> :
> </doc>
> <doc>
> :
> </doc>
> <doc>
> :
> </doc>
> <doc>
> :
> </doc>
> </result>
> <lst name="debug">
> <str name="rawquerystring">doc-id:**3000 AND text:()</str>
> <str name="querystring">doc-id:3000 AND text:()</str>
> <str name="parsedquery">
> (+(doc-id:3000 DisjunctionMaxQuery((**Publisher:and^2.0 | text:and |
> Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))))/no_coord
> </str>
> <str name="parsedquery_toString">
> +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
> Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
> </str>
> <str name="QParser">**ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <null name="boost_queries"/>
> <arr name="parsed_boost_queries"/>
> <null name="boostfuncs"/>
> </lst>
> </response>
>
> *solrconfig.xml:*
>
> <requestHandler name="/select" class="solr.SearchHandler">
>     <lst name="defaults">
>       <str name="echoParams">explicit</**str>
>       <int name="rows">10</int>
>       <str name="df">text</str>
>       <str name="defType">edismax</str>
>       <str name="qf">text^1.0 Title^3.0 Classification^2.0
> Contributors^2.0 Publisher^2.0</str>
>     </lst>
>
> *schema.xml:*
>
> <field name="text" type="my_text" indexed="true" stored="false" required=
> "false"/>*
> *
>
> <dynamicField name="*" type="my_text" indexed="true" stored="true"
> multiValued="false"/>
> <fieldType name="my_text" class="solr.TextField"> <analyzer type="index"
> class="MyAnalyzer"/> <analyzer type="query" class="MyAnalyzer"/> <analyzer
> type="multiterm" class="MyAnalyzer"/> </fieldType>
> *
> *
> *Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
>
> and LoweCaseFilter
>
> Thanks a lot.
>
> -Shankar
>
>
> On Thu, May 23, 2013 at 4:34 AM, Erick Erickson <er...@gmail.com>*
> *wrote:
>
>  Please post the results of adding &debug=query to the URL.
>> That'll tell us what the query parser spits out which is much
>> easier to analyze.
>>
>> Best
>> Erick
>>
>> On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
>> <sh...@ebrary.com> wrote:
>> > This query returns 0 documents: *q=(+Title:() +Classification:()
>> > +Contributors:() +text:())*
>> >
>> > This returns 1 document: *q=doc-id:3000*
>> >
>> > And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
>> > AND (+Title:() +Classification:() +Contributors:() +text:())*
>> >
>> > Am I missing something here? Can someone please explain? I am using Solr
>> > 4.2.1
>> >
>> > Thanks
>> > -Shankar
>>
>>
>
>
> --
> Regards,
> *Shankar Sundararaju
> *Sr. Software Architect
>
> ebrary, a ProQuest company
> 410 Cambridge Avenue, Palo Alto, CA 94306 USA
> Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)
>



-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)

Re: Can anyone explain this Solr query behavior?

Posted by Jack Krupansky <ja...@basetechnology.com>.
Okay... sorry I wasn't paying close enough attention. What is happening is 
that the empty parentheses are illegal in Lucene query syntax:

  <str name="msg">org.apache.solr.search.SyntaxError: Cannot parse 'id:* AND 
text:()': Encountered " ")" ") "" at line 1, column 15.
Was expecting one of:
    &lt;NOT&gt; ...
    "+" ...
    "-" ...
    &lt;BAREOPER&gt; ...
    "(" ...
    "*" ...
    &lt;QUOTED&gt; ...
    &lt;TERM&gt; ...
    &lt;PREFIXTERM&gt; ...
    &lt;WILDTERM&gt; ...
    &lt;REGEXPTERM&gt; ...
    "[" ...
    "{" ...
    &lt;LPARAMS&gt; ...
    &lt;NUMBER&gt; ...
    &lt;TERM&gt; ...
    "*" ...
    </str>
  <int name="code">400</int>

Edismax traps such errors and then "escapes" the query so that Lucene will 
no longer throw an error. In this case, it puts quotes around the "AND" 
operator, which is why you see "and" included in the parsed query as if it 
were a term. And I believe it turns "text:()" into "text:"()"", which makes 
the original Lucene error go away, but the "()" analyzes to nothing and 
generates no term in the query.

So, fix your syntax error and the anomaly should go away.

-- Jack Krupansky

-----Original Message----- 
From: Shankar Sundararaju
Sent: Thursday, May 23, 2013 7:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?

Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()&debug=query*

    yields

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">17</int>
<lst name="params">
<str name="indent">true</str>
<str name="q">text:()</str>
<str name="debug">query</str>
</lst>
</lst>
<result name="response" numFound="0" start="0" maxScore="0.0"></result>
<lst name="debug">
<str name="rawquerystring">text:()</str>
<str name="querystring">text:()</str>
<str name="parsedquery">(+())/no_coord</str>
<str name="parsedquery_toString">+()</str>
<str name="QParser">ExtendedDismaxQParser</str>
<null name="altquerystring"/>
<null name="boost_queries"/>
<arr name="parsed_boost_queries"/>
<null name="boostfuncs"/>
</lst>
</response>

*q=doc-id:3000&debug=query*

    yields

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">17</int>
<lst name="params">
<str name="q">doc-id:3000</str>
<str name="debug">query</str>
</lst>
</lst>
<result name="response" numFound="1" start="0" maxScore="11.682044">
<doc>
  :
  :
</doc>
</result>
<lst name="debug">
<str name="rawquerystring">doc-id:3000</str>
<str name="querystring">doc-id:3000</str>
<str name="parsedquery">(+doc-id:3000)/no_coord</str>
<str name="parsedquery_toString">+doc-id:`#8;#0;#0;#23;8</str>
<str name="QParser">ExtendedDismaxQParser</str>
<null name="altquerystring"/>
<null name="boost_queries"/>
<arr name="parsed_boost_queries"/>
<null name="boostfuncs"/>
</lst>
</response>

*q=doc-id:3000 AND text:()&debug=query*

  yields

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">23</int>
<lst name="params">
<str name="q">doc-id:3000 AND text:()</str>
<str name="debug">query</str>
</lst>
</lst>
<result name="response" numFound="631647" start="0" maxScore="8.056607">
<doc>
:
</doc>
:
</doc>
<doc>
:
</doc>
<doc>
:
</doc>
<doc>
:
</doc>
<doc>
:
</doc>
</result>
<lst name="debug">
<str name="rawquerystring">doc-id:3000 AND text:()</str>
<str name="querystring">doc-id:3000 AND text:()</str>
<str name="parsedquery">
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))))/no_coord
</str>
<str name="parsedquery_toString">
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
</str>
<str name="QParser">ExtendedDismaxQParser</str>
<null name="altquerystring"/>
<null name="boost_queries"/>
<arr name="parsed_boost_queries"/>
<null name="boostfuncs"/>
</lst>
</response>

*solrconfig.xml:*
<requestHandler name="/select" class="solr.SearchHandler">
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">10</int>
       <str name="df">text</str>
       <str name="defType">edismax</str>
       <str name="qf">text^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0</str>
     </lst>

*schema.xml:*
<field name="text" type="my_text" indexed="true" stored="false" required=
"false"/>*
*
<dynamicField name="*" type="my_text" indexed="true" stored="true"
multiValued="false"/>
<fieldType name="my_text" class="solr.TextField"> <analyzer type="index"
class="MyAnalyzer"/> <analyzer type="query" class="MyAnalyzer"/> <analyzer
type="multiterm" class="MyAnalyzer"/> </fieldType>
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson 
<er...@gmail.com>wrote:

> Please post the results of adding &debug=query to the URL.
> That'll tell us what the query parser spits out which is much
> easier to analyze.
>
> Best
> Erick
>
> On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
> <sh...@ebrary.com> wrote:
> > This query returns 0 documents: *q=(+Title:() +Classification:()
> > +Contributors:() +text:())*
> >
> > This returns 1 document: *q=doc-id:3000*
> >
> > And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
> > AND (+Title:() +Classification:() +Contributors:() +text:())*
> >
> > Am I missing something here? Can someone please explain? I am using Solr
> > 4.2.1
> >
> > Thanks
> > -Shankar
>



-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c) 


Re: Can anyone explain this Solr query behavior?

Posted by Shankar Sundararaju <sh...@ebrary.com>.
Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()&debug=query*

    yields

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">17</int>
<lst name="params">
<str name="indent">true</str>
<str name="q">text:()</str>
<str name="debug">query</str>
</lst>
</lst>
<result name="response" numFound="0" start="0" maxScore="0.0"></result>
<lst name="debug">
<str name="rawquerystring">text:()</str>
<str name="querystring">text:()</str>
<str name="parsedquery">(+())/no_coord</str>
<str name="parsedquery_toString">+()</str>
<str name="QParser">ExtendedDismaxQParser</str>
<null name="altquerystring"/>
<null name="boost_queries"/>
<arr name="parsed_boost_queries"/>
<null name="boostfuncs"/>
</lst>
</response>

*q=doc-id:3000&debug=query*

    yields

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">17</int>
<lst name="params">
<str name="q">doc-id:3000</str>
<str name="debug">query</str>
</lst>
</lst>
<result name="response" numFound="1" start="0" maxScore="11.682044">
<doc>
  :
  :
</doc>
</result>
<lst name="debug">
<str name="rawquerystring">doc-id:3000</str>
<str name="querystring">doc-id:3000</str>
<str name="parsedquery">(+doc-id:3000)/no_coord</str>
<str name="parsedquery_toString">+doc-id:`#8;#0;#0;#23;8</str>
<str name="QParser">ExtendedDismaxQParser</str>
<null name="altquerystring"/>
<null name="boost_queries"/>
<arr name="parsed_boost_queries"/>
<null name="boostfuncs"/>
</lst>
</response>

*q=doc-id:3000 AND text:()&debug=query*

  yields

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">23</int>
<lst name="params">
<str name="q">doc-id:3000 AND text:()</str>
<str name="debug">query</str>
</lst>
</lst>
<result name="response" numFound="631647" start="0" maxScore="8.056607">
<doc>
 :
</doc>
 :
</doc>
<doc>
 :
</doc>
<doc>
 :
</doc>
<doc>
 :
</doc>
<doc>
 :
</doc>
</result>
<lst name="debug">
<str name="rawquerystring">doc-id:3000 AND text:()</str>
<str name="querystring">doc-id:3000 AND text:()</str>
<str name="parsedquery">
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))))/no_coord
</str>
<str name="parsedquery_toString">
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
</str>
<str name="QParser">ExtendedDismaxQParser</str>
<null name="altquerystring"/>
<null name="boost_queries"/>
<arr name="parsed_boost_queries"/>
<null name="boostfuncs"/>
</lst>
</response>

*solrconfig.xml:*
<requestHandler name="/select" class="solr.SearchHandler">
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">10</int>
       <str name="df">text</str>
       <str name="defType">edismax</str>
       <str name="qf">text^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0</str>
     </lst>

*schema.xml:*
<field name="text" type="my_text" indexed="true" stored="false" required=
"false"/>*
*
<dynamicField name="*" type="my_text" indexed="true" stored="true"
multiValued="false"/>
<fieldType name="my_text" class="solr.TextField"> <analyzer type="index"
class="MyAnalyzer"/> <analyzer type="query" class="MyAnalyzer"/> <analyzer
type="multiterm" class="MyAnalyzer"/> </fieldType>
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson <er...@gmail.com>wrote:

> Please post the results of adding &debug=query to the URL.
> That'll tell us what the query parser spits out which is much
> easier to analyze.
>
> Best
> Erick
>
> On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
> <sh...@ebrary.com> wrote:
> > This query returns 0 documents: *q=(+Title:() +Classification:()
> > +Contributors:() +text:())*
> >
> > This returns 1 document: *q=doc-id:3000*
> >
> > And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
> > AND (+Title:() +Classification:() +Contributors:() +text:())*
> >
> > Am I missing something here? Can someone please explain? I am using Solr
> > 4.2.1
> >
> > Thanks
> > -Shankar
>



-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
Shankar@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)

Re: Can anyone explain this Solr query behavior?

Posted by Erick Erickson <er...@gmail.com>.
Please post the results of adding &debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
<sh...@ebrary.com> wrote:
> This query returns 0 documents: *q=(+Title:() +Classification:()
> +Contributors:() +text:())*
>
> This returns 1 document: *q=doc-id:3000*
>
> And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
> AND (+Title:() +Classification:() +Contributors:() +text:())*
>
> Am I missing something here? Can someone please explain? I am using Solr
> 4.2.1
>
> Thanks
> -Shankar