You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (Created) (JIRA)" <ji...@apache.org> on 2011/12/30 23:42:35 UTC

[jira] [Created] (SOLR-2996) make "q=*" not suck in the lucene and edismax parsers

make "q=*" not suck in the lucene and edismax parsers
-----------------------------------------------------

                 Key: SOLR-2996
                 URL: https://issues.apache.org/jira/browse/SOLR-2996
             Project: Solr
          Issue Type: Improvement
            Reporter: Hoss Man


More then a few users have gotten burned by thinking that "*" is the appropriate syntax for "match all docs" when what it really does (unless i'm mistaken) is create a prefix query on the default search field using a blank string as the prefix.

since it seems very unlikely that anyone has a genuine usecase for making a prefix query with a blank prefix, we should change the default behavior of the LuceneQParser and EDismaxQParsers (and any other Qparsers that respect *:* if i'm forgetting them) to treat this situation the same as *:*.  we can offer a (local)param to force the old behavior if someone really wants it.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2996) make "q=*" not suck in the lucene and edismax parsers

Posted by "Hoss Man (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177799#comment-13177799 ] 

Hoss Man commented on SOLR-2996:
--------------------------------

Recent example of this type of confusion and the problems it can cause from the mailing list...

https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3Calpine.DEB.2.00.1112131115550.16571@bester%3E

Another recent discussion about this type of problem from IRC...

{noformat}
13:25 < mikeliss:#solr> Hi, I'm running into an error with maxbooleanclauses when I try to do a range query with 
                        highlighting...is there any workaround for this? Would really appreciate some direction, if 
                        anybody knows.
13:26 < mikeliss:#solr> This is the query that dies: 
http://localhost:8983/solr/select/?q=*&version=2.2&start=0&rows=20&indent=on&hl=true&hl.fl=text,caseName,westCite,docketNumber,lexisCite,court_citation_string&hl.snippets=5&f.text.hl.alternateField=text&f.text.hl.maxAlternateFieldLength=500

13:28 < hoss:#solr> that query doesn't make sense ... for a couple of reasons ... what are you *trying* to do?
13:29 < hoss:#solr> i mena ... for starters ... there is no range query there.  second, q=* is a big red flag: it's 
                    a prefix query on the default field using the prefix "" (ie: the empty string)

14:23 < mikeliss:#solr> hoss, yeah, I assumed that highlighting would just do nothing if a prefix query were given 
                        on an empty string.
14:24 < mikeliss:#solr> hoss, I added a check in my code that will only enable highlighting if the query isn't '*'.
14:24 < mikeliss:#solr> hoss, Seems naive, but it's working at least for the moment.

14:27 < hoss:#solr> i think you're missing my point: q=* is a fairly non-sensical query ... you should't just 
                    prevent highlighting on that query, you should stop doing that query in the first place
14:28 < hoss:#solr> as a query solr can handle it, and optimize it to be efficient
14:28 < hoss:#solr> (evenn though it's silly)
14:28 < mikeliss:#solr> hoss, I'm using that query on my homepage to show the latest documents in the index. It 
                        should just return everything, right?
14:28 < hoss:#solr> but for highlighting, the highlighter actually needs to know all the terms it matches
14:28 < hoss:#solr> and to konw al lthe terms it matches, it needs to look at *ALL* the terms in the default field
14:29 < hoss:#solr> mikeliss: no, no, NO ... i'm not sure where people started getting the missconception that 
                    "q=*" matches all docs, but that is *NOT* what it does
14:29 < hoss:#solr> one second...
14:30 < hoss:#solr> mikeliss: 
https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3CCAL69qOn1XeMNz6JYdWj_o7rH_=O3i-NiqdO6rorvN48bywU+nA@mail.gmail.com%3E
14:30 < hoss:#solr> ...and...
14:30 < hoss:#solr> https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3Calpine.DEB.2.00.1112131115550.16571@bester%3E

14:32 < mikeliss:#solr> hoss, ah, that makes sense. I guess * is just too tempting, since it is something users can 
                        easily remember.
14:34 < mikeliss:#solr> hoss, back to my original issue, now I'm confused why hl fails on a search for *. Shouldn't 
                        it just highlight nothing, and return results? I wasn't able to get debugging to work for 
                        the query, so I'm a bit confused..

14:35 < hoss:#solr> see my other comment above: the highlighter is trying to find all the terms used in the query 
                    to highlight them -- a query for "*" matches all terms in the default field, which is way more 
                    then the highlighter can handle (hence the exception)

14:38 < hoss:#solr> i'm filing a bug to change the beahvior of "q=*" ... do you mind if i cut/paste this dialog 
                    into the jira issue as an example of user confusion?
14:39 < mikeliss:#solr> Not at all. I was wondering if that was potentially a bug...figured I'd leave it to the 
                        experts.
{noformat}
                
> make "q=*" not suck in the lucene and edismax parsers
> -----------------------------------------------------
>
>                 Key: SOLR-2996
>                 URL: https://issues.apache.org/jira/browse/SOLR-2996
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>
> More then a few users have gotten burned by thinking that "*" is the appropriate syntax for "match all docs" when what it really does (unless i'm mistaken) is create a prefix query on the default search field using a blank string as the prefix.
> since it seems very unlikely that anyone has a genuine usecase for making a prefix query with a blank prefix, we should change the default behavior of the LuceneQParser and EDismaxQParsers (and any other Qparsers that respect *:* if i'm forgetting them) to treat this situation the same as *:*.  we can offer a (local)param to force the old behavior if someone really wants it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2996) make "q=*" not suck in the lucene and edismax parsers

Posted by "Hoss Man (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-2996:
---------------------------

    Description: 
More then a few users have gotten burned by thinking that "{{\*}}" is the appropriate syntax for "match all docs" when what it really does (unless i'm mistaken) is create a prefix query on the default search field using a blank string as the prefix.

since it seems very unlikely that anyone has a genuine usecase for making a prefix query with a blank prefix, we should change the default behavior of the LuceneQParser and EDismaxQParsers (and any other Qparsers that respect {{\*:\*}} if i'm forgetting them) to treat this situation the same as {{\*:\*}}.  we can offer a (local)param to force the old behavior if someone really wants it.


  was:
More then a few users have gotten burned by thinking that "*" is the appropriate syntax for "match all docs" when what it really does (unless i'm mistaken) is create a prefix query on the default search field using a blank string as the prefix.

since it seems very unlikely that anyone has a genuine usecase for making a prefix query with a blank prefix, we should change the default behavior of the LuceneQParser and EDismaxQParsers (and any other Qparsers that respect *:* if i'm forgetting them) to treat this situation the same as *:*.  we can offer a (local)param to force the old behavior if someone really wants it.



fix jira markup in description
                
> make "q=*" not suck in the lucene and edismax parsers
> -----------------------------------------------------
>
>                 Key: SOLR-2996
>                 URL: https://issues.apache.org/jira/browse/SOLR-2996
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>
> More then a few users have gotten burned by thinking that "{{\*}}" is the appropriate syntax for "match all docs" when what it really does (unless i'm mistaken) is create a prefix query on the default search field using a blank string as the prefix.
> since it seems very unlikely that anyone has a genuine usecase for making a prefix query with a blank prefix, we should change the default behavior of the LuceneQParser and EDismaxQParsers (and any other Qparsers that respect {{\*:\*}} if i'm forgetting them) to treat this situation the same as {{\*:\*}}.  we can offer a (local)param to force the old behavior if someone really wants it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org