You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jan Høydahl (Created JIRA)" <ji...@apache.org> on 2012/02/20 11:59:35 UTC

[jira] [Created] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Velocity /browse GUI should stick to AND as defaultOperator
-----------------------------------------------------------

                 Key: SOLR-3145
                 URL: https://issues.apache.org/jira/browse/SOLR-3145
             Project: Solr
          Issue Type: Improvement
          Components: web gui
    Affects Versions: 4.0
            Reporter: Jan Høydahl
             Fix For: 4.0


After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228382#comment-13228382 ] 

Robert Muir commented on SOLR-3145:
-----------------------------------

{quote}
Another thing - ALL applications that want to do sorting should care about the precision of their search.
{quote}

Thats not searching, thats matching. I think we should default to good behavior for search.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Updated JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Høydahl updated SOLR-3145:
------------------------------

    Attachment: SOLR-3145.patch
    
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-3145) Velocity "/browse" config should set mm=100% to behave as in 3.x

Posted by "Jan Høydahl (Updated JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Høydahl updated SOLR-3145:
------------------------------

    Description: After SOLR-1889 was committed, the default for DisMax "mm" parameter changes depending on q.op. Since defaultOperator=OR in example schema.xml, and no "mm" parameter is specified in the "/browse" request handler, DisMax will fallback to mm=0%. To be consistent with 3.x behavior, we should add mm=100% for "/browse" config.  (was: After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.)
        Summary: Velocity "/browse" config should set mm=100% to behave as in 3.x  (was: Velocity /browse GUI should stick to AND as defaultOperator)

Clarified title and description.
                
> Velocity "/browse" config should set mm=100% to behave as in 3.x
> ----------------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the default for DisMax "mm" parameter changes depending on q.op. Since defaultOperator=OR in example schema.xml, and no "mm" parameter is specified in the "/browse" request handler, DisMax will fallback to mm=0%. To be consistent with 3.x behavior, we should add mm=100% for "/browse" config.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228354#comment-13228354 ] 

Robert Muir commented on SOLR-3145:
-----------------------------------

I think defaulting to AND is very dangerous: especially with more minimal stopword lists
like Lucene's. Then shorter documents that happen to be missing some useless pronoun
don't show up in results at all.

Any problems that this would "Fix" are really problems with Lucene's Similarity: the term
frequency normalization function grows too fast, etc.

Why not fix the real problem instead: either default to a Similarity with a stronger coord() 
implementation, or a stronger ranking algorithm all together.

                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228557#comment-13228557 ] 

Yonik Seeley commented on SOLR-3145:
------------------------------------

bq. > I believe this is how google does it?
bq. This is false.

Rather strong/blanket statement.  It seems roughly true that adding non-trivial words to a google search lowers the number of matches.

I guess we'll continue to disagree with a "lowest common denominator" approach to languages.
It's too bad that our example has no stopwords or stemming any more because of this philosophy.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228370#comment-13228370 ] 

Jan Høydahl commented on SOLR-3145:
-----------------------------------

Robert, we don't disagree on the fact that search is more difficult than a simple OR or AND. People need to invest in designing a good search experience, taking these factors as well as many other into consideration. There is no silver bullet to recall or relevancy, nor is an advice to use "OR". I have been involved in more than 100 enterprise search installations world wide and in perhaps 2 or three of them we chose "OR" as default. Most often it's a matter of "AND" as default plus a lot of careful design in order to increase recall without sacrificing too much precision. Another key point is that people expect AND-ish behavior from the large public search engines, and are puzzled if they keep getting more results the more words they enter in the search box.

Feel free to open new JIRAs for the other shortcomings you mentioned, like better Similarity defaults - I'm a big fan of that as well!
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228522#comment-13228522 ] 

Yonik Seeley commented on SOLR-3145:
------------------------------------

bq.  SOLR-1889 was the correct change [...] Changing the queryparser default to AND is very bad

+1 (but probably for different reasons than you ;-)

But Jan is talking about just changing the default for just an example GUI (/browse), and not any query parsers.  That's pretty minor - not a big deal either way, but I do think that from a "finished product" perspective, more people expect all of their query terms to appear in matching documents (and I believe this is how google does it?)
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228343#comment-13228343 ] 

Jan Høydahl commented on SOLR-3145:
-----------------------------------

Other optinions? If not, I'll prepare a patch changing "/browse" to default to mm=100%
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228616#comment-13228616 ] 

Yonik Seeley commented on SOLR-3145:
------------------------------------

bq. explicit sorting is a no-go for most use cases

Heh.  Seems we live in very different worlds.  Perhaps Lucene is only about full-text search (at least it was in the past), but Solr has always been about much more than that.  Sorting by other things than "full-text relevance" is extremely common.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228361#comment-13228361 ] 

Robert Muir commented on SOLR-3145:
-----------------------------------

{quote}
Robert, I don't get your comment - what does this have to do with stopwords or Similarity? It sounds more like a general opinion that you like OR better than AND, the more hits the better...
{quote}

Its not a general opinion. I dont care how many 'totalHits' are returned. I care about the relevance of the top N.

And when good results are discarded simply because the query contained a useless word like 'his', thats bad news.

People are too quick to jump to AND without debugging the real problem. The problem is that they see results that don't contain all of their query terms
ranked above results that do: this is a direct result of lucene's sqrt() tf normalization function (which it tries to make up for with coord): as opposed 
to other alternatives that are less aggressive and are known to perform better.

By forcing everything to AND, it then means the ranking system extremely fragile in cases like stopwords, but this is applying a hammer,
its not the right default.

                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228404#comment-13228404 ] 

Jan Høydahl commented on SOLR-3145:
-----------------------------------

{quote}
bq. Another thing - ALL applications that want to do sorting should care about the precision of their search.

Thats not searching, thats matching. I think we should default to good behavior for search.
{quote}

Come again? Are you saying people don't build *search driven* applications these days? If so, you're just missing out on a big trend in the market... Our customers tend to request a seamless mix of advanced full-text search, navigation and metadata filtering/sorting. Forcing people into either strict metadata matching OR free-text search is artificial.

Anyway, this is a side track. This issue is about NOT changing the "/browse" behaviour from 3.x to 4.x
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228357#comment-13228357 ] 

Jan Høydahl commented on SOLR-3145:
-----------------------------------

Robert, I don't get your comment - what does this have to do with stopwords or Similarity? It sounds more like a general opinion that you like OR better than AND, the more hits the better...

What this is about is letting the example "/browse" GUI stick to its previous mm=100% behavior so 3.x "/browse" users will have a consistent experience. If people want "OR" they can change it. Personally I'd prefer changing defaultOperator in example schema to "AND", but I'm fine with OR there if /browse gets fixed.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228495#comment-13228495 ] 

Jan Høydahl commented on SOLR-3145:
-----------------------------------

It is no surprise that you get better recall with OR - and thus find certain documents related to one of the terms which do not contain all terms. That's ABC and you don't need to prove that. But that is not the same as assuming that most Solr users prefer OR over AND. People seem to have been happy with "/browse" being AND for the past years, so why change now?
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Assigned JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Høydahl reassigned SOLR-3145:
---------------------------------

    Assignee: Jan Høydahl
    
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211776#comment-13211776 ] 

Jan Høydahl commented on SOLR-3145:
-----------------------------------

Two options

A) Change schema.xml defaultOperator from OR to AND. That's what the majority of people want anyway isn't it?

B) Add a mm=100% to the requestHandler config of "browse"

What do you prefer?
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228405#comment-13228405 ] 

Robert Muir commented on SOLR-3145:
-----------------------------------

I guess I'm not very trendy.

I can run tests comparing AND and OR for you on standard test collections if you want, I already know the answers.
For defaults, we should take the conservative approach. Trendy people can change the defaults.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (SOLR-3145) Velocity "/browse" config should set mm=100% to behave as in 3.x

Posted by "Jan Høydahl (Resolved JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Høydahl resolved SOLR-3145.
-------------------------------

    Resolution: Fixed
    
> Velocity "/browse" config should set mm=100% to behave as in 3.x
> ----------------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the default for DisMax "mm" parameter changes depending on q.op. Since defaultOperator=OR in example schema.xml, and no "mm" parameter is specified in the "/browse" request handler, DisMax will fallback to mm=0%. To be consistent with 3.x behavior, we should add mm=100% for "/browse" config.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228589#comment-13228589 ] 

Uwe Schindler commented on SOLR-3145:
-------------------------------------

I agree with Jan about "and" is good for some use cases, but only for the case that the user wants to override scoring and just sort e.g. by date, which is bogus for full text search engines alltogether. The first thing a full-text search consultat should do to the company representatives is explaining that explicit sorting is a no-go for most use cases. If the user wants to influence scoring, he can do that e.g. by adding per-document boost factors as DocValues field or by multiplying other score factors like geo distance, but never ever simply sort by distance in geo search (simple example: a "cocktail bar" in 2 miles distance might be a better result than a bar called "cocktail stripper" in 100 yards for users that entered "cocktail bar" into their search engine - just as example).
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211807#comment-13211807 ] 

Yonik Seeley commented on SOLR-3145:
------------------------------------

bq. B) Add a mm=100% to the requestHandler config of "browse"

+1
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228533#comment-13228533 ] 

Robert Muir commented on SOLR-3145:
-----------------------------------

{quote}
But Jan is talking about just changing the default for just an example GUI (/browse), and not any query parsers. 
{quote}

I think its pretty important. The problem is that in some languages, someone enters a search query with some useless particle
or something and misses documents completely only because of grammatical structure.

Also for a lot of languages (e.g. chinese), tokenization into 'query terms' is not even close to completely accurate!

{quote}
That's pretty minor - not a big deal either way, but I do think that from a "finished product" perspective, more people expect all of their query terms to appear in matching documents (and I believe this is how google does it?
{quote}

This is false. Search for 'lucid in imagination' and look for the first result, it does not contain the word 'in'. 
This is just an illustration of my point (its hard to come up with examples for english), but other examples
would be simple things like searching for U.S.A-China relations and missing documents that have U.S.-China relations.

In general most of the stopwords lists we have are very incomplete and minimal: I think this is good. But if you choose
to use AND as a default, you need to be much more aggressive about these things.

Also i'm completely failing to mention use cases that do more natural language searches (e.g. longer queries) would really
suffer more here. 

Again I think: don't wire the queryparser to force 100% query-term-importance, lean on the ranking system to do this.
As i mentioned, its my opinion there are serious problems with lucene's sqrt() tf normalization (it grows too fast and does
not represent the information gain of additional term occurrences well), causing additional occurences of only a few terms
to blow up the score versus documents that actually do contain all terms: but we shouldn't solve that with a hammer like this.

So from a 'finished product' I think it should work reasonably well for as many languages and use cases as possible out of box:
it should be generic. This kind of tuning thats specific to only certain use cases/languages/configurations is well documented 
(its easy to change the default operator) and not tricky to do.

                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228385#comment-13228385 ] 

Robert Muir commented on SOLR-3145:
-----------------------------------

{quote}
Feel free to open new JIRAs for the other shortcomings you mentioned, like better Similarity defaults - I'm a big fan of that as well!
{quote}

The problem is 4.x must still be able to read 3.x indexes and return good results, but 3.x indexes don't have the statistics we need
to e.g. default to BM25 or something else. So I was hoping to bring this up for 5.0, it seems for 4.0 we should take the conservative
approach and keep what we have: so that any migrating users don't have bad performance (yes all those Sims will work in degraded mode
for preflex indexes but i don't like that).
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228584#comment-13228584 ] 

Uwe Schindler commented on SOLR-3145:
-------------------------------------

bq. Rather strong/blanket statement. It seems roughly true that adding non-trivial words to a google search lowers the number of matches.

This "seems" to be true in lot's of cases. But if you search Google for "google number of results" you will see pages from all over the internet discussing this topic. Even Google states in its FAQ that the number of results is just a guess and depends on various factors that appear quite random, there is no relationship in counts regarding adding/removing terms. Even the same search returns largely different counts when you change pages (e.g. going from page 1->2 completely changes the count). The reason for this is of course query preprocessing, different search clusters and user-specific preferences. To get a more solr-like result, use "wortwörtlich" (German) / "verbatim" (English) on the left sidebar.

Lot of people simply say: the google count is just arbitrary and useless for any metrics.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Jan Høydahl (Commented JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228371#comment-13228371 ] 

Jan Høydahl commented on SOLR-3145:
-----------------------------------

Another thing - ALL applications that want to do sorting should care about the precision of their search. If there are 100 relevant docs for a given query, say {{q=sports car}}, and your result set returns 1000 docs since you use q.op=OR, then you may very well get the best sports cars on top, but try sorting by date, price, popularity or anything other than "score" and your results are crap because you only paid attention to search recall, not to precision. It's like a scale - gain one and you lose the other.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3145) Velocity /browse GUI should stick to AND as defaultOperator

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228508#comment-13228508 ] 

Robert Muir commented on SOLR-3145:
-----------------------------------

Because I think SOLR-1889 was the correct change: the default Lucene queryparser is OR, and there are many good
reasons for this. 

Changing the queryparser default to AND is very bad for isolating languages. I strongly disagree with doing this.
                
> Velocity /browse GUI should stick to AND as defaultOperator
> -----------------------------------------------------------
>
>                 Key: SOLR-3145
>                 URL: https://issues.apache.org/jira/browse/SOLR-3145
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 4.0
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>             Fix For: 4.0
>
>         Attachments: SOLR-3145.patch
>
>
> After SOLR-1889 was committed, the DisMax "mm" parameter defaults to whatever set in q.op. Since defaultOperator in schema.xml is OR, this means that DisMax now defaults to OR (mm=0) instead of the old default (mm=100%). It should stick to AND as before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org