You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Bill Bell (Created) (JIRA)" <ji...@apache.org> on 2012/02/12 00:22:59 UTC

[jira] [Created] (SOLR-3124) explain output looks unreadable when using boost and edismax - #0; ?

explain output looks unreadable when using boost and edismax - #0; ?
--------------------------------------------------------------------

                 Key: SOLR-3124
                 URL: https://issues.apache.org/jira/browse/SOLR-3124
             Project: Solr
          Issue Type: Bug
    Affects Versions: 3.5
            Reporter: Bill Bell


defType=edismax&boost=query($param)&param=specialties_ids:32&debugQuery=true

<str name="2H7DF">
6.351252 = (MATCH) boost(*:*,query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)), product of:
  1.0 = (MATCH) MatchAllDocsQuery, product of:
    1.0 = queryNorm
  6.351252 = query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)=6.351252
</str><str name="X5PJW">
6.351252 = (MATCH) boost(*:*,query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)), product of:
  1.0 = (MATCH) MatchAllDocsQuery, product of:
    1.0 = queryNorm
  6.351252 = query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)=6.351252
</str>




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3124) explain output is confusing when using trie fields (or any field type where the indexed terms are not human readable)

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226159#comment-13226159 ] 

Robert Muir commented on SOLR-3124:
-----------------------------------

In trunk most of the explanation logic is now in the Sim itself: very little is done by the queries
themselves anymore: just the minimal basics like... toString'ing terms.

It seems the real problem here is how to toString() a Term right?
We currently have the confusing situation that Terms are generally toString()'ed with .utf8ToString():

{code}
  public final String toString() { return field + ":" + bytes.utf8ToString(); }
{code}

but this gives completely unhelpful output when they are actually binary, like this case. This goes for 
more than just explanations really.

Maybe since all the interning etc is removed and Term is rather simple in 4.0, we should make it non-final, this way subclasses
could override toString... or maybe it should really be a different method name in general (doing fancy stuff in toString is scary?)

Anyway i'm not sure its the best approach forward, but I just wanted to put the idea out there. I think it sucks if explanations
aren't useful... but having subclasses of Term is scary in its own right too, especially if its just for debugging but breaks
search code.

For example, this exact case is interesting because for a TermQuery.toString(), Term's toString() is actually not even used:

{code}
if (!term.field().equals(field)) {
  buffer.append(term.field());
  buffer.append(":");
}
buffer.append(term.text());
buffer.append(ToStringUtils.boost(getBoost()));
{code}

So the problem isn't very simple: especially since I'm sure there are places using Term.text() in other ways than
debugging...


                
> explain output is confusing when using trie fields (or any field type where the indexed terms are not human readable)
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-3124
>                 URL: https://issues.apache.org/jira/browse/SOLR-3124
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Bill Bell
>
> using the trunk example schema containing...
> {noformat}
> <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
> <dynamicField name="*_ti" type="tint"    indexed="true"  stored="true"/>
> {noformat}
> ...and indexing the doc...
> {noformat}
> $ java -Ddata=args -jar post.jar '<add><doc><field name="id">HOSS</field><field name="foo_ti">42</field></doc></add>'
> {noformat}
> ...results in a query for [foo_ti:42|http://localhost:8983/solr/select?q=foo_ti:42&start=0&rows=10&wt=json&debug.explain.structured=true&debugQuery=true&indent=true] producing the following debug output...
> {noformat}
>   "debug":{
>     "rawquerystring":"foo_ti:42",
>     "querystring":"foo_ti:42",
>     "parsedquery":"foo_ti:42",
>     "parsedquery_toString":"foo_ti:`\b\u0000\u0000\u0000*",
>     "explain":{
>       "HOSS":{
>         "match":true,
>         "value":3.6741486,
>         "description":"weight(foo_ti:`\b\u0000\u0000\u0000* in 0) [DefaultSimilarity], result of:",
>         "details":[{
>             "match":true,
>             "value":3.6741486,
>             "description":"fieldWeight in 0, product of:",
>             "details":[{
>                 "match":true,
>                 "value":1.0,
>                 "description":"tf(freq=1.0), with freq of:",
>                 "details":[{
>                     "match":true,
>                     "value":1.0,
>                     "description":"termFreq=1.0"}]},
>               {
>                 "match":true,
>                 "value":3.6741486,
>                 "description":"idf(docFreq=1, maxDocs=29)"},
>               {
>                 "match":true,
>                 "value":1.0,
>                 "description":"fieldNorm(doc=0)"}]}]}},
> ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-3124) explain output is confusing when using trie fields (or any field type where the indexed terms are not human readable)

Posted by "Hoss Man (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-3124:
---------------------------

    Description: 
using the trunk example schema containing...

{noformat}
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
<dynamicField name="*_ti" type="tint"    indexed="true"  stored="true"/>
{noformat}

...and indexing the doc...

{noformat}
$ java -Ddata=args -jar post.jar '<add><doc><field name="id">HOSS</field><field name="foo_ti">42</field></doc></add>'
{noformat}

...results in a query for [foo_ti:42|http://localhost:8983/solr/select?q=foo_ti:42&start=0&rows=10&wt=json&debug.explain.structured=true&debugQuery=true&indent=true] producing the following debug output...

{noformat}
  "debug":{
    "rawquerystring":"foo_ti:42",
    "querystring":"foo_ti:42",
    "parsedquery":"foo_ti:42",
    "parsedquery_toString":"foo_ti:`\b\u0000\u0000\u0000*",
    "explain":{
      "HOSS":{
        "match":true,
        "value":3.6741486,
        "description":"weight(foo_ti:`\b\u0000\u0000\u0000* in 0) [DefaultSimilarity], result of:",
        "details":[{
            "match":true,
            "value":3.6741486,
            "description":"fieldWeight in 0, product of:",
            "details":[{
                "match":true,
                "value":1.0,
                "description":"tf(freq=1.0), with freq of:",
                "details":[{
                    "match":true,
                    "value":1.0,
                    "description":"termFreq=1.0"}]},
              {
                "match":true,
                "value":3.6741486,
                "description":"idf(docFreq=1, maxDocs=29)"},
              {
                "match":true,
                "value":1.0,
                "description":"fieldNorm(doc=0)"}]}]}},
...
{noformat}

  was:
defType=edismax&boost=query($param)&param=specialties_ids:32&debugQuery=true

<str name="2H7DF">
6.351252 = (MATCH) boost(*:*,query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)), product of:
  1.0 = (MATCH) MatchAllDocsQuery, product of:
    1.0 = queryNorm
  6.351252 = query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)=6.351252
</str><str name="X5PJW">
6.351252 = (MATCH) boost(*:*,query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)), product of:
  1.0 = (MATCH) MatchAllDocsQuery, product of:
    1.0 = queryNorm
  6.351252 = query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; ,def=0.0)=6.351252
</str>




        Summary: explain output is confusing when using trie fields (or any field type where the indexed terms are not human readable)  (was: explain output looks unreadable when using boost and edismax - #0; ?)

generalizing summary & description since the issue actually has nothing to do with "boosting" and clarifying exactly how to reproduce (the field types used matter)

Bill: the fundamental problem is that the code for generating explain information works with the indexed terms in the queries, which for some field types is non-readable.  The Solr FieldType classes know how to format those indexed terms as readable strings, but the code for generating Explanation objects is at a lower level in lucene and doens't know about the schema at all.


                
> explain output is confusing when using trie fields (or any field type where the indexed terms are not human readable)
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-3124
>                 URL: https://issues.apache.org/jira/browse/SOLR-3124
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Bill Bell
>
> using the trunk example schema containing...
> {noformat}
> <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
> <dynamicField name="*_ti" type="tint"    indexed="true"  stored="true"/>
> {noformat}
> ...and indexing the doc...
> {noformat}
> $ java -Ddata=args -jar post.jar '<add><doc><field name="id">HOSS</field><field name="foo_ti">42</field></doc></add>'
> {noformat}
> ...results in a query for [foo_ti:42|http://localhost:8983/solr/select?q=foo_ti:42&start=0&rows=10&wt=json&debug.explain.structured=true&debugQuery=true&indent=true] producing the following debug output...
> {noformat}
>   "debug":{
>     "rawquerystring":"foo_ti:42",
>     "querystring":"foo_ti:42",
>     "parsedquery":"foo_ti:42",
>     "parsedquery_toString":"foo_ti:`\b\u0000\u0000\u0000*",
>     "explain":{
>       "HOSS":{
>         "match":true,
>         "value":3.6741486,
>         "description":"weight(foo_ti:`\b\u0000\u0000\u0000* in 0) [DefaultSimilarity], result of:",
>         "details":[{
>             "match":true,
>             "value":3.6741486,
>             "description":"fieldWeight in 0, product of:",
>             "details":[{
>                 "match":true,
>                 "value":1.0,
>                 "description":"tf(freq=1.0), with freq of:",
>                 "details":[{
>                     "match":true,
>                     "value":1.0,
>                     "description":"termFreq=1.0"}]},
>               {
>                 "match":true,
>                 "value":3.6741486,
>                 "description":"idf(docFreq=1, maxDocs=29)"},
>               {
>                 "match":true,
>                 "value":1.0,
>                 "description":"fieldNorm(doc=0)"}]}]}},
> ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org