You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2009/04/08 00:56:12 UTC

[jira] Created: (SOLR-1103) LukeRequestHandler histogram excludes freq of 1

LukeRequestHandler histogram excludes freq of 1
-----------------------------------------------

                 Key: SOLR-1103
                 URL: https://issues.apache.org/jira/browse/SOLR-1103
             Project: Solr
          Issue Type: Bug
            Reporter: Hoss Man
            Priority: Minor


the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.

this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes freq of 1

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745131#action_12745131 ] 

Yonik Seeley commented on SOLR-1103:
------------------------------------

Should this be fixed for 1.4?

> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
>                 Key: SOLR-1103
>                 URL: https://issues.apache.org/jira/browse/SOLR-1103
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Priority: Minor
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1103) LukeRequestHandler histogram excludes freq of 1

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-1103:
---------------------------

    Fix Version/s: 1.4

yeah ... fixing should be trivial, i just wasn't sure where the bug was (the iteration, or the bucket assignment)




> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
>                 Key: SOLR-1103
>                 URL: https://issues.apache.org/jira/browse/SOLR-1103
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes freq of 1

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748044#action_12748044 ] 

Hoss Man commented on SOLR-1103:
--------------------------------

There were three possible fixes depending on what people thought the correct behavior should be. 

i don't have the code in front of me, but as i recall they were all trivial...

1) add a comment
2) change a for loop to start at 1 instead of 2
3) change getPowerOfTwoBucket to have something like...

{code}
return result < 2 ? 2 : result;
{code}

I think Ryan wrote this code originally: Ryan, do you have any recollection as to what the orriginal intent was with the first bucket?

> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
>                 Key: SOLR-1103
>                 URL: https://issues.apache.org/jira/browse/SOLR-1103
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes freq of 1

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696804#action_12696804 ] 

Hoss Man commented on SOLR-1103:
--------------------------------

possible ways to resolve this...
1) add a comment clarifying that the current behavior is desired
2) make iteration start at "1"
3) change getPowerOfTwoBucket so "2" is the lowest bucket value it ever returns.

> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
>                 Key: SOLR-1103
>                 URL: https://issues.apache.org/jira/browse/SOLR-1103
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Priority: Minor
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-1103) LukeRequestHandler histogram excludes freq of 1

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man resolved SOLR-1103.
----------------------------

    Resolution: Fixed
      Assignee: Hoss Man

Committed revision 810324.

i went ahead and fixed this using the "display the '1' bucket" approach.

> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
>                 Key: SOLR-1103
>                 URL: https://issues.apache.org/jira/browse/SOLR-1103
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Minor
>             Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes freq of 1

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747935#action_12747935 ] 

Grant Ingersoll commented on SOLR-1103:
---------------------------------------

Hoss, 

Do you have a fix for this?

> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
>                 Key: SOLR-1103
>                 URL: https://issues.apache.org/jira/browse/SOLR-1103
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.