You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2009/04/08 00:56:12 UTC
[jira] Created: (SOLR-1103) LukeRequestHandler histogram excludes
freq of 1
LukeRequestHandler histogram excludes freq of 1
-----------------------------------------------
Key: SOLR-1103
URL: https://issues.apache.org/jira/browse/SOLR-1103
Project: Solr
Issue Type: Bug
Reporter: Hoss Man
Priority: Minor
the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes
freq of 1
Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745131#action_12745131 ]
Yonik Seeley commented on SOLR-1103:
------------------------------------
Should this be fixed for 1.4?
> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
> Key: SOLR-1103
> URL: https://issues.apache.org/jira/browse/SOLR-1103
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Priority: Minor
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1103) LukeRequestHandler histogram excludes
freq of 1
Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated SOLR-1103:
---------------------------
Fix Version/s: 1.4
yeah ... fixing should be trivial, i just wasn't sure where the bug was (the iteration, or the bucket assignment)
> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
> Key: SOLR-1103
> URL: https://issues.apache.org/jira/browse/SOLR-1103
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Priority: Minor
> Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes
freq of 1
Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748044#action_12748044 ]
Hoss Man commented on SOLR-1103:
--------------------------------
There were three possible fixes depending on what people thought the correct behavior should be.
i don't have the code in front of me, but as i recall they were all trivial...
1) add a comment
2) change a for loop to start at 1 instead of 2
3) change getPowerOfTwoBucket to have something like...
{code}
return result < 2 ? 2 : result;
{code}
I think Ryan wrote this code originally: Ryan, do you have any recollection as to what the orriginal intent was with the first bucket?
> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
> Key: SOLR-1103
> URL: https://issues.apache.org/jira/browse/SOLR-1103
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Priority: Minor
> Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes
freq of 1
Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696804#action_12696804 ]
Hoss Man commented on SOLR-1103:
--------------------------------
possible ways to resolve this...
1) add a comment clarifying that the current behavior is desired
2) make iteration start at "1"
3) change getPowerOfTwoBucket so "2" is the lowest bucket value it ever returns.
> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
> Key: SOLR-1103
> URL: https://issues.apache.org/jira/browse/SOLR-1103
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Priority: Minor
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1103) LukeRequestHandler histogram excludes
freq of 1
Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man resolved SOLR-1103.
----------------------------
Resolution: Fixed
Assignee: Hoss Man
Committed revision 810324.
i went ahead and fixed this using the "display the '1' bucket" approach.
> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
> Key: SOLR-1103
> URL: https://issues.apache.org/jira/browse/SOLR-1103
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Assignee: Hoss Man
> Priority: Minor
> Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1103) LukeRequestHandler histogram excludes
freq of 1
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747935#action_12747935 ]
Grant Ingersoll commented on SOLR-1103:
---------------------------------------
Hoss,
Do you have a fix for this?
> LukeRequestHandler histogram excludes freq of 1
> -----------------------------------------------
>
> Key: SOLR-1103
> URL: https://issues.apache.org/jira/browse/SOLR-1103
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Priority: Minor
> Fix For: 1.4
>
>
> the TermHistogram class in the LukeRequestHandler seems to properly count the occurances of terms with a freq of "1", but then when converting to a NamedLIst begins iterating at bucket "2" so the counts for freq of "1" don't appear in the result.
> this may have been a conscious choice to eliminate superfluously high values for terms with a freq of one ... or it may have been a mistake assuming freq values of 1 would fall in the "2" bucket.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.