You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2013/01/23 09:28:12 UTC
[jira] [Commented] (LUCENE-4709) Nuke FacetResultNode.residue
[ https://issues.apache.org/jira/browse/LUCENE-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560476#comment-13560476 ]
Shai Erera commented on LUCENE-4709:
------------------------------------
BTW, a somewhat supporting evidence that we should nuke it, are the following benchmark results (thanks Mike!). Base is trunk, comp is trunk + no residue computation:
{noformat}
Task QPS base StdDev QPS comp StdDev Pct diff
Respell 111.64 (3.2%) 110.49 (3.2%) -1.0% ( -7% - 5%)
OrHighHigh 4.33 (2.8%) 4.30 (3.0%) -0.7% ( -6% - 5%)
HighSpanNear 2.98 (2.3%) 2.97 (2.0%) -0.4% ( -4% - 3%)
HighSloppyPhrase 0.89 (8.9%) 0.89 (8.2%) -0.3% ( -15% - 18%)
HighTerm 7.95 (2.3%) 7.93 (2.4%) -0.2% ( -4% - 4%)
OrHighLow 7.57 (2.2%) 7.55 (2.3%) -0.2% ( -4% - 4%)
OrHighMed 7.51 (2.7%) 7.51 (2.8%) 0.1% ( -5% - 5%)
Wildcard 74.46 (3.6%) 74.54 (2.0%) 0.1% ( -5% - 5%)
PKLookup 247.56 (2.1%) 247.85 (2.8%) 0.1% ( -4% - 5%)
LowSpanNear 7.54 (4.6%) 7.59 (3.6%) 0.7% ( -7% - 9%)
AndHighHigh 12.56 (0.9%) 12.68 (1.0%) 0.9% ( -1% - 2%)
MedSpanNear 19.88 (1.5%) 20.08 (2.2%) 1.0% ( -2% - 4%)
MedSloppyPhrase 18.45 (2.1%) 18.64 (2.1%) 1.0% ( -3% - 5%)
LowSloppyPhrase 17.52 (3.7%) 17.71 (3.8%) 1.1% ( -6% - 8%)
Prefix3 45.70 (5.6%) 46.25 (2.7%) 1.2% ( -6% - 10%)
LowPhrase 16.86 (3.4%) 17.07 (3.1%) 1.2% ( -5% - 8%)
MedTerm 23.00 (1.4%) 23.33 (1.8%) 1.4% ( -1% - 4%)
IntNRQ 17.97 (7.8%) 18.26 (4.7%) 1.6% ( -10% - 15%)
HighPhrase 15.71 (7.0%) 15.98 (5.2%) 1.7% ( -9% - 15%)
Fuzzy1 33.30 (1.8%) 33.90 (1.3%) 1.8% ( -1% - 5%)
Fuzzy2 41.46 (2.2%) 42.26 (2.0%) 1.9% ( -2% - 6%)
LowTerm 40.47 (1.1%) 41.45 (1.7%) 2.4% ( 0% - 5%)
AndHighMed 49.38 (0.9%) 51.08 (1.3%) 3.4% ( 1% - 5%)
MedPhrase 55.65 (2.7%) 57.79 (2.5%) 3.8% ( -1% - 9%)
AndHighLow 98.02 (1.5%) 104.36 (2.9%) 6.5% ( 2% - 10%)
{noformat}
> Nuke FacetResultNode.residue
> ----------------------------
>
> Key: LUCENE-4709
> URL: https://issues.apache.org/jira/browse/LUCENE-4709
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Shai Erera
> Assignee: Shai Erera
>
> The residue is the count of all categories that did not make it to the top K. But, this is a senseless statistic. Take for example the following case: two documents with categories [A/1, A/2, A/3] and [A/1, A/4, A/5]. If you ask for top-1 category of "A", you'll get A (count=2), A/1 (count=2), but A's residue will be 4!
> As a user, that number doesn't tell you anything, except maybe when you index only one category per document for a given dimension. But in that case, the residue is {{root.value - sum(topK.value)}}, which the application can compute if it needs to.
> In short, we're just wasting CPU cycles for that statistic, so I'm going to remove it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org