You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2014/08/09 03:20:13 UTC

[jira] [Commented] (SOLR-6349) LocalParams for enabling/disabling individual stats

    [ https://issues.apache.org/jira/browse/SOLR-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091534#comment-14091534 ] 

Hoss Man commented on SOLR-6349:
--------------------------------


Proposed implementation...

* Change StatsValuesFactory.createStatsValues (and the constructors for the various StatsValues impls) to take in the local params from the {{stats.field}}
* each StatsValue impl should validate the stat params it's asked to compute 
* all stats should default to disabled, but we need a special backcompat case that if *no* stats are specified in local param, all current default stats are computed
** we can't be lazy and just check 0==localparams.size() - need to check the actuals stats params because of local params like "ex" and "key"
* for the distributed logic where things get a bit more complex (ie: distrib mean needs sum+count from all shards; distrib stddev needs sum+count+sumOfSquares from each shard) we can go two possible routes:
** A) StatsValue needs a new method to be *asked* by StatsComponent what local params it needs when sending shard requests
*** in this case the localparams of the shard requests have diff localparams and the processing of those shard stats requests can be ignorant of the fact that they are distributed
** B) StatsValue (via the factory method) needs to be informed when it's computing stats for an "isShard" request, so it can internally decide what per-shard values to return based on the input
*** in this case, the localparams are not modified per shard, but since "isShard=true" the StatsValue may return diff metrics then the ones requested so that the coordinator gets what it needs to aggregate.
** I think i'm leaning towards option "B" - particularly because it simplifies the idea of how to deal with situations like "percentiles" where the per-shard info isn't really a stat that should have it's own localparma folks migth ask for.
* deprecate {{stats.calcdistinct}} but use it as a default for the new corresponding localparam(s)

> LocalParams for enabling/disabling individual stats
> ---------------------------------------------------
>
>                 Key: SOLR-6349
>                 URL: https://issues.apache.org/jira/browse/SOLR-6349
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>
> Stats component currently computes all stats (except for one) every time because they are relatively cheap, and in some cases dependent on eachother for distrib computation -- but if we start layering stats on other things it becomes unnecessarily expensive to compute all the stats when they just want the "sum" (and it will definitely become excessively verbose in the responses).  
> The plan here is to use local params to make this configurable.  All of the existing stat options could be modeled as a simple boolean param, but future params (like percentiles) might take in a more complex param value...
> Example:
> {noformat}
> stats.field={!min=true max=true percentiles='99,99.999'}price
> stats.field={!mean=true}weight
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org