You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Dennis Gove (JIRA)" <ji...@apache.org> on 2015/11/13 02:50:11 UTC

[jira] [Commented] (SOLR-8185) Add operations support to streaming metrics

    [ https://issues.apache.org/jira/browse/SOLR-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003390#comment-15003390 ] 

Dennis Gove commented on SOLR-8185:
-----------------------------------

Running into some issues turning the expression into something that would perform the expected .equals()

{code}
 avg(a_f, replace(10, withValue=0))
{code}
In this example, what type is 10? Is it a long or a float or a double? The field is a float (as noted by the _f) so one would expect 10 to be a float as well. However, in converting 10 to some Object that we can call .equals(...) on we are not sure what the type is. This has been a persistent problem with this patch.

But I think I've come up with something that puts some of the decision making in the hands of the expression writer.

{code}
 avg(a_f, replace(10f, withValue=0f))
{code}
In this case the value can only be converted to a float so it will be created as a float object.

However, to add this new requirement on the expression creator I want to take a deeper look at what this might impact and make sure the documentation is very clear. If a user doesn't do the correct thing (gives us 10 instead of 10f) and the value in the tuple is a float then float.equals(long) == false every single time.

Anyway, this note is somewhat of a rant. 

> Add operations support to streaming metrics
> -------------------------------------------
>
>                 Key: SOLR-8185
>                 URL: https://issues.apache.org/jira/browse/SOLR-8185
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrJ
>            Reporter: Dennis Gove
>            Assignee: Dennis Gove
>            Priority: Minor
>         Attachments: SOLR-8185.patch
>
>
> Adds support for operations on stream metrics.
> With this feature one can modify tuple values before applying to the computed metric. There are a lot of use-cases I can see with this - I'll describe one here.
> Imagine you have a RollupStream which is computing the average over some field but you cannot be sure that all documents have a value for that field, ie the value is null. When the value is null you want to treat it as a 0. With this feature you can accomplish that like this
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0)),
>   count(*),
> )
> {code}
> The operations are applied to the tuple for each metric in the stream which means you perform different operations on different metrics without being impacted by operations on other metrics. 
> Adding to our previous example, imagine you want to also get the min of a field but do not consider null values.
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0)),
>   min(a_i),
>   count(*),
> )
> {code}
> Also, the tuple is not modified for streams that might wrap this one. Ie, the only thing that sees the applied operation is that particular metric. If you want to apply operations for wrapping streams you can still achieve that with the SelectStream (SOLR-7669).
> One feature I'm investigating but this patch DOES NOT add is the ability to assign names to the resulting metric value. For example, to allow for something like this
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0), as="avg_a_i_null_as_0"),
>   avg(a_i),
>   count(*, as="totalCount"),
> )
> {code}
> Right now that isn't possible because the identifier for each metric would be the same "avg_a_i" and as such both couldn't be returned. It's relatively easy to add but I have to investigate its impact on the SQL and FacetStream areas.
> Depends on SOLR-7669 (SelectStream)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org