You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Lewin Joy (TMS)" <le...@toyota.com> on 2016/10/06 06:04:21 UTC

Average of Averages in Solr

•• PROTECTED 関係者外秘

Hi,

I have a big collection with around 100 million records.
There is a requirement to take an average on "Amount" field against each "code" field.
And then calculate the averages on this averages.
Since my "code" field has a very huge cardinality, which could be around 200,000 or even in millions ; It gets highly complex to calculate the average of averages through Java.
Even Solr takes a huge time listing the averages. And the JSON response size becomes huge.
Is there some way we can tackle this? Any way we stats on stats?

Thanks,
Lewin

Re: Average of Averages in Solr

Posted by Susheel Kumar <su...@gmail.com>.
Please look into streaming expressions.  I think that is what you are
looking for.
https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions

Thanks,
Susheel



On Thu, Oct 6, 2016 at 11:56 AM, John Bickerstaff <jo...@johnbickerstaff.com>
wrote:

> This may help?  Note the "Bloomberg Analytics" at the bottom of the post...
>
> https://dzone.com/articles/solr-not-just-for-text-anymore
>
> Quote from article:
>
>
>    - *Bloomberg Analytics Component for Solr*: Bloomberg Financial Services
>    uses Solr extensively, and found the existing statistical packages
> woefully
>    lacking. So, they developed a high-performance framework that can
> perform
>    complex calculations and aggregations on time-series data, and then
>    released it to OpenSource.
>
>
> On Thu, Oct 6, 2016 at 8:53 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>
> > On 10/6/2016 12:04 AM, Lewin Joy (TMS) wrote:
> > > There is a requirement to take an average on "Amount" field against
> > > each "code" field. And then calculate the averages on this averages.
> > > Since my "code" field has a very huge cardinality, which could be
> > > around 200,000 or even in millions ; It gets highly complex to
> > > calculate the average of averages through Java. Even Solr takes a huge
> > > time listing the averages. And the JSON response size becomes huge. Is
> > > there some way we can tackle this? Any way we stats on stats?
> >
> > I wasn't sure what you meant with the first sentence I quoted above, but
> > in order to get statistics from your index that are relevant for the
> > results of a query, you probably want the stats component.
> >
> > https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
> >
> > Thanks,
> > Shawn
> >
> >
>

Re: Average of Averages in Solr

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
This may help?  Note the "Bloomberg Analytics" at the bottom of the post...

https://dzone.com/articles/solr-not-just-for-text-anymore

Quote from article:


   - *Bloomberg Analytics Component for Solr*: Bloomberg Financial Services
   uses Solr extensively, and found the existing statistical packages woefully
   lacking. So, they developed a high-performance framework that can perform
   complex calculations and aggregations on time-series data, and then
   released it to OpenSource.


On Thu, Oct 6, 2016 at 8:53 AM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 10/6/2016 12:04 AM, Lewin Joy (TMS) wrote:
> > There is a requirement to take an average on "Amount" field against
> > each "code" field. And then calculate the averages on this averages.
> > Since my "code" field has a very huge cardinality, which could be
> > around 200,000 or even in millions ; It gets highly complex to
> > calculate the average of averages through Java. Even Solr takes a huge
> > time listing the averages. And the JSON response size becomes huge. Is
> > there some way we can tackle this? Any way we stats on stats?
>
> I wasn't sure what you meant with the first sentence I quoted above, but
> in order to get statistics from your index that are relevant for the
> results of a query, you probably want the stats component.
>
> https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
>
> Thanks,
> Shawn
>
>

Re: Average of Averages in Solr

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/6/2016 12:04 AM, Lewin Joy (TMS) wrote:
> There is a requirement to take an average on "Amount" field against
> each "code" field. And then calculate the averages on this averages.
> Since my "code" field has a very huge cardinality, which could be
> around 200,000 or even in millions ; It gets highly complex to
> calculate the average of averages through Java. Even Solr takes a huge
> time listing the averages. And the JSON response size becomes huge. Is
> there some way we can tackle this? Any way we stats on stats? 

I wasn't sure what you meant with the first sentence I quoted above, but
in order to get statistics from your index that are relevant for the
results of a query, you probably want the stats component.

https://cwiki.apache.org/confluence/display/solr/The+Stats+Component

Thanks,
Shawn