You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Lewin Joy (TMS)" <le...@toyota.com> on 2016/10/06 06:04:21 UTC
Average of Averages in Solr
•• PROTECTED 関係者外秘
Hi,
I have a big collection with around 100 million records.
There is a requirement to take an average on "Amount" field against each "code" field.
And then calculate the averages on this averages.
Since my "code" field has a very huge cardinality, which could be around 200,000 or even in millions ; It gets highly complex to calculate the average of averages through Java.
Even Solr takes a huge time listing the averages. And the JSON response size becomes huge.
Is there some way we can tackle this? Any way we stats on stats?
Thanks,
Lewin
Re: Average of Averages in Solr
Posted by Susheel Kumar <su...@gmail.com>.
Please look into streaming expressions. I think that is what you are
looking for.
https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions
Thanks,
Susheel
On Thu, Oct 6, 2016 at 11:56 AM, John Bickerstaff <jo...@johnbickerstaff.com>
wrote:
> This may help? Note the "Bloomberg Analytics" at the bottom of the post...
>
> https://dzone.com/articles/solr-not-just-for-text-anymore
>
> Quote from article:
>
>
> - *Bloomberg Analytics Component for Solr*: Bloomberg Financial Services
> uses Solr extensively, and found the existing statistical packages
> woefully
> lacking. So, they developed a high-performance framework that can
> perform
> complex calculations and aggregations on time-series data, and then
> released it to OpenSource.
>
>
> On Thu, Oct 6, 2016 at 8:53 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>
> > On 10/6/2016 12:04 AM, Lewin Joy (TMS) wrote:
> > > There is a requirement to take an average on "Amount" field against
> > > each "code" field. And then calculate the averages on this averages.
> > > Since my "code" field has a very huge cardinality, which could be
> > > around 200,000 or even in millions ; It gets highly complex to
> > > calculate the average of averages through Java. Even Solr takes a huge
> > > time listing the averages. And the JSON response size becomes huge. Is
> > > there some way we can tackle this? Any way we stats on stats?
> >
> > I wasn't sure what you meant with the first sentence I quoted above, but
> > in order to get statistics from your index that are relevant for the
> > results of a query, you probably want the stats component.
> >
> > https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
> >
> > Thanks,
> > Shawn
> >
> >
>
Re: Average of Averages in Solr
Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
This may help? Note the "Bloomberg Analytics" at the bottom of the post...
https://dzone.com/articles/solr-not-just-for-text-anymore
Quote from article:
- *Bloomberg Analytics Component for Solr*: Bloomberg Financial Services
uses Solr extensively, and found the existing statistical packages woefully
lacking. So, they developed a high-performance framework that can perform
complex calculations and aggregations on time-series data, and then
released it to OpenSource.
On Thu, Oct 6, 2016 at 8:53 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> On 10/6/2016 12:04 AM, Lewin Joy (TMS) wrote:
> > There is a requirement to take an average on "Amount" field against
> > each "code" field. And then calculate the averages on this averages.
> > Since my "code" field has a very huge cardinality, which could be
> > around 200,000 or even in millions ; It gets highly complex to
> > calculate the average of averages through Java. Even Solr takes a huge
> > time listing the averages. And the JSON response size becomes huge. Is
> > there some way we can tackle this? Any way we stats on stats?
>
> I wasn't sure what you meant with the first sentence I quoted above, but
> in order to get statistics from your index that are relevant for the
> results of a query, you probably want the stats component.
>
> https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
>
> Thanks,
> Shawn
>
>
Re: Average of Averages in Solr
Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/6/2016 12:04 AM, Lewin Joy (TMS) wrote:
> There is a requirement to take an average on "Amount" field against
> each "code" field. And then calculate the averages on this averages.
> Since my "code" field has a very huge cardinality, which could be
> around 200,000 or even in millions ; It gets highly complex to
> calculate the average of averages through Java. Even Solr takes a huge
> time listing the averages. And the JSON response size becomes huge. Is
> there some way we can tackle this? Any way we stats on stats?
I wasn't sure what you meant with the first sentence I quoted above, but
in order to get statistics from your index that are relevant for the
results of a query, you probably want the stats component.
https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
Thanks,
Shawn