You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ld <lu...@gmail.com> on 2013/05/08 16:46:37 UTC

Facet which takes sum of a field into account for result values

Within MySQL it is possible to get the Top N results while summing a
particular column in the database.  For example:
SELECT ip_address, SUM(ip_count) AS count FROM table GROUP BY ip_address
ORDER BY count DESC LIMIT 5

This will return the top 5 ip_address based on the sum of ip_count.

Is there a way to have a Facet query within Solr do the same?  In other
words, count an entry as if there were 'ip_count entries', not just one?

I have used the Stats component and faceting but this gives me all the
records, there is no way to limit to the top 10 sums.  My data set may have
millions of records with much variation on IP address so this wouldn’t work.

I have also considered adding ip_count number of entries when writing to
solr but this causes some issues with the unique ID shared with legacy code
that still uses MySQL.

Any help is appreciated.  




--
View this message in context: http://lucene.472066.n3.nabble.com/Facet-which-takes-sum-of-a-field-into-account-for-result-values-tp4061588.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Facet which takes sum of a field into account for result values

Posted by ld <lu...@gmail.com>.
Unfortunately, terms do not help solve my issue.

To elaborate - say i have 5 entries:
uuid - ipaddress - ipcount
1       1.1.1.1       80
2       2.2.2.2       1
3       3.3.3.3       20
4       3.3.3.3       20

When i run a facet query on the ipaddress, i get the following results:

http://localhost:8983/solr/alerts/select?q=*:*&facet=true&facet.mincount=1&facet.limit=10&facet.field=ipaddress

<lst name="facet_fields">
<lst name="ipaddress">
<int name="3.3.3.3">2</int>
<int name="1.1.1.1">1</int>
<int name="2.2.2.2">1</int>
</lst>
</lst>

BUT what i would like is to force the facet query to use the ipcount as the
sum, like this:

<lst name="facet_fields">
<lst name="ipaddress">
<int name="3.3.3.3">40</int>
<int name="1.1.1.1">80</int>
<int name="2.2.2.2">1</int>
</lst>
</lst>

Using the stats component with faceting gives me what i want but due to the
fact that i cannot limit this, i worry processing the data after the query
will take a long time.

Thanks





--
View this message in context: http://lucene.472066.n3.nabble.com/Facet-which-takes-sum-of-a-field-into-account-for-result-values-tp4061588p4061636.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Facet which takes sum of a field into account for result values

Posted by Carlos Bonilla <ca...@gmail.com>.
Hi,
have a look at http://wiki.apache.org/solr/TermsComponent.

Regards,
Carlos.


2013/5/8 ld <lu...@gmail.com>

> Within MySQL it is possible to get the Top N results while summing a
> particular column in the database.  For example:
> SELECT ip_address, SUM(ip_count) AS count FROM table GROUP BY ip_address
> ORDER BY count DESC LIMIT 5
>
> This will return the top 5 ip_address based on the sum of ip_count.
>
> Is there a way to have a Facet query within Solr do the same?  In other
> words, count an entry as if there were 'ip_count entries', not just one?
>
> I have used the Stats component and faceting but this gives me all the
> records, there is no way to limit to the top 10 sums.  My data set may have
> millions of records with much variation on IP address so this wouldn’t
> work.
>
> I have also considered adding ip_count number of entries when writing to
> solr but this causes some issues with the unique ID shared with legacy code
> that still uses MySQL.
>
> Any help is appreciated.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Facet-which-takes-sum-of-a-field-into-account-for-result-values-tp4061588.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>