You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by dan sutton <da...@gmail.com> on 2010/12/22 10:29:23 UTC

solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND other_criteria

Hi,

Is there a way with faceting or field collapsing to do the SQL equivalent of

SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND
other_criteria

i.e. I'm only interested in the total count not the individual records
and counts.

Cheers,
Dan

Re: solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND other_criteria

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Well, that's true -- you can get the total number of facet values if you 
ALSO are willing to get back every facet value in the response.

If you've got a hundred thousand or so unique facet values, and what you 
really want is just the _count_ without ALSO getting back a very large 
response (and waiting for Solr to construct the very large response), 
then you're out of luck.

But if you're willing to get back all the values in the response too, 
that'll work, true.

On 12/22/2010 11:23 AM, Erik Hatcher wrote:
> On Dec 22, 2010, at 09:21 , Jonathan Rochkind wrote:
>
>> This won't actually give you the number of distinct facet values, but will give you the number of documents matching your conditions. It's more equivalent to SQL without the "distinct".
>>
>> There is no way in Solr 1.4 to get the number of distinct facet values.
> That's not true - the total number of facet values is the distinct number of values in that field.   You need to be sure you have facet.limit=-1 (default is 100) to see all values in the response rather than just a page of them though.
>
> 	Erik
>
>

Re: solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND other_criteria

Posted by Erik Hatcher <er...@gmail.com>.
On Dec 22, 2010, at 09:21 , Jonathan Rochkind wrote:

> This won't actually give you the number of distinct facet values, but will give you the number of documents matching your conditions. It's more equivalent to SQL without the "distinct". 
> 
> There is no way in Solr 1.4 to get the number of distinct facet values. 

That's not true - the total number of facet values is the distinct number of values in that field.   You need to be sure you have facet.limit=-1 (default is 100) to see all values in the response rather than just a page of them though. 

	Erik


RE: solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND other_criteria

Posted by Jonathan Rochkind <ro...@jhu.edu>.
This won't actually give you the number of distinct facet values, but will give you the number of documents matching your conditions. It's more equivalent to SQL without the "distinct". 

There is no way in Solr 1.4 to get the number of distinct facet values. 

I am not sure about the new features in trunk.  
________________________________________
From: Peter Karich [peathal@yahoo.de]
Sent: Wednesday, December 22, 2010 6:10 AM
To: solr-user@lucene.apache.org
Subject: Re: solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND other_criteria

 facets=true&facet.field=field // SELECT count(distinct(field))
&fq=field:[* TO *]  // WHERE length(field) > 0
&q=other_criteriaA&fq=other_criteriaB// AND other_criteria

advantage: you can look into several fields at one time when adding
another facet.field
disadvantage: you get the counts splitted by the values of that field

fix this via field collapsing / results grouping
http://wiki.apache.org/solr/FieldCollapsing
or use deduplication: http://wiki.apache.org/solr/Deduplication

Regards,
Peter.

> Hi,
>
> Is there a way with faceting or field collapsing to do the SQL equivalent of
>
> SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND
> other_criteria
>
> i.e. I'm only interested in the total count not the individual records
> and counts.
>
> Cheers,
> Dan


--
http://jetwick.com open twitter search


Re: solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND other_criteria

Posted by Peter Karich <pe...@yahoo.de>.
 facets=true&facet.field=field // SELECT count(distinct(field))
&fq=field:[* TO *]  // WHERE length(field) > 0
&q=other_criteriaA&fq=other_criteriaB// AND other_criteria

advantage: you can look into several fields at one time when adding
another facet.field
disadvantage: you get the counts splitted by the values of that field

fix this via field collapsing / results grouping
http://wiki.apache.org/solr/FieldCollapsing
or use deduplication: http://wiki.apache.org/solr/Deduplication

Regards,
Peter.

> Hi,
>
> Is there a way with faceting or field collapsing to do the SQL equivalent of
>
> SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND
> other_criteria
>
> i.e. I'm only interested in the total count not the individual records
> and counts.
>
> Cheers,
> Dan


-- 
http://jetwick.com open twitter search


Re: solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND other_criteria

Posted by Dennis Gearon <ge...@sbcglobal.net>.
Have you investigated 'field collapsing'? I believe that it is a least the 
'DISTINCT' part.


 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: dan sutton <da...@gmail.com>
To: solr-user <so...@lucene.apache.org>
Sent: Wed, December 22, 2010 1:29:23 AM
Subject: solr equiv of : SELECT count(distinct(field)) FROM index WHERE 
length(field) > 0 AND other_criteria

Hi,

Is there a way with faceting or field collapsing to do the SQL equivalent of

SELECT count(distinct(field)) FROM index WHERE length(field) > 0 AND
other_criteria

i.e. I'm only interested in the total count not the individual records
and counts.

Cheers,
Dan