You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sdanzig <sd...@gmail.com> on 2012/11/29 21:38:38 UTC

Grouping by a date field

I'm trying to create a SOLR query that groups/field collapses by date.  I
have a field in YYYY-MM-dd'T'HH:mm:ss'Z' format, "datetime", and I'm looking
to group by just per day.  When grouping on this field using
group.field=datetime in the query, SOLR responds with a group for every
second.  I'm able to easily use this field to create day-based facets, but
not groups.  Advice please?

- Scott



--
View this message in context: http://lucene.472066.n3.nabble.com/Grouping-by-a-date-field-tp4023318.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Grouping by a date field

Posted by Jack Krupansky <ja...@basetechnology.com>.
A POC will tell you - it is 90% driven by your particular environment, your 
particular schema, your particular data, and your particular queries (e.g., 
how many documents they match, how many days they match.)

But please do share with us your results after conducting your POC - which 
is an absolute requirement for all particular changes to how an app uses 
Solr.

-- Jack Krupansky

-----Original Message----- 
From: Amit Nithian
Sent: Friday, November 30, 2012 12:04 AM
To: solr-user@lucene.apache.org
Subject: Re: Grouping by a date field

What's the performance impact of doing this?


On Thu, Nov 29, 2012 at 7:54 PM, Jack Krupansky 
<ja...@basetechnology.com>wrote:

> Or group by a function query which is the date field converted to
> milliseconds divided by the number of milliseconds in a day.
>
> Such as:
>
>  q=*:*&group=true&group.func=**rint(div(ms(date_dt),mul(24,**
> mul(60,mul(60,1000)))))
>
> -- Jack Krupansky
>
> -----Original Message----- From: Amit Nithian
> Sent: Thursday, November 29, 2012 10:29 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Grouping by a date field
>
>
> Why not create a new field that just contains the day component? Then you
> can group by this field.
>
>
> On Thu, Nov 29, 2012 at 12:38 PM, sdanzig <sd...@gmail.com> wrote:
>
>  I'm trying to create a SOLR query that groups/field collapses by date.  I
>> have a field in YYYY-MM-dd'T'HH:mm:ss'Z' format, "datetime", and I'm
>> looking
>> to group by just per day.  When grouping on this field using
>> group.field=datetime in the query, SOLR responds with a group for every
>> second.  I'm able to easily use this field to create day-based facets, 
>> but
>> not groups.  Advice please?
>>
>> - Scott
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.**nabble.com/Grouping-by-a-date-**
>> field-tp4023318.html<http://lucene.472066.n3.nabble.com/Grouping-by-a-date-field-tp4023318.html>
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 


Re: Grouping by a date field

Posted by sdanzig <sd...@gmail.com>.
Hey, great advice Amit, Jack, and Chris.  It's been a while since I got such
a nice array of options!  My response... yes, Amit, I thought of your way
before posting... I was just thinking, eh, there must be a way in SOLR,
since it was so easy to do the facets.  So I wanted an alternative first
before something that seems fairly ugly.. storing redundant data like that.

Jack, I was thinking of a group.func query, but I was looking for some way
to manipulate a string in some way.  I'm a bit surprised SOLR doesn't
provide this yet either.  I'm sure there's a reason.  The solution you
provided is pretty impressive, and I think I along with a bunch of other
subscribers at the very least appreciate having such a nice working example
of how to do a group.func that way.

Chris, yeah, it's for all of time, but good insight that I appreciate..
facets are a little different, yes.

I think the way I'm going to go for now, since I realize I'm doing a query
sorted by this date anyway, is just detecting day changes in the result set
while iterating through them, and grouping them that way.  Might've wanted
to have SOLR group them if there was a clean n' easy solution, but there
apparently isn't one.

- Scott



--
View this message in context: http://lucene.472066.n3.nabble.com/Grouping-by-a-date-field-tp4023318p4023563.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Grouping by a date field

Posted by Chris Hostetter <ho...@fucit.org>.
: What's the performance impact of doing this?

the function approach should have slower query times compared to the "new 
field containing day" approach because it has to do the computation for 
every doc at query time, but it's less flexible because you have to know 
in advance you want to use it that way -- the classic classic 
"precompute work when indexing to save time during query" trade off.

A key thing to keep in mind though with both of these suggestions is 
wether you really want a single group for every day, for all of time....

: >> second.  I'm able to easily use this field to create day-based facets, but
: >> not groups.  Advice please?

if you compare this to range/date faceting then there is a start/end date, 
and documents which fall outside of those dates get included in 
before/after ranges -- this wouldnt' happen if you used either of the two 
suggestions above.  

The closest analog available in grouping would be to 
use a series of "group.query" params, specifying the ranges of values you 
wanted -- this is essentialy what range faceting is doing under the 
covers.

if the goal is something like :group by day for the last 7 days, and 
everything else should be in a single 'older' bucket", then sending 8 
group.query params containing range queries should work fine.


-Hoss

Re: Grouping by a date field

Posted by Amit Nithian <an...@gmail.com>.
What's the performance impact of doing this?


On Thu, Nov 29, 2012 at 7:54 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Or group by a function query which is the date field converted to
> milliseconds divided by the number of milliseconds in a day.
>
> Such as:
>
>  q=*:*&group=true&group.func=**rint(div(ms(date_dt),mul(24,**
> mul(60,mul(60,1000)))))
>
> -- Jack Krupansky
>
> -----Original Message----- From: Amit Nithian
> Sent: Thursday, November 29, 2012 10:29 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Grouping by a date field
>
>
> Why not create a new field that just contains the day component? Then you
> can group by this field.
>
>
> On Thu, Nov 29, 2012 at 12:38 PM, sdanzig <sd...@gmail.com> wrote:
>
>  I'm trying to create a SOLR query that groups/field collapses by date.  I
>> have a field in YYYY-MM-dd'T'HH:mm:ss'Z' format, "datetime", and I'm
>> looking
>> to group by just per day.  When grouping on this field using
>> group.field=datetime in the query, SOLR responds with a group for every
>> second.  I'm able to easily use this field to create day-based facets, but
>> not groups.  Advice please?
>>
>> - Scott
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.**nabble.com/Grouping-by-a-date-**
>> field-tp4023318.html<http://lucene.472066.n3.nabble.com/Grouping-by-a-date-field-tp4023318.html>
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>

Re: Grouping by a date field

Posted by Jack Krupansky <ja...@basetechnology.com>.
Or group by a function query which is the date field converted to 
milliseconds divided by the number of milliseconds in a day.

Such as:

  q=*:*&group=true&group.func=rint(div(ms(date_dt),mul(24,mul(60,mul(60,1000)))))

-- Jack Krupansky

-----Original Message----- 
From: Amit Nithian
Sent: Thursday, November 29, 2012 10:29 PM
To: solr-user@lucene.apache.org
Subject: Re: Grouping by a date field

Why not create a new field that just contains the day component? Then you
can group by this field.


On Thu, Nov 29, 2012 at 12:38 PM, sdanzig <sd...@gmail.com> wrote:

> I'm trying to create a SOLR query that groups/field collapses by date.  I
> have a field in YYYY-MM-dd'T'HH:mm:ss'Z' format, "datetime", and I'm
> looking
> to group by just per day.  When grouping on this field using
> group.field=datetime in the query, SOLR responds with a group for every
> second.  I'm able to easily use this field to create day-based facets, but
> not groups.  Advice please?
>
> - Scott
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Grouping-by-a-date-field-tp4023318.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 


Re: Grouping by a date field

Posted by Amit Nithian <an...@gmail.com>.
Why not create a new field that just contains the day component? Then you
can group by this field.


On Thu, Nov 29, 2012 at 12:38 PM, sdanzig <sd...@gmail.com> wrote:

> I'm trying to create a SOLR query that groups/field collapses by date.  I
> have a field in YYYY-MM-dd'T'HH:mm:ss'Z' format, "datetime", and I'm
> looking
> to group by just per day.  When grouping on this field using
> group.field=datetime in the query, SOLR responds with a group for every
> second.  I'm able to easily use this field to create day-based facets, but
> not groups.  Advice please?
>
> - Scott
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Grouping-by-a-date-field-tp4023318.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>