You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jamie Johnson <je...@gmail.com> on 2012/02/15 14:58:43 UTC

Facet on TrieDateField field without including date

I have a TrieDateField field in my index which I currently store
information about when something has happened (say a particular item
was purchased).  I would like to be able to facet based on the time of
day items are purchased across a date span.  I was hoping that I could
do a query of something like date:[NOW-1WEEK TO NOW] and then specify
I wanted facet broken into hourly bins.  Is this possible?  Do I
instead need to create a separate field to facet on which is just the
time portion perhaps with some fixed date to make the faceting behave
properly?  Is there a separate time field which I can use to do this?

Re: Facet on TrieDateField field without including date

Posted by Chantal Ackermann <ch...@btelligent.de>.
I've done something like that by calculating the hours during indexing
time (in the script part of the DIH config using java.util.Calendar
which gives you all those field values without effort). I've also
extracted information on which weekday it is (using the integer
constants of Calendar).
If you need this only for one timezone it is straight forward but if the
queries come from different time zones you'll have to shift
appropriately.

I found that pre-calculating has the advantage that you end up with very
simple data: simple integers. And it makes it quite easy to build more
complex queries on that. For example I have created a grid (build from
facets) where the columns are the weekdays and the rows are the hours of
day. The facets are created using a field containing the combination of
weekday and hour of day.


Chantal



On Wed, 2012-02-15 at 15:49 +0100, Yonik Seeley wrote:
> On Wed, Feb 15, 2012 at 9:30 AM, Jamie Johnson <je...@gmail.com> wrote:
> > I think it would if I indexed the time information separately.  Which
> > was my original thought, but I was hoping to store this in one field
> > instead of 2.  So my idea was I'd store the time portion as as a
> > number (an int might suffice from 0 to 24 since I only need this to
> > have that level of granularity) then do range queries over that.  I
> > couldn't think of a way to do this using the date field though because
> > it would give me bins broken up by hours in a particular day,
> > something like
> >
> > 2012-01-01-00:00:00 - 2012-01-01-01:00:00 10
> > 2012-01-01-01:00:00 - 2012-01-01-02:00:00 20
> > 2012-01-01-02:00:00 - 2012-01-01-03:00:00 5
> >
> > But what I really want is just the time portion across all days
> >
> > 00:00:00 - 01:00:00 10
> > 01:00:00 - 02:00:00 20
> > 02:00:00 - 03:00:00 5
> >
> > I would then use the date field to limit the time range in which the
> > facet was operating.  Does that make sense?  Is there a more efficient
> > way of doing this?
> 
> Hmm, no there's no way to do this.
> Even if you were to write a custom faceting component, it seems like
> it would still be very expensive to derive the hour of the day from ms
> for every doc.
> 
> -Yonik
> lucidimagination.com
> 
> 
> 
> 
> > On Wed, Feb 15, 2012 at 9:16 AM, Yonik Seeley
> > <yo...@lucidimagination.com> wrote:
> >> On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson <je...@gmail.com> wrote:
> >>> I would like to be able to facet based on the time of
> >>> day items are purchased across a date span.  I was hoping that I could
> >>> do a query of something like date:[NOW-1WEEK TO NOW] and then specify
> >>> I wanted facet broken into hourly bins.  Is this possible?  Do I
> >>
> >> Will range faceting do everything you need?
> >> http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range
> >>
> >> -Yonik
> >> lucidimagination.com


Re: Facet on TrieDateField field without including date

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Wed, Feb 15, 2012 at 9:30 AM, Jamie Johnson <je...@gmail.com> wrote:
> I think it would if I indexed the time information separately.  Which
> was my original thought, but I was hoping to store this in one field
> instead of 2.  So my idea was I'd store the time portion as as a
> number (an int might suffice from 0 to 24 since I only need this to
> have that level of granularity) then do range queries over that.  I
> couldn't think of a way to do this using the date field though because
> it would give me bins broken up by hours in a particular day,
> something like
>
> 2012-01-01-00:00:00 - 2012-01-01-01:00:00 10
> 2012-01-01-01:00:00 - 2012-01-01-02:00:00 20
> 2012-01-01-02:00:00 - 2012-01-01-03:00:00 5
>
> But what I really want is just the time portion across all days
>
> 00:00:00 - 01:00:00 10
> 01:00:00 - 02:00:00 20
> 02:00:00 - 03:00:00 5
>
> I would then use the date field to limit the time range in which the
> facet was operating.  Does that make sense?  Is there a more efficient
> way of doing this?

Hmm, no there's no way to do this.
Even if you were to write a custom faceting component, it seems like
it would still be very expensive to derive the hour of the day from ms
for every doc.

-Yonik
lucidimagination.com




> On Wed, Feb 15, 2012 at 9:16 AM, Yonik Seeley
> <yo...@lucidimagination.com> wrote:
>> On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson <je...@gmail.com> wrote:
>>> I would like to be able to facet based on the time of
>>> day items are purchased across a date span.  I was hoping that I could
>>> do a query of something like date:[NOW-1WEEK TO NOW] and then specify
>>> I wanted facet broken into hourly bins.  Is this possible?  Do I
>>
>> Will range faceting do everything you need?
>> http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range
>>
>> -Yonik
>> lucidimagination.com

Re: Facet on TrieDateField field without including date

Posted by Jamie Johnson <je...@gmail.com>.
Thanks guys that's what I figured, just wanted to make sure I was
going down the right path.

On Wed, Feb 15, 2012 at 9:55 AM, Ted Dunning <te...@gmail.com> wrote:
> Use multiple fields and you get what you want.  The extra fields are going
> to cost very little and will have a bit positive impact.
>
> On Wed, Feb 15, 2012 at 9:30 AM, Jamie Johnson <je...@gmail.com> wrote:
>
>> I think it would if I indexed the time information separately.  Which
>> was my original thought, but I was hoping to store this in one field
>> instead of 2.  So my idea was I'd store the time portion as as a
>> number (an int might suffice from 0 to 24 since I only need this to
>> have that level of granularity) then do range queries over that.  I
>> couldn't think of a way to do this using the date field though because
>> it would give me bins broken up by hours in a particular day,
>> something like
>>
>> 2012-01-01-00:00:00 - 2012-01-01-01:00:00 10
>> 2012-01-01-01:00:00 - 2012-01-01-02:00:00 20
>> 2012-01-01-02:00:00 - 2012-01-01-03:00:00 5
>>
>> But what I really want is just the time portion across all days
>>
>> 00:00:00 - 01:00:00 10
>> 01:00:00 - 02:00:00 20
>> 02:00:00 - 03:00:00 5
>>
>> I would then use the date field to limit the time range in which the
>> facet was operating.  Does that make sense?  Is there a more efficient
>> way of doing this?
>>
>> On Wed, Feb 15, 2012 at 9:16 AM, Yonik Seeley
>> <yo...@lucidimagination.com> wrote:
>> > On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson <je...@gmail.com>
>> wrote:
>> >> I would like to be able to facet based on the time of
>> >> day items are purchased across a date span.  I was hoping that I could
>> >> do a query of something like date:[NOW-1WEEK TO NOW] and then specify
>> >> I wanted facet broken into hourly bins.  Is this possible?  Do I
>> >
>> > Will range faceting do everything you need?
>> > http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range
>> >
>> > -Yonik
>> > lucidimagination.com
>>

Re: Facet on TrieDateField field without including date

Posted by Ted Dunning <te...@gmail.com>.
Use multiple fields and you get what you want.  The extra fields are going
to cost very little and will have a bit positive impact.

On Wed, Feb 15, 2012 at 9:30 AM, Jamie Johnson <je...@gmail.com> wrote:

> I think it would if I indexed the time information separately.  Which
> was my original thought, but I was hoping to store this in one field
> instead of 2.  So my idea was I'd store the time portion as as a
> number (an int might suffice from 0 to 24 since I only need this to
> have that level of granularity) then do range queries over that.  I
> couldn't think of a way to do this using the date field though because
> it would give me bins broken up by hours in a particular day,
> something like
>
> 2012-01-01-00:00:00 - 2012-01-01-01:00:00 10
> 2012-01-01-01:00:00 - 2012-01-01-02:00:00 20
> 2012-01-01-02:00:00 - 2012-01-01-03:00:00 5
>
> But what I really want is just the time portion across all days
>
> 00:00:00 - 01:00:00 10
> 01:00:00 - 02:00:00 20
> 02:00:00 - 03:00:00 5
>
> I would then use the date field to limit the time range in which the
> facet was operating.  Does that make sense?  Is there a more efficient
> way of doing this?
>
> On Wed, Feb 15, 2012 at 9:16 AM, Yonik Seeley
> <yo...@lucidimagination.com> wrote:
> > On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson <je...@gmail.com>
> wrote:
> >> I would like to be able to facet based on the time of
> >> day items are purchased across a date span.  I was hoping that I could
> >> do a query of something like date:[NOW-1WEEK TO NOW] and then specify
> >> I wanted facet broken into hourly bins.  Is this possible?  Do I
> >
> > Will range faceting do everything you need?
> > http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range
> >
> > -Yonik
> > lucidimagination.com
>

Re: Facet on TrieDateField field without including date

Posted by Jamie Johnson <je...@gmail.com>.
I think it would if I indexed the time information separately.  Which
was my original thought, but I was hoping to store this in one field
instead of 2.  So my idea was I'd store the time portion as as a
number (an int might suffice from 0 to 24 since I only need this to
have that level of granularity) then do range queries over that.  I
couldn't think of a way to do this using the date field though because
it would give me bins broken up by hours in a particular day,
something like

2012-01-01-00:00:00 - 2012-01-01-01:00:00 10
2012-01-01-01:00:00 - 2012-01-01-02:00:00 20
2012-01-01-02:00:00 - 2012-01-01-03:00:00 5

But what I really want is just the time portion across all days

00:00:00 - 01:00:00 10
01:00:00 - 02:00:00 20
02:00:00 - 03:00:00 5

I would then use the date field to limit the time range in which the
facet was operating.  Does that make sense?  Is there a more efficient
way of doing this?

On Wed, Feb 15, 2012 at 9:16 AM, Yonik Seeley
<yo...@lucidimagination.com> wrote:
> On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson <je...@gmail.com> wrote:
>> I would like to be able to facet based on the time of
>> day items are purchased across a date span.  I was hoping that I could
>> do a query of something like date:[NOW-1WEEK TO NOW] and then specify
>> I wanted facet broken into hourly bins.  Is this possible?  Do I
>
> Will range faceting do everything you need?
> http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range
>
> -Yonik
> lucidimagination.com

Re: Facet on TrieDateField field without including date

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson <je...@gmail.com> wrote:
> I would like to be able to facet based on the time of
> day items are purchased across a date span.  I was hoping that I could
> do a query of something like date:[NOW-1WEEK TO NOW] and then specify
> I wanted facet broken into hourly bins.  Is this possible?  Do I

Will range faceting do everything you need?
http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range

-Yonik
lucidimagination.com