You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Avlesh Singh <av...@gmail.com> on 2009/09/01 03:32:17 UTC

Re: Date Faceting and Double Counting

I don't think this behavior needs to be fixed. It is justified for the data
you have indexed.
"date minus 1 second" should definitely work for you.

Cheers
Avlesh

On Mon, Aug 31, 2009 at 11:37 PM, Stephen Duncan Jr <
stephen.duncan@gmail.com> wrote:

> If we do date faceting and start at 2009-01-01T00:00:00Z, end at
> 2009-01-03T00:00:00Z, with a gap of +1DAY, then documents that occur at
> exactly 2009-01-02T00:00:00Z will be included in both the returned counts
> (2009-01-01T00:00:00Z and 2009-01-02T00:00:00Z).  At the moment, this is
> quite bad for us, as we only index the day-level, so all of our documents
> are exactly on the line between each facet-range.
>
> Because we know our data is indexed as being exactly at midnight each day,
> I
> think we can simply always start from 1 second prior and get the results we
> want (start=2008-12-31T23:59:59Z, end=2009-01-02T23:59:59Z), but I think
> this problem would affect everyone, even if usually more subtly (instead of
> all documents being counted twice, only a few on the fencepost between
> ranges).
>
> Is this a known behavior people are happy with, or should I file an issue
> asking for ranges in date-facets to be constructed to subtract one second
> from the end of each range (so that the effective range queries for my case
> would be: [2009-01-01T00:00:00Z TO 2009-01-01T23:59:59Z] &
> [2009-01-02T00:00:00Z TO 2009-01-02T23:59:59Z])?
>
> Alternatively, is there some other suggested way of using the date faceting
> to avoid this problem?
>
> --
> Stephen Duncan Jr
> www.stephenduncanjr.com
>