You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Stephan Bardubitzki <st...@bardubitzki.com> on 2012/04/27 20:19:21 UTC

Need some advice

I have a database of events and since events can be multi-day events all 
documents have a start_date and end_date. Now I want to get all events 
that happen today and those that are starting in the coming days.

How should I determine if today is in the range of start_date and end_date?

To make things more complex an event can be assigned to one or more 
categories such as sports, music etc. So the next task would be to get 
all events of a certain category that happen today and in the coming days.

Since a db of events might be a common task I hope someone has already 
come up with a solution.

Any advice would be greatly appreciated.

Stephan


Re: Need some advice

Posted by Ryan Ramage <ry...@gmail.com>.
Stephan,

Taking Dave's advice one step further, if you want to reduce the load
on the mobile client you can use a list function on top of the query
to the view with the statkey/endkey combo.

You can pass in query params to the list function, like topic=music,
and then emit only the filtered set. As long as you dont have a lot of
events (less than 1000?) in the date range of a week, then it should
be reasonably performant. I abuse list functions like this all the
time. :) The main thing to consider with a list function is if your
view query has shrunk the set of docs down to a reasonable size.

The next level would be to use something like couch-lucene,
https://github.com/rnewson/couchdb-lucene/ which would allow you to do
more advanced queries.

Ryan



On Fri, Apr 27, 2012 at 4:56 PM, Stephan Bardubitzki
<st...@bardubitzki.com> wrote:
> Thanks Jim and Dave. Your suggestions give me something to think about over
> the weekend.
>
> But for now:  startkey=burgundy
>
> Stephan
>
>
> On 12-04-27 03:38 PM, Dave Cottlehuber wrote:
>>
>> On 28 April 2012 00:02, Jim Klo<ji...@sri.com>  wrote:
>>>
>>> Okay, sorry, I missed the following days part. I'm not sure you can do
>>> exactly what you want with a view only, but you probably want a list
>>> function as well. the GeoCouch plugin would have been your best bet for
>>> that... too bad...
>>>
>>> You can still use the same type of view with keys like: [ date, eventid
>>> ],
>>> then create a list function that uniques the eventid's; just do this any
>>> time the eventid changes in the 'stream' you send the details once, - and
>>> if
>>> you include with_docs; you could add the event details you want as well.
>>>
>>> I do something very similar to this
>>> here:
>>> https://github.com/LearningRegistry/LearningRegistry/blob/master/data_services/standards-alignment-dct-conformsTo/lists/to-json.js#L18
>>>
>>> Alternatively which is probably simpler:
>>>
>>> If you can place  some kind maximum fixed range requirement that's
>>> reasonable, like I always want to show events within 5 days of the each
>>> event...  you could artificially emit extra date keys in your map...  5
>>> days
>>> before and 5 days after the event...
>>>
>>> Say your event is for May 5, then you emit that event for every day in
>>> the
>>> map from Apr 30 - May 10.  You might have another event on the May 2,
>>> which
>>> you emit dates from April 27 - May 7.  If I search for May 2, event..
>>> i'll
>>> also get the may 5 event.
>>>
>>> Then when you query the view with a simple reduce for a specific day...
>>> you
>>> should get a list of all events within 5 days (10 day range)... of the
>>> date
>>> specified.  If you put some kind of larger limit on that like 14 days
>>> before
>>> and after.. then post filter the result set using a list function into a
>>> smaller range, to dates before or after, and etc.
>>>
>>>
>>> - Jim
>>>
>>> Jim Klo
>>> Senior Software Engineer
>>> Center for Software Engineering
>>> SRI International
>>>
>>> On Apr 27, 2012, at 2:06 PM, Stephan Bardubitzki wrote:
>>>
>>> Thanks for your suggestions Jim.
>>>
>>> Unfortunately, I have to count out GeoCouch for now and your first
>>> suggestion doesn't satisfy my requirements. It would yield the events for
>>> a
>>> particular day, but not for all following days, requiring more
>>> connections
>>> to the db which I want to avoid for a mobile app.
>>>
>>> Before posting I have tried a couple days to find a clean solution, but
>>> came
>>> up with the start_date as key. It gives me all events, including past
>>> events, and the hard work has to be done on client side.
>>>
>>> My finding works, just have the feeling there is a more efficient way.
>>>
>>> Stephan
>>>
>>>
>>> On 12-04-27 11:38 AM, Jim Klo wrote:
>>>
>>> There's a whole bunch of ways to do that... but here are 2:
>>>
>>>
>>> 1. create a map function that emits all the days of the event as keys -
>>> the
>>> searching for any particular day would yield those events
>>>
>>> 2. use geocouch, store the start&  end dates as a box, then you can use
>>> the
>>>
>>> current date as a bbox to find all events that overlap. Using this
>>> method,
>>> you'd want to store time as an integer value and not a string.
>>>
>>>
>>> IMHO #2 is probably a cleaner approach, but requires GeoCouch.
>>>
>>>
>>> - Jim
>>>
>>>
>>> *
>>>
>>> *
>>>
>>> *
>>>
>>> *Jim Klo*
>>>
>>> Senior Software Engineer
>>>
>>> Center for Software Engineering
>>>
>>> SRI International
>>>
>>> *
>>>
>>> *
>>>
>>> *
>>>
>>>
>>> On Apr 27, 2012, at 11:19 AM, Stephan Bardubitzki wrote:
>>>
>>>
>>> I have a database of events and since events can be multi-day events all
>>> documents have a start_date and end_date. Now I want to get all events
>>> that
>>> happen today and those that are starting in the coming days.
>>>
>>>
>>> How should I determine if today is in the range of start_date and
>>> end_date?
>>>
>>>
>>> To make things more complex an event can be assigned to one or more
>>> categories such as sports, music etc. So the next task would be to get
>>> all
>>> events of a certain category that happen today and in the coming days.
>>>
>>>
>>> Since a db of events might be a common task I hope someone has already
>>> come
>>> up with a solution.
>>>
>>>
>>> Any advice would be greatly appreciated.
>>>
>>>
>>> Stephan
>>>
>>>
>>>
>>>
>> Something like this maybe? Excuse egregrious errors in my pseudo-code:
>>
>> doc: { "_id": "huge_partay_at_mah_place", "start": some_unix_date,
>> "duration": "3+days"}
>>
>> map:
>> if (doc.start&&  doc.duration)
>>
>> {
>>   for (i = doc.some_unix_date,  i<= doc.duration*seconds_in_day, i+=
>> seconds_in_day)
>>   {
>>     emit(i, doc.event_name); // _id of doc is emitted "for free"
>> anyway in the view.
>>   }
>> }
>>
>> Basically you could then range query whatever dates you needed from
>> the client as startkey/endkey
>> using today and today + 5.
>>
>> Very low load client side.
>>
>> And as Jim says, some fancy include_docs would give you all the event
>> information. Although in this
>> case you'd get an identical copy per event, so perhaps some view
>> post-processing with a list would
>> clean it up.
>>
>> Disclaimer: numerous grapes were harmed in the making of this email.
>> Some forward-looking statements may have been obscured through
>> burgundy-tinted lenses. YMMV. Caveat Bibendum.
>>
>> A+
>> Dave
>>
>> --------------------------------
>> Spam/Virus scanning by CanIt Pro
>>
>> For more information see
>> http://www.kgbinternet.com/SpamFilter.htm
>>
>> To control your spam filter, log in at
>> http://filter.kgbinternet.com
>>
>

Re: Need some advice

Posted by Stephan Bardubitzki <st...@bardubitzki.com>.
Thanks Jim and Dave. Your suggestions give me something to think about 
over the weekend.

But for now:  startkey=burgundy

Stephan

On 12-04-27 03:38 PM, Dave Cottlehuber wrote:
> On 28 April 2012 00:02, Jim Klo<ji...@sri.com>  wrote:
>> Okay, sorry, I missed the following days part. I'm not sure you can do
>> exactly what you want with a view only, but you probably want a list
>> function as well. the GeoCouch plugin would have been your best bet for
>> that... too bad...
>>
>> You can still use the same type of view with keys like: [ date, eventid ],
>> then create a list function that uniques the eventid's; just do this any
>> time the eventid changes in the 'stream' you send the details once, - and if
>> you include with_docs; you could add the event details you want as well.
>>
>> I do something very similar to this
>> here: https://github.com/LearningRegistry/LearningRegistry/blob/master/data_services/standards-alignment-dct-conformsTo/lists/to-json.js#L18
>>
>> Alternatively which is probably simpler:
>>
>> If you can place  some kind maximum fixed range requirement that's
>> reasonable, like I always want to show events within 5 days of the each
>> event...  you could artificially emit extra date keys in your map...  5 days
>> before and 5 days after the event...
>>
>> Say your event is for May 5, then you emit that event for every day in the
>> map from Apr 30 - May 10.  You might have another event on the May 2, which
>> you emit dates from April 27 - May 7.  If I search for May 2, event.. i'll
>> also get the may 5 event.
>>
>> Then when you query the view with a simple reduce for a specific day... you
>> should get a list of all events within 5 days (10 day range)... of the date
>> specified.  If you put some kind of larger limit on that like 14 days before
>> and after.. then post filter the result set using a list function into a
>> smaller range, to dates before or after, and etc.
>>
>>
>> - Jim
>>
>> Jim Klo
>> Senior Software Engineer
>> Center for Software Engineering
>> SRI International
>>
>> On Apr 27, 2012, at 2:06 PM, Stephan Bardubitzki wrote:
>>
>> Thanks for your suggestions Jim.
>>
>> Unfortunately, I have to count out GeoCouch for now and your first
>> suggestion doesn't satisfy my requirements. It would yield the events for a
>> particular day, but not for all following days, requiring more connections
>> to the db which I want to avoid for a mobile app.
>>
>> Before posting I have tried a couple days to find a clean solution, but came
>> up with the start_date as key. It gives me all events, including past
>> events, and the hard work has to be done on client side.
>>
>> My finding works, just have the feeling there is a more efficient way.
>>
>> Stephan
>>
>>
>> On 12-04-27 11:38 AM, Jim Klo wrote:
>>
>> There's a whole bunch of ways to do that... but here are 2:
>>
>>
>> 1. create a map function that emits all the days of the event as keys - the
>> searching for any particular day would yield those events
>>
>> 2. use geocouch, store the start&  end dates as a box, then you can use the
>> current date as a bbox to find all events that overlap. Using this method,
>> you'd want to store time as an integer value and not a string.
>>
>>
>> IMHO #2 is probably a cleaner approach, but requires GeoCouch.
>>
>>
>> - Jim
>>
>>
>> *
>>
>> *
>>
>> *
>>
>> *Jim Klo*
>>
>> Senior Software Engineer
>>
>> Center for Software Engineering
>>
>> SRI International
>>
>> *
>>
>> *
>>
>> *
>>
>>
>> On Apr 27, 2012, at 11:19 AM, Stephan Bardubitzki wrote:
>>
>>
>> I have a database of events and since events can be multi-day events all
>> documents have a start_date and end_date. Now I want to get all events that
>> happen today and those that are starting in the coming days.
>>
>>
>> How should I determine if today is in the range of start_date and end_date?
>>
>>
>> To make things more complex an event can be assigned to one or more
>> categories such as sports, music etc. So the next task would be to get all
>> events of a certain category that happen today and in the coming days.
>>
>>
>> Since a db of events might be a common task I hope someone has already come
>> up with a solution.
>>
>>
>> Any advice would be greatly appreciated.
>>
>>
>> Stephan
>>
>>
>>
>>
> Something like this maybe? Excuse egregrious errors in my pseudo-code:
>
> doc: { "_id": "huge_partay_at_mah_place", "start": some_unix_date,
> "duration": "3+days"}
>
> map:
> if (doc.start&&  doc.duration)
> {
>    for (i = doc.some_unix_date,  i<= doc.duration*seconds_in_day, i+=
> seconds_in_day)
>    {
>      emit(i, doc.event_name); // _id of doc is emitted "for free"
> anyway in the view.
>    }
> }
>
> Basically you could then range query whatever dates you needed from
> the client as startkey/endkey
> using today and today + 5.
>
> Very low load client side.
>
> And as Jim says, some fancy include_docs would give you all the event
> information. Although in this
> case you'd get an identical copy per event, so perhaps some view
> post-processing with a list would
> clean it up.
>
> Disclaimer: numerous grapes were harmed in the making of this email.
> Some forward-looking statements may have been obscured through
> burgundy-tinted lenses. YMMV. Caveat Bibendum.
>
> A+
> Dave
>
> --------------------------------
> Spam/Virus scanning by CanIt Pro
>
> For more information see
> http://www.kgbinternet.com/SpamFilter.htm
>
> To control your spam filter, log in at
> http://filter.kgbinternet.com
>

Re: Need some advice

Posted by Dave Cottlehuber <da...@muse.net.nz>.
On 28 April 2012 00:02, Jim Klo <ji...@sri.com> wrote:
> Okay, sorry, I missed the following days part. I'm not sure you can do
> exactly what you want with a view only, but you probably want a list
> function as well. the GeoCouch plugin would have been your best bet for
> that... too bad...
>
> You can still use the same type of view with keys like: [ date, eventid ],
> then create a list function that uniques the eventid's; just do this any
> time the eventid changes in the 'stream' you send the details once, - and if
> you include with_docs; you could add the event details you want as well.
>
> I do something very similar to this
> here: https://github.com/LearningRegistry/LearningRegistry/blob/master/data_services/standards-alignment-dct-conformsTo/lists/to-json.js#L18
>
> Alternatively which is probably simpler:
>
> If you can place  some kind maximum fixed range requirement that's
> reasonable, like I always want to show events within 5 days of the each
> event...  you could artificially emit extra date keys in your map...  5 days
> before and 5 days after the event...
>
> Say your event is for May 5, then you emit that event for every day in the
> map from Apr 30 - May 10.  You might have another event on the May 2, which
> you emit dates from April 27 - May 7.  If I search for May 2, event.. i'll
> also get the may 5 event.
>
> Then when you query the view with a simple reduce for a specific day... you
> should get a list of all events within 5 days (10 day range)... of the date
> specified.  If you put some kind of larger limit on that like 14 days before
> and after.. then post filter the result set using a list function into a
> smaller range, to dates before or after, and etc.
>
>
> - Jim
>
> Jim Klo
> Senior Software Engineer
> Center for Software Engineering
> SRI International
>
> On Apr 27, 2012, at 2:06 PM, Stephan Bardubitzki wrote:
>
> Thanks for your suggestions Jim.
>
> Unfortunately, I have to count out GeoCouch for now and your first
> suggestion doesn't satisfy my requirements. It would yield the events for a
> particular day, but not for all following days, requiring more connections
> to the db which I want to avoid for a mobile app.
>
> Before posting I have tried a couple days to find a clean solution, but came
> up with the start_date as key. It gives me all events, including past
> events, and the hard work has to be done on client side.
>
> My finding works, just have the feeling there is a more efficient way.
>
> Stephan
>
>
> On 12-04-27 11:38 AM, Jim Klo wrote:
>
> There's a whole bunch of ways to do that... but here are 2:
>
>
> 1. create a map function that emits all the days of the event as keys - the
> searching for any particular day would yield those events
>
> 2. use geocouch, store the start & end dates as a box, then you can use the
> current date as a bbox to find all events that overlap. Using this method,
> you'd want to store time as an integer value and not a string.
>
>
> IMHO #2 is probably a cleaner approach, but requires GeoCouch.
>
>
> - Jim
>
>
> *
>
> *
>
> *
>
> *Jim Klo*
>
> Senior Software Engineer
>
> Center for Software Engineering
>
> SRI International
>
> *
>
> *
>
> *
>
>
> On Apr 27, 2012, at 11:19 AM, Stephan Bardubitzki wrote:
>
>
> I have a database of events and since events can be multi-day events all
> documents have a start_date and end_date. Now I want to get all events that
> happen today and those that are starting in the coming days.
>
>
> How should I determine if today is in the range of start_date and end_date?
>
>
> To make things more complex an event can be assigned to one or more
> categories such as sports, music etc. So the next task would be to get all
> events of a certain category that happen today and in the coming days.
>
>
> Since a db of events might be a common task I hope someone has already come
> up with a solution.
>
>
> Any advice would be greatly appreciated.
>
>
> Stephan
>
>
>
>

Something like this maybe? Excuse egregrious errors in my pseudo-code:

doc: { "_id": "huge_partay_at_mah_place", "start": some_unix_date,
"duration": "3+days"}

map:
if (doc.start && doc.duration)
{
  for (i = doc.some_unix_date,  i <= doc.duration*seconds_in_day, i+=
seconds_in_day)
  {
    emit(i, doc.event_name); // _id of doc is emitted "for free"
anyway in the view.
  }
}

Basically you could then range query whatever dates you needed from
the client as startkey/endkey
using today and today + 5.

Very low load client side.

And as Jim says, some fancy include_docs would give you all the event
information. Although in this
case you'd get an identical copy per event, so perhaps some view
post-processing with a list would
clean it up.

Disclaimer: numerous grapes were harmed in the making of this email.
Some forward-looking statements may have been obscured through
burgundy-tinted lenses. YMMV. Caveat Bibendum.

A+
Dave

Re: Need some advice

Posted by Jim Klo <ji...@sri.com>.
Okay, sorry, I missed the following days part. I'm not sure you can do exactly what you want with a view only, but you probably want a list function as well. the GeoCouch plugin would have been your best bet for that... too bad... 

You can still use the same type of view with keys like: [ date, eventid ], then create a list function that uniques the eventid's; just do this any time the eventid changes in the 'stream' you send the details once, - and if you include with_docs; you could add the event details you want as well. 

I do something very similar to this here: https://github.com/LearningRegistry/LearningRegistry/blob/master/data_services/standards-alignment-dct-conformsTo/lists/to-json.js#L18

Alternatively which is probably simpler:

If you can place  some kind maximum fixed range requirement that's reasonable, like I always want to show events within 5 days of the each event...  you could artificially emit extra date keys in your map...  5 days before and 5 days after the event...

Say your event is for May 5, then you emit that event for every day in the map from Apr 30 - May 10.  You might have another event on the May 2, which you emit dates from April 27 - May 7.  If I search for May 2, event.. i'll also get the may 5 event.

Then when you query the view with a simple reduce for a specific day... you should get a list of all events within 5 days (10 day range)... of the date specified.  If you put some kind of larger limit on that like 14 days before and after.. then post filter the result set using a list function into a smaller range, to dates before or after, and etc.


- Jim

Jim Klo
Senior Software Engineer
Center for Software Engineering
SRI International

On Apr 27, 2012, at 2:06 PM, Stephan Bardubitzki wrote:

> Thanks for your suggestions Jim.
> 
> Unfortunately, I have to count out GeoCouch for now and your first suggestion doesn't satisfy my requirements. It would yield the events for a particular day, but not for all following days, requiring more connections to the db which I want to avoid for a mobile app.
> 
> Before posting I have tried a couple days to find a clean solution, but came up with the start_date as key. It gives me all events, including past events, and the hard work has to be done on client side.
> 
> My finding works, just have the feeling there is a more efficient way.
> 
> Stephan
> 
> 
> On 12-04-27 11:38 AM, Jim Klo wrote:
>> There's a whole bunch of ways to do that... but here are 2:
>> 
>> 1. create a map function that emits all the days of the event as keys - the searching for any particular day would yield those events
>> 2. use geocouch, store the start & end dates as a box, then you can use the current date as a bbox to find all events that overlap. Using this method, you'd want to store time as an integer value and not a string.
>> 
>> IMHO #2 is probably a cleaner approach, but requires GeoCouch.
>> 
>> - Jim
>> 
>> *
>> *
>> *
>> *Jim Klo*
>> Senior Software Engineer
>> Center for Software Engineering
>> SRI International
>> *
>> *
>> *
>> 
>> On Apr 27, 2012, at 11:19 AM, Stephan Bardubitzki wrote:
>> 
>>> I have a database of events and since events can be multi-day events all documents have a start_date and end_date. Now I want to get all events that happen today and those that are starting in the coming days.
>>> 
>>> How should I determine if today is in the range of start_date and end_date?
>>> 
>>> To make things more complex an event can be assigned to one or more categories such as sports, music etc. So the next task would be to get all events of a certain category that happen today and in the coming days.
>>> 
>>> Since a db of events might be a common task I hope someone has already come up with a solution.
>>> 
>>> Any advice would be greatly appreciated.
>>> 
>>> Stephan
>>> 
>> 


Re: Need some advice

Posted by Stephan Bardubitzki <st...@bardubitzki.com>.
Thanks for your suggestions Jim.

Unfortunately, I have to count out GeoCouch for now and your first 
suggestion doesn't satisfy my requirements. It would yield the events 
for a particular day, but not for all following days, requiring more 
connections to the db which I want to avoid for a mobile app.

Before posting I have tried a couple days to find a clean solution, but 
came up with the start_date as key. It gives me all events, including 
past events, and the hard work has to be done on client side.

My finding works, just have the feeling there is a more efficient way.

Stephan


On 12-04-27 11:38 AM, Jim Klo wrote:
> There's a whole bunch of ways to do that... but here are 2:
>
> 1. create a map function that emits all the days of the event as keys 
> - the searching for any particular day would yield those events
> 2. use geocouch, store the start & end dates as a box, then you can 
> use the current date as a bbox to find all events that overlap. Using 
> this method, you'd want to store time as an integer value and not a 
> string.
>
> IMHO #2 is probably a cleaner approach, but requires GeoCouch.
>
> - Jim
>
> *
> *
> *
> *Jim Klo*
> Senior Software Engineer
> Center for Software Engineering
> SRI International
> *
> *
> *
>
> On Apr 27, 2012, at 11:19 AM, Stephan Bardubitzki wrote:
>
>> I have a database of events and since events can be multi-day events 
>> all documents have a start_date and end_date. Now I want to get all 
>> events that happen today and those that are starting in the coming days.
>>
>> How should I determine if today is in the range of start_date and 
>> end_date?
>>
>> To make things more complex an event can be assigned to one or more 
>> categories such as sports, music etc. So the next task would be to 
>> get all events of a certain category that happen today and in the 
>> coming days.
>>
>> Since a db of events might be a common task I hope someone has 
>> already come up with a solution.
>>
>> Any advice would be greatly appreciated.
>>
>> Stephan
>>
>

Re: Need some advice

Posted by Jim Klo <ji...@sri.com>.
There's a whole bunch of ways to do that... but here are 2:

1. create a map function that emits all the days of the event as keys - the searching for any particular day would yield those events
2. use geocouch, store the start & end dates as a box, then you can use the current date as a bbox to find all events that overlap. Using this method, you'd want to store time as an integer value and not a string.

IMHO #2 is probably a cleaner approach, but requires GeoCouch.

- Jim

Jim Klo
Senior Software Engineer
Center for Software Engineering
SRI International

On Apr 27, 2012, at 11:19 AM, Stephan Bardubitzki wrote:

> I have a database of events and since events can be multi-day events all documents have a start_date and end_date. Now I want to get all events that happen today and those that are starting in the coming days.
> 
> How should I determine if today is in the range of start_date and end_date?
> 
> To make things more complex an event can be assigned to one or more categories such as sports, music etc. So the next task would be to get all events of a certain category that happen today and in the coming days.
> 
> Since a db of events might be a common task I hope someone has already come up with a solution.
> 
> Any advice would be greatly appreciated.
> 
> Stephan
>