You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Paul Davis <pa...@gmail.com> on 2011/02/01 17:08:57 UTC

Re: Reduce parameters?

I'm not entirely certain what you're wanting for output. Could you
give an example of what you'd hoped to achieve?

On Mon, Jan 31, 2011 at 3:58 PM, Joe Freeman <jo...@bitroot.com> wrote:
> Thanks for your feedback Paul.
>
> On 27 January 2011 23:57, Paul Davis <pa...@gmail.com> wrote:
>> You're best bet would be to make another view that you can use to get
>> what you want. For instance, if you want the revisions that existed
>> for a given document before a certain time just create a second view
>> that doesn't include the part in the key and then you can grab per
>> document all revisions for a given document before (or after) a given
>> timestamp.
>
> So are you suggesting I make a view that outputted something like this?
>
> ["document1",1294696806874] -> {"content": "part 1.1, revision 3"}
> ["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
> ["document1",1294696769516] -> {"content": "part 1.1, revision 1"}
> ["document1",1294696816974] -> {"content": "part 1.2, revision 2"}
> ["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
> ["document1",1294696709610] -> {"content": "part 1.3, revision 1"}
> ["document2",1294696812168] -> {"content": "part 2.1, revision 3"}
> ["document2",1294696802362] -> {"content": "part 2.1, revision 2"}
> ["document2",1294696743154] -> {"content": "part 2.1, revision 1"}
> ["document2",1294696819313] -> {"content": "part 2.2, revision 1"}
>
> I can see how from this I can get all of the parts for a specified
> document before a specified date (without the need of a reduce
> function), which seems like a good start, but I can't find a way to
> group over the document 'part'.
>
> For example, I can specify a startkey and endkey of '["document1",0]'
> and '["document1",1294696793572]' to get all revisions for 'document1'
> before the specified timestamp:
>
> ["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
> ["document1",1294696769516] -> {"content": "part 1.1, revision 1"}
> ["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
> ["document1",1294696709610] -> {"content": "part 1.3, revision 1"}
>
> But I'm still stuck when it comes to finding a way to remove all the
> revisions before the last one (in this case, the document for "part
> 1.1, revision 1" needs to be removed since this is superseded by
> "revision 2").
>
> I hope that makes sense.
>
> Any further suggestions?
>
> I'm wondering if perhaps a document-oriented database isn't quite
> suited to my case :(
>

Re: Reduce parameters?

Posted by Joe Freeman <jo...@bitroot.com>.
On 1 February 2011 18:04, Paul Davis <pa...@gmail.com> wrote:
> I don't see anything direct with a view for this either. You could do
> it for a given part at a time, but not all parts in one request. One
> thing you might try is to use a _list function with your original set
> up to discard results you don't need.

It looks like lists are able to do what I want :) Thank you Paul.

Re: Reduce parameters?

Posted by Paul Davis <pa...@gmail.com>.
On Tue, Feb 1, 2011 at 12:32 PM, Joe Freeman <jo...@bitroot.com> wrote:
> On 1 February 2011 16:08, Paul Davis <pa...@gmail.com> wrote:
>> I'm not entirely certain what you're wanting for output. Could you
>> give an example of what you'd hoped to achieve?
>
> Sorry, let me try and give a better explanation...
>
> A document (in my application; not a CouchDB document) is made up of a
> number of parts. Each part is stored in a CouchDB document, and
> contains four properties: 'document_id', 'timestamp', 'content' and
> 'revisions'. The 'revisions' property is an array of previous
> revisions. So a document might look like this:
>
> {
>  "_id": "part1.1",
>  "document_id": "document1",
>  "timestamp": 1294696806874,
>  "content": "part 1.1, revision 3",
>  "revisions": [
>    {
>      "updated": 1294696793572,
>      "content": "part 1.1, revision2",
>    },{
>      "updated": 1294696769516,
>      "content": "part 1.1, revision1",
>    }
>  ]
> }
>
> I can query the view to get 'all the latest part revisions for a
> document', but I don't seem to be able to 'get all the latest part
> revisions for a document as it was at a specified point in history'.
>
> So, to go back to my original example, and re-introduce a couple more
> parts, I might have:
>
> ["document1",1294696806874] -> {"content": "part 1.1, revision 3"}
> ["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
> ["document1",1294696769516] -> {"content": "part 1.1, revision 1"}
> ["document1",1294696816974] -> {"content": "part 1.2, revision 2"}
> ["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
> ["document1",1294696709610] -> {"content": "part 1.3, revision 1"}
>
> From this, I might want to say, give me the latest revisions at
> 1294696800000 (10th Jan 2011, 22:00:00 GMT), and I'd like the view to
> return:
>
> ["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
> ["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
> ["document1",1294696709610] -> {"content": "part 1.3, revision 1"}
>
> Note that "part 1.1, revision 3" and "part 1.2, revision 2" are not
> included because they have timestamps later than the specified time,
> and "part 1.1, revision 1" is not included because it is has a
> timestamp earlier than the latest revision.
>
> The closest I have got is to specify startkey and endkey of
> '["document1",0]' and '["document1",1294696800000]', which gives me:
>
> ["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
> ["document1",1294696769516] -> {"content": "part 1.1, revision 1"} *
> ["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
> ["document1",1294696709610] -> {"content": "part 1.3, revision 1"}
>
> The problem being that "part 1.1, revision 1" is still there, but I
> don't want it to be, because a later revision is present ("content":
> "part 1.1, revision 2").
>
> Essentially what I think I need to do at this point is 'GROUP BY' the
> part's ID. But I don't think I can do this because the part ID isn't
> in the emitted key.
>
> Does that make sense..?
>

Oh, gotchya.

I don't see anything direct with a view for this either. You could do
it for a given part at a time, but not all parts in one request. One
thing you might try is to use a _list function with your original set
up to discard results you don't need.

Re: Reduce parameters?

Posted by Joe Freeman <jo...@bitroot.com>.
On 1 February 2011 16:08, Paul Davis <pa...@gmail.com> wrote:
> I'm not entirely certain what you're wanting for output. Could you
> give an example of what you'd hoped to achieve?

Sorry, let me try and give a better explanation...

A document (in my application; not a CouchDB document) is made up of a
number of parts. Each part is stored in a CouchDB document, and
contains four properties: 'document_id', 'timestamp', 'content' and
'revisions'. The 'revisions' property is an array of previous
revisions. So a document might look like this:

{
  "_id": "part1.1",
  "document_id": "document1",
  "timestamp": 1294696806874,
  "content": "part 1.1, revision 3",
  "revisions": [
    {
      "updated": 1294696793572,
      "content": "part 1.1, revision2",
    },{
      "updated": 1294696769516,
      "content": "part 1.1, revision1",
    }
  ]
}

I can query the view to get 'all the latest part revisions for a
document', but I don't seem to be able to 'get all the latest part
revisions for a document as it was at a specified point in history'.

So, to go back to my original example, and re-introduce a couple more
parts, I might have:

["document1",1294696806874] -> {"content": "part 1.1, revision 3"}
["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
["document1",1294696769516] -> {"content": "part 1.1, revision 1"}
["document1",1294696816974] -> {"content": "part 1.2, revision 2"}
["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
["document1",1294696709610] -> {"content": "part 1.3, revision 1"}

>From this, I might want to say, give me the latest revisions at
1294696800000 (10th Jan 2011, 22:00:00 GMT), and I'd like the view to
return:

["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
["document1",1294696709610] -> {"content": "part 1.3, revision 1"}

Note that "part 1.1, revision 3" and "part 1.2, revision 2" are not
included because they have timestamps later than the specified time,
and "part 1.1, revision 1" is not included because it is has a
timestamp earlier than the latest revision.

The closest I have got is to specify startkey and endkey of
'["document1",0]' and '["document1",1294696800000]', which gives me:

["document1",1294696793572] -> {"content": "part 1.1, revision 2"}
["document1",1294696769516] -> {"content": "part 1.1, revision 1"} *
["document1",1294696761684] -> {"content": "part 1.2, revision 1"}
["document1",1294696709610] -> {"content": "part 1.3, revision 1"}

The problem being that "part 1.1, revision 1" is still there, but I
don't want it to be, because a later revision is present ("content":
"part 1.1, revision 2").

Essentially what I think I need to do at this point is 'GROUP BY' the
part's ID. But I don't think I can do this because the part ID isn't
in the emitted key.

Does that make sense..?