You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Doug Reeder <re...@gmail.com> on 2010/09/27 01:46:54 UTC

Subselect Possible?

I've searched the wiki and googled for "subselect" and "subquery", and not found anything relevelant.

I'm evaluating whether CouchDB can handle an app which currently stores a tree structure in a relational database using the "materialized path" represenation.  For example, the item with path "QG" is a child of the item with path "Q" and a parent of the item with path "QGC".   

The trickiest query is, in English, "find all items with work less than zero and whether they have any children with work less than zero".  The SQL for this is
SELECT item.*, EXISTS (SELECT work FROM item AS d WHERE d.path > item.path AND d.path <= item.path || '\ufffd' AND d.work < 0) AS undonePrereq, FROM item WHERE work < 0

Can this be done in CouchDB, using a limited, small number of queries?

My closest sally is the map function
function(doc) {
  if (doc.work < 0) {
    var key = [];
    var i;
    for (i=0; i<doc._id.length; ++i)
      key.push(doc._id.charCodeAt(i));
    emit(key, doc.work);
  }
}

and the reduce function 
function(keys, values, rereduce) {
  var shortestKeyLength = Infinity;
  for (var i=0; i<keys.length; ++i) {
    if (keys[i][0].length < shortestKeyLength)
	shortestKeyLength = keys[i][0].length;
  }

  return [keys.length, shortestKeyLength, keys[0][0].slice(0,shortestKeyLength), values.length];
}

A query such as
curl -X GET 'http://127.0.0.1:5984/tracker1/_design/trackerA/_view/undoneArrayNum?startkey=\[45,65,65\]&endkey=\[45,65,9999\]&group_level=2'
will return the data necessary for items at the second level, but the desired items could be at any level.



Doug Reeder
reeder.29@gmail.com
http://reeder29.livejournal.com/
https://twitter.com/reeder29

https://twitter.com/hominidsoftware
http://outlinetracker.com









Re: Subselect Possible?

Posted by Chris Anderson <jc...@apache.org>.
On Tue, Sep 28, 2010 at 9:36 AM, Doug Reeder <re...@gmail.com> wrote:
> The cmlenz post on CouchDB "Joins" is useful; are there other articles on CouchDB design patterns?
>

banking transactions:

http://guide.couchdb.org/draft/recipes.html#banking

sql-ish:

http://guide.couchdb.org/draft/cookbook.html

>
> On Sep 28, 2010, at 11:18 AM, Paul Davis wrote:
>
>> On Tue, Sep 28, 2010 at 9:04 AM, Doug Reeder <re...@gmail.com> wrote:
>>> Sorry, my earlier response seems to have gone astray.
>>>
>>> Thank you, but this map and reduce function produce exactly the same results as mine.
>>>
>>> Is there a general method of structuring data and map-reduce functions to do the same thing that a subselect does?
>>>
>>
>> No, there's nothing in general. SQL is a query language. Views are
>> range matches. The question is if you can rephrase your subselect to
>> give you the same data using collation. It may seem a bit limiting at
>> first, but there are a few different methods for arranging data.
>>
>> For instance, things like this:
>> http://www.cmlenz.net/archives/2007/10/couchdb-joins
>> Or this might be helpful:
>> http://blog.couchone.com/post/1167966323/a-gentle-introduction-to-couchdb-for-relational
>>
>> If you're still not able to get what you want, you might want to try
>> couchdb-lucene.
>>
>> HTH,
>> Paul Davis
>>
>>>
>>> On Sep 26, 2010, at 8:16 PM, Paul Davis wrote:
>>>
>>>> Hrm, I haven't thought this through to the end, but something that
>>>> might get you started:
>>>>
>>>> Change your emit to emit(key, 1);
>>>>
>>>> Change your reduce to:
>>>>
>>>> function(keys, vals) {
>>>>    return sum(vals);
>>>> }
>>>>
>>>> Then query like such:
>>>>
>>>> ?key=query_key&limit=1
>>>> ?group_level=length(query_key)&key=query_key
>>>>
>>>> It seems like there should be a fancy way to get the reduce call to
>>>> tell you if the original node had < 0 work but I'm distracted by other
>>>> code right now.
>>>>
>>>> HTH,
>>>> Paul Davis
>>>>
>>>> On Sun, Sep 26, 2010 at 7:46 PM, Doug Reeder <re...@gmail.com> wrote:
>>>>> I've searched the wiki and googled for "subselect" and "subquery", and not found anything relevelant.
>>>>>
>>>>> I'm evaluating whether CouchDB can handle an app which currently stores a tree structure in a relational database using the "materialized path" represenation.  For example, the item with path "QG" is a child of the item with path "Q" and a parent of the item with path "QGC".
>>>>>
>>>>> The trickiest query is, in English, "find all items with work less than zero and whether they have any children with work less than zero".  The SQL for this is
>>>>> SELECT item.*, EXISTS (SELECT work FROM item AS d WHERE d.path > item.path AND d.path <= item.path || '\ufffd' AND d.work < 0) AS undonePrereq, FROM item WHERE work < 0
>>>>>
>>>>> Can this be done in CouchDB, using a limited, small number of queries?
>>>>>
>>>>> My closest sally is the map function
>>>>> function(doc) {
>>>>>  if (doc.work < 0) {
>>>>>    var key = [];
>>>>>    var i;
>>>>>    for (i=0; i<doc._id.length; ++i)
>>>>>      key.push(doc._id.charCodeAt(i));
>>>>>    emit(key, doc.work);
>>>>>  }
>>>>> }
>>>>>
>>>>> and the reduce function
>>>>> function(keys, values, rereduce) {
>>>>>  var shortestKeyLength = Infinity;
>>>>>  for (var i=0; i<keys.length; ++i) {
>>>>>    if (keys[i][0].length < shortestKeyLength)
>>>>>        shortestKeyLength = keys[i][0].length;
>>>>>  }
>>>>>
>>>>>  return [keys.length, shortestKeyLength, keys[0][0].slice(0,shortestKeyLength), values.length];
>>>>> }
>>>>>
>>>>> A query such as
>>>>> curl -X GET 'http://127.0.0.1:5984/tracker1/_design/trackerA/_view/undoneArrayNum?startkey=\[45,65,65\]&endkey=\[45,65,9999\]&group_level=2'
>>>>> will return the data necessary for items at the second level, but the desired items could be at any level.
>>>
>>> Doug Reeder
>>> reeder.29@gmail.com
>>> http://reeder29.livejournal.com/
>>> https://twitter.com/reeder29
>>>
>>> https://twitter.com/hominidsoftware
>>> http://outlinetracker.com
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
> Doug Reeder
> reeder.29@gmail.com
> http://reeder29.livejournal.com/
> https://twitter.com/reeder29
>
> https://twitter.com/hominidsoftware
> http://outlinetracker.com
>
>
>
>
>
>
>
>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: Subselect Possible?

Posted by Doug Reeder <re...@gmail.com>.
The cmlenz post on CouchDB "Joins" is useful; are there other articles on CouchDB design patterns?


On Sep 28, 2010, at 11:18 AM, Paul Davis wrote:

> On Tue, Sep 28, 2010 at 9:04 AM, Doug Reeder <re...@gmail.com> wrote:
>> Sorry, my earlier response seems to have gone astray.
>> 
>> Thank you, but this map and reduce function produce exactly the same results as mine.
>> 
>> Is there a general method of structuring data and map-reduce functions to do the same thing that a subselect does?
>> 
> 
> No, there's nothing in general. SQL is a query language. Views are
> range matches. The question is if you can rephrase your subselect to
> give you the same data using collation. It may seem a bit limiting at
> first, but there are a few different methods for arranging data.
> 
> For instance, things like this:
> http://www.cmlenz.net/archives/2007/10/couchdb-joins
> Or this might be helpful:
> http://blog.couchone.com/post/1167966323/a-gentle-introduction-to-couchdb-for-relational
> 
> If you're still not able to get what you want, you might want to try
> couchdb-lucene.
> 
> HTH,
> Paul Davis
> 
>> 
>> On Sep 26, 2010, at 8:16 PM, Paul Davis wrote:
>> 
>>> Hrm, I haven't thought this through to the end, but something that
>>> might get you started:
>>> 
>>> Change your emit to emit(key, 1);
>>> 
>>> Change your reduce to:
>>> 
>>> function(keys, vals) {
>>>    return sum(vals);
>>> }
>>> 
>>> Then query like such:
>>> 
>>> ?key=query_key&limit=1
>>> ?group_level=length(query_key)&key=query_key
>>> 
>>> It seems like there should be a fancy way to get the reduce call to
>>> tell you if the original node had < 0 work but I'm distracted by other
>>> code right now.
>>> 
>>> HTH,
>>> Paul Davis
>>> 
>>> On Sun, Sep 26, 2010 at 7:46 PM, Doug Reeder <re...@gmail.com> wrote:
>>>> I've searched the wiki and googled for "subselect" and "subquery", and not found anything relevelant.
>>>> 
>>>> I'm evaluating whether CouchDB can handle an app which currently stores a tree structure in a relational database using the "materialized path" represenation.  For example, the item with path "QG" is a child of the item with path "Q" and a parent of the item with path "QGC".
>>>> 
>>>> The trickiest query is, in English, "find all items with work less than zero and whether they have any children with work less than zero".  The SQL for this is
>>>> SELECT item.*, EXISTS (SELECT work FROM item AS d WHERE d.path > item.path AND d.path <= item.path || '\ufffd' AND d.work < 0) AS undonePrereq, FROM item WHERE work < 0
>>>> 
>>>> Can this be done in CouchDB, using a limited, small number of queries?
>>>> 
>>>> My closest sally is the map function
>>>> function(doc) {
>>>>  if (doc.work < 0) {
>>>>    var key = [];
>>>>    var i;
>>>>    for (i=0; i<doc._id.length; ++i)
>>>>      key.push(doc._id.charCodeAt(i));
>>>>    emit(key, doc.work);
>>>>  }
>>>> }
>>>> 
>>>> and the reduce function
>>>> function(keys, values, rereduce) {
>>>>  var shortestKeyLength = Infinity;
>>>>  for (var i=0; i<keys.length; ++i) {
>>>>    if (keys[i][0].length < shortestKeyLength)
>>>>        shortestKeyLength = keys[i][0].length;
>>>>  }
>>>> 
>>>>  return [keys.length, shortestKeyLength, keys[0][0].slice(0,shortestKeyLength), values.length];
>>>> }
>>>> 
>>>> A query such as
>>>> curl -X GET 'http://127.0.0.1:5984/tracker1/_design/trackerA/_view/undoneArrayNum?startkey=\[45,65,65\]&endkey=\[45,65,9999\]&group_level=2'
>>>> will return the data necessary for items at the second level, but the desired items could be at any level.
>> 
>> Doug Reeder
>> reeder.29@gmail.com
>> http://reeder29.livejournal.com/
>> https://twitter.com/reeder29
>> 
>> https://twitter.com/hominidsoftware
>> http://outlinetracker.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 

Doug Reeder
reeder.29@gmail.com
http://reeder29.livejournal.com/
https://twitter.com/reeder29

https://twitter.com/hominidsoftware
http://outlinetracker.com









Re: Subselect Possible?

Posted by Doug Reeder <re...@gmail.com>.
Thanks!   I extended it as you suggested to emit keys for all ancestors (and I changed from storing the materialized path in the id to storing it in a separate path field), so my map function is

function(doc) {
	var i;
	if (doc.work < 0 && doc.path.length >= 2) {
		emit(doc.path, false);
		for( i = 2; i < doc.path.length; ++i )
			emit(doc.path.slice(0, i), true);
	}
}

and my reduce function is

function (keys, values, rereduce) {
	var hasUndoneDescendants = false;
	for (var i=0; i<keys.length; ++i) {
		if (values[i])
			hasUndoneDescendants = true;
	}

	return hasUndoneDescendants;
}

Now it works as intended!

On Sep 28, 2010, at 1:03 PM, Dennis Clark wrote:

> function(doc) {
> if (doc.work < 0) {
>   emit(doc._id,[doc.work,0]);
>   emit(doc.parent._id,[doc.work,1]);
> }
> }

Doug Reeder
reeder.29@gmail.com
http://reeder29.livejournal.com/
https://twitter.com/reeder29

https://twitter.com/hominidsoftware
http://outlinetracker.com









Re: Subselect Possible?

Posted by Dennis Clark <db...@gmail.com>.
I believe that you should be able to get something like this from the
following map function (pardon the pseudocode; I don't know your data id's)

function(doc) {
 if (doc.work < 0) {
   emit(doc._id,[doc.work,0]);
   emit(doc.parent._id,[doc.work,1]);
 }
}

where doc.parent._id is whatever the ID of the parent is. Now querying with
startkey= a_given_id and endkey=a_given_id  gives you the node with
a_given_id if it has zero work and any of its zero-work children. This can
be extended pretty straightforwardly to descendents of arbitrary depth. You
can then use a reduce (or better a list if nodes with <0 work are not
sparse) which looks at the second element of the value to chop this down to
just the boolean.

Re: Subselect Possible?

Posted by Paul Davis <pa...@gmail.com>.
On Tue, Sep 28, 2010 at 9:04 AM, Doug Reeder <re...@gmail.com> wrote:
> Sorry, my earlier response seems to have gone astray.
>
> Thank you, but this map and reduce function produce exactly the same results as mine.
>
> Is there a general method of structuring data and map-reduce functions to do the same thing that a subselect does?
>

No, there's nothing in general. SQL is a query language. Views are
range matches. The question is if you can rephrase your subselect to
give you the same data using collation. It may seem a bit limiting at
first, but there are a few different methods for arranging data.

For instance, things like this:
http://www.cmlenz.net/archives/2007/10/couchdb-joins
Or this might be helpful:
http://blog.couchone.com/post/1167966323/a-gentle-introduction-to-couchdb-for-relational

If you're still not able to get what you want, you might want to try
couchdb-lucene.

HTH,
Paul Davis

>
> On Sep 26, 2010, at 8:16 PM, Paul Davis wrote:
>
>> Hrm, I haven't thought this through to the end, but something that
>> might get you started:
>>
>> Change your emit to emit(key, 1);
>>
>> Change your reduce to:
>>
>> function(keys, vals) {
>>    return sum(vals);
>> }
>>
>> Then query like such:
>>
>> ?key=query_key&limit=1
>> ?group_level=length(query_key)&key=query_key
>>
>> It seems like there should be a fancy way to get the reduce call to
>> tell you if the original node had < 0 work but I'm distracted by other
>> code right now.
>>
>> HTH,
>> Paul Davis
>>
>> On Sun, Sep 26, 2010 at 7:46 PM, Doug Reeder <re...@gmail.com> wrote:
>>> I've searched the wiki and googled for "subselect" and "subquery", and not found anything relevelant.
>>>
>>> I'm evaluating whether CouchDB can handle an app which currently stores a tree structure in a relational database using the "materialized path" represenation.  For example, the item with path "QG" is a child of the item with path "Q" and a parent of the item with path "QGC".
>>>
>>> The trickiest query is, in English, "find all items with work less than zero and whether they have any children with work less than zero".  The SQL for this is
>>> SELECT item.*, EXISTS (SELECT work FROM item AS d WHERE d.path > item.path AND d.path <= item.path || '\ufffd' AND d.work < 0) AS undonePrereq, FROM item WHERE work < 0
>>>
>>> Can this be done in CouchDB, using a limited, small number of queries?
>>>
>>> My closest sally is the map function
>>> function(doc) {
>>>  if (doc.work < 0) {
>>>    var key = [];
>>>    var i;
>>>    for (i=0; i<doc._id.length; ++i)
>>>      key.push(doc._id.charCodeAt(i));
>>>    emit(key, doc.work);
>>>  }
>>> }
>>>
>>> and the reduce function
>>> function(keys, values, rereduce) {
>>>  var shortestKeyLength = Infinity;
>>>  for (var i=0; i<keys.length; ++i) {
>>>    if (keys[i][0].length < shortestKeyLength)
>>>        shortestKeyLength = keys[i][0].length;
>>>  }
>>>
>>>  return [keys.length, shortestKeyLength, keys[0][0].slice(0,shortestKeyLength), values.length];
>>> }
>>>
>>> A query such as
>>> curl -X GET 'http://127.0.0.1:5984/tracker1/_design/trackerA/_view/undoneArrayNum?startkey=\[45,65,65\]&endkey=\[45,65,9999\]&group_level=2'
>>> will return the data necessary for items at the second level, but the desired items could be at any level.
>
> Doug Reeder
> reeder.29@gmail.com
> http://reeder29.livejournal.com/
> https://twitter.com/reeder29
>
> https://twitter.com/hominidsoftware
> http://outlinetracker.com
>
>
>
>
>
>
>
>
>

Re: Subselect Possible?

Posted by Doug Reeder <re...@gmail.com>.
Sorry, my earlier response seems to have gone astray.

Thank you, but this map and reduce function produce exactly the same results as mine.

Is there a general method of structuring data and map-reduce functions to do the same thing that a subselect does?


On Sep 26, 2010, at 8:16 PM, Paul Davis wrote:

> Hrm, I haven't thought this through to the end, but something that
> might get you started:
> 
> Change your emit to emit(key, 1);
> 
> Change your reduce to:
> 
> function(keys, vals) {
>    return sum(vals);
> }
> 
> Then query like such:
> 
> ?key=query_key&limit=1
> ?group_level=length(query_key)&key=query_key
> 
> It seems like there should be a fancy way to get the reduce call to
> tell you if the original node had < 0 work but I'm distracted by other
> code right now.
> 
> HTH,
> Paul Davis
> 
> On Sun, Sep 26, 2010 at 7:46 PM, Doug Reeder <re...@gmail.com> wrote:
>> I've searched the wiki and googled for "subselect" and "subquery", and not found anything relevelant.
>> 
>> I'm evaluating whether CouchDB can handle an app which currently stores a tree structure in a relational database using the "materialized path" represenation.  For example, the item with path "QG" is a child of the item with path "Q" and a parent of the item with path "QGC".
>> 
>> The trickiest query is, in English, "find all items with work less than zero and whether they have any children with work less than zero".  The SQL for this is
>> SELECT item.*, EXISTS (SELECT work FROM item AS d WHERE d.path > item.path AND d.path <= item.path || '\ufffd' AND d.work < 0) AS undonePrereq, FROM item WHERE work < 0
>> 
>> Can this be done in CouchDB, using a limited, small number of queries?
>> 
>> My closest sally is the map function
>> function(doc) {
>>  if (doc.work < 0) {
>>    var key = [];
>>    var i;
>>    for (i=0; i<doc._id.length; ++i)
>>      key.push(doc._id.charCodeAt(i));
>>    emit(key, doc.work);
>>  }
>> }
>> 
>> and the reduce function
>> function(keys, values, rereduce) {
>>  var shortestKeyLength = Infinity;
>>  for (var i=0; i<keys.length; ++i) {
>>    if (keys[i][0].length < shortestKeyLength)
>>        shortestKeyLength = keys[i][0].length;
>>  }
>> 
>>  return [keys.length, shortestKeyLength, keys[0][0].slice(0,shortestKeyLength), values.length];
>> }
>> 
>> A query such as
>> curl -X GET 'http://127.0.0.1:5984/tracker1/_design/trackerA/_view/undoneArrayNum?startkey=\[45,65,65\]&endkey=\[45,65,9999\]&group_level=2'
>> will return the data necessary for items at the second level, but the desired items could be at any level.

Doug Reeder
reeder.29@gmail.com
http://reeder29.livejournal.com/
https://twitter.com/reeder29

https://twitter.com/hominidsoftware
http://outlinetracker.com









Re: Subselect Possible?

Posted by Paul Davis <pa...@gmail.com>.
Hrm, I haven't thought this through to the end, but something that
might get you started:

Change your emit to emit(key, 1);

Change your reduce to:

function(keys, vals) {
    return sum(vals);
}

Then query like such:

?key=query_key&limit=1
?group_level=length(query_key)&key=query_key

It seems like there should be a fancy way to get the reduce call to
tell you if the original node had < 0 work but I'm distracted by other
code right now.

HTH,
Paul Davis

On Sun, Sep 26, 2010 at 7:46 PM, Doug Reeder <re...@gmail.com> wrote:
> I've searched the wiki and googled for "subselect" and "subquery", and not found anything relevelant.
>
> I'm evaluating whether CouchDB can handle an app which currently stores a tree structure in a relational database using the "materialized path" represenation.  For example, the item with path "QG" is a child of the item with path "Q" and a parent of the item with path "QGC".
>
> The trickiest query is, in English, "find all items with work less than zero and whether they have any children with work less than zero".  The SQL for this is
> SELECT item.*, EXISTS (SELECT work FROM item AS d WHERE d.path > item.path AND d.path <= item.path || '\ufffd' AND d.work < 0) AS undonePrereq, FROM item WHERE work < 0
>
> Can this be done in CouchDB, using a limited, small number of queries?
>
> My closest sally is the map function
> function(doc) {
>  if (doc.work < 0) {
>    var key = [];
>    var i;
>    for (i=0; i<doc._id.length; ++i)
>      key.push(doc._id.charCodeAt(i));
>    emit(key, doc.work);
>  }
> }
>
> and the reduce function
> function(keys, values, rereduce) {
>  var shortestKeyLength = Infinity;
>  for (var i=0; i<keys.length; ++i) {
>    if (keys[i][0].length < shortestKeyLength)
>        shortestKeyLength = keys[i][0].length;
>  }
>
>  return [keys.length, shortestKeyLength, keys[0][0].slice(0,shortestKeyLength), values.length];
> }
>
> A query such as
> curl -X GET 'http://127.0.0.1:5984/tracker1/_design/trackerA/_view/undoneArrayNum?startkey=\[45,65,65\]&endkey=\[45,65,9999\]&group_level=2'
> will return the data necessary for items at the second level, but the desired items could be at any level.
>
>
>
> Doug Reeder
> reeder.29@gmail.com
> http://reeder29.livejournal.com/
> https://twitter.com/reeder29
>
> https://twitter.com/hominidsoftware
> http://outlinetracker.com
>
>
>
>
>
>
>
>
>