You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Nicolas Clairon <cl...@gmail.com> on 2009/03/28 11:48:08 UTC

Random issue with reduce

Hi !

I get problem with reduce. For the following map result :

key => 1
key => 1
key => 1
key => 1
key => 1
key => 1
key => 1
key => 1
...

I sometime get this reduce result:

key => [[1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1,1,],...]

while the expected result is of course :

key => [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,...]

I can't figure out why I get sometime nested lists in reduce result.

I'm using the latest trunk (0.9) of couchdb on a Linux box (Ubuntu
8.04) with erlang 12b.3
but I also got this problem with 0.8 version.

Does anyone went through this issues already ?

Nicolas

PS : I'm so sorry for my bad english...

Re: Random issue with reduce

Posted by Nicolas Clairon <cl...@gmail.com>.
I find a work around by using rereduce like this :

    function( key, values, rereduce){
        var results = [];
        if (!rereduce) {
            for (var i in values){
                if (results.indexOf(values[i]) == -1){results.push(
values[i] );}
            }
            return results;
        }
        else {
            var results = [];
            for( var i in values){
                for(var e in values[i]){
                    results.push(values[i][e]);
                }
            }
            return results;
        }
    }

But I don't understand why sometime I need to rereduce and sometime
(for the same
expected result) it is not necessary

This is the first version of the reduce function wich produce a result
with nested lists...

    function( key, values, rereduce){
        var results = [];
        for (var i in values){
             if (results.indexOf(values[i]) == -1){results.push( values[i] );}
        }
        return results;
     }

If someone has an anwser...

On Sat, Mar 28, 2009 at 12:25 PM, Nicolas Clairon <cl...@gmail.com> wrote:
> I use this form of reduce to group by key the results.
> In the real life, I use this reduce to get all tag group by document id.
>
> Something like that :
>
> function(doc){
>   if(doc.doc_type == "MyDoc"){
>     for( var t in doc.tags){
>       emit(doc.name, doc.tags[t]);
>     }
>   }
> }
>
> and the reduce :
>
> function(key, values){
>  return values;
> }
>
> I get the following result:
>
> name1 = > ["tag1", "tag2", "tag3"]
> name2 => ["tag2", "tag3"]
>
> but sometime I get this result
>
> name1 = [["tag1","tag2","tag3"],["tag4","tag5","tag6","tag7"],["tag8","tag9"]]
>
> Note that if the view work well the first time, it will always work fine.
>
> It is the correct way to go ?
>
> On Sat, Mar 28, 2009 at 12:10 PM, Sven Helmberger
> <sv...@gmx.de> wrote:
>> Nicolas Clairon schrieb:
>>>
>>> Nop !
>>>
>>> Here is the test  map function :
>>>
>>> function(doc){
>>>  emit("key", 1);
>>> }
>>>
>>> and the reduce function:
>>>
>>> function(key, values){
>>>  return values;
>>> }
>>>
>>> That work well for a large usecase but sometime, I get this strange
>>> behavior...
>>>
>>
>> What purpose does the reduce function have? it doesn't really reduce or
>> combine anything so it seems like it could just as well be left out.
>>
>> Regards,
>> Sven Helmberger
>>
>

Re: Random issue with reduce

Posted by Nicolas Clairon <cl...@gmail.com>.
Thank you Adam for this explanation. It's now clear that I forgot to deal with
the rereduce condition, thing that I didn't understand a lot (much better now).

I also try to avoid the use of reduce function if I can do so. My views will be
faster and my application will thank you :-)

Nicolas

On Sat, Mar 28, 2009 at 1:57 PM, Adam Kocoloski <ko...@apache.org> wrote:
> Hi Nicolas, your view code is not the best way to go.  The recommendation
> around here would be to drop the reduce function and query the map part of
> the view using your doc.name.  For big datasets you'll want to add
> key=docname to the GET request.  For smaller views or cases where you're
> interested in the entire output of the view you could just slurp the whole
> thing down and pick out the slice of the rows that you want client-side.
>
> If you really want one row per key in your results you can use a reduce
> function, but Sven alluded to the need to account for rereduce.  Your reduce
> function can operate on a list of values from the map, or on a list of
> intermediate reductions.  In the latter case you'd need to concatenate the
> arrays that you output in the earlier reductions.  For more details see the
> section on reduce in
>
> http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views
>
> Reduce functions work best if their output grows logarithmically with the
> size of the input data.  Yours is linear.  If you store a lot of data in
> that DB you'll find that the view takes up a ton of space and response time
> will start to lag.  Hope that helps,
>
> Adam
>
> On Mar 28, 2009, at 7:25 AM, Nicolas Clairon wrote:
>
>> I use this form of reduce to group by key the results.
>> In the real life, I use this reduce to get all tag group by document id.
>>
>> Something like that :
>>
>> function(doc){
>>  if(doc.doc_type == "MyDoc"){
>>    for( var t in doc.tags){
>>      emit(doc.name, doc.tags[t]);
>>    }
>>  }
>> }
>>
>> and the reduce :
>>
>> function(key, values){
>>  return values;
>> }
>>
>> I get the following result:
>>
>> name1 = > ["tag1", "tag2", "tag3"]
>> name2 => ["tag2", "tag3"]
>>
>> but sometime I get this result
>>
>> name1 =
>> [["tag1","tag2","tag3"],["tag4","tag5","tag6","tag7"],["tag8","tag9"]]
>>
>> Note that if the view work well the first time, it will always work fine.
>>
>> It is the correct way to go ?
>>
>> On Sat, Mar 28, 2009 at 12:10 PM, Sven Helmberger
>> <sv...@gmx.de> wrote:
>>>
>>> Nicolas Clairon schrieb:
>>>>
>>>> Nop !
>>>>
>>>> Here is the test  map function :
>>>>
>>>> function(doc){
>>>>  emit("key", 1);
>>>> }
>>>>
>>>> and the reduce function:
>>>>
>>>> function(key, values){
>>>>  return values;
>>>> }
>>>>
>>>> That work well for a large usecase but sometime, I get this strange
>>>> behavior...
>>>>
>>>
>>> What purpose does the reduce function have? it doesn't really reduce or
>>> combine anything so it seems like it could just as well be left out.
>>>
>>> Regards,
>>> Sven Helmberger
>>>
>
>

Re: Random issue with reduce

Posted by Adam Kocoloski <ko...@apache.org>.
Hi Nicolas, your view code is not the best way to go.  The  
recommendation around here would be to drop the reduce function and  
query the map part of the view using your doc.name.  For big datasets  
you'll want to add key=docname to the GET request.  For smaller views  
or cases where you're interested in the entire output of the view you  
could just slurp the whole thing down and pick out the slice of the  
rows that you want client-side.

If you really want one row per key in your results you can use a  
reduce function, but Sven alluded to the need to account for  
rereduce.  Your reduce function can operate on a list of values from  
the map, or on a list of intermediate reductions.  In the latter case  
you'd need to concatenate the arrays that you output in the earlier  
reductions.  For more details see the section on reduce in

http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

Reduce functions work best if their output grows logarithmically with  
the size of the input data.  Yours is linear.  If you store a lot of  
data in that DB you'll find that the view takes up a ton of space and  
response time will start to lag.  Hope that helps,

Adam

On Mar 28, 2009, at 7:25 AM, Nicolas Clairon wrote:

> I use this form of reduce to group by key the results.
> In the real life, I use this reduce to get all tag group by document  
> id.
>
> Something like that :
>
> function(doc){
>   if(doc.doc_type == "MyDoc"){
>     for( var t in doc.tags){
>       emit(doc.name, doc.tags[t]);
>     }
>   }
> }
>
> and the reduce :
>
> function(key, values){
>  return values;
> }
>
> I get the following result:
>
> name1 = > ["tag1", "tag2", "tag3"]
> name2 => ["tag2", "tag3"]
>
> but sometime I get this result
>
> name1 = [["tag1","tag2","tag3"],["tag4","tag5","tag6","tag7"], 
> ["tag8","tag9"]]
>
> Note that if the view work well the first time, it will always work  
> fine.
>
> It is the correct way to go ?
>
> On Sat, Mar 28, 2009 at 12:10 PM, Sven Helmberger
> <sv...@gmx.de> wrote:
>> Nicolas Clairon schrieb:
>>>
>>> Nop !
>>>
>>> Here is the test  map function :
>>>
>>> function(doc){
>>>  emit("key", 1);
>>> }
>>>
>>> and the reduce function:
>>>
>>> function(key, values){
>>>  return values;
>>> }
>>>
>>> That work well for a large usecase but sometime, I get this strange
>>> behavior...
>>>
>>
>> What purpose does the reduce function have? it doesn't really  
>> reduce or
>> combine anything so it seems like it could just as well be left out.
>>
>> Regards,
>> Sven Helmberger
>>


Re: Random issue with reduce

Posted by Nicolas Clairon <cl...@gmail.com>.
I use this form of reduce to group by key the results.
In the real life, I use this reduce to get all tag group by document id.

Something like that :

function(doc){
   if(doc.doc_type == "MyDoc"){
     for( var t in doc.tags){
       emit(doc.name, doc.tags[t]);
     }
   }
}

and the reduce :

function(key, values){
  return values;
}

I get the following result:

name1 = > ["tag1", "tag2", "tag3"]
name2 => ["tag2", "tag3"]

but sometime I get this result

name1 = [["tag1","tag2","tag3"],["tag4","tag5","tag6","tag7"],["tag8","tag9"]]

Note that if the view work well the first time, it will always work fine.

It is the correct way to go ?

On Sat, Mar 28, 2009 at 12:10 PM, Sven Helmberger
<sv...@gmx.de> wrote:
> Nicolas Clairon schrieb:
>>
>> Nop !
>>
>> Here is the test  map function :
>>
>> function(doc){
>>  emit("key", 1);
>> }
>>
>> and the reduce function:
>>
>> function(key, values){
>>  return values;
>> }
>>
>> That work well for a large usecase but sometime, I get this strange
>> behavior...
>>
>
> What purpose does the reduce function have? it doesn't really reduce or
> combine anything so it seems like it could just as well be left out.
>
> Regards,
> Sven Helmberger
>

Re: Random issue with reduce

Posted by Sven Helmberger <sv...@gmx.de>.
Nicolas Clairon schrieb:
> Nop !
> 
> Here is the test  map function :
> 
> function(doc){
>   emit("key", 1);
> }
> 
> and the reduce function:
> 
> function(key, values){
>   return values;
> }
> 
> That work well for a large usecase but sometime, I get this strange behavior...
> 

What purpose does the reduce function have? it doesn't really reduce or 
combine anything so it seems like it could just as well be left out.

Regards,
Sven Helmberger

Re: Random issue with reduce

Posted by Nicolas Clairon <cl...@gmail.com>.
Nop !

Here is the test  map function :

function(doc){
  emit("key", 1);
}

and the reduce function:

function(key, values){
  return values;
}

That work well for a large usecase but sometime, I get this strange behavior...

On Sat, Mar 28, 2009 at 11:56 AM, Sven Helmberger
<sv...@gmx.de> wrote:
> Nicolas Clairon schrieb:
>>
>> Hi !
>>
>> I get problem with reduce. For the following map result :
>>
>> key => 1
>> key => 1
>> key => 1
>> key => 1
>> key => 1
>> key => 1
>> key => 1
>> key => 1
>> ...
>>
>> I sometime get this reduce result:
>>
>> key =>
>> [[1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1,1,],...]
>>
>> while the expected result is of course :
>>
>> key => [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,...]
>>
>> I can't figure out why I get sometime nested lists in reduce result.
>>
>> I'm using the latest trunk (0.9) of couchdb on a Linux box (Ubuntu
>> 8.04) with erlang 12b.3
>> but I also got this problem with 0.8 version.
>>
>> Does anyone went through this issues already ?
>>
>> Nicolas
>>
>> PS : I'm so sorry for my bad english...
>
> Do you account for rereduce in your reduce function?
>
> http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views
>
> Regards,
> Sven Helmberger
>

Re: Random issue with reduce

Posted by Sven Helmberger <sv...@gmx.de>.
Nicolas Clairon schrieb:
> Hi !
> 
> I get problem with reduce. For the following map result :
> 
> key => 1
> key => 1
> key => 1
> key => 1
> key => 1
> key => 1
> key => 1
> key => 1
> ...
> 
> I sometime get this reduce result:
> 
> key => [[1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1,1,],...]
> 
> while the expected result is of course :
> 
> key => [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,...]
> 
> I can't figure out why I get sometime nested lists in reduce result.
> 
> I'm using the latest trunk (0.9) of couchdb on a Linux box (Ubuntu
> 8.04) with erlang 12b.3
> but I also got this problem with 0.8 version.
> 
> Does anyone went through this issues already ?
> 
> Nicolas
> 
> PS : I'm so sorry for my bad english...

Do you account for rereduce in your reduce function?

http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

Regards,
Sven Helmberger