You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "James Cohen (JIRA)" <ji...@apache.org> on 2011/08/23 13:14:29 UTC

[jira] [Created] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

eheap_alloc OOM errors when attempting to build selected views
--------------------------------------------------------------

                 Key: COUCHDB-1258
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
             Project: CouchDB
          Issue Type: Bug
         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build

Couch database exhibiting this behaviour:
{"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
            Reporter: James Cohen
            Priority: Critical


We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap

eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").

By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:

Runs out of memory v.quickly
{
   "_id": "_design/cleanup",
   "_rev": "5-e004fbab278355e9d08763877e5a8295",
   "views": {
       "byDate": {
           "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
       }
   }
}

Runs fine with minimal memory usage (returns 88128 docs in the view)
{
   "_id": "_design/cleanup",
   "_rev": "6-3823be6b72ca2441e235addfece6900c",
   "views": {
       "byDate": {
           "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
       }
   }
}

The only difference between the two is the negation of the if conditional.

memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "James Cohen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089632#comment-13089632 ] 

James Cohen commented on COUCHDB-1258:
--------------------------------------

I have just restored the same database to a completely different platform/server. This server is running Ubuntu 11.04, erlang R13B3 and CouchDB 1.0.1

I can get it to exhibit exactly the same behaviour which surprises me.

One characteristic of the database which may be unusual is the high number of deleted docs in the database. The use case for this is that documents only have a limited life-span after which we delete them from the system.

As you can see from the doc stats only a small number of the original docs are still there: "doc_count":593274,"doc_del_count":4743352,"update_seq":10559287

I wonder if someone could perform a test using a similar database (one that has a high number of deletions)

> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "James Cohen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179562#comment-13179562 ] 

James Cohen commented on COUCHDB-1258:
--------------------------------------

Thanks - we've patched our installs and this works perfectly for us.
                
> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>             Fix For: 1.2, 1.1.2
>
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090410#comment-13090410 ] 

Filipe Manana commented on COUCHDB-1258:
----------------------------------------

I have a similar case. Database with about 2.6 million deleted documents.
The map function is not emitting any key/value and memory consumption goes to about 4Gb.
After the indexer finishes, and the system goes idle, memory usage stays at about 2Gb forever.
Inspecting the erlang processes, a couch_view_group process is using about 2Gb of memory:

7> process_info(list_to_pid("<0.276.0>")).
[{current_function,{gen_server,loop,6}},
 {initial_call,{proc_lib,init_p,5}},
 {status,waiting},
 {message_queue_len,0},
 {messages,[]},
 {links,[<0.278.0>,<0.132.0>]},
 {dictionary,[{'$ancestors',[<0.275.0>]},
              {'$initial_call',{couch_view_group,init,1}}]},
 {trap_exit,true},
 {error_handler,error_handler},
 {priority,normal},
 {group_leader,<0.37.0>},
 {total_heap_size,323345205},
 {heap_size,38263080},
 {stack_size,9},
 {reductions,1041902893},
 {garbage_collection,[{min_bin_vheap_size,46368},
                      {min_heap_size,233},
                      {fullsweep_after,65535},
                      {minor_gcs,538}]},
 {suspending,[]}]
8> 

Manually invoking the garbage collector against this process releases memory and after that the process is using only about 7Kb of memory:

8> erlang:garbage_collect(list_to_pid("<0.276.0>")).
true
9> process_info(list_to_pid("<0.276.0>")).          
[{current_function,{gen_server,loop,6}},
 {initial_call,{proc_lib,init_p,5}},
 {status,waiting},
 {message_queue_len,0},
 {messages,[]},
 {links,[<0.278.0>,<0.132.0>]},
 {dictionary,[{'$ancestors',[<0.275.0>]},
              {'$initial_call',{couch_view_group,init,1}}]},
 {trap_exit,true},
 {error_handler,error_handler},
 {priority,normal},
 {group_leader,<0.37.0>},
 {total_heap_size,987},
 {heap_size,987},
 {stack_size,9},
 {reductions,1041902893},
 {garbage_collection,[{min_bin_vheap_size,46368},
                      {min_heap_size,233},
                      {fullsweep_after,65535},
                      {minor_gcs,0}]},
 {suspending,[]}]
10> 

I have no idea why the view group is using so much memory.
My best call is that we need to call couch_util:should_flush/1 everytime we receive a partial_update message in the view group.
This was with OT R14B03.

> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090466#comment-13090466 ] 

Paul Joseph Davis commented on COUCHDB-1258:
--------------------------------------------

Sounds reasonable. A map function that emits no keys coupled with a lot of deleted documents means that the partial_update frequency will be quite large. couch_work_queue's behave quite poorly when the dequeue side is faster than the queue side and end up basically not bulking updates together. A quick test would be to guard the partial_update message to only fire at most every 1000 writes. A quick hack to store calls in the process dict in do_writes should give a quick idea if it's the partial updates.

A should_flush for every partial update seems a bit extreme. Although I can't think of a better solution that doesn't just make an equivalent to should_flush that's more specific to this case.

> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090592#comment-13090592 ] 

Filipe Manana commented on COUCHDB-1258:
----------------------------------------

Robert,
That solution works perfectly fine for my case. Memory consumption doesn't go beyond ~40Mb (for the whole Erlang VM). Perhaps a lower limit (to match the work queue max items) could be generally better, I haven't tested it however.

> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "Randall Leeds (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Randall Leeds updated COUCHDB-1258:
-----------------------------------

    Affects Version/s: 1.1

Surely one of these two views emits many more documents than the other and there is something about that which is exhausting heap space. The negation in JavaScript cannot affect the memory usage of BEAM because the view server runs in a separate process with its own heap and the errors you see are from Erlang.

Someone will have to take a deep look into the view code and audit it for any sloppy memory usage.

As an experiment could you try emitting null as the value rather than doc? If the docs are very big perhaps we need to flush them in smaller batches or something.

> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "James Cohen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177682#comment-13177682 ] 

James Cohen commented on COUCHDB-1258:
--------------------------------------

I'm afraid I've still not had a chance to re-compile with this patch in place. If you're happy with it working fine would it be OK to commit/merge it?

As a cleaner workaround to the dump/load solution I previously used it's possible to use filtered replication with a filter checking for !doc._deleted, this creates a fresh replica database with a doc_del_count=0
                
> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090478#comment-13090478 ] 

Robert Newson commented on COUCHDB-1258:
----------------------------------------

you could try this to see if it helps with memory usage. Verified locally that it radically reduces the number of checkpoints.

diff --git a/src/couchdb/couch_view_group.erl b/src/couchdb/couch_view_group.erl
index cde949b..7002d07 100644
--- a/src/couchdb/couch_view_group.erl
+++ b/src/couchdb/couch_view_group.erl
@@ -243,7 +243,7 @@ handle_cast({partial_update, Pid, NewGroup}, #group_state{updater_pid=Pid}
         group = Group
     } = State,
     NewSeq = NewGroup#group.current_seq,
-    case NewSeq > Group#group.current_seq of
+    case NewSeq > 1000 + Group#group.current_seq of
     true ->
         ?LOG_INFO("checkpointing view update at seq ~p for ~s ~s", [NewSeq,
             DbName, NewGroup#group.name]),

> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "James Cohen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090083#comment-13090083 ] 

James Cohen commented on COUCHDB-1258:
--------------------------------------

Overnight I ran a dump/load of the database with the Python tools couchdb-dump and couchdb-load, I now have "doc_count":593274,"doc_del_count":0,"update_seq":593274

The views are generated with minimal memory usage (far better than in either case before) and the problem doesn't seem to be present.

emitting null instead of the doc results in high memory usage by beam.smp and ultimately the same OOM error

> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (COUCHDB-1258) eheap_alloc OOM errors when attempting to build selected views

Posted by "Robert Newson (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Newson resolved COUCHDB-1258.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.1.2
                   1.2

Resolved on 1.1.x and 1.2.x. Alleged not to be a problem on master with its fancy new indexer.
                
> eheap_alloc OOM errors when attempting to build selected views
> --------------------------------------------------------------
>
>                 Key: COUCHDB-1258
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1258
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 1.1
>         Environment: CentOS 5.6, 1GB RAM (approx 800MB free)
> CouchDb 1.1.0, Erlang R14B-03, js lib 1.7.0-8 EPEL RPM build
> Couch database exhibiting this behaviour:
> {"db_name":"activity_new","doc_count":593274,"doc_del_count":4743352,"update_seq":10559287,"purge_seq":0,"compact_running":false,"disk_size":3366396013,"instance_start_time":"1314097423985726","disk_format_version":5,"committed_update_seq":10559287}
>            Reporter: James Cohen
>            Priority: Critical
>             Fix For: 1.2, 1.1.2
>
>
> We spotted OOM errors crashing our CouchDb instance when attempting to rebuild selected views. CouchDb was dying with the following messages (worth noting that the type reported varies between heap/old_heap
> eheap_alloc: Cannot allocate 478288480 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "heap").
> eheap_alloc: Cannot allocate 747325720 bytes of memory (of type "old_heap").
> eheap_alloc: Cannot allocate 597860600 bytes of memory (of type "old_heap").
> By modifying the view I was able to find a view that could consistently crash the server and another that ran fine. They are as follows:
> Runs out of memory v.quickly
> {
>    "_id": "_design/cleanup",
>    "_rev": "5-e004fbab278355e9d08763877e5a8295",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (! doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> Runs fine with minimal memory usage (returns 88128 docs in the view)
> {
>    "_id": "_design/cleanup",
>    "_rev": "6-3823be6b72ca2441e235addfece6900c",
>    "views": {
>        "byDate": {
>            "map": "function(doc) { if (doc.action) emit([doc.date], doc); }"
>        }
>    }
> }
> The only difference between the two is the negation of the if conditional.
> memory usage was monitored with top on the machine while the view was being built. Under correct behaviour I could see beam.smp using just 3 or 4% of the server's memory. With the view that causes problems that memory usage increased until the RAM/swap on the server was exhausted (as you can see from the error messages around 500/700MB)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira