You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "David Orrell (JIRA)" <ji...@apache.org> on 2010/11/25 11:59:14 UTC

[jira] Created: (COUCHDB-964) Large memory usage downloading attachements

Large memory usage downloading attachements
-------------------------------------------

                 Key: COUCHDB-964
                 URL: https://issues.apache.org/jira/browse/COUCHDB-964
             Project: CouchDB
          Issue Type: Bug
          Components: HTTP Interface
    Affects Versions: 1.0.1
         Environment: Linux, Erlang R14B
            Reporter: David Orrell


When downloading a large attachment the CouchDB process appears to load the entire attachment in memory before data is sent to the client. I have a 1.5 GB attachment and the CouchDB process grows by approximately this amount per client connection.

For example (as reported by Bram Nejit):
dd if=/dev/urandom of=/tmp/test.bin count=50000 bs=10240
Put test.bin as an attachment in a coucdb database
Run
for i in {0..50};do curl http://localhost:5984/[test
database]/[doc_id]/test.bin > /dev/null 2>&1 & done

This will create 50 curl processes which download from your couchdb. Looking at the memory consumption of couchdb, it seems like it is loading large parts of the file into memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-964) Large memory usage downloading attachments

Posted by "David Orrell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935799#action_12935799 ] 

David Orrell commented on COUCHDB-964:
--------------------------------------

Robert, thanks for looking into this. I'm running this on Redhat EL5 on a box with a Xeon 3.2 GHz with 4 GB memory.

For me the test shows clearly that for each concurrent connection, when downloading a 0.5 GB file, the CouchDB process jumps by almost exactly the same amount until the data starts being transferred to the client when it drops back down by the same amount.

I'm monitoring this by looking at the RES memory in top.

> Large memory usage downloading attachments
> ------------------------------------------
>
>                 Key: COUCHDB-964
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-964
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: Linux, Erlang R14B
>            Reporter: David Orrell
>
> When downloading a large attachment the CouchDB process appears to load the entire attachment in memory before data is sent to the client. I have a 1.5 GB attachment and the CouchDB process grows by approximately this amount per client connection.
> For example (as reported by Bram Nejit):
> dd if=/dev/urandom of=/tmp/test.bin count=50000 bs=10240
> Put test.bin as an attachment in a coucdb database
> Run
> for i in {0..50};do curl http://localhost:5984/[test
> database]/[doc_id]/test.bin > /dev/null 2>&1 & done
> This will create 50 curl processes which download from your couchdb. Looking at the memory consumption of couchdb, it seems like it is loading large parts of the file into memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (COUCHDB-964) Large memory usage downloading attachments

Posted by "David Orrell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Orrell updated COUCHDB-964:
---------------------------------

    Summary: Large memory usage downloading attachments  (was: Large memory usage downloading attachements)

> Large memory usage downloading attachments
> ------------------------------------------
>
>                 Key: COUCHDB-964
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-964
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: Linux, Erlang R14B
>            Reporter: David Orrell
>
> When downloading a large attachment the CouchDB process appears to load the entire attachment in memory before data is sent to the client. I have a 1.5 GB attachment and the CouchDB process grows by approximately this amount per client connection.
> For example (as reported by Bram Nejit):
> dd if=/dev/urandom of=/tmp/test.bin count=50000 bs=10240
> Put test.bin as an attachment in a coucdb database
> Run
> for i in {0..50};do curl http://localhost:5984/[test
> database]/[doc_id]/test.bin > /dev/null 2>&1 & done
> This will create 50 curl processes which download from your couchdb. Looking at the memory consumption of couchdb, it seems like it is loading large parts of the file into memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-964) Large memory usage downloading attachments

Posted by "A. Bram Neijt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964694#action_12964694 ] 

A. Bram Neijt commented on COUCHDB-964:
---------------------------------------

I've not been able to reproduce the complete report. I think it is probably the effect of the garbage collector not getting the time to free the memory during the request handling.

I've done the following:
dd if=/dev/urandom of=blob bs=1024 count=102400
Create a database called "test" with a document with _id "large" and attached the "blob".
I've editted /etc/init.d/couchdb and added
ulimit -v 204800; 
to the su line 
137         if ! su $COUCHDB_USER -c "ulimit -v 204800; $command" > /dev/null; then

After that restarted my couchdb and started downloads:
for i in {0..50};do curl http://admin:admin@localhost:5984/test/large/blob > /dev/null & done
(I have a user admin, pass admin at the moment, remove the admin:admin@ if you are in admin party mode)

About 4 of the 51 requests survive this, the rest are closed before the end of the transfer or could not connect to the host.

>From here on, I can only assume stuff because it would require more knowledge of the couchdb server code. It seems that couchdb just uses as much memory as it is allowed to use for the file transfer, larger files will probably mean less garbage collection? The only worrying thing is that if you have 1GB attachments and also 1GB of memory, two users downloading that attachment will get your couchdb to refuse connections till the download is complete, which may not be desired?



> Large memory usage downloading attachments
> ------------------------------------------
>
>                 Key: COUCHDB-964
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-964
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: Linux, Erlang R14B
>            Reporter: David Orrell
>
> When downloading a large attachment the CouchDB process appears to load the entire attachment in memory before data is sent to the client. I have a 1.5 GB attachment and the CouchDB process grows by approximately this amount per client connection.
> For example (as reported by Bram Nejit):
> dd if=/dev/urandom of=/tmp/test.bin count=50000 bs=10240
> Put test.bin as an attachment in a coucdb database
> Run
> for i in {0..50};do curl http://localhost:5984/[test
> database]/[doc_id]/test.bin > /dev/null 2>&1 & done
> This will create 50 curl processes which download from your couchdb. Looking at the memory consumption of couchdb, it seems like it is loading large parts of the file into memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-964) Large memory usage downloading attachments

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935752#action_12935752 ] 

Robert Newson commented on COUCHDB-964:
---------------------------------------


I've run this test on OS X (10.6.5) and cannot reproduce the problem. the RSIZE of beam.smp remains at 180 mb throughout the test and drops to 40 mb on completion.

CouchDB does not buffer the whole attachment into memory. It could be that Erlang's GC is unable to keep up on your hardware and therefore the unreferenced chunks of the attachment are still retained in memory.

> Large memory usage downloading attachments
> ------------------------------------------
>
>                 Key: COUCHDB-964
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-964
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: Linux, Erlang R14B
>            Reporter: David Orrell
>
> When downloading a large attachment the CouchDB process appears to load the entire attachment in memory before data is sent to the client. I have a 1.5 GB attachment and the CouchDB process grows by approximately this amount per client connection.
> For example (as reported by Bram Nejit):
> dd if=/dev/urandom of=/tmp/test.bin count=50000 bs=10240
> Put test.bin as an attachment in a coucdb database
> Run
> for i in {0..50};do curl http://localhost:5984/[test
> database]/[doc_id]/test.bin > /dev/null 2>&1 & done
> This will create 50 curl processes which download from your couchdb. Looking at the memory consumption of couchdb, it seems like it is loading large parts of the file into memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-964) Large memory usage downloading attachments

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935762#action_12935762 ] 

Robert Newson commented on COUCHDB-964:
---------------------------------------

I've failed to reproduce this locally by following your instructions.
My memory usage was stable (OS X). Another user has tried the test on
Linux with R13 and reports stable memory usage also.

Can you provide more details of the OS, hardware and the manner in
which you are monitoring the memory usage itself? I'd like to
eliminate as many distracting factors as possible.

> Large memory usage downloading attachments
> ------------------------------------------
>
>                 Key: COUCHDB-964
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-964
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>    Affects Versions: 1.0.1
>         Environment: Linux, Erlang R14B
>            Reporter: David Orrell
>
> When downloading a large attachment the CouchDB process appears to load the entire attachment in memory before data is sent to the client. I have a 1.5 GB attachment and the CouchDB process grows by approximately this amount per client connection.
> For example (as reported by Bram Nejit):
> dd if=/dev/urandom of=/tmp/test.bin count=50000 bs=10240
> Put test.bin as an attachment in a coucdb database
> Run
> for i in {0..50};do curl http://localhost:5984/[test
> database]/[doc_id]/test.bin > /dev/null 2>&1 & done
> This will create 50 curl processes which download from your couchdb. Looking at the memory consumption of couchdb, it seems like it is loading large parts of the file into memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.