You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Erik Frey (JIRA)" <ji...@apache.org> on 2009/01/15 12:59:59 UTC

[jira] Created: (THRIFT-265) Fix buffer bloat in TNonblockingServer

Fix buffer bloat in TNonblockingServer
--------------------------------------

                 Key: THRIFT-265
                 URL: https://issues.apache.org/jira/browse/THRIFT-265
             Project: Thrift
          Issue Type: Improvement
          Components: Library (C++)
            Reporter: Erik Frey
            Assignee: Erik Frey
            Priority: Minor


TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Erik Frey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Frey updated THRIFT-265:
-----------------------------

    Attachment: buffer_reset.patch

Makes TNonBlockingServer::TConnection reset its read/write buffers to default sizes every 512 read/writes.

There are still scenarios that could cause memory bloat, but they are less likely: a brief period of massive concurrent requests with large payloads cause many TConnections to be created, their buffers grown, then never used again.

But probably the most common scenario is cyclic activity, with daily high loads ensuring that TConnections will be reused often enough that their buffers will eventually be reset.  We're using this patch already on our high-load servers and it's addressing this scenario admirably.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678544#action_12678544 ] 

Kevin Clark commented on THRIFT-265:
------------------------------------

I'm going to push tomorrow unless we get some opposition in the meantime.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678636#action_12678636 ] 

Alexander Shigin commented on THRIFT-265:
-----------------------------------------

+1 to Anthony Giardullo

I can misunderstand something, but the next snippet seems wrong to me.
{code}
+      void * new_buffer = std::realloc(readBuffer_, readBufferSize_);
+      if (new_buffer == NULL) {
+        GlobalOutput("TConnection::transition() realloc");
+        close();
+        return;
+      }
+      readBuffer_ = (uint8_t*) new_buffer;
+      readBufferSize_ = 1024;
{code}

I believe that the right version should set readBufferSize_ before realloc.
{code}
+      readBufferSize_ = 1024;
+      void * new_buffer = std::realloc(readBuffer_, readBufferSize_);
+      if (new_buffer == NULL) {
+        GlobalOutput("TConnection::transition() realloc");
+        close();
+        return;
+      }
+      readBuffer_ = (uint8_t*) new_buffer;
{code}

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667402#action_12667402 ] 

David Reiss commented on THRIFT-265:
------------------------------------

We recently did something similar inside Facebook.  Sorry I've been a little slow about merging it.  I'll try to reconcile the two versions in the next few days.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678865#action_12678865 ] 

shigin edited comment on THRIFT-265 at 3/4/09 11:44 AM:
------------------------------------------------------------------

I've created THRIFT-354 for tunable param. I belive you could push the second version of Erik's patch.

      was (Author: shigin):
    I've created THRIFT-354 for tunable param. I belive you could push it.
  
> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, buffer_reset_v2.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Erik Frey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Frey updated THRIFT-265:
-----------------------------

    Patch Info: [Patch Available]

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Dan Sully (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678515#action_12678515 ] 

Dan Sully commented on THRIFT-265:
----------------------------------

+1 here as well. I could definitely use this patch.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Erik Frey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Frey updated THRIFT-265:
-----------------------------

    Attachment: buffer_reset_v2.patch

Anthony you are absolutely right about the 1024 value.  Here's an update.

I considered making configurable the number of reads/writes before reallocing, but avoided doing so because above a certain number, it didn't seem to have any performance impact.  I didn't see any difference on my machines between reallocing every 512 reads/writes, and never reallocing.

I'm not inclined to make something tunable that has no effect on performance, but I didn't test it scientifically.  For what it's worth, I tried this on some servers that have ~1000 concurrent connections with at peak ~10,000 requests per second.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, buffer_reset_v2.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678829#action_12678829 ] 

Kevin Clark commented on THRIFT-265:
------------------------------------

Ok, seeing as everyone agrees this version of the patch is a good thing, but might still like to discuss the tunable parameter, could I commit this and if you feel strongly another ticket could be opened about the tunable param? I'd like to get this in, and it feels like maybe the good is being sacrificed to debate the perfect.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, buffer_reset_v2.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678513#action_12678513 ] 

Kevin Clark commented on THRIFT-265:
------------------------------------

Looks good to me. +1

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Clark closed THRIFT-265.
------------------------------

    Resolution: Fixed

Ok, this has been pushed in 750153. Thanks a lot!

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, buffer_reset_v2.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Anthony Giardullo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678566#action_12678566 ] 

Anthony Giardullo commented on THRIFT-265:
------------------------------------------

I believe the realloc call in TNonBlockingServer should be using 1024 instead of readBufferSize_.

Also, how about instead of hard coding the limit to 512, we make this value configurable? (And disabling this feature if the value is set to 0).

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Erik Frey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Frey updated THRIFT-265:
-----------------------------

    Summary: Buffer bloat in TNonblockingServer  (was: Fix buffer bloat in TNonblockingServer)

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678865#action_12678865 ] 

Alexander Shigin commented on THRIFT-265:
-----------------------------------------

I've created THRIFT-354 for tunable param. I belive you could push it.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, buffer_reset_v2.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Erik Frey (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Frey updated THRIFT-265:
-----------------------------

    Attachment: graph.png

But don't take my word for it!  Here's a ganglia graph of one of our high-concurrency servers - the patch was applied Thu 15.  We're running a number of machines with this patch, all with stable/good results.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Anthony Giardullo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678566#action_12678566 ] 

agiardullo edited comment on THRIFT-265 at 3/3/09 9:33 PM:
------------------------------------------------------------------

I believe the realloc call in TNonBlockingServer should be using 1024 instead of readBufferSize_.

Also, how about instead of hard coding the limit to 512, we make this value configurable? (This feature can then be disabled by setting this value to MAX_INT).

      was (Author: agiardullo):
    I believe the realloc call in TNonBlockingServer should be using 1024 instead of readBufferSize_.

Also, how about instead of hard coding the limit to 512, we make this value configurable? (And disabling this feature if the value is set to 0).
  
> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-265) Buffer bloat in TNonblockingServer

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678742#action_12678742 ] 

Alexander Shigin commented on THRIFT-265:
-----------------------------------------

Look good to me, but I prefer to make 512 to a tunable parameter.

> Buffer bloat in TNonblockingServer
> ----------------------------------
>
>                 Key: THRIFT-265
>                 URL: https://issues.apache.org/jira/browse/THRIFT-265
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (C++)
>            Reporter: Erik Frey
>            Assignee: Erik Frey
>            Priority: Minor
>         Attachments: buffer_reset.patch, buffer_reset_v2.patch, graph.png
>
>
> TNonBlockingServer never resets the lengths of the buffers it maintains for reading and writing.  Servers with a long life and many concurrent connections eventually generate an overhead that can reach into the gigabytes, particularly in services that have varied message sizes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.