You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Bryan Duxbury (JIRA)" <ji...@apache.org> on 2010/10/15 23:14:32 UTC

[jira] Created: (THRIFT-959) TSocket seems to do its own buffering inefficiently

TSocket seems to do its own buffering inefficiently
---------------------------------------------------

                 Key: THRIFT-959
                 URL: https://issues.apache.org/jira/browse/THRIFT-959
             Project: Thrift
          Issue Type: Improvement
          Components: Java - Library
    Affects Versions: 0.5, 0.4, 0.3, 0.2
            Reporter: Bryan Duxbury
            Assignee: Bryan Duxbury
             Fix For: 0.6


I was looking through TSocket today while reviewing THRIFT-106 and I noticed that in TSocket, when we open the socket/stream, we wrap the input/output streams with Buffered(Input|Output)Stream objects and use those for reading and writing. 

Two things stand out about this. Firstly, for some reason we're setting the buffer size specifically to 1KB, which is 1/8 the default. I think that number should be *at least* 8KB and more likely something like 32KB would be better. Anyone have any idea why we chose this size? Secondly, though, is the fact that we probably shouldn't be doing buffering here at all. The general pattern is to open a TSocket and wrap it in a TFramedTransport, which means that today, even though we're fully buffering in the framed transport, we're wastefully buffering again in the TSocket. This means we're wasting time and memory, and I wouldn't be surprised if this is artificially slowing down throughput, specifically for multi-KB requests and responses.

If we remove the buffering from TSocket, I think we will probably need to add a TBufferedTransport to support users who are talking to non-Framed servers but still need buffering for performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (THRIFT-959) TSocket seems to do its own buffering inefficiently

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury closed THRIFT-959.
--------------------------------

    Resolution: Fixed

I just committed a tiny fix for this. I found that removing the buffer improved the distribution of latencies by a small but notable percentage.

> TSocket seems to do its own buffering inefficiently
> ---------------------------------------------------
>
>                 Key: THRIFT-959
>                 URL: https://issues.apache.org/jira/browse/THRIFT-959
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Library
>    Affects Versions: 0.2, 0.3, 0.4, 0.5
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.6
>
>
> I was looking through TSocket today while reviewing THRIFT-106 and I noticed that in TSocket, when we open the socket/stream, we wrap the input/output streams with Buffered(Input|Output)Stream objects and use those for reading and writing. 
> Two things stand out about this. Firstly, for some reason we're setting the buffer size specifically to 1KB, which is 1/8 the default. I think that number should be *at least* 8KB and more likely something like 32KB would be better. Anyone have any idea why we chose this size? Secondly, though, is the fact that we probably shouldn't be doing buffering here at all. The general pattern is to open a TSocket and wrap it in a TFramedTransport, which means that today, even though we're fully buffering in the framed transport, we're wastefully buffering again in the TSocket. This means we're wasting time and memory, and I wouldn't be surprised if this is artificially slowing down throughput, specifically for multi-KB requests and responses.
> If we remove the buffering from TSocket, I think we will probably need to add a TBufferedTransport to support users who are talking to non-Framed servers but still need buffering for performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-959) TSocket seems to do its own buffering inefficiently

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921569#action_12921569 ] 

Bryan Duxbury commented on THRIFT-959:
--------------------------------------

Some investigation shows that actually, you won't have any performance issues with large writes, since Java's buffered stream implementations offer a fast path to bypass the internal buffer if the write or read is big enough to justify that. There still is the question of throughput on smaller writes though - for instance, if all your method calls are 100-byte requests, then you have to do the extra copy only to go straight into a flush. 

> TSocket seems to do its own buffering inefficiently
> ---------------------------------------------------
>
>                 Key: THRIFT-959
>                 URL: https://issues.apache.org/jira/browse/THRIFT-959
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Library
>    Affects Versions: 0.2, 0.3, 0.4, 0.5
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.6
>
>
> I was looking through TSocket today while reviewing THRIFT-106 and I noticed that in TSocket, when we open the socket/stream, we wrap the input/output streams with Buffered(Input|Output)Stream objects and use those for reading and writing. 
> Two things stand out about this. Firstly, for some reason we're setting the buffer size specifically to 1KB, which is 1/8 the default. I think that number should be *at least* 8KB and more likely something like 32KB would be better. Anyone have any idea why we chose this size? Secondly, though, is the fact that we probably shouldn't be doing buffering here at all. The general pattern is to open a TSocket and wrap it in a TFramedTransport, which means that today, even though we're fully buffering in the framed transport, we're wastefully buffering again in the TSocket. This means we're wasting time and memory, and I wouldn't be surprised if this is artificially slowing down throughput, specifically for multi-KB requests and responses.
> If we remove the buffering from TSocket, I think we will probably need to add a TBufferedTransport to support users who are talking to non-Framed servers but still need buffering for performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.