You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dave Gardner <da...@imagini.net> on 2010/08/03 11:18:02 UTC

Very slow reads - connected to TBufferedTransport buffer sizes

Hi all

I'm working on a PHP/Cassandra application. Yesterday we experienced a
strange situation when testing random reads. The background to this test was
that we inserted 10,000 rows with simple row keys. The number of columns in
each row varies between about 5 columns and 40 columns (all random).  Insert
speed via the PHP Thrift library was good - very fast.

With our random read script, we are simply carrying out a get_slice on a
specific row key to fetch all the columns in the row (up to 100, which is
always everything).

The random read script either completes in a few ms, or it takes around 5s.
The significance of 5s is that this is (temporarily) what we've set the
socket read timeout to (TSocket::setRecvTimeout). There are a couple of
interesting things to note:

1. Even when the socket seemingly times out, the result of the operation is
successful - eg: we do end up with a row returned.
2. The slow response times are *always* for rows with a larger number of
columns; the limit seems to be around 9 columns or so
3. If we up the timeout to 10s, the reads take 10s instead of 5s, but still
return successfully

This is an example of data from a row that is read quickly

9bc905c5-fc62-58de-87f5-48eb1ebb4f03
    col0: e0ce34010211030d91331feefc946c8c
    col1: 6d97b892ee7773d40b7bbff27ec5b34d
    col2: 6e2394dd48d5ca2df47eeceb72ca9de0
    col3: 43fca4716b865f24e30de67b2e10c1a8
    col4: c2e8de1541550e78829d312609acd237
    col5: e458447d8a2987bf05d65bee0a103be8
    col6: 0a8a86de4247b690e765aeca6615aef8
    col7: dc48b5e996da86b94d40d85292351c61
    col8: 3b95f9fc7c64d021ecc2c7a013f2e132
    key: 9bc905c5-fc62-58de-87f5-48eb1ebb4f03

This is an example of data from a row that is read slowly:

7318f337-529d-5408-a7c8-1283b750164d
    col0: 113b9cfe8eea8bf7eca71ce1ca1b0913
    col1: 428fe0bfadf687ef3b5c532e98e487ef
    col10: f1e80507626223358414130b1c7ecacd
    col11: cf7ada7ab098d2aeb9e5553808c89044
    col12: a93237313167c313d36d39779dcf23cd
    col13: 609595bbbb2b7058ad3f97f1ea0b7ebd
    col2: 27eca7dbff849eac82dc32e92b3fe977
    col3: 294dbf3107c351783a69450fedbefc61
    col4: 7fbd8f20d52731a10029e6f92874fae5
    col5: 3d06fc491c8f1669b144b798155578d4
    col6: 60d8f358cf07924912c8e19c60f45aac
    col7: 03297fd9576c1c96586bbbaaeaa1aa64
    col8: 6d6383fba84a6ec6811d96aee2b39102
    col9: c937171c3ad5d30b671d72741a777dc7
    key: 7318f337-529d-5408-a7c8-1283b750164d

Other points:

- we are reading with CL:One
- we are using the PHP Thrift library directly
- we are using:
   TSocket
   TBufferedTransport                  [with buffer sizes of 1024, 1024]
   TBinaryProtocolAccelerated

I have just this second discovered that changing the buffer sizes impacts
this issue. Reducing the buffer size makes every request take 5s, increasing
the buffer size makes every request execute quickly.

I will continue to debug this issue today, but thought that someone may be
able to shed some light on the issue.

Thanks!

Dave

Re: Very slow reads - connected to TBufferedTransport buffer sizes

Posted by Dave Gardner <da...@imagini.net>.
Yes this is the issue. Thanks.

Dave

On Tuesday, August 3, 2010, Jonathan Ellis <jb...@gmail.com> wrote:
> Sounds like https://issues.apache.org/jira/browse/THRIFT-638, where
> Arya Goudarzi posted a patch.
>
> On Tue, Aug 3, 2010 at 5:18 AM, Dave Gardner <da...@imagini.net> wrote:
>> Hi all
>>
>> I'm working on a PHP/Cassandra application. Yesterday we experienced a
>> strange situation when testing random reads. The background to this test was
>> that we inserted 10,000 rows with simple row keys. The number of columns in
>> each row varies between about 5 columns and 40 columns (all random).  Insert
>> speed via the PHP Thrift library was good - very fast.
>>
>> With our random read script, we are simply carrying out a get_slice on a
>> specific row key to fetch all the columns in the row (up to 100, which is
>> always everything).
>>
>> The random read script either completes in a few ms, or it takes around 5s.
>> The significance of 5s is that this is (temporarily) what we've set the
>> socket read timeout to (TSocket::setRecvTimeout). There are a couple of
>> interesting things to note:
>>
>> 1. Even when the socket seemingly times out, the result of the operation is
>> successful - eg: we do end up with a row returned.
>> 2. The slow response times are *always* for rows with a larger number of
>> columns; the limit seems to be around 9 columns or so
>> 3. If we up the timeout to 10s, the reads take 10s instead of 5s, but still
>> return successfully
>>
>> This is an example of data from a row that is read quickly
>>
>> 9bc905c5-fc62-58de-87f5-48eb1ebb4f03
>>     col0: e0ce34010211030d91331feefc946c8c
>>     col1: 6d97b892ee7773d40b7bbff27ec5b34d
>>     col2: 6e2394dd48d5ca2df47eeceb72ca9de0
>>     col3: 43fca4716b865f24e30de67b2e10c1a8
>>     col4: c2e8de1541550e78829d312609acd237
>>     col5: e458447d8a2987bf05d65bee0a103be8
>>     col6: 0a8a86de4247b690e765aeca6615aef8
>>     col7: dc48b5e996da86b94d40d85292351c61
>>     col8: 3b95f9fc7c64d021ecc2c7a013f2e132
>>     key: 9bc905c5-fc62-58de-87f5-48eb1ebb4f03
>>
>> This is an example of data from a row that is read slowly:
>>
>> 7318f337-529d-5408-a7c8-1283b750164d
>>     col0: 113b9cfe8eea8bf7eca71ce1ca1b0913
>>     col1: 428fe0bfadf687ef3b5c532e98e487ef
>>     col10: f1e80507626223358414130b1c7ecacd
>>     col11: cf7ada7ab098d2aeb9e5553808c89044
>>     col12: a93237313167c313d36d39779dcf23cd
>>     col13: 609595bbbb2b7058ad3f97f1ea0b7ebd
>>     col2: 27eca7dbff849eac82dc32e92b3fe977
>>     col3: 294dbf3107c351783a69450fedbefc61
>>     col4: 7fbd8f20d52731a10029e6f92874fae5
>>     col5: 3d06fc491c8f1669b144b798155578d4
>>     col6: 60d8f358cf07924912c8e19c60f45aac
>>     col7: 03297fd9576c1c96586bbbaaeaa1aa64
>>     col8: 6d6383fba84a6ec6811d96aee2b39102
>>     col9: c937171c3ad5d30b671d72741a777dc7
>>     key: 7318f337-529d-5408-a7c8-1283b750164d
>>
>> Other points:
>>
>> - we are reading with CL:One
>> - we are using the PHP Thrift library directly
>> - we are using:
>>    TSocket
>>    TBufferedTransport                  [with buffer sizes of 1024, 1024]
>>    TBinaryProtocolAccelerated
>>
>> I have just this second discovered that changing the buffer sizes impacts
>> this issue. Reducing the buffer size makes every request take 5s, increasing
>> the buffer size makes every request execute quickly.
>>
>> I will continue to debug this issue today, but thought that someone may be
>> able to shed some light on the issue.
>>
>> Thanks!
>>
>> Dave
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: Very slow reads - connected to TBufferedTransport buffer sizes

Posted by Jonathan Ellis <jb...@gmail.com>.
Sounds like https://issues.apache.org/jira/browse/THRIFT-638, where
Arya Goudarzi posted a patch.

On Tue, Aug 3, 2010 at 5:18 AM, Dave Gardner <da...@imagini.net> wrote:
> Hi all
>
> I'm working on a PHP/Cassandra application. Yesterday we experienced a
> strange situation when testing random reads. The background to this test was
> that we inserted 10,000 rows with simple row keys. The number of columns in
> each row varies between about 5 columns and 40 columns (all random).  Insert
> speed via the PHP Thrift library was good - very fast.
>
> With our random read script, we are simply carrying out a get_slice on a
> specific row key to fetch all the columns in the row (up to 100, which is
> always everything).
>
> The random read script either completes in a few ms, or it takes around 5s.
> The significance of 5s is that this is (temporarily) what we've set the
> socket read timeout to (TSocket::setRecvTimeout). There are a couple of
> interesting things to note:
>
> 1. Even when the socket seemingly times out, the result of the operation is
> successful - eg: we do end up with a row returned.
> 2. The slow response times are *always* for rows with a larger number of
> columns; the limit seems to be around 9 columns or so
> 3. If we up the timeout to 10s, the reads take 10s instead of 5s, but still
> return successfully
>
> This is an example of data from a row that is read quickly
>
> 9bc905c5-fc62-58de-87f5-48eb1ebb4f03
>     col0: e0ce34010211030d91331feefc946c8c
>     col1: 6d97b892ee7773d40b7bbff27ec5b34d
>     col2: 6e2394dd48d5ca2df47eeceb72ca9de0
>     col3: 43fca4716b865f24e30de67b2e10c1a8
>     col4: c2e8de1541550e78829d312609acd237
>     col5: e458447d8a2987bf05d65bee0a103be8
>     col6: 0a8a86de4247b690e765aeca6615aef8
>     col7: dc48b5e996da86b94d40d85292351c61
>     col8: 3b95f9fc7c64d021ecc2c7a013f2e132
>     key: 9bc905c5-fc62-58de-87f5-48eb1ebb4f03
>
> This is an example of data from a row that is read slowly:
>
> 7318f337-529d-5408-a7c8-1283b750164d
>     col0: 113b9cfe8eea8bf7eca71ce1ca1b0913
>     col1: 428fe0bfadf687ef3b5c532e98e487ef
>     col10: f1e80507626223358414130b1c7ecacd
>     col11: cf7ada7ab098d2aeb9e5553808c89044
>     col12: a93237313167c313d36d39779dcf23cd
>     col13: 609595bbbb2b7058ad3f97f1ea0b7ebd
>     col2: 27eca7dbff849eac82dc32e92b3fe977
>     col3: 294dbf3107c351783a69450fedbefc61
>     col4: 7fbd8f20d52731a10029e6f92874fae5
>     col5: 3d06fc491c8f1669b144b798155578d4
>     col6: 60d8f358cf07924912c8e19c60f45aac
>     col7: 03297fd9576c1c96586bbbaaeaa1aa64
>     col8: 6d6383fba84a6ec6811d96aee2b39102
>     col9: c937171c3ad5d30b671d72741a777dc7
>     key: 7318f337-529d-5408-a7c8-1283b750164d
>
> Other points:
>
> - we are reading with CL:One
> - we are using the PHP Thrift library directly
> - we are using:
>    TSocket
>    TBufferedTransport                  [with buffer sizes of 1024, 1024]
>    TBinaryProtocolAccelerated
>
> I have just this second discovered that changing the buffer sizes impacts
> this issue. Reducing the buffer size makes every request take 5s, increasing
> the buffer size makes every request execute quickly.
>
> I will continue to debug this issue today, but thought that someone may be
> able to shed some light on the issue.
>
> Thanks!
>
> Dave
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com