You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Edward Capriolo <ed...@gmail.com> on 2012/10/25 18:04:38 UTC

Large results and network round trips

Hello all,

Currently we implement wide rows for most of our entities. For example:

user {
 event1=>x
 event2=>y
 event3=>z
 ...
}

Normally the entires are bounded to be less then 256 columns and most
columns are small in size say 30 bytes. Because the blind write nature
of Cassandra it is possible the column family can get much larger. We
have very low latency requirements for example say less then (5ms).

Considering network rountrip and all other factors I am wondering what
is the largest column that is possible in a 5ms window on a GB
network.  First we have our thrift limits 15MB, is it possible even in
the best case scenario to deliver a 15MB response in under 5ms on a
GigaBit ethernet for example? Does anyone have any real world numbers
with reference to package sizes and standard performance?

Thanks all,
Edward

Re: Large results and network round trips

Posted by Edward Capriolo <ed...@gmail.com>.
For this scenario, remove disk speed from the equation. Assume the row
is completely in Row Cache. Also lets assume Read.ONE. With this
information I would be looking to determine response size/maximum
requests second/max latency.

I would use this to say "You want to do 5,000 reads/sec, on a GigaBit
ethernet, and each row is 10K, in under 5ms latency"

Sorry that is impossible.




On Thu, Oct 25, 2012 at 2:58 PM, sankalp kohli <ko...@gmail.com> wrote:
> I dont have any sample data on this, but read latency will depend on these
> 1) Consistency level of the read
> 2) Disk speed.
>
> Also you can look at the Netflix client as it makes the co-ordinator node
> same as the node which holds that data. This will reduce one hop.
>
> On Thu, Oct 25, 2012 at 9:04 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
>>
>> Hello all,
>>
>> Currently we implement wide rows for most of our entities. For example:
>>
>> user {
>>  event1=>x
>>  event2=>y
>>  event3=>z
>>  ...
>> }
>>
>> Normally the entires are bounded to be less then 256 columns and most
>> columns are small in size say 30 bytes. Because the blind write nature
>> of Cassandra it is possible the column family can get much larger. We
>> have very low latency requirements for example say less then (5ms).
>>
>> Considering network rountrip and all other factors I am wondering what
>> is the largest column that is possible in a 5ms window on a GB
>> network.  First we have our thrift limits 15MB, is it possible even in
>> the best case scenario to deliver a 15MB response in under 5ms on a
>> GigaBit ethernet for example? Does anyone have any real world numbers
>> with reference to package sizes and standard performance?
>>
>> Thanks all,
>> Edward
>
>

Re: Large results and network round trips

Posted by sankalp kohli <ko...@gmail.com>.
I dont have any sample data on this, but read latency will depend on these
1) Consistency level of the read
2) Disk speed.

Also you can look at the Netflix client as it makes the co-ordinator node
same as the node which holds that data. This will reduce one hop.

On Thu, Oct 25, 2012 at 9:04 AM, Edward Capriolo <ed...@gmail.com>wrote:

> Hello all,
>
> Currently we implement wide rows for most of our entities. For example:
>
> user {
>  event1=>x
>  event2=>y
>  event3=>z
>  ...
> }
>
> Normally the entires are bounded to be less then 256 columns and most
> columns are small in size say 30 bytes. Because the blind write nature
> of Cassandra it is possible the column family can get much larger. We
> have very low latency requirements for example say less then (5ms).
>
> Considering network rountrip and all other factors I am wondering what
> is the largest column that is possible in a 5ms window on a GB
> network.  First we have our thrift limits 15MB, is it possible even in
> the best case scenario to deliver a 15MB response in under 5ms on a
> GigaBit ethernet for example? Does anyone have any real world numbers
> with reference to package sizes and standard performance?
>
> Thanks all,
> Edward
>