You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by envio user <en...@gmail.com> on 2010/01/20 16:57:41 UTC

issue with get_slice() performance.

All,
I downloaded 0.5.0 final and tried following test:

Machine : Quad Core(64bit) , 8GB RAM, signle node.

Data model: A single SCF with CompareWith="BytesType" and
CompareSubcolumnsWith="TimeUUIDType"

Sample data stored:
FF = {
        '1' : {
                    'ALL' => {
                                  'TimeUUID1' => '16bytesstring',
                                  ......
                                  'TimeUUID25' => '16bytesstring',
                                }
                      }
         },
....................
     '100K': {
..............
        }
}


Client: PHP

I loaded with 100K keys with 25 columns per key and value for each of
the sub-column is 16 bytes string.
I populated the data using batch_insert(), and no issues with writes.
For my requirement, I need to fecth all 25 columns.
 For this purpose, am using get_slice() method.
Performance of get_slice() is way below our expections. We were
expecting, at least 1000 get_slices/second. "ROW-READ-STAGE" are
piling up.
vmstat and iostat on the machine looks normal.

 nodeprobe  -host localhost -port 8080 info

69914105239252561041379405595605110341
Load             : 200.9 MB
Generation No    : 1263997349
Uptime (seconds) : 2627
Heap Memory (MB) : 117.18 / 1023.06

I tried to  fecth 500 get_slices() per second:
===============================

tpstats:
--------
ROW-READ-STAGE                   16       250          94305
.............
ROW-READ-STAGE                   16       244         112190

Load test stats:

Name    | highest 10sec  mean  | lowest 10sec mean | Highest Rate  |
Mean     |  Count
===============================================================
 request | 48.23 sec                 | 6.94 msec              | 332.6
/ sec     | 2.70 sec | 61825

Please let me know how can I improve the performance of get_slice().

thanks,
-Aita

Re: issue with get_slice() performance.

Posted by Jonathan Ellis <jb...@gmail.com>.

what does iostat -x say %util and %await is?

what does top say cpu is?

does tpstats show non-zero compaction activity?

if cpu is not close to 100% try increasing ConcurrentReads

since you have plenty of ram, increase the KeysCachedFraction of your
CFs to 0.1, 0.2, or even more (until it stops improving performance)

-Jonathan

On Wed, Jan 20, 2010 at 9:57 AM, envio user <en...@gmail.com> wrote:
> All,
> I downloaded 0.5.0 final and tried following test:
>
> Machine : Quad Core(64bit) , 8GB RAM, signle node.
>
> Data model: A single SCF with CompareWith="BytesType" and
> CompareSubcolumnsWith="TimeUUIDType"
>
> Sample data stored:
> FF = {
>         '1' : {
>                     'ALL' => {
>                                   'TimeUUID1' => '16bytesstring',
>                                   ......
>                                   'TimeUUID25' => '16bytesstring',
>                                 }
>                       }
>          },
> ....................
>      '100K': {
> ..............
>         }
> }
>
>
> Client: PHP
>
> I loaded with 100K keys with 25 columns per key and value for each of
> the sub-column is 16 bytes string.
> I populated the data using batch_insert(), and no issues with writes.
> For my requirement, I need to fecth all 25 columns.
>  For this purpose, am using get_slice() method.
> Performance of get_slice() is way below our expections. We were
> expecting, at least 1000 get_slices/second. "ROW-READ-STAGE" are
> piling up.
> vmstat and iostat on the machine looks normal.
>
>  nodeprobe  -host localhost -port 8080 info
>
> 69914105239252561041379405595605110341
> Load             : 200.9 MB
> Generation No    : 1263997349
> Uptime (seconds) : 2627
> Heap Memory (MB) : 117.18 / 1023.06
>
> I tried to  fecth 500 get_slices() per second:
> ===============================
>
> tpstats:
> --------
> ROW-READ-STAGE                   16       250          94305
> .............
> ROW-READ-STAGE                   16       244         112190
>
> Load test stats:
>
> Name    | highest 10sec  mean  | lowest 10sec mean | Highest Rate  |
> Mean     |  Count
> ===============================================================
>  request | 48.23 sec                 | 6.94 msec              | 332.6
> / sec     | 2.70 sec | 61825
>
> Please let me know how can I improve the performance of get_slice().
>
> thanks,
> -Aita
>