You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by aaron morton <aa...@thelastpickle.com> on 2012/09/03 07:00:21 UTC

Re: performance is drastically degraded after 0.7.8 --> 1.0.11 upgrade

The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down?

If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 > 1.0)
Can you log the execution time for tests and find ones that are taking longer ?
 
There are full request metrics available on the StorageProxy JMX object. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/08/2012, at 4:45 PM, Илья Шипицин <ch...@gmail.com> wrote:

> we are using functional tests ( ~500 tests in time).
> it is hard to tell which query is slower, it is "slower in general".
>  
> same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings.
> as we are talking about functional tests, so we recreate KS just before tests are run.
>  
> I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you.
> 
> 2012/8/31 aaron morton <aa...@thelastpickle.com>
>>> we are running somewhat queue-like with aggressive write-read patterns.
> We'll need some more details…
> 
> How much data ?
> How many machines ?
> What is the machine spec ?
> How many clients ?
> Is there an example of a slow request ? 
> How are you measuring that it's slow ? 
> Is there anything unusual in the log ? 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 31/08/2012, at 3:30 AM, Edward Capriolo <ed...@gmail.com> wrote:
> 
>> If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as
>> soon as possible. If you have large bloomfilters you can hit a bug
>> where the bloom filters will not work properly.
>> 
>> 
>> On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>> we are running somewhat queue-like with aggressive write-read patterns.
>>> I was looking for scripting queries from live Cassandra installation, but I
>>> didn't find any.
>>> 
>>> is there something like thrift-proxy or other query logging/scripting engine
>>> ?
>>> 
>>> 2012/8/30 aaron morton <aa...@thelastpickle.com>
>>>> 
>>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>>> times slower than cassandra-0.7.8
>>>> 
>>>> We've not had any reports of a performance drop off. All tests so far have
>>>> show improvements in both read and write performance.
>>>> 
>>>> I agree, such digests save some network IO, but they seem to be very bad
>>>> in terms of CPU and disk IO.
>>>> 
>>>> The sha1 is created so we can diagnose corruptions in the -Data component
>>>> of the SSTables. They are not used to save network IO.
>>>> It is calculated while streaming the Memtable to disk so has no impact on
>>>> disk IO. While not the fasted algorithm I would assume it's CPU overhead in
>>>> this case is minimal.
>>>> 
>>>> there's already relatively small Bloom filter file, which can be used for
>>>> saving network traffic instead of sha1 digest.
>>>> 
>>>> Bloom filters are used to test if a row key may exist in an SSTable.
>>>> 
>>>> any explanation ?
>>>> 
>>>> If you can provide some more information on your use case we may be able
>>>> to help.
>>>> 
>>>> Cheers
>>>> 
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 30/08/2012, at 5:18 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>>> 
>>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>>> times slower than cassandra-0.7.8
>>>> after some investigation carried out I noticed files with "sha1" extension
>>>> (which are missing for Cassandra-0.7.8)
>>>> 
>>>> in maybeWriteDigest() function I see no option fot switching sha1 digests
>>>> off.
>>>> 
>>>> I agree, such digests save some network IO, but they seem to be very bad
>>>> in terms of CPU and disk IO.
>>>> why to use one more digest (which have to be calculated), there's already
>>>> relatively small Bloom filter file, which can be used for saving network
>>>> traffic instead of sha1 digest.
>>>> 
>>>> any explanation ?
>>>> 
>>>> Ilya Shipitsin
>>>> 
>>>> 
>>> 
> 
>

Re: performance is drastically degraded after 0.7.8 --> 1.0.11 upgrade

Posted by aaron morton <aa...@thelastpickle.com>.

Sorry but you will need to provide details of a specific query or workload that goes slower in 1.0.11.

As I said tests have shown improvements in performance in every new release. If you are seeing a significant decrease in performance it may be a workload that has not being considered or a known edge case. Whatever the cause we would need more details to help you.  

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/09/2012, at 4:19 PM, Илья Шипицин <ch...@gmail.com> wrote:

> all tests use similar data access patterns, so every test on 1.0.11 is slower than 0.7.8
> recent micros confirms that.
> 
> 2012/9/5 aaron morton <aa...@thelastpickle.com>
> That's slower.
> 
> the Recent* metrics are the best to look at. They recent each time you look at them. So read them, then run the test, then read them again.  
> 
> You'll need to narrow it down still. e.g. Is there a single test taking a very long time or are all tests running slower ?  The Histogram stats can help with that as they provide a spread of latencies. 
> 
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 5/09/2012, at 12:27 AM, Илья Шипицин <ch...@gmail.com> wrote:
> 
>> it was good idea to have a look at StorageProxy :-)
>>  
>>  
>> 1.0.10 Performance Tests
>> StorageProxy
>> 
>> RangeOperations: 546
>> ReadOperations: 694563
>> TotalHints: 0
>> TotalRangeLatencyMicros: 4469484
>> TotalReadLatencyMicros:245669679
>> TotalWriteLatencyMicros: 57819722
>> WriteOperations:208741
>> 
>> 
>> 0.7.10 Performance Tests
>> StorageProxy
>> 
>> RangeOperations: 520
>> ReadOperations: 671476
>> TotalRangeLatencyMicros: 2208902
>> TotalReadLatencyMicros: 162186009
>> TotalWriteLatencyMicros: 33911222
>> WriteOperations: 204806
>> 
>> 
>> 2012/9/3 aaron morton <aa...@thelastpickle.com>
>> The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down?
>> 
>> If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 > 1.0)
>> Can you log the execution time for tests and find ones that are taking longer ?
>>  
>> There are full request metrics available on the StorageProxy JMX object. 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 31/08/2012, at 4:45 PM, Илья Шипицин <ch...@gmail.com> wrote:
>> 
>>> we are using functional tests ( ~500 tests in time).
>>> it is hard to tell which query is slower, it is "slower in general".
>>>  
>>> same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings.
>>> as we are talking about functional tests, so we recreate KS just before tests are run.
>>>  
>>> I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you.
>>> 
>>> 2012/8/31 aaron morton <aa...@thelastpickle.com>
>>>>> we are running somewhat queue-like with aggressive write-read patterns.
>>> We'll need some more details…
>>> 
>>> How much data ?
>>> How many machines ?
>>> What is the machine spec ?
>>> How many clients ?
>>> Is there an example of a slow request ? 
>>> How are you measuring that it's slow ? 
>>> Is there anything unusual in the log ? 
>>> 
>>> Cheers
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 31/08/2012, at 3:30 AM, Edward Capriolo <ed...@gmail.com> wrote:
>>> 
>>>> If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as
>>>> soon as possible. If you have large bloomfilters you can hit a bug
>>>> where the bloom filters will not work properly.
>>>> 
>>>> 
>>>> On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>>>> we are running somewhat queue-like with aggressive write-read patterns.
>>>>> I was looking for scripting queries from live Cassandra installation, but I
>>>>> didn't find any.
>>>>> 
>>>>> is there something like thrift-proxy or other query logging/scripting engine
>>>>> ?
>>>>> 
>>>>> 2012/8/30 aaron morton <aa...@thelastpickle.com>
>>>>>> 
>>>>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>>>>> times slower than cassandra-0.7.8
>>>>>> 
>>>>>> We've not had any reports of a performance drop off. All tests so far have
>>>>>> show improvements in both read and write performance.
>>>>>> 
>>>>>> I agree, such digests save some network IO, but they seem to be very bad
>>>>>> in terms of CPU and disk IO.
>>>>>> 
>>>>>> The sha1 is created so we can diagnose corruptions in the -Data component
>>>>>> of the SSTables. They are not used to save network IO.
>>>>>> It is calculated while streaming the Memtable to disk so has no impact on
>>>>>> disk IO. While not the fasted algorithm I would assume it's CPU overhead in
>>>>>> this case is minimal.
>>>>>> 
>>>>>> there's already relatively small Bloom filter file, which can be used for
>>>>>> saving network traffic instead of sha1 digest.
>>>>>> 
>>>>>> Bloom filters are used to test if a row key may exist in an SSTable.
>>>>>> 
>>>>>> any explanation ?
>>>>>> 
>>>>>> If you can provide some more information on your use case we may be able
>>>>>> to help.
>>>>>> 
>>>>>> Cheers
>>>>>> 
>>>>>> 
>>>>>> -----------------
>>>>>> Aaron Morton
>>>>>> Freelance Developer
>>>>>> @aaronmorton
>>>>>> http://www.thelastpickle.com
>>>>>> 
>>>>>> On 30/08/2012, at 5:18 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>>>>> 
>>>>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>>>>> times slower than cassandra-0.7.8
>>>>>> after some investigation carried out I noticed files with "sha1" extension
>>>>>> (which are missing for Cassandra-0.7.8)
>>>>>> 
>>>>>> in maybeWriteDigest() function I see no option fot switching sha1 digests
>>>>>> off.
>>>>>> 
>>>>>> I agree, such digests save some network IO, but they seem to be very bad
>>>>>> in terms of CPU and disk IO.
>>>>>> why to use one more digest (which have to be calculated), there's already
>>>>>> relatively small Bloom filter file, which can be used for saving network
>>>>>> traffic instead of sha1 digest.
>>>>>> 
>>>>>> any explanation ?
>>>>>> 
>>>>>> Ilya Shipitsin
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>>> 
>> 
>> 
> 
>

Re: performance is drastically degraded after 0.7.8 --> 1.0.11 upgrade

Posted by Илья Шипицин <ch...@gmail.com>.

all tests use similar data access patterns, so every test on 1.0.11 is
slower than 0.7.8
recent micros confirms that.

2012/9/5 aaron morton <aa...@thelastpickle.com>

> That's slower.
>
> the Recent* metrics are the best to look at. They recent each time you
> look at them. So read them, then run the test, then read them again.
>
> You'll need to narrow it down still. e.g. Is there a single test taking a
> very long time or are all tests running slower ?  The Histogram stats can
> help with that as they provide a spread of latencies.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/09/2012, at 12:27 AM, Илья Шипицин <ch...@gmail.com> wrote:
>
> it was good idea to have a look at StorageProxy :-)
>
>
> 1.0.10 Performance Tests
> StorageProxy
>
> RangeOperations: 546
> ReadOperations: 694563
> TotalHints: 0
> TotalRangeLatencyMicros: 4469484
> TotalReadLatencyMicros:245669679
> TotalWriteLatencyMicros: 57819722
> WriteOperations:208741
>
>
> 0.7.10 Performance Tests
> StorageProxy
>
> RangeOperations: 520
> ReadOperations: 671476
> TotalRangeLatencyMicros: 2208902
> TotalReadLatencyMicros: 162186009
> TotalWriteLatencyMicros: 33911222
> WriteOperations: 204806
>
>
> 2012/9/3 aaron morton <aa...@thelastpickle.com>
>
>> The whole test run is taking longer ? So it could be slower queries or
>> slower test setup / tear down?
>>
>> If you are creating and truncate the KS for each of the 500 tests is that
>> taking longer ? (Schema code has changed a lot 0.7 > 1.0)
>> Can you log the execution time for tests and find ones that are taking
>> longer ?
>>
>> There are full request metrics available on the StorageProxy JMX object.
>>
>> Cheers
>>
>>  -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 31/08/2012, at 4:45 PM, Илья Шипицин <ch...@gmail.com> wrote:
>>
>> we are using functional tests ( ~500 tests in time).
>> it is hard to tell which query is slower, it is "slower in general".
>>
>> same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings.
>> as we are talking about functional tests, so we recreate KS just before
>> tests are run.
>>
>> I do not know how to record queries (there are a lot of them), if you are
>> interested, I can set up a special stand for you.
>>
>> 2012/8/31 aaron morton <aa...@thelastpickle.com>
>>
>>> we are running somewhat queue-like with aggressive write-read patterns.
>>>
>>> We'll need some more details...
>>>
>>> How much data ?
>>> How many machines ?
>>> What is the machine spec ?
>>> How many clients ?
>>> Is there an example of a slow request ?
>>> How are you measuring that it's slow ?
>>> Is there anything unusual in the log ?
>>>
>>> Cheers
>>>
>>>  -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 31/08/2012, at 3:30 AM, Edward Capriolo <ed...@gmail.com>
>>> wrote:
>>>
>>> If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as
>>> soon as possible. If you have large bloomfilters you can hit a bug
>>> where the bloom filters will not work properly.
>>>
>>>
>>> On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин <ch...@gmail.com>
>>> wrote:
>>>
>>> we are running somewhat queue-like with aggressive write-read patterns.
>>> I was looking for scripting queries from live Cassandra installation,
>>> but I
>>> didn't find any.
>>>
>>> is there something like thrift-proxy or other query logging/scripting
>>> engine
>>> ?
>>>
>>> 2012/8/30 aaron morton <aa...@thelastpickle.com>
>>>
>>>
>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>> times slower than cassandra-0.7.8
>>>
>>> We've not had any reports of a performance drop off. All tests so far
>>> have
>>> show improvements in both read and write performance.
>>>
>>> I agree, such digests save some network IO, but they seem to be very bad
>>> in terms of CPU and disk IO.
>>>
>>> The sha1 is created so we can diagnose corruptions in the -Data component
>>> of the SSTables. They are not used to save network IO.
>>> It is calculated while streaming the Memtable to disk so has no impact on
>>> disk IO. While not the fasted algorithm I would assume it's CPU overhead
>>> in
>>> this case is minimal.
>>>
>>> there's already relatively small Bloom filter file, which can be used for
>>> saving network traffic instead of sha1 digest.
>>>
>>> Bloom filters are used to test if a row key may exist in an SSTable.
>>>
>>> any explanation ?
>>>
>>> If you can provide some more information on your use case we may be able
>>> to help.
>>>
>>> Cheers
>>>
>>>
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 30/08/2012, at 5:18 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>>
>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>> times slower than cassandra-0.7.8
>>> after some investigation carried out I noticed files with "sha1"
>>> extension
>>> (which are missing for Cassandra-0.7.8)
>>>
>>> in maybeWriteDigest() function I see no option fot switching sha1 digests
>>> off.
>>>
>>> I agree, such digests save some network IO, but they seem to be very bad
>>> in terms of CPU and disk IO.
>>> why to use one more digest (which have to be calculated), there's already
>>> relatively small Bloom filter file, which can be used for saving network
>>> traffic instead of sha1 digest.
>>>
>>> any explanation ?
>>>
>>> Ilya Shipitsin
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>

Re: performance is drastically degraded after 0.7.8 --> 1.0.11 upgrade

Posted by aaron morton <aa...@thelastpickle.com>.

That's slower.

the Recent* metrics are the best to look at. They recent each time you look at them. So read them, then run the test, then read them again.  

You'll need to narrow it down still. e.g. Is there a single test taking a very long time or are all tests running slower ?  The Histogram stats can help with that as they provide a spread of latencies. 

Cheers
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/09/2012, at 12:27 AM, Илья Шипицин <ch...@gmail.com> wrote:

> it was good idea to have a look at StorageProxy :-)
>  
>  
> 1.0.10 Performance Tests
> StorageProxy
> 
> RangeOperations: 546
> ReadOperations: 694563
> TotalHints: 0
> TotalRangeLatencyMicros: 4469484
> TotalReadLatencyMicros:245669679
> TotalWriteLatencyMicros: 57819722
> WriteOperations:208741
> 
> 
> 0.7.10 Performance Tests
> StorageProxy
> 
> RangeOperations: 520
> ReadOperations: 671476
> TotalRangeLatencyMicros: 2208902
> TotalReadLatencyMicros: 162186009
> TotalWriteLatencyMicros: 33911222
> WriteOperations: 204806
> 
> 
> 2012/9/3 aaron morton <aa...@thelastpickle.com>
> The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down?
> 
> If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 > 1.0)
> Can you log the execution time for tests and find ones that are taking longer ?
>  
> There are full request metrics available on the StorageProxy JMX object. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 31/08/2012, at 4:45 PM, Илья Шипицин <ch...@gmail.com> wrote:
> 
>> we are using functional tests ( ~500 tests in time).
>> it is hard to tell which query is slower, it is "slower in general".
>>  
>> same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings.
>> as we are talking about functional tests, so we recreate KS just before tests are run.
>>  
>> I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you.
>> 
>> 2012/8/31 aaron morton <aa...@thelastpickle.com>
>>>> we are running somewhat queue-like with aggressive write-read patterns.
>> We'll need some more details…
>> 
>> How much data ?
>> How many machines ?
>> What is the machine spec ?
>> How many clients ?
>> Is there an example of a slow request ? 
>> How are you measuring that it's slow ? 
>> Is there anything unusual in the log ? 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 31/08/2012, at 3:30 AM, Edward Capriolo <ed...@gmail.com> wrote:
>> 
>>> If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as
>>> soon as possible. If you have large bloomfilters you can hit a bug
>>> where the bloom filters will not work properly.
>>> 
>>> 
>>> On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>>> we are running somewhat queue-like with aggressive write-read patterns.
>>>> I was looking for scripting queries from live Cassandra installation, but I
>>>> didn't find any.
>>>> 
>>>> is there something like thrift-proxy or other query logging/scripting engine
>>>> ?
>>>> 
>>>> 2012/8/30 aaron morton <aa...@thelastpickle.com>
>>>>> 
>>>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>>>> times slower than cassandra-0.7.8
>>>>> 
>>>>> We've not had any reports of a performance drop off. All tests so far have
>>>>> show improvements in both read and write performance.
>>>>> 
>>>>> I agree, such digests save some network IO, but they seem to be very bad
>>>>> in terms of CPU and disk IO.
>>>>> 
>>>>> The sha1 is created so we can diagnose corruptions in the -Data component
>>>>> of the SSTables. They are not used to save network IO.
>>>>> It is calculated while streaming the Memtable to disk so has no impact on
>>>>> disk IO. While not the fasted algorithm I would assume it's CPU overhead in
>>>>> this case is minimal.
>>>>> 
>>>>> there's already relatively small Bloom filter file, which can be used for
>>>>> saving network traffic instead of sha1 digest.
>>>>> 
>>>>> Bloom filters are used to test if a row key may exist in an SSTable.
>>>>> 
>>>>> any explanation ?
>>>>> 
>>>>> If you can provide some more information on your use case we may be able
>>>>> to help.
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> 
>>>>> -----------------
>>>>> Aaron Morton
>>>>> Freelance Developer
>>>>> @aaronmorton
>>>>> http://www.thelastpickle.com
>>>>> 
>>>>> On 30/08/2012, at 5:18 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>>>> 
>>>>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>>>>> times slower than cassandra-0.7.8
>>>>> after some investigation carried out I noticed files with "sha1" extension
>>>>> (which are missing for Cassandra-0.7.8)
>>>>> 
>>>>> in maybeWriteDigest() function I see no option fot switching sha1 digests
>>>>> off.
>>>>> 
>>>>> I agree, such digests save some network IO, but they seem to be very bad
>>>>> in terms of CPU and disk IO.
>>>>> why to use one more digest (which have to be calculated), there's already
>>>>> relatively small Bloom filter file, which can be used for saving network
>>>>> traffic instead of sha1 digest.
>>>>> 
>>>>> any explanation ?
>>>>> 
>>>>> Ilya Shipitsin
>>>>> 
>>>>> 
>>>> 
>> 
>> 
> 
>

Re: performance is drastically degraded after 0.7.8 --> 1.0.11 upgrade

Posted by Илья Шипицин <ch...@gmail.com>.

it was good idea to have a look at StorageProxy :-)


1.0.10 Performance Tests
StorageProxy

RangeOperations: 546
ReadOperations: 694563
TotalHints: 0
TotalRangeLatencyMicros: 4469484
TotalReadLatencyMicros:245669679
TotalWriteLatencyMicros: 57819722
WriteOperations:208741


0.7.10 Performance Tests
StorageProxy

RangeOperations: 520
ReadOperations: 671476
TotalRangeLatencyMicros: 2208902
TotalReadLatencyMicros: 162186009
TotalWriteLatencyMicros: 33911222
WriteOperations: 204806


2012/9/3 aaron morton <aa...@thelastpickle.com>

> The whole test run is taking longer ? So it could be slower queries or
> slower test setup / tear down?
>
> If you are creating and truncate the KS for each of the 500 tests is that
> taking longer ? (Schema code has changed a lot 0.7 > 1.0)
> Can you log the execution time for tests and find ones that are taking
> longer ?
>
> There are full request metrics available on the StorageProxy JMX object.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 31/08/2012, at 4:45 PM, Илья Шипицин <ch...@gmail.com> wrote:
>
> we are using functional tests ( ~500 tests in time).
> it is hard to tell which query is slower, it is "slower in general".
>
> same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings.
> as we are talking about functional tests, so we recreate KS just before
> tests are run.
>
> I do not know how to record queries (there are a lot of them), if you are
> interested, I can set up a special stand for you.
>
> 2012/8/31 aaron morton <aa...@thelastpickle.com>
>
>> we are running somewhat queue-like with aggressive write-read patterns.
>>
>> We'll need some more details...
>>
>> How much data ?
>> How many machines ?
>> What is the machine spec ?
>> How many clients ?
>> Is there an example of a slow request ?
>> How are you measuring that it's slow ?
>> Is there anything unusual in the log ?
>>
>> Cheers
>>
>>  -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 31/08/2012, at 3:30 AM, Edward Capriolo <ed...@gmail.com> wrote:
>>
>> If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as
>> soon as possible. If you have large bloomfilters you can hit a bug
>> where the bloom filters will not work properly.
>>
>>
>> On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин <ch...@gmail.com>
>> wrote:
>>
>> we are running somewhat queue-like with aggressive write-read patterns.
>> I was looking for scripting queries from live Cassandra installation, but
>> I
>> didn't find any.
>>
>> is there something like thrift-proxy or other query logging/scripting
>> engine
>> ?
>>
>> 2012/8/30 aaron morton <aa...@thelastpickle.com>
>>
>>
>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>> times slower than cassandra-0.7.8
>>
>> We've not had any reports of a performance drop off. All tests so far have
>> show improvements in both read and write performance.
>>
>> I agree, such digests save some network IO, but they seem to be very bad
>> in terms of CPU and disk IO.
>>
>> The sha1 is created so we can diagnose corruptions in the -Data component
>> of the SSTables. They are not used to save network IO.
>> It is calculated while streaming the Memtable to disk so has no impact on
>> disk IO. While not the fasted algorithm I would assume it's CPU overhead
>> in
>> this case is minimal.
>>
>> there's already relatively small Bloom filter file, which can be used for
>> saving network traffic instead of sha1 digest.
>>
>> Bloom filters are used to test if a row key may exist in an SSTable.
>>
>> any explanation ?
>>
>> If you can provide some more information on your use case we may be able
>> to help.
>>
>> Cheers
>>
>>
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 30/08/2012, at 5:18 AM, Илья Шипицин <ch...@gmail.com> wrote:
>>
>> in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
>> times slower than cassandra-0.7.8
>> after some investigation carried out I noticed files with "sha1" extension
>> (which are missing for Cassandra-0.7.8)
>>
>> in maybeWriteDigest() function I see no option fot switching sha1 digests
>> off.
>>
>> I agree, such digests save some network IO, but they seem to be very bad
>> in terms of CPU and disk IO.
>> why to use one more digest (which have to be calculated), there's already
>> relatively small Bloom filter file, which can be used for saving network
>> traffic instead of sha1 digest.
>>
>> any explanation ?
>>
>> Ilya Shipitsin
>>
>>
>>
>>
>>
>
>