You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ивaн Cобoлeв <so...@gmail.com> on 2013/02/06 11:42:13 UTC

Estimating write throughput with LeveledCompactionStrategy

Dear Community,

Could anyone please give me a hand with understanding what am I
missing while trying to model how LeveledCompactionStrategy works:
https://docs.google.com/spreadsheet/ccc?key=0AvNacZ0w52BydDQ3N2ZPSks2OHR1dlFmMVV4d1E2eEE#gid=0

Logs mostly contain something like this:
 INFO [CompactionExecutor:2235] 2013-02-06 02:32:29,758
CompactionTask.java (line 221) Compacted to
[chunks-hf-285962-Data.db,chunks-hf-285963-Data.db,chunks-hf-285964-Data.db,chunks-hf-285965-Data.db,chunks-hf-285966-Data.db,chunks-hf-285967-Data.db,chunks-hf-285968-Data.db,chunks-hf-285969-Data.db,chunks-hf-285970-Data.db,chunks-hf-285971-Data.db,chunks-hf-285972-Data.db,chunks-hf-285973-Data.db,chunks-hf-285974-Data.db,chunks-hf-285975-Data.db,chunks-hf-285976-Data.db,chunks-hf-285977-Data.db,chunks-hf-285978-Data.db,chunks-hf-285979-Data.db,chunks-hf-285980-Data.db,].
 2,255,863,073 to 1,908,460,931 (~84% of original) bytes for 36,868
keys at 14.965795MB/s.  Time: 121,614ms.

Thus spreadsheet is parameterized with throughput being 15Mb and
survivor ratio of 0.9.

1) Projected result actually differs from what I observe - what am I missing?
2) Are there any metrics on write throughput with LCS per node anyone
could possibly share?

Thank you very much in advance,
Ivan

Re: Estimating write throughput with LeveledCompactionStrategy

Posted by Ивaн Cобoлeв <so...@gmail.com>.
Yup, we set it to 100M. Currently we have around 1Tb of data per
node(getting to level 5 now) + data pieces are rather large(small
tables would flush more often).

Yes, you're right, it's slower thus building mental models is more
time effective than experimenting :)

Ivan

2013/2/6 Wei Zhu <wz...@yahoo.com>:
> I have been struggling with the LCS myself. I observed that for the higher
> level compaction,(from level 4 to 5) it involves much more SSTables than
> compacting from lower level. One compaction could take an hour or more. By
> the way, you set the your SSTable size to be 100M?
>
> Thanks.
> -Wei
>
> ________________________________
> From: Ивaн Cобoлeв <so...@gmail.com>
> To: user@cassandra.apache.org
> Sent: Wednesday, February 6, 2013 2:42 AM
> Subject: Estimating write throughput with LeveledCompactionStrategy
>
> Dear Community,
>
> Could anyone please give me a hand with understanding what am I
> missing while trying to model how LeveledCompactionStrategy works:
> https://docs.google.com/spreadsheet/ccc?key=0AvNacZ0w52BydDQ3N2ZPSks2OHR1dlFmMVV4d1E2eEE#gid=0
>
> Logs mostly contain something like this:
> INFO [CompactionExecutor:2235] 2013-02-06 02:32:29,758
> CompactionTask.java (line 221) Compacted to
> [chunks-hf-285962-Data.db,chunks-hf-285963-Data.db,chunks-hf-285964-Data.db,chunks-hf-285965-Data.db,chunks-hf-285966-Data.db,chunks-hf-285967-Data.db,chunks-hf-285968-Data.db,chunks-hf-285969-Data.db,chunks-hf-285970-Data.db,chunks-hf-285971-Data.db,chunks-hf-285972-Data.db,chunks-hf-285973-Data.db,chunks-hf-285974-Data.db,chunks-hf-285975-Data.db,chunks-hf-285976-Data.db,chunks-hf-285977-Data.db,chunks-hf-285978-Data.db,chunks-hf-285979-Data.db,chunks-hf-285980-Data.db,].
> 2,255,863,073 to 1,908,460,931 (~84% of original) bytes for 36,868
> keys at 14.965795MB/s.  Time: 121,614ms.
>
> Thus spreadsheet is parameterized with throughput being 15Mb and
> survivor ratio of 0.9.
>
> 1) Projected result actually differs from what I observe - what am I
> missing?
> 2) Are there any metrics on write throughput with LCS per node anyone
> could possibly share?
>
> Thank you very much in advance,
> Ivan
>
>

Re: Estimating write throughput with LeveledCompactionStrategy

Posted by Wei Zhu <wz...@yahoo.com>.
I have been struggling with the LCS myself. I observed that for the higher level compaction,(from level 4 to 5) it involves much more SSTables than compacting from lower level. One compaction could take an hour or more. By the way, you set the your SSTable size to be 100M?

Thanks.
-Wei 


________________________________
 From: Ивaн Cобoлeв <so...@gmail.com>
To: user@cassandra.apache.org 
Sent: Wednesday, February 6, 2013 2:42 AM
Subject: Estimating write throughput with LeveledCompactionStrategy
 
Dear Community,

Could anyone please give me a hand with understanding what am I
missing while trying to model how LeveledCompactionStrategy works:
https://docs.google.com/spreadsheet/ccc?key=0AvNacZ0w52BydDQ3N2ZPSks2OHR1dlFmMVV4d1E2eEE#gid=0

Logs mostly contain something like this:
INFO [CompactionExecutor:2235] 2013-02-06 02:32:29,758
CompactionTask.java (line 221) Compacted to
[chunks-hf-285962-Data.db,chunks-hf-285963-Data.db,chunks-hf-285964-Data.db,chunks-hf-285965-Data.db,chunks-hf-285966-Data.db,chunks-hf-285967-Data.db,chunks-hf-285968-Data.db,chunks-hf-285969-Data.db,chunks-hf-285970-Data.db,chunks-hf-285971-Data.db,chunks-hf-285972-Data.db,chunks-hf-285973-Data.db,chunks-hf-285974-Data.db,chunks-hf-285975-Data.db,chunks-hf-285976-Data.db,chunks-hf-285977-Data.db,chunks-hf-285978-Data.db,chunks-hf-285979-Data.db,chunks-hf-285980-Data.db,].
2,255,863,073 to 1,908,460,931 (~84% of original) bytes for 36,868
keys at 14.965795MB/s.  Time: 121,614ms.

Thus spreadsheet is parameterized with throughput being 15Mb and
survivor ratio of 0.9.

1) Projected result actually differs from what I observe - what am I missing?
2) Are there any metrics on write throughput with LCS per node anyone
could possibly share?

Thank you very much in advance,
Ivan