You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Praveen Baratam <pr...@gmail.com> on 2011/12/25 07:48:10 UTC

Row or Supercolumn with approximately n columns

Hello Everybody,

Happy Christmas.

I know that this topic has come up quiet a few times on Dev and User lists
but did not culminate into a solution.

http://www.mail-archive.com/user@cassandra.apache.org/msg15367.html

The above discussion on User list talks about AbstractCompactionStrategy
but I could not find any relevant documentation as its a fairly new feature
in Cassandra.

Let me state this necessity and use-case again.

I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to
approximately limit the number of columns to "n". "n" can vary a lot and
the intention is to throw away stale data and not to maintain any hard
limit on the CF or SC. Its very useful for storing time-series data where
stale data is not necessary. The goal is to achieve this with minimum
overhead and since compaction happens all the time it would be clever to
implement it as part of compaction.

Thanks in advance.

Praveen

Re: Should I throttle deletes?

Posted by Maxim Potekhin <po...@bnl.gov>.

Thanks, that's quite helpful. I'm wondering though if multiplying the 
number of clients will
end up doing same thing.

On 1/5/2012 3:29 PM, Philippe wrote:
>
>     Then I do have a question, what do people generally use as the
>     batch size?
>
> I used to do batches from 500 to 2000 like you do.
> After investigating issues such as the one you've encountered I've 
> moved to batches of 20 for writes and 256 for reads. Everything is a 
> lot smoother : no more timeouts.
>
> The downside though is that I have to run more client threads in 
> parallele to maximize throughput.
>
> Cheers

Re: Should I throttle deletes?

Posted by Maxim Potekhin <po...@bnl.gov>.

Thanks, this makes sense. I'll try that.

Maxim

On 1/6/2012 10:51 AM, Vitalii Tymchyshyn wrote:
> Do you mean on writes? Yes, your timeouts must be so that your write 
> batch could complete until timeout elapsed. But this will lower write 
> load, so reads should not timeout.
>
> Best regards, Vitalii Tymchyshym
>
> 06.01.12 17:37, Philippe написав(ла):
>>
>> But you will then get timeouts.
>>
>> Le 6 janv. 2012 15:17, "Vitalii Tymchyshyn" <tivv00@gmail.com 
>> <ma...@gmail.com>> a écrit :
>>
>>     05.01.12 22:29, Philippe написав(ла):
>>>
>>>         Then I do have a question, what do people generally use as
>>>         the batch size?
>>>
>>>     I used to do batches from 500 to 2000 like you do.
>>>     After investigating issues such as the one you've encountered
>>>     I've moved to batches of 20 for writes and 256 for reads.
>>>     Everything is a lot smoother : no more timeouts.
>>>
>>     I'd better reduce mutation thread pool with concurrent_writes
>>     setting. This will lower server load no matter, how many clients
>>     are sending batches, at the same time you still have good batching.
>>
>>     Best regards, Vitalii Tymchyshyn
>>
>

Re: Should I throttle deletes?

Posted by Vitalii Tymchyshyn <ti...@gmail.com>.

Do you mean on writes? Yes, your timeouts must be so that your write 
batch could complete until timeout elapsed. But this will lower write 
load, so reads should not timeout.

Best regards, Vitalii Tymchyshym

06.01.12 17:37, Philippe написав(ла):
>
> But you will then get timeouts.
>
> Le 6 janv. 2012 15:17, "Vitalii Tymchyshyn" <tivv00@gmail.com 
> <ma...@gmail.com>> a écrit :
>
>     05.01.12 22:29, Philippe написав(ла):
>>
>>         Then I do have a question, what do people generally use as
>>         the batch size?
>>
>>     I used to do batches from 500 to 2000 like you do.
>>     After investigating issues such as the one you've encountered
>>     I've moved to batches of 20 for writes and 256 for reads.
>>     Everything is a lot smoother : no more timeouts.
>>
>     I'd better reduce mutation thread pool with concurrent_writes
>     setting. This will lower server load no matter, how many clients
>     are sending batches, at the same time you still have good batching.
>
>     Best regards, Vitalii Tymchyshyn
>

Re: Should I throttle deletes?

Posted by Philippe <wa...@gmail.com>.

But you will then get timeouts.
Le 6 janv. 2012 15:17, "Vitalii Tymchyshyn" <ti...@gmail.com> a écrit :

> **
> 05.01.12 22:29, Philippe написав(ла):
>
>  Then I do have a question, what do people generally use as the batch
>> size?
>>
>  I used to do batches from 500 to 2000 like you do.
> After investigating issues such as the one you've encountered I've moved
> to batches of 20 for writes and 256 for reads. Everything is a lot smoother
> : no more timeouts.
>
>  I'd better reduce mutation thread pool with concurrent_writes setting.
> This will lower server load no matter, how many clients are sending
> batches, at the same time you still have good batching.
>
> Best regards, Vitalii Tymchyshyn
>

Re: Should I throttle deletes?

Posted by Vitalii Tymchyshyn <ti...@gmail.com>.

05.01.12 22:29, Philippe ???????(??):
>
>     Then I do have a question, what do people generally use as the
>     batch size?
>
> I used to do batches from 500 to 2000 like you do.
> After investigating issues such as the one you've encountered I've 
> moved to batches of 20 for writes and 256 for reads. Everything is a 
> lot smoother : no more timeouts.
>
I'd better reduce mutation thread pool with concurrent_writes setting. 
This will lower server load no matter, how many clients are sending 
batches, at the same time you still have good batching.

Best regards, Vitalii Tymchyshyn

Re: Should I throttle deletes?

Posted by Philippe <wa...@gmail.com>.

>
> Then I do have a question, what do people generally use as the batch size?
>
I used to do batches from 500 to 2000 like you do.
After investigating issues such as the one you've encountered I've moved to
batches of 20 for writes and 256 for reads. Everything is a lot smoother :
no more timeouts.

The downside though is that I have to run more client threads in parallele
to maximize throughput.

Cheers

Re: Should I throttle deletes?

Posted by Maxim Potekhin <po...@bnl.gov>.

Hello Aaron,

On 1/5/2012 4:25 AM, aaron morton wrote:
>> I use a batch mutator in Pycassa to delete ~1M rows based on
>> a longish list of keys I'm extracting from an auxiliary CF (with no
>> problem of any sort).
> What is the size of the deletion batches ?

2000 mutations.


>
>> Now, it appears that such heads-on delete puts a temporary
>> but large load on the cluster. I have SSD's and they go to 100%
>> utilization, and the CPU spikes to significant loads.
> Does the load spike during the deletion or after it ?

During.


> Do any of the thread pool back up in nodetool tpstats during the load ?

Haven't checked, thank you for the lead.

> I can think of a few general issues you may want to avoid:
>
> * Each row in a batch mutation is handled by a task in a thread pool 
> on the nodes. So if you send a batch to delete 1,000 rows it will put 
> 1,000 tasks in the Mutation stage. This will reduce the query throughput.

Aah. I didn't know that. I was under the impression that batching saves 
the communication overhead, and that's it.

Then I do have a question, what do people generally use as the batch size?

Thanks

Maxim

Re: Should I throttle deletes?

Posted by aaron morton <aa...@thelastpickle.com>.

> I use a batch mutator in Pycassa to delete ~1M rows based on
> a longish list of keys I'm extracting from an auxiliary CF (with no
> problem of any sort).
What is the size of the deletion batches ?

> Now, it appears that such heads-on delete puts a temporary
> but large load on the cluster. I have SSD's and they go to 100%
> utilization, and the CPU spikes to significant loads.
Does the load spike during the deletion or after it ? 
Do any of the thread pool back up in nodetool tpstats during the load ?  

I can think of a few general issues you may want to avoid:

* Each row in a batch mutation is handled by a task in a thread pool on the nodes. So if you send a batch to delete 1,000 rows it will put 1,000 tasks in the Mutation stage. This will reduce the query throughput.
* Lots of deletes in a row will add overhead to reads on the row. 

You may want to check for excessive memtable flushing, but if you have default automatic memory management running lots of deletes should not result in extra flushing.  

Hope that helps
Aaron

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/01/2012, at 10:13 AM, Maxim Potekhin wrote:

> Now that my cluster appears to run smoothly and after a few successful
> repairs and compacts, I'm back in the business of deletion of portions
> of data based on its date of insertion. For reasons too lengthy to be
> explained here, I don't want to use TTL.
> 
> I use a batch mutator in Pycassa to delete ~1M rows based on
> a longish list of keys I'm extracting from an auxiliary CF (with no
> problem of any sort).
> 
> Now, it appears that such heads-on delete puts a temporary
> but large load on the cluster. I have SSD's and they go to 100%
> utilization, and the CPU spikes to significant loads.
> 
> Does anyone do throttling on such mass-delete procedure?
> 
> Thanks in advance,
> 
> Maxim
>

Should I throttle deletes?

Posted by Maxim Potekhin <po...@bnl.gov>.

Now that my cluster appears to run smoothly and after a few successful
repairs and compacts, I'm back in the business of deletion of portions
of data based on its date of insertion. For reasons too lengthy to be
explained here, I don't want to use TTL.

I use a batch mutator in Pycassa to delete ~1M rows based on
a longish list of keys I'm extracting from an auxiliary CF (with no
problem of any sort).

Now, it appears that such heads-on delete puts a temporary
but large load on the cluster. I have SSD's and they go to 100%
utilization, and the CPU spikes to significant loads.

Does anyone do throttling on such mass-delete procedure?

Thanks in advance,

Maxim

Re: Row or Supercolumn with approximately n columns

Posted by Praveen Baratam <pr...@gmail.com>.

I understand that there will be contention regarding which *n* columns are
the current *n* columns but as mentioned previously the goal is to limit
the accumulation of data as in our use-case some row keys can receive
fairly heavy inserts. For people requiring precise set of current columns
that feature can be implemented by having a buffer of *m* columns above the
*n columns * so that they can filter in the client.

I believe this approach will not tax cassandra in terms of performance.

Coming to TTL based columns, its difficult to store last *n* samples in
this approach. If the inserts are happening at a constant/predictable rate
then we can achieve the desired functionality using TTL but if inserts are
event driven, then there is no way we can see the last *n* samples after
TTL. This may not be desirable in many use-cases including mine.

Another approach could be a cron job that reads all the rows and slices
every row to first *n* columns using batch_mutate. For this to be efficient
we need an efficient way to query for rows with more than n columns. This
could be a quick externally managed compaction if the performance penalty
can be minimized by some internal api provisions.

https://issues.apache.org/jira/browse/CASSANDRA-3678?page=com.atlassian.streams.streams-jira-plugin:activity-stream-issue-tab#issue-tabs

I have also opened the above ticket to collect ideas to solve this problem.
Sadly no activity yet.

Coming to custom compaction for this purpose a levelled compaction with
only 2 levels or just one could be enough as rows are not meant to grow
huge and most rows have similar number and sized columns.

Regards.

On Tue, Jan 3, 2012 at 4:29 AM, aaron morton <aa...@thelastpickle.com>wrote:

> During compaction, both automatic / minor and manual / major.
>
> The performance drop is having a lot of expired columns that have not been
> purged by compaction as they must be read and discarded during reads.
>
> Cheers
>
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/01/2012, at 10:38 AM, R. Verlangen wrote:
>
> @Aaron: Small side question, when do columns with a past TTL get removed?
> On a repair, (minor) compaction, or .. ? Does it have a performance drop if
> that's happening?
>
> 2012/1/2 aaron morton <aa...@thelastpickle.com>
>
>> Even if you had compaction enforcing a limit on the number of columns in
>> a row, there would still be issues with concurrent writes at the same time
>> and with read-repair. i.e. node a says the this is the first n columns but
>> node b says something else, you only know who is correct at read time.
>>
>> Have you considered using a TTL on the columns ?
>>
>> Depending on the use case you could also consider have writes
>> periodically or randomly trim the data size, or trim on reads.
>>
>> It will also make sense to partition the time series data into different
>> rows, and Viva la Standard Column Families!
>>
>> Hope that helps.
>>
>>   -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 25/12/2011, at 7:48 PM, Praveen Baratam wrote:
>>
>> Hello Everybody,
>>
>> Happy Christmas.
>>
>> I know that this topic has come up quiet a few times on Dev and User
>> lists but did not culminate into a solution.
>>
>> http://www.mail-archive.com/user@cassandra.apache.org/msg15367.html
>>
>> The above discussion on User list talks about AbstractCompactionStrategy
>> but I could not find any relevant documentation as its a fairly new feature
>> in Cassandra.
>>
>> Let me state this necessity and use-case again.
>>
>> I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to
>> approximately limit the number of columns to "n". "n" can vary a lot and
>> the intention is to throw away stale data and not to maintain any hard
>> limit on the CF or SC. Its very useful for storing time-series data where
>> stale data is not necessary. The goal is to achieve this with minimum
>> overhead and since compaction happens all the time it would be clever to
>> implement it as part of compaction.
>>
>> Thanks in advance.
>>
>> Praveen
>>
>>
>>
>
>

Re: Row or Supercolumn with approximately n columns

Posted by aaron morton <aa...@thelastpickle.com>.

During compaction, both automatic / minor and manual / major. 

The performance drop is having a lot of expired columns that have not been purged by compaction as they must be read and discarded during reads. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/01/2012, at 10:38 AM, R. Verlangen wrote:

> @Aaron: Small side question, when do columns with a past TTL get removed? On a repair, (minor) compaction, or .. ? Does it have a performance drop if that's happening?
> 
> 2012/1/2 aaron morton <aa...@thelastpickle.com>
> Even if you had compaction enforcing a limit on the number of columns in a row, there would still be issues with concurrent writes at the same time and with read-repair. i.e. node a says the this is the first n columns but node b says something else, you only know who is correct at read time.
> 
> Have you considered using a TTL on the columns ? 
> 
> Depending on the use case you could also consider have writes periodically or randomly trim the data size, or trim on reads. 
> 
> It will also make sense to partition the time series data into different rows, and Viva la Standard Column Families!
> 
> Hope that helps. 
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 25/12/2011, at 7:48 PM, Praveen Baratam wrote:
> 
>> Hello Everybody,
>> 
>> Happy Christmas.
>> 
>> I know that this topic has come up quiet a few times on Dev and User lists but did not culminate into a solution.
>> 
>> http://www.mail-archive.com/user@cassandra.apache.org/msg15367.html
>> 
>> The above discussion on User list talks about AbstractCompactionStrategy but I could not find any relevant documentation as its a fairly new feature in Cassandra.
>> 
>> Let me state this necessity and use-case again.
>> 
>> I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to approximately limit the number of columns to "n". "n" can vary a lot and the intention is to throw away stale data and not to maintain any hard limit on the CF or SC. Its very useful for storing time-series data where stale data is not necessary. The goal is to achieve this with minimum overhead and since compaction happens all the time it would be clever to implement it as part of compaction.
>> 
>> Thanks in advance.
>> 
>> Praveen
> 
>

Re: Row or Supercolumn with approximately n columns

Posted by "R. Verlangen" <ro...@us2.nl>.

@Aaron: Small side question, when do columns with a past TTL get removed?
On a repair, (minor) compaction, or .. ? Does it have a performance drop if
that's happening?

2012/1/2 aaron morton <aa...@thelastpickle.com>

> Even if you had compaction enforcing a limit on the number of columns in a
> row, there would still be issues with concurrent writes at the same time
> and with read-repair. i.e. node a says the this is the first n columns but
> node b says something else, you only know who is correct at read time.
>
> Have you considered using a TTL on the columns ?
>
> Depending on the use case you could also consider have writes periodically
> or randomly trim the data size, or trim on reads.
>
> It will also make sense to partition the time series data into different
> rows, and Viva la Standard Column Families!
>
> Hope that helps.
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 25/12/2011, at 7:48 PM, Praveen Baratam wrote:
>
> Hello Everybody,
>
> Happy Christmas.
>
> I know that this topic has come up quiet a few times on Dev and User lists
> but did not culminate into a solution.
>
> http://www.mail-archive.com/user@cassandra.apache.org/msg15367.html
>
> The above discussion on User list talks about AbstractCompactionStrategy
> but I could not find any relevant documentation as its a fairly new feature
> in Cassandra.
>
> Let me state this necessity and use-case again.
>
> I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to
> approximately limit the number of columns to "n". "n" can vary a lot and
> the intention is to throw away stale data and not to maintain any hard
> limit on the CF or SC. Its very useful for storing time-series data where
> stale data is not necessary. The goal is to achieve this with minimum
> overhead and since compaction happens all the time it would be clever to
> implement it as part of compaction.
>
> Thanks in advance.
>
> Praveen
>
>
>

Re: Row or Supercolumn with approximately n columns

Posted by aaron morton <aa...@thelastpickle.com>.

Even if you had compaction enforcing a limit on the number of columns in a row, there would still be issues with concurrent writes at the same time and with read-repair. i.e. node a says the this is the first n columns but node b says something else, you only know who is correct at read time.

Have you considered using a TTL on the columns ? 

Depending on the use case you could also consider have writes periodically or randomly trim the data size, or trim on reads. 

It will also make sense to partition the time series data into different rows, and Viva la Standard Column Families!

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/12/2011, at 7:48 PM, Praveen Baratam wrote:

> Hello Everybody,
> 
> Happy Christmas.
> 
> I know that this topic has come up quiet a few times on Dev and User lists but did not culminate into a solution.
> 
> http://www.mail-archive.com/user@cassandra.apache.org/msg15367.html
> 
> The above discussion on User list talks about AbstractCompactionStrategy but I could not find any relevant documentation as its a fairly new feature in Cassandra.
> 
> Let me state this necessity and use-case again.
> 
> I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to approximately limit the number of columns to "n". "n" can vary a lot and the intention is to throw away stale data and not to maintain any hard limit on the CF or SC. Its very useful for storing time-series data where stale data is not necessary. The goal is to achieve this with minimum overhead and since compaction happens all the time it would be clever to implement it as part of compaction.
> 
> Thanks in advance.
> 
> Praveen