You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jay Svc <ja...@gmail.com> on 2013/01/03 00:55:01 UTC

Multi threads updating single row

Happy New Year Everyone..!

In the situation, there are multiple client thread updating or adding new
columns in a same row. These new columns is a time series data.

My Question is - Would multiple threads able to add columns to the same
row? Do you see any performance issues since there is a comparator and has
to sort columns as they get added. Do you see any loss of data or conflict
due to mulitple thread updating data? I am expecting high volumn of data
coming on each one of those threads.

Thanks in advance.

Jay

Re: Multi threads updating single row

Posted by "Hiller, Dean" <De...@nrel.gov>.
My bad, I meant exactly that, there could be performance issues if you have multiple nodes hitting the same row since that row work items are serialised.

Dean

From: aaron morton <aa...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Thursday, January 3, 2013 12:28 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Multi threads updating single row

Multiple nodes could be a problem
Not sure what you mean here Dean.

There are no issues with multiple clients, from multiple threads, processes or nodes inserting / updating to the same row.

There are potential performance issues though due to row level isolation used in the write path. The first (internal) writing thread "wins" when updating a row, so any concurrent (internal) write threads updating the same row must (again internally) restart their processing.

You will probably only see issues when you have 10's of clients continually updating the same row. And then the size of  issue depends somewhat on the size of the inserts you are running. i.e. a smaller insert size means less re-work.

There are also potential performance issues if you have secondary indexes on the row.

Hope that helps.

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/01/2013, at 2:32 AM, "Hiller, Dean" <De...@nrel.gov>> wrote:

Multiple nodes could be a problem but multiple threads is probably just fine.  If you have two threads write to the same column, the last one wins though so I hope your timestamps are unique even across threads so you don't lose data ;).

Dean

From: Jay Svc <ja...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, January 2, 2013 4:55 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Multi threads updating single row

Happy New Year Everyone..!

In the situation, there are multiple client thread updating or adding new columns in a same row. These new columns is a time series data.

My Question is - Would multiple threads able to add columns to the same row? Do you see any performance issues since there is a comparator and has to sort columns as they get added. Do you see any loss of data or conflict due to mulitple thread updating data? I am expecting high volumn of data coming on each one of those threads.

Thanks in advance.

Jay



Re: Multi threads updating single row

Posted by aaron morton <aa...@thelastpickle.com>.
> Multiple nodes could be a problem 
Not sure what you mean here Dean.

There are no issues with multiple clients, from multiple threads, processes or nodes inserting / updating to the same row. 

There are potential performance issues though due to row level isolation used in the write path. The first (internal) writing thread "wins" when updating a row, so any concurrent (internal) write threads updating the same row must (again internally) restart their processing. 

You will probably only see issues when you have 10's of clients continually updating the same row. And then the size of  issue depends somewhat on the size of the inserts you are running. i.e. a smaller insert size means less re-work.  

There are also potential performance issues if you have secondary indexes on the row. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/01/2013, at 2:32 AM, "Hiller, Dean" <De...@nrel.gov> wrote:

> Multiple nodes could be a problem but multiple threads is probably just fine.  If you have two threads write to the same column, the last one wins though so I hope your timestamps are unique even across threads so you don't lose data ;).
> 
> Dean
> 
> From: Jay Svc <ja...@gmail.com>>
> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
> Date: Wednesday, January 2, 2013 4:55 PM
> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
> Subject: Multi threads updating single row
> 
> Happy New Year Everyone..!
> 
> In the situation, there are multiple client thread updating or adding new columns in a same row. These new columns is a time series data.
> 
> My Question is - Would multiple threads able to add columns to the same row? Do you see any performance issues since there is a comparator and has to sort columns as they get added. Do you see any loss of data or conflict due to mulitple thread updating data? I am expecting high volumn of data coming on each one of those threads.
> 
> Thanks in advance.
> 
> Jay
> 


Re: Multi threads updating single row

Posted by "Hiller, Dean" <De...@nrel.gov>.
Multiple nodes could be a problem but multiple threads is probably just fine.  If you have two threads write to the same column, the last one wins though so I hope your timestamps are unique even across threads so you don't lose data ;).

Dean

From: Jay Svc <ja...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, January 2, 2013 4:55 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Multi threads updating single row

Happy New Year Everyone..!

In the situation, there are multiple client thread updating or adding new columns in a same row. These new columns is a time series data.

My Question is - Would multiple threads able to add columns to the same row? Do you see any performance issues since there is a comparator and has to sort columns as they get added. Do you see any loss of data or conflict due to mulitple thread updating data? I am expecting high volumn of data coming on each one of those threads.

Thanks in advance.

Jay