You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Marcus Bointon <ma...@synchromedia.co.uk> on 2011/06/01 03:36:40 UTC

Re: Appending to fields

On 31 May 2011, at 23:03, Dan Kuebrich wrote:

> I think perhaps OP meant O(N * M), where N is number of rows and M is total bytes.

That's probably more accurate. 

This is what it was doing: Say I repeatedly append 100 bytes to the same 1000 records. First time around that's 100,000 bytes to transfer. Second time around it has to read the first 100k, then write it back, plus the next 100k, making 300k. Next time it's read 200k, write 200k+100k, making 500k. After 100 iterations you're transferring 20Mb for a 100k update. MySQL really does do this; It's very painful. In my real case a 50 byte update to 500k records went from taking a few seconds to several hours, with no prospect of ever improving.

Marcus

Re: Appending to fields

Posted by "Nair, Rajesh" <ra...@blackrock.com>.
D

----- Original Message -----
From: Jonathan Ellis [mailto:jbellis@gmail.com]
Sent: Tuesday, May 31, 2011 09:57 PM
To: user@cassandra.apache.org <us...@cassandra.apache.org>
Subject: Re: Appending to fields

Sounds like Ed is right and you should be doing the append as
add-a-new-column instead of overwrite-existing-column.

On Tue, May 31, 2011 at 8:36 PM, Marcus Bointon
<ma...@synchromedia.co.uk> wrote:
> On 31 May 2011, at 23:03, Dan Kuebrich wrote:
>
>> I think perhaps OP meant O(N * M), where N is number of rows and M is total bytes.
>
> That's probably more accurate.
>
> This is what it was doing: Say I repeatedly append 100 bytes to the same 1000 records. First time around that's 100,000 bytes to transfer. Second time around it has to read the first 100k, then write it back, plus the next 100k, making 300k. Next time it's read 200k, write 200k+100k, making 500k. After 100 iterations you're transferring 20Mb for a 100k update. MySQL really does do this; It's very painful. In my real case a 50 byte update to 500k records went from taking a few seconds to several hours, with no prospect of ever improving.
>
> Marcus



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED.  If this message was misdirected, BlackRock, Inc. and its subsidiaries, ("BlackRock") does not waive any confidentiality or privilege.  If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone.  Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized.  The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of BlackRock, unless the author is authorized by BlackRock to express such views or opinions on its behalf.  All email sent to or from this address is subject to electronic storage and review by BlackRock.  Although BlackRock operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.



Re: Appending to fields

Posted by Jonathan Ellis <jb...@gmail.com>.
Sounds like Ed is right and you should be doing the append as
add-a-new-column instead of overwrite-existing-column.

On Tue, May 31, 2011 at 8:36 PM, Marcus Bointon
<ma...@synchromedia.co.uk> wrote:
> On 31 May 2011, at 23:03, Dan Kuebrich wrote:
>
>> I think perhaps OP meant O(N * M), where N is number of rows and M is total bytes.
>
> That's probably more accurate.
>
> This is what it was doing: Say I repeatedly append 100 bytes to the same 1000 records. First time around that's 100,000 bytes to transfer. Second time around it has to read the first 100k, then write it back, plus the next 100k, making 300k. Next time it's read 200k, write 200k+100k, making 500k. After 100 iterations you're transferring 20Mb for a 100k update. MySQL really does do this; It's very painful. In my real case a 50 byte update to 500k records went from taking a few seconds to several hours, with no prospect of ever improving.
>
> Marcus



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com