You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Lee Parker <le...@socialagency.com> on 2010/04/16 19:50:55 UTC

cassandra instability

I am having major issues with stability on my cassandra nodes.  Here is the
setup:
Cassandra Cluster - 2 EC2 small instances (1.7G RAM, single 32 bit core)
with an EBS for the cassandra sstables
Cassandra 0.6.0 w/ 1G heap space and 128M/1mil Memtable Thresholds
Clients are also small EC2 webservers running php 5.3.2 and using Thrift PHP
to access the cluster

I am trying to migrate data from mysql into the cluster using the following
methodology:
1. get 500 rows (12 columns each) from mysql
2. build a batch_mutate to insert these rows into one CF (1 row = 1 row )
3. build a second batch_mutate to insert an index of those rows into a
second CF ( 1 row = 1 column )
4. loop around and do it again until all data has migrated.

This process is running on two clients each working on a separate part of
the mysql data which totals to about 70G.  Each time I start it up, it will
work fine for about 1 hour and then it will crash the servers.  The error
message on the servers is usually an out of memory error.  I will get
several time out errors on the clients and occasionally get an error telling
me that i was missing the timestamp.  The timestamp error is accompanied by
a server crashing if I use framed transport instead of buffered.  I wasn't
having the out of memory errors with 0.5.0, but had lots of timeouts and
some "unknown result" errors.  So we upgraded to 0.6.0 when it became the
stable release.

I have a development VM which has all the same code on one machine and have
had stability issues there as well.  I upgraded to the newest build of Java
(b20) on the VM and it hasn't helped.  If anything, it is more unstable.

I am at the point of abandoning cassandra as a solution for us, but wanted
to see if you all could offer me some advice first.  One of the reasons we
were trying cassandra was to scale out with smaller nodes rather than having
to run larger instances for mysql.

Lee Parker

Re: cassandra instability

Posted by banks <ba...@gmail.com>.

Is crashing really how it should deal with restricted memory?  Seems like if
this was true either a minimum required memory needs to be defined, or it
should adjust how it uses memory in the absence of it...

On Fri, Apr 16, 2010 at 11:07 AM, Avinash Lakshman <
avinash.lakshman@gmail.com> wrote:

> Those memtable thresholds also need looking into. You are using some real
> poor hardware configuration - 1.7 GB RAM is not a configuration worth
> experimenting with IMO. Typical production deployments are running 16 GB RAM
> and quad core 64 bit machines. Its hard I would presume to make any
> recommendations with this kind of configuration.
>
> Avinash
>
>
> On Fri, Apr 16, 2010 at 10:58 AM, Paul Brown <pa...@gmail.com> wrote:
>
>>
>> On Apr 16, 2010, at 10:50 AM, Lee Parker wrote:
>> > [...]
>> > I am trying to migrate data from mysql into the cluster using the
>> following methodology:
>> > 1. get 500 rows (12 columns each) from mysql
>> > 2. build a batch_mutate to insert these rows into one CF (1 row = 1 row
>> )
>> > 3. build a second batch_mutate to insert an index of those rows into a
>> second CF ( 1 row = 1 column )
>> > 4. loop around and do it again until all data has migrated.
>>
>> If you have row caching turned on and are putting a lot of data in each
>> row, you might be causing the memory issues.  Maybe turn row caching off?
>>
>> -- Paul
>>
>>
>

Re: cassandra instability

Posted by Avinash Lakshman <av...@gmail.com>.

Those memtable thresholds also need looking into. You are using some real
poor hardware configuration - 1.7 GB RAM is not a configuration worth
experimenting with IMO. Typical production deployments are running 16 GB RAM
and quad core 64 bit machines. Its hard I would presume to make any
recommendations with this kind of configuration.

Avinash

On Fri, Apr 16, 2010 at 10:58 AM, Paul Brown <pa...@gmail.com> wrote:

>
> On Apr 16, 2010, at 10:50 AM, Lee Parker wrote:
> > [...]
> > I am trying to migrate data from mysql into the cluster using the
> following methodology:
> > 1. get 500 rows (12 columns each) from mysql
> > 2. build a batch_mutate to insert these rows into one CF (1 row = 1 row )
> > 3. build a second batch_mutate to insert an index of those rows into a
> second CF ( 1 row = 1 column )
> > 4. loop around and do it again until all data has migrated.
>
> If you have row caching turned on and are putting a lot of data in each
> row, you might be causing the memory issues.  Maybe turn row caching off?
>
> -- Paul
>
>

Re: cassandra instability

Posted by Paul Brown <pa...@gmail.com>.

Two more things you can do:

1) If you're running the updaters in the JVM (sounded like you were doing PHP?), then be sure that you're cleaning up the database sessions properly.  Hibernate, in particular, will keep a lot of bookkeeping data around otherwise, and that can easily overflow your heap.

2) Use jmap to get some heap snapshots and see what the problem is ($PID is the process ID of your Cassandra process):

jmap -histo $PID > histo-`date +%s`

With several of those and a little bash-fu, you ought to be able to see what the leak is.

-- Paul

On Apr 16, 2010, at 11:06 AM, Lee Parker wrote:

> Row caching is not turned on.
> Lee Parker
> 
> On Fri, Apr 16, 2010 at 12:58 PM, Paul Brown <pa...@gmail.com> wrote:
> 
> On Apr 16, 2010, at 10:50 AM, Lee Parker wrote:
> > [...]
> > I am trying to migrate data from mysql into the cluster using the following methodology:
> > 1. get 500 rows (12 columns each) from mysql
> > 2. build a batch_mutate to insert these rows into one CF (1 row = 1 row )
> > 3. build a second batch_mutate to insert an index of those rows into a second CF ( 1 row = 1 column )
> > 4. loop around and do it again until all data has migrated.
> 
> If you have row caching turned on and are putting a lot of data in each row, you might be causing the memory issues.  Maybe turn row caching off?
> 
> -- Paul
> 
>

Re: cassandra instability

Posted by Lee Parker <le...@socialagency.com>.

Row caching is not turned on.

Lee Parker
On Fri, Apr 16, 2010 at 12:58 PM, Paul Brown <pa...@gmail.com> wrote:

>
> On Apr 16, 2010, at 10:50 AM, Lee Parker wrote:
> > [...]
> > I am trying to migrate data from mysql into the cluster using the
> following methodology:
> > 1. get 500 rows (12 columns each) from mysql
> > 2. build a batch_mutate to insert these rows into one CF (1 row = 1 row )
> > 3. build a second batch_mutate to insert an index of those rows into a
> second CF ( 1 row = 1 column )
> > 4. loop around and do it again until all data has migrated.
>
> If you have row caching turned on and are putting a lot of data in each
> row, you might be causing the memory issues.  Maybe turn row caching off?
>
> -- Paul
>
>

Re: cassandra instability

Posted by Paul Brown <pa...@gmail.com>.

On Apr 16, 2010, at 10:50 AM, Lee Parker wrote:
> [...]
> I am trying to migrate data from mysql into the cluster using the following methodology:
> 1. get 500 rows (12 columns each) from mysql
> 2. build a batch_mutate to insert these rows into one CF (1 row = 1 row )
> 3. build a second batch_mutate to insert an index of those rows into a second CF ( 1 row = 1 column )
> 4. loop around and do it again until all data has migrated.

If you have row caching turned on and are putting a lot of data in each row, you might be causing the memory issues.  Maybe turn row caching off?

-- Paul

Re: cassandra instability

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Apr 16, 2010 at 2:30 PM, Lee Parker <le...@socialagency.com> wrote:
> As for the Memtable thresholds, when I ran with lower thresholds, the server
> would be thrashing with compaction runs due to the dramatically increased
> number of sstable files.  That was when I was running 0.5.0.  Has 0.6.0
> improved compaction performance such that this shouldn't be an issue?

Nope.  Hence the conclusion that you probably need to add capacity or
throttle your writes.

-Jonathan

Re: cassandra instability

Posted by Lee Parker <le...@socialagency.com>.

I don't think it is a hardware issue.  This is happening on multiple servers
and clients on ec2 instances and my local development VM.  I think you are
right that the timestamp errors are likely being cause by the Thrift PHP
bindings.  The frustrating part is that I can't get the error to
consistently reproduce when I have debugging systems in place.

As for the Memtable thresholds, when I ran with lower thresholds, the server
would be thrashing with compaction runs due to the dramatically increased
number of sstable files.  That was when I was running 0.5.0.  Has 0.6.0
improved compaction performance such that this shouldn't be an issue?

Lee Parker
On Fri, Apr 16, 2010 at 1:13 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <le...@socialagency.com> wrote:
> > Each time I start it up, it will
> > work fine for about 1 hour and then it will crash the servers.  The error
> > message on the servers is usually an out of memory error.
>
> Sounds like
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> to me.
>
> > I will get
> > several time out errors on the clients
>
> Symtomatic of running out of memory.
>
> > and occasionally get an error telling
> > me that i was missing the timestamp.
>
> This is an entirely different problem.  Your client is sending
> garbage, plain and simple.  Why that is, I don't know.  The PHP Thrift
> binding is virtually unmaintained, so it could be a bug there, but
> Digg uses PHP against Cassandra extensively and hasn't hit this to my
> knowledge.  As I said in another thread, I wouldn't rule out bad
> hardware.
>
> > The timestamp error is accompanied by
> > a server crashing if I use framed transport instead of buffered.
>
> Thrift is fragile when the client sends it garbage.
> (https://issues.apache.org/jira/browse/THRIFT-601)
>
> > One of the reasons we
> > were trying cassandra was to scale out with smaller nodes rather than
> having
> > to run larger instances for mysql.
>
> 2 x 1GB isn't a whole lot to do a bulk load with.  You may have to
> throttle your clients to fix the OOM completely.
>
> -Jonathan
>

Re: cassandra instability

Posted by Chris Goffinet <go...@digg.com>.

We don't use PHP to talk to Cassandra directly. But we do have the front-end communicate to our backend services which are over Thrift. We've used Framed and Buffered, both required some tweaks. We use the PHP C-extension from the Thrift repo. I have to admit, it's pretty crappy, we had to make some modifications in Thrift.

I opened this ticket, I need to submit some of my patches so we can close it out (resolve the time out issues):
https://issues.apache.org/jira/browse/THRIFT-638

-Chris

On Apr 22, 2010, at 9:03 AM, S Ahmed wrote:

> If digg uses PHP with cassandra, can the library really be that old?
> 
> Or they are using their own custom php cassandra client? (probably, but just making sure).
> 
> On Fri, Apr 16, 2010 at 2:13 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <le...@socialagency.com> wrote:
> > Each time I start it up, it will
> > work fine for about 1 hour and then it will crash the servers.  The error
> > message on the servers is usually an out of memory error.
> 
> Sounds like http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> to me.
> 
> > I will get
> > several time out errors on the clients
> 
> Symtomatic of running out of memory.
> 
> > and occasionally get an error telling
> > me that i was missing the timestamp.
> 
> This is an entirely different problem.  Your client is sending
> garbage, plain and simple.  Why that is, I don't know.  The PHP Thrift
> binding is virtually unmaintained, so it could be a bug there, but
> Digg uses PHP against Cassandra extensively and hasn't hit this to my
> knowledge.  As I said in another thread, I wouldn't rule out bad
> hardware.
> 
> > The timestamp error is accompanied by
> > a server crashing if I use framed transport instead of buffered.
> 
> Thrift is fragile when the client sends it garbage.
> (https://issues.apache.org/jira/browse/THRIFT-601)
> 
> > One of the reasons we
> > were trying cassandra was to scale out with smaller nodes rather than having
> > to run larger instances for mysql.
> 
> 2 x 1GB isn't a whole lot to do a bulk load with.  You may have to
> throttle your clients to fix the OOM completely.
> 
> -Jonathan
>

Re: cassandra instability

Posted by S Ahmed <sa...@gmail.com>.

If digg uses PHP with cassandra, can the library really be that old?

Or they are using their own custom php cassandra client? (probably, but just
making sure).

On Fri, Apr 16, 2010 at 2:13 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <le...@socialagency.com> wrote:
> > Each time I start it up, it will
> > work fine for about 1 hour and then it will crash the servers.  The error
> > message on the servers is usually an out of memory error.
>
> Sounds like
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> to me.
>
> > I will get
> > several time out errors on the clients
>
> Symtomatic of running out of memory.
>
> > and occasionally get an error telling
> > me that i was missing the timestamp.
>
> This is an entirely different problem.  Your client is sending
> garbage, plain and simple.  Why that is, I don't know.  The PHP Thrift
> binding is virtually unmaintained, so it could be a bug there, but
> Digg uses PHP against Cassandra extensively and hasn't hit this to my
> knowledge.  As I said in another thread, I wouldn't rule out bad
> hardware.
>
> > The timestamp error is accompanied by
> > a server crashing if I use framed transport instead of buffered.
>
> Thrift is fragile when the client sends it garbage.
> (https://issues.apache.org/jira/browse/THRIFT-601)
>
> > One of the reasons we
> > were trying cassandra was to scale out with smaller nodes rather than
> having
> > to run larger instances for mysql.
>
> 2 x 1GB isn't a whole lot to do a bulk load with.  You may have to
> throttle your clients to fix the OOM completely.
>
> -Jonathan
>

Re: cassandra instability

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <le...@socialagency.com> wrote:
> Each time I start it up, it will
> work fine for about 1 hour and then it will crash the servers.  The error
> message on the servers is usually an out of memory error.

Sounds like http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
to me.

> I will get
> several time out errors on the clients

Symtomatic of running out of memory.

> and occasionally get an error telling
> me that i was missing the timestamp.

This is an entirely different problem.  Your client is sending
garbage, plain and simple.  Why that is, I don't know.  The PHP Thrift
binding is virtually unmaintained, so it could be a bug there, but
Digg uses PHP against Cassandra extensively and hasn't hit this to my
knowledge.  As I said in another thread, I wouldn't rule out bad
hardware.

> The timestamp error is accompanied by
> a server crashing if I use framed transport instead of buffered.

Thrift is fragile when the client sends it garbage.
(https://issues.apache.org/jira/browse/THRIFT-601)

> One of the reasons we
> were trying cassandra was to scale out with smaller nodes rather than having
> to run larger instances for mysql.

2 x 1GB isn't a whole lot to do a bulk load with.  You may have to
throttle your clients to fix the OOM completely.

-Jonathan

Re: cassandra instability

Posted by Lee Parker <le...@socialagency.com>.

I did regenerate the thrift bindings.  What I have found in testing is that
the batch_mutate command occasionally sends bad data to thrift when i try to
insert a set of items with too many columns.  I don't know if this is a
problem with PHP, or the thrift PHP library.  I have found that a limit of
1000 columns is perfectly fast enough for my needs and stable. Previously, I
was regularly sending 6000 columns (500 rows with about 12 columns each).
 Most of the columns in each row was fairly small, but some of the rows had
a rather large block of text.  When this was happening, the output of the
TBinaryProtocol would actually be incorrect at seemingly random times.  This
would then cause an error from cassandra saying that I was missing my
timestamp.  Enough of these errors and cassandra would crash with an out of
memory error.  If enough data was on the servers when this happened,
cassandra couldn't recover from the error because I didn't have enough
memory on the machines.  I have now upgraded to larger machines and that has
cleared up the real memory issues.

Lee Parker
On Sun, Apr 18, 2010 at 6:51 PM, Brandon Williams <dr...@gmail.com> wrote:

> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <le...@socialagency.com> wrote:
>
>> This process is running on two clients each working on a separate part of
>> the mysql data which totals to about 70G.  Each time I start it up, it will
>> work fine for about 1 hour and then it will crash the servers.  The error
>> message on the servers is usually an out of memory error.  I will get
>> several time out errors on the clients and occasionally get an error telling
>> me that i was missing the timestamp.  The timestamp error is accompanied by
>> a server crashing if I use framed transport instead of buffered.  I wasn't
>> having the out of memory errors with 0.5.0, but had lots of timeouts and
>> some "unknown result" errors.  So we upgraded to 0.6.0 when it became the
>> stable release.
>>
>
> Did you regenerate the php thrift bindings between 0.5 and 0.6?  There's a
> decent chance that thrift made some kind of backwards incompatible change
> between those revisions (look in the lib dir of each cassandra version to
> determine the thrift svn revision you need.)  If that happened, then it's
> possible the old bindings are sending something the newer version does not
> understand, and causing you to run into THRIFT-601, crashing the server.
>
> -Brandon
>

Re: cassandra instability

Posted by Brandon Williams <dr...@gmail.com>.

On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <le...@socialagency.com> wrote:

> This process is running on two clients each working on a separate part of
> the mysql data which totals to about 70G.  Each time I start it up, it will
> work fine for about 1 hour and then it will crash the servers.  The error
> message on the servers is usually an out of memory error.  I will get
> several time out errors on the clients and occasionally get an error telling
> me that i was missing the timestamp.  The timestamp error is accompanied by
> a server crashing if I use framed transport instead of buffered.  I wasn't
> having the out of memory errors with 0.5.0, but had lots of timeouts and
> some "unknown result" errors.  So we upgraded to 0.6.0 when it became the
> stable release.
>

Did you regenerate the php thrift bindings between 0.5 and 0.6?  There's a
decent chance that thrift made some kind of backwards incompatible change
between those revisions (look in the lib dir of each cassandra version to
determine the thrift svn revision you need.)  If that happened, then it's
possible the old bindings are sending something the newer version does not
understand, and causing you to run into THRIFT-601, crashing the server.

-Brandon