You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Aram Ayazyan <ay...@gmail.com> on 2010/12/02 01:28:02 UTC

OutOfMemory exceptions w/ Cassandra 0.6.8

Hi,

We have a small cluster of 3 Cassandra servers running w/ full
replication. Every once in a while we get an OutOfMemory exception and
have to restart servers. Sometimes just restarting doesn’t do it and
we have to clean the commitlog or data directory.

We are running Cassandra 0.6.8. There is only 1 keyspace and 3 column
families. There are less than 1000 keys across all column families.
There is roughly 1 write request per second and 1 read request. Each
server is allocated 1GB.  Size of all files in data directory of the
only column family is ~300MB. MemtableThroughputInMB is throttled way
down to 2 and BinaryMemtableThroughputInMB to 8 (w/ higher values we
were running out of memory extremely fast, this way it works for a
couple of days w/o crashing).

Last time this issue happened, I didn’t clear the commitlog/data
folders, enabled gc logging and restarted Cassandra. It crashes really
fast, but what is really strange is that it seems like it still has
plenty of memory when the error happens, last 3 lines from gc log:
21.408: [GC 437098K->436592K(1046464K), 0.0986800 secs]
21.520: [GC 453616K->453117K(1046464K), 0.0967770 secs]
21.629: [GC 470141K->469436K(1046464K), 0.0383520 secs]
The full log is here: http://pastebin.com/XGRSRcBd

I’ve tried increasing the memory up to 1.5GB, but it still doesn’t start.

Any ideas what might be the problem here?

Thank you,
Aram

Re: OutOfMemory exceptions w/ Cassandra 0.6.8

Posted by Aram Ayazyan <ay...@gmail.com>.
Thanks a lot Jonathan! That seems to be it, since the exact same
configuration w/ the same data starts up and works fine on a different
server.

-Aram

On Wed, Dec 1, 2010 at 5:24 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> Stack trace looks like an OS-level thread limit causing problems, not
> actually memory.
>
> On Wed, Dec 1, 2010 at 7:05 PM, Aram Ayazyan <ay...@gmail.com> wrote:
>> Hi Aaron,
>>
>> OOM is happening both after the system has been running for a while as
>> well as when I restart it afterwards. The only way to make it run
>> after it has crashed, is to remove everything from data and commitlog
>> directories. Unfortunately I don't have the original log from when
>> cassandra crashed earlier, but might have some soon if another node
>> crashes.
>>
>> This particular exception happened during start-up:
>> ERROR [main] 2010-12-01 14:58:37,795 CassandraDaemon.java (line 242)
>> Exception encountered during startup.
>> java.lang.OutOfMemoryError: unable to create new native thread
>>        at java.lang.Thread.start0(Native Method)
>>        at java.lang.Thread.start(Thread.java:597)
>>        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:57)
>>        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:40)
>>        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:117)
>>        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:71)
>>        at org.apache.cassandra.db.commitlog.CommitLog$CLHandle.<clinit>(CommitLog.java:85)
>>        at org.apache.cassandra.db.commitlog.CommitLog.instance(CommitLog.java:80)
>>        at org.apache.cassandra.db.ColumnFamilyStore.maybeSwitchMemtable(ColumnFamilyStore.java:469)
>>        at org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:517)
>>        at org.apache.cassandra.db.Table.flush(Table.java:431)
>>        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:291)
>>        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172)
>>        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:115)
>>        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224)
>>
>> And here is the full GC log: http://pastebin.com/XGRSRcBd (all 21
>> seconds of it).
>>
>> Thank you,
>> Aram
>>
>> On Wed, Dec 1, 2010 at 4:55 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>>> Do you have a log message for the OOM? And some GC messages around it? Have
>>> you tried watching the server with jconsole?
>>> Is the OOM happening on system start or after it's been running ? Or both?
>>> Do you have any row/key caches? Cannot remember but is 0.6* has this but
>>> have you enabled the save cache feature?
>>> Aaron
>>>
>>> On 02 Dec, 2010,at 01:28 PM, Aram Ayazyan <ay...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> We have a small cluster of 3 Cassandra servers running w/ full
>>> replication. Every once in a while we get an OutOfMemory exception and
>>> have to restart servers. Sometimes just restarting doesn’t do it and
>>> we have to clean the commitlog or data directory.
>>>
>>> We are running Cassandra 0.6.8. There is only 1 keyspace and 3 column
>>> families. There are less than 1000 keys across all column families.
>>> There is roughly 1 write request per second and 1 read request. Each
>>> server is allocated 1GB. Size of all files in data directory of the
>>> only column family is ~300MB. MemtableThroughputInMB is throttled way
>>> down to 2 and BinaryMemtableThroughputInMB to 8 (w/ higher values we
>>> were running out of memory extremely fast, this way it works for a
>>> couple of days w/o crashing).
>>>
>>> Last time this issue happened, I didn’t clear the commitlog/data
>>> folders, enabled gc logging and restarted Cassandra. It crashes really
>>> fast, but what is really strange is that it seems like it still has
>>> plenty of memory when the error happens, last 3 lines from gc log:
>>> 21.408: [GC 437098K->436592K(1046464K), 0.0986800 secs]
>>> 21.520: [GC 453616K->453117K(1046464K), 0.0967770 secs]
>>> 21.629: [GC 470141K->469436K(1046464K), 0.0383520 secs]
>>> The full log is here: http://pastebin.com/XGRSRcBd
>>>
>>> I’ve tried increasing the memory up to 1.5GB, but it still doesn’t start.
>>>
>>> Any ideas what might be the problem here?
>>>
>>> Thank you,
>>> Aram
>>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: OutOfMemory exceptions w/ Cassandra 0.6.8

Posted by Jonathan Ellis <jb...@gmail.com>.
Stack trace looks like an OS-level thread limit causing problems, not
actually memory.

On Wed, Dec 1, 2010 at 7:05 PM, Aram Ayazyan <ay...@gmail.com> wrote:
> Hi Aaron,
>
> OOM is happening both after the system has been running for a while as
> well as when I restart it afterwards. The only way to make it run
> after it has crashed, is to remove everything from data and commitlog
> directories. Unfortunately I don't have the original log from when
> cassandra crashed earlier, but might have some soon if another node
> crashes.
>
> This particular exception happened during start-up:
> ERROR [main] 2010-12-01 14:58:37,795 CassandraDaemon.java (line 242)
> Exception encountered during startup.
> java.lang.OutOfMemoryError: unable to create new native thread
>        at java.lang.Thread.start0(Native Method)
>        at java.lang.Thread.start(Thread.java:597)
>        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:57)
>        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:40)
>        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:117)
>        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:71)
>        at org.apache.cassandra.db.commitlog.CommitLog$CLHandle.<clinit>(CommitLog.java:85)
>        at org.apache.cassandra.db.commitlog.CommitLog.instance(CommitLog.java:80)
>        at org.apache.cassandra.db.ColumnFamilyStore.maybeSwitchMemtable(ColumnFamilyStore.java:469)
>        at org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:517)
>        at org.apache.cassandra.db.Table.flush(Table.java:431)
>        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:291)
>        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172)
>        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:115)
>        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224)
>
> And here is the full GC log: http://pastebin.com/XGRSRcBd (all 21
> seconds of it).
>
> Thank you,
> Aram
>
> On Wed, Dec 1, 2010 at 4:55 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>> Do you have a log message for the OOM? And some GC messages around it? Have
>> you tried watching the server with jconsole?
>> Is the OOM happening on system start or after it's been running ? Or both?
>> Do you have any row/key caches? Cannot remember but is 0.6* has this but
>> have you enabled the save cache feature?
>> Aaron
>>
>> On 02 Dec, 2010,at 01:28 PM, Aram Ayazyan <ay...@gmail.com> wrote:
>>
>> Hi,
>>
>> We have a small cluster of 3 Cassandra servers running w/ full
>> replication. Every once in a while we get an OutOfMemory exception and
>> have to restart servers. Sometimes just restarting doesn’t do it and
>> we have to clean the commitlog or data directory.
>>
>> We are running Cassandra 0.6.8. There is only 1 keyspace and 3 column
>> families. There are less than 1000 keys across all column families.
>> There is roughly 1 write request per second and 1 read request. Each
>> server is allocated 1GB. Size of all files in data directory of the
>> only column family is ~300MB. MemtableThroughputInMB is throttled way
>> down to 2 and BinaryMemtableThroughputInMB to 8 (w/ higher values we
>> were running out of memory extremely fast, this way it works for a
>> couple of days w/o crashing).
>>
>> Last time this issue happened, I didn’t clear the commitlog/data
>> folders, enabled gc logging and restarted Cassandra. It crashes really
>> fast, but what is really strange is that it seems like it still has
>> plenty of memory when the error happens, last 3 lines from gc log:
>> 21.408: [GC 437098K->436592K(1046464K), 0.0986800 secs]
>> 21.520: [GC 453616K->453117K(1046464K), 0.0967770 secs]
>> 21.629: [GC 470141K->469436K(1046464K), 0.0383520 secs]
>> The full log is here: http://pastebin.com/XGRSRcBd
>>
>> I’ve tried increasing the memory up to 1.5GB, but it still doesn’t start.
>>
>> Any ideas what might be the problem here?
>>
>> Thank you,
>> Aram
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: OutOfMemory exceptions w/ Cassandra 0.6.8

Posted by Aram Ayazyan <ay...@gmail.com>.
Regarding caches, I haven't explicitly enabled them and the
"saved_caches" directory is empty.

-Aram

On Wed, Dec 1, 2010 at 5:05 PM, Aram Ayazyan <ay...@gmail.com> wrote:
> Hi Aaron,
>
> OOM is happening both after the system has been running for a while as
> well as when I restart it afterwards. The only way to make it run
> after it has crashed, is to remove everything from data and commitlog
> directories. Unfortunately I don't have the original log from when
> cassandra crashed earlier, but might have some soon if another node
> crashes.
>
> This particular exception happened during start-up:
> ERROR [main] 2010-12-01 14:58:37,795 CassandraDaemon.java (line 242)
> Exception encountered during startup.
> java.lang.OutOfMemoryError: unable to create new native thread
>        at java.lang.Thread.start0(Native Method)
>        at java.lang.Thread.start(Thread.java:597)
>        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:57)
>        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:40)
>        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:117)
>        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:71)
>        at org.apache.cassandra.db.commitlog.CommitLog$CLHandle.<clinit>(CommitLog.java:85)
>        at org.apache.cassandra.db.commitlog.CommitLog.instance(CommitLog.java:80)
>        at org.apache.cassandra.db.ColumnFamilyStore.maybeSwitchMemtable(ColumnFamilyStore.java:469)
>        at org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:517)
>        at org.apache.cassandra.db.Table.flush(Table.java:431)
>        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:291)
>        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172)
>        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:115)
>        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224)
>
> And here is the full GC log: http://pastebin.com/XGRSRcBd (all 21
> seconds of it).
>
> Thank you,
> Aram
>
> On Wed, Dec 1, 2010 at 4:55 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>> Do you have a log message for the OOM? And some GC messages around it? Have
>> you tried watching the server with jconsole?
>> Is the OOM happening on system start or after it's been running ? Or both?
>> Do you have any row/key caches? Cannot remember but is 0.6* has this but
>> have you enabled the save cache feature?
>> Aaron
>>
>> On 02 Dec, 2010,at 01:28 PM, Aram Ayazyan <ay...@gmail.com> wrote:
>>
>> Hi,
>>
>> We have a small cluster of 3 Cassandra servers running w/ full
>> replication. Every once in a while we get an OutOfMemory exception and
>> have to restart servers. Sometimes just restarting doesn’t do it and
>> we have to clean the commitlog or data directory.
>>
>> We are running Cassandra 0.6.8. There is only 1 keyspace and 3 column
>> families. There are less than 1000 keys across all column families.
>> There is roughly 1 write request per second and 1 read request. Each
>> server is allocated 1GB. Size of all files in data directory of the
>> only column family is ~300MB. MemtableThroughputInMB is throttled way
>> down to 2 and BinaryMemtableThroughputInMB to 8 (w/ higher values we
>> were running out of memory extremely fast, this way it works for a
>> couple of days w/o crashing).
>>
>> Last time this issue happened, I didn’t clear the commitlog/data
>> folders, enabled gc logging and restarted Cassandra. It crashes really
>> fast, but what is really strange is that it seems like it still has
>> plenty of memory when the error happens, last 3 lines from gc log:
>> 21.408: [GC 437098K->436592K(1046464K), 0.0986800 secs]
>> 21.520: [GC 453616K->453117K(1046464K), 0.0967770 secs]
>> 21.629: [GC 470141K->469436K(1046464K), 0.0383520 secs]
>> The full log is here: http://pastebin.com/XGRSRcBd
>>
>> I’ve tried increasing the memory up to 1.5GB, but it still doesn’t start.
>>
>> Any ideas what might be the problem here?
>>
>> Thank you,
>> Aram
>>
>

Re: OutOfMemory exceptions w/ Cassandra 0.6.8

Posted by Aram Ayazyan <ay...@gmail.com>.
Hi Aaron,

OOM is happening both after the system has been running for a while as
well as when I restart it afterwards. The only way to make it run
after it has crashed, is to remove everything from data and commitlog
directories. Unfortunately I don't have the original log from when
cassandra crashed earlier, but might have some soon if another node
crashes.

This particular exception happened during start-up:
ERROR [main] 2010-12-01 14:58:37,795 CassandraDaemon.java (line 242)
Exception encountered during startup.
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:597)
        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:57)
        at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:40)
        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:117)
        at org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:71)
        at org.apache.cassandra.db.commitlog.CommitLog$CLHandle.<clinit>(CommitLog.java:85)
        at org.apache.cassandra.db.commitlog.CommitLog.instance(CommitLog.java:80)
        at org.apache.cassandra.db.ColumnFamilyStore.maybeSwitchMemtable(ColumnFamilyStore.java:469)
        at org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:517)
        at org.apache.cassandra.db.Table.flush(Table.java:431)
        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:291)
        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172)
        at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:115)
        at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224)

And here is the full GC log: http://pastebin.com/XGRSRcBd (all 21
seconds of it).

Thank you,
Aram

On Wed, Dec 1, 2010 at 4:55 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Do you have a log message for the OOM? And some GC messages around it? Have
> you tried watching the server with jconsole?
> Is the OOM happening on system start or after it's been running ? Or both?
> Do you have any row/key caches? Cannot remember but is 0.6* has this but
> have you enabled the save cache feature?
> Aaron
>
> On 02 Dec, 2010,at 01:28 PM, Aram Ayazyan <ay...@gmail.com> wrote:
>
> Hi,
>
> We have a small cluster of 3 Cassandra servers running w/ full
> replication. Every once in a while we get an OutOfMemory exception and
> have to restart servers. Sometimes just restarting doesn’t do it and
> we have to clean the commitlog or data directory.
>
> We are running Cassandra 0.6.8. There is only 1 keyspace and 3 column
> families. There are less than 1000 keys across all column families.
> There is roughly 1 write request per second and 1 read request. Each
> server is allocated 1GB. Size of all files in data directory of the
> only column family is ~300MB. MemtableThroughputInMB is throttled way
> down to 2 and BinaryMemtableThroughputInMB to 8 (w/ higher values we
> were running out of memory extremely fast, this way it works for a
> couple of days w/o crashing).
>
> Last time this issue happened, I didn’t clear the commitlog/data
> folders, enabled gc logging and restarted Cassandra. It crashes really
> fast, but what is really strange is that it seems like it still has
> plenty of memory when the error happens, last 3 lines from gc log:
> 21.408: [GC 437098K->436592K(1046464K), 0.0986800 secs]
> 21.520: [GC 453616K->453117K(1046464K), 0.0967770 secs]
> 21.629: [GC 470141K->469436K(1046464K), 0.0383520 secs]
> The full log is here: http://pastebin.com/XGRSRcBd
>
> I’ve tried increasing the memory up to 1.5GB, but it still doesn’t start.
>
> Any ideas what might be the problem here?
>
> Thank you,
> Aram
>

Re: OutOfMemory exceptions w/ Cassandra 0.6.8

Posted by Aaron Morton <aa...@thelastpickle.com>.
Do you have a log message for the OOM? And some GC messages around it? Have you tried watching the server with jconsole?

Is the OOM happening on system start or after it's been running ? Or both?

Do you have any row/key caches? Cannot remember but is 0.6* has this but have you enabled the save cache feature?

Aaron
 
On 02 Dec, 2010,at 01:28 PM, Aram Ayazyan <ay...@gmail.com> wrote:

Hi,

We have a small cluster of 3 Cassandra servers running w/ full
replication. Every once in a while we get an OutOfMemory exception and
have to restart servers. Sometimes just restarting doesn’t do it and
we have to clean the commitlog or data directory.

We are running Cassandra 0.6.8. There is only 1 keyspace and 3 column
families. There are less than 1000 keys across all column families.
There is roughly 1 write request per second and 1 read request. Each
server is allocated 1GB. Size of all files in data directory of the
only column family is ~300MB. MemtableThroughputInMB is throttled way
down to 2 and BinaryMemtableThroughputInMB to 8 (w/ higher values we
were running out of memory extremely fast, this way it works for a
couple of days w/o crashing).

Last time this issue happened, I didn’t clear the commitlog/data
folders, enabled gc logging and restarted Cassandra. It crashes really
fast, but what is really strange is that it seems like it still has
plenty of memory when the error happens, last 3 lines from gc log:
21.408: [GC 437098K->436592K(1046464K), 0.0986800 secs]
21.520: [GC 453616K->453117K(1046464K), 0.0967770 secs]
21.629: [GC 470141K->469436K(1046464K), 0.0383520 secs]
The full log is here: http://pastebin.com/XGRSRcBd

I’ve tried increasing the memory up to 1.5GB, but it still doesn’t start.

Any ideas what might be the problem here?

Thank you,
Aram