You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Thibaut (JIRA)" <ji...@apache.org> on 2011/02/23 16:52:38 UTC
[jira] Created: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Race conditions when reinitialisating nodes (OOM + Nullpointer)
---------------------------------------------------------------
Key: CASSANDRA-2228
URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.7.2
Reporter: Thibaut
Fix For: 0.7.3
I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
1)
After a few seconds/minutes, I get a OOM error:
INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
java.lang.NullPointerException
at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
Killing and restarting the node multiple times will eventually "fix" these errors.
Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Thibaut (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998422#comment-12998422 ]
Thibaut commented on CASSANDRA-2228:
------------------------------------
The normal heap usage of the node is about half of the allocated heap:
Gossip active : true
Load : 30.99 GB
Generation No : 1298476238
Uptime (seconds) : 308
Heap Memory (MB) : 1544.41 / 2493.38
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Fix For: 0.7.3
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002197#comment-13002197 ]
Gary Dusbabek commented on CASSANDRA-2228:
------------------------------------------
aha. I was looking in the trunk code (no chance of null). I see the problem in the 0.7 branch.
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002268#comment-13002268 ]
Jonathan Ellis commented on CASSANDRA-2228:
-------------------------------------------
+1
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
> Attachments: v1-0001-initialize-localEndpoint_.txt
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002323#comment-13002323 ]
Hudson commented on CASSANDRA-2228:
-----------------------------------
Integrated in Cassandra-0.7 #344 (See [https://hudson.apache.org/hudson/job/Cassandra-0.7/344/])
initialize localendpoint in gossiper earlier. patch by gdusbabek, reviewed by jbellis. CASSANDRA-2228
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
> Attachments: v1-0001-initialize-localEndpoint_.txt
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis reassigned CASSANDRA-2228:
-----------------------------------------
Assignee: Gary Dusbabek
The OOM is because flushing allocates a buffer the size of in_memory_compaction_limit. Sounds like you need to lower that.
The NPE looks like a bug.
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.3
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002080#comment-13002080 ]
Gary Dusbabek commented on CASSANDRA-2228:
------------------------------------------
CHM throws NPE when the key is null. To me, then, the bug is that FBU.getLocalAddress() is somehow returning null.
I checked yesterday: even if local_address is unspecified, it ends up returning localhost.
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gary Dusbabek updated CASSANDRA-2228:
-------------------------------------
Attachment: v1-0001-initialize-localEndpoint_.txt
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
> Attachments: v1-0001-initialize-localEndpoint_.txt
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001630#comment-13001630 ]
Gary Dusbabek commented on CASSANDRA-2228:
------------------------------------------
What is listen_address in cassandra.yaml? I'm having a hard time conceiving of a way for FBUtilities.getLocalAddress() to return null.
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Thibaut Britz (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001633#comment-13001633 ]
Thibaut Britz commented on CASSANDRA-2228:
------------------------------------------
Hi,
I'm on vacation until March 14th.
For urgent matters, please contact:
Christophe Folschette (christophe.folschette@trendiction.com)
+352 20 33 35 32
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002037#comment-13002037 ]
Jonathan Ellis commented on CASSANDRA-2228:
-------------------------------------------
It looks like we're doing Migration stuff before Gossiper.start is called.
One possible solution: initialize Gossiper.localEndpoint in constructor instead of in start.
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2228) Race conditions when
reinitialisating nodes (OOM + Nullpointer)
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002084#comment-13002084 ]
Jonathan Ellis commented on CASSANDRA-2228:
-------------------------------------------
My point was that we can call put before initializing localEndpoint_ to FBU.gLA, so it's naturally going to be null.
> Race conditions when reinitialisating nodes (OOM + Nullpointer)
> ---------------------------------------------------------------
>
> Key: CASSANDRA-2228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2228
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.2
> Reporter: Thibaut
> Assignee: Gary Dusbabek
> Fix For: 0.7.4
>
>
> I had a corrupt system table which wouldn't compact anymore and I deleted the files and restarted cassandra and let it take the same token/ip address.
> I experienced the same errors when I'm adding a newly installed node under the same token/ip address before calling repair.
> 1)
> After a few seconds/minutes, I get a OOM error:
> INFO [FlushWriter:1] 2011-02-23 16:40:28,958 Memtable.java (line 164) Completed flushing /cassandra/data/system/Schema-f-15-Data.db (8037 bytes)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 Migration.java (line 133) Applying migration 3e30e76b-1e3f-11e0-8369-5a9c1faed4ae Add keyspace: table_userentriesrep factor:3rep strategy:SimpleStrategy{org.apache.cassandra.config.CFMetaData@58925d9[cfId=1024,tableName=table_userentries,cfName=table_userentries,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}], org.apache.cassandra.config.CFMetaData@11ab7246[cfId=1025,tableName=table_userentries,cfName=table_userentries_meta,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType@b44dff0,subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=0.0,gcGraceSeconds=86400,defaultValidator=org.apache.cassandra.db.marshal.BytesType@b44dff0,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=3600,memtableFlushAfterMins=60,memtableThroughputInMb=64,memtableOperationsInMillions=10.0,column_metadata={}]}
> INFO [MigrationStage:1] 2011-02-23 16:40:28,965 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Migrations at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [FlushWriter:1] 2011-02-23 16:40:28,966 Memtable.java (line 157) Writing Memtable-Migrations@2121008793(12529 bytes, 1 operations)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,966 ColumnFamilyStore.java (line 666) switching in a fresh Memtable for Schema at CommitLogContext(file='/cassandra/commitlog/CommitLog-1298475572022.log', position=226075)
> INFO [MigrationStage:1] 2011-02-23 16:40:28,967 ColumnFamilyStore.java (line 977) Enqueuing flush of Memtable-Schema@139610466(8370 bytes, 15 operations)
> INFO [ScheduledTasks:1] 2011-02-23 16:40:28,972 StatusLogger.java (line 89) table_sourcedetection.table_sourcedetection 0,0 0/0 0/200000
> ERROR [FlushWriter:1] 2011-02-23 16:41:01,240 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[FlushWriter:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
> at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:126)
> at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:75)
> at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:158)
> at org.apache.cassandra.db.Memtable.access$000(Memtable.java:51)
> at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:176)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2) If I restart then, I'm getting an Nullpointer exception. The OOM error will only appear once.
> ERROR [main] 2011-02-23 16:42:32,782 AbstractCassandraDaemon.java (line 333) Exception encountered during startup.
> java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:925)
> at org.apache.cassandra.service.MigrationManager.passiveAnnounce(MigrationManager.java:105)
> at org.apache.cassandra.service.MigrationManager.applyMigrations(MigrationManager.java:161)
> at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:185)
> at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> Killing and restarting the node multiple times will eventually "fix" these errors.
> Steps to reproduce. Remove complete data directory and restart node with same token/ip.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira