You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ondřej Černoš <ce...@gmail.com> on 2014/02/06 14:38:53 UTC

exceptions all around in clean cluster

Hi,

I am running a small 2 DC cluster of 3 nodes (each DC). I use 3 replicas in
both DCs (all 6 nodes have everything) on Cassandra 1.2.11. I populated the
cluster via cqlsh pipelined with a series of inserts. I use the cluster for
tests, the dataset is pretty small (hundreds of thousands of records max).

The cluster was completely up during inserts. Inserts were done serially on
one of the nodes.

The resulting load is uneven:

Datacenter: xxx
==================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens  Owns (effective)  Host ID
                         Rack
UN  ip  1.63 GB    256     100.0%
83ecd32a-3f2b-4cf6-b3c7-b316cb1986cc  default-rackUN  ip  1.5 GB
256     100.0%            091ca530-2e95-4954-92c4-76f51fab0b66
default-rack
UN  ip  1.44 GB    256     100.0%
d94d335e-08bf-4a30-ad58-4c5acdc2ef45  default-rack
Datacenter: yyy
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens  Owns (effective)  Host ID
                         Rack
UN  ip   2.27 GB    256     100.0%
e2584981-71f7-45b0-82f4-e08942c47585  1c
UN  ip   2.27 GB    256     100.0%
e5c6de9a-819e-4757-a420-55ec3ffaf131  1c
UN  ip   2.27 GB    256     100.0%
fa53f391-2dd3-4ec8-885d-8db6d453a708  1c


And 4 out of 6 nodes report corrupted sstables:


java.lang.RuntimeException:
org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.IOException: mmap segment underflow; remaining is 239882945
but 1349280116 requested
        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1618)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.IOException: mmap segment underflow; remaining is 239882945
but 1349280116 requested
        at org.apache.cassandra.db.columniterator.IndexedSliceReader.<init>(IndexedSliceReader.java:119)
        at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:68)
        at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:44)
        at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:104)
        at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
        at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:272)
        at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
        at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
        at org.apache.cassandra.db.Table.getRow(Table.java:347)
        at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
        at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1062)
        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1614)
        ... 3 more
Caused by: java.io.IOException: mmap segment underflow; remaining is
239882945 but 1349280116 requested
        at org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
        at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
        at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
        at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
        at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
        at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
        at org.apache.cassandra.db.columniterator.IndexedSliceReader$SimpleBlockFetcher.<init>(IndexedSliceReader.java:477)
        at org.apache.cassandra.db.columniterator.IndexedSliceReader.<init>(IndexedSliceReader.java:94)


repair -pr hangs, rebuild from the less corrupted dc hangs.

The only interesting exception (besides the java.io.EOFException during
repair) is the following:

org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 52f2665b
	at org.apache.cassandra.db.marshal.UTF8Type.getString(UTF8Type.java:54)
	at org.apache.cassandra.db.index.AbstractSimplePerColumnSecondaryIndex.insert(AbstractSimplePerColumnSecondaryIndex.java:102)
	at org.apache.cassandra.db.index.SecondaryIndexManager.indexRow(SecondaryIndexManager.java:448)
	at org.apache.cassandra.db.Table.indexRow(Table.java:431)	at
org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
	at org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)


during rebuild and some repair -pr sessions.

Does anyone know what might have caused this?

regards,

ondrej cernos

Re: exceptions all around in clean cluster

Posted by Tupshin Harper <tu...@tupshin.com>.
This is a known issue until Cassandra 2.1

https://issues.apache.org/jira/browse/CASSANDRA-5202

-Tupshin
On Feb 6, 2014 10:05 PM, "Robert Coli" <rc...@eventbrite.com> wrote:

> On Thu, Feb 6, 2014 at 8:39 AM, Ondřej Černoš <ce...@gmail.com> wrote:
>
>> Update: I dropped the keyspace, the system keyspace, deleted all the data
>> and started from fresh state. Now it behaves correctly. The previously
>> reported state is therefore the result of the keyspace being dropped
>> beforehand and recreated with no compression on sstables - maybe some
>> sstables were left in system keyspace as live though the keyspace was
>> completely dropped?
>>
>
> If you have a reproduction path, I recommend filing a JIRA in the Apache
> Cassandra JIRA.
>
> It's possible the response will be that dropping and recreating things
> (CFs, Keyspaces) is currently problematic and will be fixed soon, but your
> case seems particularly unusual/severe...
>
> =Rob
>
>

Re: exceptions all around in clean cluster

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Feb 6, 2014 at 8:39 AM, Ondřej Černoš <ce...@gmail.com> wrote:

> Update: I dropped the keyspace, the system keyspace, deleted all the data
> and started from fresh state. Now it behaves correctly. The previously
> reported state is therefore the result of the keyspace being dropped
> beforehand and recreated with no compression on sstables - maybe some
> sstables were left in system keyspace as live though the keyspace was
> completely dropped?
>

If you have a reproduction path, I recommend filing a JIRA in the Apache
Cassandra JIRA.

It's possible the response will be that dropping and recreating things
(CFs, Keyspaces) is currently problematic and will be fixed soon, but your
case seems particularly unusual/severe...

=Rob

Re: exceptions all around in clean cluster

Posted by Ondřej Černoš <ce...@gmail.com>.
Update: I dropped the keyspace, the system keyspace, deleted all the data
and started from fresh state. Now it behaves correctly. The previously
reported state is therefore the result of the keyspace being dropped
beforehand and recreated with no compression on sstables - maybe some
sstables were left in system keyspace as live though the keyspace was
completely dropped?

ondrej cernos


On Thu, Feb 6, 2014 at 3:11 PM, Ondřej Černoš <ce...@gmail.com> wrote:

> I ran nodetool scrub on nodes in the less corrupted datacenter and tried
> nodetool rebuild from this datacenter.
>
> This is the result:
>
> 2014-02-06 15:04:24.645+0100 [Thread-83] [ERROR] CassandraDaemon.java(191)
> org.apache.cassandra.service.CassandraDaemon: Exception in thread
> Thread[Thread-83,5,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.IllegalArgumentException
> at
> org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:152)
>  at
> org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:187)
> at
> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:138)
>  at
> org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:243)
> at
> org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:183)
>  at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:79)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.IllegalArgumentException
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>  at
> org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:144)
> ... 5 more
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Buffer.java:267)
> at
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
>  at
> org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
> at
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
>  at
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> at
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:132)
>  at
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:115)
> at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:165)
>  at
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:45)
> at
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:61)
>  at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>  at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
> at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>  at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
> at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
>  at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
> at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>  at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:291)
> at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>  at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
>  at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
> at org.apache.cassandra.db.SliceQueryPager.next(SliceQueryPager.java:57)
>  at org.apache.cassandra.db.Table.indexRow(Table.java:424)
> at
> org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
>  at
> org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 2014-02-06 15:04:24.646+0100 [CompactionExecutor:10] [ERROR]
> CassandraDaemon.java(191) org.apache.cassandra.service.CassandraDaemon:
> component=c4 Exception in thread Thread[CompactionExecutor:10,1,main]
> java.lang.IllegalArgumentException
>  at java.nio.Buffer.limit(Buffer.java:267)
> at
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
>  at
> org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
> at
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
>  at
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> at
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:132)
>  at
> org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:115)
> at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:165)
>  at
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:45)
> at
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:61)
>  at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>  at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
> at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>  at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
> at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
>  at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
> at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>  at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:291)
> at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>  at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
>  at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
> at org.apache.cassandra.db.SliceQueryPager.next(SliceQueryPager.java:57)
>  at org.apache.cassandra.db.Table.indexRow(Table.java:424)
> at
> org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
>  at
> org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
> Should I file a bug report with all this?
>
> regards,
>
> ondrej cernos
>
>
> On Thu, Feb 6, 2014 at 2:38 PM, Ondřej Černoš <ce...@gmail.com> wrote:
>
>> Hi,
>>
>> I am running a small 2 DC cluster of 3 nodes (each DC). I use 3 replicas
>> in both DCs (all 6 nodes have everything) on Cassandra 1.2.11. I populated
>> the cluster via cqlsh pipelined with a series of inserts. I use the cluster
>> for tests, the dataset is pretty small (hundreds of thousands of records
>> max).
>>
>> The cluster was completely up during inserts. Inserts were done serially
>> on one of the nodes.
>>
>> The resulting load is uneven:
>>
>> Datacenter: xxx
>> ==================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address          Load       Tokens  Owns (effective)  Host ID                               Rack
>> UN  ip  1.63 GB    256     100.0%            83ecd32a-3f2b-4cf6-b3c7-b316cb1986cc  default-rackUN  ip  1.5 GB     256     100.0%            091ca530-2e95-4954-92c4-76f51fab0b66  default-rack
>>
>>
>> UN  ip  1.44 GB    256     100.0%            d94d335e-08bf-4a30-ad58-4c5acdc2ef45  default-rack
>>
>>
>> Datacenter: yyy
>> ===================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address          Load       Tokens  Owns (effective)  Host ID                               Rack
>> UN  ip   2.27 GB    256     100.0%            e2584981-71f7-45b0-82f4-e08942c47585  1c
>>
>>
>> UN  ip   2.27 GB    256     100.0%            e5c6de9a-819e-4757-a420-55ec3ffaf131  1c
>>
>>
>> UN  ip   2.27 GB    256     100.0%            fa53f391-2dd3-4ec8-885d-8db6d453a708  1c
>>
>>
>> And 4 out of 6 nodes report corrupted sstables:
>>
>>
>> java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: mmap segment underflow; remaining is 239882945 but 1349280116 requested
>>         at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1618)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:744)
>> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: mmap segment underflow; remaining is 239882945 but 1349280116 requested
>>         at org.apache.cassandra.db.columniterator.IndexedSliceReader.<init>(IndexedSliceReader.java:119)
>>         at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:68)
>>         at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:44)
>>         at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:104)
>>         at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
>>         at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:272)
>>         at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>>         at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
>>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
>>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
>>         at org.apache.cassandra.db.Table.getRow(Table.java:347)
>>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>>         at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1062)
>>         at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1614)
>>         ... 3 more
>> Caused by: java.io.IOException: mmap segment underflow; remaining is 239882945 but 1349280116 requested
>>         at org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
>>         at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>>         at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
>>         at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
>>         at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
>>         at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
>>         at org.apache.cassandra.db.columniterator.IndexedSliceReader$SimpleBlockFetcher.<init>(IndexedSliceReader.java:477)
>>         at org.apache.cassandra.db.columniterator.IndexedSliceReader.<init>(IndexedSliceReader.java:94)
>>
>>
>> repair -pr hangs, rebuild from the less corrupted dc hangs.
>>
>> The only interesting exception (besides the java.io.EOFException during
>> repair) is the following:
>>
>> org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 52f2665b
>> 	at org.apache.cassandra.db.marshal.UTF8Type.getString(UTF8Type.java:54)
>> 	at org.apache.cassandra.db.index.AbstractSimplePerColumnSecondaryIndex.insert(AbstractSimplePerColumnSecondaryIndex.java:102)
>> 	at org.apache.cassandra.db.index.SecondaryIndexManager.indexRow(SecondaryIndexManager.java:448)
>> 	at org.apache.cassandra.db.Table.indexRow(Table.java:431)	at org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
>> 	at org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
>> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> 	at java.lang.Thread.run(Thread.java:744)
>>
>>
>> during rebuild and some repair -pr sessions.
>>
>> Does anyone know what might have caused this?
>>
>> regards,
>>
>> ondrej cernos
>>
>>
>>
>

Re: exceptions all around in clean cluster

Posted by Ondřej Černoš <ce...@gmail.com>.
I ran nodetool scrub on nodes in the less corrupted datacenter and tried
nodetool rebuild from this datacenter.

This is the result:

2014-02-06 15:04:24.645+0100 [Thread-83] [ERROR] CassandraDaemon.java(191)
org.apache.cassandra.service.CassandraDaemon: Exception in thread
Thread[Thread-83,5,main]
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.IllegalArgumentException
at
org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:152)
at
org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:187)
at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:138)
at
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:243)
at
org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:183)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:79)
Caused by: java.util.concurrent.ExecutionException:
java.lang.IllegalArgumentException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at
org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:144)
... 5 more
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:267)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:132)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:115)
at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:165)
at
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:45)
at
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:61)
at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:291)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
at org.apache.cassandra.db.SliceQueryPager.next(SliceQueryPager.java:57)
at org.apache.cassandra.db.Table.indexRow(Table.java:424)
at
org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
at
org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-02-06 15:04:24.646+0100 [CompactionExecutor:10] [ERROR]
CassandraDaemon.java(191) org.apache.cassandra.service.CassandraDaemon:
component=c4 Exception in thread Thread[CompactionExecutor:10,1,main]
java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:267)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
at
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:132)
at
org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:115)
at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:165)
at
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:45)
at
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:61)
at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:291)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
at org.apache.cassandra.db.SliceQueryPager.next(SliceQueryPager.java:57)
at org.apache.cassandra.db.Table.indexRow(Table.java:424)
at
org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
at
org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Should I file a bug report with all this?

regards,

ondrej cernos


On Thu, Feb 6, 2014 at 2:38 PM, Ondřej Černoš <ce...@gmail.com> wrote:

> Hi,
>
> I am running a small 2 DC cluster of 3 nodes (each DC). I use 3 replicas
> in both DCs (all 6 nodes have everything) on Cassandra 1.2.11. I populated
> the cluster via cqlsh pipelined with a series of inserts. I use the cluster
> for tests, the dataset is pretty small (hundreds of thousands of records
> max).
>
> The cluster was completely up during inserts. Inserts were done serially
> on one of the nodes.
>
> The resulting load is uneven:
>
> Datacenter: xxx
> ==================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns (effective)  Host ID                               Rack
> UN  ip  1.63 GB    256     100.0%            83ecd32a-3f2b-4cf6-b3c7-b316cb1986cc  default-rackUN  ip  1.5 GB     256     100.0%            091ca530-2e95-4954-92c4-76f51fab0b66  default-rack
>
> UN  ip  1.44 GB    256     100.0%            d94d335e-08bf-4a30-ad58-4c5acdc2ef45  default-rack
>
> Datacenter: yyy
> ===================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address          Load       Tokens  Owns (effective)  Host ID                               Rack
> UN  ip   2.27 GB    256     100.0%            e2584981-71f7-45b0-82f4-e08942c47585  1c
>
> UN  ip   2.27 GB    256     100.0%            e5c6de9a-819e-4757-a420-55ec3ffaf131  1c
>
> UN  ip   2.27 GB    256     100.0%            fa53f391-2dd3-4ec8-885d-8db6d453a708  1c
>
>
> And 4 out of 6 nodes report corrupted sstables:
>
>
> java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: mmap segment underflow; remaining is 239882945 but 1349280116 requested
>         at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1618)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: mmap segment underflow; remaining is 239882945 but 1349280116 requested
>         at org.apache.cassandra.db.columniterator.IndexedSliceReader.<init>(IndexedSliceReader.java:119)
>         at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:68)
>         at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:44)
>         at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:104)
>         at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
>         at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:272)
>         at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>         at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1123)
>         at org.apache.cassandra.db.Table.getRow(Table.java:347)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>         at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1062)
>         at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1614)
>         ... 3 more
> Caused by: java.io.IOException: mmap segment underflow; remaining is 239882945 but 1349280116 requested
>         at org.apache.cassandra.io.util.MappedFileDataInput.readBytes(MappedFileDataInput.java:135)
>         at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>         at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
>         at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
>         at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
>         at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
>         at org.apache.cassandra.db.columniterator.IndexedSliceReader$SimpleBlockFetcher.<init>(IndexedSliceReader.java:477)
>         at org.apache.cassandra.db.columniterator.IndexedSliceReader.<init>(IndexedSliceReader.java:94)
>
>
> repair -pr hangs, rebuild from the less corrupted dc hangs.
>
> The only interesting exception (besides the java.io.EOFException during
> repair) is the following:
>
> org.apache.cassandra.db.marshal.MarshalException: invalid UTF8 bytes 52f2665b
> 	at org.apache.cassandra.db.marshal.UTF8Type.getString(UTF8Type.java:54)
> 	at org.apache.cassandra.db.index.AbstractSimplePerColumnSecondaryIndex.insert(AbstractSimplePerColumnSecondaryIndex.java:102)
> 	at org.apache.cassandra.db.index.SecondaryIndexManager.indexRow(SecondaryIndexManager.java:448)
> 	at org.apache.cassandra.db.Table.indexRow(Table.java:431)	at org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
> 	at org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:803)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:744)
>
>
> during rebuild and some repair -pr sessions.
>
> Does anyone know what might have caused this?
>
> regards,
>
> ondrej cernos
>
>
>