You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Erik Onnen <eo...@gmail.com> on 2011/03/07 20:23:07 UTC

CompactionExecutor EOF During Bootstrap

During a recent upgrade of our cassandra ring from 0.6.8 to 0.7.3 and
prior to a drain on the 0.6.8 nodes, we lost a node for reasons
unrelated to cassandra. We decided to push forward with the drain on
the remaining healthy nodes. The upgrade completed successfully for
the remaining nodes and the ring was healthy. However, we're unable to
boostrap in a new node. The bootstrap process starts and we can see
streaming activity in the logs for the node giving up tokens, but the
bootstrapping node encounters the following:


INFO [main] 2011-03-07 10:37:32,671 StorageService.java (line 505)
Joining: sleeping 30000 ms for pending range setup
 INFO [main] 2011-03-07 10:38:02,679 StorageService.java (line 505)
Bootstrapping
 INFO [HintedHandoff:1] 2011-03-07 10:38:02,899
HintedHandOffManager.java (line 304) Started hinted handoff for
endpoint /10.211.14.200
 INFO [HintedHandoff:1] 2011-03-07 10:38:02,900
HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows
to endpoint /10.211.14.200
 INFO [CompactionExecutor:1] 2011-03-07 10:38:04,924
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-1
 INFO [CompactionExecutor:1] 2011-03-07 10:38:05,390
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-2
 INFO [CompactionExecutor:1] 2011-03-07 10:38:05,768
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-1
 INFO [CompactionExecutor:1] 2011-03-07 10:38:06,389
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-2
 INFO [CompactionExecutor:1] 2011-03-07 10:38:06,581
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-3
ERROR [CompactionExecutor:1] 2011-03-07 10:38:07,056
AbstractCassandraDaemon.java (line 114) Fatal exception in thread
Thread[CompactionExecutor:1,1,main]
java.io.EOFException
        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
 INFO [CompactionExecutor:1] 2011-03-07 10:38:08,480
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-5
 INFO [CompactionExecutor:1] 2011-03-07 10:38:08,582
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-1
ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,635
AbstractCassandraDaemon.java (line 114) Fatal exception in thread
Thread[CompactionExecutor:1,1,main]
java.io.EOFException
        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,666
AbstractCassandraDaemon.java (line 114) Fatal exception in thread
Thread[CompactionExecutor:1,1,main]
java.io.EOFException
        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
 INFO [CompactionExecutor:1] 2011-03-07 10:38:08,855
SSTableReader.java (line 154) Opening
/mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-4


The same behavior has happened on both attempts. Logs from the node
giving up tokens show activity by the StreamStage thread but after the
failure on the bootstrapping node not much else relative to the
stream.

Lastly, the behavior in both cases seems to have issue with the third
data file. Files f-1,f-2 and f-4 are present but f-3 is not.

Any help would be appreciated.

-erik

Re: CompactionExecutor EOF During Bootstrap

Posted by Erik Onnen <eo...@gmail.com>.
Thanks Jonathan.

Filed: https://issues.apache.org/jira/browse/CASSANDRA-2283

We'll start the scrub during our normal compaction cycle and update
this thread and the bug with the results.

-erik

On Mon, Mar 7, 2011 at 11:27 AM, Jonathan Ellis <jb...@gmail.com> wrote:
> It sounds like it doesn't realize the data it's streaming over is
> older-version data.  Can you create a ticket?
>
> In the meantime nodetool scrub (on the existing nodes) will rewrite
> the data in the new format which should workaround the problem.
>
> On Mon, Mar 7, 2011 at 1:23 PM, Erik Onnen <eo...@gmail.com> wrote:
>> During a recent upgrade of our cassandra ring from 0.6.8 to 0.7.3 and
>> prior to a drain on the 0.6.8 nodes, we lost a node for reasons
>> unrelated to cassandra. We decided to push forward with the drain on
>> the remaining healthy nodes. The upgrade completed successfully for
>> the remaining nodes and the ring was healthy. However, we're unable to
>> boostrap in a new node. The bootstrap process starts and we can see
>> streaming activity in the logs for the node giving up tokens, but the
>> bootstrapping node encounters the following:
>>
>>
>> INFO [main] 2011-03-07 10:37:32,671 StorageService.java (line 505)
>> Joining: sleeping 30000 ms for pending range setup
>>  INFO [main] 2011-03-07 10:38:02,679 StorageService.java (line 505)
>> Bootstrapping
>>  INFO [HintedHandoff:1] 2011-03-07 10:38:02,899
>> HintedHandOffManager.java (line 304) Started hinted handoff for
>> endpoint /10.211.14.200
>>  INFO [HintedHandoff:1] 2011-03-07 10:38:02,900
>> HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows
>> to endpoint /10.211.14.200
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:04,924
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-1
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:05,390
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-2
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:05,768
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-1
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:06,389
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-2
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:06,581
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-3
>> ERROR [CompactionExecutor:1] 2011-03-07 10:38:07,056
>> AbstractCassandraDaemon.java (line 114) Fatal exception in thread
>> Thread[CompactionExecutor:1,1,main]
>> java.io.EOFException
>>        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
>>        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
>>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
>>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
>>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:662)
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:08,480
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-5
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:08,582
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-1
>> ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,635
>> AbstractCassandraDaemon.java (line 114) Fatal exception in thread
>> Thread[CompactionExecutor:1,1,main]
>> java.io.EOFException
>>        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
>>        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
>>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
>>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
>>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:662)
>> ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,666
>> AbstractCassandraDaemon.java (line 114) Fatal exception in thread
>> Thread[CompactionExecutor:1,1,main]
>> java.io.EOFException
>>        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
>>        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
>>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
>>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
>>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:662)
>>  INFO [CompactionExecutor:1] 2011-03-07 10:38:08,855
>> SSTableReader.java (line 154) Opening
>> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-4
>>
>>
>> The same behavior has happened on both attempts. Logs from the node
>> giving up tokens show activity by the StreamStage thread but after the
>> failure on the bootstrapping node not much else relative to the
>> stream.
>>
>> Lastly, the behavior in both cases seems to have issue with the third
>> data file. Files f-1,f-2 and f-4 are present but f-3 is not.
>>
>> Any help would be appreciated.
>>
>> -erik
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: CompactionExecutor EOF During Bootstrap

Posted by Jonathan Ellis <jb...@gmail.com>.
It sounds like it doesn't realize the data it's streaming over is
older-version data.  Can you create a ticket?

In the meantime nodetool scrub (on the existing nodes) will rewrite
the data in the new format which should workaround the problem.

On Mon, Mar 7, 2011 at 1:23 PM, Erik Onnen <eo...@gmail.com> wrote:
> During a recent upgrade of our cassandra ring from 0.6.8 to 0.7.3 and
> prior to a drain on the 0.6.8 nodes, we lost a node for reasons
> unrelated to cassandra. We decided to push forward with the drain on
> the remaining healthy nodes. The upgrade completed successfully for
> the remaining nodes and the ring was healthy. However, we're unable to
> boostrap in a new node. The bootstrap process starts and we can see
> streaming activity in the logs for the node giving up tokens, but the
> bootstrapping node encounters the following:
>
>
> INFO [main] 2011-03-07 10:37:32,671 StorageService.java (line 505)
> Joining: sleeping 30000 ms for pending range setup
>  INFO [main] 2011-03-07 10:38:02,679 StorageService.java (line 505)
> Bootstrapping
>  INFO [HintedHandoff:1] 2011-03-07 10:38:02,899
> HintedHandOffManager.java (line 304) Started hinted handoff for
> endpoint /10.211.14.200
>  INFO [HintedHandoff:1] 2011-03-07 10:38:02,900
> HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows
> to endpoint /10.211.14.200
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:04,924
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-1
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:05,390
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-2
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:05,768
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-1
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:06,389
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-2
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:06,581
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-3
> ERROR [CompactionExecutor:1] 2011-03-07 10:38:07,056
> AbstractCassandraDaemon.java (line 114) Fatal exception in thread
> Thread[CompactionExecutor:1,1,main]
> java.io.EOFException
>        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
>        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:08,480
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-5
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:08,582
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-1
> ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,635
> AbstractCassandraDaemon.java (line 114) Fatal exception in thread
> Thread[CompactionExecutor:1,1,main]
> java.io.EOFException
>        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
>        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,666
> AbstractCassandraDaemon.java (line 114) Fatal exception in thread
> Thread[CompactionExecutor:1,1,main]
> java.io.EOFException
>        at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65)
>        at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303)
>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923)
>        at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
>  INFO [CompactionExecutor:1] 2011-03-07 10:38:08,855
> SSTableReader.java (line 154) Opening
> /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-4
>
>
> The same behavior has happened on both attempts. Logs from the node
> giving up tokens show activity by the StreamStage thread but after the
> failure on the bootstrapping node not much else relative to the
> stream.
>
> Lastly, the behavior in both cases seems to have issue with the third
> data file. Files f-1,f-2 and f-4 are present but f-3 is not.
>
> Any help would be appreciated.
>
> -erik
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com