You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Brian Frank Cooper <co...@yahoo-inc.com> on 2009/08/20 04:12:43 UTC

Server cannot startup after shutdown

Hi folks,

I'm using 0.4 beta1 and had six servers loaded with 20 GB of data per server. (In this test, 10 KB per record, and 2 GB heap space allocated to the JVM.) I stopped the servers (using what I think is the recommended method, the kill command). Upon trying to restart, some servers threw a UTFDataFormatException, while others threw an OutOfMemoryError exception. None of them started.

Is this a known issue?

ERROR - Fatal exception in thread Thread[main,5,main]
java.lang.OutOfMemoryError: Java heap space
        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)

ERROR - Exception encountered during startup.
java.io.UTFDataFormatException: malformed input around byte 5497
        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
Exception encountered during startup.
java.io.UTFDataFormatException: malformed input around byte 5497
        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)

Thanks for the help!

Brian

Re: Server cannot startup after shutdown

Posted by Jonathan Ellis <jb...@gmail.com>.
On Wed, Aug 19, 2009 at 9:46 PM, Jonathan Ellis<jb...@gmail.com> wrote:
> The OOM puzzles me a little; I'm not sure how it could be unable to
> replay a mutation that it was able to write to the commitlog in the
> first place.

Ah, I think I know: if a compaction starts during recovery, that could
suck up a bunch of memory.

Re: Server cannot startup after shutdown

Posted by Jonathan Ellis <jb...@gmail.com>.
On Wed, Aug 26, 2009 at 12:26 PM, Brian Frank
Cooper<co...@yahoo-inc.com> wrote:
>> Is the commitlog small enough that you can gzip it and attach to JIRA
>> (10 MB limit)?
>
> /var/cassandra/commitlog has 215 files totaling about 28 GB. Most are 134 MB, the last one is 6MB. Which one would be useful to you?

Can you update to trunk and re-run recovery with log level set to
DEBUG?  It will log the file it is in like this:

DEBUG - Replaying
/var/lib/cassandra/commitlog/CommitLog-1251137387800.log starting at
117

then before it errors out the last entry like

DEBUG - Reading mutation at 666

> That's the curious thing; there were no writes in progress. In fact, my experiment had finished about 24 hours before, and there was no load in between, and then I shut down and still couldn't restart. I figured all the writes would have committed by then.

Must be a different bug, then.  Thanks for finding it for us! :)

-Jonathan

RE: Server cannot startup after shutdown

Posted by Brian Frank Cooper <co...@yahoo-inc.com>.
> Is the commitlog small enough that you can gzip it and attach to JIRA
> (10 MB limit)?

/var/cassandra/commitlog has 215 files totaling about 28 GB. Most are 134 MB, the last one is 6MB. Which one would be useful to you?

> Right, but you can still have an incomplete write in progress if you
> shut down while writes are still happening.

That's the curious thing; there were no writes in progress. In fact, my experiment had finished about 24 hours before, and there was no load in between, and then I shut down and still couldn't restart. I figured all the writes would have committed by then.

thanks...

brian

Re: Server cannot startup after shutdown

Posted by Jonathan Ellis <jb...@gmail.com>.
On Wed, Aug 26, 2009 at 1:03 AM, Brian Frank
Cooper<co...@yahoo-inc.com> wrote:
> Hi, Jonathan,
>
> I have been trying to shutdown and restart Cassandra again this morning. I still get the malformed entry bug (which you say below your patch fixes.) I also get:
>
> ERROR - Exception encountered during startup.
> java.lang.NegativeArraySizeException

That could be a related problem.  Or it might be a different bug. :)
Is the commitlog small enough that you can gzip it and attach to JIRA
(10 MB limit)?

> No out of memory error this time, though.
>
> I'm also curious about your comment "I introduced a regression where it couldn't handle the last entry in the commitlog being incomplete." Does the last entry in the commit log being incomplete mean that the last update or set of updates are not fully committed to the log? And therefore they are lost? I thought since I had set "<CommitLogSync>true</CommitLogSync>" that all updates would be fully flushed before returning to the caller.

Right, but you can still have an incomplete write in progress if you
shut down while writes are still happening.

-Jonathan

RE: Server cannot startup after shutdown

Posted by Brian Frank Cooper <co...@yahoo-inc.com>.
Hi, Jonathan,

I have been trying to shutdown and restart Cassandra again this morning. I still get the malformed entry bug (which you say below your patch fixes.) I also get:

ERROR - Exception encountered during startup.
java.lang.NegativeArraySizeException
        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
Exception encountered during startup.
java.lang.NegativeArraySizeException
        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)

No out of memory error this time, though.

I'm also curious about your comment "I introduced a regression where it couldn't handle the last entry in the commitlog being incomplete." Does the last entry in the commit log being incomplete mean that the last update or set of updates are not fully committed to the log? And therefore they are lost? I thought since I had set "<CommitLogSync>true</CommitLogSync>" that all updates would be fully flushed before returning to the caller.

(BTW thanks for all the help with setting up Cassandra, it really made it easier to run experiments...)

brian
________________________________________
From: Jonathan Ellis [jbellis@gmail.com]
Sent: Monday, August 24, 2009 12:51 PM
To: cassandra-user@incubator.apache.org
Subject: Re: Server cannot startup after shutdown

Oops, my bad -- that patch has been sitting unreviewed in
CASSANDRA-370.  I thought it was in trunk by now.  I'll try to get
someone to review that today.

-Jonathan

On Wed, Aug 19, 2009 at 9:46 PM, Jonathan Ellis<jb...@gmail.com> wrote:
> The malformed input bug was fixed after beta1 and should be in a
> nightly build by now.  (I introduced a regression where it couldn't
> handle the last entry in the commitlog being incomplete.  So upgrading
> should be able to restart on the existing commitlogs.)
>
> The OOM puzzles me a little; I'm not sure how it could be unable to
> replay a mutation that it was able to write to the commitlog in the
> first place.  You could try setting the memtable object and memory
> thresholds lower temporarily and see if that leaves enough extra free
> to do the replay.
>
> -Jonathan
>
> On Wed, Aug 19, 2009 at 7:12 PM, Brian Frank
> Cooper<co...@yahoo-inc.com> wrote:
>> Hi folks,
>>
>> I'm using 0.4 beta1 and had six servers loaded with 20 GB of data per server. (In this test, 10 KB per record, and 2 GB heap space allocated to the JVM.) I stopped the servers (using what I think is the recommended method, the kill command). Upon trying to restart, some servers threw a UTFDataFormatException, while others threw an OutOfMemoryError exception. None of them started.
>>
>> Is this a known issue?
>>
>> ERROR - Fatal exception in thread Thread[main,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> ERROR - Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>> Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> Thanks for the help!
>>
>> Brian
>

Re: Server cannot startup after shutdown

Posted by Jonathan Ellis <jb...@gmail.com>.
Oops, my bad -- that patch has been sitting unreviewed in
CASSANDRA-370.  I thought it was in trunk by now.  I'll try to get
someone to review that today.

-Jonathan

On Wed, Aug 19, 2009 at 9:46 PM, Jonathan Ellis<jb...@gmail.com> wrote:
> The malformed input bug was fixed after beta1 and should be in a
> nightly build by now.  (I introduced a regression where it couldn't
> handle the last entry in the commitlog being incomplete.  So upgrading
> should be able to restart on the existing commitlogs.)
>
> The OOM puzzles me a little; I'm not sure how it could be unable to
> replay a mutation that it was able to write to the commitlog in the
> first place.  You could try setting the memtable object and memory
> thresholds lower temporarily and see if that leaves enough extra free
> to do the replay.
>
> -Jonathan
>
> On Wed, Aug 19, 2009 at 7:12 PM, Brian Frank
> Cooper<co...@yahoo-inc.com> wrote:
>> Hi folks,
>>
>> I'm using 0.4 beta1 and had six servers loaded with 20 GB of data per server. (In this test, 10 KB per record, and 2 GB heap space allocated to the JVM.) I stopped the servers (using what I think is the recommended method, the kill command). Upon trying to restart, some servers threw a UTFDataFormatException, while others threw an OutOfMemoryError exception. None of them started.
>>
>> Is this a known issue?
>>
>> ERROR - Fatal exception in thread Thread[main,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> ERROR - Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>> Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> Thanks for the help!
>>
>> Brian
>

RE: Server cannot startup after shutdown

Posted by Brian Frank Cooper <co...@yahoo-inc.com>.
I haven't had a chance to play with that yet. I'm just trying to get a bunch of data loaded so I can run some tests. Once the tests are done I will look at starting and stopping servers again.

The tokens thing helped out quite a lot.

brian

-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
Sent: Thursday, August 20, 2009 10:50 AM
To: cassandra-user@incubator.apache.org
Subject: Re: Server cannot startup after shutdown

I saw that you're testing with different tokens now -- how did the
replay OOM work out?

On Wed, Aug 19, 2009 at 10:54 PM, Brian Frank
Cooper<co...@yahoo-inc.com> wrote:
> Thanks for the reply. I'll try playing with the memory settings.
>
> brian
> ________________________________________
> From: Jonathan Ellis [jbellis@gmail.com]
> Sent: Wednesday, August 19, 2009 7:46 PM
> To: cassandra-user@incubator.apache.org
> Subject: Re: Server cannot startup after shutdown
>
> The malformed input bug was fixed after beta1 and should be in a
> nightly build by now.  (I introduced a regression where it couldn't
> handle the last entry in the commitlog being incomplete.  So upgrading
> should be able to restart on the existing commitlogs.)
>
> The OOM puzzles me a little; I'm not sure how it could be unable to
> replay a mutation that it was able to write to the commitlog in the
> first place.  You could try setting the memtable object and memory
> thresholds lower temporarily and see if that leaves enough extra free
> to do the replay.
>
> -Jonathan
>
> On Wed, Aug 19, 2009 at 7:12 PM, Brian Frank
> Cooper<co...@yahoo-inc.com> wrote:
>> Hi folks,
>>
>> I'm using 0.4 beta1 and had six servers loaded with 20 GB of data per server. (In this test, 10 KB per record, and 2 GB heap space allocated to the JVM.) I stopped the servers (using what I think is the recommended method, the kill command). Upon trying to restart, some servers threw a UTFDataFormatException, while others threw an OutOfMemoryError exception. None of them started.
>>
>> Is this a known issue?
>>
>> ERROR - Fatal exception in thread Thread[main,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> ERROR - Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>> Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> Thanks for the help!
>>
>> Brian
>

Re: Server cannot startup after shutdown

Posted by Jonathan Ellis <jb...@gmail.com>.
I saw that you're testing with different tokens now -- how did the
replay OOM work out?

On Wed, Aug 19, 2009 at 10:54 PM, Brian Frank
Cooper<co...@yahoo-inc.com> wrote:
> Thanks for the reply. I'll try playing with the memory settings.
>
> brian
> ________________________________________
> From: Jonathan Ellis [jbellis@gmail.com]
> Sent: Wednesday, August 19, 2009 7:46 PM
> To: cassandra-user@incubator.apache.org
> Subject: Re: Server cannot startup after shutdown
>
> The malformed input bug was fixed after beta1 and should be in a
> nightly build by now.  (I introduced a regression where it couldn't
> handle the last entry in the commitlog being incomplete.  So upgrading
> should be able to restart on the existing commitlogs.)
>
> The OOM puzzles me a little; I'm not sure how it could be unable to
> replay a mutation that it was able to write to the commitlog in the
> first place.  You could try setting the memtable object and memory
> thresholds lower temporarily and see if that leaves enough extra free
> to do the replay.
>
> -Jonathan
>
> On Wed, Aug 19, 2009 at 7:12 PM, Brian Frank
> Cooper<co...@yahoo-inc.com> wrote:
>> Hi folks,
>>
>> I'm using 0.4 beta1 and had six servers loaded with 20 GB of data per server. (In this test, 10 KB per record, and 2 GB heap space allocated to the JVM.) I stopped the servers (using what I think is the recommended method, the kill command). Upon trying to restart, some servers threw a UTFDataFormatException, while others threw an OutOfMemoryError exception. None of them started.
>>
>> Is this a known issue?
>>
>> ERROR - Fatal exception in thread Thread[main,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> ERROR - Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>> Exception encountered during startup.
>> java.io.UTFDataFormatException: malformed input around byte 5497
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>>
>> Thanks for the help!
>>
>> Brian
>

RE: Server cannot startup after shutdown

Posted by Brian Frank Cooper <co...@yahoo-inc.com>.
Thanks for the reply. I'll try playing with the memory settings.

brian
________________________________________
From: Jonathan Ellis [jbellis@gmail.com]
Sent: Wednesday, August 19, 2009 7:46 PM
To: cassandra-user@incubator.apache.org
Subject: Re: Server cannot startup after shutdown

The malformed input bug was fixed after beta1 and should be in a
nightly build by now.  (I introduced a regression where it couldn't
handle the last entry in the commitlog being incomplete.  So upgrading
should be able to restart on the existing commitlogs.)

The OOM puzzles me a little; I'm not sure how it could be unable to
replay a mutation that it was able to write to the commitlog in the
first place.  You could try setting the memtable object and memory
thresholds lower temporarily and see if that leaves enough extra free
to do the replay.

-Jonathan

On Wed, Aug 19, 2009 at 7:12 PM, Brian Frank
Cooper<co...@yahoo-inc.com> wrote:
> Hi folks,
>
> I'm using 0.4 beta1 and had six servers loaded with 20 GB of data per server. (In this test, 10 KB per record, and 2 GB heap space allocated to the JVM.) I stopped the servers (using what I think is the recommended method, the kill command). Upon trying to restart, some servers threw a UTFDataFormatException, while others threw an OutOfMemoryError exception. None of them started.
>
> Is this a known issue?
>
> ERROR - Fatal exception in thread Thread[main,5,main]
> java.lang.OutOfMemoryError: Java heap space
>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>
> ERROR - Exception encountered during startup.
> java.io.UTFDataFormatException: malformed input around byte 5497
>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
> Exception encountered during startup.
> java.io.UTFDataFormatException: malformed input around byte 5497
>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>
> Thanks for the help!
>
> Brian

Re: Server cannot startup after shutdown

Posted by Jonathan Ellis <jb...@gmail.com>.
The malformed input bug was fixed after beta1 and should be in a
nightly build by now.  (I introduced a regression where it couldn't
handle the last entry in the commitlog being incomplete.  So upgrading
should be able to restart on the existing commitlogs.)

The OOM puzzles me a little; I'm not sure how it could be unable to
replay a mutation that it was able to write to the commitlog in the
first place.  You could try setting the memtable object and memory
thresholds lower temporarily and see if that leaves enough extra free
to do the replay.

-Jonathan

On Wed, Aug 19, 2009 at 7:12 PM, Brian Frank
Cooper<co...@yahoo-inc.com> wrote:
> Hi folks,
>
> I'm using 0.4 beta1 and had six servers loaded with 20 GB of data per server. (In this test, 10 KB per record, and 2 GB heap space allocated to the JVM.) I stopped the servers (using what I think is the recommended method, the kill command). Upon trying to restart, some servers threw a UTFDataFormatException, while others threw an OutOfMemoryError exception. None of them started.
>
> Is this a known issue?
>
> ERROR - Fatal exception in thread Thread[main,5,main]
> java.lang.OutOfMemoryError: Java heap space
>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:274)
>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>
> ERROR - Exception encountered during startup.
> java.io.UTFDataFormatException: malformed input around byte 5497
>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
> Exception encountered during startup.
> java.io.UTFDataFormatException: malformed input around byte 5497
>        at java.io.DataInputStream.readUTF(DataInputStream.java:639)
>        at java.io.DataInputStream.readUTF(DataInputStream.java:547)
>        at org.apache.cassandra.db.RowSerializer.deserialize(Row.java:218)
>        at org.apache.cassandra.db.CommitLog.recover(CommitLog.java:285)
>        at org.apache.cassandra.db.RecoveryManager.doRecovery(RecoveryManager.java:63)
>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:96)
>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:171)
>
> Thanks for the help!
>
> Brian