You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Todd Bilsborrow <tb...@rhythmnewmedia.com> on 2013/05/07 20:17:27 UTC

recover from corrupt log file?

Are there any recommended steps to take to try and recover a corrupt log file?

I'm running Kafka 0.7.0, using java apis for both production and consumption. If I attempt to read a message from a certain offset using the simple consumer, I get the following on the client:

java.io.EOFException: Received -1 when reading from channel, socket has likely been closed.
at kafka.utils.Utils$.read(Utils.scala:486)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)
at kafka.network.Receive$class.readCompletely(Transmission.scala:57)
at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
at kafka.consumer.SimpleConsumer.getResponse(SimpleConsumer.scala:184)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:98)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:88)
at kafka.javaapi.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:43)

and the following on the broker:

ERROR Closing socket for /xx.xx.xx.xx because of error (kafka.network.Processor)
java.io.IOException: Input/output error
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
        at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:405)
        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:506)
        at kafka.message.FileMessageSet.writeTo(FileMessageSet.scala:107)
        at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:51)
        at kafka.network.Processor.write(SocketServer.scala:332)
        at kafka.network.Processor.run(SocketServer.scala:209)
        at java.lang.Thread.run(Thread.java:662)

When I run DumpLogSegments on the file, it prints all messages up to the seemingly corrupt offset, then pauses for several seconds, then exits with the message "tail of the log is at offset: 152722143050" - which is the offset that appears to be the start of the corruption. My next log file starts at offset 153008674335, so there are a couple hundred MB (~couple million messages) that I can't access.

Just curious if there are any next "best practice" steps.

Re: recover from corrupt log file?

Posted by Jun Rao <ju...@gmail.com>.
If you hard-kill (kill -9) a broker, it will do log validation and recovery
(by truncating the segment off from the first invalid message), but only on
the last segment. If you have corruption in earlier segments, the simplest
way is to skip that segment by manually setting the consumer offset to the
offset of the next segment.

Thanks,

Jun


On Tue, May 7, 2013 at 11:17 AM, Todd Bilsborrow <
tbilsborrow@rhythmnewmedia.com> wrote:

> Are there any recommended steps to take to try and recover a corrupt log
> file?
>
> I'm running Kafka 0.7.0, using java apis for both production and
> consumption. If I attempt to read a message from a certain offset using the
> simple consumer, I get the following on the client:
>
> java.io.EOFException: Received -1 when reading from channel, socket has
> likely been closed.
> at kafka.utils.Utils$.read(Utils.scala:486)
> at
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)
> at kafka.network.Receive$class.readCompletely(Transmission.scala:57)
> at
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> at kafka.consumer.SimpleConsumer.getResponse(SimpleConsumer.scala:184)
> at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:98)
> at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:88)
> at kafka.javaapi.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:43)
>
> and the following on the broker:
>
> ERROR Closing socket for /xx.xx.xx.xx because of error
> (kafka.network.Processor)
> java.io.IOException: Input/output error
>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>         at
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:405)
>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:506)
>         at kafka.message.FileMessageSet.writeTo(FileMessageSet.scala:107)
>         at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:51)
>         at kafka.network.Processor.write(SocketServer.scala:332)
>         at kafka.network.Processor.run(SocketServer.scala:209)
>         at java.lang.Thread.run(Thread.java:662)
>
> When I run DumpLogSegments on the file, it prints all messages up to the
> seemingly corrupt offset, then pauses for several seconds, then exits with
> the message "tail of the log is at offset: 152722143050" - which is the
> offset that appears to be the start of the corruption. My next log file
> starts at offset 153008674335, so there are a couple hundred MB (~couple
> million messages) that I can't access.
>
> Just curious if there are any next "best practice" steps.
>