You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2012/08/07 04:10:03 UTC

[jira] [Commented] (ACCUMULO-623) Data lost with hdfs write ahead log

    [ https://issues.apache.org/jira/browse/ACCUMULO-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429666#comment-13429666 ] 

Josh Elser commented on ACCUMULO-623:
-------------------------------------

Interesting, I added some messages in TabletServer to dump those variables on my instance: 'dfs.durable.sync'=false and 'dfs.support.append'=true. I have neither set in my config files.

Cloudera's site (https://ccp.cloudera.com/display/CDHDOC/Incompatible+Changes) states that starting in CDH3u4 dfs.support.append is true by default. Given that I'm on CDH3u3, apparently that's a little off or I stumbled upon tlipcon's magic pixie dust ;)
                
> Data lost with hdfs write ahead log
> -----------------------------------
>
>                 Key: ACCUMULO-623
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-623
>             Project: Accumulo
>          Issue Type: Bug
>         Environment: MacOSX, Hadoop 1.0.3, zookeeper 3.3.3
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> I shut my machine down with Accumulo, Zookeeper, and HDFS running.  When I restarted it, Accumulo failed to recover its write ahead log because it was zero length.  I wondered if this was because I shutdown HDFS so I tried the following on my single node Accumulo instance.
>  * start HDFS and zookeeper
>  * init & start Accumulo
>  * created a table and insert some data
>  * pkill -f java
>  * restart everything
>  * Accumulo fails to start because walog is zero length
> Saw excpetions like the following
> {noformat}
> 06 18:58:44,581 [log.SortedLogRecovery] INFO : Looking at mutations from /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7 for !0;!0<<
> 06 18:58:44,590 [tabletserver.TabletServer] WARN : exception trying to assign tablet !0;!0<< /root_tablet
> java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1458)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1295)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1134)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1121)
>         at org.apache.accumulo.server.tabletserver.TabletServer$AssignmentHandler.run(TabletServer.java:2477)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:680)
> Caused by: java.io.IOException: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:428)
>         at org.apache.accumulo.server.tabletserver.TabletServer.recover(TabletServer.java:3206)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1426)
>         ... 6 more
> Caused by: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.findLastStartToFinish(SortedLogRecovery.java:125)
>         at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:89)
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:426)
>         ... 8 more
> {noformat}
> When trying to run LogReader on the files, it prints nothing.  
> {noformat}
> $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7
> 06 19:04:37,147 [util.NativeCodeLoader] WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/wal/127.0.0.1+40200/def72721-5c64-4755-87cc-2e8cfc3002b7
> $ 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira