You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Keith Turner (JIRA)" <ji...@apache.org> on 2012/07/16 19:40:33 UTC

[jira] [Commented] (ACCUMULO-623) Data lost with hdfs write ahead log

    [ https://issues.apache.org/jira/browse/ACCUMULO-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415434#comment-13415434 ] 

Keith Turner commented on ACCUMULO-623:
---------------------------------------

John V and I were discussing this, one possibility is that a tserver will only start if dfs.durable.sync OR dfs.support.append is set to true.  This is kinda screwy because at some point the property dfs.support.append (which defaults to false) will go away and the property dfs.durable.sync will appear (which defaults to true).  However, I do not think there is a way to determine what a property defaults to in HAdoop, because this is just hardcoded into code that uses the prop.  So a user would need to explicitly set dfs.durable.sync to true in their config even though this is the default.  See HADOOP-8365.
                
> Data lost with hdfs write ahead log
> -----------------------------------
>
>                 Key: ACCUMULO-623
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-623
>             Project: Accumulo
>          Issue Type: Bug
>         Environment: MacOSX, Hadoop 1.0.3, zookeeper 3.3.3
>            Reporter: Keith Turner
>            Assignee: Eric Newton
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> I shut my machine down with Accumulo, Zookeeper, and HDFS running.  When I restarted it, Accumulo failed to recover its write ahead log because it was zero length.  I wondered if this was because I shutdown HDFS so I tried the following on my single node Accumulo instance.
>  * start HDFS and zookeeper
>  * init & start Accumulo
>  * created a table and insert some data
>  * pkill -f java
>  * restart everything
>  * Accumulo fails to start because walog is zero length
> Saw excpetions like the following
> {noformat}
> 06 18:58:44,581 [log.SortedLogRecovery] INFO : Looking at mutations from /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7 for !0;!0<<
> 06 18:58:44,590 [tabletserver.TabletServer] WARN : exception trying to assign tablet !0;!0<< /root_tablet
> java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1458)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1295)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1134)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1121)
>         at org.apache.accumulo.server.tabletserver.TabletServer$AssignmentHandler.run(TabletServer.java:2477)
>         at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>         at java.lang.Thread.run(Thread.java:680)
> Caused by: java.io.IOException: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:428)
>         at org.apache.accumulo.server.tabletserver.TabletServer.recover(TabletServer.java:3206)
>         at org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1426)
>         ... 6 more
> Caused by: java.lang.RuntimeException: Unable to read log entries
>         at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.findLastStartToFinish(SortedLogRecovery.java:125)
>         at org.apache.accumulo.server.tabletserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:89)
>         at org.apache.accumulo.server.tabletserver.log.TabletServerLogger.recover(TabletServerLogger.java:426)
>         ... 8 more
> {noformat}
> When trying to run LogReader on the files, it prints nothing.  
> {noformat}
> $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/recovery/def72721-5c64-4755-87cc-2e8cfc3002b7
> 06 19:04:37,147 [util.NativeCodeLoader] WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> $ ./bin/accumulo org.apache.accumulo.server.logger.LogReader /accumulo/wal/127.0.0.1+40200/def72721-5c64-4755-87cc-2e8cfc3002b7
> $ 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira