You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Yoram Arnon (JIRA)" <ji...@apache.org> on 2006/03/31 01:07:47 UTC

[jira] Commented: (HADOOP-96) name server should log decisions that affect data: block creation, removal, replication

    [ http://issues.apache.org/jira/browse/HADOOP-96?page=comments#action_12372589 ] 

Yoram Arnon commented on HADOOP-96:
-----------------------------------

the plan is to add a log line for each change in the name space and each change in block placement or replication. What we get is effectively a trace of program execution for DFS changes.
the log will go to a new log object, to enable switching this (extensive) logging on or off.
name space changes will be logged at level fine, block commit changes at finer, and block pending changes at finest.
In order to facilitate tracing of multiple concurrent operations, each line will include the thread id of the name server's thread. For that we derive a logging class, that places the thread id right after the date/time.

we log in the following methods of class name node, and in methods of class nameSystem called by them:
create (startFile)
abandonFileInProgress (abandonFileInProgress )
AbandonBlock (AbandonBlock )
reportWrittenBlock (blockReceived)
addBlock (getAdditionalBlock)
Complete (completeFile)
rename (renameTo)
delete (delete)
Mkdirs (Mkdirs)
sendHeartbeat (getHeartbeat)
blockReport (processReoprt)
blockReceived (blockReceived)
errorReport
getBlockWork (pendingTransfer, blocksToInvalidate)


> name server should log decisions that affect data: block creation, removal, replication
> ---------------------------------------------------------------------------------------
>
>          Key: HADOOP-96
>          URL: http://issues.apache.org/jira/browse/HADOOP-96
>      Project: Hadoop
>         Type: Improvement
>   Components: dfs
>     Versions: 0.1
>     Reporter: Yoram Arnon
>     Priority: Critical

>
> currently, there's no way to analyze and debug DFS errors where blocks disapear.
> name server should log its decisions that affect data, including block creation, removal, replication:
> - block <b> created, assigned to datanodes A, B, ...
> - datanode A dead, block <b> underreplicated(1), replicating to datanode C
> - datanode B dead, block <b> underreplicated(2), replicating to datanode D
> - datanode A alive, block <b> overreplicated, removing from datanode D
> - block <removed> from datanodes C, D, ...
> that will enable me to track down, two weeks later, a block that's missing from a file, and to debug the name server.
> extra credit:
> - rotate log file, as it might grow large
> - make this behaviour optional/configurable

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira