You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/03/27 23:14:30 UTC

[jira] Updated: (HADOOP-107) Namenode errors "Failed to complete filename.crc because dir.getFile()==null and null"

     [ http://issues.apache.org/jira/browse/HADOOP-107?page=all ]

Doug Cutting updated HADOOP-107:
--------------------------------

    Attachment: writeLocal.patch

A connection to a datanode is opened for the checksum file when a file is opened.  Then lots of data is written to the main file, and only a little to the parallel checksum file.  So the checksum file might not get touched in up to a minute.

The last block of every file (checksum & main) is tee'd to a temporary local file, so that if the network connection dies then attempts can be made to re-transmit it to another datanode.

This patch changes things so that connections to datanodes are not initiated until the block is complete.  All writes are initially to the local, temporary file and only copied to a datanode when the block is complete.

> Namenode errors "Failed to complete filename.crc  because dir.getFile()==null and null"
> ---------------------------------------------------------------------------------------
>
>          Key: HADOOP-107
>          URL: http://issues.apache.org/jira/browse/HADOOP-107
>      Project: Hadoop
>         Type: Bug
>   Components: dfs
>  Environment: Linux
>     Reporter: Igor Bolotin
>  Attachments: writeLocal.patch
>
> We're getting lot of these errors and here is what I see in namenode log: 
> 060327 002016 Removing lease [Lease.  Holder: DFSClient_1897466025, heldlocks: 0, pendingcreates: 0], leases remaining: 1
> 060327 002523 Block report from member2.local:50010: 91895 blocks.
> 060327 003238 Block report from member1.local:50010: 91895 blocks.
> 060327 005830 Failed to complete /feedback/.feedback_10.1.10.102-33877.log.crc  because dir.getFile()==null and null
> 060327 005830 Server handler 1 on 50000 call error: java.io.IOException: Could not complete write to file /feedback/.feedback_10.1.10.102-33877.log.crc by DFSClient_1897466025
> java.io.IOException: Could not complete write to file /feedback/.feedback_10.1.10.102-33877.log.crc by DFSClient_1897466025
>         at org.apache.hadoop.dfs.NameNode.complete(NameNode.java:205)
>         at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216)
> I can't be 100% sure, but it looks like these errors happen with checksum files for very small data files. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira