You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Brian Bockelman <bb...@math.unl.edu> on 2008/12/16 19:22:31 UTC

Hit a roadbump in solving truncated block issue

Hey,

I hit a bit of a roadbump in solving the "truncated block issue" at  
our site: namely, some of the blocks appear perfectly valid to the  
datanode.  The block verifies, but it is still the wrong size (it  
appears that the metadata is too small too).

What's the best way to proceed?  It appears that either (a) the block  
scanner needs to report to the datanode the size of the block it just  
verified, which is possibly a scaling issue or (b) the metadata file  
needs to save the correct block size, which is a pretty major  
modification, as it requires a change of the on-disk format.

Ideas?

Brian

Re: Hit a roadbump in solving truncated block issue

Posted by Brian Bockelman <bb...@cse.unl.edu>.

On Dec 16, 2008, at 4:10 PM, Raghu Angadi wrote:

> Brian Bockelman wrote:
>> Hey,
>> I hit a bit of a roadbump in solving the "truncated block issue" at  
>> our site: namely, some of the blocks appear perfectly valid to the  
>> datanode.  The block verifies, but it is still the wrong size (it  
>> appears that the metadata is too small too).
>> What's the best way to proceed?  It appears that either (a) the  
>> block scanner needs to report to the datanode the size of the block  
>> it just verified, which is possibly a scaling issue or (b) the  
>> metadata file needs to save the correct block size, which is a  
>> pretty major modification, as it requires a change of the on-disk  
>> format.
>
> This should be detected by the NameNode. i.e. it should detect this  
> replica is shorter (either compared to other replicas or the  
> expected size). There are various fixes (recent or being worked on)  
> to this area of NameNode and it is mostly covered by of those or  
> should be soon.

Do you know which JIRA tickets I can ask my admins to follow?  We'd  
like to test these out as soon as the fixes are reasonably stable.   
Right now, we've increased # of replicas to 3, but can't maintain this  
level of replication forever.

I forgot to give numbers: out of ~200 remaining blocks with this  
"truncated" issue, having the block scanner verify blocks after a  
failed transfers due to "inconsistent size" solved about 150 of the  
issues.  The remaining 50 appear to be due to the issue described above.

Thank you very much Raghu; the Hadoop team has been quick in solving  
these issues.

Brian

Re: Hit a roadbump in solving truncated block issue

Posted by Garhan Attebury <at...@cse.unl.edu>.

Actually we do have the namenode logs for the period Brian mentioned.  
In Brian's email, he shows the log entries on node191 corresponding to  
it storing the third (new) replica of the block in question. The  
namenode log from that period shows:


2008-12-12 08:53:02,637 INFO org.apache.hadoop.hdfs.StateChange:  
BLOCK* ask 172.16.1.121:50010 to replicate  
blk_7345861444716855534_7201 to datanode(s) 172.16.1.191:50010
2008-12-12 08:53:17,127 INFO org.apache.hadoop.hdfs.StateChange:  
BLOCK* NameSystem.addStoredBlock: blockMap updated: 172.16.1.191:50010  
is added to blk_7345861444716855534_7201 size 134217728

As you can see, node191 correctly claimed it received the block of  
size 0, yet the namenode claims that node191 has the block of size  
134217728.


Looking at more of the namenode logs I found another instance where  
the namenode again replicated the block due to a node going down, and  
just as we see here with node191, this new datanode stored the block  
with size 0 while the namenode seemed to think everything was correct.



- Garhan Attebury


On Dec 19, 2008, at 11:57 AM, Brian Bockelman wrote:

> Hey Raghu,
>
> I never heard back from you about whether any of these fixes are  
> ready to try out.  Things are getting kind of bad here.
>
> Even at three replicas, I found one block which has all three  
> replicas of length=0.  Grepping through the logs, I get things like  
> this:
>
> 2008-12-18 22:45:04,680 WARN  
> org.apache.hadoop.hdfs.server.datanode.DataNode:  
> DatanodeRegistration(172.16.1.121:50010,  
> storageID=DS-1732140560-172.16.1.121-50010-1228236234012,  
> infoPort=50075, ipcPort=50020):Got exception while serving  
> blk_7345861444716855534_7201 to /172.16.1.1:
> java.io.IOException:  Offset 35307520 and length 10485760 don't  
> match block blk_7345861444716855534_7201 ( blockLen 0 )
> java.io.IOException:  Offset 35307520 and length 10485760 don't  
> match block blk_7345861444716855534_7201 ( blockLen 0 )
>
> On the other hand, if I look for the block scanner activity:
>
> 2008-12-08 13:59:15,616 INFO  
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:  
> Verification succeeded for blk_7345861444716855534_7201
>
> There is indeed a zero-sized file on disk and matching *correct*  
> metadata:
>
> [root@node121 ~]# find /hadoop-data/ -name *7345861444716855534* - 
> exec ls -lh {} \;
> -rw-r--r--  1 root root 7 Dec  3 15:44 /hadoop-data/dfs/data/current/ 
> subdir9/subdir6/blk_7345861444716855534_7201.meta
> -rw-r--r--  1 root root 0 Dec  3 15:44 /hadoop-data/dfs/data/current/ 
> subdir9/subdir6/blk_7345861444716855534
>
> The metadata matches the 0-sized block, not the full one, of course.
>
> We recently went from 2 replicas to 3 replicas on Dec 11.  On Dec  
> 12, a replicas was created on node191:
>
> [root@node191 ~]# find /hadoop-data/ -name *7345861444716855534* - 
> exec ls -lh {} \;
> -rw-r--r--  1 root root 7 Dec 12 08:53 /hadoop-data/dfs/data/current/ 
> subdir40/subdir37/subdir42/blk_7345861444716855534_7201.meta
> -rw-r--r--  1 root root 0 Dec 12 08:53 /hadoop-data/dfs/data/current/ 
> subdir40/subdir37/subdir42/blk_7345861444716855534
>
> The corresponding log entries are here:
>
> 2008-12-12 08:53:09,014 INFO  
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block  
> blk_7345861444716855534_7201 src: /172.16.1.121:47799 dest: / 
> 172.16.1.191:50010
> 2008-12-12 08:53:17,134 INFO  
> org.apache.hadoop.hdfs.server.datanode.DataNode: Received block  
> blk_7345861444716855534_7201 src: /172.16.1.121:47799 dest: / 
> 172.16.1.191:50010 of size 0
>
> So, the incorrectly-sized block had a new copy created, the datanode  
> reported the incorrect size (!), and the namenode never deleted it  
> afterward.  I unfortunately don't have the namenode logs from this  
> period.
>
> Brian

Re: Hit a roadbump in solving truncated block issue

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Hey Raghu,

I never heard back from you about whether any of these fixes are ready  
to try out.  Things are getting kind of bad here.

Even at three replicas, I found one block which has all three replicas  
of length=0.  Grepping through the logs, I get things like this:

2008-12-18 22:45:04,680 WARN  
org.apache.hadoop.hdfs.server.datanode.DataNode:  
DatanodeRegistration(172.16.1.121:50010,  
storageID=DS-1732140560-172.16.1.121-50010-1228236234012,  
infoPort=50075, ipcPort=50020):Got exception while serving  
blk_7345861444716855534_7201 to /172.16.1.1:
java.io.IOException:  Offset 35307520 and length 10485760 don't match  
block blk_7345861444716855534_7201 ( blockLen 0 )
java.io.IOException:  Offset 35307520 and length 10485760 don't match  
block blk_7345861444716855534_7201 ( blockLen 0 )

On the other hand, if I look for the block scanner activity:

2008-12-08 13:59:15,616 INFO  
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification  
succeeded for blk_7345861444716855534_7201

There is indeed a zero-sized file on disk and matching *correct*  
metadata:

[root@node121 ~]# find /hadoop-data/ -name *7345861444716855534* -exec  
ls -lh {} \;
-rw-r--r--  1 root root 7 Dec  3 15:44 /hadoop-data/dfs/data/current/ 
subdir9/subdir6/blk_7345861444716855534_7201.meta
-rw-r--r--  1 root root 0 Dec  3 15:44 /hadoop-data/dfs/data/current/ 
subdir9/subdir6/blk_7345861444716855534

The metadata matches the 0-sized block, not the full one, of course.

We recently went from 2 replicas to 3 replicas on Dec 11.  On Dec 12,  
a replicas was created on node191:

[root@node191 ~]# find /hadoop-data/ -name *7345861444716855534* -exec  
ls -lh {} \;
-rw-r--r--  1 root root 7 Dec 12 08:53 /hadoop-data/dfs/data/current/ 
subdir40/subdir37/subdir42/blk_7345861444716855534_7201.meta
-rw-r--r--  1 root root 0 Dec 12 08:53 /hadoop-data/dfs/data/current/ 
subdir40/subdir37/subdir42/blk_7345861444716855534

The corresponding log entries are here:

2008-12-12 08:53:09,014 INFO  
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block  
blk_7345861444716855534_7201 src: /172.16.1.121:47799 dest: / 
172.16.1.191:50010
2008-12-12 08:53:17,134 INFO  
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block  
blk_7345861444716855534_7201 src: /172.16.1.121:47799 dest: / 
172.16.1.191:50010 of size 0

So, the incorrectly-sized block had a new copy created, the datanode  
reported the incorrect size (!), and the namenode never deleted it  
afterward.  I unfortunately don't have the namenode logs from this  
period.

Brian

On Dec 16, 2008, at 4:10 PM, Raghu Angadi wrote:

> Brian Bockelman wrote:
>> Hey,
>> I hit a bit of a roadbump in solving the "truncated block issue" at  
>> our site: namely, some of the blocks appear perfectly valid to the  
>> datanode.  The block verifies, but it is still the wrong size (it  
>> appears that the metadata is too small too).
>> What's the best way to proceed?  It appears that either (a) the  
>> block scanner needs to report to the datanode the size of the block  
>> it just verified, which is possibly a scaling issue or (b) the  
>> metadata file needs to save the correct block size, which is a  
>> pretty major modification, as it requires a change of the on-disk  
>> format.
>
> This should be detected by the NameNode. i.e. it should detect this  
> replica is shorter (either compared to other replicas or the  
> expected size). There are various fixes (recent or being worked on)  
> to this area of NameNode and it is mostly covered by of those or  
> should be soon.
>
> Raghu.
>
>> Ideas?
>> Brian

Re: Hit a roadbump in solving truncated block issue

Posted by Raghu Angadi <ra...@yahoo-inc.com>.

Brian Bockelman wrote:
> Hey,
> 
> I hit a bit of a roadbump in solving the "truncated block issue" at our 
> site: namely, some of the blocks appear perfectly valid to the 
> datanode.  The block verifies, but it is still the wrong size (it 
> appears that the metadata is too small too).
> 
> What's the best way to proceed?  It appears that either (a) the block 
> scanner needs to report to the datanode the size of the block it just 
> verified, which is possibly a scaling issue or (b) the metadata file 
> needs to save the correct block size, which is a pretty major 
> modification, as it requires a change of the on-disk format.

This should be detected by the NameNode. i.e. it should detect this 
replica is shorter (either compared to other replicas or the expected 
size). There are various fixes (recent or being worked on) to this area 
of NameNode and it is mostly covered by of those or should be soon.

Raghu.

> Ideas?
> 
> Brian