You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "sam rash (JIRA)" <ji...@apache.org> on 2008/09/16 03:47:44 UTC

[jira] Created: (HADOOP-4180) DataNode exceptions reading local disk

DataNode exceptions reading local disk
--------------------------------------

                 Key: HADOOP-4180
                 URL: https://issues.apache.org/jira/browse/HADOOP-4180
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.16.4
            Reporter: sam rash


We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks* like the DataNode deletes the block and then tries to read it again later.

NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.


2008-09-16 00:27:05,232 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010
2008-09-16 00:27:06,987 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010 of size 36431585
2008-09-16 00:27:08,440 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010
2008-09-16 00:27:08,830 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_7614043800226105600. Block not found in blockMap. 
2008-09-16 00:27:08,838 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
        at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
        at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:27:10,251 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010 of size 36431585
2008-09-16 00:27:11,844 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4689026510986275010 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-4689026510986275010
2008-09-16 00:27:15,819 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010
2008-09-16 00:27:17,623 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010 of size 36431585
2008-09-16 00:27:17,846 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4002794066491997463 file /local/data/hadoop/d1/dfs/data/current/subdir17/subdir50/blk_-4002794066491997463
2008-09-16 00:27:23,848 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-5880042835256853020 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-5880042835256853020
2008-09-16 00:27:43,661 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-2865745407489352337 to /10.16.160.49
2008-09-16 00:27:49,435 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6570557978733593293 to /10.16.160.49
2008-09-16 00:27:53,676 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_8435126910829255466 to /10.16.160.49
2008-09-16 00:27:54,541 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-4299511117827744453 to /10.16.160.49
2008-09-16 00:28:00,441 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_3186947853142559620 to /10.16.160.49
2008-09-16 00:28:05,229 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-6450608467879552280 to /10.16.160.49
2008-09-16 00:28:09,301 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6030697901457602946 to /10.16.160.113
2008-09-16 00:28:10,470 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_4663928918404305287 to /10.16.160.36
2008-09-16 00:28:10,537 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.33:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:10,538 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.108:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.49:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4002794066491997463 to /10.16.160.79:
java.io.IOException: Block blk_-4002794066491997463 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4002794066491997463 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:50,873 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_-4689026510986275010. Block not found in blockMap. 

2008-09-16 00:28:50,880 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
        at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
        at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
        at java.lang.Thread.run(Thread.java:619)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4180) DataNode exceptions reading local disk

Posted by "sam rash (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sam rash updated HADOOP-4180:
-----------------------------

    Description: 
We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks* like the DataNode deletes the block and then tries to read it again later.

NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.


  was:
We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks* like the DataNode deletes the block and then tries to read it again later.

NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.


2008-09-16 00:27:05,232 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010
2008-09-16 00:27:06,987 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010 of size 36431585
2008-09-16 00:27:08,440 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010
2008-09-16 00:27:08,830 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_7614043800226105600. Block not found in blockMap. 
2008-09-16 00:27:08,838 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
        at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
        at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:27:10,251 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010 of size 36431585
2008-09-16 00:27:11,844 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4689026510986275010 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-4689026510986275010
2008-09-16 00:27:15,819 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010
2008-09-16 00:27:17,623 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010 of size 36431585
2008-09-16 00:27:17,846 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4002794066491997463 file /local/data/hadoop/d1/dfs/data/current/subdir17/subdir50/blk_-4002794066491997463
2008-09-16 00:27:23,848 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-5880042835256853020 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-5880042835256853020
2008-09-16 00:27:43,661 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-2865745407489352337 to /10.16.160.49
2008-09-16 00:27:49,435 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6570557978733593293 to /10.16.160.49
2008-09-16 00:27:53,676 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_8435126910829255466 to /10.16.160.49
2008-09-16 00:27:54,541 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-4299511117827744453 to /10.16.160.49
2008-09-16 00:28:00,441 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_3186947853142559620 to /10.16.160.49
2008-09-16 00:28:05,229 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-6450608467879552280 to /10.16.160.49
2008-09-16 00:28:09,301 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6030697901457602946 to /10.16.160.113
2008-09-16 00:28:10,470 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_4663928918404305287 to /10.16.160.36
2008-09-16 00:28:10,537 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.33:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:10,538 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.108:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.49:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4002794066491997463 to /10.16.160.79:
java.io.IOException: Block blk_-4002794066491997463 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4002794066491997463 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:50,873 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_-4689026510986275010. Block not found in blockMap. 

2008-09-16 00:28:50,880 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
        at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
        at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
        at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
        at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
        at java.lang.Thread.run(Thread.java:619)



> DataNode exceptions reading local disk
> --------------------------------------
>
>                 Key: HADOOP-4180
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4180
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.4
>            Reporter: sam rash
>
> We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks* like the DataNode deletes the block and then tries to read it again later.
> NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4180) DataNode exceptions reading local disk

Posted by "sam rash (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631508#action_12631508 ] 

sam rash commented on HADOOP-4180:
----------------------------------

I correlated this with an error at job submission:

this is another exception we get client-side that appears to be happening more consistently now (and correlates with a higher frequency of the "Got exception while reading blk_....".


WARN 2008-09-16 18:52:41,977 DataStreamer for file /mapred/system/job_200809052004_4597/job.jar block blk_-7411787837747554790 org.apache.hadoop.dfs.DFSClient DataStreamer Exception: java.net.SocketException: Broken pipe
WARN 2008-09-16 18:52:41,977 DataStreamer for file /mapred/system/job_200809052004_4597/job.jar block blk_-7411787837747554790 org.apache.hadoop.dfs.DFSClient Error Recovery for block blk_-7411787837747554790 bad datanode[0] 10.16.160.81:50010
WARN 2008-09-16 18:52:41,978 DataStreamer for file /mapred/system/job_200809052004_4597/job.jar block blk_-7411787837747554790 org.apache.hadoop.dfs.DFSClient Error Recovery for block blk_-7411787837747554790 in pipeline 10.16.160.81:50010, 10.16.160.47:50010, 10.16.160.48:50010: bad datanode 10.16.160.81:50010

I then looked in the server logs and in fact the exception existed:

2008-09-16 18:54:07,310 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-7411787837747554790 src: /10.16.160.102:36057 dest: /10.16.160.102:50010
2008-09-16 18:54:09,062 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-7411787837747554790 src: /10.16.160.102:36057 dest: /10.16.160.102:50010 of siz
e 36564037
2008-09-16 18:54:14,388 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-7411787837747554790 file /local/data/hadoop/d1/dfs/data/current/subdir36/subdir
14/blk_-7411787837747554790
...
2008-09-16 18:54:49,955 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.98:50010:Got exception while serving blk_-7411787837747554790 to /10.16.160.104:
java.io.IOException: Block blk_-7411787837747554790 is not valid.
        at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
        at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
        at java.lang.Thread.run(Thread.java:619)

> DataNode exceptions reading local disk
> --------------------------------------
>
>                 Key: HADOOP-4180
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4180
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.4
>            Reporter: sam rash
>
> We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks* like the DataNode deletes the block and then tries to read it again later.
> NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.
> 2008-09-16 00:27:05,232 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010
> 2008-09-16 00:27:06,987 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010 of size 36431585
> 2008-09-16 00:27:08,440 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010
> 2008-09-16 00:27:08,830 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_7614043800226105600. Block not found in blockMap. 
> 2008-09-16 00:27:08,838 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
>         at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
>         at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
>         at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
>         at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:27:10,251 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010 of size 36431585
> 2008-09-16 00:27:11,844 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4689026510986275010 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-4689026510986275010
> 2008-09-16 00:27:15,819 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010
> 2008-09-16 00:27:17,623 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010 of size 36431585
> 2008-09-16 00:27:17,846 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4002794066491997463 file /local/data/hadoop/d1/dfs/data/current/subdir17/subdir50/blk_-4002794066491997463
> 2008-09-16 00:27:23,848 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-5880042835256853020 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-5880042835256853020
> 2008-09-16 00:27:43,661 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-2865745407489352337 to /10.16.160.49
> 2008-09-16 00:27:49,435 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6570557978733593293 to /10.16.160.49
> 2008-09-16 00:27:53,676 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_8435126910829255466 to /10.16.160.49
> 2008-09-16 00:27:54,541 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-4299511117827744453 to /10.16.160.49
> 2008-09-16 00:28:00,441 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_3186947853142559620 to /10.16.160.49
> 2008-09-16 00:28:05,229 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-6450608467879552280 to /10.16.160.49
> 2008-09-16 00:28:09,301 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6030697901457602946 to /10.16.160.113
> 2008-09-16 00:28:10,470 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_4663928918404305287 to /10.16.160.36
> 2008-09-16 00:28:10,537 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.33:
> java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:10,538 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:11,255 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.108:
> java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:11,255 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:12,973 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.49:
> java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:12,973 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:14,647 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4002794066491997463 to /10.16.160.79:
> java.io.IOException: Block blk_-4002794066491997463 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:14,647 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4002794066491997463 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:50,873 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_-4689026510986275010. Block not found in blockMap. 
> 2008-09-16 00:28:50,880 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
>         at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
>         at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
>         at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
>         at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
>         at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4180) DataNode exceptions reading local disk

Posted by "sam rash (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631516#action_12631516 ] 

sam rash commented on HADOOP-4180:
----------------------------------

(moved stack trace from server into comment)

2008-09-16 00:27:05,232 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010
2008-09-16 00:27:06,987 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010 of size 36431585
2008-09-16 00:27:08,440 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010
2008-09-16 00:27:08,830 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_7614043800226105600. Block not found in blockMap.
2008-09-16 00:27:08,838 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:27:10,251 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010 of size 36431585
2008-09-16 00:27:11,844 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4689026510986275010 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-4689026510986275010
2008-09-16 00:27:15,819 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010
2008-09-16 00:27:17,623 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010 of size 36431585
2008-09-16 00:27:17,846 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4002794066491997463 file /local/data/hadoop/d1/dfs/data/current/subdir17/subdir50/blk_-4002794066491997463
2008-09-16 00:27:23,848 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-5880042835256853020 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-5880042835256853020
2008-09-16 00:27:43,661 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-2865745407489352337 to /10.16.160.49
2008-09-16 00:27:49,435 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6570557978733593293 to /10.16.160.49
2008-09-16 00:27:53,676 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_8435126910829255466 to /10.16.160.49
2008-09-16 00:27:54,541 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-4299511117827744453 to /10.16.160.49
2008-09-16 00:28:00,441 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_3186947853142559620 to /10.16.160.49
2008-09-16 00:28:05,229 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-6450608467879552280 to /10.16.160.49
2008-09-16 00:28:09,301 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6030697901457602946 to /10.16.160.113
2008-09-16 00:28:10,470 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_4663928918404305287 to /10.16.160.36
2008-09-16 00:28:10,537 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.33:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:10,538 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.108:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:11,255 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.49:
java.io.IOException: Block blk_-4689026510986275010 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:12,973 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4002794066491997463 to /10.16.160.79:
java.io.IOException: Block blk_-4002794066491997463 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:14,647 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4002794066491997463 is not valid.
at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
at java.lang.Thread.run(Thread.java:619)

2008-09-16 00:28:50,873 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_-4689026510986275010. Block not found in blockMap.

2008-09-16 00:28:50,880 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
at java.lang.Thread.run(Thread.java:619)

> DataNode exceptions reading local disk
> --------------------------------------
>
>                 Key: HADOOP-4180
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4180
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.4
>            Reporter: sam rash
>
> We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks* like the DataNode deletes the block and then tries to read it again later.
> NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4180) DataNode exceptions reading local disk

Posted by "sam rash (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631229#action_12631229 ] 

sam rash commented on HADOOP-4180:
----------------------------------

sorry, the block of interest is

blk_-4689026510986275010


> DataNode exceptions reading local disk
> --------------------------------------
>
>                 Key: HADOOP-4180
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4180
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.4
>            Reporter: sam rash
>
> We get 100s of exceptions at WARN level per day indicating errors while trying to read local blocks.  When this occurs, I've checked on the local box's dfs.data.dir and the block is not present.  Here is a relevant snippet from the logs regarding the missing block.  It *looks* like the DataNode deletes the block and then tries to read it again later.
> NOTE: this is for the jar file as up to 8 hosts have this exception for one block and our data repl factor is only 3.
> 2008-09-16 00:27:05,232 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010
> 2008-09-16 00:27:06,987 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4689026510986275010 src: /10.16.160.78:40905 dest: /10.16.160.78:50010 of size 36431585
> 2008-09-16 00:27:08,440 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010
> 2008-09-16 00:27:08,830 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_7614043800226105600. Block not found in blockMap. 
> 2008-09-16 00:27:08,838 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
>         at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
>         at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
>         at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
>         at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:27:10,251 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-4002794066491997463 src: /10.16.160.69:35567 dest: /10.16.160.69:50010 of size 36431585
> 2008-09-16 00:27:11,844 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4689026510986275010 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-4689026510986275010
> 2008-09-16 00:27:15,819 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010
> 2008-09-16 00:27:17,623 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-5880042835256853020 src: /10.16.160.108:50565 dest: /10.16.160.108:50010 of size 36431585
> 2008-09-16 00:27:17,846 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-4002794066491997463 file /local/data/hadoop/d1/dfs/data/current/subdir17/subdir50/blk_-4002794066491997463
> 2008-09-16 00:27:23,848 INFO org.apache.hadoop.dfs.DataNode: Deleting block blk_-5880042835256853020 file /local/data/hadoop/d0/dfs/data/current/subdir33/subdir38/blk_-5880042835256853020
> 2008-09-16 00:27:43,661 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-2865745407489352337 to /10.16.160.49
> 2008-09-16 00:27:49,435 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6570557978733593293 to /10.16.160.49
> 2008-09-16 00:27:53,676 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_8435126910829255466 to /10.16.160.49
> 2008-09-16 00:27:54,541 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-4299511117827744453 to /10.16.160.49
> 2008-09-16 00:28:00,441 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_3186947853142559620 to /10.16.160.49
> 2008-09-16 00:28:05,229 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_-6450608467879552280 to /10.16.160.49
> 2008-09-16 00:28:09,301 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_6030697901457602946 to /10.16.160.113
> 2008-09-16 00:28:10,470 INFO org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010 Served block blk_4663928918404305287 to /10.16.160.36
> 2008-09-16 00:28:10,537 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.33:
> java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:10,538 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:11,255 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.108:
> java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:11,255 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:12,973 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4689026510986275010 to /10.16.160.49:
> java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:12,973 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4689026510986275010 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:14,647 WARN org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:Got exception while serving blk_-4002794066491997463 to /10.16.160.79:
> java.io.IOException: Block blk_-4002794066491997463 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:14,647 ERROR org.apache.hadoop.dfs.DataNode: 10.16.160.49:50010:DataXceiver: java.io.IOException: Block blk_-4002794066491997463 is not valid.
>         at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:567)
>         at org.apache.hadoop.dfs.DataNode$BlockSender.<init>(DataNode.java:1466)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:992)
>         at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:930)
>         at java.lang.Thread.run(Thread.java:619)
> 2008-09-16 00:28:50,873 WARN org.apache.hadoop.dfs.DataNode: Unexpected error trying to delete block blk_-4689026510986275010. Block not found in blockMap. 
> 2008-09-16 00:28:50,880 WARN org.apache.hadoop.dfs.DataNode: java.io.IOException: Error in deleting blocks.
>         at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:859)
>         at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:718)
>         at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:597)
>         at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2431)
>         at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.