You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2008/03/17 21:12:24 UTC

[jira] Created: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Datanode fails write to DFS file with exception message "Trying to change block file offset"
--------------------------------------------------------------------------------------------

                 Key: HADOOP-3033
                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.16.1
            Reporter: dhruba borthakur
            Assignee: dhruba borthakur


A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:

Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
PacketResponder 0 for block blk_-7279084187433655573 terminating
writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579735#action_12579735 ] 

Raghu Angadi commented on HADOOP-3033:
--------------------------------------

ok, I still don't see how it explains the error. I was trying to see a sequence of events that lead to this particular error... such a sequence would clarify better I think.

As I see it, irrespective of value of 'count', 'out' is flushed completely before calling setChannelPosition(). I will chat with you to see how it happens.

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580327#action_12580327 ] 

Hudson commented on HADOOP-3033:
--------------------------------

Integrated in Hadoop-trunk #433 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/433/])

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.16.2
>
>         Attachments: badnode.patch, badnode2.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579726#action_12579726 ] 

dhruba borthakur commented on HADOOP-3033:
------------------------------------------

I think the value of count fetched by the BlockResponder thread is stale. Especially because 'count' is not a volatile and no locking is enforced while accesing this variable. This value is probably being  fetched from the CPU's cache and since no locking is enforced, it is getting a stale value.



> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579630#action_12579630 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3033:
------------------------------------------------

- DFSBufferedOutputStream.out and DFSBufferedOutputStream.flush() do not change anything since DFSBufferedOutputStream.out is never used and DFSBufferedOutputStream.flush() is already synchronized in super.flush().

- putting DFSBufferedOutputStream.count inside a synchronized method should be useful since count may be being updated in another synchronized method.

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3033:
-------------------------------------

    Status: Patch Available  (was: Open)

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch, badnode2.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3033:
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.16.2
           Status: Resolved  (was: Patch Available)

Incorporated Nicholas' review comments about logging debug log messages.
I just committed this.

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.16.2
>
>         Attachments: badnode.patch, badnode2.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3033:
-------------------------------------

    Attachment: badnode2.patch

t is possible that the output stream created using RandomAccessFile caches data in some underlying layer. Here is a patch that incorprates Raghu's tip that it is safe to make the Blockreceiver thread invoke the flush too. This patch works!

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch, badnode2.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579725#action_12579725 ] 

Raghu Angadi commented on HADOOP-3033:
--------------------------------------

I don't think it explains above logs. There is a {{out.flush()}} right before setChannelPosition(). So there is no data and count is zero before setChannelPostion(). setChannelPosition() and out.write() are called by the same thread, right?

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579714#action_12579714 ] 

Raghu Angadi commented on HADOOP-3033:
--------------------------------------


Earlier in the same log file for another block :

{noformat}
2008-03-17 20:03:06,945 INFO org.apache.hadoop.dfs.DataNode: Changing block file offset of block blk_-8611433653592759099 from 84869120 to 84934656 meta file offset to 663559
2008-03-17 20:03:08,128 INFO org.apache.hadoop.dfs.DataNode: Changing block file offset of block blk_-8611433653592759099 from 114294784 to114360320 meta file offset to 893447
2008-03-17 20:03:09,057 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-8611433653592759099 of size 134217728 from /a.b.c.79
{noformat}

This is strange too. The BlockReceiver thread calls {{out.write()}} then the above 'Changing ...' message is logged. But the check inside setChannelPosition() and it did not result in an exception. So something changed the channels position in the mean time (during this, there were no writes to 'out').  

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579723#action_12579723 ] 

dhruba borthakur commented on HADOOP-3033:
------------------------------------------

I think I have a possible explanation of this behaviour... it is a race between a flush and setChannelPosition. The BufferedOutputStream.flush and BufferedOutputStream.write are mutually exclusive, but there is no guarantee that there is any mutual exclusion between a BufferedOutputStream.flush and setChannelPosition.

The BlockReceiver thread must be in the executing a getChannelPosition/setChannelPosition while processing the current packet. However, the previous packet that was sitting in the buffer of the BufferedOutputSteam is being concurrently flushed by the PacketResponder thread.


> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579693#action_12579693 ] 

Raghu Angadi commented on HADOOP-3033:
--------------------------------------

I think we should still explain how that can lead to what we saw :

One particular case I looked at shows one datanode does not write 64k of data (or overwrites last 64k):

The last (third) data node in the pipeline failed with : {noformat}
2008-03-17 20:38:01,928 INFO org.apache.hadoop.dfs.DataNode: Changing block file offset of block blk_7114623733442731588 from 85983232 to 86048768 meta file offset to 672263
2008-03-17 20:38:01,928 INFO org.apache.hadoop.dfs.DataNode: Exception in receiveBlock for block blk_7114623733442731588 java.io.IOException: Trying to change block file offset of block blk_7114623733442731588 to 86048768 but actual size of file is 85983232
{noformat}
The client retried with remaining DNs and succeded.

Say 'x' == 85983232.

Block file in tmp dir on bad datanode is x bytes long and meta data file is 672263 bytes long.

Data from failed datanode and a good datanode for this block show that data till x-64k matches on both. 64k at  at x-64k on bad datanode matches 64k at x at the good datanode. The meta data file data matches on both side. So this show the bad datanode either some did not write the last packet or overwrote last but one packet with the last packet. Each packet has 64k of real data.


> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579693#action_12579693 ] 

rangadi edited comment on HADOOP-3033 at 3/17/08 6:08 PM:
---------------------------------------------------------------

I think we should still explain how that can lead to what we saw :

One particular case I looked at shows one datanode does not write 64k of data (or overwrites last 64k):

The last (third) data node in the pipeline failed with : {noformat}
2008-03-17 20:38:01,928 INFO org.apache.hadoop.dfs.DataNode: Changing block file offset of block blk_7114623733442731588 from 85983232 to 86048768 meta file offset to 672263
2008-03-17 20:38:01,928 INFO org.apache.hadoop.dfs.DataNode: Exception in receiveBlock for block blk_7114623733442731588 java.io.IOException: Trying to change block file offset of block blk_7114623733442731588 to 86048768 but actual size of file is 85983232
{noformat}
The client retried with remaining DNs and succeded.

Say 'x' == 85983232.

Block file in tmp dir on bad datanode is x bytes long and meta data file is 672263 bytes long.

Data from failed datanode and a good datanode for this block shows that data till x-64k matches on both. 64k at  at x-64k on bad datanode matches 64k at x at the good datanode. The meta data file data matches on both side. So this shows that the bad datanode either some how did not write the last but one packet or overwrote it with the last packet. Each packet has 64k of real data.


      was (Author: rangadi):
    I think we should still explain how that can lead to what we saw :

One particular case I looked at shows one datanode does not write 64k of data (or overwrites last 64k):

The last (third) data node in the pipeline failed with : {noformat}
2008-03-17 20:38:01,928 INFO org.apache.hadoop.dfs.DataNode: Changing block file offset of block blk_7114623733442731588 from 85983232 to 86048768 meta file offset to 672263
2008-03-17 20:38:01,928 INFO org.apache.hadoop.dfs.DataNode: Exception in receiveBlock for block blk_7114623733442731588 java.io.IOException: Trying to change block file offset of block blk_7114623733442731588 to 86048768 but actual size of file is 85983232
{noformat}
The client retried with remaining DNs and succeded.

Say 'x' == 85983232.

Block file in tmp dir on bad datanode is x bytes long and meta data file is 672263 bytes long.

Data from failed datanode and a good datanode for this block show that data till x-64k matches on both. 64k at  at x-64k on bad datanode matches 64k at x at the good datanode. The meta data file data matches on both side. So this show the bad datanode either some did not write the last packet or overwrote last but one packet with the last packet. Each packet has 64k of real data.

  
> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3033:
-------------------------------------

    Attachment: badnode.patch

Here is a patch that changes this portion of code in the datanode to look like what is in trunk. Investigation is still on whether this patch actually solves the problem or not.

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580221#action_12580221 ] 

Hadoop QA commented on HADOOP-3033:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12378104/badnode2.patch
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1990/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1990/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1990/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1990/console

This message is automatically generated.

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch, badnode2.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3033) Datanode fails write to DFS file with exception message "Trying to change block file offset"

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580191#action_12580191 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3033:
------------------------------------------------

+1 codes look good

Some minor comments:
- It might be better to use ackQueue lock instead of PacketResponder.this for ackQueue related synchronization.

- Should check LOG.isDebugEnabled() before printing debug messages.  Then the message will not be constructed (i.e. better performance) if debug is not enabled.

> Datanode fails write to DFS file with exception message "Trying to change block file offset"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3033
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3033
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.1
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: badnode.patch, badnode2.patch
>
>
> A write to a DFS block failed with the lastdatanode in the pipeline reporting this error:
> Receiving block blk_-7279084187433655573 src: /xx.xx.xx.xx:xx dest: /xx.xx.xx.xx:50010
> Changing block file offset of block blk_-7279084187433655573 from 9043968 to 9043968 meta file offset to 70663
> Changing block file offset of block blk_-7279084187433655573 from 111935488 to 112001024 meta file offset to 875015
> Exception in receiveBlock for block blk_-7279084187433655573 java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> PacketResponder 0 for block blk_-7279084187433655573 Interrupted.
> PacketResponder 0 for block blk_-7279084187433655573 terminating
> writeBlock blk_-7279084187433655573 received exception java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488
> DataXceiver: java.io.IOException: Trying to change block file offset of block blk_-7279084187433655573 to 112001024 but actual size of file is 111935488

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.