You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "farmmamba (Jira)" <ji...@apache.org> on 2023/05/08 13:29:00 UTC

[jira] [Updated] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

     [ https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

farmmamba updated HDFS-17003:
-----------------------------
    Description: 
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block to invalidate. It is a dangerous behaviour and may cause data loss. Some logs in our production as below:

 

NameNode log:
{code:java}
2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks for block: BP-932824627-xxxx-1680179358678:blk_-9223372036846808880_1669008 on datanode: datanode1:50010 {code}
datanode1 log:
{code:java}
2023-05-08 14:39:42,183 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-932824627-xxxx-1680179358678:blk_-9223372036846808880_1669008
 on /data1/hadoop/hdfs/datanode

2023-05-08 14:39:47,338 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to delete replica blk_-9223372036846808879_1669008: ReplicaInfo
not found. {code}
 

This phenomenon can be reproduced.

  was:
After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block to invalidate. It is a dangerous behaviour and may cause data loss. Some logs in our production as below:

 

NameNode log:
{code:java}
2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks for block: BP-932824627-xxxx-1680179358678:blk_-9223372036846808880_1669008 on datanode: datanode1:50010 {code}
datanode1 log:
{code:java}
2023-05-08 14:39:42,183 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-932824627-xxxx-1680179358678:blk_-9223372036846808880_1669008
 on /data1/hadoop/hdfs/datanode

2023-05-08 14:39:47,338 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to delete replica blk_-9223372036846808879_1669008: ReplicaInfo
not found. {code}


> Erasure coding: invalidate wrong block after reporting bad blocks from datanode
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-17003
>                 URL: https://issues.apache.org/jira/browse/HDFS-17003
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: farmmamba
>            Priority: Critical
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block to invalidate. It is a dangerous behaviour and may cause data loss. Some logs in our production as below:
>  
> NameNode log:
> {code:java}
> 2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks for block: BP-932824627-xxxx-1680179358678:blk_-9223372036846808880_1669008 on datanode: datanode1:50010 {code}
> datanode1 log:
> {code:java}
> 2023-05-08 14:39:42,183 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-932824627-xxxx-1680179358678:blk_-9223372036846808880_1669008
>  on /data1/hadoop/hdfs/datanode
> 2023-05-08 14:39:47,338 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to delete replica blk_-9223372036846808879_1669008: ReplicaInfo
> not found. {code}
>  
> This phenomenon can be reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org