You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by 권병창 <ma...@navercorp.com> on 2016/01/12 11:20:06 UTC

mismatch corrupt blocks from fsck and dfadmin report.

Hi. hadooper!
 
I use hadoop-2.7.1 and  my cluster has 130 nodes.
 
recently I am facing a problem.
 
I have found corrupt block by nagios.
 
nagios request http://namenode01:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem. 
 
below is result.  notice CorruptBlocks is 1. 

 {
  "beans" : [ {
    "name" : "Hadoop:service=NameNode,name=FSNamesystem",
    "modelerType" : "FSNamesystem",
    "tag.Context" : "dfs",
    "tag.HAState" : "active",
    "tag.Hostname" : "css0700.nhnsystem.com",
    "MissingBlocks" : 0,
    "MissingReplOneBlocks" : 0,
    "ExpiredHeartbeats" : 10,
    "TransactionsSinceLastCheckpoint" : 820630,
    "TransactionsSinceLastLogRoll" : 1916,
    "LastWrittenTransactionId" : 376685578,
    "LastCheckpointTime" : 1452583950883,
    "CapacityTotal" : 1650893130075660,
    "CapacityTotalGB" : 1537514.0,
    "CapacityUsed" : 1237079990848257,
    "CapacityUsedGB" : 1152121.0,
    "CapacityRemaining" : 410189981364473,
    "CapacityRemainingGB" : 382019.0,
    "CapacityUsedNonDFS" : 3623157862930,
    "TotalLoad" : 6717,
    "SnapshottableDirectories" : 0,
    "Snapshots" : 0,
    "BlocksTotal" : 4034155,
    "FilesTotal" : 2866690,
    "PendingReplicationBlocks" : 0,
    "UnderReplicatedBlocks" : 0,
    "CorruptBlocks" : 1,
    "ScheduledReplicationBlocks" : 0,
    "PendingDeletionBlocks" : 0,
    "ExcessBlocks" : 87,
    "PostponedMisreplicatedBlocks" : 0,
    "PendingDataNodeMessageCount" : 0,
    "MillisSinceLastLoadedEdits" : 0,
    "BlockCapacity" : 67108864,
    "StaleDataNodes" : 0,
    "TotalFiles" : 2866690
  } ]
}


 
 
however  'hdfs fsck / -list-corruptfileblocks' does not found.
 
below is fsck result.

 The filesystem under path '/' has 0 CORRUPT files



 
'hdfs dfsadmin -report'  is similar first result.

 Configured Capacity: 1650892318461452 (1.47 PB)
Present Capacity: 1647353181258422 (1.46 PB)
DFS Remaining: 408711410865856 (371.72 TB)
DFS Used: 1238641770392566 (1.10 PB)
DFS Used%: 75.19%
Under replicated blocks: 0
Blocks with corrupt replicas: 1
Missing blocks: 0
Missing blocks (with replication factor 1): 0



 
My question is
 
1. why different result?
2. How do I find corrupt filename? I wonder which file is corrupt.
 
Thank you.
 

Re: mismatch corrupt blocks from fsck and dfadmin report.

Posted by Mungeol Heo <mu...@gmail.com>.
In my case, the "Blocks with corrupt replicas", which from "hdfs dfsadmin
-report" command, became 0 after a while.

On Tue, Dec 27, 2016 at 4:15 PM, Mungeol Heo <mu...@gmail.com> wrote:

> +1
>
> On Tue, Jan 12, 2016 at 7:20 PM, 권병창 <ma...@navercorp.com> wrote:
>
>> Hi. hadooper!
>>
>>
>>
>> I use hadoop-2.7.1 and  my cluster has 130 nodes.
>>
>>
>>
>> recently I am facing a problem.
>>
>>
>>
>> I have found corrupt block by nagios.
>>
>>
>>
>> nagios request http://namenode01:50070/jmx?qry=Hadoop:service=NameN
>> ode,name=FSNamesystem.
>>
>>
>>
>> below is result.  notice CorruptBlocks is 1.
>>
>>  {
>>
>>   "beans" : [ {
>>     "name" : "Hadoop:service=NameNode,name=FSNamesystem",
>>     "modelerType" : "FSNamesystem",
>>     "tag.Context" : "dfs",
>>     "tag.HAState" : "active",
>>     "tag.Hostname" : "css0700.nhnsystem.com",
>>     "MissingBlocks" : 0,
>>     "MissingReplOneBlocks" : 0,
>>     "ExpiredHeartbeats" : 10,
>>     "TransactionsSinceLastCheckpoint" : 820630,
>>     "TransactionsSinceLastLogRoll" : 1916,
>>     "LastWrittenTransactionId" : 376685578,
>>     "LastCheckpointTime" : 1452583950883,
>>     "CapacityTotal" : 1650893130075660,
>>     "CapacityTotalGB" : 1537514.0,
>>     "CapacityUsed" : 1237079990848257,
>>     "CapacityUsedGB" : 1152121.0,
>>     "CapacityRemaining" : 410189981364473,
>>     "CapacityRemainingGB" : 382019.0,
>>     "CapacityUsedNonDFS" : 3623157862930,
>>     "TotalLoad" : 6717,
>>     "SnapshottableDirectories" : 0,
>>     "Snapshots" : 0,
>>     "BlocksTotal" : 4034155,
>>     "FilesTotal" : 2866690,
>>     "PendingReplicationBlocks" : 0,
>>     "UnderReplicatedBlocks" : 0,
>>     "CorruptBlocks" : 1,
>>     "ScheduledReplicationBlocks" : 0,
>>     "PendingDeletionBlocks" : 0,
>>     "ExcessBlocks" : 87,
>>     "PostponedMisreplicatedBlocks" : 0,
>>     "PendingDataNodeMessageCount" : 0,
>>     "MillisSinceLastLoadedEdits" : 0,
>>     "BlockCapacity" : 67108864,
>>     "StaleDataNodes" : 0,
>>     "TotalFiles" : 2866690
>>   } ]
>> }
>>
>>
>>
>>
>>
>> however  'hdfs fsck / -list-corruptfileblocks' does not found.
>>
>>
>>
>> below is fsck result.
>>
>>  The filesystem under path '/' has 0 CORRUPT files
>>
>>
>>
>> 'hdfs dfsadmin -report'  is similar first result.
>>
>>  Configured Capacity: 1650892318461452 (1.47 PB)
>>
>> Present Capacity: 1647353181258422 (1.46 PB)
>>
>> DFS Remaining: 408711410865856 (371.72 TB)
>>
>> DFS Used: 1238641770392566 (1.10 PB)
>>
>> DFS Used%: 75.19%
>>
>> Under replicated blocks: 0
>>
>> Blocks with corrupt replicas: 1
>>
>> Missing blocks: 0
>>
>> Missing blocks (with replication factor 1): 0
>>
>>
>>
>> My question is
>>
>>
>>
>> 1. why different result?
>>
>> 2. How do I find corrupt filename? I wonder which file is corrupt.
>>
>>
>>
>> Thank you.
>>
>>
>>
>
>

Re: mismatch corrupt blocks from fsck and dfadmin report.

Posted by Mungeol Heo <mu...@gmail.com>.
+1

On Tue, Jan 12, 2016 at 7:20 PM, 권병창 <ma...@navercorp.com> wrote:

> Hi. hadooper!
>
>
>
> I use hadoop-2.7.1 and  my cluster has 130 nodes.
>
>
>
> recently I am facing a problem.
>
>
>
> I have found corrupt block by nagios.
>
>
>
> nagios request http://namenode01:50070/jmx?qry=Hadoop:service=
> NameNode,name=FSNamesystem.
>
>
>
> below is result.  notice CorruptBlocks is 1.
>
>  {
>
>   "beans" : [ {
>     "name" : "Hadoop:service=NameNode,name=FSNamesystem",
>     "modelerType" : "FSNamesystem",
>     "tag.Context" : "dfs",
>     "tag.HAState" : "active",
>     "tag.Hostname" : "css0700.nhnsystem.com",
>     "MissingBlocks" : 0,
>     "MissingReplOneBlocks" : 0,
>     "ExpiredHeartbeats" : 10,
>     "TransactionsSinceLastCheckpoint" : 820630,
>     "TransactionsSinceLastLogRoll" : 1916,
>     "LastWrittenTransactionId" : 376685578,
>     "LastCheckpointTime" : 1452583950883,
>     "CapacityTotal" : 1650893130075660,
>     "CapacityTotalGB" : 1537514.0,
>     "CapacityUsed" : 1237079990848257,
>     "CapacityUsedGB" : 1152121.0,
>     "CapacityRemaining" : 410189981364473,
>     "CapacityRemainingGB" : 382019.0,
>     "CapacityUsedNonDFS" : 3623157862930,
>     "TotalLoad" : 6717,
>     "SnapshottableDirectories" : 0,
>     "Snapshots" : 0,
>     "BlocksTotal" : 4034155,
>     "FilesTotal" : 2866690,
>     "PendingReplicationBlocks" : 0,
>     "UnderReplicatedBlocks" : 0,
>     "CorruptBlocks" : 1,
>     "ScheduledReplicationBlocks" : 0,
>     "PendingDeletionBlocks" : 0,
>     "ExcessBlocks" : 87,
>     "PostponedMisreplicatedBlocks" : 0,
>     "PendingDataNodeMessageCount" : 0,
>     "MillisSinceLastLoadedEdits" : 0,
>     "BlockCapacity" : 67108864,
>     "StaleDataNodes" : 0,
>     "TotalFiles" : 2866690
>   } ]
> }
>
>
>
>
>
> however  'hdfs fsck / -list-corruptfileblocks' does not found.
>
>
>
> below is fsck result.
>
>  The filesystem under path '/' has 0 CORRUPT files
>
>
>
> 'hdfs dfsadmin -report'  is similar first result.
>
>  Configured Capacity: 1650892318461452 (1.47 PB)
>
> Present Capacity: 1647353181258422 (1.46 PB)
>
> DFS Remaining: 408711410865856 (371.72 TB)
>
> DFS Used: 1238641770392566 (1.10 PB)
>
> DFS Used%: 75.19%
>
> Under replicated blocks: 0
>
> Blocks with corrupt replicas: 1
>
> Missing blocks: 0
>
> Missing blocks (with replication factor 1): 0
>
>
>
> My question is
>
>
>
> 1. why different result?
>
> 2. How do I find corrupt filename? I wonder which file is corrupt.
>
>
>
> Thank you.
>
>
>