You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Wei-Chiu Chuang (JIRA)" <ji...@apache.org> on 2019/08/12 18:14:01 UTC

[jira] [Resolved] (HDFS-12914) Block report leases cause missing blocks until next report

     [ https://issues.apache.org/jira/browse/HDFS-12914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wei-Chiu Chuang resolved HDFS-12914.
------------------------------------
    Resolution: Fixed

I've muddled the water too much in this jira. Let's use HDFS-14725 to track the branch-2 backport work.

> Block report leases cause missing blocks until next report
> ----------------------------------------------------------
>
>                 Key: HDFS-12914
>                 URL: https://issues.apache.org/jira/browse/HDFS-12914
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.8.0, 2.9.2
>            Reporter: Daryn Sharp
>            Assignee: Santosh Marella
>            Priority: Critical
>             Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
>         Attachments: HDFS-12914-branch-2.001.patch, HDFS-12914-trunk.00.patch, HDFS-12914-trunk.01.patch, HDFS-12914.005.patch, HDFS-12914.006.patch, HDFS-12914.007.patch, HDFS-12914.008.patch, HDFS-12914.009.patch, HDFS-12914.branch-2.000.patch, HDFS-12914.branch-2.001.patch, HDFS-12914.branch-2.002.patch, HDFS-12914.branch-2.8.001.patch, HDFS-12914.branch-2.8.002.patch, HDFS-12914.branch-2.patch, HDFS-12914.branch-3.0.patch, HDFS-12914.branch-3.1.001.patch, HDFS-12914.branch-3.1.002.patch, HDFS-12914.branch-3.2.patch, HDFS-12914.utfix.patch
>
>
> {{BlockReportLeaseManager#checkLease}} will reject FBRs from DNs for conditions such as "unknown datanode", "not in pending set", "lease has expired", wrong lease id, etc.  Lease rejection does not throw an exception.  It returns false which bubbles up to  {{NameNodeRpcServer#blockReport}} and interpreted as {{noStaleStorages}}.
> A re-registering node whose FBR is rejected from an invalid lease becomes active with _no blocks_.  A replication storm ensues possibly causing DNs to temporarily go dead (HDFS-12645), leading to more FBR lease rejections on re-registration.  The cluster will have many "missing blocks" until the DNs next FBR is sent and/or forced.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org