You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Surendra Singh Lilhore (Jira)" <ji...@apache.org> on 2020/10/14 18:28:00 UTC

[jira] [Updated] (YARN-10442) RM should make sure node label file highly available

     [ https://issues.apache.org/jira/browse/YARN-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Surendra Singh Lilhore updated YARN-10442:
------------------------------------------
    Attachment: YARN-10442.002.patch

> RM should make sure node label file highly available
> ----------------------------------------------------
>
>                 Key: YARN-10442
>                 URL: https://issues.apache.org/jira/browse/YARN-10442
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.1.1
>            Reporter: Surendra Singh Lilhore
>            Assignee: Surendra Singh Lilhore
>            Priority: Major
>         Attachments: YARN-10442.001.patch, YARN-10442.002.patch
>
>
> One of my cluster RM failed transition to Active because node label file blocks are missing. I think RM should to make sure important files are highly available. 
> {noformat}
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Could not obtain block: BP-2121803626-10.0.0.22-1597301807397:blk_1073832522_91774 file=/yarn/node-labels/nodelabel.mirror
> 	at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:238)
> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> 	at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> 	at org.apache.hadoop.yarn.proto.YarnServerResourceManagerServiceProtos$AddToClusterNodeLabelsRequestProto.parseDelimitedFrom(YarnServerResourceManagerServiceProtos.java:7493)
> 	at org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.loadFromMirror(FileSystemNodeLabelsStore.java:168)
> 	at org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.recover(FileSystemNodeLabelsStore.java:205)
> 	at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.initNodeLabelStore(CommonNodeLabelsManager.java:254)
> 	at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStart(CommonNodeLabelsManager.java:268)
> 	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)(AbstractService.java:194){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org