You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Oleksii Dymytrov (JIRA)" <ji...@apache.org> on 2016/12/05 12:58:58 UTC

[jira] [Updated] (YARN-5924) Resource Manager fails to load state with InvalidProtocolBufferException

     [ https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Oleksii Dymytrov updated YARN-5924:
-----------------------------------
    Assignee:     (was: Oleksii Dymytrov)

> Resource Manager fails to load state with InvalidProtocolBufferException
> ------------------------------------------------------------------------
>
>                 Key: YARN-5924
>                 URL: https://issues.apache.org/jira/browse/YARN-5924
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Oleksii Dymytrov
>
> InvalidProtocolBufferException is thrown during recovering of the application's state if application's data has invalid format (or is broken) under FSRMStateRoot/RMAppRoot/application_1477986176766_0134/ directory in HDFS:
> {noformat}
> com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
> 	at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> 	at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> 	at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:143)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:188)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:193)
> 	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
> 	at org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$ApplicationStateDataProto.parseFrom(YarnServerResourceManagerRecoveryProtos.java:1028)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore$RMAppStateFileProcessor.processChildNode(FileSystemRMStateStore.java:966)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.processDirectoriesOfFiles(FileSystemRMStateStore.java:317)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMAppState(FileSystemRMStateStore.java:281)
> 	at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:232)
> {noformat}
> The solution can be to catch "InvalidProtocolBufferException", show warning and remove application's folder that contains invalid data to prevent RM restart failure. 
> Additionally, I've added catch for other exceptions that can appear during recovering of the specific application, to avoid RM failure even if the only one application's state can't be loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org