You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Oleksii Dymytrov (JIRA)" <ji...@apache.org> on 2016/12/05 12:58:58 UTC
[jira] [Updated] (YARN-5924) Resource Manager fails to load state
with InvalidProtocolBufferException
[ https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Oleksii Dymytrov updated YARN-5924:
-----------------------------------
Assignee: (was: Oleksii Dymytrov)
> Resource Manager fails to load state with InvalidProtocolBufferException
> ------------------------------------------------------------------------
>
> Key: YARN-5924
> URL: https://issues.apache.org/jira/browse/YARN-5924
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 3.0.0-alpha1
> Reporter: Oleksii Dymytrov
>
> InvalidProtocolBufferException is thrown during recovering of the application's state if application's data has invalid format (or is broken) under FSRMStateRoot/RMAppRoot/application_1477986176766_0134/ directory in HDFS:
> {noformat}
> com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
> at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
> at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
> at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:143)
> at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176)
> at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:188)
> at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:193)
> at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
> at org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$ApplicationStateDataProto.parseFrom(YarnServerResourceManagerRecoveryProtos.java:1028)
> at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore$RMAppStateFileProcessor.processChildNode(FileSystemRMStateStore.java:966)
> at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.processDirectoriesOfFiles(FileSystemRMStateStore.java:317)
> at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMAppState(FileSystemRMStateStore.java:281)
> at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:232)
> {noformat}
> The solution can be to catch "InvalidProtocolBufferException", show warning and remove application's folder that contains invalid data to prevent RM restart failure.
> Additionally, I've added catch for other exceptions that can appear during recovering of the specific application, to avoid RM failure even if the only one application's state can't be loaded.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org