You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "rangjiaheng (JIRA)" <ji...@apache.org> on 2017/10/09 17:38:00 UTC

[jira] [Comment Edited] (MAPREDUCE-6978) MR task counters deserialized through RPC throws OutOfBoundsException if Counter enum class version not match

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16197363#comment-16197363 ] 

rangjiaheng edited comment on MAPREDUCE-6978 at 10/9/17 5:37 PM:
-----------------------------------------------------------------

The main reason is that when Container write Counter values to RPC, it write TaskCounter.class Enumeration value's ordinal, when AM read an ordinal from RPC which large than TaskCounter.class Enumeration values' size, it cause an OutOfBoundsException, and then the Container was kill by AM.

{code:java}
  public void readFields(DataInput in) throws IOException {
    clear();
    int len = WritableUtils.readVInt(in);
    T[] enums = enumClass.getEnumConstants();
    for (int i = 0; i < len; ++i) {
      int ord = WritableUtils.readVInt(in);
      Counter counter = newCounter(enums[ord]);        // here it is, throws OutOfBoundsException
      counter.setValue(WritableUtils.readVLong(in));
      counters[ord] = counter;
    }   
  }
{code}

This problem happened when we are doing Gray Release, I believe this will not happen if we upgrade all the NMs simultaneously; however we prefer Gray Release.



was (Author: neomatrix):
The main reason is that when Container write Counter values to RPC, it write TaskCounter.class Enumeration value's ordinal, when AM read an ordinal from RPC which large than TaskCounter.class Enumeration values' size, it cause an OutOfBoundsException, and then the Container was kill by AM.

{code:java}
  public void readFields(DataInput in) throws IOException {
    clear();
    int len = WritableUtils.readVInt(in);
    T[] enums = enumClass.getEnumConstants();
    for (int i = 0; i < len; ++i) {
      int ord = WritableUtils.readVInt(in);
      Counter counter = newCounter(enums[ord]);
      counter.setValue(WritableUtils.readVLong(in));
      counters[ord] = counter;
    }   
  }
{code}

This problem happened when we are doing Gray Release, I believe this will not happen if we upgrade all the NMs simultaneously; however we prefer Gray Release.


> MR task counters deserialized through RPC throws OutOfBoundsException if Counter enum class version not match
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6978
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6978
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, task
>    Affects Versions: 3.0.0-alpha4
>         Environment: NM1 TaskCounter.class old version; 
> NM2 TaskCounter.class new version (new Enumeration values appended); 
>            Reporter: rangjiaheng
>
> Environment:
> NM1 TaskCounter.class old version; 
> NM2 TaskCounter.class new version (new Enumeration values appended); 
> Result:
> When an MR app's AM running on NM1, and it's containers on NM2; the containers on NM2 will all failed, AM cause OutOfBoundsException;
> Reason:
> When app running, containers will report their counters to AM through RPC, while the Container with new version TaskCounter.class will write more Counter values to RPC; however, the AM with old version TaskCounter.class which can not read them correctly from RPC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org