You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2015/04/25 00:33:38 UTC

[jira] [Resolved] (TEZ-2367) Corruption of TezHeartbeatRequest

     [ https://issues.apache.org/jira/browse/TEZ-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bikas Saha resolved TEZ-2367.
-----------------------------
    Resolution: Duplicate

TEZ-2314

> Corruption of TezHeartbeatRequest
> ---------------------------------
>
>                 Key: TEZ-2367
>                 URL: https://issues.apache.org/jira/browse/TEZ-2367
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>            Priority: Blocker
>
> The following exception is seen in the AM logs while attempting to deserialize a heartbeat request.
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 1382376565
>         at org.apache.tez.runtime.api.impl.EventMetaData.readFields(EventMetaData.java:120)
>         at org.apache.tez.runtime.api.impl.TezEvent.readFields(TezEvent.java:271)
>         at org.apache.tez.runtime.api.impl.TezHeartbeatRequest.readFields(TezHeartbeatRequest.java:110)
>         at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
>         at org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:160)
>         at org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1869)
>         at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1801)
>         at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1559)
>         at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:784)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:650)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:621)
> {code}
> TEZ-2234 is what changed the serialization most recently. [~bikassaha] - mind taking a look.
> From a quick glance, it looks like this is caused by the way TaskStatistics are serialized. ioStatistics.size followed by an iterator over ioStatistics.
> ioStatistics can change during this time as different Inputs / Outputs get initialized. Synchronizing should fix this.
> Also, setting the statistics may require synchronization to ensure correct values are written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)