You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Ben La Monica (JIRA)" <ji...@apache.org> on 2018/11/07 12:58:00 UTC

[jira] [Commented] (FLINK-10682) EOFException occurs during deserialization of Avro class

    [ https://issues.apache.org/jira/browse/FLINK-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16678193#comment-16678193 ] 

Ben La Monica commented on FLINK-10682:
---------------------------------------

So an update on this. I don't think it's a mismatch in avro classes because this is running in a yarn cluster and it copies all of the code to each node, I've installed more logging and decreased the number of task managers so that it's easier to track. I'm still seeing this problem, and it's not always due to AVRO. Here is another stack trace...
{code:java}
2018-11-07 12:40:55,495 [INFO ] class=o.a.f.r.e.ExecutionGraph thread="flink-akka.actor.default-dispatcher-5202" Combine Price and Fund (7/9) (f0854988c294e4b256718746aff6bd6c) switched from RUNNING to FAILED.
java.io.IOException: Corrupt stream, found tag: 82
at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:220)
at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:49)
at org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
at org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:140)
at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:206)
at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:116)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748){code}

I think it may be due to a very large object being serialized. I will continue to try to track it down.

> EOFException occurs during deserialization of Avro class
> --------------------------------------------------------
>
>                 Key: FLINK-10682
>                 URL: https://issues.apache.org/jira/browse/FLINK-10682
>             Project: Flink
>          Issue Type: Bug
>          Components: Type Serialization System
>    Affects Versions: 1.5.4
>         Environment: AWS EMR 5.17 (upgraded to Flink 1.5.4)
> 3 task managers, 1 job manager running in YARN in Hadoop
> Running on Amazon Linux with OpenJDK 1.8
>            Reporter: Ben La Monica
>            Priority: Critical
>
> I'm having trouble (which usually occurs after an hour of processing in a StreamExecutionEnvironment) where I get this failure message. I'm at a loss for what is causing it. I'm running this in AWS on EMR 5.17 with 3 task managers and a job manager running in a YARN cluster and I've upgraded my flink libraries to 1.5.4 to bypass another serialization issue and the kerberos auth issues.
> The avro classes that are being deserialized were generated with avro 1.8.2.
> {code:java}
> 2018-10-22 16:12:10,680 [INFO ] class=o.a.flink.runtime.taskmanager.Task thread="Calculate Estimated NAV -> Split into single messages (3/10)" Calculate Estimated NAV -> Split into single messages (3/10) (de7d8fa77
> 84903a475391d0168d56f2e) switched from RUNNING to FAILED.
> java.io.EOFException: null
> at org.apache.flink.core.memory.DataInputDeserializer.readLong(DataInputDeserializer.java:219)
> at org.apache.flink.core.memory.DataInputDeserializer.readDouble(DataInputDeserializer.java:138)
> at org.apache.flink.formats.avro.utils.DataInputDecoder.readDouble(DataInputDecoder.java:70)
> at org.apache.avro.io.ResolvingDecoder.readDouble(ResolvingDecoder.java:190)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:186)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116)
> at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
> at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:266)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:177)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116)
> at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
> at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
> at org.apache.flink.formats.avro.typeutils.AvroSerializer.deserialize(AvroSerializer.java:172)
> at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:208)
> at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:49)
> at org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
> at org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:140)
> at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:208)
> at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:116)
> at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> at java.lang.Thread.run(Thread.java:748){code}
> Do you have any ideas on how I could further troubleshoot this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)