You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Peter Westermann <no...@genesys.com> on 2020/07/22 18:03:15 UTC

NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

I just started testing Flink 1.11.1 and noticed that the Task Managers section in the UI doesn’t load.
The exception in the log is:
j.i.NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
\tat j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat java.util.ArrayList.writeObject(ArrayList.java:766)
\tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
\tat s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
\tat j.l.reflect.Method.invoke(Method.java:498)
\tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
\tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
\t... 29 common frames omitted
Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result for RPC call : requestTaskManagerInfo.
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
\tat o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
\tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
\tat j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
\tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
\tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
\tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
\tat o.a.f.r.r.a.AkkaRpcActo...


Peter

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Till Rohrmann <tr...@apache.org>.
The fix has been merged into master and the release-1.11 branch. It should
be shipped with the next bug fix release 1.11.2.

Cheers,
Till

On Fri, Jul 24, 2020 at 9:02 PM Peter Westermann <no...@genesys.com>
wrote:

> Thank you Till!
>
>
>
> *From: *Till Rohrmann <tr...@apache.org>
> *Date: *Friday, July 24, 2020 at 2:49 PM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *Robert Metzger <rm...@apache.org>, Xintong Song <
> tonysong820@gmail.com>, "user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> The problem is that `ResourceProfileInfo` is not serializable. When
> requesting the information from the leading web server then there is no
> serialization required since the leading RM is most likely co-located in
> the same process. I've opened an issue [1] and PR [2] for it.
>
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-18710
>
> [2] https://github.com/apache/flink/pull/12991
>
>
>
> Cheers,
>
> Till
>
>
>
> On Fri, Jul 24, 2020 at 5:43 PM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> Hi Robert,
>
>
>
> I think this may have something to do with the HA setup: looks like the
> exceptions only show up when not on the leader.
>
> I just spun up a new cluster to provide logs and didn’t get any errors
> when looking at task managers on the current leader but as soon as I look
> at the UI on the standby backup I get these exceptions. I attached the log
> for the standby jobmanager.
>
>
>
> Thanks for your help,
>
>
>
> Peter
>
>
>
> *From: *Robert Metzger <rm...@apache.org>
> *Date: *Friday, July 24, 2020 at 10:42 AM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *Xintong Song <to...@gmail.com>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Thanks for your response. I was able to start Flink 1.11.1 locally (1 JM,
> 5 TMs) with SSL enabled, but I didn't have this problem (it was also
> unlikely :) )
>
>
>
> I'm running JDK 1.8, Scala 2.12 build, vanilla Flink:
>
>
>
> 2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Starting StandaloneSessionClusterEntrypoint (
> Version: 1.11.1, Scala: 2.12, Rev:7eb514a, Date:2020-07-15T07:02:09+02:00)
>
> 2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - OS current user: robert
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Current Hadoop/Kerberos user: <no hadoop
> dependency found>
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 1.8/
> 25.252-b09
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Maximum heap size: 981 MiBytes
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - JAVA_HOME: /Library/Java/JavaVirtualMachines
> /adoptopenjdk-8.jdk/Contents/Home
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - No Hadoop Dependency available
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - JVM Options:
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Xmx1073741824
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Xms1073741824
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -XX:MaxMetaspaceSize=268435456
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlog.file=/private/tmp/flink/flink-1.11.1
> /log/flink-robert-standalonesession-0-MacBook-Pro-2.localdomain.log
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlog4j.configuration=file:/private
> /tmp/flink/flink-1.11.1/conf/log4j.properties
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlog4j.configurationFile=file:/private
> /tmp/flink/flink-1.11.1/conf/log4j.properties
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlogback.configurationFile=file:/private
> /tmp/flink/flink-1.11.1/conf/logback.xml
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Program Arguments:
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - --configDir
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - /private/tmp/flink/flink-1.11.1/conf
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - --executionMode
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - cluster
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Classpath: /private/tmp/flink/flink-1.11.1
> /lib/flink-csv-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-json-
> 1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-shaded-zookeeper-3.4.
> 14.jar:/private/tmp/flink/flink-1.11.1/lib/flink-table-blink_2.12-1.11.1
> .jar:/private/tmp/flink/flink-1.11.1/lib/flink-table_2.12-1.11.1.jar:/
> private/tmp/flink/flink-1.11.1/lib/log4j-1.2-api-2.12.1.jar:/private
> /tmp/flink/flink-1.11.1/lib/log4j-api-2.12.1.jar:/private/tmp/flink/flink-
> 1.11.1/lib/log4j-core-2.12.1.jar:/private/tmp/flink/flink-1.11.1
> /lib/log4j-slf4j-impl-2.12.1.jar:/private/tmp/flink/flink-1.11.1
> /lib/flink-dist_2.12-1.11.1.jar:::
>
>
>
>
>
> Your setup also sounds pretty vanilla, and the error seems to occur even
> before you submit any job (so the S3 / rocksdb stuff is not loaded / used
> yet).
>
> Are there any clues in the JobManager log? Can you share the full log
> here? (or with me privately?)
>
> Did you do any other modifications?
>
>
>
>
>
> On Fri, Jul 24, 2020 at 3:52 PM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> Hi Robert,
>
>
>
> Jobmanagers and taskmanagers are both running on 1.11.1. Jobmanagers are
> started with *jobmanager.sh start *and taskmanagers are started with *taskmanager.sh
> start* – to be clear those run on separate instances. Jars and config are
> distributed when creating AMIs for these instances – every build starts
> from scratch so there are no lingering jars from older Flink versions.
>
> The only code change is using Flink 1.11.1 instead of 1.10.1.
>
> FWIW: This is with *security.ssl.rest.enabled: true *if that makes a
> difference.
>
>
>
> Thanks,
>
> Peter
>
>
>
>
>
> *From: *Robert Metzger <rm...@apache.org>
> *Date: *Friday, July 24, 2020 at 8:54 AM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *Xintong Song <to...@gmail.com>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Hi Peter,
>
> how are you deploying Flink on the EC2 machines? Did you manually
> distribute the files to the machines, and then use the start-cluster.sh
> script?
>
> Can you make sure that the TaskManagers are also running Flink 1.11.1?
>
>
>
> On Thu, Jul 23, 2020 at 1:05 PM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> Hi Xintong Song,
>
>
>
> This is the UI for a newly started Flink cluster:
>
>
>
> [image: A screenshot of a cell phone Description automatically generated]
>
> As soon as I click on Task Managers, this happens (the same error message
> pops up on each UI refresh):
>
> [image: A screenshot of a cell phone Description automatically generated]
>
>
>
> I got the actual error message from the logs.
>
> This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend,
> state in S3, and zookeeper for HA.
>
>
>
>
>
> Peter
>
>
>
> *From: *Xintong Song <to...@gmail.com>
> *Date: *Wednesday, July 22, 2020 at 10:10 PM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Hi Peter,
>
>
>
> Thanks for reporting this issue.
>
>
>
> From the exception stack, it seems there's indeed a problem. However, I'm
> not able to reproduce this issue on my machine, and I guess that's why this
> is not discovered before the release. Could you help share some more
> details (and maybe screenshots) on how this issue is triggered?
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> I just started testing Flink 1.11.1 and noticed that the Task Managers
> section in the UI doesn’t load.
>
> The exception in the log is:
>
> j.i.NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> \tat
> j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat java.util.ArrayList.writeObject(ArrayList.java:766)
> \tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
> \tat
> s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> \tat j.l.reflect.Method.invoke(Method.java:498)
> \tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
> \tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
> \t... 29 common frames omitted
> Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result
> for RPC call : requestTaskManagerInfo.
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
> \tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> \tat
> j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
> \tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
> \tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
> \tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
> \tat o.a.f.r.r.a.AkkaRpcActo...
>
>
>
>
>
> Peter
>
>

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Peter Westermann <no...@genesys.com>.
Thank you Till!

From: Till Rohrmann <tr...@apache.org>
Date: Friday, July 24, 2020 at 2:49 PM
To: Peter Westermann <no...@genesys.com>
Cc: Robert Metzger <rm...@apache.org>, Xintong Song <to...@gmail.com>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

The problem is that `ResourceProfileInfo` is not serializable. When requesting the information from the leading web server then there is no serialization required since the leading RM is most likely co-located in the same process. I've opened an issue [1] and PR [2] for it.

[1] https://issues.apache.org/jira/browse/FLINK-18710<https://issues.apache.org/jira/browse/FLINK-18710>
[2] https://github.com/apache/flink/pull/12991<https://github.com/apache/flink/pull/12991>

Cheers,
Till

On Fri, Jul 24, 2020 at 5:43 PM Peter Westermann <no...@genesys.com>> wrote:
Hi Robert,

I think this may have something to do with the HA setup: looks like the exceptions only show up when not on the leader.
I just spun up a new cluster to provide logs and didn’t get any errors when looking at task managers on the current leader but as soon as I look at the UI on the standby backup I get these exceptions. I attached the log for the standby jobmanager.

Thanks for your help,

Peter

From: Robert Metzger <rm...@apache.org>>
Date: Friday, July 24, 2020 at 10:42 AM
To: Peter Westermann <no...@genesys.com>>
Cc: Xintong Song <to...@gmail.com>>, "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Thanks for your response. I was able to start Flink 1.11.1 locally (1 JM, 5 TMs) with SSL enabled, but I didn't have this problem (it was also unlikely :) )

I'm running JDK 1.8, Scala 2.12 build, vanilla Flink:

2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting StandaloneSessionClusterEntrypoint (Version: 1.11.1, Scala: 2.12, Rev:7eb514a, Date:2020-07-15T07:02:09+02:00)
2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - OS current user: robert
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Current Hadoop/Kerberos user: <no hadoop dependency found>
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 1.8/25.252-b09
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Maximum heap size: 981 MiBytes
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - JAVA_HOME: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - No Hadoop Dependency available
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - JVM Options:
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Xmx1073741824
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Xms1073741824
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -XX:MaxMetaspaceSize=268435456
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlog.file=/private/tmp/flink/flink-1.11.1/log/flink-robert-standalonesession-0-MacBook-Pro-2.localdomain.log
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlog4j.configuration=file:/private/tmp/flink/flink-1.11.1/conf/log4j.properties
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlog4j.configurationFile=file:/private/tmp/flink/flink-1.11.1/conf/log4j.properties
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlogback.configurationFile=file:/private/tmp/flink/flink-1.11.1/conf/logback.xml
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Program Arguments:
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - --configDir
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - /private/tmp/flink/flink-1.11.1/conf
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - --executionMode
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - cluster
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Classpath: /private/tmp/flink/flink-1.11.1/lib/flink-csv-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-json-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-shaded-zookeeper-3.4.14.jar:/private/tmp/flink/flink-1.11.1/lib/flink-table-blink_2.12-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-table_2.12-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-1.2-api-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-api-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-core-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-slf4j-impl-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-dist_2.12-1.11.1.jar:::


Your setup also sounds pretty vanilla, and the error seems to occur even before you submit any job (so the S3 / rocksdb stuff is not loaded / used yet).
Are there any clues in the JobManager log? Can you share the full log here? (or with me privately?)
Did you do any other modifications?


On Fri, Jul 24, 2020 at 3:52 PM Peter Westermann <no...@genesys.com>> wrote:
Hi Robert,

Jobmanagers and taskmanagers are both running on 1.11.1. Jobmanagers are started with jobmanager.sh start and taskmanagers are started with taskmanager.sh start – to be clear those run on separate instances. Jars and config are distributed when creating AMIs for these instances – every build starts from scratch so there are no lingering jars from older Flink versions.
The only code change is using Flink 1.11.1 instead of 1.10.1.
FWIW: This is with security.ssl.rest.enabled: true if that makes a difference.

Thanks,

Peter


From: Robert Metzger <rm...@apache.org>>
Date: Friday, July 24, 2020 at 8:54 AM
To: Peter Westermann <no...@genesys.com>>
Cc: Xintong Song <to...@gmail.com>>, "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Hi Peter,
how are you deploying Flink on the EC2 machines? Did you manually distribute the files to the machines, and then use the start-cluster.sh script?
Can you make sure that the TaskManagers are also running Flink 1.11.1?

On Thu, Jul 23, 2020 at 1:05 PM Peter Westermann <no...@genesys.com>> wrote:
Hi Xintong Song,

This is the UI for a newly started Flink cluster:

[A screenshot of a cell phone  Description automatically generated]
As soon as I click on Task Managers, this happens (the same error message pops up on each UI refresh):
[A screenshot of a cell phone  Description automatically generated]

I got the actual error message from the logs.
This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend, state in S3, and zookeeper for HA.


Peter

From: Xintong Song <to...@gmail.com>>
Date: Wednesday, July 22, 2020 at 10:10 PM
To: Peter Westermann <no...@genesys.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Hi Peter,

Thanks for reporting this issue.

From the exception stack, it seems there's indeed a problem. However, I'm not able to reproduce this issue on my machine, and I guess that's why this is not discovered before the release. Could you help share some more details (and maybe screenshots) on how this issue is triggered?


Thank you~

Xintong Song


On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <no...@genesys.com>> wrote:
I just started testing Flink 1.11.1 and noticed that the Task Managers section in the UI doesn’t load.
The exception in the log is:
j.i.NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
\tat j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat java.util.ArrayList.writeObject(ArrayList.java:766)
\tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
\tat s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
\tat j.l.reflect.Method.invoke(Method.java:498)
\tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
\tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
\t... 29 common frames omitted
Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result for RPC call : requestTaskManagerInfo.
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
\tat o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
\tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
\tat j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
\tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
\tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
\tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
\tat o.a.f.r.r.a.AkkaRpcActo...


Peter

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Till Rohrmann <tr...@apache.org>.
The problem is that `ResourceProfileInfo` is not serializable. When
requesting the information from the leading web server then there is no
serialization required since the leading RM is most likely co-located in
the same process. I've opened an issue [1] and PR [2] for it.

[1] https://issues.apache.org/jira/browse/FLINK-18710
[2] https://github.com/apache/flink/pull/12991

Cheers,
Till

On Fri, Jul 24, 2020 at 5:43 PM Peter Westermann <no...@genesys.com>
wrote:

> Hi Robert,
>
>
>
> I think this may have something to do with the HA setup: looks like the
> exceptions only show up when not on the leader.
>
> I just spun up a new cluster to provide logs and didn’t get any errors
> when looking at task managers on the current leader but as soon as I look
> at the UI on the standby backup I get these exceptions. I attached the log
> for the standby jobmanager.
>
>
>
> Thanks for your help,
>
>
>
> Peter
>
>
>
> *From: *Robert Metzger <rm...@apache.org>
> *Date: *Friday, July 24, 2020 at 10:42 AM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *Xintong Song <to...@gmail.com>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Thanks for your response. I was able to start Flink 1.11.1 locally (1 JM,
> 5 TMs) with SSL enabled, but I didn't have this problem (it was also
> unlikely :) )
>
>
>
> I'm running JDK 1.8, Scala 2.12 build, vanilla Flink:
>
>
>
> 2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Starting StandaloneSessionClusterEntrypoint (
> Version: 1.11.1, Scala: 2.12, Rev:7eb514a, Date:2020-07-15T07:02:09+02:00)
>
> 2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - OS current user: robert
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Current Hadoop/Kerberos user: <no hadoop
> dependency found>
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 1.8/
> 25.252-b09
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Maximum heap size: 981 MiBytes
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - JAVA_HOME: /Library/Java/JavaVirtualMachines
> /adoptopenjdk-8.jdk/Contents/Home
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - No Hadoop Dependency available
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - JVM Options:
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Xmx1073741824
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Xms1073741824
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -XX:MaxMetaspaceSize=268435456
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlog.file=/private/tmp/flink/flink-1.11.1
> /log/flink-robert-standalonesession-0-MacBook-Pro-2.localdomain.log
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlog4j.configuration=file:/private
> /tmp/flink/flink-1.11.1/conf/log4j.properties
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlog4j.configurationFile=file:/private
> /tmp/flink/flink-1.11.1/conf/log4j.properties
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - -Dlogback.configurationFile=file:/private
> /tmp/flink/flink-1.11.1/conf/logback.xml
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Program Arguments:
>
> 2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - --configDir
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - /private/tmp/flink/flink-1.11.1/conf
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - --executionMode
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - cluster
>
> 2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
> ClusterEntrypoint [] - Classpath: /private/tmp/flink/flink-1.11.1
> /lib/flink-csv-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-json-
> 1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-shaded-zookeeper-3.4.
> 14.jar:/private/tmp/flink/flink-1.11.1/lib/flink-table-blink_2.12-1.11.1
> .jar:/private/tmp/flink/flink-1.11.1/lib/flink-table_2.12-1.11.1.jar:/
> private/tmp/flink/flink-1.11.1/lib/log4j-1.2-api-2.12.1.jar:/private
> /tmp/flink/flink-1.11.1/lib/log4j-api-2.12.1.jar:/private/tmp/flink/flink-
> 1.11.1/lib/log4j-core-2.12.1.jar:/private/tmp/flink/flink-1.11.1
> /lib/log4j-slf4j-impl-2.12.1.jar:/private/tmp/flink/flink-1.11.1
> /lib/flink-dist_2.12-1.11.1.jar:::
>
>
>
>
>
> Your setup also sounds pretty vanilla, and the error seems to occur even
> before you submit any job (so the S3 / rocksdb stuff is not loaded / used
> yet).
>
> Are there any clues in the JobManager log? Can you share the full log
> here? (or with me privately?)
>
> Did you do any other modifications?
>
>
>
>
>
> On Fri, Jul 24, 2020 at 3:52 PM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> Hi Robert,
>
>
>
> Jobmanagers and taskmanagers are both running on 1.11.1. Jobmanagers are
> started with *jobmanager.sh start *and taskmanagers are started with *taskmanager.sh
> start* – to be clear those run on separate instances. Jars and config are
> distributed when creating AMIs for these instances – every build starts
> from scratch so there are no lingering jars from older Flink versions.
>
> The only code change is using Flink 1.11.1 instead of 1.10.1.
>
> FWIW: This is with *security.ssl.rest.enabled: true *if that makes a
> difference.
>
>
>
> Thanks,
>
> Peter
>
>
>
>
>
> *From: *Robert Metzger <rm...@apache.org>
> *Date: *Friday, July 24, 2020 at 8:54 AM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *Xintong Song <to...@gmail.com>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Hi Peter,
>
> how are you deploying Flink on the EC2 machines? Did you manually
> distribute the files to the machines, and then use the start-cluster.sh
> script?
>
> Can you make sure that the TaskManagers are also running Flink 1.11.1?
>
>
>
> On Thu, Jul 23, 2020 at 1:05 PM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> Hi Xintong Song,
>
>
>
> This is the UI for a newly started Flink cluster:
>
>
>
> [image: A screenshot of a cell phone Description automatically generated]
>
> As soon as I click on Task Managers, this happens (the same error message
> pops up on each UI refresh):
>
> [image: A screenshot of a cell phone Description automatically generated]
>
>
>
> I got the actual error message from the logs.
>
> This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend,
> state in S3, and zookeeper for HA.
>
>
>
>
>
> Peter
>
>
>
> *From: *Xintong Song <to...@gmail.com>
> *Date: *Wednesday, July 22, 2020 at 10:10 PM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Hi Peter,
>
>
>
> Thanks for reporting this issue.
>
>
>
> From the exception stack, it seems there's indeed a problem. However, I'm
> not able to reproduce this issue on my machine, and I guess that's why this
> is not discovered before the release. Could you help share some more
> details (and maybe screenshots) on how this issue is triggered?
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> I just started testing Flink 1.11.1 and noticed that the Task Managers
> section in the UI doesn’t load.
>
> The exception in the log is:
>
> j.i.NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> \tat
> j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat java.util.ArrayList.writeObject(ArrayList.java:766)
> \tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
> \tat
> s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> \tat j.l.reflect.Method.invoke(Method.java:498)
> \tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
> \tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
> \t... 29 common frames omitted
> Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result
> for RPC call : requestTaskManagerInfo.
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
> \tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> \tat
> j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
> \tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
> \tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
> \tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
> \tat o.a.f.r.r.a.AkkaRpcActo...
>
>
>
>
>
> Peter
>
>

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Peter Westermann <no...@genesys.com>.
Hi Robert,

I think this may have something to do with the HA setup: looks like the exceptions only show up when not on the leader.
I just spun up a new cluster to provide logs and didn’t get any errors when looking at task managers on the current leader but as soon as I look at the UI on the standby backup I get these exceptions. I attached the log for the standby jobmanager.

Thanks for your help,

Peter

From: Robert Metzger <rm...@apache.org>
Date: Friday, July 24, 2020 at 10:42 AM
To: Peter Westermann <no...@genesys.com>
Cc: Xintong Song <to...@gmail.com>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Thanks for your response. I was able to start Flink 1.11.1 locally (1 JM, 5 TMs) with SSL enabled, but I didn't have this problem (it was also unlikely :) )

I'm running JDK 1.8, Scala 2.12 build, vanilla Flink:

2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting StandaloneSessionClusterEntrypoint (Version: 1.11.1, Scala: 2.12, Rev:7eb514a, Date:2020-07-15T07:02:09+02:00)
2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - OS current user: robert
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Current Hadoop/Kerberos user: <no hadoop dependency found>
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 1.8/25.252-b09
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Maximum heap size: 981 MiBytes
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - JAVA_HOME: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - No Hadoop Dependency available
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - JVM Options:
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Xmx1073741824
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Xms1073741824
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -XX:MaxMetaspaceSize=268435456
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlog.file=/private/tmp/flink/flink-1.11.1/log/flink-robert-standalonesession-0-MacBook-Pro-2.localdomain.log
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlog4j.configuration=file:/private/tmp/flink/flink-1.11.1/conf/log4j.properties
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlog4j.configurationFile=file:/private/tmp/flink/flink-1.11.1/conf/log4j.properties
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - -Dlogback.configurationFile=file:/private/tmp/flink/flink-1.11.1/conf/logback.xml
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Program Arguments:
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - --configDir
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - /private/tmp/flink/flink-1.11.1/conf
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - --executionMode
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - cluster
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Classpath: /private/tmp/flink/flink-1.11.1/lib/flink-csv-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-json-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-shaded-zookeeper-3.4.14.jar:/private/tmp/flink/flink-1.11.1/lib/flink-table-blink_2.12-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-table_2.12-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-1.2-api-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-api-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-core-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/log4j-slf4j-impl-2.12.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-dist_2.12-1.11.1.jar:::


Your setup also sounds pretty vanilla, and the error seems to occur even before you submit any job (so the S3 / rocksdb stuff is not loaded / used yet).
Are there any clues in the JobManager log? Can you share the full log here? (or with me privately?)
Did you do any other modifications?


On Fri, Jul 24, 2020 at 3:52 PM Peter Westermann <no...@genesys.com>> wrote:
Hi Robert,

Jobmanagers and taskmanagers are both running on 1.11.1. Jobmanagers are started with jobmanager.sh start and taskmanagers are started with taskmanager.sh start – to be clear those run on separate instances. Jars and config are distributed when creating AMIs for these instances – every build starts from scratch so there are no lingering jars from older Flink versions.
The only code change is using Flink 1.11.1 instead of 1.10.1.
FWIW: This is with security.ssl.rest.enabled: true if that makes a difference.

Thanks,

Peter


From: Robert Metzger <rm...@apache.org>>
Date: Friday, July 24, 2020 at 8:54 AM
To: Peter Westermann <no...@genesys.com>>
Cc: Xintong Song <to...@gmail.com>>, "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Hi Peter,
how are you deploying Flink on the EC2 machines? Did you manually distribute the files to the machines, and then use the start-cluster.sh script?
Can you make sure that the TaskManagers are also running Flink 1.11.1?

On Thu, Jul 23, 2020 at 1:05 PM Peter Westermann <no...@genesys.com>> wrote:
Hi Xintong Song,

This is the UI for a newly started Flink cluster:

[A screenshot of a cell phone  Description automatically generated]
As soon as I click on Task Managers, this happens (the same error message pops up on each UI refresh):
[A screenshot of a cell phone  Description automatically generated]

I got the actual error message from the logs.
This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend, state in S3, and zookeeper for HA.


Peter

From: Xintong Song <to...@gmail.com>>
Date: Wednesday, July 22, 2020 at 10:10 PM
To: Peter Westermann <no...@genesys.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Hi Peter,

Thanks for reporting this issue.

From the exception stack, it seems there's indeed a problem. However, I'm not able to reproduce this issue on my machine, and I guess that's why this is not discovered before the release. Could you help share some more details (and maybe screenshots) on how this issue is triggered?


Thank you~

Xintong Song


On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <no...@genesys.com>> wrote:
I just started testing Flink 1.11.1 and noticed that the Task Managers section in the UI doesn’t load.
The exception in the log is:
j.i.NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
\tat j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat java.util.ArrayList.writeObject(ArrayList.java:766)
\tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
\tat s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
\tat j.l.reflect.Method.invoke(Method.java:498)
\tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
\tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
\t... 29 common frames omitted
Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result for RPC call : requestTaskManagerInfo.
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
\tat o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
\tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
\tat j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
\tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
\tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
\tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
\tat o.a.f.r.r.a.AkkaRpcActo...


Peter

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Robert Metzger <rm...@apache.org>.
Thanks for your response. I was able to start Flink 1.11.1 locally (1 JM, 5
TMs) with SSL enabled, but I didn't have this problem (it was also unlikely
:) )

I'm running JDK 1.8, Scala 2.12 build, vanilla Flink:

2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - Starting StandaloneSessionClusterEntrypoint (Version:
1.11.1, Scala: 2.12, Rev:7eb514a, Date:2020-07-15T07:02:09+02:00)
2020-07-24 16:33:58,416 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - OS current user: robert
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - Current Hadoop/Kerberos user: <no hadoop dependency
found>
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - JVM: OpenJDK 64-Bit Server VM - AdoptOpenJDK - 1.8/
25.252-b09
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - Maximum heap size: 981 MiBytes
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - JAVA_HOME: /Library/Java/JavaVirtualMachines
/adoptopenjdk-8.jdk/Contents/Home
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - No Hadoop Dependency available
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - JVM Options:
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - -Xmx1073741824
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - -Xms1073741824
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - -XX:MaxMetaspaceSize=268435456
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - -Dlog.file=/private/tmp/flink/flink-1.11.1
/log/flink-robert-standalonesession-0-MacBook-Pro-2.localdomain.log
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - -Dlog4j.configuration=file:/private/tmp/flink/flink-
1.11.1/conf/log4j.properties
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - -Dlog4j.configurationFile=file:/private
/tmp/flink/flink-1.11.1/conf/log4j.properties
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - -Dlogback.configurationFile=file:/private
/tmp/flink/flink-1.11.1/conf/logback.xml
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - Program Arguments:
2020-07-24 16:33:58,417 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - --configDir
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - /private/tmp/flink/flink-1.11.1/conf
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - --executionMode
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - cluster
2020-07-24 16:33:58,418 INFO org.apache.flink.runtime.entrypoint.
ClusterEntrypoint [] - Classpath: /private/tmp/flink/flink-1.11.1
/lib/flink-csv-1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-json-
1.11.1.jar:/private/tmp/flink/flink-1.11.1/lib/flink-shaded-zookeeper-3.4.14
.jar:/private/tmp/flink/flink-1.11.1/lib/flink-table-blink_2.12-1.11.1.jar:/
private/tmp/flink/flink-1.11.1/lib/flink-table_2.12-1.11.1.jar:/private
/tmp/flink/flink-1.11.1/lib/log4j-1.2-api-2.12.1.jar:/private
/tmp/flink/flink-1.11.1/lib/log4j-api-2.12.1.jar:/private/tmp/flink/flink-
1.11.1/lib/log4j-core-2.12.1.jar:/private/tmp/flink/flink-1.11.1
/lib/log4j-slf4j-impl-2.12.1.jar:/private/tmp/flink/flink-1.11.1
/lib/flink-dist_2.12-1.11.1.jar:::


Your setup also sounds pretty vanilla, and the error seems to occur even
before you submit any job (so the S3 / rocksdb stuff is not loaded / used
yet).
Are there any clues in the JobManager log? Can you share the full log here?
(or with me privately?)
Did you do any other modifications?


On Fri, Jul 24, 2020 at 3:52 PM Peter Westermann <no...@genesys.com>
wrote:

> Hi Robert,
>
>
>
> Jobmanagers and taskmanagers are both running on 1.11.1. Jobmanagers are
> started with *jobmanager.sh start *and taskmanagers are started with *taskmanager.sh
> start* – to be clear those run on separate instances. Jars and config are
> distributed when creating AMIs for these instances – every build starts
> from scratch so there are no lingering jars from older Flink versions.
>
> The only code change is using Flink 1.11.1 instead of 1.10.1.
>
> FWIW: This is with *security.ssl.rest.enabled: true *if that makes a
> difference.
>
>
>
> Thanks,
>
> Peter
>
>
>
>
>
> *From: *Robert Metzger <rm...@apache.org>
> *Date: *Friday, July 24, 2020 at 8:54 AM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *Xintong Song <to...@gmail.com>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Hi Peter,
>
> how are you deploying Flink on the EC2 machines? Did you manually
> distribute the files to the machines, and then use the start-cluster.sh
> script?
>
> Can you make sure that the TaskManagers are also running Flink 1.11.1?
>
>
>
> On Thu, Jul 23, 2020 at 1:05 PM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> Hi Xintong Song,
>
>
>
> This is the UI for a newly started Flink cluster:
>
>
>
> [image: A screenshot of a cell phone Description automatically generated]
>
> As soon as I click on Task Managers, this happens (the same error message
> pops up on each UI refresh):
>
> [image: A screenshot of a cell phone Description automatically generated]
>
>
>
> I got the actual error message from the logs.
>
> This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend,
> state in S3, and zookeeper for HA.
>
>
>
>
>
> Peter
>
>
>
> *From: *Xintong Song <to...@gmail.com>
> *Date: *Wednesday, July 22, 2020 at 10:10 PM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Hi Peter,
>
>
>
> Thanks for reporting this issue.
>
>
>
> From the exception stack, it seems there's indeed a problem. However, I'm
> not able to reproduce this issue on my machine, and I guess that's why this
> is not discovered before the release. Could you help share some more
> details (and maybe screenshots) on how this issue is triggered?
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> I just started testing Flink 1.11.1 and noticed that the Task Managers
> section in the UI doesn’t load.
>
> The exception in the log is:
>
> j.i.NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> \tat
> j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat java.util.ArrayList.writeObject(ArrayList.java:766)
> \tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
> \tat
> s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> \tat j.l.reflect.Method.invoke(Method.java:498)
> \tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
> \tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
> \t... 29 common frames omitted
> Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result
> for RPC call : requestTaskManagerInfo.
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
> \tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> \tat
> j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
> \tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
> \tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
> \tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
> \tat o.a.f.r.r.a.AkkaRpcActo...
>
>
>
>
>
> Peter
>
>

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Peter Westermann <no...@genesys.com>.
Hi Robert,

Jobmanagers and taskmanagers are both running on 1.11.1. Jobmanagers are started with jobmanager.sh start and taskmanagers are started with taskmanager.sh start – to be clear those run on separate instances. Jars and config are distributed when creating AMIs for these instances – every build starts from scratch so there are no lingering jars from older Flink versions.
The only code change is using Flink 1.11.1 instead of 1.10.1.
FWIW: This is with security.ssl.rest.enabled: true if that makes a difference.

Thanks,

Peter


From: Robert Metzger <rm...@apache.org>
Date: Friday, July 24, 2020 at 8:54 AM
To: Peter Westermann <no...@genesys.com>
Cc: Xintong Song <to...@gmail.com>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Hi Peter,
how are you deploying Flink on the EC2 machines? Did you manually distribute the files to the machines, and then use the start-cluster.sh script?
Can you make sure that the TaskManagers are also running Flink 1.11.1?

On Thu, Jul 23, 2020 at 1:05 PM Peter Westermann <no...@genesys.com>> wrote:
Hi Xintong Song,

This is the UI for a newly started Flink cluster:

[A screenshot of a cell phone  Description automatically generated]
As soon as I click on Task Managers, this happens (the same error message pops up on each UI refresh):
[A screenshot of a cell phone  Description automatically generated]

I got the actual error message from the logs.
This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend, state in S3, and zookeeper for HA.


Peter

From: Xintong Song <to...@gmail.com>>
Date: Wednesday, July 22, 2020 at 10:10 PM
To: Peter Westermann <no...@genesys.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Hi Peter,

Thanks for reporting this issue.

From the exception stack, it seems there's indeed a problem. However, I'm not able to reproduce this issue on my machine, and I guess that's why this is not discovered before the release. Could you help share some more details (and maybe screenshots) on how this issue is triggered?


Thank you~

Xintong Song


On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <no...@genesys.com>> wrote:
I just started testing Flink 1.11.1 and noticed that the Task Managers section in the UI doesn’t load.
The exception in the log is:
j.i.NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
\tat j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat java.util.ArrayList.writeObject(ArrayList.java:766)
\tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
\tat s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
\tat j.l.reflect.Method.invoke(Method.java:498)
\tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
\tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
\t... 29 common frames omitted
Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result for RPC call : requestTaskManagerInfo.
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
\tat o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
\tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
\tat j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
\tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
\tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
\tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
\tat o.a.f.r.r.a.AkkaRpcActo...


Peter

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Robert Metzger <rm...@apache.org>.
Hi Peter,
how are you deploying Flink on the EC2 machines? Did you manually
distribute the files to the machines, and then use the start-cluster.sh
script?
Can you make sure that the TaskManagers are also running Flink 1.11.1?

On Thu, Jul 23, 2020 at 1:05 PM Peter Westermann <no...@genesys.com>
wrote:

> Hi Xintong Song,
>
>
>
> This is the UI for a newly started Flink cluster:
>
>
>
> [image: A screenshot of a cell phone Description automatically generated]
>
> As soon as I click on Task Managers, this happens (the same error message
> pops up on each UI refresh):
>
> [image: A screenshot of a cell phone Description automatically generated]
>
>
>
> I got the actual error message from the logs.
>
> This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend,
> state in S3, and zookeeper for HA.
>
>
>
>
>
> Peter
>
>
>
> *From: *Xintong Song <to...@gmail.com>
> *Date: *Wednesday, July 22, 2020 at 10:10 PM
> *To: *Peter Westermann <no...@genesys.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *Re: NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
>
>
>
> Hi Peter,
>
>
>
> Thanks for reporting this issue.
>
>
>
> From the exception stack, it seems there's indeed a problem. However, I'm
> not able to reproduce this issue on my machine, and I guess that's why this
> is not discovered before the release. Could you help share some more
> details (and maybe screenshots) on how this issue is triggered?
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <
> no.Westermann@genesys.com> wrote:
>
> I just started testing Flink 1.11.1 and noticed that the Task Managers
> section in the UI doesn’t load.
>
> The exception in the log is:
>
> j.i.NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> \tat
> j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat java.util.ArrayList.writeObject(ArrayList.java:766)
> \tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
> \tat
> s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> \tat j.l.reflect.Method.invoke(Method.java:498)
> \tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
> \tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
> \t... 29 common frames omitted
> Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result
> for RPC call : requestTaskManagerInfo.
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
> \tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> \tat
> j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
> \tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
> \tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
> \tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
> \tat o.a.f.r.r.a.AkkaRpcActo...
>
>
>
>
>
> Peter
>
>

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Peter Westermann <no...@genesys.com>.
Hi Xintong Song,

This is the UI for a newly started Flink cluster:

[A screenshot of a cell phone  Description automatically generated]
As soon as I click on Task Managers, this happens (the same error message pops up on each UI refresh):
[A screenshot of a cell phone  Description automatically generated]

I got the actual error message from the logs.
This is for a Flink cluster on Amazon EC2 with RocksDB as a state backend, state in S3, and zookeeper for HA.


Peter

From: Xintong Song <to...@gmail.com>
Date: Wednesday, July 22, 2020 at 10:10 PM
To: Peter Westermann <no...@genesys.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Hi Peter,

Thanks for reporting this issue.

From the exception stack, it seems there's indeed a problem. However, I'm not able to reproduce this issue on my machine, and I guess that's why this is not discovered before the release. Could you help share some more details (and maybe screenshots) on how this issue is triggered?


Thank you~

Xintong Song


On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <no...@genesys.com>> wrote:
I just started testing Flink 1.11.1 and noticed that the Task Managers section in the UI doesn’t load.
The exception in the log is:
j.i.NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
\tat j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat java.util.ArrayList.writeObject(ArrayList.java:766)
\tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
\tat s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
\tat j.l.reflect.Method.invoke(Method.java:498)
\tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
\tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
\tat j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
\tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
\tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
\tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
\tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
\t... 29 common frames omitted
Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result for RPC call : requestTaskManagerInfo.
\tat o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
\tat o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
\tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
\tat j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
\tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
\tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
\tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
\tat o.a.f.r.r.a.AkkaRpcActo...


Peter

Re: NotSerializableException: org.apache.flink.runtime.rest.messages.ResourceProfileInfo

Posted by Xintong Song <to...@gmail.com>.
Hi Peter,

Thanks for reporting this issue.

From the exception stack, it seems there's indeed a problem. However, I'm
not able to reproduce this issue on my machine, and I guess that's why this
is not discovered before the release. Could you help share some more
details (and maybe screenshots) on how this issue is triggered?

Thank you~

Xintong Song



On Thu, Jul 23, 2020 at 2:07 AM Peter Westermann <no...@genesys.com>
wrote:

> I just started testing Flink 1.11.1 and noticed that the Task Managers
> section in the UI doesn’t load.
>
> The exception in the log is:
>
> j.i.NotSerializableException:
> org.apache.flink.runtime.rest.messages.ResourceProfileInfo
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> \tat
> j.i.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat java.util.ArrayList.writeObject(ArrayList.java:766)
> \tat s.r.GeneratedMethodAccessor22.invoke(Unknown Source)
> \tat
> s.r.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> \tat j.l.reflect.Method.invoke(Method.java:498)
> \tat j.i.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
> \tat j.i.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> \tat
> j.i.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> \tat j.i.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> \tat j.i.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> \tat o.a.f.u.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
> \tat o.a.f.u.SerializedValue.<init>(SerializedValue.java:52)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:357)
> \t... 29 common frames omitted
> Wrapped by: o.a.f.r.r.a.e.AkkaRpcException: Failed to serialize the result
> for RPC call : requestTaskManagerInfo.
> \tat
> o.a.f.r.r.a.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:368)
> \tat
> o.a.f.r.r.a.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:335)
> \tat j.u.c.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> \tat
> j.u.c.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
> \tat j.u.c.CompletableFuture.whenComplete(CompletableFuture.java:2140)
> \tat o.a.f.r.r.a.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:329)
> \tat o.a.f.r.r.a.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:298)
> \tat o.a.f.r.r.a.AkkaRpcActo...
>
>
>
>
>
> Peter
>