You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Flavio Pompermaier <po...@okkam.it> on 2018/11/07 10:48:54 UTC

Error after upgrading to Flink 1.6.2

Hi to all,
we tried to upgrade our jobs to Flink 1.6.2 but now we get the following
error (we saw a similar issue with spark that was caused by different java
version on the cluster servers so we checked them and they are all to the
same version - oracle-8-191):

Caused by: org.apache.flink.runtime.client.JobExecutionException:
Cannot initialize task 'DataSink (Parquet write:
hdfs:/rivela/1/1/0_staging/parquet)': Deserializing the OutputFormat
(org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8)
failed: unread block data
	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:220)
	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100)
	at org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1151)
	at org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1131)
	at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:294)
	at org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:157)
	... 10 more
Caused by: java.lang.Exception: Deserializing the OutputFormat
(org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8)
failed: unread block data
	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63)
	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:216)
	... 15 more
Caused by: java.lang.IllegalStateException: unread block data
	at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2783)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1605)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:502)
	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:489)
	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:477)
	at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:438)
	at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288)
	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60)
	... 16 more


Has anyone faced this problem before? How can we try to solve it?
Best,Flavio

Re: Error after upgrading to Flink 1.6.2

Posted by Flavio Pompermaier <po...@okkam.it>.
Hi Till,
we are not using HBase at the moment. We managed to run successfully the
job but it was a pain to find the right combination of dependencies,
library shading and the right HADOOP_CLASSPATH.
The problem was the combination of parquet, jaxrs, hadoop and jackson.
Moreover we had to run the cluster with parent-first class loading in order
to make it run.

However we still have the big problem of being able to submit jobs via rest
API (as I wrote in another thread it seems that there's no way to execute
any code after env.execute if using REST APIs).

Best,
Flavio

On Wed, Nov 7, 2018 at 6:15 PM Till Rohrmann <tr...@apache.org> wrote:

> Hi Flavio,
>
> I haven't seen this problem before. Are you using Flink's HBase connector?
> According to similar problems with Spark one needs to make sure that the
> hbase jars are on the classpath [1, 2]. If not, then it might be a problem
> with the MR1 version 2.6.0-mr1-cdh5.11.2 which caused problems for CDH 5.2
> [2]. It could also be worthwhile to try it out with the latest CDH version.
>
> [1]
> https://stackoverflow.com/questions/34901331/spark-hbase-error-java-lang-illegalstateexception-unread-block-data
> [2]
> https://mapr.com/community/s/question/0D50L00006BIthGSAT/javalangillegalstateexception-unread-block-data-when-running-spark-with-yarn
> [3]
> https://issues.apache.org/jira/browse/SPARK-1867?focusedCommentId=14322647&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14322647
>
> Cheers,
> Till
>
> On Wed, Nov 7, 2018 at 12:05 PM Flavio Pompermaier <po...@okkam.it>
> wrote:
>
>> I forgot to mention that I'm using Flink 1.6.2 compiled for cloudera CDH
>> 5.11.2:
>>
>> /opt/shared/devel/apache-maven-3.3.9/bin/mvn clean install
>> -Dhadoop.version=2.6.0-cdh5.11.2 -Dhbase.version=1.2.0-cdh5.11.2
>> -Dhadoop.core.version=2.6.0-mr1-cdh5.11.2 -DskipTests -Pvendor-repos
>>
>> On Wed, Nov 7, 2018 at 11:48 AM Flavio Pompermaier <po...@okkam.it>
>> wrote:
>>
>>> Hi to all,
>>> we tried to upgrade our jobs to Flink 1.6.2 but now we get the following
>>> error (we saw a similar issue with spark that was caused by different java
>>> version on the cluster servers so we checked them and they are all to the
>>> same version - oracle-8-191):
>>>
>>> Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (Parquet write: hdfs:/rivela/1/1/0_staging/parquet)': Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) failed: unread block data
>>> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:220)
>>> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100)
>>> 	at org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1151)
>>> 	at org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1131)
>>> 	at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:294)
>>> 	at org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:157)
>>> 	... 10 more
>>> Caused by: java.lang.Exception: Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) failed: unread block data
>>> 	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63)
>>> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:216)
>>> 	... 15 more
>>> Caused by: java.lang.IllegalStateException: unread block data
>>> 	at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2783)
>>> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1605)
>>> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
>>> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
>>> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
>>> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>>> 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
>>> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:502)
>>> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:489)
>>> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:477)
>>> 	at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:438)
>>> 	at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288)
>>> 	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60)
>>> 	... 16 more
>>>
>>>
>>> Has anyone faced this problem before? How can we try to solve it?
>>> Best,Flavio
>>>
>>
>>

Re: Error after upgrading to Flink 1.6.2

Posted by Till Rohrmann <tr...@apache.org>.
Hi Flavio,

I haven't seen this problem before. Are you using Flink's HBase connector?
According to similar problems with Spark one needs to make sure that the
hbase jars are on the classpath [1, 2]. If not, then it might be a problem
with the MR1 version 2.6.0-mr1-cdh5.11.2 which caused problems for CDH 5.2
[2]. It could also be worthwhile to try it out with the latest CDH version.

[1]
https://stackoverflow.com/questions/34901331/spark-hbase-error-java-lang-illegalstateexception-unread-block-data
[2]
https://mapr.com/community/s/question/0D50L00006BIthGSAT/javalangillegalstateexception-unread-block-data-when-running-spark-with-yarn
[3]
https://issues.apache.org/jira/browse/SPARK-1867?focusedCommentId=14322647&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14322647

Cheers,
Till

On Wed, Nov 7, 2018 at 12:05 PM Flavio Pompermaier <po...@okkam.it>
wrote:

> I forgot to mention that I'm using Flink 1.6.2 compiled for cloudera CDH
> 5.11.2:
>
> /opt/shared/devel/apache-maven-3.3.9/bin/mvn clean install
> -Dhadoop.version=2.6.0-cdh5.11.2 -Dhbase.version=1.2.0-cdh5.11.2
> -Dhadoop.core.version=2.6.0-mr1-cdh5.11.2 -DskipTests -Pvendor-repos
>
> On Wed, Nov 7, 2018 at 11:48 AM Flavio Pompermaier <po...@okkam.it>
> wrote:
>
>> Hi to all,
>> we tried to upgrade our jobs to Flink 1.6.2 but now we get the following
>> error (we saw a similar issue with spark that was caused by different java
>> version on the cluster servers so we checked them and they are all to the
>> same version - oracle-8-191):
>>
>> Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (Parquet write: hdfs:/rivela/1/1/0_staging/parquet)': Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) failed: unread block data
>> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:220)
>> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100)
>> 	at org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1151)
>> 	at org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1131)
>> 	at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:294)
>> 	at org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:157)
>> 	... 10 more
>> Caused by: java.lang.Exception: Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) failed: unread block data
>> 	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63)
>> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:216)
>> 	... 15 more
>> Caused by: java.lang.IllegalStateException: unread block data
>> 	at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2783)
>> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1605)
>> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
>> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
>> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
>> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>> 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
>> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:502)
>> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:489)
>> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:477)
>> 	at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:438)
>> 	at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288)
>> 	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60)
>> 	... 16 more
>>
>>
>> Has anyone faced this problem before? How can we try to solve it?
>> Best,Flavio
>>
>
>

Re: Error after upgrading to Flink 1.6.2

Posted by Flavio Pompermaier <po...@okkam.it>.
I forgot to mention that I'm using Flink 1.6.2 compiled for cloudera CDH
5.11.2:

/opt/shared/devel/apache-maven-3.3.9/bin/mvn clean install
-Dhadoop.version=2.6.0-cdh5.11.2 -Dhbase.version=1.2.0-cdh5.11.2
-Dhadoop.core.version=2.6.0-mr1-cdh5.11.2 -DskipTests -Pvendor-repos

On Wed, Nov 7, 2018 at 11:48 AM Flavio Pompermaier <po...@okkam.it>
wrote:

> Hi to all,
> we tried to upgrade our jobs to Flink 1.6.2 but now we get the following
> error (we saw a similar issue with spark that was caused by different java
> version on the cluster servers so we checked them and they are all to the
> same version - oracle-8-191):
>
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (Parquet write: hdfs:/rivela/1/1/0_staging/parquet)': Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) failed: unread block data
> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:220)
> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100)
> 	at org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1151)
> 	at org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1131)
> 	at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:294)
> 	at org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:157)
> 	... 10 more
> Caused by: java.lang.Exception: Deserializing the OutputFormat (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) failed: unread block data
> 	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63)
> 	at org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:216)
> 	... 15 more
> Caused by: java.lang.IllegalStateException: unread block data
> 	at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2783)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1605)
> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
> 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:502)
> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:489)
> 	at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:477)
> 	at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:438)
> 	at org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288)
> 	at org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60)
> 	... 16 more
>
>
> Has anyone faced this problem before? How can we try to solve it?
> Best,Flavio
>