You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Neil Derraugh <ne...@intellifylearning.com> on 2016/10/19 23:35:55 UTC

Lost connection to the JobManager

Hello All,

I’m trying to get up and running with Flink on a mac.

Setup:
Flink 1.1.3
Zeppelin 0.6.2
Oracle JDK 1.8 u 111  — I realize this isn’t officially tested.  

I can’t get it working with Java 7 at all.  When I try with 7, the daemon starts, and the page mostly loads at localhost:8080 but then there are ws errors: WebSocket connection to 'ws://localhost:8080/ws' failed: Error during WebSocket handshake: Unexpected response code: 500

Having more success with 8 I switch back.

I’ve got my interpreter configured as localhost:6123.

If I run 
  %flink  // let Zeppelin know what interpretter to use. 
  benv
I get res5: org.apache.flink.api.scala.ExecutionEnvironment = org.apache.flink.api.scala.ExecutionEnvironment@76bde237

But if I run the example from Trevor Grant’s gagillion dollar gist I get the following.

text: org.apache.flink.api.scala.DataSet[String] = org.apache.flink.api.scala.DataSet@281e2d1b
counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = org.apache.flink.api.scala.AggregateDataSet@57613737
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Lost connection to the JobManager.
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:405)
	at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:95)
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:378)
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:365)
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:340)
	at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:211)
	at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:188)
	at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:172)
	at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:896)
	at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:637)
	at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:547)
	at .<init>(<console>:22)
	at .<clinit>(<console>)
	at .<init>(<console>:7)
	at .<clinit>(<console>)
	at $print(<console>)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
	at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
	at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
	at org.apache.zeppelin.flink.FlinkInterpreter$1.apply(FlinkInterpreter.java:299)
	at org.apache.zeppelin.flink.FlinkInterpreter$1.apply(FlinkInterpreter.java:296)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
	at scala.Console$.withOut(Console.scala:107)
	at scala.Console.withOut(Console.scala)
	at org.apache.zeppelin.flink.FlinkInterpreter.interpret(FlinkInterpreter.java:294)
	at org.apache.zeppelin.flink.FlinkInterpreter.interpret(FlinkInterpreter.java:239)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
	at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Communication with JobManager failed: Lost connection to the JobManager.
	at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:137)
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:401)
	... 42 more
Caused by: org.apache.flink.runtime.client.JobClientActorConnectionTimeoutException: Lost connection to the JobManager.
	at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:252)
	at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:90)
	at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:70)
	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
	at akka.dispatch.Mailbox.run(Mailbox.scala:221)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

I need help figuring out what I’m doing wrong.  
Thanks!
Neil

Re: Lost connection to the JobManager

Posted by Neil Derraugh <ne...@intellifylearning.com>.
Finally got some time to spend on this again.

Compiling Flink 1.1.2 with Scala 2.11 and Zeppelin head also with 2.11 has resolved the issue.  Not sure why the Flink 1.1.2 binaries compiled with 2.11 didn’t work.

Neil

> On Oct 20, 2016, at 10:30 AM, Neil Derraugh <ne...@intellifylearning.com> wrote:
> 
> I’ve also tried with Oracle JDK8 against the 0.7 head with the same result: Lost connection to the JobManager.
> 	
>> On Oct 19, 2016, at 7:35 PM, Neil Derraugh <neil.derraugh@intellifylearning.com <ma...@intellifylearning.com>> wrote:
>> 
>> Hello All,
>> 
>> I’m trying to get up and running with Flink on a mac.
>> 
>> Setup:
>> Flink 1.1.3
>> Zeppelin 0.6.2
>> Oracle JDK 1.8 u 111  — I realize this isn’t officially tested.  
>> 
>> I can’t get it working with Java 7 at all.  When I try with 7, the daemon starts, and the page mostly loads at localhost:8080 but then there are ws errors: WebSocket connection to 'ws://localhost:8080/ws <ws://localhost:8080/ws>' failed: Error during WebSocket handshake: Unexpected response code: 500
>> 
>> Having more success with 8 I switch back.
>> 
>> I’ve got my interpreter configured as localhost:6123.
>> 
>> If I run 
>>   %flink  // let Zeppelin know what interpretter to use. 
>>   benv
>> I get res5: org.apache.flink.api.scala.ExecutionEnvironment = org.apache.flink.api.scala.ExecutionEnvironment@76bde237
>> 
>> But if I run the example from Trevor Grant’s gagillion dollar gist I get the following.
>> 
>> text: org.apache.flink.api.scala.DataSet[String] = org.apache.flink.api.scala.DataSet@281e2d1b
>> counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = org.apache.flink.api.scala.AggregateDataSet@57613737
>> org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Lost connection to the JobManager.
>> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:405)
>> 	at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:95)
>> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:378)
>> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:365)
>> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:340)
>> 	at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:211)
>> 	at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:188)
>> 	at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:172)
>> 	at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:896)
>> 	at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:637)
>> 	at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:547)
>> 	at .<init>(<console>:22)
>> 	at .<clinit>(<console>)
>> 	at .<init>(<console>:7)
>> 	at .<clinit>(<console>)
>> 	at $print(<console>)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> 	at java.lang.reflect.Method.invoke(Method.java:498)
>> 	at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
>> 	at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
>> 	at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
>> 	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
>> 	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
>> 	at org.apache.zeppelin.flink.FlinkInterpreter$1.apply(FlinkInterpreter.java:299)
>> 	at org.apache.zeppelin.flink.FlinkInterpreter$1.apply(FlinkInterpreter.java:296)
>> 	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>> 	at scala.Console$.withOut(Console.scala:107)
>> 	at scala.Console.withOut(Console.scala)
>> 	at org.apache.zeppelin.flink.FlinkInterpreter.interpret(FlinkInterpreter.java:294)
>> 	at org.apache.zeppelin.flink.FlinkInterpreter.interpret(FlinkInterpreter.java:239)
>> 	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94)
>> 	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
>> 	at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
>> 	at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
>> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> 	at java.lang.Thread.run(Thread.java:745)
>> Caused by: org.apache.flink.runtime.client.JobExecutionException: Communication with JobManager failed: Lost connection to the JobManager.
>> 	at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:137)
>> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:401)
>> 	... 42 more
>> Caused by: org.apache.flink.runtime.client.JobClientActorConnectionTimeoutException: Lost connection to the JobManager.
>> 	at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:252)
>> 	at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:90)
>> 	at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:70)
>> 	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
>> 	at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>> 	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
>> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>> 	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>> 	at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>> 	at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> 
>> I need help figuring out what I’m doing wrong.  
>> Thanks!
>> Neil
> 


Re: Lost connection to the JobManager

Posted by Neil Derraugh <ne...@intellifylearning.com>.
I’ve also tried with Oracle JDK8 against the 0.7 head with the same result: Lost connection to the JobManager.
	
> On Oct 19, 2016, at 7:35 PM, Neil Derraugh <ne...@intellifylearning.com> wrote:
> 
> Hello All,
> 
> I’m trying to get up and running with Flink on a mac.
> 
> Setup:
> Flink 1.1.3
> Zeppelin 0.6.2
> Oracle JDK 1.8 u 111  — I realize this isn’t officially tested.  
> 
> I can’t get it working with Java 7 at all.  When I try with 7, the daemon starts, and the page mostly loads at localhost:8080 but then there are ws errors: WebSocket connection to 'ws://localhost:8080/ws <ws://localhost:8080/ws>' failed: Error during WebSocket handshake: Unexpected response code: 500
> 
> Having more success with 8 I switch back.
> 
> I’ve got my interpreter configured as localhost:6123.
> 
> If I run 
>   %flink  // let Zeppelin know what interpretter to use. 
>   benv
> I get res5: org.apache.flink.api.scala.ExecutionEnvironment = org.apache.flink.api.scala.ExecutionEnvironment@76bde237
> 
> But if I run the example from Trevor Grant’s gagillion dollar gist I get the following.
> 
> text: org.apache.flink.api.scala.DataSet[String] = org.apache.flink.api.scala.DataSet@281e2d1b
> counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = org.apache.flink.api.scala.AggregateDataSet@57613737
> org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Communication with JobManager failed: Lost connection to the JobManager.
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:405)
> 	at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:95)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:378)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:365)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:340)
> 	at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:211)
> 	at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:188)
> 	at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:172)
> 	at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:896)
> 	at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:637)
> 	at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:547)
> 	at .<init>(<console>:22)
> 	at .<clinit>(<console>)
> 	at .<init>(<console>:7)
> 	at .<clinit>(<console>)
> 	at $print(<console>)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
> 	at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
> 	at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
> 	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
> 	at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
> 	at org.apache.zeppelin.flink.FlinkInterpreter$1.apply(FlinkInterpreter.java:299)
> 	at org.apache.zeppelin.flink.FlinkInterpreter$1.apply(FlinkInterpreter.java:296)
> 	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
> 	at scala.Console$.withOut(Console.scala:107)
> 	at scala.Console.withOut(Console.scala)
> 	at org.apache.zeppelin.flink.FlinkInterpreter.interpret(FlinkInterpreter.java:294)
> 	at org.apache.zeppelin.flink.FlinkInterpreter.interpret(FlinkInterpreter.java:239)
> 	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94)
> 	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
> 	at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
> 	at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Communication with JobManager failed: Lost connection to the JobManager.
> 	at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:137)
> 	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:401)
> 	... 42 more
> Caused by: org.apache.flink.runtime.client.JobClientActorConnectionTimeoutException: Lost connection to the JobManager.
> 	at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:252)
> 	at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:90)
> 	at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:70)
> 	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
> 	at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> 	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> 	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> 	at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 
> I need help figuring out what I’m doing wrong.  
> Thanks!
> Neil