You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/07/03 11:58:00 UTC

[jira] [Commented] (FLINK-7066) Kafka integration tests failing in "airplane mode"

    [ https://issues.apache.org/jira/browse/FLINK-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072309#comment-16072309 ] 

ASF GitHub Bot commented on FLINK-7066:
---------------------------------------

GitHub user pnowojski opened a pull request:

    https://github.com/apache/flink/pull/4247

    [FLINK-7066] Fix integration tests in airplane mode

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pnowojski/flink airplane

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4247
    
----
commit bc43bda5274bd73bf2daa25b2f9ddbd59ede672b
Author: Piotr Nowojski <pi...@gmail.com>
Date:   2017-07-03T11:55:27Z

    [FLINK-7066] Fix integration tests in airplane mode

----


> Kafka integration tests failing in "airplane mode"
> --------------------------------------------------
>
>                 Key: FLINK-7066
>                 URL: https://issues.apache.org/jira/browse/FLINK-7066
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Piotr Nowojski
>            Assignee: Piotr Nowojski
>
> Tests KafkaXXXProducerITCase are failing on my laptop in airplane mode. It seemed to have something to do with some service listening on wrong interface, when client tries to connect to different host. Strangely tests for Kafka010 and Kafka011 fails with different error, but there is the same fix for them (maybe in Kafka010 original exception is masked by some other error). Kafka 0.11 tests fails like this:
> {code}
> 35309 [flink-akka.actor.default-dispatcher-3] INFO  Remoting  - Starting remoting
> 42445 [flink-akka.actor.default-dispatcher-3] INFO  Remoting  - Remoting started; listening on addresses :[akka.tcp://flink@fe80:0:0:0:165d:140b:f597:e019%13:54398]
> 42445 [main] INFO  org.apache.flink.runtime.client.JobClient  - Started JobClient actor system at [fe80::165d:140b:f597:e019]:54398
> 42450 [flink-akka.actor.default-dispatcher-5] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Disconnect from JobManager null.
> 42461 [flink-akka.actor.default-dispatcher-5] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Received SubmitJobAndWait(JobGraph(jobId: 3b11234d116ab1ed3c1279dd73dfaab5)) but there is no connection to a JobManager yet.
> 42462 [flink-akka.actor.default-dispatcher-5] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Received job Exactly once test (3b11234d116ab1ed3c1279dd73dfaab5).
> 52473 [flink-akka.actor.default-dispatcher-5] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Terminate JobClientActor.
> 52473 [flink-akka.actor.default-dispatcher-5] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Disconnect from JobManager null.
> org.apache.flink.runtime.client.JobExecutionException: Couldn't retrieve the JobExecutionResult from the JobManager.
> 	at org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:309)
> ...
> Caused by: org.apache.flink.runtime.client.JobClientActorConnectionTimeoutException: Lost connection to the JobManager.
> 	at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:219)
> ...
> {code}
> I think the issue is that there is someone listening on fe80:0:0:0:165d:140b:f597:e019 (note that this is ipv6 address from some virtual utun0 interface on my machine), while JobClient tries to connect to "localhost" - which fails. When I enable wifi and connect to any network and log looks like this:
> {code}
> 32981 [flink-akka.actor.default-dispatcher-2] INFO  Remoting  - Starting remoting
> 32995 [flink-akka.actor.default-dispatcher-3] INFO  Remoting  - Remoting started; listening on addresses :[akka.tcp://flink@192.168.178.125:55576]
> address = akka.tcp://flink@192.168.178.125:55576
> 33000 [main] INFO  org.apache.flink.runtime.client.JobClient  - Started JobClient actor system at 192.168.178.125:55576
> 33005 [flink-akka.actor.default-dispatcher-2] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Disconnect from JobManager null.
> submitJobAndWait config = {restart-strategy.fixed-delay.delay=0 s, local.number-taskmanager=1, taskmanager.network.netty.client.numThreads=1, metrics.reporter.my_reporter.class=org.apache.flink.metrics.jmx.JMXReporter, jobmanager.rpc.address=localhost, taskmanager.numberOfTaskSlots=8, taskmanager.memory.size=16, metrics.reporters=my_reporter, taskmanager.network.netty.server.numThreads=2, jobmanager.rpc.port=55566, query.server.enable=false}
> 33013 [flink-akka.actor.default-dispatcher-2] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Received SubmitJobAndWait(JobGraph(jobId: ac67638ac85a2179a37486d507a1a008)) but there is no connection to a JobManager yet.
> 33014 [flink-akka.actor.default-dispatcher-2] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Received job Exactly once test (ac67638ac85a2179a37486d507a1a008).
> 33024 [flink-akka.actor.default-dispatcher-2] INFO  org.apache.flink.runtime.client.JobSubmissionClientActor  - Connect to JobManager Actor[akka.tcp://flink@localhost:55566/user/jobmanager#-1394172571].
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)