You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Colin Williams <co...@gmail.com> on 2017/12/20 03:29:05 UTC

flink jobmanager HA zookeeper leadership election - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.

Hi,

I've been trying to update my flink-docker jobmanager configuration for
flink 1.4. I think the system is shutting down after a leadership election,
but I'm not sure what the issue is. My configuration of the jobmanager
follows


jobmanager.rpc.address: 10.16.228.150
jobmanager.rpc.port: 6123
jobmanager.heap.mb: 1024
blob.server.port: 6124
query.server.port: 6125

web.port: 8081
web.history: 10

parallelism.default: 1

state.backend: rocksdb
state.backend.rocksdb.checkpointdir: /tmp/flink/rocksdb
state.backend.fs.checkpointdir: file:///var/lib/data/checkpoints

high-availability: zookeeper
high-availability.cluster-id: /dev
high-availability.zookeeper.quorum: 10.16.228.190:2181
high-availability.zookeeper.path.root: /flink-1.4
high-availability.zookeeper.storageDir: file:///var/lib/data/recovery
high-availability.jobmanager.port: 50010

env.java.opts: -Dlog.file=/opt/flink/log/jobmanager.log

I'm also attaching some debugging output which shows the shutdown. Again
I'm not entirely sure it's caused by a leadership issue because it's not
clear from the debug logs. Can anyone suggest changes I might make to the
configuration to fix this? I've tried clearing the zookeeper root path in
case it had some old session information, but that didn't seem to help.

Best,

Colin Williams

Re: flink jobmanager HA zookeeper leadership election - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.

Posted by Till Rohrmann <tr...@apache.org>.
Hi Colin,

the log looks as if the Flink JobManager receives a SIGTERM signal and
shuts down due to that. This is nothing which should be triggered by
Flink's leader election. Could you check whether this signal might be
created by another process in your environment or if the container
supervisor terminated the process?

Cheers,
Till

On Wed, Dec 20, 2017 at 4:41 AM, Colin Williams <
colin.williams.seattle@gmail.com> wrote:

>
>
> On Tue, Dec 19, 2017 at 7:29 PM, Colin Williams <
> colin.williams.seattle@gmail.com> wrote:
>
>> Hi,
>>
>> I've been trying to update my flink-docker jobmanager configuration for
>> flink 1.4. I think the system is shutting down after a leadership election,
>> but I'm not sure what the issue is. My configuration of the jobmanager
>> follows
>>
>>
>> jobmanager.rpc.address: 10.16.228.150
>> jobmanager.rpc.port: 6123
>> jobmanager.heap.mb: 1024
>> blob.server.port: 6124
>> query.server.port: 6125
>>
>> web.port: 8081
>> web.history: 10
>>
>> parallelism.default: 1
>>
>> state.backend: rocksdb
>> state.backend.rocksdb.checkpointdir: /tmp/flink/rocksdb
>> state.backend.fs.checkpointdir: file:///var/lib/data/checkpoints
>>
>> high-availability: zookeeper
>> high-availability.cluster-id: /dev
>> high-availability.zookeeper.quorum: 10.16.228.190:2181
>> high-availability.zookeeper.path.root: /flink-1.4
>> high-availability.zookeeper.storageDir: file:///var/lib/data/recovery
>> high-availability.jobmanager.port: 50010
>>
>> env.java.opts: -Dlog.file=/opt/flink/log/jobmanager.log
>>
>> I'm also attaching some debugging output which shows the shutdown. Again
>> I'm not entirely sure it's caused by a leadership issue because it's not
>> clear from the debug logs. Can anyone suggest changes I might make to the
>> configuration to fix this? I've tried clearing the zookeeper root path in
>> case it had some old session information, but that didn't seem to help.
>>
>> Best,
>>
>> Colin Williams
>>
>
>

Re: flink jobmanager HA zookeeper leadership election - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.

Posted by Colin Williams <co...@gmail.com>.
On Tue, Dec 19, 2017 at 7:29 PM, Colin Williams <
colin.williams.seattle@gmail.com> wrote:

> Hi,
>
> I've been trying to update my flink-docker jobmanager configuration for
> flink 1.4. I think the system is shutting down after a leadership election,
> but I'm not sure what the issue is. My configuration of the jobmanager
> follows
>
>
> jobmanager.rpc.address: 10.16.228.150
> jobmanager.rpc.port: 6123
> jobmanager.heap.mb: 1024
> blob.server.port: 6124
> query.server.port: 6125
>
> web.port: 8081
> web.history: 10
>
> parallelism.default: 1
>
> state.backend: rocksdb
> state.backend.rocksdb.checkpointdir: /tmp/flink/rocksdb
> state.backend.fs.checkpointdir: file:///var/lib/data/checkpoints
>
> high-availability: zookeeper
> high-availability.cluster-id: /dev
> high-availability.zookeeper.quorum: 10.16.228.190:2181
> high-availability.zookeeper.path.root: /flink-1.4
> high-availability.zookeeper.storageDir: file:///var/lib/data/recovery
> high-availability.jobmanager.port: 50010
>
> env.java.opts: -Dlog.file=/opt/flink/log/jobmanager.log
>
> I'm also attaching some debugging output which shows the shutdown. Again
> I'm not entirely sure it's caused by a leadership issue because it's not
> clear from the debug logs. Can anyone suggest changes I might make to the
> configuration to fix this? I've tried clearing the zookeeper root path in
> case it had some old session information, but that didn't seem to help.
>
> Best,
>
> Colin Williams
>