You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by howard chen <ho...@gmail.com> on 2012/10/22 18:06:03 UTC

Manage Kafka using supervisor, failed to restart

I am using supervisor to manage Kafka, with the config:

[program:kafka]
command=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
directory=/opt/kafka
user=kafka
autostart=true
autorestart=true
startretries=10
stdout_logfile=/opt/kafka/logs/kafka.log

I was able to start Kafka, but not able to stop Kafka completely, as
when I start Kafka after the stop, it said the port is being used.

So definitely the previous shutdown was not complete, anyone got an
idea what is the issue?

Thanks.

Re: Manage Kafka using supervisor, failed to restart

Posted by David Birdsong <da...@gmail.com>.
that is precisely what we do too.

On Wed, Oct 24, 2012 at 3:28 PM, Erich Nachbar <er...@nachbar.biz> wrote:
> Had similar issues and use now the actual java command used by the scripts.
>
> [program:kafka]
> process_name=kafka
> command=java -Xmx1G -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1 -XX:+HeapDumpOnOutOfMemoryError
> -Dlog4j.configuration=file:/home/prod/kafka/bin/../config/log4j.properties
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.port=9999 -cp
> :/home/prod/kafka/bin/../project/boot/scala-2.8.0/lib/scala-compiler.jar:/home/prod/kafka/bin/../project/boot/scala-2.8.0/lib/scala-library.jar:/home/prod/kafka/bin/../core/target/scala_2.8.0/kafka-0.7.0.jar:/home/prod/kafka/bin/../core/lib/gelfj-0.9.3-SNAPSHOT.jar:/home/prod/kafka/bin/../core/lib/json-simple-1.1.1.jar:/home/prod/kafka/bin/../core/lib/zkclient-20110412.jar:/home/prod/kafka/bin/../core/lib_managed/scala_2.8.0/compile/jopt-simple-3.2.jar:/home/prod/kafka/bin/../core/lib_managed/scala_2.8.0/compile/log4j-1.2.15.jar:/home/prod/kafka/bin/../core/lib_managed/scala_2.8.0/compile/zookeeper-3.3.3.jar
> kafka.Kafka /home/prod/kafka/config/server.properties
> user=prod
> directory=/home/prod/kafka
> stopasgroup=true
> startretries=864000
> startsecs=10
> autorestart=true
>
> I think it is a straight copy from 'ps -aux'
>
> On Wed, Oct 24, 2012 at 10:06 AM, David Birdsong
> <da...@gmail.com>wrote:
>
>> The exec method needs to be plumbed all the way through to the point java
>> is called.
>>
>> please excuse my brevity on this mobile device
>> On Oct 24, 2012 9:25 AM, "howard chen" <ho...@gmail.com> wrote:
>>
>> > Hi,
>> >
>> >
>> > On Tue, Oct 23, 2012 at 1:25 AM, David DeMaagd <dd...@linkedin.com>
>> > wrote:
>> > > exec $(dirname $0)/kafka-run-class.sh kafka.Kafka $@
>> >
>> > Thanks for your tips first.
>> >
>> > I updated as you suggested, but experience the following issue:
>> >
>> > Before stopping from supervisor, the pstree is:
>> >
>> >
>> > ├─supervisord(1696)───kafka-run-class(1736)───java(1739)─┬─{java}(1740)
>> >
>> > Look good right?
>> >
>> > Then I stop from supervisor (supervisorctl stop kafka), it then become..
>> >
>> >         ├─java(1739)─┬─{java}(1740)
>> >
>> > You see, the java is `detached` from the `kafka-run-class` but still
>> > running in the background and continue using the port...
>> >
>>

Re: Manage Kafka using supervisor, failed to restart

Posted by Erich Nachbar <er...@nachbar.biz>.
Had similar issues and use now the actual java command used by the scripts.

[program:kafka]
process_name=kafka
command=java -Xmx1G -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1 -XX:+HeapDumpOnOutOfMemoryError
-Dlog4j.configuration=file:/home/prod/kafka/bin/../config/log4j.properties
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.port=9999 -cp
:/home/prod/kafka/bin/../project/boot/scala-2.8.0/lib/scala-compiler.jar:/home/prod/kafka/bin/../project/boot/scala-2.8.0/lib/scala-library.jar:/home/prod/kafka/bin/../core/target/scala_2.8.0/kafka-0.7.0.jar:/home/prod/kafka/bin/../core/lib/gelfj-0.9.3-SNAPSHOT.jar:/home/prod/kafka/bin/../core/lib/json-simple-1.1.1.jar:/home/prod/kafka/bin/../core/lib/zkclient-20110412.jar:/home/prod/kafka/bin/../core/lib_managed/scala_2.8.0/compile/jopt-simple-3.2.jar:/home/prod/kafka/bin/../core/lib_managed/scala_2.8.0/compile/log4j-1.2.15.jar:/home/prod/kafka/bin/../core/lib_managed/scala_2.8.0/compile/zookeeper-3.3.3.jar
kafka.Kafka /home/prod/kafka/config/server.properties
user=prod
directory=/home/prod/kafka
stopasgroup=true
startretries=864000
startsecs=10
autorestart=true

I think it is a straight copy from 'ps -aux'

On Wed, Oct 24, 2012 at 10:06 AM, David Birdsong
<da...@gmail.com>wrote:

> The exec method needs to be plumbed all the way through to the point java
> is called.
>
> please excuse my brevity on this mobile device
> On Oct 24, 2012 9:25 AM, "howard chen" <ho...@gmail.com> wrote:
>
> > Hi,
> >
> >
> > On Tue, Oct 23, 2012 at 1:25 AM, David DeMaagd <dd...@linkedin.com>
> > wrote:
> > > exec $(dirname $0)/kafka-run-class.sh kafka.Kafka $@
> >
> > Thanks for your tips first.
> >
> > I updated as you suggested, but experience the following issue:
> >
> > Before stopping from supervisor, the pstree is:
> >
> >
> > ├─supervisord(1696)───kafka-run-class(1736)───java(1739)─┬─{java}(1740)
> >
> > Look good right?
> >
> > Then I stop from supervisor (supervisorctl stop kafka), it then become..
> >
> >         ├─java(1739)─┬─{java}(1740)
> >
> > You see, the java is `detached` from the `kafka-run-class` but still
> > running in the background and continue using the port...
> >
>

Re: Manage Kafka using supervisor, failed to restart

Posted by David Birdsong <da...@gmail.com>.
The exec method needs to be plumbed all the way through to the point java
is called.

please excuse my brevity on this mobile device
On Oct 24, 2012 9:25 AM, "howard chen" <ho...@gmail.com> wrote:

> Hi,
>
>
> On Tue, Oct 23, 2012 at 1:25 AM, David DeMaagd <dd...@linkedin.com>
> wrote:
> > exec $(dirname $0)/kafka-run-class.sh kafka.Kafka $@
>
> Thanks for your tips first.
>
> I updated as you suggested, but experience the following issue:
>
> Before stopping from supervisor, the pstree is:
>
>
> ├─supervisord(1696)───kafka-run-class(1736)───java(1739)─┬─{java}(1740)
>
> Look good right?
>
> Then I stop from supervisor (supervisorctl stop kafka), it then become..
>
>         ├─java(1739)─┬─{java}(1740)
>
> You see, the java is `detached` from the `kafka-run-class` but still
> running in the background and continue using the port...
>

Re: Manage Kafka using supervisor, failed to restart

Posted by howard chen <ho...@gmail.com>.
Hi,


On Tue, Oct 23, 2012 at 1:25 AM, David DeMaagd <dd...@linkedin.com> wrote:
> exec $(dirname $0)/kafka-run-class.sh kafka.Kafka $@

Thanks for your tips first.

I updated as you suggested, but experience the following issue:

Before stopping from supervisor, the pstree is:

        ├─supervisord(1696)───kafka-run-class(1736)───java(1739)─┬─{java}(1740)

Look good right?

Then I stop from supervisor (supervisorctl stop kafka), it then become..

        ├─java(1739)─┬─{java}(1740)

You see, the java is `detached` from the `kafka-run-class` but still
running in the background and continue using the port...

Re: Manage Kafka using supervisor, failed to restart

Posted by David DeMaagd <dd...@linkedin.com>.
Yes, that is likely the case - if the kafka-server-start.sh you're using
is the 'out of the box' version, then you're getting a shell running the
java process, and supervisor is likely only seeing the shell.  One way
to work around this is to change the last line to be:

exec $(dirname $0)/kafka-run-class.sh kafka.Kafka $@

At which the shell process will be replaced with the java process, so
signals from supervisor will end up going direct to the java proc. 

What you may end up with at that point is trying to restart too fast -
kafka can take a little bit (generally not more than a few seconds)
depending on the workload it is under at the time. 

-- 
Dave DeMaagd
ddemaagd@linkedin.com | 818 262 7958

(david.birdsong@gmail.com - Mon, Oct 22, 2012 at 10:17:30AM -0700)
> Using start scripts with supervisor usually doesn't work. Java is probably
> a grandchild of supervisor and therefore not getting the term signal and
> not exiting. Sure the start script exits, but there's probably a java
> process hanging out; check ps.
> 
> please excuse my brevity on this mobile device
> On Oct 22, 2012 9:06 AM, "howard chen" <ho...@gmail.com> wrote:
> 
> > I am using supervisor to manage Kafka, with the config:
> >
> > [program:kafka]
> > command=/opt/kafka/bin/kafka-server-start.sh
> > /opt/kafka/config/server.properties
> > directory=/opt/kafka
> > user=kafka
> > autostart=true
> > autorestart=true
> > startretries=10
> > stdout_logfile=/opt/kafka/logs/kafka.log
> >
> > I was able to start Kafka, but not able to stop Kafka completely, as
> > when I start Kafka after the stop, it said the port is being used.
> >
> > So definitely the previous shutdown was not complete, anyone got an
> > idea what is the issue?
> >
> > Thanks.
> >

Re: Manage Kafka using supervisor, failed to restart

Posted by David Birdsong <da...@gmail.com>.
Using start scripts with supervisor usually doesn't work. Java is probably
a grandchild of supervisor and therefore not getting the term signal and
not exiting. Sure the start script exits, but there's probably a java
process hanging out; check ps.

please excuse my brevity on this mobile device
On Oct 22, 2012 9:06 AM, "howard chen" <ho...@gmail.com> wrote:

> I am using supervisor to manage Kafka, with the config:
>
> [program:kafka]
> command=/opt/kafka/bin/kafka-server-start.sh
> /opt/kafka/config/server.properties
> directory=/opt/kafka
> user=kafka
> autostart=true
> autorestart=true
> startretries=10
> stdout_logfile=/opt/kafka/logs/kafka.log
>
> I was able to start Kafka, but not able to stop Kafka completely, as
> when I start Kafka after the stop, it said the port is being used.
>
> So definitely the previous shutdown was not complete, anyone got an
> idea what is the issue?
>
> Thanks.
>

Re: Manage Kafka using supervisor, failed to restart

Posted by Felix GV <fe...@mate1inc.com>.
I think you need to make sure that the Zookeeper sessions have enough time
to time out, so if you restart a broker too fast, it can't start properly
because of that. If this is the situation you're in, then the message you
would get in the Kafka broker's log should tell you pretty explicitly /
clearly. AFAIK, the solution is simply to wait a little bit before starting
again...

If you have another problem than the one I'm describing, then you should
post what you see in the logs...

--
Felix



On Mon, Oct 22, 2012 at 12:06 PM, howard chen <ho...@gmail.com> wrote:

> I am using supervisor to manage Kafka, with the config:
>
> [program:kafka]
> command=/opt/kafka/bin/kafka-server-start.sh
> /opt/kafka/config/server.properties
> directory=/opt/kafka
> user=kafka
> autostart=true
> autorestart=true
> startretries=10
> stdout_logfile=/opt/kafka/logs/kafka.log
>
> I was able to start Kafka, but not able to stop Kafka completely, as
> when I start Kafka after the stop, it said the port is being used.
>
> So definitely the previous shutdown was not complete, anyone got an
> idea what is the issue?
>
> Thanks.
>