You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Eric Coan <ec...@instructure.com> on 2017/06/28 17:47:09 UTC

Kafka Sometimes Fails to Start on Boot

Hello,

I've recently been doing research into getting our Kafka cluster running
outside of Mesos (for a couple of reasons). However I'm noticing about 10%
of the time Kafka fails to start on boot (or more accurately starts, and
immediately exits). I find it weird since all brokers are using the exact
same configuration, on the same OS (Ubuntu 16.04)

There's nothing in my LOG4J directory, however I did find a singular log
line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual light as
to why it's failing:

```
Error: Exception thrown by the agent : java.rmi.server.ExportException:
Port already in use: 9999; nested exception is:
        java.net.BindException: Address already in use (Bind failed)
```

However, I can verify nothing is running on this port right before
invocation using netstat -tulpn which shows:

```
 upstart.sh[1127]: Active Internet connections (only servers)
 upstart.sh[1127]: Proto Recv-Q Send-Q Local Address           Foreign
Address         State       PID/Pr
 upstart.sh[1127]: tcp        0      0 127.0.0.1:17123         0.0.0.0:*
           LISTEN      1419/p
 upstart.sh[1127]: tcp        0      0 127.0.0.1:8400          0.0.0.0:*
           LISTEN      1125/c
 upstart.sh[1127]: tcp        0      0 127.0.0.1:8500          0.0.0.0:*
           LISTEN      1125/c
 upstart.sh[1127]: tcp        0      0 0.0.0.0:53              0.0.0.0:*
           LISTEN      1215/d
 upstart.sh[1127]: tcp        0      0 0.0.0.0:22              0.0.0.0:*
           LISTEN      1111/s
 upstart.sh[1127]: tcp        0      0 127.0.0.1:8600          0.0.0.0:*
           LISTEN      1125/c
 upstart.sh[1127]: tcp        0      0 127.0.0.1:8126          0.0.0.0:*
           LISTEN      1418/t
 upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
            LISTEN      1125/c
 upstart.sh[1127]: tcp6       0      0 :::53                   :::*
            LISTEN      1215/d
 upstart.sh[1127]: tcp6       0      0 :::22                   :::*
            LISTEN      1111/s
 upstart.sh[1127]: udp        0      0 0.0.0.0:53              0.0.0.0:*
                       1215/d
 upstart.sh[1127]: udp        0      0 0.0.0.0:68              0.0.0.0:*
                       973/dh
 upstart.sh[1127]: udp        0      0 10.32.104.144:123       0.0.0.0:*
                       1341/n
 upstart.sh[1127]: udp        0      0 127.0.0.1:123           0.0.0.0:*
                       1341/n
 upstart.sh[1127]: udp        0      0 0.0.0.0:123             0.0.0.0:*
                       1341/n
 upstart.sh[1127]: udp        0      0 127.0.0.1:8600          0.0.0.0:*
                       1125/c
 upstart.sh[1127]: udp6       0      0 :::54933                :::*
                        1441/j
 upstart.sh[1127]: udp6       0      0 127.0.0.1:8125          :::*
                        1420/p
 upstart.sh[1127]: udp6       0      0 :::53                   :::*
                        1215/d
 upstart.sh[1127]: udp6       0      0 :::8301                 :::*
                        1125/c
 upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
                        1341/n
 upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
                        1341/n
 upstart.sh[1127]: udp6       0      0 :::123                  :::*
                        1341/n
```

I can also verify the network of the box itself is up, and working as
programs like the consul-agent do in fact spawn, and connect to their
clusters before kafka even gets invoked.

For reference I'm using the built in `kafka-server-start.sh` script, and
invoking it like so (IPs cut out):

```
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=
kafka-i-0617a6aaa98f63c21.insops.net
-Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
JAVA_HOME=/usr
$KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
$KAFKA_INSTALL_PATH/config/server.properties --override
zookeeper.connect="XX.XX.XX.XX:XX" --override
broker.id="the-broker-test" --override
listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
```

I'm not really sure where else to check for problems as it's only happening
on some boots, and only logging the one line mentioned above.

Thanks,


-- 
*Eric Coan*
*E: ecoan@instructure.com <ec...@instructure.com>*
*O:* *801.869.5000 <//801.869.5000>*
<http://instructure.com/>

Re: Kafka Sometimes Fails to Start on Boot

Posted by Eric Coan <ec...@instructure.com>.
I have tried changing the port to no avail. You can also see in the first
email running a: `netstat -tulpn` produces output saying that nothing is
using port 9999. Also seeing as how 90% of the time it works it really
doesn't seem like other software would sometimes be booted, and sometimes
not.

On Thu, Jun 29, 2017 at 1:51 AM, Tom Bentley <t....@gmail.com> wrote:

> Have you tried changing the configured JMX port? After all, it's possible
> the conflict is between kafka and some other software running on the same
> server.
>
> On 28 June 2017 at 21:06, Eric Coan <ec...@instructure.com> wrote:
>
> > Hello,
> >
> >
> > Unfortunately Kafka does indeed startup and run for a little bit before
> > crashing with the above exception, so doing one simple check wouldn't
> work.
> > I could theoretically keep this script running forever, and constantly
> > checking for it being up. However that's really a hacky solution, and I'd
> > prefer to not do that if I don't have too.
> >
> > On Wed, Jun 28, 2017 at 1:43 PM, M. Manna <ma...@gmail.com> wrote:
> >
> > > Can you not put a service wrapper for startup? It will attempt a
> restart
> > if
> > > the executable isn't up and running successfully.
> > >
> > > I am not familiar with Unix side, but in Windows you can use a
> powershell
> > > to utilise such thing. It's a better approach.
> > >
> > > Let me know what you think.
> > >
> > > On 28 Jun 2017 8:34 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > >
> > > > I am using the same configuration for all brokers. However, each
> broker
> > > is
> > > > running on a completely separate host (I'm not running all three
> > brokers
> > > on
> > > > the same host). I can get all three running if I manually start kafka
> > > > again, however it's just occasionally on boot one fails to start with
> > > this
> > > > error.
> > > >
> > > > On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <ma...@gmail.com>
> wrote:
> > > >
> > > > > Aren't u using the same JMX port 9999 for all brokers? I dont think
> > it
> > > > will
> > > > > work for more than 1 broker.
> > > > >
> > > > >
> > > > >
> > > > > On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > > > >
> > > > > > Hey,
> > > > > >
> > > > > > No worries. I'm starting the brokers with a script yes (that ends
> > up
> > > > > > generating the command I pasted:
> > > > > >
> > > > > > ```
> > > > > >
> > > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > -Djava.rmi.server.hostname=$
> > > > > FQDN
> > > > > >  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> > SCALA_VERSION=2.12.2
> > > > > > JAVA_HOME=/usr
> > > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id
> > ="$broker_id"
> > > > > > --override
> > > > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > > > ```
> > > > > >
> > > > > > The script beforehand populates the variables such as the FQDN,
> the
> > > > > broker
> > > > > > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The
> > > important
> > > > > > part of the command really is:
> > > > > >
> > > > > > ```
> > > > > > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2
> > > JAVA_HOME=/usr
> > > > > > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> > > > > > ```
> > > > > >
> > > > > > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <ma...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Please forgive my autocorrect options :(
> > > > > > >
> > > > > > > On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > OS is not an issue, I have a 3 broker setup and I have
> > experienced
> > > > this
> > > > > > > too.
> > > > > > >
> > > > > > > How are toy atarting the brokers? Is this a concurrent start or
> > > have
> > > > > you
> > > > > > > got some startup scriptto bring up all the brokers?
> > > > > > >
> > > > > > > KR,
> > > > > > >
> > > > > > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com>
> > wrote:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I've recently been doing research into getting our Kafka
> > cluster
> > > > > > running
> > > > > > > > outside of Mesos (for a couple of reasons). However I'm
> > noticing
> > > > > about
> > > > > > > 10%
> > > > > > > > of the time Kafka fails to start on boot (or more accurately
> > > > starts,
> > > > > > and
> > > > > > > > immediately exits). I find it weird since all brokers are
> using
> > > the
> > > > > > exact
> > > > > > > > same configuration, on the same OS (Ubuntu 16.04)
> > > > > > > >
> > > > > > > > There's nothing in my LOG4J directory, however I did find a
> > > > singular
> > > > > > log
> > > > > > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the
> > actual
> > > > > light
> > > > > > > as
> > > > > > > > to why it's failing:
> > > > > > > >
> > > > > > > > ```
> > > > > > > > Error: Exception thrown by the agent : java.rmi.server.
> > > > > > ExportException:
> > > > > > > > Port already in use: 9999; nested exception is:
> > > > > > > >         java.net.BindException: Address already in use (Bind
> > > > failed)
> > > > > > > > ```
> > > > > > > >
> > > > > > > > However, I can verify nothing is running on this port right
> > > before
> > > > > > > > invocation using netstat -tulpn which shows:
> > > > > > > >
> > > > > > > > ```
> > > > > > > >  upstart.sh[1127]: Active Internet connections (only servers)
> > > > > > > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address
> > > > >  Foreign
> > > > > > > > Address         State       PID/Pr
> > > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
> > > > > >  0.0.0.0:*
> > > > > > > >            LISTEN      1419/p
> > > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> > > > > > 0.0.0.0:*
> > > > > > > >            LISTEN      1125/c
> > > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> > > > > > 0.0.0.0:*
> > > > > > > >            LISTEN      1125/c
> > > > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> > > > > > 0.0.0.0:*
> > > > > > > >            LISTEN      1215/d
> > > > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> > > > > > 0.0.0.0:*
> > > > > > > >            LISTEN      1111/s
> > > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> > > > > > 0.0.0.0:*
> > > > > > > >            LISTEN      1125/c
> > > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> > > > > > 0.0.0.0:*
> > > > > > > >            LISTEN      1418/t
> > > > > > > >  upstart.sh[1127]: tcp6       0      0 :::8301
> > >  :::*
> > > > > > > >             LISTEN      1125/c
> > > > > > > >  upstart.sh[1127]: tcp6       0      0 :::53
> > >  :::*
> > > > > > > >             LISTEN      1215/d
> > > > > > > >  upstart.sh[1127]: tcp6       0      0 :::22
> > >  :::*
> > > > > > > >             LISTEN      1111/s
> > > > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> > > > > > 0.0.0.0:*
> > > > > > > >                        1215/d
> > > > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> > > > > > 0.0.0.0:*
> > > > > > > >                        973/dh
> > > > > > > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
> > > > > >  0.0.0.0:*
> > > > > > > >                        1341/n
> > > > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
> > > > > >  0.0.0.0:*
> > > > > > > >                        1341/n
> > > > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
> > > > > >  0.0.0.0:*
> > > > > > > >                        1341/n
> > > > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> > > > > > 0.0.0.0:*
> > > > > > > >                        1125/c
> > > > > > > >  upstart.sh[1127]: udp6       0      0 :::54933
> > > :::*
> > > > > > > >                         1441/j
> > > > > > > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125
> > > > :::*
> > > > > > > >                         1420/p
> > > > > > > >  upstart.sh[1127]: udp6       0      0 :::53
> > >  :::*
> > > > > > > >                         1215/d
> > > > > > > >  upstart.sh[1127]: udp6       0      0 :::8301
> > >  :::*
> > > > > > > >                         1125/c
> > > > > > > >  upstart.sh[1127]: udp6       0      0
> fe80::898:21ff:fec0:123
> > > :::*
> > > > > > > >                         1341/n
> > > > > > > >  upstart.sh[1127]: udp6       0      0 ::1:123
> > >  :::*
> > > > > > > >                         1341/n
> > > > > > > >  upstart.sh[1127]: udp6       0      0 :::123
> > > :::*
> > > > > > > >                         1341/n
> > > > > > > > ```
> > > > > > > >
> > > > > > > > I can also verify the network of the box itself is up, and
> > > working
> > > > as
> > > > > > > > programs like the consul-agent do in fact spawn, and connect
> to
> > > > their
> > > > > > > > clusters before kafka even gets invoked.
> > > > > > > >
> > > > > > > > For reference I'm using the built in `kafka-server-start.sh`
> > > > script,
> > > > > > and
> > > > > > > > invoking it like so (IPs cut out):
> > > > > > > >
> > > > > > > > ```
> > > > > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > > -Djava.rmi.server.hostname=
> > > > > > > > kafka-i-0617a6aaa98f63c21.insops.net
> > > > > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> > > > SCALA_VERSION=2.12.2
> > > > > > > > JAVA_HOME=/usr
> > > > > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > > > > > > broker.id="the-broker-test" --override
> > > > > > > > listeners="SSL://$LOCAL_IPV4:9092" --override
> > broker.rack="$AZ"
> > > > > > > > ```
> > > > > > > >
> > > > > > > > I'm not really sure where else to check for problems as it's
> > only
> > > > > > > happening
> > > > > > > > on some boots, and only logging the one line mentioned above.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Eric Coan*
> > > > > > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > > > > <http://instructure.com/>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Eric Coan*
> > > > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > > <http://instructure.com/>
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Eric Coan*
> > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > <http://instructure.com/>
> > > >
> > >
> >
> >
> >
> > --
> > *Eric Coan*
> > *E: ecoan@instructure.com <ec...@instructure.com>*
> > *O:* *801.869.5000 <//801.869.5000>*
> > <http://instructure.com/>
> >
>



-- 
*Eric Coan*
*E: ecoan@instructure.com <ec...@instructure.com>*
*O:* *801.869.5000 <//801.869.5000>*
<http://instructure.com/>

Re: Kafka Sometimes Fails to Start on Boot

Posted by Tom Bentley <t....@gmail.com>.
Have you tried changing the configured JMX port? After all, it's possible
the conflict is between kafka and some other software running on the same
server.

On 28 June 2017 at 21:06, Eric Coan <ec...@instructure.com> wrote:

> Hello,
>
>
> Unfortunately Kafka does indeed startup and run for a little bit before
> crashing with the above exception, so doing one simple check wouldn't work.
> I could theoretically keep this script running forever, and constantly
> checking for it being up. However that's really a hacky solution, and I'd
> prefer to not do that if I don't have too.
>
> On Wed, Jun 28, 2017 at 1:43 PM, M. Manna <ma...@gmail.com> wrote:
>
> > Can you not put a service wrapper for startup? It will attempt a restart
> if
> > the executable isn't up and running successfully.
> >
> > I am not familiar with Unix side, but in Windows you can use a powershell
> > to utilise such thing. It's a better approach.
> >
> > Let me know what you think.
> >
> > On 28 Jun 2017 8:34 pm, "Eric Coan" <ec...@instructure.com> wrote:
> >
> > > I am using the same configuration for all brokers. However, each broker
> > is
> > > running on a completely separate host (I'm not running all three
> brokers
> > on
> > > the same host). I can get all three running if I manually start kafka
> > > again, however it's just occasionally on boot one fails to start with
> > this
> > > error.
> > >
> > > On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <ma...@gmail.com> wrote:
> > >
> > > > Aren't u using the same JMX port 9999 for all brokers? I dont think
> it
> > > will
> > > > work for more than 1 broker.
> > > >
> > > >
> > > >
> > > > On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > > >
> > > > > Hey,
> > > > >
> > > > > No worries. I'm starting the brokers with a script yes (that ends
> up
> > > > > generating the command I pasted:
> > > > >
> > > > > ```
> > > > >
> > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > -Dcom.sun.management.jmxremote.ssl=false
> > -Djava.rmi.server.hostname=$
> > > > FQDN
> > > > >  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> SCALA_VERSION=2.12.2
> > > > > JAVA_HOME=/usr
> > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id
> ="$broker_id"
> > > > > --override
> > > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > > ```
> > > > >
> > > > > The script beforehand populates the variables such as the FQDN, the
> > > > broker
> > > > > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The
> > important
> > > > > part of the command really is:
> > > > >
> > > > > ```
> > > > > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2
> > JAVA_HOME=/usr
> > > > > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> > > > > ```
> > > > >
> > > > > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <ma...@gmail.com>
> > wrote:
> > > > >
> > > > > > Please forgive my autocorrect options :(
> > > > > >
> > > > > > On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > OS is not an issue, I have a 3 broker setup and I have
> experienced
> > > this
> > > > > > too.
> > > > > >
> > > > > > How are toy atarting the brokers? Is this a concurrent start or
> > have
> > > > you
> > > > > > got some startup scriptto bring up all the brokers?
> > > > > >
> > > > > > KR,
> > > > > >
> > > > > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com>
> wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > I've recently been doing research into getting our Kafka
> cluster
> > > > > running
> > > > > > > outside of Mesos (for a couple of reasons). However I'm
> noticing
> > > > about
> > > > > > 10%
> > > > > > > of the time Kafka fails to start on boot (or more accurately
> > > starts,
> > > > > and
> > > > > > > immediately exits). I find it weird since all brokers are using
> > the
> > > > > exact
> > > > > > > same configuration, on the same OS (Ubuntu 16.04)
> > > > > > >
> > > > > > > There's nothing in my LOG4J directory, however I did find a
> > > singular
> > > > > log
> > > > > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the
> actual
> > > > light
> > > > > > as
> > > > > > > to why it's failing:
> > > > > > >
> > > > > > > ```
> > > > > > > Error: Exception thrown by the agent : java.rmi.server.
> > > > > ExportException:
> > > > > > > Port already in use: 9999; nested exception is:
> > > > > > >         java.net.BindException: Address already in use (Bind
> > > failed)
> > > > > > > ```
> > > > > > >
> > > > > > > However, I can verify nothing is running on this port right
> > before
> > > > > > > invocation using netstat -tulpn which shows:
> > > > > > >
> > > > > > > ```
> > > > > > >  upstart.sh[1127]: Active Internet connections (only servers)
> > > > > > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address
> > > >  Foreign
> > > > > > > Address         State       PID/Pr
> > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
> > > > >  0.0.0.0:*
> > > > > > >            LISTEN      1419/p
> > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> > > > > 0.0.0.0:*
> > > > > > >            LISTEN      1125/c
> > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> > > > > 0.0.0.0:*
> > > > > > >            LISTEN      1125/c
> > > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> > > > > 0.0.0.0:*
> > > > > > >            LISTEN      1215/d
> > > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> > > > > 0.0.0.0:*
> > > > > > >            LISTEN      1111/s
> > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> > > > > 0.0.0.0:*
> > > > > > >            LISTEN      1125/c
> > > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> > > > > 0.0.0.0:*
> > > > > > >            LISTEN      1418/t
> > > > > > >  upstart.sh[1127]: tcp6       0      0 :::8301
> >  :::*
> > > > > > >             LISTEN      1125/c
> > > > > > >  upstart.sh[1127]: tcp6       0      0 :::53
> >  :::*
> > > > > > >             LISTEN      1215/d
> > > > > > >  upstart.sh[1127]: tcp6       0      0 :::22
> >  :::*
> > > > > > >             LISTEN      1111/s
> > > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> > > > > 0.0.0.0:*
> > > > > > >                        1215/d
> > > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> > > > > 0.0.0.0:*
> > > > > > >                        973/dh
> > > > > > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
> > > > >  0.0.0.0:*
> > > > > > >                        1341/n
> > > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
> > > > >  0.0.0.0:*
> > > > > > >                        1341/n
> > > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
> > > > >  0.0.0.0:*
> > > > > > >                        1341/n
> > > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> > > > > 0.0.0.0:*
> > > > > > >                        1125/c
> > > > > > >  upstart.sh[1127]: udp6       0      0 :::54933
> > :::*
> > > > > > >                         1441/j
> > > > > > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125
> > > :::*
> > > > > > >                         1420/p
> > > > > > >  upstart.sh[1127]: udp6       0      0 :::53
> >  :::*
> > > > > > >                         1215/d
> > > > > > >  upstart.sh[1127]: udp6       0      0 :::8301
> >  :::*
> > > > > > >                         1125/c
> > > > > > >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123
> > :::*
> > > > > > >                         1341/n
> > > > > > >  upstart.sh[1127]: udp6       0      0 ::1:123
> >  :::*
> > > > > > >                         1341/n
> > > > > > >  upstart.sh[1127]: udp6       0      0 :::123
> > :::*
> > > > > > >                         1341/n
> > > > > > > ```
> > > > > > >
> > > > > > > I can also verify the network of the box itself is up, and
> > working
> > > as
> > > > > > > programs like the consul-agent do in fact spawn, and connect to
> > > their
> > > > > > > clusters before kafka even gets invoked.
> > > > > > >
> > > > > > > For reference I'm using the built in `kafka-server-start.sh`
> > > script,
> > > > > and
> > > > > > > invoking it like so (IPs cut out):
> > > > > > >
> > > > > > > ```
> > > > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > > -Djava.rmi.server.hostname=
> > > > > > > kafka-i-0617a6aaa98f63c21.insops.net
> > > > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> > > SCALA_VERSION=2.12.2
> > > > > > > JAVA_HOME=/usr
> > > > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > > > > > broker.id="the-broker-test" --override
> > > > > > > listeners="SSL://$LOCAL_IPV4:9092" --override
> broker.rack="$AZ"
> > > > > > > ```
> > > > > > >
> > > > > > > I'm not really sure where else to check for problems as it's
> only
> > > > > > happening
> > > > > > > on some boots, and only logging the one line mentioned above.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Eric Coan*
> > > > > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > > > <http://instructure.com/>
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Eric Coan*
> > > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > <http://instructure.com/>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Eric Coan*
> > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > *O:* *801.869.5000 <//801.869.5000>*
> > > <http://instructure.com/>
> > >
> >
>
>
>
> --
> *Eric Coan*
> *E: ecoan@instructure.com <ec...@instructure.com>*
> *O:* *801.869.5000 <//801.869.5000>*
> <http://instructure.com/>
>

Re: Kafka Sometimes Fails to Start on Boot

Posted by Eric Coan <ec...@instructure.com>.
Hello,


Unfortunately Kafka does indeed startup and run for a little bit before
crashing with the above exception, so doing one simple check wouldn't work.
I could theoretically keep this script running forever, and constantly
checking for it being up. However that's really a hacky solution, and I'd
prefer to not do that if I don't have too.

On Wed, Jun 28, 2017 at 1:43 PM, M. Manna <ma...@gmail.com> wrote:

> Can you not put a service wrapper for startup? It will attempt a restart if
> the executable isn't up and running successfully.
>
> I am not familiar with Unix side, but in Windows you can use a powershell
> to utilise such thing. It's a better approach.
>
> Let me know what you think.
>
> On 28 Jun 2017 8:34 pm, "Eric Coan" <ec...@instructure.com> wrote:
>
> > I am using the same configuration for all brokers. However, each broker
> is
> > running on a completely separate host (I'm not running all three brokers
> on
> > the same host). I can get all three running if I manually start kafka
> > again, however it's just occasionally on boot one fails to start with
> this
> > error.
> >
> > On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <ma...@gmail.com> wrote:
> >
> > > Aren't u using the same JMX port 9999 for all brokers? I dont think it
> > will
> > > work for more than 1 broker.
> > >
> > >
> > >
> > > On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > >
> > > > Hey,
> > > >
> > > > No worries. I'm starting the brokers with a script yes (that ends up
> > > > generating the command I pasted:
> > > >
> > > > ```
> > > >
> > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > -Dcom.sun.management.jmxremote.ssl=false
> -Djava.rmi.server.hostname=$
> > > FQDN
> > > >  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > > > JAVA_HOME=/usr
> > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id="$broker_id"
> > > > --override
> > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > ```
> > > >
> > > > The script beforehand populates the variables such as the FQDN, the
> > > broker
> > > > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The
> important
> > > > part of the command really is:
> > > >
> > > > ```
> > > > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2
> JAVA_HOME=/usr
> > > > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> > > > ```
> > > >
> > > > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <ma...@gmail.com>
> wrote:
> > > >
> > > > > Please forgive my autocorrect options :(
> > > > >
> > > > > On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > OS is not an issue, I have a 3 broker setup and I have experienced
> > this
> > > > > too.
> > > > >
> > > > > How are toy atarting the brokers? Is this a concurrent start or
> have
> > > you
> > > > > got some startup scriptto bring up all the brokers?
> > > > >
> > > > > KR,
> > > > >
> > > > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I've recently been doing research into getting our Kafka cluster
> > > > running
> > > > > > outside of Mesos (for a couple of reasons). However I'm noticing
> > > about
> > > > > 10%
> > > > > > of the time Kafka fails to start on boot (or more accurately
> > starts,
> > > > and
> > > > > > immediately exits). I find it weird since all brokers are using
> the
> > > > exact
> > > > > > same configuration, on the same OS (Ubuntu 16.04)
> > > > > >
> > > > > > There's nothing in my LOG4J directory, however I did find a
> > singular
> > > > log
> > > > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual
> > > light
> > > > > as
> > > > > > to why it's failing:
> > > > > >
> > > > > > ```
> > > > > > Error: Exception thrown by the agent : java.rmi.server.
> > > > ExportException:
> > > > > > Port already in use: 9999; nested exception is:
> > > > > >         java.net.BindException: Address already in use (Bind
> > failed)
> > > > > > ```
> > > > > >
> > > > > > However, I can verify nothing is running on this port right
> before
> > > > > > invocation using netstat -tulpn which shows:
> > > > > >
> > > > > > ```
> > > > > >  upstart.sh[1127]: Active Internet connections (only servers)
> > > > > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address
> > >  Foreign
> > > > > > Address         State       PID/Pr
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
> > > >  0.0.0.0:*
> > > > > >            LISTEN      1419/p
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1215/d
> > > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1111/s
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> > > > 0.0.0.0:*
> > > > > >            LISTEN      1418/t
> > > > > >  upstart.sh[1127]: tcp6       0      0 :::8301
>  :::*
> > > > > >             LISTEN      1125/c
> > > > > >  upstart.sh[1127]: tcp6       0      0 :::53
>  :::*
> > > > > >             LISTEN      1215/d
> > > > > >  upstart.sh[1127]: tcp6       0      0 :::22
>  :::*
> > > > > >             LISTEN      1111/s
> > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> > > > 0.0.0.0:*
> > > > > >                        1215/d
> > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> > > > 0.0.0.0:*
> > > > > >                        973/dh
> > > > > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
> > > >  0.0.0.0:*
> > > > > >                        1341/n
> > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
> > > >  0.0.0.0:*
> > > > > >                        1341/n
> > > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
> > > >  0.0.0.0:*
> > > > > >                        1341/n
> > > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> > > > 0.0.0.0:*
> > > > > >                        1125/c
> > > > > >  upstart.sh[1127]: udp6       0      0 :::54933
> :::*
> > > > > >                         1441/j
> > > > > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125
> > :::*
> > > > > >                         1420/p
> > > > > >  upstart.sh[1127]: udp6       0      0 :::53
>  :::*
> > > > > >                         1215/d
> > > > > >  upstart.sh[1127]: udp6       0      0 :::8301
>  :::*
> > > > > >                         1125/c
> > > > > >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123
> :::*
> > > > > >                         1341/n
> > > > > >  upstart.sh[1127]: udp6       0      0 ::1:123
>  :::*
> > > > > >                         1341/n
> > > > > >  upstart.sh[1127]: udp6       0      0 :::123
> :::*
> > > > > >                         1341/n
> > > > > > ```
> > > > > >
> > > > > > I can also verify the network of the box itself is up, and
> working
> > as
> > > > > > programs like the consul-agent do in fact spawn, and connect to
> > their
> > > > > > clusters before kafka even gets invoked.
> > > > > >
> > > > > > For reference I'm using the built in `kafka-server-start.sh`
> > script,
> > > > and
> > > > > > invoking it like so (IPs cut out):
> > > > > >
> > > > > > ```
> > > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > > -Dcom.sun.management.jmxremote.ssl=false
> > -Djava.rmi.server.hostname=
> > > > > > kafka-i-0617a6aaa98f63c21.insops.net
> > > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> > SCALA_VERSION=2.12.2
> > > > > > JAVA_HOME=/usr
> > > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > > > > broker.id="the-broker-test" --override
> > > > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > > > ```
> > > > > >
> > > > > > I'm not really sure where else to check for problems as it's only
> > > > > happening
> > > > > > on some boots, and only logging the one line mentioned above.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Eric Coan*
> > > > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > > <http://instructure.com/>
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Eric Coan*
> > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > <http://instructure.com/>
> > > >
> > >
> >
> >
> >
> > --
> > *Eric Coan*
> > *E: ecoan@instructure.com <ec...@instructure.com>*
> > *O:* *801.869.5000 <//801.869.5000>*
> > <http://instructure.com/>
> >
>



-- 
*Eric Coan*
*E: ecoan@instructure.com <ec...@instructure.com>*
*O:* *801.869.5000 <//801.869.5000>*
<http://instructure.com/>

Re: Kafka Sometimes Fails to Start on Boot

Posted by "M. Manna" <ma...@gmail.com>.
Can you not put a service wrapper for startup? It will attempt a restart if
the executable isn't up and running successfully.

I am not familiar with Unix side, but in Windows you can use a powershell
to utilise such thing. It's a better approach.

Let me know what you think.

On 28 Jun 2017 8:34 pm, "Eric Coan" <ec...@instructure.com> wrote:

> I am using the same configuration for all brokers. However, each broker is
> running on a completely separate host (I'm not running all three brokers on
> the same host). I can get all three running if I manually start kafka
> again, however it's just occasionally on boot one fails to start with this
> error.
>
> On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <ma...@gmail.com> wrote:
>
> > Aren't u using the same JMX port 9999 for all brokers? I dont think it
> will
> > work for more than 1 broker.
> >
> >
> >
> > On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:
> >
> > > Hey,
> > >
> > > No worries. I'm starting the brokers with a script yes (that ends up
> > > generating the command I pasted:
> > >
> > > ```
> > >
> > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > -Dcom.sun.management.jmxremote.authenticate=false
> > > -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$
> > FQDN
> > >  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > > JAVA_HOME=/usr
> > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id="$broker_id"
> > > --override
> > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > ```
> > >
> > > The script beforehand populates the variables such as the FQDN, the
> > broker
> > > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The important
> > > part of the command really is:
> > >
> > > ```
> > > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2 JAVA_HOME=/usr
> > > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> > > ```
> > >
> > > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <ma...@gmail.com> wrote:
> > >
> > > > Please forgive my autocorrect options :(
> > > >
> > > > On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > OS is not an issue, I have a 3 broker setup and I have experienced
> this
> > > > too.
> > > >
> > > > How are toy atarting the brokers? Is this a concurrent start or have
> > you
> > > > got some startup scriptto bring up all the brokers?
> > > >
> > > > KR,
> > > >
> > > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I've recently been doing research into getting our Kafka cluster
> > > running
> > > > > outside of Mesos (for a couple of reasons). However I'm noticing
> > about
> > > > 10%
> > > > > of the time Kafka fails to start on boot (or more accurately
> starts,
> > > and
> > > > > immediately exits). I find it weird since all brokers are using the
> > > exact
> > > > > same configuration, on the same OS (Ubuntu 16.04)
> > > > >
> > > > > There's nothing in my LOG4J directory, however I did find a
> singular
> > > log
> > > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual
> > light
> > > > as
> > > > > to why it's failing:
> > > > >
> > > > > ```
> > > > > Error: Exception thrown by the agent : java.rmi.server.
> > > ExportException:
> > > > > Port already in use: 9999; nested exception is:
> > > > >         java.net.BindException: Address already in use (Bind
> failed)
> > > > > ```
> > > > >
> > > > > However, I can verify nothing is running on this port right before
> > > > > invocation using netstat -tulpn which shows:
> > > > >
> > > > > ```
> > > > >  upstart.sh[1127]: Active Internet connections (only servers)
> > > > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address
> >  Foreign
> > > > > Address         State       PID/Pr
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
> > >  0.0.0.0:*
> > > > >            LISTEN      1419/p
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> > > 0.0.0.0:*
> > > > >            LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> > > 0.0.0.0:*
> > > > >            LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> > > 0.0.0.0:*
> > > > >            LISTEN      1215/d
> > > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> > > 0.0.0.0:*
> > > > >            LISTEN      1111/s
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> > > 0.0.0.0:*
> > > > >            LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> > > 0.0.0.0:*
> > > > >            LISTEN      1418/t
> > > > >  upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
> > > > >             LISTEN      1125/c
> > > > >  upstart.sh[1127]: tcp6       0      0 :::53                   :::*
> > > > >             LISTEN      1215/d
> > > > >  upstart.sh[1127]: tcp6       0      0 :::22                   :::*
> > > > >             LISTEN      1111/s
> > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> > > 0.0.0.0:*
> > > > >                        1215/d
> > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> > > 0.0.0.0:*
> > > > >                        973/dh
> > > > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
> > >  0.0.0.0:*
> > > > >                        1341/n
> > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
> > >  0.0.0.0:*
> > > > >                        1341/n
> > > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
> > >  0.0.0.0:*
> > > > >                        1341/n
> > > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> > > 0.0.0.0:*
> > > > >                        1125/c
> > > > >  upstart.sh[1127]: udp6       0      0 :::54933                :::*
> > > > >                         1441/j
> > > > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125
> :::*
> > > > >                         1420/p
> > > > >  upstart.sh[1127]: udp6       0      0 :::53                   :::*
> > > > >                         1215/d
> > > > >  upstart.sh[1127]: udp6       0      0 :::8301                 :::*
> > > > >                         1125/c
> > > > >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
> > > > >                         1341/n
> > > > >  upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
> > > > >                         1341/n
> > > > >  upstart.sh[1127]: udp6       0      0 :::123                  :::*
> > > > >                         1341/n
> > > > > ```
> > > > >
> > > > > I can also verify the network of the box itself is up, and working
> as
> > > > > programs like the consul-agent do in fact spawn, and connect to
> their
> > > > > clusters before kafka even gets invoked.
> > > > >
> > > > > For reference I'm using the built in `kafka-server-start.sh`
> script,
> > > and
> > > > > invoking it like so (IPs cut out):
> > > > >
> > > > > ```
> > > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > > -Dcom.sun.management.jmxremote.ssl=false
> -Djava.rmi.server.hostname=
> > > > > kafka-i-0617a6aaa98f63c21.insops.net
> > > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999
> SCALA_VERSION=2.12.2
> > > > > JAVA_HOME=/usr
> > > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > > > broker.id="the-broker-test" --override
> > > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > > ```
> > > > >
> > > > > I'm not really sure where else to check for problems as it's only
> > > > happening
> > > > > on some boots, and only logging the one line mentioned above.
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > --
> > > > > *Eric Coan*
> > > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > > <http://instructure.com/>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Eric Coan*
> > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > *O:* *801.869.5000 <//801.869.5000>*
> > > <http://instructure.com/>
> > >
> >
>
>
>
> --
> *Eric Coan*
> *E: ecoan@instructure.com <ec...@instructure.com>*
> *O:* *801.869.5000 <//801.869.5000>*
> <http://instructure.com/>
>

Re: Kafka Sometimes Fails to Start on Boot

Posted by Eric Coan <ec...@instructure.com>.
I am using the same configuration for all brokers. However, each broker is
running on a completely separate host (I'm not running all three brokers on
the same host). I can get all three running if I manually start kafka
again, however it's just occasionally on boot one fails to start with this
error.

On Wed, Jun 28, 2017 at 1:25 PM, M. Manna <ma...@gmail.com> wrote:

> Aren't u using the same JMX port 9999 for all brokers? I dont think it will
> work for more than 1 broker.
>
>
>
> On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:
>
> > Hey,
> >
> > No worries. I'm starting the brokers with a script yes (that ends up
> > generating the command I pasted:
> >
> > ```
> >
> > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > -Dcom.sun.management.jmxremote.authenticate=false
> > -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$
> FQDN
> >  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > JAVA_HOME=/usr
> > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > $KAFKA_INSTALL_PATH/config/server.properties --override
> > zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id="$broker_id"
> > --override
> > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > ```
> >
> > The script beforehand populates the variables such as the FQDN, the
> broker
> > Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The important
> > part of the command really is:
> >
> > ```
> > KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2 JAVA_HOME=/usr
> > $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> > ```
> >
> > On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <ma...@gmail.com> wrote:
> >
> > > Please forgive my autocorrect options :(
> > >
> > > On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > OS is not an issue, I have a 3 broker setup and I have experienced this
> > > too.
> > >
> > > How are toy atarting the brokers? Is this a concurrent start or have
> you
> > > got some startup scriptto bring up all the brokers?
> > >
> > > KR,
> > >
> > > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > I've recently been doing research into getting our Kafka cluster
> > running
> > > > outside of Mesos (for a couple of reasons). However I'm noticing
> about
> > > 10%
> > > > of the time Kafka fails to start on boot (or more accurately starts,
> > and
> > > > immediately exits). I find it weird since all brokers are using the
> > exact
> > > > same configuration, on the same OS (Ubuntu 16.04)
> > > >
> > > > There's nothing in my LOG4J directory, however I did find a singular
> > log
> > > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual
> light
> > > as
> > > > to why it's failing:
> > > >
> > > > ```
> > > > Error: Exception thrown by the agent : java.rmi.server.
> > ExportException:
> > > > Port already in use: 9999; nested exception is:
> > > >         java.net.BindException: Address already in use (Bind failed)
> > > > ```
> > > >
> > > > However, I can verify nothing is running on this port right before
> > > > invocation using netstat -tulpn which shows:
> > > >
> > > > ```
> > > >  upstart.sh[1127]: Active Internet connections (only servers)
> > > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address
>  Foreign
> > > > Address         State       PID/Pr
> > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
> >  0.0.0.0:*
> > > >            LISTEN      1419/p
> > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> > 0.0.0.0:*
> > > >            LISTEN      1125/c
> > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> > 0.0.0.0:*
> > > >            LISTEN      1125/c
> > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> > 0.0.0.0:*
> > > >            LISTEN      1215/d
> > > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> > 0.0.0.0:*
> > > >            LISTEN      1111/s
> > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> > 0.0.0.0:*
> > > >            LISTEN      1125/c
> > > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> > 0.0.0.0:*
> > > >            LISTEN      1418/t
> > > >  upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
> > > >             LISTEN      1125/c
> > > >  upstart.sh[1127]: tcp6       0      0 :::53                   :::*
> > > >             LISTEN      1215/d
> > > >  upstart.sh[1127]: tcp6       0      0 :::22                   :::*
> > > >             LISTEN      1111/s
> > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> > 0.0.0.0:*
> > > >                        1215/d
> > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> > 0.0.0.0:*
> > > >                        973/dh
> > > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
> >  0.0.0.0:*
> > > >                        1341/n
> > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
> >  0.0.0.0:*
> > > >                        1341/n
> > > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
> >  0.0.0.0:*
> > > >                        1341/n
> > > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> > 0.0.0.0:*
> > > >                        1125/c
> > > >  upstart.sh[1127]: udp6       0      0 :::54933                :::*
> > > >                         1441/j
> > > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125          :::*
> > > >                         1420/p
> > > >  upstart.sh[1127]: udp6       0      0 :::53                   :::*
> > > >                         1215/d
> > > >  upstart.sh[1127]: udp6       0      0 :::8301                 :::*
> > > >                         1125/c
> > > >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
> > > >                         1341/n
> > > >  upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
> > > >                         1341/n
> > > >  upstart.sh[1127]: udp6       0      0 :::123                  :::*
> > > >                         1341/n
> > > > ```
> > > >
> > > > I can also verify the network of the box itself is up, and working as
> > > > programs like the consul-agent do in fact spawn, and connect to their
> > > > clusters before kafka even gets invoked.
> > > >
> > > > For reference I'm using the built in `kafka-server-start.sh` script,
> > and
> > > > invoking it like so (IPs cut out):
> > > >
> > > > ```
> > > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > > -Dcom.sun.management.jmxremote.authenticate=false
> > > > -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=
> > > > kafka-i-0617a6aaa98f63c21.insops.net
> > > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > > > JAVA_HOME=/usr
> > > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > > broker.id="the-broker-test" --override
> > > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > > ```
> > > >
> > > > I'm not really sure where else to check for problems as it's only
> > > happening
> > > > on some boots, and only logging the one line mentioned above.
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > --
> > > > *Eric Coan*
> > > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > > *O:* *801.869.5000 <//801.869.5000>*
> > > > <http://instructure.com/>
> > > >
> > >
> >
> >
> >
> > --
> > *Eric Coan*
> > *E: ecoan@instructure.com <ec...@instructure.com>*
> > *O:* *801.869.5000 <//801.869.5000>*
> > <http://instructure.com/>
> >
>



-- 
*Eric Coan*
*E: ecoan@instructure.com <ec...@instructure.com>*
*O:* *801.869.5000 <//801.869.5000>*
<http://instructure.com/>

Re: Kafka Sometimes Fails to Start on Boot

Posted by "M. Manna" <ma...@gmail.com>.
Aren't u using the same JMX port 9999 for all brokers? I dont think it will
work for more than 1 broker.



On 28 Jun 2017 8:22 pm, "Eric Coan" <ec...@instructure.com> wrote:

> Hey,
>
> No worries. I'm starting the brokers with a script yes (that ends up
> generating the command I pasted:
>
> ```
>
> KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$FQDN
>  -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> JAVA_HOME=/usr
> $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> $KAFKA_INSTALL_PATH/config/server.properties --override
> zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id="$broker_id"
> --override
> listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> ```
>
> The script beforehand populates the variables such as the FQDN, the broker
> Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The important
> part of the command really is:
>
> ```
> KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2 JAVA_HOME=/usr
> $KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
> ```
>
> On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <ma...@gmail.com> wrote:
>
> > Please forgive my autocorrect options :(
> >
> > On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:
> >
> > Hi,
> >
> > OS is not an issue, I have a 3 broker setup and I have experienced this
> > too.
> >
> > How are toy atarting the brokers? Is this a concurrent start or have you
> > got some startup scriptto bring up all the brokers?
> >
> > KR,
> >
> > On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:
> >
> > > Hello,
> > >
> > > I've recently been doing research into getting our Kafka cluster
> running
> > > outside of Mesos (for a couple of reasons). However I'm noticing about
> > 10%
> > > of the time Kafka fails to start on boot (or more accurately starts,
> and
> > > immediately exits). I find it weird since all brokers are using the
> exact
> > > same configuration, on the same OS (Ubuntu 16.04)
> > >
> > > There's nothing in my LOG4J directory, however I did find a singular
> log
> > > line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual light
> > as
> > > to why it's failing:
> > >
> > > ```
> > > Error: Exception thrown by the agent : java.rmi.server.
> ExportException:
> > > Port already in use: 9999; nested exception is:
> > >         java.net.BindException: Address already in use (Bind failed)
> > > ```
> > >
> > > However, I can verify nothing is running on this port right before
> > > invocation using netstat -tulpn which shows:
> > >
> > > ```
> > >  upstart.sh[1127]: Active Internet connections (only servers)
> > >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address           Foreign
> > > Address         State       PID/Pr
> > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123
>  0.0.0.0:*
> > >            LISTEN      1419/p
> > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400
> 0.0.0.0:*
> > >            LISTEN      1125/c
> > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500
> 0.0.0.0:*
> > >            LISTEN      1125/c
> > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53
> 0.0.0.0:*
> > >            LISTEN      1215/d
> > >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22
> 0.0.0.0:*
> > >            LISTEN      1111/s
> > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600
> 0.0.0.0:*
> > >            LISTEN      1125/c
> > >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126
> 0.0.0.0:*
> > >            LISTEN      1418/t
> > >  upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
> > >             LISTEN      1125/c
> > >  upstart.sh[1127]: tcp6       0      0 :::53                   :::*
> > >             LISTEN      1215/d
> > >  upstart.sh[1127]: tcp6       0      0 :::22                   :::*
> > >             LISTEN      1111/s
> > >  upstart.sh[1127]: udp        0      0 0.0.0.0:53
> 0.0.0.0:*
> > >                        1215/d
> > >  upstart.sh[1127]: udp        0      0 0.0.0.0:68
> 0.0.0.0:*
> > >                        973/dh
> > >  upstart.sh[1127]: udp        0      0 10.32.104.144:123
>  0.0.0.0:*
> > >                        1341/n
> > >  upstart.sh[1127]: udp        0      0 127.0.0.1:123
>  0.0.0.0:*
> > >                        1341/n
> > >  upstart.sh[1127]: udp        0      0 0.0.0.0:123
>  0.0.0.0:*
> > >                        1341/n
> > >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600
> 0.0.0.0:*
> > >                        1125/c
> > >  upstart.sh[1127]: udp6       0      0 :::54933                :::*
> > >                         1441/j
> > >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125          :::*
> > >                         1420/p
> > >  upstart.sh[1127]: udp6       0      0 :::53                   :::*
> > >                         1215/d
> > >  upstart.sh[1127]: udp6       0      0 :::8301                 :::*
> > >                         1125/c
> > >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
> > >                         1341/n
> > >  upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
> > >                         1341/n
> > >  upstart.sh[1127]: udp6       0      0 :::123                  :::*
> > >                         1341/n
> > > ```
> > >
> > > I can also verify the network of the box itself is up, and working as
> > > programs like the consul-agent do in fact spawn, and connect to their
> > > clusters before kafka even gets invoked.
> > >
> > > For reference I'm using the built in `kafka-server-start.sh` script,
> and
> > > invoking it like so (IPs cut out):
> > >
> > > ```
> > > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > > -Dcom.sun.management.jmxremote.authenticate=false
> > > -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=
> > > kafka-i-0617a6aaa98f63c21.insops.net
> > > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > > JAVA_HOME=/usr
> > > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > > $KAFKA_INSTALL_PATH/config/server.properties --override
> > > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > > broker.id="the-broker-test" --override
> > > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > > ```
> > >
> > > I'm not really sure where else to check for problems as it's only
> > happening
> > > on some boots, and only logging the one line mentioned above.
> > >
> > > Thanks,
> > >
> > >
> > > --
> > > *Eric Coan*
> > > *E: ecoan@instructure.com <ec...@instructure.com>*
> > > *O:* *801.869.5000 <//801.869.5000>*
> > > <http://instructure.com/>
> > >
> >
>
>
>
> --
> *Eric Coan*
> *E: ecoan@instructure.com <ec...@instructure.com>*
> *O:* *801.869.5000 <//801.869.5000>*
> <http://instructure.com/>
>

Re: Kafka Sometimes Fails to Start on Boot

Posted by Eric Coan <ec...@instructure.com>.
Hey,

No worries. I'm starting the brokers with a script yes (that ends up
generating the command I pasted:

```

KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$FQDN
 -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
JAVA_HOME=/usr
$KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
$KAFKA_INSTALL_PATH/config/server.properties --override
zookeeper.connect="XX.XX.XX.XX:XX" --override broker.id="$broker_id" --override
listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
```

The script beforehand populates the variables such as the FQDN, the broker
Id, Zookeeper IPs to connect to, Kafka Install Path, etc. The important
part of the command really is:

```
KAFKA_JMX_OPTS="..." JMX_PORT=9999 SCALA_VERSION=2.12.2 JAVA_HOME=/usr
$KAFKA_INSTALL_PATH/bin/kafka-server-start.sh -daemon ..
```

On Wed, Jun 28, 2017 at 1:08 PM, M. Manna <ma...@gmail.com> wrote:

> Please forgive my autocorrect options :(
>
> On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:
>
> Hi,
>
> OS is not an issue, I have a 3 broker setup and I have experienced this
> too.
>
> How are toy atarting the brokers? Is this a concurrent start or have you
> got some startup scriptto bring up all the brokers?
>
> KR,
>
> On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:
>
> > Hello,
> >
> > I've recently been doing research into getting our Kafka cluster running
> > outside of Mesos (for a couple of reasons). However I'm noticing about
> 10%
> > of the time Kafka fails to start on boot (or more accurately starts, and
> > immediately exits). I find it weird since all brokers are using the exact
> > same configuration, on the same OS (Ubuntu 16.04)
> >
> > There's nothing in my LOG4J directory, however I did find a singular log
> > line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual light
> as
> > to why it's failing:
> >
> > ```
> > Error: Exception thrown by the agent : java.rmi.server.ExportException:
> > Port already in use: 9999; nested exception is:
> >         java.net.BindException: Address already in use (Bind failed)
> > ```
> >
> > However, I can verify nothing is running on this port right before
> > invocation using netstat -tulpn which shows:
> >
> > ```
> >  upstart.sh[1127]: Active Internet connections (only servers)
> >  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address           Foreign
> > Address         State       PID/Pr
> >  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123         0.0.0.0:*
> >            LISTEN      1419/p
> >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400          0.0.0.0:*
> >            LISTEN      1125/c
> >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500          0.0.0.0:*
> >            LISTEN      1125/c
> >  upstart.sh[1127]: tcp        0      0 0.0.0.0:53              0.0.0.0:*
> >            LISTEN      1215/d
> >  upstart.sh[1127]: tcp        0      0 0.0.0.0:22              0.0.0.0:*
> >            LISTEN      1111/s
> >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600          0.0.0.0:*
> >            LISTEN      1125/c
> >  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126          0.0.0.0:*
> >            LISTEN      1418/t
> >  upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
> >             LISTEN      1125/c
> >  upstart.sh[1127]: tcp6       0      0 :::53                   :::*
> >             LISTEN      1215/d
> >  upstart.sh[1127]: tcp6       0      0 :::22                   :::*
> >             LISTEN      1111/s
> >  upstart.sh[1127]: udp        0      0 0.0.0.0:53              0.0.0.0:*
> >                        1215/d
> >  upstart.sh[1127]: udp        0      0 0.0.0.0:68              0.0.0.0:*
> >                        973/dh
> >  upstart.sh[1127]: udp        0      0 10.32.104.144:123       0.0.0.0:*
> >                        1341/n
> >  upstart.sh[1127]: udp        0      0 127.0.0.1:123           0.0.0.0:*
> >                        1341/n
> >  upstart.sh[1127]: udp        0      0 0.0.0.0:123             0.0.0.0:*
> >                        1341/n
> >  upstart.sh[1127]: udp        0      0 127.0.0.1:8600          0.0.0.0:*
> >                        1125/c
> >  upstart.sh[1127]: udp6       0      0 :::54933                :::*
> >                         1441/j
> >  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125          :::*
> >                         1420/p
> >  upstart.sh[1127]: udp6       0      0 :::53                   :::*
> >                         1215/d
> >  upstart.sh[1127]: udp6       0      0 :::8301                 :::*
> >                         1125/c
> >  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
> >                         1341/n
> >  upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
> >                         1341/n
> >  upstart.sh[1127]: udp6       0      0 :::123                  :::*
> >                         1341/n
> > ```
> >
> > I can also verify the network of the box itself is up, and working as
> > programs like the consul-agent do in fact spawn, and connect to their
> > clusters before kafka even gets invoked.
> >
> > For reference I'm using the built in `kafka-server-start.sh` script, and
> > invoking it like so (IPs cut out):
> >
> > ```
> > KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> > -Dcom.sun.management.jmxremote.authenticate=false
> > -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=
> > kafka-i-0617a6aaa98f63c21.insops.net
> > -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> > JAVA_HOME=/usr
> > $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> > $KAFKA_INSTALL_PATH/config/server.properties --override
> > zookeeper.connect="XX.XX.XX.XX:XX" --override
> > broker.id="the-broker-test" --override
> > listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> > ```
> >
> > I'm not really sure where else to check for problems as it's only
> happening
> > on some boots, and only logging the one line mentioned above.
> >
> > Thanks,
> >
> >
> > --
> > *Eric Coan*
> > *E: ecoan@instructure.com <ec...@instructure.com>*
> > *O:* *801.869.5000 <//801.869.5000>*
> > <http://instructure.com/>
> >
>



-- 
*Eric Coan*
*E: ecoan@instructure.com <ec...@instructure.com>*
*O:* *801.869.5000 <//801.869.5000>*
<http://instructure.com/>

Re: Kafka Sometimes Fails to Start on Boot

Posted by "M. Manna" <ma...@gmail.com>.
Please forgive my autocorrect options :(

On 28 Jun 2017 8:06 pm, "M. Manna" <ma...@gmail.com> wrote:

Hi,

OS is not an issue, I have a 3 broker setup and I have experienced this too.

How are toy atarting the brokers? Is this a concurrent start or have you
got some startup scriptto bring up all the brokers?

KR,

On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:

> Hello,
>
> I've recently been doing research into getting our Kafka cluster running
> outside of Mesos (for a couple of reasons). However I'm noticing about 10%
> of the time Kafka fails to start on boot (or more accurately starts, and
> immediately exits). I find it weird since all brokers are using the exact
> same configuration, on the same OS (Ubuntu 16.04)
>
> There's nothing in my LOG4J directory, however I did find a singular log
> line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual light as
> to why it's failing:
>
> ```
> Error: Exception thrown by the agent : java.rmi.server.ExportException:
> Port already in use: 9999; nested exception is:
>         java.net.BindException: Address already in use (Bind failed)
> ```
>
> However, I can verify nothing is running on this port right before
> invocation using netstat -tulpn which shows:
>
> ```
>  upstart.sh[1127]: Active Internet connections (only servers)
>  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address           Foreign
> Address         State       PID/Pr
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123         0.0.0.0:*
>            LISTEN      1419/p
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400          0.0.0.0:*
>            LISTEN      1125/c
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500          0.0.0.0:*
>            LISTEN      1125/c
>  upstart.sh[1127]: tcp        0      0 0.0.0.0:53              0.0.0.0:*
>            LISTEN      1215/d
>  upstart.sh[1127]: tcp        0      0 0.0.0.0:22              0.0.0.0:*
>            LISTEN      1111/s
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600          0.0.0.0:*
>            LISTEN      1125/c
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126          0.0.0.0:*
>            LISTEN      1418/t
>  upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
>             LISTEN      1125/c
>  upstart.sh[1127]: tcp6       0      0 :::53                   :::*
>             LISTEN      1215/d
>  upstart.sh[1127]: tcp6       0      0 :::22                   :::*
>             LISTEN      1111/s
>  upstart.sh[1127]: udp        0      0 0.0.0.0:53              0.0.0.0:*
>                        1215/d
>  upstart.sh[1127]: udp        0      0 0.0.0.0:68              0.0.0.0:*
>                        973/dh
>  upstart.sh[1127]: udp        0      0 10.32.104.144:123       0.0.0.0:*
>                        1341/n
>  upstart.sh[1127]: udp        0      0 127.0.0.1:123           0.0.0.0:*
>                        1341/n
>  upstart.sh[1127]: udp        0      0 0.0.0.0:123             0.0.0.0:*
>                        1341/n
>  upstart.sh[1127]: udp        0      0 127.0.0.1:8600          0.0.0.0:*
>                        1125/c
>  upstart.sh[1127]: udp6       0      0 :::54933                :::*
>                         1441/j
>  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125          :::*
>                         1420/p
>  upstart.sh[1127]: udp6       0      0 :::53                   :::*
>                         1215/d
>  upstart.sh[1127]: udp6       0      0 :::8301                 :::*
>                         1125/c
>  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
>                         1341/n
>  upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
>                         1341/n
>  upstart.sh[1127]: udp6       0      0 :::123                  :::*
>                         1341/n
> ```
>
> I can also verify the network of the box itself is up, and working as
> programs like the consul-agent do in fact spawn, and connect to their
> clusters before kafka even gets invoked.
>
> For reference I'm using the built in `kafka-server-start.sh` script, and
> invoking it like so (IPs cut out):
>
> ```
> KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=
> kafka-i-0617a6aaa98f63c21.insops.net
> -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> JAVA_HOME=/usr
> $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> $KAFKA_INSTALL_PATH/config/server.properties --override
> zookeeper.connect="XX.XX.XX.XX:XX" --override
> broker.id="the-broker-test" --override
> listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> ```
>
> I'm not really sure where else to check for problems as it's only happening
> on some boots, and only logging the one line mentioned above.
>
> Thanks,
>
>
> --
> *Eric Coan*
> *E: ecoan@instructure.com <ec...@instructure.com>*
> *O:* *801.869.5000 <//801.869.5000>*
> <http://instructure.com/>
>

Re: Kafka Sometimes Fails to Start on Boot

Posted by "M. Manna" <ma...@gmail.com>.
Hi,

OS is not an issue, I have a 3 broker setup and I have experienced this too.

How are toy atarting the brokers? Is this a concurrent start or have you
got some startup scriptto bring up all the brokers?

KR,

On 28 Jun 2017 6:47 pm, "Eric Coan" <ec...@instructure.com> wrote:

> Hello,
>
> I've recently been doing research into getting our Kafka cluster running
> outside of Mesos (for a couple of reasons). However I'm noticing about 10%
> of the time Kafka fails to start on boot (or more accurately starts, and
> immediately exits). I find it weird since all brokers are using the exact
> same configuration, on the same OS (Ubuntu 16.04)
>
> There's nothing in my LOG4J directory, however I did find a singular log
> line within $KAFKA_DIR/logs/kafkaServer.out that shed the actual light as
> to why it's failing:
>
> ```
> Error: Exception thrown by the agent : java.rmi.server.ExportException:
> Port already in use: 9999; nested exception is:
>         java.net.BindException: Address already in use (Bind failed)
> ```
>
> However, I can verify nothing is running on this port right before
> invocation using netstat -tulpn which shows:
>
> ```
>  upstart.sh[1127]: Active Internet connections (only servers)
>  upstart.sh[1127]: Proto Recv-Q Send-Q Local Address           Foreign
> Address         State       PID/Pr
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:17123         0.0.0.0:*
>            LISTEN      1419/p
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8400          0.0.0.0:*
>            LISTEN      1125/c
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8500          0.0.0.0:*
>            LISTEN      1125/c
>  upstart.sh[1127]: tcp        0      0 0.0.0.0:53              0.0.0.0:*
>            LISTEN      1215/d
>  upstart.sh[1127]: tcp        0      0 0.0.0.0:22              0.0.0.0:*
>            LISTEN      1111/s
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8600          0.0.0.0:*
>            LISTEN      1125/c
>  upstart.sh[1127]: tcp        0      0 127.0.0.1:8126          0.0.0.0:*
>            LISTEN      1418/t
>  upstart.sh[1127]: tcp6       0      0 :::8301                 :::*
>             LISTEN      1125/c
>  upstart.sh[1127]: tcp6       0      0 :::53                   :::*
>             LISTEN      1215/d
>  upstart.sh[1127]: tcp6       0      0 :::22                   :::*
>             LISTEN      1111/s
>  upstart.sh[1127]: udp        0      0 0.0.0.0:53              0.0.0.0:*
>                        1215/d
>  upstart.sh[1127]: udp        0      0 0.0.0.0:68              0.0.0.0:*
>                        973/dh
>  upstart.sh[1127]: udp        0      0 10.32.104.144:123       0.0.0.0:*
>                        1341/n
>  upstart.sh[1127]: udp        0      0 127.0.0.1:123           0.0.0.0:*
>                        1341/n
>  upstart.sh[1127]: udp        0      0 0.0.0.0:123             0.0.0.0:*
>                        1341/n
>  upstart.sh[1127]: udp        0      0 127.0.0.1:8600          0.0.0.0:*
>                        1125/c
>  upstart.sh[1127]: udp6       0      0 :::54933                :::*
>                         1441/j
>  upstart.sh[1127]: udp6       0      0 127.0.0.1:8125          :::*
>                         1420/p
>  upstart.sh[1127]: udp6       0      0 :::53                   :::*
>                         1215/d
>  upstart.sh[1127]: udp6       0      0 :::8301                 :::*
>                         1125/c
>  upstart.sh[1127]: udp6       0      0 fe80::898:21ff:fec0:123 :::*
>                         1341/n
>  upstart.sh[1127]: udp6       0      0 ::1:123                 :::*
>                         1341/n
>  upstart.sh[1127]: udp6       0      0 :::123                  :::*
>                         1341/n
> ```
>
> I can also verify the network of the box itself is up, and working as
> programs like the consul-agent do in fact spawn, and connect to their
> clusters before kafka even gets invoked.
>
> For reference I'm using the built in `kafka-server-start.sh` script, and
> invoking it like so (IPs cut out):
>
> ```
> KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=
> kafka-i-0617a6aaa98f63c21.insops.net
> -Djava.net.preferIPv4Stack=true" JMX_PORT=9999 SCALA_VERSION=2.12.2
> JAVA_HOME=/usr
> $KAFKA_INSTALL_PATH//bin/kafka-server-start.sh -daemon
> $KAFKA_INSTALL_PATH/config/server.properties --override
> zookeeper.connect="XX.XX.XX.XX:XX" --override
> broker.id="the-broker-test" --override
> listeners="SSL://$LOCAL_IPV4:9092" --override broker.rack="$AZ"
> ```
>
> I'm not really sure where else to check for problems as it's only happening
> on some boots, and only logging the one line mentioned above.
>
> Thanks,
>
>
> --
> *Eric Coan*
> *E: ecoan@instructure.com <ec...@instructure.com>*
> *O:* *801.869.5000 <//801.869.5000>*
> <http://instructure.com/>
>