You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Jonathan Hsieh <jo...@cloudera.com> on 2011/08/01 19:22:01 UTC

Re: Unable to start zookeeper in distributed master setup

[NOTE: moved to new mailing flume-user@incubator.apache.org mailing list]

Can you try the ip addresses of the master machines instead of the dns
names/hostnames?

We added some code that tries to backwards resolve names but this may be
getting in the way.

https://issues.cloudera.org/browse/FLUME-424

Jon.

We added some code to auto-detect names but this may have problems if you
have

On Thu, Jul 28, 2011 at 9:59 AM, David <bo...@gmail.com> wrote:

> Yea, I set it in both places
>
> On Jul 28, 12:45 pm, Justin Workman <ju...@gmail.com> wrote:
> > I have always set my serversids in the global flume-conf.xml and not the
> local site configuration file.
> >
> > Have you tried setting your master serverids there?
> >
> > Sent from my iPhone
> >
> > On Jul 28, 2011, at 10:21 AM, David <bo...@gmail.com> wrote:
> >
> >
> >
> >
> >
> >
> >
> > > Thanks I tried that and waited a good 5 minutes with no luck. I
> > > suspect that somehow the serverid's are the same even though they're
> > > set to different in their respective flume-site.xml files.
> >
> > > On Jul 28, 11:14 am, NerdyNick <ne...@gmail.com> wrote:
> > >> Adding the new list to the thread.
> >
> > >> Also have you tried firing 1 node at a time up and waiting a little
> > >> bit before bringing the next up. If that works i bet the error goes
> > >> away after a short period. It might just be taking 1 or more of the
> > >> nodes a bit to get its ZK instance up and running. So the other(s) are
> > >> reporting they can't talk to it yet.
> >
> > >> On Thu, Jul 28, 2011 at 8:10 AM, David <bo...@gmail.com> wrote:
> > >>> Yea its still not working with 3 masters, the 3rd master now displays
> > >>> this error message:
> >
> > >>> [QuorumPeer:/0:0:0:0:0:0:0:0:3181] ERROR quorum.Learner: Unexpected
> > >>> exception
> > >>> java.net.NoRouteToHostException: No route to host
> >
> > >>> Also it doesn't make sense to me why the inferred master server index
> > >>> in this message:
> >
> > >>> [main] INFO master.FlumeMaster: Inferred master server index 0
> >
> > >>> doesn't match the serverid specified in flume-site.xml.
> >
> > >>> Any ideas, I'm stumped
> >
> > >>> On Jul 27, 10:32 pm, David <bo...@gmail.com> wrote:
> > >>>> Yea sorry that was a typo... the server ids are set as different in
> > >>>> flume-site.xml. Thanks for the tip about the majority though, I'll
> > >>>> have to try it with 3 masters.
> >
> > >>>> On Jul 27, 10:18 pm, NerdyNick <ne...@gmail.com> wrote:
> >
> > >>>>> You need each servers Serverid to be different. As well as zk wont
> run
> > >>>>> without a majority up from what i understand.
> > >>>>> On Jul 27, 2011 12:29 PM, "David" <bo...@gmail.com> wrote:
> >
> > >>>>>> Something interesting to note is that masterB with a specified
> > >>>>>> serverid: 1 still displays this log message
> >
> > >>>>>> 2011-07-27 14:59:20,410 [main] INFO master.FlumeMaster: Inferred
> > >>>>>> master server index 0
> >
> > >>>>>> On Jul 27, 2:24 pm, David <bo...@gmail.com> wrote:
> > >>>>>>> Hi, I'm trying to startup a distributed master setup with 2
> masters
> > >>>>>>> (yes I know only 2 masters is pointless) and I have the following
> > >>>>>>> configuration:
> >
> > >>>>>>> masterA (flume-site.xml):
> > >>>>>>> <property>
> > >>>>>>> <name>flume.master.servers</name>
> > >>>>>>> <value>masterA,masterB</value>
> > >>>>>>> </property>
> >
> > >>>>>>> <property>
> > >>>>>>> <name>flume.master.serverid</name>
> > >>>>>>> <value>1</value>
> > >>>>>>> </property>
> >
> > >>>>>>> masterB (flume-site.xml):
> > >>>>>>> <property>
> > >>>>>>> <name>flume.master.servers</name>
> > >>>>>>> <value>masterA,masterB</value>
> > >>>>>>> </property>
> >
> > >>>>>>> <property>
> > >>>>>>> <name>flume.master.serverid</name>
> > >>>>>>> <value>1</value>
> > >>>>>>> </property>
> >
> > >>>>>>> The problem is when I start both masters they both hang with the
> > >>>>>>> following error message:
> > >>>>>>> 2011-07-27 14:16:53,056 [main] INFO master.ZKInProcessServer:
> server
> > >>>>>>> 0.0.0.0:3181 not up yet
> > >>>>>>> 2011-07-27 14:17:09,109 [main] ERROR master.FlumeMaster: IO
> problem:
> > >>>>>>> ZooKeeper server did not come up within 15 seconds
> >
> > >>>>>>> Anyone have any ideas?
> >
> > >>>>>>> Thanks, David
> >
> > >> --
> > >> Nick Verbeck - NerdyNick
> > >> ----------------------------------------------------
> > >> NerdyNick.com
> > >> Coloco.ubuntu-rocks.org
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com