You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@bookkeeper.apache.org by Ming Chen <mc...@cs.stonybrook.edu> on 2015/01/08 04:19:18 UTC

Hedwig Across-Region Configuration

Hi there,

I am new to Hedwig, and have a question about setting up Hedwig with
two regions. I have successfully set up two separate single-region
instances of Hedwig. But I could not get them communicate with each
other.

Here are my configuration:

I have two physical machines (A and B), each hosting three VMs
mimicking a cluster in a region. In each region, the three VMs run
ZooKeeper servers, BookKeeper bookie servers, and Hedwig servers at
the same time.

For "reg1" (machine A), the three VMs's IP are:
X.Y.Z.111
X.Y.Z.112
X.Y.Z.113

I successfully set them up as a single-region Hedwig instance. I could
run "bin/hedwig console", and successfully executed commands including
"show", "sub", "pub", "pubsub",  "readtopic", "describe topic" etc.

I did the same for "reg2" (machine B), where the three VMs are:
X.Y.Z.114
X.Y.Z.115
X.Y.Z.116

Again, "reg2" as a single-region Hedwig instance was running well.

Then, I began to setup the across-region communication of the two
Hedwig instances. I shut down the two Hedwig instances, and set the
"regions" option in hw_server.conf of all 6 VMs to

regions=X.Y.Z.111:4080 X.Y.Z.114:4080

All options, other than zk_host, region, and regions, in
hw_server.conf are left as default. After that, I restarted the two
Hedwig instances. I logged into the console of "reg1" and published
some test messages in a topic. But I could not observe the topic or
messages when I logged into console of "reg2". The same is true vice
versa.

It seemed that the two regions can find each other because I observed
that reg1 contains hub(s) from reg2, and vice versa:

[hedwig: (reg1) 24] show hubs
Available Hub Servers:
        X.Y.Z.111:4080:9876 :     info : [hostname:
"X.Y.Z.111:4080:9876", czxid: 12884902049], load : [numTopics: 0]
        X.Y.Z.116:4080:9876 :     info : [hostname:
"X.Y.Z.116:4080:9876", czxid: 12884902050], load : [numTopics: 0]
        X.Y.Z.112:4080:9876 :     info : [hostname:
"X.Y.Z.112:4080:9876", czxid: 12884902051], load : [numTopics: 1]

[hedwig: (reg2) 8] show hubs
Available Hub Servers:
        X.Y.Z.116:4080:9876 :     info : [hostname:
"X.Y.Z.116:4080:9876", czxid: 34359738415], load : [numTopics: 0]
        X.Y.Z.114:4080:9876 :     info : [hostname:
"X.Y.Z.114:4080:9876", czxid: 34359738416], load : [numTopics: 0]
        X.Y.Z.112:4080:9876 :     info : [hostname:
"X.Y.Z.112:4080:9876", czxid: 34359738417], load : [numTopics: 0]

Another thing I observed is that the "sub" command sometimes hung in
this case. But, I did not find any error message in the log file.

Am I doing something wrong? Did I set the "regions" option correctly?
Any help is
deeply appreciated.

Thanks,
Ming

Re: Hedwig Across-Region Configuration

Posted by Ivan Kelly <iv...@apache.org>.
Hi Ming,

It's been a long time since I looked at the region stuff in hedwig, but I
think it could be that you don't seem to be setting the region identifier
in hw_server.conf. You need to change "region" in hw_server to some
identifier, like reg1 and reg2 for your example.

Hope this helps,
Ivan

On Thu, Jan 8, 2015 at 4:19 AM, Ming Chen <mc...@cs.stonybrook.edu> wrote:

> Hi there,
>
> I am new to Hedwig, and have a question about setting up Hedwig with
> two regions. I have successfully set up two separate single-region
> instances of Hedwig. But I could not get them communicate with each
> other.
>
> Here are my configuration:
>
> I have two physical machines (A and B), each hosting three VMs
> mimicking a cluster in a region. In each region, the three VMs run
> ZooKeeper servers, BookKeeper bookie servers, and Hedwig servers at
> the same time.
>
> For "reg1" (machine A), the three VMs's IP are:
> X.Y.Z.111
> X.Y.Z.112
> X.Y.Z.113
>
> I successfully set them up as a single-region Hedwig instance. I could
> run "bin/hedwig console", and successfully executed commands including
> "show", "sub", "pub", "pubsub",  "readtopic", "describe topic" etc.
>
> I did the same for "reg2" (machine B), where the three VMs are:
> X.Y.Z.114
> X.Y.Z.115
> X.Y.Z.116
>
> Again, "reg2" as a single-region Hedwig instance was running well.
>
> Then, I began to setup the across-region communication of the two
> Hedwig instances. I shut down the two Hedwig instances, and set the
> "regions" option in hw_server.conf of all 6 VMs to
>
> regions=X.Y.Z.111:4080 X.Y.Z.114:4080
>
> All options, other than zk_host, region, and regions, in
> hw_server.conf are left as default. After that, I restarted the two
> Hedwig instances. I logged into the console of "reg1" and published
> some test messages in a topic. But I could not observe the topic or
> messages when I logged into console of "reg2". The same is true vice
> versa.
>
> It seemed that the two regions can find each other because I observed
> that reg1 contains hub(s) from reg2, and vice versa:
>
> [hedwig: (reg1) 24] show hubs
> Available Hub Servers:
>         X.Y.Z.111:4080:9876 :     info : [hostname:
> "X.Y.Z.111:4080:9876", czxid: 12884902049], load : [numTopics: 0]
>         X.Y.Z.116:4080:9876 :     info : [hostname:
> "X.Y.Z.116:4080:9876", czxid: 12884902050], load : [numTopics: 0]
>         X.Y.Z.112:4080:9876 :     info : [hostname:
> "X.Y.Z.112:4080:9876", czxid: 12884902051], load : [numTopics: 1]
>
> [hedwig: (reg2) 8] show hubs
> Available Hub Servers:
>         X.Y.Z.116:4080:9876 :     info : [hostname:
> "X.Y.Z.116:4080:9876", czxid: 34359738415], load : [numTopics: 0]
>         X.Y.Z.114:4080:9876 :     info : [hostname:
> "X.Y.Z.114:4080:9876", czxid: 34359738416], load : [numTopics: 0]
>         X.Y.Z.112:4080:9876 :     info : [hostname:
> "X.Y.Z.112:4080:9876", czxid: 34359738417], load : [numTopics: 0]
>
> Another thing I observed is that the "sub" command sometimes hung in
> this case. But, I did not find any error message in the log file.
>
> Am I doing something wrong? Did I set the "regions" option correctly?
> Any help is
> deeply appreciated.
>
> Thanks,
> Ming
>

Re: Hedwig Across-Region Configuration

Posted by Ming Chen <mc...@cs.stonybrook.edu>.
Hi Ivan,

Thanks for the heads-up. Sorry that I didn't make it clear, but I did set
the region option in hw_server.conf to "reg1" and "reg2" for the two
regions, respectively.

I tried some more experiments, and got some error message with the
following operations on just one region:
(1) format
(2) show topics # it throws an IOException, which is probably okay as we
did not have any topic to show
(3) pub mytopic1 hello-topic1
(4) sub mytopic1 myid1 2

[hedwig: (reg1) 88] format
You ask to format hedwig metadata stored in
org.apache.hedwig.server.meta.ZkMetadataManagerFactory.
Press <Return> to continue, or Q to cancel ...
2015-01-08 00:09:45,752 - INFO  - [main:HedwigAdmin@541] - Formatted Hedwig
metadata successfully.
2015-01-08 00:09:45,757 - INFO  - [main:HedwigAdmin@544] - Removed old
factory layout.
2015-01-08 00:09:45,770 - INFO  - [main:HedwigAdmin@548] - Created new
factory layout.
Formatted hedwig metadata successfully.
Finished 2.352 s.
[hedwig: (reg1) 89] show topics
Unable to fetch the list of topics
java.io.IOException: Failed to get topics list :
        at
org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:98)
        at
org.apache.hedwig.admin.HedwigAdmin.getTopics(HedwigAdmin.java:331)
        at
org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.showTopics(HedwigConsole.java:588)
        at
org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.runCmd(HedwigConsole.java:564)
        at
org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
        at
org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
        at
org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
        at
org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /hedwig/reg1/topics
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
        at
org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:96)
        ... 7 more
Finished 0.015 s.
[hedwig: (reg1) 90] pub mytopic1 hello-topic1
PUB DONE
Finished 0.472 s.
[hedwig: (reg1) 91] sub mytopic1 myid1 2
2015-01-08 00:13:38,021 - INFO  - [New I/O worker #6:HChannelHandler@228] -
Channel [id: 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080] was
disconnected to host localhost/1
27.0.0.1:4080.
2015-01-08 00:13:38,022 - INFO  - [New I/O worker
#6:AbstractHChannelManager@357] - NonSubscription Channel [id: 0x50aa85e6, /
127.0.0.1:52095 :> localhost/127.0.0.1:4080] to localhost
/127.0.0.1:4080 disconnected.
2015-01-08 00:13:38,030 - INFO  - [New I/O worker #7:HChannelHandler@228] -
Channel [id: 0x9615a67b, /127.0.0.1:52098 :> localhost/127.0.0.1:4080] was
disconnected to host localhost/1
27.0.0.1:4080.
2015-01-08 00:13:38,031 - INFO  - [New I/O worker
#7:SimpleHChannelManager@191] - Subscription Channel [id: 0x9615a67b, /
127.0.0.1:52098 :> localhost/127.0.0.1:4080] disconnected from
 localhost/127.0.0.1:4080.
2015-01-08 00:13:38,037 - ERROR - [main:HedwigSubscriber@130] - Unexpected
PubSubException thrown:
org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
Server ack response never received before server connection disconnected!
        at
org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
        at
org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at
org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at
org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
        at
org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
        at
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at
org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
        at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
        at
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
        at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
SUB FAILED
org.apache.hedwig.exceptions.PubSubException$ServiceDownException:
org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
Server ack response never received before server connection disconnected!
        at
org.apache.hedwig.client.netty.HedwigSubscriber.subUnsub(HedwigSubscriber.java:133)
        at
org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:194)
        at
org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:181)
        at
org.apache.hedwig.admin.console.HedwigConsole$SubCmd.runCmd(HedwigConsole.java:291)
        at
org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
        at
org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
        at
org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
        at
org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
Caused by:
org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
Server ack response never received before server connection disconnected!
        at
org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
        at
org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at
org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at
org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
        at
org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
        at
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at
org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
        at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
        at
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
        at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Thanks,
Ming


On Thu, Jan 8, 2015 at 6:05 AM, Ivan Kelly <iv...@apache.org> wrote:
> Hi Ming,
>
> It's been a long time since I looked at the region stuff in hedwig, but I
> think it could be that you don't seem to be setting the region identifier
in
> hw_server.conf. You need to change "region" in hw_server to some
identifier,
> like reg1 and reg2 for your example.
>
> Hope this helps,
> Ivan
>

Re: Hedwig Across-Region Configuration

Posted by Ivan Kelly <iv...@apache.org>.
Hi Ming,

This looks like a bug. Feel free to dig in and try and fix it :)

The cross region stuff in hedwig was never tested extensively, so there's
probably quite a few bugs in there.

Regards
Ivan

On Mon, Jan 12, 2015 at 7:42 PM, Ming Chen <mc...@cs.stonybrook.edu> wrote:

> FYI,  the cross-region communication is working now after I used the
> latest code from git and enabled SSL in conf.
>
> Even though there seems to be an infinite loop when I do "sub mytopic
> myid1-1 2" in "hedwig console":
> [hedwig: (reg1) 164] sub mytopic myid1-1 2
> SUB DONE AND RECEIVE
> Finished 0.031 s.
> [hedwig: (reg1) 165] Received message from topic mytopic for subscriber
> myid1-1 : neeeeew-msg-from-reg2
> Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
> Received message from topic mytopic for subscriber myid1-1 :
> abs-new-msg-from-reg1
> Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
> Received message from topic mytopic for subscriber myid1-1 :
> neeeeew-msg-from-reg2
> Received message from topic mytopic for subscriber myid1-1 : msg-2-1
> Received message from topic mytopic for subscriber myid1-1 :
> abs-new-msg-from-reg1
> Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
> Received message from topic mytopic for subscriber myid1-1 :
> neeeeew-msg-from-reg2
> ...
>
> Thanks,
> Ming
>
> On Thu, Jan 8, 2015 at 11:24 AM, Ming Chen <mc...@cs.stonybrook.edu>
> wrote:
>
>>  Hi Ivan,
>>
>>  Thanks for the heads-up. Sorry that I didn't make it clear, but I did
>> set the region option in hw_server.conf to "reg1" and "reg2" for the two
>> regions, respectively.
>>
>>  I tried some more experiments, and got some error message with the
>> following operations on just one region:
>> (1) format
>> (2) show topics # it throws an IOException, which is probably okay as we
>> did not have any topic to show
>> (3) pub mytopic1 hello-topic1
>> (4) sub mytopic1 myid1 2
>>
>>  [hedwig: (reg1) 88] format
>> You ask to format hedwig metadata stored in
>> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.
>> Press <Return> to continue, or Q to cancel ...
>> 2015-01-08 00:09:45,752 - INFO  - [main:HedwigAdmin@541] - Formatted
>> Hedwig metadata successfully.
>> 2015-01-08 00:09:45,757 - INFO  - [main:HedwigAdmin@544] - Removed old
>> factory layout.
>> 2015-01-08 00:09:45,770 - INFO  - [main:HedwigAdmin@548] - Created new
>> factory layout.
>> Formatted hedwig metadata successfully.
>> Finished 2.352 s.
>> [hedwig: (reg1) 89] show topics
>> Unable to fetch the list of topics
>> java.io.IOException: Failed to get topics list :
>>         at
>> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:98)
>>         at
>> org.apache.hedwig.admin.HedwigAdmin.getTopics(HedwigAdmin.java:331)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.showTopics(HedwigConsole.java:588)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.runCmd(HedwigConsole.java:564)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>> KeeperErrorCode = NoNode for /hedwig/reg1/topics
>>         at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>>         at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
>>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
>>         at
>> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:96)
>>         ... 7 more
>> Finished 0.015 s.
>> [hedwig: (reg1) 90] pub mytopic1 hello-topic1
>> PUB DONE
>> Finished 0.472 s.
>> [hedwig: (reg1) 91] sub mytopic1 myid1 2
>> 2015-01-08 00:13:38,021 - INFO  - [New I/O worker #6:HChannelHandler@228]
>> - Channel [id: 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080]
>> was disconnected to host localhost/1
>> 27.0.0.1:4080.
>> 2015-01-08 00:13:38,022 - INFO  - [New I/O worker
>> #6:AbstractHChannelManager@357] - NonSubscription Channel [id:
>> 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080] to localhost
>> /127.0.0.1:4080 disconnected.
>> 2015-01-08 00:13:38,030 - INFO  - [New I/O worker #7:HChannelHandler@228]
>> - Channel [id: 0x9615a67b, /127.0.0.1:52098 :> localhost/127.0.0.1:4080]
>> was disconnected to host localhost/1
>> 27.0.0.1:4080.
>> 2015-01-08 00:13:38,031 - INFO  - [New I/O worker
>> #7:SimpleHChannelManager@191] - Subscription Channel [id: 0x9615a67b, /
>> 127.0.0.1:52098 :> localhost/127.0.0.1:4080] disconnected from
>>  localhost/127.0.0.1:4080.
>> 2015-01-08 00:13:38,037 - ERROR - [main:HedwigSubscriber@130] -
>> Unexpected PubSubException thrown:
>> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
>> Server ack response never received before server connection disconnected!
>>         at
>> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
>>         at
>> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>         at
>> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>         at
>> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>>         at
>> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>>         at
>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>>         at
>> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
>>         at
>> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>>         at
>> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>>         at
>> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>>         at
>> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> SUB FAILED
>> org.apache.hedwig.exceptions.PubSubException$ServiceDownException:
>> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
>> Server ack response never received before server connection disconnected!
>>         at
>> org.apache.hedwig.client.netty.HedwigSubscriber.subUnsub(HedwigSubscriber.java:133)
>>         at
>> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:194)
>>         at
>> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:181)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole$SubCmd.runCmd(HedwigConsole.java:291)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
>>         at
>> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
>> Caused by:
>> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
>> Server ack response never received before server connection disconnected!
>>         at
>> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
>>         at
>> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>         at
>> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>         at
>> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>>         at
>> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>>         at
>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>         at
>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>>         at
>> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
>>         at
>> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>>         at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>>         at
>> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>>         at
>> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>>         at
>> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>>          at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>  Thanks,
>> Ming
>>
>>
>> On Thu, Jan 8, 2015 at 6:05 AM, Ivan Kelly <iv...@apache.org> wrote:
>> > Hi Ming,
>> >
>> > It's been a long time since I looked at the region stuff in hedwig, but
>> I
>> > think it could be that you don't seem to be setting the region
>> identifier in
>> > hw_server.conf. You need to change "region" in hw_server to some
>> identifier,
>> > like reg1 and reg2 for your example.
>> >
>> > Hope this helps,
>> > Ivan
>> >
>>
>>
>

Re: Hedwig Across-Region Configuration

Posted by Ming Chen <mc...@cs.stonybrook.edu>.
I found the problem is caused by my configuration. When set "regions"
in hw_server.conf,
the local region should NOT be included.

The problem went way after setting "regions=X.Y.Z.114:4080" for the first
region (reg1) and "regions=X.Y.Z.111:4080" for the second region (reg2).

Thanks,
Ming

On Tue, Jan 13, 2015 at 4:38 AM, Ivan Kelly <iv...@apache.org> wrote:

>    Hi Ming,
>
>  This looks like a bug. Feel free to dig in and try and fix it :)
>
>  The cross region stuff in hedwig was never tested extensively, so there's
> probably quite a few bugs in there.
>
>  Regards
>  Ivan
>
> On Mon, Jan 12, 2015 at 7:42 PM, Ming Chen <mc...@cs.stonybrook.edu>
> wrote:
>
>> FYI,  the cross-region communication is working now after I used the
>> latest code from git and enabled SSL in conf.
>>
>>  Even though there seems to be an infinite loop when I do "sub mytopic
>> myid1-1 2" in "hedwig console":
>>  [hedwig: (reg1) 164] sub mytopic myid1-1 2
>> SUB DONE AND RECEIVE
>> Finished 0.031 s.
>> [hedwig: (reg1) 165] Received message from topic mytopic for subscriber
>> myid1-1 : neeeeew-msg-from-reg2
>> Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
>>  Received message from topic mytopic for subscriber myid1-1 :
>> abs-new-msg-from-reg1
>> Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
>> Received message from topic mytopic for subscriber myid1-1 :
>> neeeeew-msg-from-reg2
>> Received message from topic mytopic for subscriber myid1-1 : msg-2-1
>>  Received message from topic mytopic for subscriber myid1-1 :
>> abs-new-msg-from-reg1
>> Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
>> Received message from topic mytopic for subscriber myid1-1 :
>> neeeeew-msg-from-reg2
>>  ...
>>
>>  Thanks,
>> Ming
>>
>> On Thu, Jan 8, 2015 at 11:24 AM, Ming Chen <mc...@cs.stonybrook.edu>
>> wrote:
>>
>>>   Hi Ivan,
>>>
>>>  Thanks for the heads-up. Sorry that I didn't make it clear, but I did
>>> set the region option in hw_server.conf to "reg1" and "reg2" for the two
>>> regions, respectively.
>>>
>>>  I tried some more experiments, and got some error message with the
>>> following operations on just one region:
>>> (1) format
>>> (2) show topics # it throws an IOException, which is probably okay as we
>>> did not have any topic to show
>>> (3) pub mytopic1 hello-topic1
>>> (4) sub mytopic1 myid1 2
>>>
>>>  [hedwig: (reg1) 88] format
>>> You ask to format hedwig metadata stored in
>>> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.
>>> Press <Return> to continue, or Q to cancel ...
>>> 2015-01-08 00:09:45,752 - INFO  - [main:HedwigAdmin@541] - Formatted
>>> Hedwig metadata successfully.
>>> 2015-01-08 00:09:45,757 - INFO  - [main:HedwigAdmin@544] - Removed old
>>> factory layout.
>>> 2015-01-08 00:09:45,770 - INFO  - [main:HedwigAdmin@548] - Created new
>>> factory layout.
>>> Formatted hedwig metadata successfully.
>>> Finished 2.352 s.
>>> [hedwig: (reg1) 89] show topics
>>> Unable to fetch the list of topics
>>> java.io.IOException: Failed to get topics list :
>>>         at
>>> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:98)
>>>         at
>>> org.apache.hedwig.admin.HedwigAdmin.getTopics(HedwigAdmin.java:331)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.showTopics(HedwigConsole.java:588)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.runCmd(HedwigConsole.java:564)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
>>> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>>> KeeperErrorCode = NoNode for /hedwig/reg1/topics
>>>         at
>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>>>         at
>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>         at
>>> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
>>>         at
>>> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
>>>         at
>>> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:96)
>>>         ... 7 more
>>> Finished 0.015 s.
>>> [hedwig: (reg1) 90] pub mytopic1 hello-topic1
>>> PUB DONE
>>> Finished 0.472 s.
>>> [hedwig: (reg1) 91] sub mytopic1 myid1 2
>>> 2015-01-08 00:13:38,021 - INFO  - [New I/O worker #6:HChannelHandler@228]
>>> - Channel [id: 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080]
>>> was disconnected to host localhost/1
>>> 27.0.0.1:4080.
>>> 2015-01-08 00:13:38,022 - INFO  - [New I/O worker
>>> #6:AbstractHChannelManager@357] - NonSubscription Channel [id:
>>> 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080] to localhost
>>> /127.0.0.1:4080 disconnected.
>>> 2015-01-08 00:13:38,030 - INFO  - [New I/O worker #7:HChannelHandler@228]
>>> - Channel [id: 0x9615a67b, /127.0.0.1:52098 :> localhost/127.0.0.1:4080]
>>> was disconnected to host localhost/1
>>> 27.0.0.1:4080.
>>> 2015-01-08 00:13:38,031 - INFO  - [New I/O worker
>>> #7:SimpleHChannelManager@191] - Subscription Channel [id: 0x9615a67b, /
>>> 127.0.0.1:52098 :> localhost/127.0.0.1:4080] disconnected from
>>>  localhost/127.0.0.1:4080.
>>> 2015-01-08 00:13:38,037 - ERROR - [main:HedwigSubscriber@130] -
>>> Unexpected PubSubException thrown:
>>> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
>>> Server ack response never received before server connection disconnected!
>>>         at
>>> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
>>>         at
>>> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>         at
>>> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>         at
>>> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>>>         at
>>> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>>>         at
>>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>>>         at
>>> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>>>         at
>>> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>>>         at
>>> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> SUB FAILED
>>> org.apache.hedwig.exceptions.PubSubException$ServiceDownException:
>>> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
>>> Server ack response never received before server connection disconnected!
>>>         at
>>> org.apache.hedwig.client.netty.HedwigSubscriber.subUnsub(HedwigSubscriber.java:133)
>>>         at
>>> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:194)
>>>         at
>>> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:181)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole$SubCmd.runCmd(HedwigConsole.java:291)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
>>>         at
>>> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
>>> Caused by:
>>> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
>>> Server ack response never received before server connection disconnected!
>>>         at
>>> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
>>>         at
>>> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>         at
>>> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>>>         at
>>> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>>>         at
>>> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>>>         at
>>> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>>>         at
>>> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>>>         at
>>> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>>>         at
>>> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>>>         at
>>> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>>>            at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)
>>>
>>>  Thanks,
>>> Ming
>>>
>>>
>>> On Thu, Jan 8, 2015 at 6:05 AM, Ivan Kelly <iv...@apache.org> wrote:
>>>  > Hi Ming,
>>> >
>>> > It's been a long time since I looked at the region stuff in hedwig,
>>> but I
>>> > think it could be that you don't seem to be setting the region
>>> identifier in
>>> > hw_server.conf. You need to change "region" in hw_server to some
>>> identifier,
>>> > like reg1 and reg2 for your example.
>>> >
>>> > Hope this helps,
>>> > Ivan
>>> >
>>>
>>>
>>
>

Re: Hedwig Across-Region Configuration

Posted by Ming Chen <mc...@cs.stonybrook.edu>.
FYI,  the cross-region communication is working now after I used the latest
code from git and enabled SSL in conf.

Even though there seems to be an infinite loop when I do "sub mytopic
myid1-1 2" in "hedwig console":
[hedwig: (reg1) 164] sub mytopic myid1-1 2
SUB DONE AND RECEIVE
Finished 0.031 s.
[hedwig: (reg1) 165] Received message from topic mytopic for subscriber
myid1-1 : neeeeew-msg-from-reg2
Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
Received message from topic mytopic for subscriber myid1-1 :
abs-new-msg-from-reg1
Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
Received message from topic mytopic for subscriber myid1-1 :
neeeeew-msg-from-reg2
Received message from topic mytopic for subscriber myid1-1 : msg-2-1
Received message from topic mytopic for subscriber myid1-1 :
abs-new-msg-from-reg1
Received message from topic mytopic for subscriber myid1-1 : mysg-1-2
Received message from topic mytopic for subscriber myid1-1 :
neeeeew-msg-from-reg2
...

Thanks,
Ming

On Thu, Jan 8, 2015 at 11:24 AM, Ming Chen <mc...@cs.stonybrook.edu> wrote:

>  Hi Ivan,
>
>  Thanks for the heads-up. Sorry that I didn't make it clear, but I did
> set the region option in hw_server.conf to "reg1" and "reg2" for the two
> regions, respectively.
>
>  I tried some more experiments, and got some error message with the
> following operations on just one region:
> (1) format
> (2) show topics # it throws an IOException, which is probably okay as we
> did not have any topic to show
> (3) pub mytopic1 hello-topic1
> (4) sub mytopic1 myid1 2
>
>  [hedwig: (reg1) 88] format
> You ask to format hedwig metadata stored in
> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.
> Press <Return> to continue, or Q to cancel ...
> 2015-01-08 00:09:45,752 - INFO  - [main:HedwigAdmin@541] - Formatted
> Hedwig metadata successfully.
> 2015-01-08 00:09:45,757 - INFO  - [main:HedwigAdmin@544] - Removed old
> factory layout.
> 2015-01-08 00:09:45,770 - INFO  - [main:HedwigAdmin@548] - Created new
> factory layout.
> Formatted hedwig metadata successfully.
> Finished 2.352 s.
> [hedwig: (reg1) 89] show topics
> Unable to fetch the list of topics
> java.io.IOException: Failed to get topics list :
>         at
> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:98)
>         at
> org.apache.hedwig.admin.HedwigAdmin.getTopics(HedwigAdmin.java:331)
>         at
> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.showTopics(HedwigConsole.java:588)
>         at
> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.runCmd(HedwigConsole.java:564)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for /hedwig/reg1/topics
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
>         at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
>         at
> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:96)
>         ... 7 more
> Finished 0.015 s.
> [hedwig: (reg1) 90] pub mytopic1 hello-topic1
> PUB DONE
> Finished 0.472 s.
> [hedwig: (reg1) 91] sub mytopic1 myid1 2
> 2015-01-08 00:13:38,021 - INFO  - [New I/O worker #6:HChannelHandler@228]
> - Channel [id: 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080]
> was disconnected to host localhost/1
> 27.0.0.1:4080.
> 2015-01-08 00:13:38,022 - INFO  - [New I/O worker
> #6:AbstractHChannelManager@357] - NonSubscription Channel [id:
> 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080] to localhost
> /127.0.0.1:4080 disconnected.
> 2015-01-08 00:13:38,030 - INFO  - [New I/O worker #7:HChannelHandler@228]
> - Channel [id: 0x9615a67b, /127.0.0.1:52098 :> localhost/127.0.0.1:4080]
> was disconnected to host localhost/1
> 27.0.0.1:4080.
> 2015-01-08 00:13:38,031 - INFO  - [New I/O worker
> #7:SimpleHChannelManager@191] - Subscription Channel [id: 0x9615a67b, /
> 127.0.0.1:52098 :> localhost/127.0.0.1:4080] disconnected from
>  localhost/127.0.0.1:4080.
> 2015-01-08 00:13:38,037 - ERROR - [main:HedwigSubscriber@130] -
> Unexpected PubSubException thrown:
> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
> Server ack response never received before server connection disconnected!
>         at
> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
>         at
> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>         at
> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>         at
> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>         at
> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>         at
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>         at
> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
>         at
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>         at
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>         at
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>         at
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> SUB FAILED
> org.apache.hedwig.exceptions.PubSubException$ServiceDownException:
> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
> Server ack response never received before server connection disconnected!
>         at
> org.apache.hedwig.client.netty.HedwigSubscriber.subUnsub(HedwigSubscriber.java:133)
>         at
> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:194)
>         at
> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:181)
>         at
> org.apache.hedwig.admin.console.HedwigConsole$SubCmd.runCmd(HedwigConsole.java:291)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021)
>         at
> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036)
> Caused by:
> org.apache.hedwig.exceptions.PubSubException$UncertainStateException:
> Server ack response never received before server connection disconnected!
>         at
> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252)
>         at
> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>         at
> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>         at
> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493)
>         at
> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
>         at
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>         at
> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
>         at
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>         at
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>         at
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>         at
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>          at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
>  Thanks,
> Ming
>
>
> On Thu, Jan 8, 2015 at 6:05 AM, Ivan Kelly <iv...@apache.org> wrote:
> > Hi Ming,
> >
> > It's been a long time since I looked at the region stuff in hedwig, but I
> > think it could be that you don't seem to be setting the region
> identifier in
> > hw_server.conf. You need to change "region" in hw_server to some
> identifier,
> > like reg1 and reg2 for your example.
> >
> > Hope this helps,
> > Ivan
> >
>
>