You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Sajith <sa...@gmail.com> on 2014/04/25 11:40:33 UTC

Need help on Storm Remote Cluster deployment(Workers in a one machine are not getting tuples)

Hi all,

I was trying out storm in a remote deployment and was trying to measure end
to end latency.

I was using 4 machines Lets name them as M1, M2, M3 and M4. Following is
the usage of the each machine
M1 - Nimbus
M2 - Supervisor
M3 - Supervisor
M4 - ZooKeeper

Then i wrote a simple topology which contains 1 spout(StampTimeSpout)  and
2 bolts (PassThroughBolt, LatencyMeasureBolt).  StampTime spout generates
tupels which contains current time, a group id (guaranteed to be either "0"
or "0") and the task id, these tuples are sent to PassThroughBolt. At
PassThroughBolt it adds it's task id to the tuple and forwards it to
LatencyMeasureBolt. At LatencyMeasureBolt it calculates the latency by
taking the time containing in the tuple.

Following is the topology.

 builder.setSpout("timestamp", new StampTimeSpout(), *10*);
 builder.setBolt("pass", new PassThroughBolt(), *2*)
                 .shuffleGrouping("timestamp");
 builder.setBolt("measure-bolt", new LatencyMeasureBolt(), *2*)
                 .fieldsGrouping("pass", new Fields("groupid"));

When i deployed this on cluster,
- workers of spouts are spread across M2 and M3
- Both workers of PassThroughBolt was deplyed in M2
- One worker of LatencyMeasureBolt() was deployed in M2 and the other
worker was in M3

As i mentioned, I'm having a field called "groupid" and  I make sure the
group id is either 0 or 1. And I use field grouping on the "groupid".
Therefore,  both worked of LatencyMeasureBolt should get messages.

The problem I face here is the* worker of **LatencyMeasureBolt in M3* is
not getting any messages from the workers of PassThroughBolt which are in
M2. But workers in M2 of LatencyMeasureBolt are working as intended.

I also observe that the worker in M2 is only receiving messages of a single
group due to field grouping and all the messages of the other group is lost
it seems.

Any idea on what's going on or have I done something wrong? I have attached
the .java files i used.

PS :
Pleas note that I verified the connectivity among machines using ping and
telnet. Also, both M2 and M3 supervisors were listed in UI and it was
showing that there was one task of LatencyMeasureBolt was running in M3 and
workers of spout was also running on M3. And this is not intermittent and
this keeps happening every time.