You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Guillermo Szeliga <gs...@dexmatech.com> on 2014/08/28 12:02:59 UTC

Fwd: [URGENT] False tuple failure?

Hi there,

Lately we spent a considerable amount of time trying to figure out the
reason behind a misleading case (that we'll expose below) and that still
today can't understand the given conditions that trigger the fail of the
tuple , so we thought in asking the pros and experienced ones out there
hoping that you can guide us though the dark path of storm behaviour so,
here we go:

We've been performing several stress tests playing around with different
configuration settings. We started off by setting one instance per
bolt/spout (hint: 1, task: 1) and everything worked smooth, no failures at
all. Then, we decided to put some strain on the rest of the topology by
using the following configuration:

- RabbitMQ spout (hint: 2, tasks:4)
- Discretization bolt (hint: 1, tasks:1)
- Fatten-The-Pig bolt (hint: 1, tasks:1)
- Bouncer bolt (hint: 1, tasks:1)
- Weighting bolt (hint: 1, tasks:1)
- Sysout bolt (hint: 1, tasks:1)

also, in order to have more visibility on how contention and backpressure
strategies are used and affects the whole performance, we also used the
following configuration:

        config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 128);
        config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 128);
        config.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS, 120);
        config.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 64);


Now, after a while, we start getting traces as the ones below:

2014-08-28 10:44:28 c.d.i.d.s.s.RabbitMQSpout [DEBUG] New reading has been
retrieved and successfully parsed [uuid:
'4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2',
msg: '279', task: '7']
2014-08-28 10:44:28 c.d.i.d.s.s.RabbitMQSpout [DEBUG] EMIT [msg:
'4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2']
2014-08-28 10:44:45 b.s.d.executor [DEBUG] Failing message 279: {:stream
"default", :values ["4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2"
"genericftp-177091" "1" "2014-08-12T21:30:00.000 UTC" "127.0.0.1"
"2014-08-27 15:50:42.375541" (#<Tuple (402,1448.3394702747873)>)]}
2014-08-28 10:44:56 c.d.i.d.s.b.DiscretizationBolt [INFO] Discretization
requested [uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', nnid: '1', ts:
'2014-08-12T21:30:00.000 UTC', raw: '[(402,1448.3394702747873)]']
2014-08-28 10:44:56 c.d.i.d.s.b.DiscretizationBolt [DEBUG] EMIT [msg:
'4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2']
2014-08-28 10:44:56 c.d.i.d.s.b.DiscretizationBolt [DEBUG] ACK [msg:
'4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2']
2014-08-28 10:44:58 c.d.i.d.s.b.FattenThePigBolt [DEBUG] Tuple enrichment
requested [uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', gwmac:
'genericftp-177091', nid: '1']
2014-08-28 10:44:58 c.d.i.d.s.b.FattenThePigBolt [DEBUG] Mote matched
[uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', gwmac: 'genericftp-177091',
nid: '1', mt_id: '74735', gw_id: '8879']
2014-08-28 10:44:58 c.d.i.d.s.b.FattenThePigBolt [INFO] Tuple successfully
enriched  [uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', gwmac:
'genericftp-177091', nid: '1', mt_id: '74735', gw_id: '8879']
2014-08-28 10:44:58 c.d.i.d.s.b.BouncerBolt [DEBUG] Mote dealing with The
Bouncer [uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', mt_id: '74735',
gw_id: '8879']
2014-08-28 10:44:58 c.d.i.d.s.b.BouncerBolt [DEBUG] Mote fullfils
specifications. Now checking datapoints [uuid:
'4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2',
mt_id: '74735', gw_id: '8879']
2014-08-28 10:44:58 c.d.i.d.s.b.BouncerBolt [INFO] Mote accepted [uuid:
'4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', mt_id: '74735', gw_id: '8879']
2014-08-28 10:44:58 c.d.i.d.s.b.WeightingBolt [INFO] Received '1' readings
to be weighted [uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', up-to-ts:
'2014-08-12T23:30:00.000+02:00', mt_id: '74735']
2014-08-28 10:44:58 c.d.i.d.s.b.WeightingBolt [DEBUG] Weighting datapoint
'402' [uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', up-to-ts:
'2014-08-12T23:30:00.000+02:00', mt_id: '74735']
2014-08-28 10:44:58 c.d.i.d.s.b.SysOutBolt [INFO] SYSOUT :== source:
weighting-bolt:12, stream: default, id:
{-4612146741903420812=-8379812311928270863},
[4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2, genericftp-177091, 1,
2014-08-12T21:30:00.000 UTC, 127.0.0.1, 2014-08-27 15:50:42.375541,
[DiscretizedReading(datapoint=402, discreteValue=3.3479927538794527,
accumulatedValue=1448.3394702747873, frecuency=NONE, operation=ACUM,
refTimestamp=2014-08-12T21:15:00.000 UTC, dirty=false)],
Mote(gateway=Gateway(id=8879, mac=genericftp-177091,
timezone=Europe/London, connected=true, registered=true, depId=1659),
id=74735, status=PENDING, nid=1, name=device 1, datapoints=[Datapoint
[id=127326, networkid=402, status=PENDING, mote=74735]]),
{402=[WeightedReading(year=2014, month=8, day=12, hour=22, minute=15,
value=1.1159975846264842, frequency=NONE, dst=true),
WeightedReading(year=2014, month=8, day=12, hour=22, minute=20,
value=1.1159975846264842, frequency=NONE, dst=true),
WeightedReading(year=2014, month=8, day=12, hour=22, minute=25,
value=1.1159975846264842, frequency=NONE, dst=true),
WeightedReading(year=2014, month=8, day=12, hour=22, minute=30, value=0.0,
frequency=NONE, dst=true)]}]
2014-08-28 10:44:58 c.d.i.d.s.b.WeightingBolt [INFO] Weighting emited
[uuid: '4e0ff06d-77a1-4341-ab4e-79cc7a33b3d2', up-to-ts:
'2014-08-12T23:30:00.000+02:00', mt_id: '74735']


See how a tuple is flagged a failed after it is emmited by the spout but
still it goes on until the end of the stream (SysOutBolt). As far as we
understand, this tuple did not fail, it reached the final bolt as expected
but, despite that, the tuple is marked as 'FAILED'. We've detected that we
increased latency on the first bolt (discretization) up to the point where
its capacity is above one but we don't wan't to scale out until we figure
out why we're getting this false failres on some tuples. Before you suggest
the usual stuff you can find on the google groups, we've double checked
that all ACKs and EMITS are correctly invoked so we've got everything
anchored as expected ;)

By the way, we're using v0.9.2

Cheers and thanks in advance.