You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Sheikh Al Amin Sunoyon <su...@gmail.com> on 2014/08/12 21:04:57 UTC

Nimbus-Supervisor-Worker process died for increasing cache memory in Storm Cluster

I've used the following topology to process high traffic stream. I'm
processing at ~2500 records/sec rate.

Storm Cluster:
4 m3.xlarge EC2 machines in AWS.

Topology:

TopologyBuilder builder = new TopologyBuilder();

builder.setSpout("kinesis_spout", spout, 5);

builder.setBolt("redis", new KinesisSplitLog(),
60).setNumTasks(2400).shuffleGrouping("kinesis_spout");

// the bolt only writes some count value into redis

Configuration:

Config conf = new Config();

conf.registerSerialization(java.nio.ByteBuffer.class);

conf.registerSerialization(com.amazonaws.services.kinesis.model.Record.class,
RecordSerializer.class);

conf.setSkipMissingKryoRegistrations(true);

conf.setFallBackOnJavaSerialization(false);

conf.setNumAckers(10);

conf.setDebug(false);

conf.setNumWorkers(5);

conf.setMaxSpoutPending(3000);


The cache memory increased gradually and after some time (10-12 hours
later) nimbus, supervisors and worker processes died. Memory occupied by
the processes were fixed.

I've used a jedis pool connection in the bolt where the timeout is 2 min.
Does it cache the data during session connection with redis?

I've attached the screen shot of memory state of each machine captured by
ganglia and the Storm UI.

Why do all processes of storm cluster and topology die? Please help me out.