You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by 李家宏 <jh...@gmail.com> on 2014/06/06 10:01:19 UTC
Re: Tuples lost in Storm 0.9.1
Hi, Daria
The lost tuples may going into two places:
1) message_queue in netty-client, which will cause memory leak;
2) netty internal buffer, if connection lose, all tuples in it get lost.
so, check your worker log to see if there is any connection lost error
Regards
2014-05-16 12:18 GMT+08:00 李家宏 <jh...@gmail.com>:
> I am running into the same issue. Where do the lost tuples gone ? If they
> were queueing in the transport layer, the memory usage should keep
> increasing, but I didn't see any noticeable memory leaks.
>
> Does storm have the guarantee all tuples sent from task A to task B will
> be received by task B ? Moreover, are they in order ?
>
> Can anybody give any idea on this issue
>
>
> 2014-04-02 20:56 GMT+08:00 Daria Mayorova <d....@gmail.com>:
>
> Hi everyone,
>>
>> We are having some issues with the Storm topology. The problem is that
>> some tuples are being lost somewhere in the topology. Just after the
>> topology is deployed, it goes pretty well, but after several hours it
>> starts to loose a significant amount of tuples.
>>
>> From what we've found out from the logs, the thing is that the tuples
>> exit one bolt/spout, and never enter the next bolt.
>>
>> Here is some info about the topology:
>>
>> - The version is 0.9.1, and netty is used as transport
>> - The spout is extending BaseRichSpout, and the bolts extend
>> BaseBasicBolt
>> - The spout is using Kestrel message queue
>> - The cluster consists of 2 nodes: zookeeper, nimbus and ui are
>> running on one node, and the workers run on another node. I am attaching
>> the content of the config files below. We have also tried running the
>> workers on another node (the same where nimbus and zookeeper are), and also
>> on both nodes, but the behavior is the same.
>>
>> According to the Storm UI there are no Failed tuples. Can anybody give
>> any idea of what might be the reason of the tuples getting lost?
>>
>> Thanks.
>>
>> *Storm config (storm.yaml)*
>> (In case both nodes have workers running, the configuration is the same
>> on both nodes, just the "storm.local.hostname" parameter changes)
>>
>> storm.zookeeper.servers:
>> - "zkserver1"
>> nimbus.host: "nimbusserver"
>> storm.local.dir: "/mnt/storm"
>> supervisor.slots.ports:
>> - 6700
>> - 6701
>> - 6702
>> - 6703
>> storm.local.hostname: "storm1server"
>>
>> nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
>> ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>> supervisor.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
>> worker.childopts: "-Xmx3548m -Djava.net.preferIPv4Stack=true"
>>
>> storm.cluster.mode: "distributed"
>> storm.local.mode.zmq: false
>> storm.thrift.transport:
>> "backtype.storm.security.auth.SimpleTransportPlugin"
>>
>> storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>
>> storm.messaging.netty.server_worker_threads: 1
>> storm.messaging.netty.client_worker_threads: 1
>> storm.messaging.netty.buffer_size: 5242880 #5MB buffer
>> storm.messaging.netty.max_retries: 30
>> storm.messaging.netty.max_wait_ms: 1000
>> storm.messaging.netty.min_wait_ms: 100
>>
>> *Zookeeper config (zoo.cfg):*
>> tickTime=2000
>> initLimit=10
>> syncLimit=5
>> dataDir=/var/zookeeper
>> clientPort=2181
>> autopurge.purgeInterval=24
>> autopurge.snapRetainCount=5
>> server.1=localhost:2888:3888
>>
>> *Topology configuration* passed to the StormSubmitter:
>> Config conf = new Config();
>> conf.setNumAckers(6);
>> conf.setNumWorkers(4);
>> conf.setMaxSpoutPending(100);
>>
>>
>> Best regards,
>> Daria Mayorova
>>
>
>
>
> --
>
> ======================================================
>
> Gvain
>
> Email: jh.li.em@gmail.com
>
--
======================================================
Gvain
Email: jh.li.em@gmail.com
Ensuring QoS in multi-tenant storm - high level questions
Posted by adiya n <na...@yahoo.com>.
Hello storm experts,
few high level questions.
1) DRPC in production:
- Has anyone been successful in running DRPC in production?
- like run drpc behind a light weight webserver or something like that
and meet SLA requirements. Obviously it all depends on what
processing you are doing but wanted to know in general.
2) Ensuring QoS:
- We have a mix of traffic that comes in from property A to Z. But we
want to meet SLA for each property. Think of property as source of
data stream or may be topic in Kafka.
- It can so happen that one
property comes in with whole lot of traffic and
chokes the processing of another propertyB.
- Is it better to have the topology replicated with dedicated spouts and
dedicated bolts for each property. But thats expensive.
What is the best way to guarantee SLA ?
3) Making http calls to external services from Bolts.
- One part of processing also involves processing images. And we need
to download these images.
- Is it better to do this on a bolt? But then I will be under-utilizing bolt
resources while the bolt is waiting for images to be downloaded.
- What is the best way to handle http communication with external
services on a bolt?
thanks!!