You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by 李家宏 <jh...@gmail.com> on 2014/06/06 10:01:19 UTC

Re: Tuples lost in Storm 0.9.1

​Hi, Daria

The lost tupl​es may going into two places:
1) message_queue in netty-client, which will cause memory leak;
2) netty internal buffer, if connection lose, all tuples in it get lost.

so, check your worker log to see if there is any connection lost error

Regards


2014-05-16 12:18 GMT+08:00 李家宏 <jh...@gmail.com>:

> I am running into the same issue. Where do the lost tuples gone ? If they
> were queueing in the transport layer, the memory usage should keep
> increasing, but I didn't see any noticeable memory leaks.
>
> Does storm have the guarantee all tuples sent from task A to task B will
> be received by task B ? Moreover, are they in order ?
>
> Can anybody give any idea on this issue
>
>
> 2014-04-02 20:56 GMT+08:00 Daria Mayorova <d....@gmail.com>:
>
>  Hi everyone,
>>
>> We are having some issues with the Storm topology. The problem is that
>> some tuples are being lost somewhere in the topology. Just after the
>> topology is deployed, it goes pretty well, but after several hours it
>> starts to loose a significant amount of tuples.
>>
>> From what we've found out from the logs, the thing is that the tuples
>> exit one bolt/spout, and never enter the next bolt.
>>
>> Here is some info about the topology:
>>
>>    - The version is 0.9.1, and netty is used as transport
>>    - The spout is extending BaseRichSpout, and the bolts extend
>>    BaseBasicBolt
>>    - The spout is using Kestrel message queue
>>    - The cluster consists of 2 nodes: zookeeper, nimbus and ui are
>>    running on one node, and the workers run on another node. I am attaching
>>    the content of the config files below. We have also tried running the
>>    workers on another node (the same where nimbus and zookeeper are), and also
>>    on both nodes, but the behavior is the same.
>>
>> According to the Storm UI there are no Failed tuples. Can anybody give
>> any idea of what might be the reason of the tuples getting lost?
>>
>> Thanks.
>>
>> *Storm config (storm.yaml)*
>> (In case both nodes have workers running, the configuration is the same
>> on both nodes, just the "storm.local.hostname" parameter changes)
>>
>> storm.zookeeper.servers:
>>      - "zkserver1"
>> nimbus.host: "nimbusserver"
>> storm.local.dir: "/mnt/storm"
>> supervisor.slots.ports:
>>     - 6700
>>     - 6701
>>     - 6702
>>     - 6703
>> storm.local.hostname: "storm1server"
>>
>> nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
>> ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>> supervisor.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
>> worker.childopts: "-Xmx3548m -Djava.net.preferIPv4Stack=true"
>>
>> storm.cluster.mode: "distributed"
>> storm.local.mode.zmq: false
>> storm.thrift.transport:
>> "backtype.storm.security.auth.SimpleTransportPlugin"
>>
>> storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>
>> storm.messaging.netty.server_worker_threads: 1
>> storm.messaging.netty.client_worker_threads: 1
>> storm.messaging.netty.buffer_size: 5242880 #5MB buffer
>> storm.messaging.netty.max_retries: 30
>> storm.messaging.netty.max_wait_ms: 1000
>> storm.messaging.netty.min_wait_ms: 100
>>
>> *Zookeeper config (zoo.cfg):*
>> tickTime=2000
>> initLimit=10
>> syncLimit=5
>> dataDir=/var/zookeeper
>> clientPort=2181
>> autopurge.purgeInterval=24
>> autopurge.snapRetainCount=5
>> server.1=localhost:2888:3888
>>
>> *Topology configuration* passed to the StormSubmitter:
>> Config conf = new Config();
>> conf.setNumAckers(6);
>> conf.setNumWorkers(4);
>> conf.setMaxSpoutPending(100);
>>
>>
>> Best regards,
>> Daria Mayorova
>>
>
>
>
> --
>
> ======================================================
>
> Gvain
>
> Email: jh.li.em@gmail.com
>



-- 

======================================================

Gvain

Email: jh.li.em@gmail.com

Ensuring QoS in multi-tenant storm - high level questions

Posted by adiya n <na...@yahoo.com>.
Hello storm experts, 

few high level questions. 

1) DRPC in production:
    -    Has anyone been successful in running DRPC in production?
    -    like run drpc behind a light weight webserver or something like that 
          and meet SLA requirements. Obviously it all depends on what
          processing you are doing but wanted to know in general.
   

2) Ensuring QoS:
    - We have a  mix of traffic that comes in from property A to Z. But we 
       want to meet SLA for each property.  Think of property as source of
       data stream or may be topic in Kafka.
    - It can so happen that one 
      property comes in with whole lot of traffic and  
      chokes the processing of another propertyB.
    - Is it better to have the topology replicated with dedicated spouts and 
      dedicated bolts for each property. But thats expensive. 
   
     What is the best way to guarantee SLA ?
   
3) Making http calls to external services from Bolts. 
    - One part of processing also involves processing images. And we need 
      to download these images. 
    - Is it better to do this on a bolt? But then I will be under-utilizing bolt 
      resources while the bolt is waiting for images to be downloaded. 
    - What is the best way to handle http communication with external
      services on a bolt?
 
thanks!!