You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by 徐鹏 <xp...@outlook.com> on 2015/07/08 11:01:59 UTC

Two topologies have the same name

hi:As shown in the figure, two topologies have the same name which should be impossible.Their uptime is close,so is it a concurrent issue?
Cheers 		 	   		  

Re: Problem to recept massive tuples

Posted by Kyle Nusbaum <kn...@yahoo-inc.com>.
The 'back pressure' only works if acking is enabled, which is not the case in his topology.
 -- Kyle
 


     On Wednesday, July 8, 2015 11:37 AM, Matthias J. Sax <mj...@informatik.hu-berlin.de> wrote:
   

 Hi,

this sounds weird... Storm should apply back pressure an slow down the
spout if bolts cannot keep up... Have you tried in increase the dop of
the bolt? Which bolt is the problematic one? NumberAvg or TimeGlobalAvg?

You might also checkout out `max spout pending` property to solve the
problem:


-Matthias


On 07/08/2015 05:54 PM, charlie quillard wrote:
> Hi,
> 
> 
> I began my performance tests on storm and  in my full use case when i
> send many tuples(> 1000), I can have a "core dump " because my bolt
> cannot treated all my spout tuples.
> 
> And for testing , I added a gist :
> https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I did
> not put a 1 millisecond sleep, I had the same problem.
> 
> So i would like to know how to fix it without to put a sleep.
> 
> 
> Thanks in advance,
> 
> Charlie QUILLARD
> 


  

Re: Problem to recept massive tuples

Posted by "Matthias J. Sax" <mj...@informatik.hu-berlin.de>.
Hi,

this sounds weird... Storm should apply back pressure an slow down the
spout if bolts cannot keep up... Have you tried in increase the dop of
the bolt? Which bolt is the problematic one? NumberAvg or TimeGlobalAvg?

You might also checkout out `max spout pending` property to solve the
problem:


-Matthias


On 07/08/2015 05:54 PM, charlie quillard wrote:
> Hi,
> 
> 
> I began my performance tests on storm and  in my full use case when i
> send many tuples(> 1000), I can have a "core dump " because my bolt
> cannot treated all my spout tuples.
> 
> And for testing , I added a gist :
> https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I did
> not put a 1 millisecond sleep, I had the same problem.
> 
> So i would like to know how to fix it without to put a sleep.
> 
> 
> Thanks in advance,
> 
> Charlie QUILLARD
> 


RE: Problem to recept massive tuples

Posted by charlie quillard <ch...@epitech.eu>.
You're right, my last tuple(it was kafkaBolt) did not emit anything. 
Thank you, i have a better understanding about storm processing.

________________________________________
De : Matthias J. Sax <mj...@informatik.hu-berlin.de>
Envoyé : jeudi 9 juillet 2015 13:27
À : user@storm.apache.org
Objet : Re: Problem to recept massive tuples

I guess, it is an anchoring "problem" if you do not receive acks.

Let's assume you have the following dependency chain of tuples

t1 -> t2 -> t3 -> t4

ie, t1 is emitted by a spout, a bolt b1 processes t1 and emits t2 and so
forth. Furthermore, t2 is anchored by t1 and so forth.

The last bolt consuming t4 does not emit anything and t4 is not used as
an anchor tuple.

The ack for t1 will only be delivered to the spout, if all tuple t1 to
t4 got acked.


You should check, if your anchoring and acking works accordingly.


-Matthias



On 07/09/2015 12:41 PM, charlie quillard wrote:
> Hi Matthias,
>
> Thanks for your advice, I increased the parallelism of my bolts and put a "max spout pending" of 10 that works but my ack function is never called when my chain processing is important.
> For example, I received all my ack messages with my example storm chain and when I used a complex chain (Asn.1 Decoding and AvroEncoding), i received any ack messages and i i don't understand why ?
>
> Best regards,
> Charlie
>
> ________________________________________
> De : Matthias J. Sax <mj...@informatik.hu-berlin.de>
> Envoyé : jeudi 9 juillet 2015 12:02
> À : user@storm.apache.org
> Objet : Re: Problem to recept massive tuples
>
> Hi Charlie,
>
> yes, if you want to use back preassure, you need to use message-ids for
> tuples in spouts and in bolts, anchor emitted tuples (input tuples are
> anchors) and ack processed tuples.
>
> On the other hand, I was wondering, if you did try to increase the
> parallelism of NumberAvgBolt to avoid the bottleneck.
>
> -Matthias
>
>
> On 07/09/2015 09:54 AM, charlie quillard wrote:
>> Hi,
>>
>>
>> Thanks for your help, my problem is with NumberAvgBolt which one is
>> submergerd by the spout tuples, to resume if I understand , I have to
>> implement the reliability like this example :
>> http://www.datasalt.com/2012/01/real-time-feed-processing-with-storm/
>>
>> With this solution, I could have a memory overflow problems if I receive
>> too messages from my sources, no ?
>>
>> Best regards,
>> Charlie QUILLARD
>>
>> ------------------------------------------------------------------------
>> *De :* 임정택 <ka...@gmail.com>
>> *Envoyé :* mercredi 8 juillet 2015 23:18
>> *À :* user@storm.apache.org
>> *Objet :* Re: Problem to recept massive tuples
>>
>> Hi, Charlie.
>>
>> Your Spout has the issue. Spout shouldn't stay longer in nextTuple()
>> cause Spout takes care of events (including calling ack, fail,
>> nextTuple) in event loop with just one thread.
>>
>> In other words, back pressure cannot work in your Spout cause how max
>> spout pending works is checking pending queue size before calling
>> nextTuple() and if it's greater than max spout pending, it skips calling
>> nextTuple().
>>
>> If you want to follow max spout pending strictly, nextTuple() should
>> emit only one tuple, but it is not hard rule.
>>
>> Please note that max spout pending only works with ack mode, and for
>> now it is only way for Storm to handle back pressure.
>>
>> Hope this helps.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2015년 7월 9일 목요일, charlie quillard<charlie.quillard@epitech.eu
>> <ma...@epitech.eu>>님이 작성한 메시지:
>>
>>     Hi,
>>
>>
>>     I began my performance tests on storm and  in my full use case when
>>     i send many tuples(> 1000), I can have a "core dump " because my
>>     bolt cannot treated all my spout tuples.
>>
>>     And for testing , I added a gist :
>>     https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I
>>     did not put a 1 millisecond sleep, I had the same problem.
>>
>>     So i would like to know how to fix it without to put a sleep.
>>
>>
>>     Thanks in advance,
>>
>>     Charlie QUILLARD
>>
>>
>>
>> --
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>
>
>


Re: Problem to recept massive tuples

Posted by "Matthias J. Sax" <mj...@informatik.hu-berlin.de>.
I guess, it is an anchoring "problem" if you do not receive acks.

Let's assume you have the following dependency chain of tuples

t1 -> t2 -> t3 -> t4

ie, t1 is emitted by a spout, a bolt b1 processes t1 and emits t2 and so
forth. Furthermore, t2 is anchored by t1 and so forth.

The last bolt consuming t4 does not emit anything and t4 is not used as
an anchor tuple.

The ack for t1 will only be delivered to the spout, if all tuple t1 to
t4 got acked.


You should check, if your anchoring and acking works accordingly.


-Matthias



On 07/09/2015 12:41 PM, charlie quillard wrote:
> Hi Matthias,
> 
> Thanks for your advice, I increased the parallelism of my bolts and put a "max spout pending" of 10 that works but my ack function is never called when my chain processing is important.
> For example, I received all my ack messages with my example storm chain and when I used a complex chain (Asn.1 Decoding and AvroEncoding), i received any ack messages and i i don't understand why ?
> 
> Best regards,
> Charlie
> 
> ________________________________________
> De : Matthias J. Sax <mj...@informatik.hu-berlin.de>
> Envoyé : jeudi 9 juillet 2015 12:02
> À : user@storm.apache.org
> Objet : Re: Problem to recept massive tuples
> 
> Hi Charlie,
> 
> yes, if you want to use back preassure, you need to use message-ids for
> tuples in spouts and in bolts, anchor emitted tuples (input tuples are
> anchors) and ack processed tuples.
> 
> On the other hand, I was wondering, if you did try to increase the
> parallelism of NumberAvgBolt to avoid the bottleneck.
> 
> -Matthias
> 
> 
> On 07/09/2015 09:54 AM, charlie quillard wrote:
>> Hi,
>>
>>
>> Thanks for your help, my problem is with NumberAvgBolt which one is
>> submergerd by the spout tuples, to resume if I understand , I have to
>> implement the reliability like this example :
>> http://www.datasalt.com/2012/01/real-time-feed-processing-with-storm/
>>
>> With this solution, I could have a memory overflow problems if I receive
>> too messages from my sources, no ?
>>
>> Best regards,
>> Charlie QUILLARD
>>
>> ------------------------------------------------------------------------
>> *De :* 임정택 <ka...@gmail.com>
>> *Envoyé :* mercredi 8 juillet 2015 23:18
>> *À :* user@storm.apache.org
>> *Objet :* Re: Problem to recept massive tuples
>>
>> Hi, Charlie.
>>
>> Your Spout has the issue. Spout shouldn't stay longer in nextTuple()
>> cause Spout takes care of events (including calling ack, fail,
>> nextTuple) in event loop with just one thread.
>>
>> In other words, back pressure cannot work in your Spout cause how max
>> spout pending works is checking pending queue size before calling
>> nextTuple() and if it's greater than max spout pending, it skips calling
>> nextTuple().
>>
>> If you want to follow max spout pending strictly, nextTuple() should
>> emit only one tuple, but it is not hard rule.
>>
>> Please note that max spout pending only works with ack mode, and for
>> now it is only way for Storm to handle back pressure.
>>
>> Hope this helps.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2015년 7월 9일 목요일, charlie quillard<charlie.quillard@epitech.eu
>> <ma...@epitech.eu>>님이 작성한 메시지:
>>
>>     Hi,
>>
>>
>>     I began my performance tests on storm and  in my full use case when
>>     i send many tuples(> 1000), I can have a "core dump " because my
>>     bolt cannot treated all my spout tuples.
>>
>>     And for testing , I added a gist :
>>     https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I
>>     did not put a 1 millisecond sleep, I had the same problem.
>>
>>     So i would like to know how to fix it without to put a sleep.
>>
>>
>>     Thanks in advance,
>>
>>     Charlie QUILLARD
>>
>>
>>
>> --
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>
> 
> 


RE: Problem to recept massive tuples

Posted by charlie quillard <ch...@epitech.eu>.
Hi Matthias,

Thanks for your advice, I increased the parallelism of my bolts and put a "max spout pending" of 10 that works but my ack function is never called when my chain processing is important.
For example, I received all my ack messages with my example storm chain and when I used a complex chain (Asn.1 Decoding and AvroEncoding), i received any ack messages and i i don't understand why ?

Best regards,
Charlie

________________________________________
De : Matthias J. Sax <mj...@informatik.hu-berlin.de>
Envoyé : jeudi 9 juillet 2015 12:02
À : user@storm.apache.org
Objet : Re: Problem to recept massive tuples

Hi Charlie,

yes, if you want to use back preassure, you need to use message-ids for
tuples in spouts and in bolts, anchor emitted tuples (input tuples are
anchors) and ack processed tuples.

On the other hand, I was wondering, if you did try to increase the
parallelism of NumberAvgBolt to avoid the bottleneck.

-Matthias


On 07/09/2015 09:54 AM, charlie quillard wrote:
> Hi,
>
>
> Thanks for your help, my problem is with NumberAvgBolt which one is
> submergerd by the spout tuples, to resume if I understand , I have to
> implement the reliability like this example :
> http://www.datasalt.com/2012/01/real-time-feed-processing-with-storm/
>
> With this solution, I could have a memory overflow problems if I receive
> too messages from my sources, no ?
>
> Best regards,
> Charlie QUILLARD
>
> ------------------------------------------------------------------------
> *De :* 임정택 <ka...@gmail.com>
> *Envoyé :* mercredi 8 juillet 2015 23:18
> *À :* user@storm.apache.org
> *Objet :* Re: Problem to recept massive tuples
>
> Hi, Charlie.
>
> Your Spout has the issue. Spout shouldn't stay longer in nextTuple()
> cause Spout takes care of events (including calling ack, fail,
> nextTuple) in event loop with just one thread.
>
> In other words, back pressure cannot work in your Spout cause how max
> spout pending works is checking pending queue size before calling
> nextTuple() and if it's greater than max spout pending, it skips calling
> nextTuple().
>
> If you want to follow max spout pending strictly, nextTuple() should
> emit only one tuple, but it is not hard rule.
>
> Please note that max spout pending only works with ack mode, and for
> now it is only way for Storm to handle back pressure.
>
> Hope this helps.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2015년 7월 9일 목요일, charlie quillard<charlie.quillard@epitech.eu
> <ma...@epitech.eu>>님이 작성한 메시지:
>
>     Hi,
>
>
>     I began my performance tests on storm and  in my full use case when
>     i send many tuples(> 1000), I can have a "core dump " because my
>     bolt cannot treated all my spout tuples.
>
>     And for testing , I added a gist :
>     https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I
>     did not put a 1 millisecond sleep, I had the same problem.
>
>     So i would like to know how to fix it without to put a sleep.
>
>
>     Thanks in advance,
>
>     Charlie QUILLARD
>
>
>
> --
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>


Re: Problem to recept massive tuples

Posted by "Matthias J. Sax" <mj...@informatik.hu-berlin.de>.
Hi Charlie,

yes, if you want to use back preassure, you need to use message-ids for
tuples in spouts and in bolts, anchor emitted tuples (input tuples are
anchors) and ack processed tuples.

On the other hand, I was wondering, if you did try to increase the
parallelism of NumberAvgBolt to avoid the bottleneck.

-Matthias


On 07/09/2015 09:54 AM, charlie quillard wrote:
> Hi,
> 
> 
> Thanks for your help, my problem is with NumberAvgBolt which one is
> submergerd by the spout tuples, to resume if I understand , I have to
> implement the reliability like this example :
> http://www.datasalt.com/2012/01/real-time-feed-processing-with-storm/
> 
> With this solution, I could have a memory overflow problems if I receive
> too messages from my sources, no ?
> 
> Best regards,
> Charlie QUILLARD
> 
> ------------------------------------------------------------------------
> *De :* 임정택 <ka...@gmail.com>
> *Envoyé :* mercredi 8 juillet 2015 23:18
> *À :* user@storm.apache.org
> *Objet :* Re: Problem to recept massive tuples
>  
> Hi, Charlie.
> 
> Your Spout has the issue. Spout shouldn't stay longer in nextTuple()
> cause Spout takes care of events (including calling ack, fail,
> nextTuple) in event loop with just one thread.
> 
> In other words, back pressure cannot work in your Spout cause how max
> spout pending works is checking pending queue size before calling
> nextTuple() and if it's greater than max spout pending, it skips calling
> nextTuple().
> 
> If you want to follow max spout pending strictly, nextTuple() should
> emit only one tuple, but it is not hard rule.
> 
> Please note that max spout pending only works with ack mode, and for
> now it is only way for Storm to handle back pressure.
> 
> Hope this helps.
> 
> Thanks,
> Jungtaek Lim (HeartSaVioR)
> 
> 2015년 7월 9일 목요일, charlie quillard<charlie.quillard@epitech.eu
> <ma...@epitech.eu>>님이 작성한 메시지:
> 
>     Hi,
> 
> 
>     I began my performance tests on storm and  in my full use case when
>     i send many tuples(> 1000), I can have a "core dump " because my
>     bolt cannot treated all my spout tuples.
> 
>     And for testing , I added a gist :
>     https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I
>     did not put a 1 millisecond sleep, I had the same problem.
> 
>     So i would like to know how to fix it without to put a sleep.
> 
> 
>     Thanks in advance,
> 
>     Charlie QUILLARD
> 
> 
> 
> -- 
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
> 


RE: Problem to recept massive tuples

Posted by charlie quillard <ch...@epitech.eu>.
Hi,


Thanks for your help, my problem is with NumberAvgBolt which one is submergerd by the spout tuples, to resume if I understand , I have to implement the reliability like this example : http://www.datasalt.com/2012/01/real-time-feed-processing-with-storm/

With this solution, I could have a memory overflow problems if I receive too messages from my sources, no ?

Best regards,
Charlie QUILLARD

________________________________
De : ??? <ka...@gmail.com>
Envoyé : mercredi 8 juillet 2015 23:18
À : user@storm.apache.org
Objet : Re: Problem to recept massive tuples

Hi, Charlie.

Your Spout has the issue. Spout shouldn't stay longer in nextTuple() cause Spout takes care of events (including calling ack, fail, nextTuple) in event loop with just one thread.

In other words, back pressure cannot work in your Spout cause how max spout pending works is checking pending queue size before calling nextTuple() and if it's greater than max spout pending, it skips calling nextTuple().

If you want to follow max spout pending strictly, nextTuple() should emit only one tuple, but it is not hard rule.

Please note that max spout pending only works with ack mode, and for now it is only way for Storm to handle back pressure.

Hope this helps.

Thanks,
Jungtaek Lim (HeartSaVioR)

2015? 7? 9? ???, charlie quillard<ch...@epitech.eu>>?? ??? ???:

Hi,


I began my performance tests on storm and  in my full use case when i send many tuples(> 1000), I can have a "core dump " because my bolt cannot treated all my spout tuples.

And for testing , I added a gist : https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I did not put a 1 millisecond sleep, I had the same problem.

So i would like to know how to fix it without to put a sleep.


Thanks in advance,

Charlie QUILLARD


--
Name : ? ??
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior


Re: Problem to recept massive tuples

Posted by 임정택 <ka...@gmail.com>.
Hi, Charlie.

Your Spout has the issue. Spout shouldn't stay longer in nextTuple() cause
Spout takes care of events (including calling ack, fail, nextTuple) in
event loop with just one thread.

In other words, back pressure cannot work in your Spout cause how max spout
pending works is checking pending queue size before calling nextTuple() and
if it's greater than max spout pending, it skips calling nextTuple().

If you want to follow max spout pending strictly, nextTuple() should emit
only one tuple, but it is not hard rule.

Please note that max spout pending only works with ack mode, and for now it
is only way for Storm to handle back pressure.

Hope this helps.

Thanks,
Jungtaek Lim (HeartSaVioR)

2015년 7월 9일 목요일, charlie quillard<ch...@epitech.eu>님이 작성한 메시지:

>  Hi,
>
>
>  I began my performance tests on storm and  in my full use case when i
> send many tuples(> 1000), I can have a "core dump " because my bolt cannot
> treated all my spout tuples.
>
> And for testing , I added a gist :
> https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I did
> not put a 1 millisecond sleep, I had the same problem.
>
> So i would like to know how to fix it without to put a sleep.
>
>
>  Thanks in advance,
>
> Charlie QUILLARD
>


-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Problem to recept massive tuples

Posted by charlie quillard <ch...@epitech.eu>.
Hi,


I began my performance tests on storm and  in my full use case when i send many tuples(> 1000), I can have a "core dump " because my bolt cannot treated all my spout tuples.

And for testing , I added a gist : https://gist.github.com/episanchez/a5c101bdf637a5ff2e28 , and when I did not put a 1 millisecond sleep, I had the same problem.

So i would like to know how to fix it without to put a sleep.


Thanks in advance,

Charlie QUILLARD