You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by 임정택 <ka...@gmail.com> on 2015/06/24 00:03:42 UTC

Gathering opinion: changing multilang heartbeat mechanism

Hi!

Since it's about multilang feature and you can use your own implementation
of multilang (and I believe multilang library developers are subscribing
user group), I wanna get opinion about changing multilang heartbeat
mechanism.

At Storm 0.9.3, Storm introduces multilang heartbeat feature.
http://storm.apache.org/documentation/Multilang-protocol.html
If you use Storm 0.9.3 and higher, and didn't know about the change, you
may skip this mail.

Since it contains some design constraint, I'm trying my best to add
workarounds, but it cannot cover whole situation (STORM-738
<https://issues.apache.org/jira/browse/STORM-738>). That's why I want to
change mechanism to get rid of design constraint.

AS-IS (STORM-513 <https://issues.apache.org/jira/browse/STORM-513>)

- When subprocess receives heartbeat tuple, subprocess sends sync to parent.
- ShellSpout / ShellBolt updates last heartbeat timestamp when it receives
sync.
-- added workaround : ShellSpout / ShellBolt updates timestamp when it
receives any kind of message. (It doesn't applied to ShellBolt yet, but
it's ready for review. STORM-742
<https://issues.apache.org/jira/browse/STORM-742>)
- ShellSpout / ShellBolt checks last heartbeat timestamp periodically, and
if timestamp is not updated well, it suicides itself.

TO-BE (STORM-871 <https://issues.apache.org/jira/browse/STORM-871>)

- Subprocess has to update pid file's modified time periodically.
-- In default implementation, it updates pid file every 1 sec.
-- It should be handled concurrently with executing pending tuples.
-- Some languages couldn't implement this clearly, but I don't have an idea
what languages could be.
- ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid
file's modified time periodically, and if timestamp is not updated well, it
suicides itself.
- Heartbeat tuple is removed.

Please let me know your opinion, especially when you're developing
multilang libraries.

Thanks,
Jungtaek Lim (HeartSaVioR)

Re: Gathering opinion: changing multilang heartbeat mechanism

Posted by Srikanth <sr...@gmail.com>.
The RStorm package adds multilang supports for R language. Bringing
multithreading in R not easy.
Current design is fine, as long as every message is treated as heartbeat by
ShellBolt.

Srikanth

On Wed, Jul 15, 2015 at 12:31 AM, 임정택 <ka...@gmail.com> wrote:

> I agreed.
>
> I didn't want to keep "design constraint", but with GIL I can't find
> better solution now.
> I change my mind to stick it, then at least STORM-742 should be merged.
>
> Actually we can adjust SUPERVISOR_WORKER_TIMEOUT_SECS to make it work,
> but if we want to add separated variable, I'll happy to add.
>
> Thanks for following up this thread, Dan.
>
> Best,
> Jungtaek Lim (HeartSaVioR)
>
>
> 2015-07-15 11:48 GMT+09:00 Dan Blanchard <da...@parsely.com>:
>
>> If the GIL is a problem with both approaches, I think the best course of
>> action would be you just stick with what is already in the Multi-Lang
>> protocol, rather than adding another thing that Storn libraries will need
>> to support.
>>
>> Also, as long as the amount of time that a ShellBolt will wait to hear
>> from a subprocess is configurable, I don't think the current approach would
>> be a problem for CPU intensive tasks, as people can just bump up the wait
>> time.
>>
>> -Dan
>>
>> On Jul 9, 2015, at 11:51 PM, 임정택 <ka...@gmail.com> wrote:
>>
>> Thinking GIL once more, current approach can't deal with GIL, too.
>> If one of tuple takes more time then heartbeat timeout processing CPU
>> intensive job heavily, it could not do any ack / emits until end of
>> processing.
>>
>> GIL is a limitation of the languages, not multi-lang issue.
>> And GIL bothers us whatever we're checking heartbeat from subprocess.
>> Only thing we can avoid this situation is multiprocessing, which is too
>> complex so I'm afraid we have to follow.
>>
>> Best,
>> Jungtaek Lim (HeartSaVioR)
>>
>>
>> 2015-07-10 11:19 GMT+09:00 임정택 <ka...@gmail.com>:
>>
>>> Dan,
>>>
>>> I experimented about python's GIL just now, and with python 2.7.6 in OSX
>>> I found that other thread can hold CPU more than 1 sec when timer is
>>> expired at that time.
>>> https://gist.github.com/HeartSaVioR/34d90cdd6af906e72935
>>>
>>> Actually I wasn't affected this issue during I was working with Python
>>> cause it was I/O intensive job, and seems like it isn't same to CPU
>>> intensive job.
>>>
>>> Default tick time is somewhat very long. I found one document which says
>>> tick time is about ~6.5 secs, which doesn't meet our requirement.
>>>
>>> I don't think my experiment represents normal usage of multilang bolt,
>>> but who knows?
>>>
>>> - To all,
>>>
>>> So finally, newer heartbeat mechanism has other constraint which seems
>>> that languages matter, which languages are mainly supported now.
>>>
>>> Though I think newer heartbeat mechanism can solve more issues than
>>> current mechanism, but it is just my opinion.
>>> I don't have strong opinion to apply newer heartbeat mechanism since I
>>> found another constraint.
>>>
>>> I'd like to hear any opinions, objections, suggestions so please don't
>>> hesitate to tell.
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>>
>>> 2015-07-10 8:04 GMT+09:00 임정택 <ka...@gmail.com>:
>>>
>>>> Thanks Dan for giving opinion. :)
>>>>
>>>> To tell the truth, when I was implementing STORM-513, Sean talks me
>>>> privately about why design constraint is necessary. It was valid
>>>> opinion actually.
>>>>
>>>> I was thinking multilang feature should consider whole languages. It
>>>> blocks introducing whole kinds of approaches, and introduces design
>>>> constraint finally.
>>>>
>>>> After introducing this constraint, Dashengju noticed me that design
>>>> constraint can't cover some kind of situation which STORM-742 still can't
>>>> cover it.
>>>>
>>>> I agree and change my mind that it's time for multilang feature to drop
>>>> supporting some kind of languages which doesn't meet future requirements.
>>>>
>>>> I know default implementation of Python and Ruby have GIL issue, but
>>>> AFAIK context switch interval is not too long so it doesn't block heartbeat
>>>> timer to act on time.
>>>> (Please let me know when you met GIL issue which blocks one thread to
>>>> wait over seconds.)
>>>>
>>>> I don't expect subprocess to change modified time per exactly 1 sec,
>>>> and ShellSpout and ShellBolt will adjust it, too.
>>>>
>>>> It is replacement of current heartbeat mechanism, so when we introduce
>>>> new heartbeat, old thing should be removed.
>>>> It could introduce backward compatibility issue (especially
>>>> changing protocol) so we should consider what version we can adopt this.
>>>>
>>>> Thanks for reading long mail.
>>>>
>>>> Thanks,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>> 2015년 7월 10일 금요일, Dan Blanchard<da...@parsely.com>님이 작성한 메시지:
>>>>
>>>> Hi Jungtaek,
>>>>>
>>>>> Sorry I didn’t notice this earlier, as I was the person who filed
>>>>> STORM–513 <https://issues.apache.org/jira/browse/STORM-513> in the
>>>>> first place.
>>>>>
>>>>> Having just implemented the new heartbeat protocol in Python (for
>>>>> streamparse <https://github.com/Parsely/streamparse/pull/87>) and
>>>>> Perl (for IO::Storm
>>>>> <https://github.com/dan-blanchard/io-storm/commit/d1bac6bcac9fa2f8c6eee5ce3eae7f98eb45930e>),
>>>>> I’m not crazy about needing to add another heartbeat approach to multiple
>>>>> libraries so soon.
>>>>>
>>>>> I also am against needing to deal with multithreading in Python (where
>>>>> there will be GIL issues) just to accommodate a change to the heartbeat
>>>>> protocol. It seems to me that the workaround you proposed in STORM–742
>>>>> <https://issues.apache.org/jira/browse/STORM-742> (where any command
>>>>> the ShellBolt receives counts as a heartbeat) should be sufficient.
>>>>>
>>>>> Thanks,
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On June 23, 2015 at 6:04:11 PM, 임정택 (kabhwan@gmail.com) wrote:
>>>>>
>>>>>  Hi!
>>>>>
>>>>> Since it's about multilang feature and you can use your own
>>>>> implementation of multilang (and I believe multilang library developers are
>>>>> subscribing user group), I wanna get opinion about changing multilang
>>>>> heartbeat mechanism.
>>>>>
>>>>> At Storm 0.9.3, Storm introduces multilang heartbeat feature.
>>>>> http://storm.apache.org/documentation/Multilang-protocol.html
>>>>>  If you use Storm 0.9.3 and higher, and didn't know about the change,
>>>>> you may skip this mail.
>>>>>
>>>>> Since it contains some design constraint, I'm trying my best to add
>>>>> workarounds, but it cannot cover whole situation (STORM-738
>>>>> <https://issues.apache.org/jira/browse/STORM-738>). That's why I want
>>>>> to change mechanism to get rid of design constraint.
>>>>>
>>>>> AS-IS (STORM-513 <https://issues.apache.org/jira/browse/STORM-513>)
>>>>>
>>>>> - When subprocess receives heartbeat tuple, subprocess sends sync to
>>>>> parent.
>>>>> - ShellSpout / ShellBolt updates last heartbeat timestamp when it
>>>>> receives sync.
>>>>> -- added workaround : ShellSpout / ShellBolt updates timestamp when it
>>>>> receives any kind of message. (It doesn't applied to ShellBolt yet, but
>>>>> it's ready for review. STORM-742
>>>>> <https://issues.apache.org/jira/browse/STORM-742>)
>>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically,
>>>>> and if timestamp is not updated well, it suicides itself.
>>>>>
>>>>> TO-BE (STORM-871 <https://issues.apache.org/jira/browse/STORM-871>)
>>>>>
>>>>> - Subprocess has to update pid file's modified time periodically.
>>>>> -- In default implementation, it updates pid file every 1 sec.
>>>>> -- It should be handled concurrently with executing pending tuples.
>>>>> -- Some languages couldn't implement this clearly, but I don't have an
>>>>> idea what languages could be.
>>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading
>>>>> pid file's modified time periodically, and if timestamp is not updated
>>>>> well, it suicides itself.
>>>>> - Heartbeat tuple is removed.
>>>>>
>>>>> Please let me know your opinion, especially when you're developing
>>>>> multilang libraries.
>>>>>
>>>>> Thanks,
>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>
>>>>>
>>>>
>>>> --
>>>> Name : 임 정택
>>>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>>>> Twitter : http://twitter.com/heartsavior
>>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>>>
>>>>
>>>
>>>
>>> --
>>> Name : 임 정택
>>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>>> Twitter : http://twitter.com/heartsavior
>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>>
>>
>>
>>
>> --
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>
>>
>
>
> --
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>

Re: Gathering opinion: changing multilang heartbeat mechanism

Posted by 임정택 <ka...@gmail.com>.
I agreed.

I didn't want to keep "design constraint", but with GIL I can't find better
solution now.
I change my mind to stick it, then at least STORM-742 should be merged.

Actually we can adjust SUPERVISOR_WORKER_TIMEOUT_SECS to make it work, but
if we want to add separated variable, I'll happy to add.

Thanks for following up this thread, Dan.

Best,
Jungtaek Lim (HeartSaVioR)


2015-07-15 11:48 GMT+09:00 Dan Blanchard <da...@parsely.com>:

> If the GIL is a problem with both approaches, I think the best course of
> action would be you just stick with what is already in the Multi-Lang
> protocol, rather than adding another thing that Storn libraries will need
> to support.
>
> Also, as long as the amount of time that a ShellBolt will wait to hear
> from a subprocess is configurable, I don't think the current approach would
> be a problem for CPU intensive tasks, as people can just bump up the wait
> time.
>
> -Dan
>
> On Jul 9, 2015, at 11:51 PM, 임정택 <ka...@gmail.com> wrote:
>
> Thinking GIL once more, current approach can't deal with GIL, too.
> If one of tuple takes more time then heartbeat timeout processing CPU
> intensive job heavily, it could not do any ack / emits until end of
> processing.
>
> GIL is a limitation of the languages, not multi-lang issue.
> And GIL bothers us whatever we're checking heartbeat from subprocess.
> Only thing we can avoid this situation is multiprocessing, which is too
> complex so I'm afraid we have to follow.
>
> Best,
> Jungtaek Lim (HeartSaVioR)
>
>
> 2015-07-10 11:19 GMT+09:00 임정택 <ka...@gmail.com>:
>
>> Dan,
>>
>> I experimented about python's GIL just now, and with python 2.7.6 in OSX
>> I found that other thread can hold CPU more than 1 sec when timer is
>> expired at that time.
>> https://gist.github.com/HeartSaVioR/34d90cdd6af906e72935
>>
>> Actually I wasn't affected this issue during I was working with Python
>> cause it was I/O intensive job, and seems like it isn't same to CPU
>> intensive job.
>>
>> Default tick time is somewhat very long. I found one document which says
>> tick time is about ~6.5 secs, which doesn't meet our requirement.
>>
>> I don't think my experiment represents normal usage of multilang bolt,
>> but who knows?
>>
>> - To all,
>>
>> So finally, newer heartbeat mechanism has other constraint which seems
>> that languages matter, which languages are mainly supported now.
>>
>> Though I think newer heartbeat mechanism can solve more issues than
>> current mechanism, but it is just my opinion.
>> I don't have strong opinion to apply newer heartbeat mechanism since I
>> found another constraint.
>>
>> I'd like to hear any opinions, objections, suggestions so please don't
>> hesitate to tell.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>>
>>
>> 2015-07-10 8:04 GMT+09:00 임정택 <ka...@gmail.com>:
>>
>>> Thanks Dan for giving opinion. :)
>>>
>>> To tell the truth, when I was implementing STORM-513, Sean talks me
>>> privately about why design constraint is necessary. It was valid
>>> opinion actually.
>>>
>>> I was thinking multilang feature should consider whole languages. It
>>> blocks introducing whole kinds of approaches, and introduces design
>>> constraint finally.
>>>
>>> After introducing this constraint, Dashengju noticed me that design
>>> constraint can't cover some kind of situation which STORM-742 still can't
>>> cover it.
>>>
>>> I agree and change my mind that it's time for multilang feature to drop
>>> supporting some kind of languages which doesn't meet future requirements.
>>>
>>> I know default implementation of Python and Ruby have GIL issue, but
>>> AFAIK context switch interval is not too long so it doesn't block heartbeat
>>> timer to act on time.
>>> (Please let me know when you met GIL issue which blocks one thread to
>>> wait over seconds.)
>>>
>>> I don't expect subprocess to change modified time per exactly 1 sec, and
>>> ShellSpout and ShellBolt will adjust it, too.
>>>
>>> It is replacement of current heartbeat mechanism, so when we introduce
>>> new heartbeat, old thing should be removed.
>>> It could introduce backward compatibility issue (especially
>>> changing protocol) so we should consider what version we can adopt this.
>>>
>>> Thanks for reading long mail.
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>> 2015년 7월 10일 금요일, Dan Blanchard<da...@parsely.com>님이 작성한 메시지:
>>>
>>> Hi Jungtaek,
>>>>
>>>> Sorry I didn’t notice this earlier, as I was the person who filed
>>>> STORM–513 <https://issues.apache.org/jira/browse/STORM-513> in the
>>>> first place.
>>>>
>>>> Having just implemented the new heartbeat protocol in Python (for
>>>> streamparse <https://github.com/Parsely/streamparse/pull/87>) and Perl
>>>> (for IO::Storm
>>>> <https://github.com/dan-blanchard/io-storm/commit/d1bac6bcac9fa2f8c6eee5ce3eae7f98eb45930e>),
>>>> I’m not crazy about needing to add another heartbeat approach to multiple
>>>> libraries so soon.
>>>>
>>>> I also am against needing to deal with multithreading in Python (where
>>>> there will be GIL issues) just to accommodate a change to the heartbeat
>>>> protocol. It seems to me that the workaround you proposed in STORM–742
>>>> <https://issues.apache.org/jira/browse/STORM-742> (where any command
>>>> the ShellBolt receives counts as a heartbeat) should be sufficient.
>>>>
>>>> Thanks,
>>>> Dan
>>>>
>>>>
>>>>
>>>>
>>>> On June 23, 2015 at 6:04:11 PM, 임정택 (kabhwan@gmail.com) wrote:
>>>>
>>>>  Hi!
>>>>
>>>> Since it's about multilang feature and you can use your own
>>>> implementation of multilang (and I believe multilang library developers are
>>>> subscribing user group), I wanna get opinion about changing multilang
>>>> heartbeat mechanism.
>>>>
>>>> At Storm 0.9.3, Storm introduces multilang heartbeat feature.
>>>> http://storm.apache.org/documentation/Multilang-protocol.html
>>>>  If you use Storm 0.9.3 and higher, and didn't know about the change,
>>>> you may skip this mail.
>>>>
>>>> Since it contains some design constraint, I'm trying my best to add
>>>> workarounds, but it cannot cover whole situation (STORM-738
>>>> <https://issues.apache.org/jira/browse/STORM-738>). That's why I want
>>>> to change mechanism to get rid of design constraint.
>>>>
>>>> AS-IS (STORM-513 <https://issues.apache.org/jira/browse/STORM-513>)
>>>>
>>>> - When subprocess receives heartbeat tuple, subprocess sends sync to
>>>> parent.
>>>> - ShellSpout / ShellBolt updates last heartbeat timestamp when it
>>>> receives sync.
>>>> -- added workaround : ShellSpout / ShellBolt updates timestamp when it
>>>> receives any kind of message. (It doesn't applied to ShellBolt yet, but
>>>> it's ready for review. STORM-742
>>>> <https://issues.apache.org/jira/browse/STORM-742>)
>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically,
>>>> and if timestamp is not updated well, it suicides itself.
>>>>
>>>> TO-BE (STORM-871 <https://issues.apache.org/jira/browse/STORM-871>)
>>>>
>>>> - Subprocess has to update pid file's modified time periodically.
>>>> -- In default implementation, it updates pid file every 1 sec.
>>>> -- It should be handled concurrently with executing pending tuples.
>>>> -- Some languages couldn't implement this clearly, but I don't have an
>>>> idea what languages could be.
>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid
>>>> file's modified time periodically, and if timestamp is not updated well, it
>>>> suicides itself.
>>>> - Heartbeat tuple is removed.
>>>>
>>>> Please let me know your opinion, especially when you're developing
>>>> multilang libraries.
>>>>
>>>> Thanks,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>>
>>>
>>> --
>>> Name : 임 정택
>>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>>> Twitter : http://twitter.com/heartsavior
>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>>
>>>
>>
>>
>> --
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>
>
>
>
> --
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>
>


-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Re: Gathering opinion: changing multilang heartbeat mechanism

Posted by Dan Blanchard <da...@parsely.com>.
If the GIL is a problem with both approaches, I think the best course of action would be you just stick with what is already in the Multi-Lang protocol, rather than adding another thing that Storn libraries will need to support. 

Also, as long as the amount of time that a ShellBolt will wait to hear from a subprocess is configurable, I don't think the current approach would be a problem for CPU intensive tasks, as people can just bump up the wait time. 

-Dan

> On Jul 9, 2015, at 11:51 PM, 임정택 <ka...@gmail.com> wrote:
> 
> Thinking GIL once more, current approach can't deal with GIL, too. 
> If one of tuple takes more time then heartbeat timeout processing CPU intensive job heavily, it could not do any ack / emits until end of processing.
> 
> GIL is a limitation of the languages, not multi-lang issue.
> And GIL bothers us whatever we're checking heartbeat from subprocess.
> Only thing we can avoid this situation is multiprocessing, which is too complex so I'm afraid we have to follow.
> 
> Best,
> Jungtaek Lim (HeartSaVioR)
> 
> 
> 2015-07-10 11:19 GMT+09:00 임정택 <ka...@gmail.com>:
>> Dan,
>> 
>> I experimented about python's GIL just now, and with python 2.7.6 in OSX I found that other thread can hold CPU more than 1 sec when timer is expired at that time. 
>> https://gist.github.com/HeartSaVioR/34d90cdd6af906e72935
>> 
>> Actually I wasn't affected this issue during I was working with Python cause it was I/O intensive job, and seems like it isn't same to CPU intensive job.
>> 
>> Default tick time is somewhat very long. I found one document which says tick time is about ~6.5 secs, which doesn't meet our requirement.
>> 
>> I don't think my experiment represents normal usage of multilang bolt, but who knows?
>> 
>> - To all,
>> 
>> So finally, newer heartbeat mechanism has other constraint which seems that languages matter, which languages are mainly supported now.
>> 
>> Though I think newer heartbeat mechanism can solve more issues than current mechanism, but it is just my opinion.
>> I don't have strong opinion to apply newer heartbeat mechanism since I found another constraint.
>> 
>> I'd like to hear any opinions, objections, suggestions so please don't hesitate to tell.
>> 
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>> 
>> 
>> 
>> 2015-07-10 8:04 GMT+09:00 임정택 <ka...@gmail.com>:
>>> Thanks Dan for giving opinion. :)
>>> 
>>> To tell the truth, when I was implementing STORM-513, Sean talks me privately about why design constraint is necessary. It was valid opinion actually.
>>> 
>>> I was thinking multilang feature should consider whole languages. It blocks introducing whole kinds of approaches, and introduces design constraint finally.
>>> 
>>> After introducing this constraint, Dashengju noticed me that design constraint can't cover some kind of situation which STORM-742 still can't cover it.
>>> 
>>> I agree and change my mind that it's time for multilang feature to drop supporting some kind of languages which doesn't meet future requirements.
>>> 
>>> I know default implementation of Python and Ruby have GIL issue, but AFAIK context switch interval is not too long so it doesn't block heartbeat timer to act on time. 
>>> (Please let me know when you met GIL issue which blocks one thread to wait over seconds.)
>>> 
>>> I don't expect subprocess to change modified time per exactly 1 sec, and ShellSpout and ShellBolt will adjust it, too.
>>> 
>>> It is replacement of current heartbeat mechanism, so when we introduce new heartbeat, old thing should be removed. 
>>> It could introduce backward compatibility issue (especially changing protocol) so we should consider what version we can adopt this.
>>> 
>>> Thanks for reading long mail.
>>> 
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>> 
>>> 2015년 7월 10일 금요일, Dan Blanchard<da...@parsely.com>님이 작성한 메시지:
>>> 
>>>> Hi Jungtaek,
>>>> 
>>>> Sorry I didn’t notice this earlier, as I was the person who filed STORM–513 in the first place.
>>>> 
>>>> Having just implemented the new heartbeat protocol in Python (for streamparse) and Perl (for IO::Storm), I’m not crazy about needing to add another heartbeat approach to multiple libraries so soon.
>>>> 
>>>> I also am against needing to deal with multithreading in Python (where there will be GIL issues) just to accommodate a change to the heartbeat protocol. It seems to me that the workaround you proposed in STORM–742 (where any command the ShellBolt receives counts as a heartbeat) should be sufficient.
>>>> 
>>>> Thanks,
>>>> Dan
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On June 23, 2015 at 6:04:11 PM, 임정택 (kabhwan@gmail.com) wrote:
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> Since it's about multilang feature and you can use your own implementation of multilang (and I believe multilang library developers are subscribing user group), I wanna get opinion about changing multilang heartbeat mechanism.
>>>>> 
>>>>> At Storm 0.9.3, Storm introduces multilang heartbeat feature.
>>>>> http://storm.apache.org/documentation/Multilang-protocol.html
>>>>> If you use Storm 0.9.3 and higher, and didn't know about the change, you may skip this mail.
>>>>> 
>>>>> Since it contains some design constraint, I'm trying my best to add workarounds, but it cannot cover whole situation (STORM-738). That's why I want to change mechanism to get rid of design constraint.
>>>>> 
>>>>> AS-IS (STORM-513)
>>>>> 
>>>>> - When subprocess receives heartbeat tuple, subprocess sends sync to parent.
>>>>> - ShellSpout / ShellBolt updates last heartbeat timestamp when it receives sync.
>>>>> -- added workaround : ShellSpout / ShellBolt updates timestamp when it receives any kind of message. (It doesn't applied to ShellBolt yet, but it's ready for review. STORM-742)
>>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically, and if timestamp is not updated well, it suicides itself.
>>>>> 
>>>>> TO-BE (STORM-871)
>>>>> 
>>>>> - Subprocess has to update pid file's modified time periodically.
>>>>> -- In default implementation, it updates pid file every 1 sec.
>>>>> -- It should be handled concurrently with executing pending tuples.
>>>>> -- Some languages couldn't implement this clearly, but I don't have an idea what languages could be.
>>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid file's modified time periodically, and if timestamp is not updated well, it suicides itself.
>>>>> - Heartbeat tuple is removed.
>>>>> 
>>>>> Please let me know your opinion, especially when you're developing multilang libraries.
>>>>> 
>>>>> Thanks,
>>>>> Jungtaek Lim (HeartSaVioR)
>>>> 
>>> 
>>> 
>>> -- 
>>> Name : 임 정택
>>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>>> Twitter : http://twitter.com/heartsavior
>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>> 
>> 
>> 
>> -- 
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
> 
> 
> 
> -- 
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior

Re: Gathering opinion: changing multilang heartbeat mechanism

Posted by 임정택 <ka...@gmail.com>.
Thinking GIL once more, current approach can't deal with GIL, too.
If one of tuple takes more time then heartbeat timeout processing CPU
intensive job heavily, it could not do any ack / emits until end of
processing.

GIL is a limitation of the languages, not multi-lang issue.
And GIL bothers us whatever we're checking heartbeat from subprocess.
Only thing we can avoid this situation is multiprocessing, which is too
complex so I'm afraid we have to follow.

Best,
Jungtaek Lim (HeartSaVioR)


2015-07-10 11:19 GMT+09:00 임정택 <ka...@gmail.com>:

> Dan,
>
> I experimented about python's GIL just now, and with python 2.7.6 in OSX I
> found that other thread can hold CPU more than 1 sec when timer is expired
> at that time.
> https://gist.github.com/HeartSaVioR/34d90cdd6af906e72935
>
> Actually I wasn't affected this issue during I was working with Python
> cause it was I/O intensive job, and seems like it isn't same to CPU
> intensive job.
>
> Default tick time is somewhat very long. I found one document which says
> tick time is about ~6.5 secs, which doesn't meet our requirement.
>
> I don't think my experiment represents normal usage of multilang bolt, but
> who knows?
>
> - To all,
>
> So finally, newer heartbeat mechanism has other constraint which seems
> that languages matter, which languages are mainly supported now.
>
> Though I think newer heartbeat mechanism can solve more issues than
> current mechanism, but it is just my opinion.
> I don't have strong opinion to apply newer heartbeat mechanism since I
> found another constraint.
>
> I'd like to hear any opinions, objections, suggestions so please don't
> hesitate to tell.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>
>
> 2015-07-10 8:04 GMT+09:00 임정택 <ka...@gmail.com>:
>
>> Thanks Dan for giving opinion. :)
>>
>> To tell the truth, when I was implementing STORM-513, Sean talks me
>> privately about why design constraint is necessary. It was valid
>> opinion actually.
>>
>> I was thinking multilang feature should consider whole languages. It
>> blocks introducing whole kinds of approaches, and introduces design
>> constraint finally.
>>
>> After introducing this constraint, Dashengju noticed me that design
>> constraint can't cover some kind of situation which STORM-742 still can't
>> cover it.
>>
>> I agree and change my mind that it's time for multilang feature to drop
>> supporting some kind of languages which doesn't meet future requirements.
>>
>> I know default implementation of Python and Ruby have GIL issue, but
>> AFAIK context switch interval is not too long so it doesn't block heartbeat
>> timer to act on time.
>> (Please let me know when you met GIL issue which blocks one thread to
>> wait over seconds.)
>>
>> I don't expect subprocess to change modified time per exactly 1 sec, and
>> ShellSpout and ShellBolt will adjust it, too.
>>
>> It is replacement of current heartbeat mechanism, so when we introduce
>> new heartbeat, old thing should be removed.
>> It could introduce backward compatibility issue (especially
>> changing protocol) so we should consider what version we can adopt this.
>>
>> Thanks for reading long mail.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2015년 7월 10일 금요일, Dan Blanchard<da...@parsely.com>님이 작성한 메시지:
>>
>> Hi Jungtaek,
>>>
>>> Sorry I didn’t notice this earlier, as I was the person who filed
>>> STORM–513 <https://issues.apache.org/jira/browse/STORM-513> in the
>>> first place.
>>>
>>> Having just implemented the new heartbeat protocol in Python (for
>>> streamparse <https://github.com/Parsely/streamparse/pull/87>) and Perl (for
>>> IO::Storm
>>> <https://github.com/dan-blanchard/io-storm/commit/d1bac6bcac9fa2f8c6eee5ce3eae7f98eb45930e>),
>>> I’m not crazy about needing to add another heartbeat approach to multiple
>>> libraries so soon.
>>>
>>> I also am against needing to deal with multithreading in Python (where
>>> there will be GIL issues) just to accommodate a change to the heartbeat
>>> protocol. It seems to me that the workaround you proposed in STORM–742
>>> <https://issues.apache.org/jira/browse/STORM-742> (where any command
>>> the ShellBolt receives counts as a heartbeat) should be sufficient.
>>>
>>> Thanks,
>>> Dan
>>>
>>>
>>>
>>>
>>> On June 23, 2015 at 6:04:11 PM, 임정택 (kabhwan@gmail.com) wrote:
>>>
>>>  Hi!
>>>
>>> Since it's about multilang feature and you can use your own
>>> implementation of multilang (and I believe multilang library developers are
>>> subscribing user group), I wanna get opinion about changing multilang
>>> heartbeat mechanism.
>>>
>>> At Storm 0.9.3, Storm introduces multilang heartbeat feature.
>>> http://storm.apache.org/documentation/Multilang-protocol.html
>>>  If you use Storm 0.9.3 and higher, and didn't know about the change,
>>> you may skip this mail.
>>>
>>> Since it contains some design constraint, I'm trying my best to add
>>> workarounds, but it cannot cover whole situation (STORM-738
>>> <https://issues.apache.org/jira/browse/STORM-738>). That's why I want
>>> to change mechanism to get rid of design constraint.
>>>
>>> AS-IS (STORM-513 <https://issues.apache.org/jira/browse/STORM-513>)
>>>
>>> - When subprocess receives heartbeat tuple, subprocess sends sync to
>>> parent.
>>> - ShellSpout / ShellBolt updates last heartbeat timestamp when it
>>> receives sync.
>>> -- added workaround : ShellSpout / ShellBolt updates timestamp when it
>>> receives any kind of message. (It doesn't applied to ShellBolt yet, but
>>> it's ready for review. STORM-742
>>> <https://issues.apache.org/jira/browse/STORM-742>)
>>> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically,
>>> and if timestamp is not updated well, it suicides itself.
>>>
>>> TO-BE (STORM-871 <https://issues.apache.org/jira/browse/STORM-871>)
>>>
>>> - Subprocess has to update pid file's modified time periodically.
>>> -- In default implementation, it updates pid file every 1 sec.
>>> -- It should be handled concurrently with executing pending tuples.
>>> -- Some languages couldn't implement this clearly, but I don't have an
>>> idea what languages could be.
>>> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid
>>> file's modified time periodically, and if timestamp is not updated well, it
>>> suicides itself.
>>> - Heartbeat tuple is removed.
>>>
>>> Please let me know your opinion, especially when you're developing
>>> multilang libraries.
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>
>> --
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>
>>
>
>
> --
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>



-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Re: Gathering opinion: changing multilang heartbeat mechanism

Posted by 임정택 <ka...@gmail.com>.
Dan,

I experimented about python's GIL just now, and with python 2.7.6 in OSX I
found that other thread can hold CPU more than 1 sec when timer is expired
at that time.
https://gist.github.com/HeartSaVioR/34d90cdd6af906e72935

Actually I wasn't affected this issue during I was working with Python
cause it was I/O intensive job, and seems like it isn't same to CPU
intensive job.

Default tick time is somewhat very long. I found one document which says
tick time is about ~6.5 secs, which doesn't meet our requirement.

I don't think my experiment represents normal usage of multilang bolt, but
who knows?

- To all,

So finally, newer heartbeat mechanism has other constraint which seems that
languages matter, which languages are mainly supported now.

Though I think newer heartbeat mechanism can solve more issues than current
mechanism, but it is just my opinion.
I don't have strong opinion to apply newer heartbeat mechanism since I
found another constraint.

I'd like to hear any opinions, objections, suggestions so please don't
hesitate to tell.

Thanks,
Jungtaek Lim (HeartSaVioR)



2015-07-10 8:04 GMT+09:00 임정택 <ka...@gmail.com>:

> Thanks Dan for giving opinion. :)
>
> To tell the truth, when I was implementing STORM-513, Sean talks me
> privately about why design constraint is necessary. It was valid
> opinion actually.
>
> I was thinking multilang feature should consider whole languages. It
> blocks introducing whole kinds of approaches, and introduces design
> constraint finally.
>
> After introducing this constraint, Dashengju noticed me that design
> constraint can't cover some kind of situation which STORM-742 still can't
> cover it.
>
> I agree and change my mind that it's time for multilang feature to drop
> supporting some kind of languages which doesn't meet future requirements.
>
> I know default implementation of Python and Ruby have GIL issue, but
> AFAIK context switch interval is not too long so it doesn't block heartbeat
> timer to act on time.
> (Please let me know when you met GIL issue which blocks one thread to wait
> over seconds.)
>
> I don't expect subprocess to change modified time per exactly 1 sec, and
> ShellSpout and ShellBolt will adjust it, too.
>
> It is replacement of current heartbeat mechanism, so when we introduce new
> heartbeat, old thing should be removed.
> It could introduce backward compatibility issue (especially
> changing protocol) so we should consider what version we can adopt this.
>
> Thanks for reading long mail.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2015년 7월 10일 금요일, Dan Blanchard<da...@parsely.com>님이 작성한 메시지:
>
> Hi Jungtaek,
>>
>> Sorry I didn’t notice this earlier, as I was the person who filed
>> STORM–513 <https://issues.apache.org/jira/browse/STORM-513> in the first
>> place.
>>
>> Having just implemented the new heartbeat protocol in Python (for
>> streamparse <https://github.com/Parsely/streamparse/pull/87>) and Perl (for
>> IO::Storm
>> <https://github.com/dan-blanchard/io-storm/commit/d1bac6bcac9fa2f8c6eee5ce3eae7f98eb45930e>),
>> I’m not crazy about needing to add another heartbeat approach to multiple
>> libraries so soon.
>>
>> I also am against needing to deal with multithreading in Python (where
>> there will be GIL issues) just to accommodate a change to the heartbeat
>> protocol. It seems to me that the workaround you proposed in STORM–742
>> <https://issues.apache.org/jira/browse/STORM-742> (where any command the
>> ShellBolt receives counts as a heartbeat) should be sufficient.
>>
>> Thanks,
>> Dan
>>
>>
>>
>>
>> On June 23, 2015 at 6:04:11 PM, 임정택 (kabhwan@gmail.com) wrote:
>>
>>  Hi!
>>
>> Since it's about multilang feature and you can use your own
>> implementation of multilang (and I believe multilang library developers are
>> subscribing user group), I wanna get opinion about changing multilang
>> heartbeat mechanism.
>>
>> At Storm 0.9.3, Storm introduces multilang heartbeat feature.
>> http://storm.apache.org/documentation/Multilang-protocol.html
>>  If you use Storm 0.9.3 and higher, and didn't know about the change,
>> you may skip this mail.
>>
>> Since it contains some design constraint, I'm trying my best to add
>> workarounds, but it cannot cover whole situation (STORM-738
>> <https://issues.apache.org/jira/browse/STORM-738>). That's why I want to
>> change mechanism to get rid of design constraint.
>>
>> AS-IS (STORM-513 <https://issues.apache.org/jira/browse/STORM-513>)
>>
>> - When subprocess receives heartbeat tuple, subprocess sends sync to
>> parent.
>> - ShellSpout / ShellBolt updates last heartbeat timestamp when it
>> receives sync.
>> -- added workaround : ShellSpout / ShellBolt updates timestamp when it
>> receives any kind of message. (It doesn't applied to ShellBolt yet, but
>> it's ready for review. STORM-742
>> <https://issues.apache.org/jira/browse/STORM-742>)
>> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically,
>> and if timestamp is not updated well, it suicides itself.
>>
>> TO-BE (STORM-871 <https://issues.apache.org/jira/browse/STORM-871>)
>>
>> - Subprocess has to update pid file's modified time periodically.
>> -- In default implementation, it updates pid file every 1 sec.
>> -- It should be handled concurrently with executing pending tuples.
>> -- Some languages couldn't implement this clearly, but I don't have an
>> idea what languages could be.
>> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid
>> file's modified time periodically, and if timestamp is not updated well, it
>> suicides itself.
>> - Heartbeat tuple is removed.
>>
>> Please let me know your opinion, especially when you're developing
>> multilang libraries.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>>
>
> --
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>
>


-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Re: Gathering opinion: changing multilang heartbeat mechanism

Posted by 임정택 <ka...@gmail.com>.
Thanks Dan for giving opinion. :)

To tell the truth, when I was implementing STORM-513, Sean talks me
privately about why design constraint is necessary. It was valid
opinion actually.

I was thinking multilang feature should consider whole languages. It blocks
introducing whole kinds of approaches, and introduces design constraint
finally.

After introducing this constraint, Dashengju noticed me that design
constraint can't cover some kind of situation which STORM-742 still can't
cover it.

I agree and change my mind that it's time for multilang feature to drop
supporting some kind of languages which doesn't meet future requirements.

I know default implementation of Python and Ruby have GIL issue, but
AFAIK context switch interval is not too long so it doesn't block heartbeat
timer to act on time.
(Please let me know when you met GIL issue which blocks one thread to wait
over seconds.)

I don't expect subprocess to change modified time per exactly 1 sec, and
ShellSpout and ShellBolt will adjust it, too.

It is replacement of current heartbeat mechanism, so when we introduce new
heartbeat, old thing should be removed.
It could introduce backward compatibility issue (especially
changing protocol) so we should consider what version we can adopt this.

Thanks for reading long mail.

Thanks,
Jungtaek Lim (HeartSaVioR)

2015년 7월 10일 금요일, Dan Blanchard<da...@parsely.com>님이 작성한 메시지:

> Hi Jungtaek,
>
> Sorry I didn’t notice this earlier, as I was the person who filed
> STORM–513 <https://issues.apache.org/jira/browse/STORM-513> in the first
> place.
>
> Having just implemented the new heartbeat protocol in Python (for
> streamparse <https://github.com/Parsely/streamparse/pull/87>) and Perl (for
> IO::Storm
> <https://github.com/dan-blanchard/io-storm/commit/d1bac6bcac9fa2f8c6eee5ce3eae7f98eb45930e>),
> I’m not crazy about needing to add another heartbeat approach to multiple
> libraries so soon.
>
> I also am against needing to deal with multithreading in Python (where
> there will be GIL issues) just to accommodate a change to the heartbeat
> protocol. It seems to me that the workaround you proposed in STORM–742
> <https://issues.apache.org/jira/browse/STORM-742> (where any command the
> ShellBolt receives counts as a heartbeat) should be sufficient.
>
> Thanks,
> Dan
>
>
>
>
> On June 23, 2015 at 6:04:11 PM, 임정택 (kabhwan@gmail.com
> <javascript:_e(%7B%7D,'cvml','kabhwan@gmail.com');>) wrote:
>
>  Hi!
>
> Since it's about multilang feature and you can use your own implementation
> of multilang (and I believe multilang library developers are subscribing
> user group), I wanna get opinion about changing multilang heartbeat
> mechanism.
>
> At Storm 0.9.3, Storm introduces multilang heartbeat feature.
> http://storm.apache.org/documentation/Multilang-protocol.html
>  If you use Storm 0.9.3 and higher, and didn't know about the change, you
> may skip this mail.
>
> Since it contains some design constraint, I'm trying my best to add
> workarounds, but it cannot cover whole situation (STORM-738
> <https://issues.apache.org/jira/browse/STORM-738>). That's why I want to
> change mechanism to get rid of design constraint.
>
> AS-IS (STORM-513 <https://issues.apache.org/jira/browse/STORM-513>)
>
> - When subprocess receives heartbeat tuple, subprocess sends sync to
> parent.
> - ShellSpout / ShellBolt updates last heartbeat timestamp when it receives
> sync.
> -- added workaround : ShellSpout / ShellBolt updates timestamp when it
> receives any kind of message. (It doesn't applied to ShellBolt yet, but
> it's ready for review. STORM-742
> <https://issues.apache.org/jira/browse/STORM-742>)
> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically, and
> if timestamp is not updated well, it suicides itself.
>
> TO-BE (STORM-871 <https://issues.apache.org/jira/browse/STORM-871>)
>
> - Subprocess has to update pid file's modified time periodically.
> -- In default implementation, it updates pid file every 1 sec.
> -- It should be handled concurrently with executing pending tuples.
> -- Some languages couldn't implement this clearly, but I don't have an
> idea what languages could be.
> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid
> file's modified time periodically, and if timestamp is not updated well, it
> suicides itself.
> - Heartbeat tuple is removed.
>
> Please let me know your opinion, especially when you're developing
> multilang libraries.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>

-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Re: Gathering opinion: changing multilang heartbeat mechanism

Posted by Dan Blanchard <da...@parsely.com>.
Hi Jungtaek,

Sorry I didn’t notice this earlier, as I was the person who filed STORM–513 in the first place.

Having just implemented the new heartbeat protocol in Python (for streamparse) and Perl (for IO::Storm), I’m not crazy about needing to add another heartbeat approach to multiple libraries so soon.

I also am against needing to deal with multithreading in Python (where there will be GIL issues) just to accommodate a change to the heartbeat protocol. It seems to me that the workaround you proposed in STORM–742 (where any command the ShellBolt receives counts as a heartbeat) should be sufficient.

Thanks,
Dan




On June 23, 2015 at 6:04:11 PM, 임정택 (kabhwan@gmail.com) wrote:

Hi!

Since it's about multilang feature and you can use your own implementation of multilang (and I believe multilang library developers are subscribing user group), I wanna get opinion about changing multilang heartbeat mechanism.

At Storm 0.9.3, Storm introduces multilang heartbeat feature.
http://storm.apache.org/documentation/Multilang-protocol.html
If you use Storm 0.9.3 and higher, and didn't know about the change, you may skip this mail.

Since it contains some design constraint, I'm trying my best to add workarounds, but it cannot cover whole situation (STORM-738). That's why I want to change mechanism to get rid of design constraint.

AS-IS (STORM-513)

- When subprocess receives heartbeat tuple, subprocess sends sync to parent.
- ShellSpout / ShellBolt updates last heartbeat timestamp when it receives sync.
-- added workaround : ShellSpout / ShellBolt updates timestamp when it receives any kind of message. (It doesn't applied to ShellBolt yet, but it's ready for review. STORM-742)
- ShellSpout / ShellBolt checks last heartbeat timestamp periodically, and if timestamp is not updated well, it suicides itself.

TO-BE (STORM-871)

- Subprocess has to update pid file's modified time periodically.
-- In default implementation, it updates pid file every 1 sec.
-- It should be handled concurrently with executing pending tuples.
-- Some languages couldn't implement this clearly, but I don't have an idea what languages could be.
- ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid file's modified time periodically, and if timestamp is not updated well, it suicides itself.
- Heartbeat tuple is removed.

Please let me know your opinion, especially when you're developing multilang libraries.

Thanks,
Jungtaek Lim (HeartSaVioR)