You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Jungtaek Lim <ka...@gmail.com> on 2018/01/07 23:53:25 UTC

[DISCUSS] Regarding support old Storm workers in Storm 2.0.0

Hi devs,

We have added a feature regarding support old Storm workers in Storm 2.0.0
via STORM-2448 [1] which was OK to me before addressing metrics issue, but
for now I think it worths to discuss.

STORM-2448 assumes we have backward compatible interaction between daemons
(Nimbus/Supervisor/etc.) and worker in Storm 2.0.0. It is not only for
interaction via thrift, but also for interaction via any ways including
Zookeeper.

STORM-2693[2] came in as nice improvement, which changes the mechanism of
heartbeat (replace ZK with thrift RPC for interprocess heartbeat transfer)
and it is not compatible with old Storm workers. (We are still be able to
make it as backward compatible via letting Nimbus also support old style
heartbeat - reading ZK periodically, but it clearly reduces the performance
gain.)

Now I can see a patch for STORM-2156[3], which stores metrics into RocksDB,
but worker metrics are not addressed yet. I guess it will depend on Metrics
V2 (STORM-2153)[4] and regardless of dependent, if STORM-2156 would want to
change the approach of publishing metric from workers (via thrift RPC), it
will be also backward incompatible (same reason as STORM-2693).

We should break backward compatibility eventually to enjoy full benefits on
this (and others if we have similar improvements), and I'm not sure why it
can't be at Storm 2.0.0 (major release, nearly 2 years after 1.0.0). Some
users might be upset with backward incompatibility, but I don't think they
would not be upset we postpone the breaking changes and finally bring them
to Storm 3.0.0.

I would like to hear everyone's opinions regarding how to handle this
situation. We might have some workarounds which makes us bring both
features but with reducing effects.

Thanks,
Jungtaek Lim (HeartSaVioR)

1. https://issues.apache.org/jira/browse/STORM-2448
2. https://issues.apache.org/jira/browse/STORM-2693
3. https://issues.apache.org/jira/browse/STORM-2156
4. https://issues.apache.org/jira/browse/STORM-2153

ps. I imagine that how our consensus goes for this situation: if we could
bring much improvements but only breaking backward compatible way. One
possible change would be dropping Acker mechanism and adopting distributed
snapshot: I have been thinking this as worth to do, and JStorm already made
a change to bring performance gain and also get advantage while windowing.

Re: [DISCUSS] Regarding support old Storm workers in Storm 2.0.0

Posted by Alexandre Vermeerbergen <av...@gmail.com>.
Hello Jungtaek,

+1 for a distributed snapshot support for Storm !

Regarding breaking workers compatibility, on my side that wouln't be a big
deal, as we do not yet do "rolling upgrades" of our Storm clusters.

Even do we where doing rolling upgrades for normal upgrades, getting such a
great improvement such as distributed snapshot would be a good reason to
make a "cold upgrade" of our clusters.

Thanks,
Alexandre Vermeerbergen


2018-01-08 0:53 GMT+01:00 Jungtaek Lim <ka...@gmail.com>:

> Hi devs,
>
> We have added a feature regarding support old Storm workers in Storm 2.0.0
> via STORM-2448 [1] which was OK to me before addressing metrics issue, but
> for now I think it worths to discuss.
>
> STORM-2448 assumes we have backward compatible interaction between daemons
> (Nimbus/Supervisor/etc.) and worker in Storm 2.0.0. It is not only for
> interaction via thrift, but also for interaction via any ways including
> Zookeeper.
>
> STORM-2693[2] came in as nice improvement, which changes the mechanism of
> heartbeat (replace ZK with thrift RPC for interprocess heartbeat transfer)
> and it is not compatible with old Storm workers. (We are still be able to
> make it as backward compatible via letting Nimbus also support old style
> heartbeat - reading ZK periodically, but it clearly reduces the performance
> gain.)
>
> Now I can see a patch for STORM-2156[3], which stores metrics into RocksDB,
> but worker metrics are not addressed yet. I guess it will depend on Metrics
> V2 (STORM-2153)[4] and regardless of dependent, if STORM-2156 would want to
> change the approach of publishing metric from workers (via thrift RPC), it
> will be also backward incompatible (same reason as STORM-2693).
>
> We should break backward compatibility eventually to enjoy full benefits on
> this (and others if we have similar improvements), and I'm not sure why it
> can't be at Storm 2.0.0 (major release, nearly 2 years after 1.0.0). Some
> users might be upset with backward incompatibility, but I don't think they
> would not be upset we postpone the breaking changes and finally bring them
> to Storm 3.0.0.
>
> I would like to hear everyone's opinions regarding how to handle this
> situation. We might have some workarounds which makes us bring both
> features but with reducing effects.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 1. https://issues.apache.org/jira/browse/STORM-2448
> 2. https://issues.apache.org/jira/browse/STORM-2693
> 3. https://issues.apache.org/jira/browse/STORM-2156
> 4. https://issues.apache.org/jira/browse/STORM-2153
>
> ps. I imagine that how our consensus goes for this situation: if we could
> bring much improvements but only breaking backward compatible way. One
> possible change would be dropping Acker mechanism and adopting distributed
> snapshot: I have been thinking this as worth to do, and JStorm already made
> a change to bring performance gain and also get advantage while windowing.
>