You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "P. Taylor Goetz (JIRA)" <ji...@apache.org> on 2015/01/29 21:26:34 UTC

[jira] [Resolved] (STORM-127) Implement protocol buffer encoding for shell spouts and bolts

     [ https://issues.apache.org/jira/browse/STORM-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

P. Taylor Goetz resolved STORM-127.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 0.9.2-incubating

> Implement protocol buffer encoding for shell spouts and bolts
> -------------------------------------------------------------
>
>                 Key: STORM-127
>                 URL: https://issues.apache.org/jira/browse/STORM-127
>             Project: Apache Storm
>          Issue Type: Improvement
>            Reporter: James Xu
>            Priority: Minor
>             Fix For: 0.9.2-incubating
>
>
> https://github.com/nathanmarz/storm/issues/654
> The current multilang protocol using json encoding is pretty slow. I plan to add the feature to shell spouts and bolts to use protocol buffer encoding.
> I've completed the design from the non-JVM language side and the protocol buffer side. I would just like some feedback from the storm community on how to integrate this feature into the codebase.
> Should the feature be fully backwards compatible?
> Should there be two types of shell spouts and bolts (json and protobuf)?
> Or, I think the better and more generic solution: a shell bolt takes in an interface for decoding and encoding (or an encoding interface and a decoding interface). There then exists a json and protobuf interface implementation and the user selects which one(s) to plug in.
> ----------
> nathanmarz: I think making the serialization interface pluggable is fine, and it should be configurable via the topology config. Please open a pull request for that.
> The best place for the protobuf implementation would be under the @stormprocessor account.
> ----------
> jsgilmore: I've done some refactoring of the Storm shell components. I moved ShellProcess, ShellSpout and ShellBolt to backtype.storm.multilang, since the refactoring also created other classes. I think it makes more sense to have all multilang code live together.
> The new design overloads the ShellSpout and ShellBolt constructors with a ISerializer argument. The ShellProcess contains a serializer field, which it can use to send and receive objects. By default, a ShellComponent uses a JsonSerializer, which was mostly factored out of the ShellProcess code.
> Instead of ShellComponents working directly with JSON objects, I created the following abstract data types: Emission, SpoutMsg and Immission (Maybe another name would be better, but this is technically correct). These objects are written and read from the serializer interface. The interface implementation can then use any wire protocol to get data in and out of those objects.
> I would appreciate comments on the design. I shall then submit a pull request.
> I've also implemented a protocol buffer serializer that that can substitute the JSON serializer. This serializer uses a binary varint delimited wire protocol to serialise the protocol buffer messages. I can add this to the @stormprocessor account.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)