You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2014/05/01 12:32:17 UTC

[jira] [Commented] (STORM-138) Pluggable serialization for multilang

    [ https://issues.apache.org/jira/browse/STORM-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986485#comment-13986485 ] 

ASF GitHub Bot commented on STORM-138:
--------------------------------------

Github user jsgilmore commented on the pull request:

    https://github.com/apache/incubator-storm/pull/84#issuecomment-41898570
  
    The multilang configuration is now topology specific.


> Pluggable serialization for multilang
> -------------------------------------
>
>                 Key: STORM-138
>                 URL: https://issues.apache.org/jira/browse/STORM-138
>             Project: Apache Storm (Incubating)
>          Issue Type: New Feature
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/373
> Currently JSON is used to serialize tuples for multilang. It would be great if the serialization mechanism were pluggable so that using richer types with multilang would be possible.
> ---------
> francis-liberty: Hello, I am a newbie here, and I wanted to pick up this issue. I also noticed a recent PR here #697 by jsgilmore, is it feasible for this issue, too?
> I looked around the source code, and I would like to talk about my opinions on this issue here.
> For now, ShellProcess only supports JSON to communicate with multilang process: read, write. And, ShellSpout and ShellBolt talk with ShellProcess through JSON, too. This is all because ShellProcess's interface use JSONObject only. Conceptually, ShellProcess should encapsulate the multilang details, and talk with Bolt and Spout using Tuple. (jsgilmore invented two new classes, Immission and Emission. But I think all information Bolt and Spout need is in Tuple already, no need for new data structures.) So, I think it would be much cleaner to do serialization in ShellProcess only, and both ShellSpout and ShellBolt don't know anything about how ShellProcess convert between Tuple and strings.
> So, I suppose I can do the work of
> 1. change the interface of ShellProcess to return and accept Tuple data structure, instead of JSONObject.
> 2. make ShellSpout and ShellBolt work on Tuple, all information like task_id, stream_id and tuples should be retrieve/encapsulate in this data structure.
> 3. what other serialization format would you like to add? I think in the end we need to add some example other than JSON to storm-starter storm.py/rb, which I would also like to work on.
> ----------
> jsgilmore: Hi, all serialisation is done in the JSONSerialiser, so no serialisation is done in ShellBolt, ShellProcess or ShellSpout. They just send around the Emission and Immission classes. The point of the ISerializer interface is to achieve the separation of serialisation.
> I come from the multilang side of Storm, so I'm not that familiar with the internal Storm structures. If there is a class that the ISerializer interface can use, instead of the Emission and Immission classes, I'm open to it.
> I would recommend that further discussion of PR #697 rather happen in the PR thread itself though.
> I created an issue to add protocol buffer serialisation for multilang to Storm in issue #654 , but I didn't see this issue. The whole purpose of PR #697 is to solve this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)