You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Yan Fang (JIRA)" <ji...@apache.org> on 2015/06/11 02:02:14 UTC

[jira] [Updated] (SAMZA-4) SerdeManager should cache serdes on startup

     [ https://issues.apache.org/jira/browse/SAMZA-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Fang updated SAMZA-4:
-------------------------
    Assignee:     (was: Yan Fang)

> SerdeManager should cache serdes on startup
> -------------------------------------------
>
>                 Key: SAMZA-4
>                 URL: https://issues.apache.org/jira/browse/SAMZA-4
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>
> The SerdeManager does a complex set of evaluations to determine which serde should be used for a given incoming/outgoing message envelope. For example, the ordered list of rules for outgoing keys are (toBytes) are:
> 1. Use the key object, itself (don't serialize), if the stream is a changelog stream.
> 2. Use the key serializer defined in the envelope, if it's defined.
> 3. Use the stream's key serde defined in config, if defined.
> 4. Use the system's key serde defined in config, if defined.
> 5. Use the key object, itself (don't serialize)
> These rules are evaluated on every incoming/outgoing message right now (it's a bunch of if statements). Instead of this, we should just cache the appropriate serdes rather than doing the full re-evaluation every time.
> For outgoing messages, we'll need to make sure that we can handle arbitrary streams, since an outgoing message can be sent to any (undefined) stream.
> The two ways that I can think to do this are:
> 1. Cache when constructing the SerdeManager.
> 2. Cache in toBytes/fromBytes.
> I'm in favor of #2, since it means we can cache the decisions for outgoing message envelopes that are sent to a new (undefined in config) stream, as well. We couldn't do this in the constructor because we don't know all outgoing streams at that point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)