You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2014/04/28 18:20:15 UTC

[jira] [Updated] (SAMZA-232) Keys and values in state should be versioned

     [ https://issues.apache.org/jira/browse/SAMZA-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini updated SAMZA-232:
----------------------------------

    Fix Version/s:     (was: 0.7.0)

> Keys and values in state should be versioned
> --------------------------------------------
>
>                 Key: SAMZA-232
>                 URL: https://issues.apache.org/jira/browse/SAMZA-232
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Martin Kleppmann
>
> At the moment, keys and values that are written to a task's key-value store (and the associated changelog stream) are just the bytes that were generated by the serde. This will be a problem in future, since it gives us no way of changing the storage format.
> For example, in order to implement exactly-once semantics, we may want to associate additional metadata with each value (and that metadata would be managed by the framework, and would not be seen by serdes). The current implementation does not give us any room to make such a change, because a job would not know whether the value it is reading includes metadata or not.
> I propose that we prefix every key and every value in the key-value store and the changelog stream with a version number, currently just a zero byte. That is an incompatible change, so we should do it before the 0.7.0 release. In future, if we ever need to change the storage format, we can bump the version number and thus allow jobs to be gracefully upgraded in-place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)