You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Ryan Blue <bl...@cloudera.com> on 2015/04/01 19:53:56 UTC

Semantic versioning and format compatibility

Hi everyone,

Just following up on how we want to do versioning (I think a thread is 
better than a JIRA issue for this one). I have two main versioning 
questions:

1. How do we want to handle format changes with semantic versioning?

It would be awesome to have a version number reserved for the format 
version, but semver doesn't include it. That leaves us with the 
following options:

* Increment the major version only for a format update. This prevents us 
from removing anything from the public API until we also change the 
format, so I think it's a bad idea.

* Increment the major version for a format update OR a breaking API 
change. This is more reasonable, but might be a problem for upgrades 
because it causes both API and data changes.

* Increment the major version for a format update XOR a breaking API 
change. This is what I prefer: never update the format and break the API 
at the same time.

* Use semver for the API and something else for the format. We could use 
the classifier, so 1.7.0-format-2 and 1.7.0-format-1, but that requires 
fighting maven.

2. Are we going to guarantee forward-compatibility for data across major 
version upgrades?

If yes, then users would be able to update to 2.0 and still roll back to 
1.x and read the data written in 2.0. I think this is something we 
should commit to, otherwise if users have to roll back for any reason, 
they've lost data, at least temporarily.


-- 
Ryan Blue
Software Engineer
Cloudera, Inc.