You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Ryan Blue <bl...@cloudera.com> on 2015/04/01 19:53:56 UTC
Semantic versioning and format compatibility
Hi everyone,
Just following up on how we want to do versioning (I think a thread is
better than a JIRA issue for this one). I have two main versioning
questions:
1. How do we want to handle format changes with semantic versioning?
It would be awesome to have a version number reserved for the format
version, but semver doesn't include it. That leaves us with the
following options:
* Increment the major version only for a format update. This prevents us
from removing anything from the public API until we also change the
format, so I think it's a bad idea.
* Increment the major version for a format update OR a breaking API
change. This is more reasonable, but might be a problem for upgrades
because it causes both API and data changes.
* Increment the major version for a format update XOR a breaking API
change. This is what I prefer: never update the format and break the API
at the same time.
* Use semver for the API and something else for the format. We could use
the classifier, so 1.7.0-format-2 and 1.7.0-format-1, but that requires
fighting maven.
2. Are we going to guarantee forward-compatibility for data across major
version upgrades?
If yes, then users would be able to update to 2.0 and still roll back to
1.x and read the data written in 2.0. I think this is something we
should commit to, otherwise if users have to roll back for any reason,
they've lost data, at least temporarily.
--
Ryan Blue
Software Engineer
Cloudera, Inc.