You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/02/18 00:22:38 UTC

[GitHub] [lucene-solr] jtibshirani commented on a change in pull request #2395: LUCENE-9616: Add developer docs on how to update a format.

jtibshirani commented on a change in pull request #2395:
URL: https://github.com/apache/lucene-solr/pull/2395#discussion_r578036823



##########
File path: lucene/backward-codecs/README.md
##########
@@ -0,0 +1,45 @@
+# Index backwards compatibility
+
+This README describes the approach to maintaining compatibility with indices
+from previous versions and gives guidelines for making format changes.
+
+## Compatibility strategy
+
+Lucene supports the ability to read segments created in older versions by
+maintaining old codec classes along with their formats. When making a change
+to a file format, we create a fresh format class and copy the existing one
+into the backwards-codecs package.
+
+These older formats are tested in two ways:
+* Through unit tests like TestLucene80NormsFormat, which checks we can write
+then read data using the old format
+* Through TestBackwardsCompatibility, which loads indices created in previous
+versions and checks that we can search them
+
+## Making index format changes
+
+As an example, let's say we're making a change to the norms file format, and
+the current class in core is Lucene80NormsFormat. We'd perform the following
+steps:
+
+1. Create a new format with the target version for the changes, for example
+Lucene90NormsFormat. This includes creating copies of its writer and reader
+classes, as well as any helper classes. Make sure to copy unit tests too, like
+TestLucene80NormsFormat.
+2. Move the old Lucene80NormsFormat, along with its writer, reader, tests, and
+helper classes to the backwards-codecs package. If the format will only be
+used for reading, then delete the write-side logic and move it to a test-only
+class like Lucene80RWNormsFormat to support unit tests. Note that most formats
+only need read logic, but a small set including DocValuesFormat and
+FieldInfosFormat will need to retain write logic since can be used to update
+old segments.
+3. Make a change to the new format!
+
+## Internal format versions
+
+Each format class maintains an internal version which is written into the

Review comment:
       This is based on the discussion in https://issues.apache.org/jira/browse/LUCENE-9616.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org