You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@subversion.apache.org by Apache subversion Wiki <co...@subversion.apache.org> on 2013/12/02 14:36:59 UTC

[Subversion Wiki] Update of "UnicodeNormalization" by brane

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Subversion Wiki" for change notification.

The "UnicodeNormalization" page has been changed by brane:
https://wiki.apache.org/subversion/UnicodeNormalization

New page:
= Unicode Normalization for Path and Mergeinfo Lookup =

This specification is the result of a number of ongoing discussions, starting including issue [[http://subversion.tigris.org/issues/show_bug.cgi?id=2464|#2464]] and the various discussions gathered on the UnicodeComposition page. It has also been strongly influenced by [[https://blogs.oracle.com/nico/entry/normalization_insensitivity_should_be_the|this blog post]], which discusses the solution adopted by the ZFS filesystem.

== Constraints ==

Any solution to the normalization problem must maintain strict backwards compatibility between clients and servers. This implies that:
* we cannot change the network protocol to require that all paths are normalized;
* the server cannot store paths, or return them to clients, in a different representation than the one they were originally created with.

The solution also may not drastically affect the performance of the server or working copy. For example, the working copy database cannot use a normalization-independent collation for indexing paths, because that limits SQLite's ability to opimize queries.

For repositories that use the FSFS backend, the solution must not affect the layout of the revision files or directory contents. The repository administrator should be given the choice whether to implement the solution, regardless of format version.

FSX should incorporate the solution as a mandatory feature. BDB will not support it, ever.