You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by br...@apache.org on 2013/11/29 21:41:56 UTC

svn commit: r1546640 - in /subversion/branches/fsfs-ucsnorm: BRANCH-README subversion/libsvn_fs_fs/fs.h subversion/libsvn_fs_fs/fs_fs.c

Author: brane
Date: Fri Nov 29 20:41:56 2013
New Revision: 1546640

URL: http://svn.apache.org/r1546640
Log:
On the fsfs-ucsnorm branch: Upon reflection, I decided to not add a
path normalization check to FSFS, but only the normalized lookup
option. The new BRANCH-README contains a reference to an article
explaining why.

[in subversion/branches/fsfs-ucsnorm]

* BRANCH-README: Rewrite.

* subversion/libsvn_fs_fs/fs.h
  (CONFIG_SECTION_NORMALIZATION,
   CONFIG_OPTION_ENABLE_NORMALIZED_LOOKUP): New symbols.
* subversion/libsvn_fs_fs/fs_fs.c (write_config):
   Include and describe enable-normalized-lookup option in the
   default FSFS configuratino file.

Modified:
    subversion/branches/fsfs-ucsnorm/BRANCH-README
    subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h
    subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c

Modified: subversion/branches/fsfs-ucsnorm/BRANCH-README
URL: http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/BRANCH-README?rev=1546640&r1=1546639&r2=1546640&view=diff
==============================================================================
--- subversion/branches/fsfs-ucsnorm/BRANCH-README [UTF-8] (original)
+++ subversion/branches/fsfs-ucsnorm/BRANCH-README [UTF-8] Fri Nov 29 20:41:56 2013
@@ -1,68 +1,52 @@
-The purpose of this [fsfs-ucsnorm] branch is to implement two optional
-checks related to Unicode normalisation to FSFS.
+Enabling Normalized Path Lookup in FSFS
+=======================================
 
+The purpose of this [fsfs-ucsnorm] branch is to implement
+normalization-insensitive path lookups in FSFS. This will prevent the
+creation of paths that differ only in normalization, and will also
+remove the current constraint that paths used in the FS API must be
+byte-for-byte identical to those stored in the filesystem.
 
-Option: Prevent name collisions
-===============================
+The filesystem will *not* impose a particular normalization form, and
+it *will* preserve whatever representation it receives when a new path
+is created.
 
-If this option is enabled, FSFS will reject operations that would
-create two different representations of the same name in the same
-directory. This will prevent situations where a user could see more
-than one form of the name in a directory listing:
+This option would be enabled by default for all new FSFS-based
+repositories and *disabled* during repository format upgrade. The
+option can be disabled or enabled at any time during the lifetime of
+the repository; however, it is not safe to enable it without first
+running:
 
-        /Namárië.txt
-        /Namárië.txt
-        /Namárië.txt
-        /Namárië.txt
+        svnadmin verify REPOS --check-ucs-normalization
 
-Note that the representations of these names are all different.
 
-FSFS would not require newly created paths to be normalised, nor would
-it normalise them when thy are stored; it would only forbid collisions
-of identical names.
+Proposed argument to 'svnadmin create':
 
-Option in fsfs.conf:
+        svnadmin create REPOS --disable-normalized-lookup
 
-        [path-representation]
-        prevent-name-collisions = true|false
+Proposed argument to 'svnadmin upgrade':
 
-        Default value: true
-
-Note: turning on this option would have some performance impact since
-all paths would have to be normalised for lookups.
-
-
-Option: Require normalised paths
-================================
-
-If this option is enabled, FSFS will reject operations that would
-create paths in the filesystem that are not in NFC. For example, the
-attempt to create a file called
-
-        /Namárië.txt
+        svnadmin upgrade REPOS --enable-normalized-lookup
 
-would fail, because the name is in NFD; but creating a file named
+Proposed option in fsfs.conf:
 
-        /Namárië.txt
+        [normalization]
+        enable-normalized-lookup = true|false
 
-would succeed.
-
-Option in fsfs.conf:
-
-        [path-representation]
-        require-normalized-paths = true|false
+        Default value: true
 
-        Default value: false
 
-Note that this option cannot be enabled by default, since it might
-prevent users from modifying existing files.
+References
+==========
 
+Unicode Normalization Forms
+        http://unicode.org/reports/tr15/#Norm_Forms
 
-Glossary
-========
+Normalization Insensitivity (blog post)
+        https://blogs.oracle.com/nico/entry/normalization_insensitivity_should_be_the
 
-NFC     Unicode Normalization Form C
-        http://unicode.org/reports/tr15/#Norm_Forms
+zfs(1M)
+        http://www.freebsd.org/cgi/man.cgi?query=zfs&apropos=0&sektion=0&manpath=FreeBSD+8.1-RELEASE&format=html
 
-NFD     Unicode Normalization Form D
-        http://unicode.org/reports/tr15/#Norm_Forms
+zfs_share(1M)
+        http://docs.oracle.com/cd/E23824_01/html/821-1462/zfs-share-1m.html#scrolltoc

Modified: subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h
URL: http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h?rev=1546640&r1=1546639&r2=1546640&view=diff
==============================================================================
--- subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h (original)
+++ subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h Fri Nov 29 20:41:56 2013
@@ -101,6 +101,8 @@ extern "C" {
 #define CONFIG_SECTION_PACKED_REVPROPS   "packed-revprops"
 #define CONFIG_OPTION_REVPROP_PACK_SIZE  "revprop-pack-size"
 #define CONFIG_OPTION_COMPRESS_PACKED_REVPROPS  "compress-packed-revprops"
+#define CONFIG_SECTION_NORMALIZATION     "normalization"
+#define CONFIG_OPTION_ENABLE_NORMALIZED_LOOKUP  "enable-normalized-lookup"
 
 /* The format number of this filesystem.
    This is independent of the repository format number, and

Modified: subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c
URL: http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c?rev=1546640&r1=1546639&r2=1546640&view=diff
==============================================================================
--- subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c (original)
+++ subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c Fri Nov 29 20:41:56 2013
@@ -616,6 +616,32 @@ write_config(svn_fs_t *fs,
 "### unless you often modify revprops after packing."                        NL
 "### Compressing packed revprops is disabled by default."                    NL
 "# " CONFIG_OPTION_COMPRESS_PACKED_REVPROPS " = false"                       NL
+""                                                                           NL
+"[" CONFIG_SECTION_NORMALIZATION "]"                                         NL
+"### Subversion decrees that paths in the repository must be in the Unicode" NL
+"### character set, and further requires that they are encoded in UTF-8."    NL
+"### Unfortunately it does not prescribe whether or how the names should"    NL
+"### be normalized. Consequently, it is possible to create two paths that"   NL
+"### appear to be identical on screen, but contain different Unicode code"   NL
+"### points for the same glyphs. Apart from being confusing, this is not"    NL
+"### supported by some filesystems (e.g., OSX HFS+, ZFS with normalization"  NL
+"### enabled)."                                                              NL
+"###"                                                                        NL
+"### When this option is enabled, FSFS will perform all path lookups in a"   NL
+"### normalization-insensitive way. This will prevent the creation of new"   NL
+"### paths with conflicting names, and will also remove the restriction on"  NL
+"### clients to send paths in exactly the same form as is stored in the"     NL
+"### filesystem. The representation of new paths will still be preserved;"   NL
+"### FSFS will not normalize them, and will return them from queries in the" NL
+"### same form in which they were created."                                  NL
+"### Normalized lookup is enabled by default for new FSFS repositories."     NL
+"# " CONFIG_OPTION_ENABLE_NORMALIZED_LOOKUP " = true"                        NL
+"###"                                                                        NL
+"### WARNING: Before enabling this option for existing repositories, you "   NL
+"###          must verify that there are no extant name collisions by"       NL
+"###          running the following command:"                                NL
+"###"                                                                        NL
+"###              svnadmin verify <REPOS-PATH> --check-ucs-normalization"    NL
 ;
 #undef NL
   return svn_io_file_create(svn_dirent_join(fs->path, PATH_CONFIG, pool),