You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by br...@apache.org on 2013/11/29 21:41:56 UTC
svn commit: r1546640 - in /subversion/branches/fsfs-ucsnorm: BRANCH-README
subversion/libsvn_fs_fs/fs.h subversion/libsvn_fs_fs/fs_fs.c
Author: brane
Date: Fri Nov 29 20:41:56 2013
New Revision: 1546640
URL: http://svn.apache.org/r1546640
Log:
On the fsfs-ucsnorm branch: Upon reflection, I decided to not add a
path normalization check to FSFS, but only the normalized lookup
option. The new BRANCH-README contains a reference to an article
explaining why.
[in subversion/branches/fsfs-ucsnorm]
* BRANCH-README: Rewrite.
* subversion/libsvn_fs_fs/fs.h
(CONFIG_SECTION_NORMALIZATION,
CONFIG_OPTION_ENABLE_NORMALIZED_LOOKUP): New symbols.
* subversion/libsvn_fs_fs/fs_fs.c (write_config):
Include and describe enable-normalized-lookup option in the
default FSFS configuratino file.
Modified:
subversion/branches/fsfs-ucsnorm/BRANCH-README
subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h
subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c
Modified: subversion/branches/fsfs-ucsnorm/BRANCH-README
URL: http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/BRANCH-README?rev=1546640&r1=1546639&r2=1546640&view=diff
==============================================================================
--- subversion/branches/fsfs-ucsnorm/BRANCH-README [UTF-8] (original)
+++ subversion/branches/fsfs-ucsnorm/BRANCH-README [UTF-8] Fri Nov 29 20:41:56 2013
@@ -1,68 +1,52 @@
-The purpose of this [fsfs-ucsnorm] branch is to implement two optional
-checks related to Unicode normalisation to FSFS.
+Enabling Normalized Path Lookup in FSFS
+=======================================
+The purpose of this [fsfs-ucsnorm] branch is to implement
+normalization-insensitive path lookups in FSFS. This will prevent the
+creation of paths that differ only in normalization, and will also
+remove the current constraint that paths used in the FS API must be
+byte-for-byte identical to those stored in the filesystem.
-Option: Prevent name collisions
-===============================
+The filesystem will *not* impose a particular normalization form, and
+it *will* preserve whatever representation it receives when a new path
+is created.
-If this option is enabled, FSFS will reject operations that would
-create two different representations of the same name in the same
-directory. This will prevent situations where a user could see more
-than one form of the name in a directory listing:
+This option would be enabled by default for all new FSFS-based
+repositories and *disabled* during repository format upgrade. The
+option can be disabled or enabled at any time during the lifetime of
+the repository; however, it is not safe to enable it without first
+running:
- /Namárië.txt
- /Namárië.txt
- /Namárië.txt
- /Namárië.txt
+ svnadmin verify REPOS --check-ucs-normalization
-Note that the representations of these names are all different.
-FSFS would not require newly created paths to be normalised, nor would
-it normalise them when thy are stored; it would only forbid collisions
-of identical names.
+Proposed argument to 'svnadmin create':
-Option in fsfs.conf:
+ svnadmin create REPOS --disable-normalized-lookup
- [path-representation]
- prevent-name-collisions = true|false
+Proposed argument to 'svnadmin upgrade':
- Default value: true
-
-Note: turning on this option would have some performance impact since
-all paths would have to be normalised for lookups.
-
-
-Option: Require normalised paths
-================================
-
-If this option is enabled, FSFS will reject operations that would
-create paths in the filesystem that are not in NFC. For example, the
-attempt to create a file called
-
- /Namárië.txt
+ svnadmin upgrade REPOS --enable-normalized-lookup
-would fail, because the name is in NFD; but creating a file named
+Proposed option in fsfs.conf:
- /Namárië.txt
+ [normalization]
+ enable-normalized-lookup = true|false
-would succeed.
-
-Option in fsfs.conf:
-
- [path-representation]
- require-normalized-paths = true|false
+ Default value: true
- Default value: false
-Note that this option cannot be enabled by default, since it might
-prevent users from modifying existing files.
+References
+==========
+Unicode Normalization Forms
+ http://unicode.org/reports/tr15/#Norm_Forms
-Glossary
-========
+Normalization Insensitivity (blog post)
+ https://blogs.oracle.com/nico/entry/normalization_insensitivity_should_be_the
-NFC Unicode Normalization Form C
- http://unicode.org/reports/tr15/#Norm_Forms
+zfs(1M)
+ http://www.freebsd.org/cgi/man.cgi?query=zfs&apropos=0&sektion=0&manpath=FreeBSD+8.1-RELEASE&format=html
-NFD Unicode Normalization Form D
- http://unicode.org/reports/tr15/#Norm_Forms
+zfs_share(1M)
+ http://docs.oracle.com/cd/E23824_01/html/821-1462/zfs-share-1m.html#scrolltoc
Modified: subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h
URL: http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h?rev=1546640&r1=1546639&r2=1546640&view=diff
==============================================================================
--- subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h (original)
+++ subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs.h Fri Nov 29 20:41:56 2013
@@ -101,6 +101,8 @@ extern "C" {
#define CONFIG_SECTION_PACKED_REVPROPS "packed-revprops"
#define CONFIG_OPTION_REVPROP_PACK_SIZE "revprop-pack-size"
#define CONFIG_OPTION_COMPRESS_PACKED_REVPROPS "compress-packed-revprops"
+#define CONFIG_SECTION_NORMALIZATION "normalization"
+#define CONFIG_OPTION_ENABLE_NORMALIZED_LOOKUP "enable-normalized-lookup"
/* The format number of this filesystem.
This is independent of the repository format number, and
Modified: subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c
URL: http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c?rev=1546640&r1=1546639&r2=1546640&view=diff
==============================================================================
--- subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c (original)
+++ subversion/branches/fsfs-ucsnorm/subversion/libsvn_fs_fs/fs_fs.c Fri Nov 29 20:41:56 2013
@@ -616,6 +616,32 @@ write_config(svn_fs_t *fs,
"### unless you often modify revprops after packing." NL
"### Compressing packed revprops is disabled by default." NL
"# " CONFIG_OPTION_COMPRESS_PACKED_REVPROPS " = false" NL
+"" NL
+"[" CONFIG_SECTION_NORMALIZATION "]" NL
+"### Subversion decrees that paths in the repository must be in the Unicode" NL
+"### character set, and further requires that they are encoded in UTF-8." NL
+"### Unfortunately it does not prescribe whether or how the names should" NL
+"### be normalized. Consequently, it is possible to create two paths that" NL
+"### appear to be identical on screen, but contain different Unicode code" NL
+"### points for the same glyphs. Apart from being confusing, this is not" NL
+"### supported by some filesystems (e.g., OSX HFS+, ZFS with normalization" NL
+"### enabled)." NL
+"###" NL
+"### When this option is enabled, FSFS will perform all path lookups in a" NL
+"### normalization-insensitive way. This will prevent the creation of new" NL
+"### paths with conflicting names, and will also remove the restriction on" NL
+"### clients to send paths in exactly the same form as is stored in the" NL
+"### filesystem. The representation of new paths will still be preserved;" NL
+"### FSFS will not normalize them, and will return them from queries in the" NL
+"### same form in which they were created." NL
+"### Normalized lookup is enabled by default for new FSFS repositories." NL
+"# " CONFIG_OPTION_ENABLE_NORMALIZED_LOOKUP " = true" NL
+"###" NL
+"### WARNING: Before enabling this option for existing repositories, you " NL
+"### must verify that there are no extant name collisions by" NL
+"### running the following command:" NL
+"###" NL
+"### svnadmin verify <REPOS-PATH> --check-ucs-normalization" NL
;
#undef NL
return svn_io_file_create(svn_dirent_join(fs->path, PATH_CONFIG, pool),