You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by br...@apache.org on 2012/11/06 21:02:14 UTC

svn commit: r1406291 - /subversion/branches/wc-collate-path/subversion/tests/diacritical.txt

Author: brane
Date: Tue Nov  6 20:02:13 2012
New Revision: 1406291

URL: http://svn.apache.org/viewvc?rev=1406291&view=rev
Log:
On the wc-collate-path branch:

* subversion/tests/diacritical.txt: New;
   describes the normalized-compare test data.

Added:
    subversion/branches/wc-collate-path/subversion/tests/diacritical.txt

Added: subversion/branches/wc-collate-path/subversion/tests/diacritical.txt
URL: http://svn.apache.org/viewvc/subversion/branches/wc-collate-path/subversion/tests/diacritical.txt?rev=1406291&view=auto
==============================================================================
--- subversion/branches/wc-collate-path/subversion/tests/diacritical.txt (added)
+++ subversion/branches/wc-collate-path/subversion/tests/diacritical.txt Tue Nov  6 20:02:13 2012
@@ -0,0 +1,41 @@
+-*- coding: utf-8 -*-
+
+This is the source of the test data used by the normalized unicode
+string comparison tests.
+
+
+Whole word: Ṩůḇṽḝȑšḯờṋ
+
+Individual letters:
+
+char    name                            NFC UCS-4      NFC UTF-8      NFD UCS-4      NFD UTF-8
+
+Ṩ       S with dot above and below      \u1E68         \xe1\xb9\xa8   S\u0323\u0307  S\xcc\xa3\xcc\x87
+ů       u with ring                     \u016F         \xc5\xaf       u\u030A        u\xcc\x8a
+ḇ       b with macron below             \u1E07         \xe1\xb8\x87   b\u0331        b\xcc\xb1
+á¹½       v with tilde                    \u1E7D         \xe1\xb9\xbd   v\u0303        v\xcc\x83
+ḝ       e with breve and cedilla        \u1E1D         \xe1\xb8\x9d   e\u0327\u0306  e\xcc\xa7\xcc\x86
+ȑ       r with double grave             \u0211         \xc8\x91       r\u030F        r\xcc\x8f
+Å¡       s with caron                    \u0161         \xc5\xa1       s\u030C        s\xcc\x8c
+ḯ       i with diaeresis and acute      \u1E2F         \xe1\xb8\xaf   i\u0308\u0301  i\xcc\x88\xcc\x81
+ờ       o with grave and hook           \u1EDD         \xe1\xbb\x9d   o\u031B\u0300  o\xcc\x9b\xcc\x80
+ṋ       n with circumflex below         \u1E4B         \xe1\xb9\x8b   n\u032D        n\xcc\xad
+
+Combining diacriticals:
+
+char    name                            UCS-4          UTF-8
+
+̇       dot                             \u0307         \xcc\x87
+Ì£       dot below                       \u0323         \xcc\xa3
+̊       ring                            \u030A         \xcc\x8a
+̱       macron below                    \u0331         \xcc\xb1
+̃       tilde                           \u0303         \xcc\x83
+̆       breve                           \u0306         \xcc\x86
+̧       cedilla                         \u0327         \xcc\xa7
+̏       double grave                    \u030F         \xcc\x8f
+̌       caron                           \u030C         \xcc\x8c
+̈       diaeresis                       \u0308         \xcc\x88
+́       acute                           \u0301         \xcc\x81
+̀       grave                           \u0300         \xcc\x80
+̛       horn                            \u031B         \xcc\x9b
+Ì­       circumflex below                \u032D         \xcc\xad