You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by br...@apache.org on 2012/11/06 21:02:14 UTC
svn commit: r1406291 -
/subversion/branches/wc-collate-path/subversion/tests/diacritical.txt
Author: brane
Date: Tue Nov 6 20:02:13 2012
New Revision: 1406291
URL: http://svn.apache.org/viewvc?rev=1406291&view=rev
Log:
On the wc-collate-path branch:
* subversion/tests/diacritical.txt: New;
describes the normalized-compare test data.
Added:
subversion/branches/wc-collate-path/subversion/tests/diacritical.txt
Added: subversion/branches/wc-collate-path/subversion/tests/diacritical.txt
URL: http://svn.apache.org/viewvc/subversion/branches/wc-collate-path/subversion/tests/diacritical.txt?rev=1406291&view=auto
==============================================================================
--- subversion/branches/wc-collate-path/subversion/tests/diacritical.txt (added)
+++ subversion/branches/wc-collate-path/subversion/tests/diacritical.txt Tue Nov 6 20:02:13 2012
@@ -0,0 +1,41 @@
+-*- coding: utf-8 -*-
+
+This is the source of the test data used by the normalized unicode
+string comparison tests.
+
+
+Whole word: Ṩůá¸á¹½á¸Èšḯá»á¹
+
+Individual letters:
+
+char name NFC UCS-4 NFC UTF-8 NFD UCS-4 NFD UTF-8
+
+Ṩ S with dot above and below \u1E68 \xe1\xb9\xa8 S\u0323\u0307 S\xcc\xa3\xcc\x87
+ů u with ring \u016F \xc5\xaf u\u030A u\xcc\x8a
+Ḡb with macron below \u1E07 \xe1\xb8\x87 b\u0331 b\xcc\xb1
+á¹½ v with tilde \u1E7D \xe1\xb9\xbd v\u0303 v\xcc\x83
+Ḡe with breve and cedilla \u1E1D \xe1\xb8\x9d e\u0327\u0306 e\xcc\xa7\xcc\x86
+È r with double grave \u0211 \xc8\x91 r\u030F r\xcc\x8f
+Å¡ s with caron \u0161 \xc5\xa1 s\u030C s\xcc\x8c
+ḯ i with diaeresis and acute \u1E2F \xe1\xb8\xaf i\u0308\u0301 i\xcc\x88\xcc\x81
+á» o with grave and hook \u1EDD \xe1\xbb\x9d o\u031B\u0300 o\xcc\x9b\xcc\x80
+á¹ n with circumflex below \u1E4B \xe1\xb9\x8b n\u032D n\xcc\xad
+
+Combining diacriticals:
+
+char name UCS-4 UTF-8
+
+Ì dot \u0307 \xcc\x87
+Ì£ dot below \u0323 \xcc\xa3
+Ì ring \u030A \xcc\x8a
+̱ macron below \u0331 \xcc\xb1
+Ì tilde \u0303 \xcc\x83
+Ì breve \u0306 \xcc\x86
+̧ cedilla \u0327 \xcc\xa7
+Ì double grave \u030F \xcc\x8f
+Ì caron \u030C \xcc\x8c
+Ì diaeresis \u0308 \xcc\x88
+Ì acute \u0301 \xcc\x81
+Ì grave \u0300 \xcc\x80
+Ì horn \u031B \xcc\x9b
+Ì circumflex below \u032D \xcc\xad