| -*- coding: utf-8 -*- |
| |
| This is the source of the test data used by the normalized unicode |
| string comparison tests. |
| |
| |
| Whole word: Ṩůḇṽḝȑšḯờṋ |
| |
| Individual letters: |
| |
| char name NFC UCS-4 NFC UTF-8 NFD UCS-4 NFD UTF-8 |
| |
| Ṩ S with dot above and below \u1E68 \xe1\xb9\xa8 S\u0323\u0307 S\xcc\xa3\xcc\x87 |
| ů u with ring \u016F \xc5\xaf u\u030A u\xcc\x8a |
| ḇ b with macron below \u1E07 \xe1\xb8\x87 b\u0331 b\xcc\xb1 |
| ṽ v with tilde \u1E7D \xe1\xb9\xbd v\u0303 v\xcc\x83 |
| ḝ e with breve and cedilla \u1E1D \xe1\xb8\x9d e\u0327\u0306 e\xcc\xa7\xcc\x86 |
| ȑ r with double grave \u0211 \xc8\x91 r\u030F r\xcc\x8f |
| š s with caron \u0161 \xc5\xa1 s\u030C s\xcc\x8c |
| ḯ i with diaeresis and acute \u1E2F \xe1\xb8\xaf i\u0308\u0301 i\xcc\x88\xcc\x81 |
| ờ o with grave and hook \u1EDD \xe1\xbb\x9d o\u031B\u0300 o\xcc\x9b\xcc\x80 |
| ṋ n with circumflex below \u1E4B \xe1\xb9\x8b n\u032D n\xcc\xad |
| |
| Combining diacriticals: |
| |
| char name UCS-4 UTF-8 |
| |
| ̇ dot \u0307 \xcc\x87 |
| ̣ dot below \u0323 \xcc\xa3 |
| ̊ ring \u030A \xcc\x8a |
| ̱ macron below \u0331 \xcc\xb1 |
| ̃ tilde \u0303 \xcc\x83 |
| ̆ breve \u0306 \xcc\x86 |
| ̧ cedilla \u0327 \xcc\xa7 |
| ̏ double grave \u030F \xcc\x8f |
| ̌ caron \u030C \xcc\x8c |
| ̈ diaeresis \u0308 \xcc\x88 |
| ́ acute \u0301 \xcc\x81 |
| ̀ grave \u0300 \xcc\x80 |
| ̛ horn \u031B \xcc\x9b |
| ̭ circumflex below \u032D \xcc\xad |