A Unicode Test Page

Contents

Spaces and related

The following should all appear as spaces of various widths:

General punctuation

The convention in English is “to use double quotation marks to indicate quotation, and ‘single quotation marks’ for nested quotations.”

En français la convention est « d'utiliser les guillemets français doubles pour les citations, et “ les guillemets anglais doubles ” ou bien ‹ les guillemets français simples › pour les citations imbriquées. »

Auf Deutsch ist die Vereinbarung »umgekehrte zweifache Anführungszeichen für die Zitate zu benutzen, sogar ›einfache Anführungszeichen‹ für die verschachtelte Zitate«; diese Anführungszeichen „dürfen auch solche ‚englische‘ Anführungszeichen sein.“

The en-dash is used between numbers such as in: 1685–1750 (J. S. Bach). It is longer than the hyphen (as in “en-dash”, or, more properly, “en‐dash”) but shorter than the em-dash, which is used — like this — as a sort of parenthesis. Neither should be confused with the horizontal bar which is used to introduce quotation in some cases.
― Like this?
― Right.

The ellipsis is… well, it just is.

A table of (some) accents

In the following table, characters that are missing from Unicode as of version 3.0, latest draft (i.e. that cannot be represented as a single character but must use the more general form of combining diacritics) have been replaced by X, so you can tell they are not your browser's fault.

BaseGraveAcuteCircumflexTildeDiaeresis MacronBreveOgonekDotDouble acuteCaron Double graveInverted breve
aàáâãä āăąȧXǎ ȁȃ
eèéêë ēĕęėXě ȅȇ
iìíîĩï īĭįiXǐ ȉȋ
oòóôõö ōŏǫȯőǒ ȍȏ
uùúûũü ūŭųXűǔ ȕȗ
yýŷÿ ȳXXXX XX

Note that the three characters “LATIN SMALL LETTER A WITH DOT ABOVE”, “LATIN SMALL LETTER O WITH DOT ABOVE” and “LATIN SMALL LETTER Y WITH MACRON” were not present in version 2.0 of the Unicode standard. So it is quite understandable if you do not see the corresponding entries.

Combining diacritics

The following paragraph gives a few forms formed by using combining diacritics. The equal sign means that the combined form on the left should be identical in all respects (and in particular, represented identically) to the atomic form on the right. To emphasize even more: you should not see two signs on the left of the equal sign but one, the same as on the right.

à=à; é=é; î=î (not the same as ı̂ but may be graphically identical); õ=õ; ū=ū (whereas u¯ is two different symbols); ă=ă; ė=ė (also note i̇ should be essentially i); ï=ï (whereas i̇̈ has three dots on the i); å=å (not to be confused with a° (read “a degrees”) nor a˚); ő=ő; č=č
ç=ç; ḅ=ḅ; ḏ=ḏ (this is supposedly different from d̠ but may be graphically identical); ḙ=ḙ
ǖ=ǖ=ǖ (not the same as ṻ which has the diaeresis on top of the macron); ǡ=ǡ=ǡ
ǭ=ǭ=ǭ (also ǭ but the latter is not so canonical); ó̷=ǿ=ǿ (not so sure about this one)

Various symbols

Here is a table of the constellations of the Zodiac, in which the first column should contain the relevant astrological symbol:

Sym.English nameLatin nameLatin genitiveα star
The RamAriesArietisHamal
The BullTaurusTauriAldebaran
The TwinsGeminiGeminorumCastor
The CrabCancerCancriAcubens
The LionLeoLeonisRegulus
The VirginVirgoVirginisSpica
The ScalesLibraLibræZumen el Genubi
The ScorpionScorpiusScorpiiAntares
The ArcherSagittariusSagittariiRubkat
The Sea GoatCapricornusCapricorniGiedi
The Water BearerAquariusAquariiSadalmelik
The FishesPiscesPisciumEl Rischa

The following table should show a chessboard, with a pictorial representation of the pieces:

ABCDEFGH
8
7
6
5
4
3
2
1

Here is a snowflake: ❄.

Some verses in Russian

The following is a five-verse extract of introduction of the poem Mednyj Vsadnik; by A. S. Pushkin (in Russian):

По оживлённым берегам
Громады стройные теснятся
Дворцов и башен; корабли
Толпой со всех концов земли
К богатым пристаням стремятся;

Here is what the above might look like if your browser supports the cyrillic block of Unicode:

[Five verses in Russian]

And here is a transcription of it:

Po oživlënnym beregam
Gromady strojnye tesnâtsâ
Dvorcov i bašen; korabli
Tolpoj so vseh koncov zemli
K bogatym pristanâm stremâtsâ;

A rough translation might be:

Along the animated banks [of the Neva] / the shapely masses press / of palaces and towers; ships / in crowd from all corners of the Earth / rush toward its rich quays.

Some verses in ancient Greek

The following verses are lines 1182–1185 of the tragedy Oedipus Rex by Sophocles (in ancient Greek):

Ἰοὺ ἰού· τὰ πάντʼ ἂν ἐξήκοι σαφῆ.
Ὦ φῶς, τελευταῖόν σε προσϐλέψαιμι νῦν,
ὅστις πέφασμαι φύς τʼ ἀφʼ ὧν οὐ χρῆν, ξὺν οἷς τʼ
οὐ χρῆν ὁμιλῶν, οὕς τέ μʼ οὐκ ἔδει κτανών.

Here is what the above might look like if your browser supports the greek and greek extended blocks of Unicode (note that this representation uses the wrong shape of beta on the second line, because I didn't have the right one in the font I used):

[Four verses in Greek]

And here is the transcription of it:

Iou iou; ta pant' an exēkoi saphē.
Ō phōs, teleutaion se prosblepsaimi nun,
hostis pephasmai phus t' aph' hōn ou khrēn, xun hois t'
ou khrēn homilōn, hous te m' ouk edei ktanōn.

A rough translation might be:

Alas! All would become clear. / O light, may I see you for the last time, / I who was born of these of which it is a crime to be born, who live with these / which which it is a crime to live, and who killed these whom I must not kill.

Some verses in Sanskrit

The following is one stanza of canto Ⅵ of the Kumāra-saṃbhava (“the birth of Kumāra”) by the great Sanskrit poet Kālidāsa:

पशुपतिरपि तान्यहानि कृच्छ्राद्
अगमयदद्रिसुतासमागमोत्कः ।
कमपरमवशं न विप्रकुर्युर्
विभुमपि तं यदमी स्पृशन्ति भावाः ॥

Here is what the above might look like if your browser supports the devanāgarī block of Unicode:

[A stanza in Sanskrit]

And here is the transcription of it:

Paśupatirapi tānyahāni kṛcchrād
agamayadadrisutāsamāgamotkaḥ;
kamaparamavaśaṃ na viprakuryur
vibhumapi taṃ yadamī spṛśanti bhāvāḥ?

A rough translation might be:

And Paśupati passed those days with hardship, / eager for union with the daughter of the mountain. / Which other powerless [creature] would they not torment, / such emotions, when they affect even the powerful [Śiva]?

Some Chinese

The following are the two first lines of the Analects by Confucius:

子曰:「學而時習之,不亦說乎?有朋自遠方來,不亦樂乎?
人不知而不慍,不亦君子乎?」

有子曰:「其為人也孝弟,而好犯上者,鮮矣;
不好犯上,而好作亂者,未之有也。君子務本,本立而道生。
孝弟也者,其為仁之本與!」

Here is what the above might look like if your browser supports the CJK block of Unicode:

[Two lines in Chinese]

And here is the transcription of it:

Zǐ yuē: “Xué ér shī xí zhī, bú yì yuè hū? Yoǔ péng zì yǔan fānglái, bú yì lè hū? Rén bù zhī, ér bú yùn, bú yì jūnzǐ hū?”

Yóuzǐ yuē: “Qí wèi rén yě xiàodì, ér hàofànshàngzhě, xiān yǐ; bú hào fànshàng, ér hàozuòluànzhě, wèi zhī yóu yě. Jūnzǐ wù běn, běn lì ér dào shēng. Xiàodì yé zhě, qí wèi rén zhī bén yǔ!”

A rough translation might be:

The Master [Confucius] said: “To study and to practice, it is is a joy, isn't it? When friends come from afar, it is a pleasure, isn't it? If one remains unknown and isn't hurt, isn't one an honorable man?”

Master You said: “Few of the men who act well filially and fraternally are also fond of offending their superiors; men who are not fond of offending their superiors, but who like to cause trouble, such do not exist. The honorable man concerns himself with the foundations. Once the foundations are established, the Way is born. Is not acting well filially and fraternally the foundation of humanity?”

A Tamil name

The following is the original (Tamil) name of a famous mathematician:

ஸ்றீனிவாஸ ராமானுஜன் ஐயங்கார்

Here is what the above might look like if your browser supports the Tamil block of Unicode (note, however, that this representation is less than optimal, since the font I used didn't have the ‘sr’ ligature; so if the first two characters are replaced by a single one which looks very different, it is probably normal):

[A name in Tamil]

And here is a transcription of it:

Sṟīṉivāsa Rāmāṉujaṉ Aiyaṅkār

Here there can be no translation, of course, since this is a proper noun. But I note that the mathematician in question (1887–1920) is typically named “Srinivasa Ramanujan Iyengar” in English.

Some Arabic

The following lines are the first chapter of the Qur'an (note that the text runs right to left, and should probably be aligned on the right margin):

بِسْمِ ٱللّٰهِ ٱلرَّحْمـَبنِ ٱلرَّحِيمِ

ٱلْحَمْدُ لِلّٰهِ رَبِّ ٱلْعَالَمِينَ

ٱلرَّحْمـَبنِ ٱلرَّحِيمِ

مَـالِكِ يَوْمِ ٱلدِّينِ

إِيَّاكَ نَعْبُدُ وَإِيَّاكَ نَسْتَعِينُ

ٱهْدِنَــــا ٱلصِّرَاطَ ٱلمُسْتَقِيمَ

صِرَاطَ ٱلَّذِينَ أَنعَمْتَ عَلَيهِمْ غَيرِ ٱلمَغضُوبِ عَلَيهِمْ وَلاَ ٱلضَّالِّينَ

Here is what the above might look like if your browser supports the Arabic block of Unicode:

[Seven verses in Arabic]

And here is a transcription of it:

bismi ăl-la'hi ăr-raḥma'ni ăr-raḥiymi

ăl-ḥamdu li-lla'hi rabbi ăl-`a'lamiyna

ăr-raḥma'ni ăr-raḥiymi

ma'liki yawmi ăd-diyni

'iyya'ka na`budu wa-'iyya'ka nasta`iynu

ĭhdina' ăṣ-ṣira'ṭa ăl-mustaqiyma

ṣira'ṭa ăllaḏiyna 'an`amta `alayhim ġayri ăl-maġḍuwbi `alayhim wala' ăḍ-ḍa'lliyna

A rough translation might be:

In the name of God, the beneficient, the merciful.

Praise be to God, lord of the worlds.

The beneficient, the merciful.

Master of the day of judgment.

Thee do we worship, and Thine aid we seek.

Lead us on the right path.

The path of those on whom Thou hast bestowed favors. Not of those who have earned Thy wrath, nor of those who go astray.