Comments on A Unicode-obfuscated programming language proposal

SB (2016-04-15T18:25:39Z)

@Hélène: https://web.archive.org/web/20080414231249/http://lorien.sdsu.edu/~carroll/shrub.html

Ruxor (2016-04-14T14:09:10Z)

Je pense que c'était une référence à ceci : <URL: http://www.youtube.com/watch?v=zIV4poUZAQo >.

Hélène (2016-04-14T06:35:59Z)

> it says NI and GOD at once

Le lien est complètement cassé et une recherche Google ne donne rien de pertinent. Peux-tu expliquer de quoi il s'agissait ? (si tu t'en souviens…)

Fork (2011-08-01T15:28:06Z)

Maman Oo

bi (2005-11-07T11:34:51Z)

R, phi, Ruxor: Mongolian letters have initial, medial, and final forms, like in Arabic ( http://www.omniglot.com/writing/arabic.htm ). The Unicode charts list only one form for each letter, usually the initial form. If I remember correctly, the Code2000 font also includes only 1 form for each letter. The MonTeX package does generate the correct forms and even the correct (vertical) layout (!), but it works only with eLaTeX it seems.

jonas (2005-09-06T10:04:14Z)

Let me argue with your statement that most programming languages do not allow non-ASCII characters.

Java and Perl definitely allows arbitary non-ASCII characters in its identifiers, even though people rarely use this feature. In Perl, you can declare the encoding of the source with the encoding pragma (the default is utf-8), so that you can write accented characters in your native encoding.

C++ also has some support for non-ASCII characters in identifiers, although I don't really know how.

AFAIK, Visual Basic allows non-ASCII characters in identifiers too, but I'm not sure how.

Non-english versions of MS Excel (or at least the Hungarian version) has the names of most builtin functions translated, so they include non-ASCII characters too. User-defined identifiers can also contain (certain) non-ASCII chars.

Perl6 will have at least three non-ASCII punctation characters (0xAB, 0xBB, 0xA5) which have built-in meaning, although there are ASCII-only constructs with the same function as these.

(I personally don't like the way these last two languages use non-ASCII characters.)

Ewww (2005-09-05T15:43:02Z)

My mind is in a world of hurt!

Muriel (2004-12-12T09:18:24Z)

Juste pour l'auteur :
c'est amusant, la variable temporaire "無 "
m'apparaît comme un caractère chinois signifiant
"il n'y a pas" "ne pas". C'est le seul endroit du texte qui fait sens pour moi, ce caractère chinois.

phi (2004-12-05T15:39:29Z)

one more page full of stuff
http://www.omniglot.com/writing/mongolian.htm
It's very adequate for other languages as well.

phi (2004-12-04T23:51:32Z)

btw, see also http://www.babelstone.co.uk/Scripts/Mongolian.html
One should also define the corresponding characters for hexadecimal digits whose value is greater than or equal to the normal number of our fingers.

R (2004-12-04T20:03:24Z)

There actually is a free mongolian font available; it is in Type 1 format and is used for the support of mongolian in LaTeX (which was achieved by the quite impressive Oliver Corff, a researcher in linguistics now working at "Freie Universitaet Berlin"). Quite unfortunately, it is not Unicode-compliant, but could be converted (something I have already started to do a while ago, actually).

Anyway, you may find it at
<URL: ftp://ftp.ctan.org/tex-archive/language/mongolian/montex/contrib/montex.type1/bicig/>
(try bcghsm.pfb, for example) and its various mirrors all over the world (see http://www.ctan.org/ for a complete list).

And by the way, I will remember to try and avoid speaking on specific subjects at dinner from now on ;-)

bort (2004-12-04T15:24:50Z)

To a certain extent, the relatively well-known paper "Generalizing Overloading for C++2000" by Bjarne Stroustrup uses some of the same ideas as your proposed language. It doesn't mandate non-ASCII identifiers, although it allows them.

<URL: http://www.research.att.com/~bs/whitespace98.pdf>

phi (2004-12-04T13:12:40Z)

The true type font Code2000 does have Mongolian characters, from U+1800 : MONGOLIAN BIRGA to U+18A9 : MONGOLIAN LETTER ALI GALI DAGALGA. And it's donation-ware.

Ruxor (2004-12-04T13:00:30Z)

kox → I'm afraid I don't know of any Mongolian fonts that would be freely available. (Besides, there's the annoying complication that it's supposed to be vertically written, and only CSS3 handles this properly, and there exist no full implementations of CSS3 to date, only very partial ones.) If you merely wish to know what they look like, you can go to the Unicode web site and download the PDF code chart for Mongolian (<URL: http://www.unicode.org/charts/PDF/U1800.pdf >); and if you're a PDF guru you can actually extract the vector font which is embedded in the PDF file (doing so is illegal, of course).

phi (2004-12-04T12:47:03Z)

Ah non, alors, arrêtons de faire de la mauvaise vulgarisation, nos lecteurs ont le droit de savoir, et de tout savoir.

Pour les variables, on peut utiliser l'alphabet latin, à condition bien sûr d'employer la série U+1D434 : MATHEMATICAL ITALIC CAPITAL A du plan 01. Pour les indices il suffit d'utiliser U+2080 : SUBSCRIPT ZERO.

Pour les nombres, la série U+1D7CE : MATHEMATICAL BOLD DIGIT ZERO s'impose pour toutes les valeurs représentées exactement. Pour les approximartions en virgule flottante, il vaut mieux employer U+1D7E2 : MATHEMATICAL SANS-SERIF DIGIT ZERO. Pour les nombres de taille fixe (entiers 32 bits par exemple) il faut utiliser U+1D7E2 : MATHEMATICAL SANS-SERIF DIGIT ZERO. Grâce à ce dispositiof, vous n'avez plus à déclarer p&lablement le type entier, float, ou real de vos constantes numériques.

Ah, dernier détail: chaque ligne de programme doit porter un no de ligne comme en BASIC, lequel no de ligne doit être écrit avec U+FF10 : FULLWIDTH DIGIT ZERO. Par ailleurs, vous avez droit à 10 labels de gotos, nommés U+E0030 : TAG DIGIT ZERO.

Voilà, j'espère avoir bien clarifié l'utilisation des chiffres. Je vous souhaite de longues heures de programmation heureuse.

kox (2004-12-04T09:31:29Z)

And if I haven't the mongolian digits, how can I get them ? At least under firefox, because I want to see what they look like.

bidibulle (2004-12-04T09:30:52Z)

Et le Brainfuck???

Ruxor (2004-12-04T01:14:21Z)

Ouarf…

Joël (2004-12-04T01:01:41Z)

U+2302 HOUSE / U+00A0 NO-BREAK SPACE / U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK / U+1D157 MUSICAL SYMBOL VOID NOTEHEAD / U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
U+2639 WHITE FROWNING FACE
U+043F CYRILLIC SMALL LETTER PE / U+0438 CYRILLIC SMALL LETTER I / U+0441 CYRILLIC SMALL LETTER ES / U+0430 CYRILLIC SMALL LETTER A / U+0442 CYRILLIC SMALL LETTER TE / U+044C CYRILLIC SMALL LETTER SOFT SIGN / U+00A0 NO-BREAK SPACE / U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK / U+201C LEFT DOUBLE QUOTATION MARK / U+261E WHITE RIGHT POINTING INDEX / U+2420 SYMBOL FOR SPACE / U+2642 MALE SIGN / U+2420 SYMBOL FOR SPACE / U+24D4 CIRCLED LATIN SMALL LETTER E / U+24E2 CIRCLED LATIN SMALL LETTER S / U+24E3 CIRCLED LATIN SMALL LETTER T / U+2420 SYMBOL FOR SPACE / U+24BB CIRCLED LATIN CAPITAL LETTER F / U+24C4 CIRCLED LATIN CAPITAL LETTER O / U+24CA CIRCLED LATIN CAPITAL LETTER U / U+0589 ARMENIAN FULL STOP / U+2424 SYMBOL FOR NEWLINE / U+201D RIGHT DOUBLE QUOTATION MARK / U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK / U+0964 DEVANAGARI DANDA
U+263A WHITE SMILING FACE


You can post a comment using the following fields:
Name or nick (mandatory):
Web site URL (optional):
Email address (optional, will not appear):
Identifier phrase (optional, see below):
Attempt to remember the values above?
The comment itself (mandatory):

Optional message for moderator (hidden to others):

Spam protection: please enter below the following signs in reverse order: 7b0c31


Recent comments