For those not familiar with the expression, the phrase Turing
tarpit
refers to a situation in computer science (typically, a
programming language) from which “everything” is possible
(everything that a Turing machine can do can be done) but nothing is
easy. Some programming languages (such as the dire Unlambda which I unleashed
upon the world) achieve this situation deliberately, but more
frequently it occurs unsought for. As a matter of fact, I do believe
we are all—all of computing, that is—in the Turing tarpit.
Some are deeper than others, of course.
Here, then, is a thought rant as to what archaeologists
of future times will find within the tarpit, under the auspices of a
famous epigram by Alan Perlis,
A programming language is low level when its programs require attention to the irrelevant.
The irrelevant
refers to anything that is not part of the
algorithm
behind the problem, the latter being what one would
describe to a fellow human being when explaining how things work; or,
if you will, the algorithm
is what requires inspiration whereas
the irrelevant
is merely perspiration (apologies due to Thomas
Edison). The irrelevant can be further categorized in: syntactic
sugar (which, Perlis notes in another epigram, causes cancer of the
semicolon), propitiatory rites (to ward off evil deities), reinventing
the wheel (because you need a different color of wheel than the ones
provided in the tarpit), feeding fuel to the Shaddock pump (if you
haven't lived in France, you might not know what Shaddocks
are—besides a kind of grapefruit—but it doesn't matter
much), keeping track of the zorkmids, taking out the garbage, and
generally working around the many gratuitous barriers which someone
decided to erect in your way to prevent you from reaching your goal
too easily. Naturally, what is irrelevant to one man, or in one
context, can be of the highest importance elsewhere: another insidious
form of tarpit is that in which one lacks control over something
because it was arbitrarily categorized as irrelevant.
The dominant programming language of our times is probably
C. Accordingly, C is responsible for the
sorry state of most programs nowadays: it lacks even elective bounds
checking, strict type checking, stack checking and integer overflow
checking, whether static (at compile time) or dynamic (at run time);
and, certainly, C's absence of any kind of optional
verification is the cause of a great deal of bugs and security holes
in a vast repertoire of programs (one might argue, of course, that the
bug is always the programmer's fault: prima facie this is
admittedly true, but it is at least debatable whether the programmer's
role is to count the zorkmids like a Byzantine monk, just as it is
debatable whether a mathematician's work is to provide complete proofs
in some formal system, a similarly tedious kind of job). Another of
C's nastinesses is the lack of any sort of exception
(even flow control is very limited, in the absence of labeled
break
s, which require goto
to be used
instead), except as setjmp()
/longjmp()
,
which has been so carefully maimed as to make it useless. Let us also
mention the lack of inner (nested) functions, and the necessity of
explicitely constructing any kind of closure or continuation as a
manually allocated structure (for this reason, all callback data is
systematically passed as void*
and benefits from no kind
of type checking), making all manner of functional programming or
polymorphism absolutely impractical. C encourages the
use of null-terminated strings, which cause all sorts of problems
(such as mishandling of null characters, possibly with underlying
security problems, in a huge number of programs). C
requires the programmer to collect all his garbage himself: not only
does it not promote the use of a garbage collector (or
promote a specific one, and this is probably a good thing, because
having a GC, let alone a specific GC, forced
upon oneself, is not always nice), it actually discourages it in every
way (for example, most of C's very extensive external
library is only remotely usable with a GC), and allows
only very conservative garbage collection. And let's not even mention
the worthlessness of C's absurdly complex and
aggressively syntactical preprocessor.
C was invented so that Ken Thompson and Dennis Ritchie could port Unix to the PDP-11, and it shows: it was designed to function as a portable assembler for Digital's computers, and it is dubious whether it is good for anything else (even as a general-purpose portable assembler, it isn't very good as it obscures many implementation details). C makes modular programming (at least top-down, functional and callbackable modular programming) very difficult, and any kind of code reusability is severely limited by the way C works (and the flat namespace is only one aspect of the problem). De facto, code reusability is low, metaprogramming is practically inexistent, even automated code analysis is extremely difficult, data structures are kept to a bare minimum, and all maintenance work on the code (such as upgrading to a new set of specifications, or providing compatibility bindings) must be done by hand. Worse: all these limitations have so severely penetrated the minds of programmers that many of them are persuaded that they are some fundamental part of computer science.
Given all these problems, one may ask of me: why program mostly in
C, then? Besides inertia, there is the simple fact that
all libraries are written for and in C, and using them in
any other language means a load of extra work to convert the bindings
and very little gain since the library semantics are in any case
restricted to those that C can afford; and among these
libraries
I count the operating system I use, Unix, which is
sadly dependent on C from its inception. So
C is the mammoth that is bringing us all with it in the
tarpit. In any case, I have almost entirely ceased to program, given
that all programming languages that currectly exist are either full of
tar or unusable, or both: it seems to me that the only really useful
program to write would be a compiler for an entirely new programming
language.
What about existing languages other than C?
Java, for example, seems to improve considerably upon some of
C's most ridiculous limitations. Still, Java is rather
conservative in the way it departs from C, its main
improvements being usable exceptions and generalized garbage
collection: they are not certain to compensate the pratical problems
associated with Java (the lack of a really usable free-as-in-speech
implementation, and the slowness of the existing environments). Even
insofar as it differs from C, Java also is not perfect,
however (it has pretty much failed at providing polymorphism).
Likewise, C++ provides some improvements on
C (but still no garbage collector, at the cost of an
unreasonable complexity of the language). Functional programming
languages are no better: OCaml is plagued by, besides a horrible
syntax, a type system with an unreasonable number of features, the sum
of all previous research work and experimentation; Haskell is highly
elegant and very understandable, but its lazy evaluation is so
pervasive and impossible to escape that it becomes a plague (not even
counting the consequently atrocious performance); Scheme is a
toy-language, its ridiculous standard library (not even a decent
printf
!) competes to make the language useless with the
fact that one must create types “by hand” by composing
pairs and tags. Common Lisp presumably has every imaginable feature
from every imaginable language, but this makes it impossibly
complicated. Perl is as ugly as my worst nightmares (except that
perhaps Perl6 will be better, but so far this is complete vaporware);
Python is better but still has some annoying defects (the gratuitous
separation of expressions and instructions, which has no place in this
kind of imperative high-level language, is certainly one). Prolog, or
its much better cousin Mercury, is too specialized. The list is long
(even just counting languages which I know: I have yet to learn
Smalltalk, for example).
The bottom line is that good—high-level—programming
languages, in my opinion, are still to be invented. But I don't think
it would be impossibly complicated: merely learning the design
principles and the motivating logic behind a number of very different
languages which come closest to being good
(say: Algol,
ML, Haskell, Scheme, Common Lisp, Smalltalk, Java,
PostScript, Mercury, Dylan, Erlang and Python), and avoiding each
one's most glaring mistakes, should be a good start. (I'm not saying
it is possible to produce a language that is excellent in every
circumstance; but it is assuredly possible to produce one which is uniformly
catastrophic.) Then we might start escaping the Turing tarpit,
and then we can start thinking about truly novel features for
languages (one that comes to my mind is: bounded resource
reflexivity/sandboxes, which I believe no known language implements in
any way).