Ian Hickson has written an
interesting note on why
not to use XHTML for the moment. He raises some very
interesting issues. One of them is that the overwhelming majority of
Web authors are hopelessly clueless and will just copy their
HTML code from some other site or some poorly written
book. Now when they start copying thinks like
and
<?xml
version="1.0" encoding="utf-8"?>
without understanding what it means, then we have problems. Also when
they start writing <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
when they
should have written <br />
or vice
versa, because they haven't heard of XHTML or don't know
the difference with SGML-based HTML. Hell
will not break loose now, because existing Web browsers have
been built to be very fault-tolerant, but it may break loose
in the future.<br>
So, my important advice to Web authors: if you don't want to write
markup that validates (or if this all sounds Chinese to you), fine,
but then make sure of one thing: don't include
in your HTML code anything which contains the characters
or the word
<?
. Just don't. Unless you know
exactly what they mean, that is, and are prepared to face the
consequences. If you don't, what you're writing is known as a tag
soup, and the correct way to start an HTML tag soup
is with DOCTYPE
(or perhaps
<html>
or some such thing).
If you start the document with <html lang="en">
then you are promising well-formed
XML, so you had better know what this means. If you
include a line such as <?xml version="1.0"
encoding="utf-8"?>
then you are promising markup that validates against a specific
DOCTYPE, so you had better check that it does validate. If
you aren't prepared to go through all that, then start with
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
and write tag soup, which just
works.<html>
Now why do I write XHTML when Ian Hickson
quite rightly points out that it will bring me no advantage
whatsoever (since it is served as text/html
and not
application/xhtml+xml
)? Well, for one thing, my
XHTML is valid: but the point of being valid is
not that it makes the page any better per se, it
simply helps me check for some basic mistakes that even using
two-and-fourty different Web browsers wouldn't catch. But also, quite
trivially, I find XHTML simpler to write than
HTML4: writing
without ever closing the tag, for example, just seems wrong.
And when the pages are computer-generated it's even more obvious: it
is such a pain to write a program that will have to remember that the
<br>
<br>
tag may not be closed, for example,
whereas in XHTML we simply close every tag, no questions
asked.