Personality tests abound on the Web: I have yet to find one that
would tell me something interesting, useful, insightful, or
non-trivial about myself; usually I end up answering most questions
more or less randomly (and I'm sure that running the test a second
time would give entirely different results) — at least certain
tests have a not sure

category for answers, which makes them a
little less haphazard. Anyway, I wonder who invents these quizzes:
manic taxonomists, I tend to think, who want to classify people
according to criteria which they think intellectually elegant (think
of the Keirsey or the enneagram tests: the
categories are certainly seductive in their taxonomical harmony, but
are they pertinent?), but use completely heuristic and unscientific
methods for devising their tools. I'm sure I could produce a lot of
fun but entirely meaningless personality tests that would classify you
as one of the four elements, or one of the sixty-four 易經
(Yi Jing) hexagrams, or one of the twenty-two major arcana of the
tarot, or one of the three cardinal causes (power, will and knowledge)
of my mumbo-jumbo philosophy, or
whatever (or perhaps as a combination of all this).

But how about a truly scientific test with a really pertinent
statistical basis? Can such a thing be devised and, if so, how? I
tend to believe that the actual questions are of little importance
(and they might be as stupid as what is your favorite color

— and perhaps something about swallows, too — provided
they are numerous enough for statistical significance), what really
matters is how the data are interpreted.

If, for example, one asks three hundred yes/no questions, then a
data point is a point in the three-hundred-dimensional discrete
hypercube (or perhaps the full hypercube if the questions admit a
continuous 0-through-1 answer scale). If a sample population is
subjected to the test, one obtains a cluster of points in said
hypercube. Then, out of that cluster, one wishes to extract certain
statistically meaningful variables, or subpopulations: how can this be
done? Here's at least one way to do it (probably not a very smart
one, but at least it shows the sort of thing I'm after). Take the
line in the hypercube that minimizes the sum of squares of distances
to the population points — in other words, the Gaussian
one-dimensional best fit for the data; projecting the quiz result
point to this line determines a one-parameter classification which is
in a certain way the most significant one (except that its main
drawback is that it takes the euclidian structure on the cube as a
natural one, which it is not, so it is really biased by the choice of
questions; but you get the idea of the sort of things that could be
done). After that, if one wishes to extract a second, and a third,
variable from the quiz, one could simply take the next dimensional
Gaussian fits for the data points. Of course, *interpreting*
these variables demands some competence in psychology, but at least
their *definition* would be based on some objective statistical
data, not merely the test author's intuition on how to classify the
human mind.

Has such a test been devised already? If not, I might try creating one myself, if only as a proof of concept. (Though the main difficulty would be to find a sample population willing to take the test although they will get no results from it until the entire data have been processed.)