David Madore's WebLog: Results of my pronunciation poll

Index of all entries / Index de toutes les entréesXML (RSS 1.0) • Recent comments / Commentaires récents

↓Entry #2618 [older| permalink|newer] / ↓Entrée #2618 [précédente| permalien|suivante] ↓

(Thursday)

Results of my pronunciation poll

A little over a week ago, I launched a not-at-all-scientific online poll on the pronunciation of English vowels, in order to gain some insight into (a) how much we are influenced by the written form of a word into how we think it is pronounced, and (b) how well English pronunciation is taught to foreigners, especially in France, and what vowel distinctions they uphold — or think they do. I closed this poll on Wednesday after receiving 259 responses, of which 77 self-reported as native speakers and 182 as non-native speakers (there was also one entirely blank answer, which is not included in these statistics). This is a tabulation of results, along with some comments.

The poll consisted of 40 pairs of words (like pin / pen), displayed in a (fixed but) randomly chosen order, and for each pair, the respondent was asked whether they pronounce the words identically or not, with the choice given between four possible answers: identical, unclear / varies, distinct or don't know (instructions were given to choose the don't know answer when the respondent was not familiar with one of the words or how to pronounce it).

For each of the 40 pairs, I give below a table showing (in the last two lines), the proportion of the number of native and non-native speakers (excluding the — never more than two — who skipped the question altogether) who chose each of the proposed answers; the most frequent answer in each category has been highlighted in green. The first two lines of the table give, as an asterisk (‘✱’), the “expected” answer for two (somewhat idealized or stereotypical) standardized accents: English Received Pronunciation and General American Pronunciation (sadly, I do not have any reliable dictionary of Australian pronunciation at hand): the phonetic transcription used to conclude this has been shown in the last column of these lines; sometimes, a question mark has been added to indicate that notable variant pronunciations make the answers in question also predictably plausible (or plausibly predictable). I also added some comments as to why the pair was included and what it was meant to test (and why, in some cases, it was stupid of me to include it).

The poll also asked the respondent where they learned English (in hindsight, it would have been better to also ask where they were from, and, in the case of non-native respondents, what their native tongue was; this suggestion was made in the comments, but I did not wish to alter the questions once the poll had started). The distribution of answers is as follows:

  • Native respondents (77): England 15 (including 6 from London, and including 2 who did not specify beyond UK, but presumed to be from England); United States 40 (mostly from California and the Midwestern US; but a few did not disclose beyond the country); Canada 4; Australia 12; New Zealand 2; others 1 (Wales and Nigeria); no answer 3.
  • Non-native respondents (182): in overwhelming majority in France (117 answered France, possibly with a more specific place; another 9 included France as part of their answer); among the most common answers not including France were Russia (10, plus 1 including Russia), Germany (3) and a few other non English speaking EU countries (17), and various English-speaking countries (10).

In the comments below I will use expressions such as native respondents from the US as a shortcut to designate respondents who self-reported as native English speakers and who answered the question of where they learned English with a place in the US (or the US without further information).

The highly skewed number of French respondents is due to the way the poll was announced (on my blog, which is mostly in French, and Twitter feed, which is partially in French).

English vowels are, of course, a mess (see also this old entry), and there isn't even any clear and definitive answer to how many different vowels (phonemes?) English has, let alone how they should be transcribed. The “lexical sets” chosen by John C. Wells (namely, the vowels of: KIT, DRESS, TRAP, LOT, STRUT, FOOT, BATH, CLOTH, NURSE, FLEECE, FACE, PALM, THOUGHT, GOAT, GOOSE, PRICE, CHOICE, MOUTH, NEAR, SQUARE, START, NORTH, FORCE, CURE) are an attempt at forming a repertoire (but no accent has a different vowel for each set, and conversely, some may subdivide some of the sets; a lexical set like CLOTH has the same vowel as LOT in RP and the same vowel as THOUGHT in GA; vowels with a following ‘r’ are generally classified separately; and the NURSE vowel is not even a single vowel in Irish accents), so it is used in giving the phonetic key below, and in discussions. I encourage learners of English to memorize this set of words, try to keep apart those which are indeed pronounced separately in the accent(s) they target (so, probably forget about the distinction between NORTH and FORCE), and try to note, whenever encountering a difficult vowel, which lexical set it relates to.

The following phonetic key has been used in transcription; it is a sort of hybrid between the one used in Wells's own Longman Pronunciation Dictionary (with the notable difference that /ɛ/ rather than /e/ has been used for the DRESS vowel), and the one used in Wiktionary (with the notable difference that some vowels have been marked with ‘ː’ even in American where such distinction of length is dubious):

KITDRESSTRAPLOTSTRUTFOOTBATHCLOTHNURSEFLEECEFACEPALMTHOUGHTGOATGOOSEPRICECHOICEMOUTHNEARSQUARESTARTNORTHFORCECURE
RPɪɛæɒʌʊɑːɒɜːɑːɔːəʊɔɪɪəɛəɑːɔːɔːʊə
GAɪɛæɑːʌʊæɔːɝːɑːɔːɔɪɪɹɛɹɑːɹɔːɹɔːɹʊɹ

It should be noted that, despite the transcription which distinguishes them, most Americans now do not seem to separate the LOT and THOUGHT vowels (this is the cot–caught merger), and, conversely, a small handful still pronounce the NORTH and FORCE vowels differently (in which case the latter might be transcribed /oːɹ/).

Caveat: While the percentages in the tables have been computed automatically, everything else is written by hand, and, as humans are prone to making mistakes and I am exceptionally human, probably littered with mistakes of all sorts. Percentages might not sum to 100% because of rounding, of course; concerning rounding, I have rounded to the nearest integer or, in case of a tie (which occurs fairly frequently because I had 40 native respondents from the US and I often give the details for those), to the nearest even integer.

For those who wish to analyse the results themselves, the raw results are here.

warn / worn
Ident.Uncl.Dist.DKwarn / worn
RPwɔːn
GAwɔːɹn
Native 71%  9% 20%  0%
NonNat 27%  7% 62%  4%

This pair is homophonous in all English accents I know of. It was included to test the effect of spelling differences, and as a possible comparison with the farm / form question.

fairy / ferry
Ident.Uncl.Dist.DKfairy / ferry
RPˈfɛəɹi / ˈfɛɹi
GAˈfɛɹi
Native 42%  8% 51%  0%
NonNat 24% 13% 62%  1%

This pair is a test of the Mary–merry merger (merger of SQUARE and DRESS vowels before intervocalic ‘r’) which occurred in North American accents. Of the 40 native respondents reporting from the US, 70% reported identical, 12% reported unclear and 18% reported distinct for this question.

spear it / spirit
Ident.Uncl.Dist.DKspear it / spirit
RPˈspɪəɹɪt / ˈspɪɹɪt
GAˈspɪɹɪt
Native 42%  5% 53%  0%
NonNat 14%  9% 74%  3%

This pair was included as a test of the a merger of KIT and NEAR vowels before intervocalic ‘r’ (sometimes know as the mirror–nearer merger, and analogous to the Mary–merry merger discussed above) which occurred in North American accents. Of the 40 native respondents reporting from the US, 65% reported identical, 10% reported unclear and 25% reported distinct for this question.

fire / far
Ident.Uncl.Dist.DKfire / far
RP?faɪə / fɑː
GAˈfaɪɚ / fɑːɹ
Native  0%  0%100%  0%
NonNat  0%  2% 98%  0%

This pair was included as a test of vowel smoothing: it is expected that in some English accents, [faɪə] can smooth to [faə] or [fɑə] or even monophtonged to [faː] or [fɑː], which could then make it homophonous with far. Upon documentation, it seems a front vowel (closer to [a]) is to be expected in this word, and tire / tar would have been a more plausible test; see also tower / tar below.

law / lore
Ident.Uncl.Dist.DKlaw / lore
RPlɔː
GAlɔː / lɔːɹ
Native 29%  5% 66%  0%
NonNat 12%  7% 76%  5%

This is classical test of rhoticity. As expected, while 98% of the native respondents reporting from the US marked this pair as distinct (the last one as unclear), 80% of the (15) respondents from England said these were identical (the other 20% called them distinct).

I was also interested in knowing whether the non-native respondents would report the words as identical, especially given that most English teachers in France are presumably from the UK; I suspect, however, that ‘r’-dropping is not really covered in English classes in France.

ant / aunt
Ident.Uncl.Dist.DKant / aunt
RP?ænt / ɑːnt
GA?ænt
Native 39% 16% 45%  0%
NonNat 13%  9% 75%  3%

These words are distinct in RP (following the TRAP and BATH (or PALM) vowels respectively) and identical in American; there are, however, a sizable number of Americans who pronounce aunt with the PALM vowel, and there could be some variability in English accents as well. Of the 40 native respondents reporting from the US, 58% reported identical, 25% reported unclear and 18% reported distinct for this question; while of the 15 native respondents reporting from England, 73% reported distinct (leaving only three identical and one unclear).

full / fool
Ident.Uncl.Dist.DKfull / fool
RPfʊl / fuːl
GAfʊl / fuːl
Native  3%  4% 94%  0%
NonNat 24% 12% 63%  1%

This is a merger (between the FOOT and GOOSE vowels) expected to occur in Scottish accents, but as was pointed to me, the example is perhaps badly chosen given that postvocalic ‘l’ tends to alter vowel quality significantly (especially in Great-Britain); maybe look / Luke would have been better.

I was also interested to know how French respondents would do on this one, since the FOOT and GOOSE vowels seem to merge in French accents in English (in much the same way as KIT and FLEECE). Of the 117 non-native respondents reporting from France specifically, 32% reported identical, 10% reported unclear, 56% reported distinct and one (1%) not knowing.

sun / son
Ident.Uncl.Dist.DKsun / son
RPsʌn
GAsʌn
Native100%  0%  0%  0%
NonNat 27% 10% 61%  1%

I was surprised by the results of this one: as far as I can tell, sun and son are homophonous in every accent of English. (It is worth recalling that they had the same vowel in Old English and that the distinction in spelling is a historical artifact due to the way the minime script was written to avoid some ambiguities.) And I didn't expect such common words to cause so much problem. I am tempted to say that one of the lessons here is that one shouldn't expect learners of English to just pick up these things: they need to be pointed out explicitly.

horse / hoarse
Ident.Uncl.Dist.DKhorse / hoarse
RPhɔːs
GA??hɔːɹs
Native 97%  1%  1%  0%
NonNat 46% 16% 20% 18%

This is a classic merger between NORTH (horse, morning, for, short, fork, corn) and FORCE (hoarse, mourning, four=fore, sport, pork, torn) which is reported to be complete in almost every accent of English, and the distinction between these two sets can be considered mostly dead. The two native speakers who reported the words as not identical were from Yorkshire (distinct) and South Australia (unclear / varies). See also morning / mourning and for / four below.

pain / pane
Ident.Uncl.Dist.DKpain / pane
RPpeɪn
GApeɪn
Native100%  0%  0%  0%
NonNat 71% 11% 14%  4%

This merger (between earlier /ɛi/ and /ɛː/) has been complete since just after the Great Vowel Shift (the long mid mergers, around the 16th century), and people making the distinction are expected to be long dead. 😉

hire / higher
Ident.Uncl.Dist.DKhire / higher
RP?haɪə
GA?ˈhaɪɚ
Native 64% 17% 19%  0%
NonNat 69% 15% 15%  1%

The main issue here is the number of syllables: is the word analysed as a single syllable with the PRICE vowel followed by ‘r’ (a syllable which could then undergo smoothing); or is it disyllabic, having one stressed syllable with the PRICE vowel followed by a weak one with a (possibly rhoticized) schwa sound? Or is the distinction meaningless? And if not, is the disyllabic version more likely to occur in higher where it is suggested by the morphemic analysis high+er? The results are unclear and may warrant further investigation (a related issue is whether idea and I, dear are homophonous in RP).

threw / through
Ident.Uncl.Dist.DKthrew / through
RPθɹuː
GAθɹuː
Native 92%  4%  4%  0%
NonNat 61% 13% 25%  2%
luck / look
Ident.Uncl.Dist.DKluck / look
RPlʌk / lʊk
GAlʌk / lʊk
Native  3%  0% 97%  0%
NonNat  1%  0% 99%  0%

This question tests a split of Middle English short ‘u’ into the STRUT (/ʌ/, as in luck, putt, rush) and FOOT (/ʊ/, as in look, put, push) vowels; this split did not occur in Northern England (one of the two native respondents who reported these words as identical did indeed indicate North-West England as origin; the other just wrote United Kingdom).

The STRUT–FOOT distinction is thought to be unproblematic for French speakers (who rather tend to conflate FOOT with GOOSE, see above).

would / wood
Ident.Uncl.Dist.DKwould / wood
RPwʊd
GAwʊd
Native 92%  6%  1%  0%
NonNat 61% 17% 22%  0%
poor / pure
Ident.Uncl.Dist.DKpoor / pure
RPpɔː / pjʊə
GApʊɹ / pjʊɹ
Native  0%  0%100%  0%
NonNat  1%  0% 99%  1%

This is a mess, and this pair was included more or less by mistake. There are two issues here: one is the tendency for the CURE vowel to merge with the NORTH (and hence FORCE) vowel in English accents in general and perhaps even more so on the word poor, or sometimes with the NURSE vowel; the second is the possibility of yod-dropping (here yod refers to /j/), which should not occur here; so the two words are distinct, but this tells us little about the vowel, which is a mess anyway, and this question is useless. Sorry about that.

brewed / brood
Ident.Uncl.Dist.DKbrewed / brood
RPbɹuːd
GAbɹuːd
Native 83%  4% 12%  1%
NonNat 39% 17% 32% 12%
steering / stirring
Ident.Uncl.Dist.DKsteering / stirring
RPˈstɪəɹɪŋ / ˈstɜːɹɪŋ
GAˈstɪɹɪŋ / ˈstɝːɹɪŋ
Native  0%  1% 99%  0%
NonNat 15% 10% 71%  4%

This is a fake test (a kind of trap, if you will) against the mirror–nearer merger mentioned above, as non-natives may fail to realize that stirring has the NURSE vowel (not the KIT vowel which may be merged with NEAR). I'm not sure there's anything intelligent to conclude here.

shed / shared
Ident.Uncl.Dist.DKshed / shared
RPʃɛd / ʃɛəd
GAʃɛd / ʃɛɹd
Native  0%  1% 99%  0%
NonNat  4%  4% 91%  1%

The two words should, of course, be unambiguously different in rhotic accents; I thought some speakers with non-rhotic accents might perhaps fail to note the difference (even if they were pronouncing it themselves). This was included as a baseline against which to compare the fairy / ferry question, but it appears to have been useless as such. The one native respondent who reported the distinction as unclear was from New Zealand.

morning / mourning
Ident.Uncl.Dist.DKmorning / mourning
RP?ˈmɔːnɪŋ
GA??ˈmɔːɹnɪŋ
Native 86%  6%  8%  0%
NonNat 27% 16% 51%  5%

Compare with horse / hoarse above and for / four below. Here, three (8%) of the 40 native respondents from the US reported the distinction as unclear, none as distinct; the six native respondents who reported the words as distinct were four from the UK, one from Australia, and one from an unspecified location.

tower / tire
Ident.Uncl.Dist.DKtower / tire
RP?taʊə / taɪə
GAˈtaʊɚ / ˈtaɪɚ
Native  0%  0%100%  0%
NonNat  1%  0% 98%  1%

I thought that vowel smoothing (see above) might sometimes make these words homophonous as [taə] or [tɑə].

farm / form
Ident.Uncl.Dist.DKfarm / form
RPfɑːm / fɔːm
GAfɑːɹm / fɔːɹm
Native  0%  0%100%  0%
NonNat  1%  2% 97%  0%

This was mostly included as a comparison baseline for the warn / worn question for non-native speakers.

sat / set
Ident.Uncl.Dist.DKsat / set
RPsæt / sɛt
GAsæt / sɛt
Native  0%  0%100%  0%
NonNat  1%  4% 94%  1%

This was included because German speakers seem to have problems distinguishing the DRESS and TRAP vowels. But there were few responses from Germany, and the only two respondents who reported the words as identical were from France and Finland.

dolly / Dali
Ident.Uncl.Dist.DKdolly / Dali
RPˈdɒli / ˈdɑːli
GAˈdɑːli
Native 23%  8% 66%  3%
NonNat  1%  7% 75% 18%

The LOT vowel is identical to the PALM vowel in American accents (this is the father–bother merger); since the PALM vowel is rare, it is not easy to give minimal pair examples of this: maybe it would have been better to ask about [f]ather / [b]other, because the pronunciation of the foreign name Dali is, of course, of little value as a reference. Of the 40 native respondents reporting from the US, 42% reported identical, 12% reported unclear and 45% reported distinct for this question; the 15 native respondents from England all reported the two words as distinct.

hit / heat
Ident.Uncl.Dist.DKhit / heat
RPhɪt / hiːt
GAhɪt / hiːt
Native  0%  0%100%  0%
NonNat 15%  7% 78%  0%

French speakers (among others) have considerable difficulty keeping the KIT and FLEECE vowels separate. Of course, whether they claim to hear a difference and whether they actually do pronounce one are different matters. Here, of the 117 non-native respondents reporting from France specifically, 14% reported identical, 9% reported unclear and 77% reported distinct.

bury / berry
Ident.Uncl.Dist.DKbury / berry
RPˈbɛɹi
GAˈbɛɹi
Native 75%  8% 17%  0%
NonNat 20%  7% 70%  3%

The two words are expected to be homophonous, as bury unexpectedly has the DRESS vowel. I know that there are exceptions, though, because my own father (who grew up mainly in Ontario, Canada) pronounces it with the NURSE vowel. In this poll, of the 40 native respondents reporting from the US, 70% reported identical, 8% reported unclear and 22% reported distinct for this question (this does not tell us, of course, whether they use the NURSE vowel or something else, or were simply misled by the spelling).

putt / put
Ident.Uncl.Dist.DKputt / put
RPpʌt / pʊt
GApʌt / pʊt
Native  8%  0% 92%  0%
NonNat 30%  9% 38% 23%

This should be compared with luck / look above as the distinction should be the same (STRUT–FOOT). The fact that a considerably larger number of non-native speakers reported the words as identical suggests that they take clues from the spelling.

nose / knows
Ident.Uncl.Dist.DKnose / knows
RPnəʊz
GAnoʊz
Native 95%  0%  5%  0%
NonNat 71% 13% 16%  0%
tower / tar
Ident.Uncl.Dist.DKtower / tar
RP?taʊə / tɑː
GAˈtaʊɚ / tɑːɹ
Native  1%  0% 99%  0%
NonNat  1%  4% 87%  8%

Yet another test of smoothing (there are too many of these): see above.

earn / urn
Ident.Uncl.Dist.DKearn / urn
RPɜːn
GAɝːn
Native 91%  3%  6%  0%
NonNat 58%  9% 20% 13%

Both words should have the NURSE vowel. I expect that Irish accents may tell the two apart, but my poll does not seem to have gotten any responses from Ireland. The five native respondents who reported the words as distinct, and the two who reported them as unclear, were all from North America (plus one who did not report a location).

[h]urry / [f]urry
Ident.Uncl.Dist.DK[h]urry / [f]urry
RPˈ[h]ʌɹi / ˈ[f]ɜːɹi /
GAˈ[h]ɝːɹi / ˈ[f]ɝːɹi
Native 56%  1% 43%  0%
NonNat 66%  6% 21%  6%

This merger between the STRUT and NURSE vowels before an intervocalic ‘r’ occurs in North American accents (and I expect, others as well). Of the 40 native respondents reporting from the US, 90% reported identical, 2% reported unclear and 8% reported distinct for this question; while of the 15 native respondents reporting from England, all reported the words as distinct.

[n]earer / [m]irror
Ident.Uncl.Dist.DK[n]earer / [m]irror
RPˈ[n]ɪəɹə / ˈ[m]ɪɹə
GAˈ[n]ɪɹɚ / ˈ[m]ɪɹɚ
Native 48%  8% 44%  0%
NonNat 19% 13% 64%  4%

Compare with spear it / spirit above. Of the 40 native respondents reporting from the US, 75% reported identical, 8% reported unclear and 18% reported distinct for this question; while of the 15 native respondents reporting from England, all but one (93%) reported the words as distinct (and the last one as unclear).

stow / store
Ident.Uncl.Dist.DKstow / store
RPstəʊ / stɔː
GAstoʊ / stɔːɹ
Native  0%  0%100%  0%
NonNat  3%  4% 87%  6%

This was a test to see if non-native speakers, who might have learned about ‘r’-dropping, were led to conclude that the two words would sound identical. Not a very interesting test.

poor / pour
Ident.Uncl.Dist.DKpoor / pour
RP?pɔː
GA?pʊɹ / pɔːɹ
Native 70%  9% 21%  0%
NonNat 55% 14% 28%  3%

The word poor takes the CURE vowel, which has a tendency for to merge with the NORTH (and hence FORCE) vowel especially in England; the word pour takes the NORTH vowel (or perhaps FORCE, but the distinction has become essentially non-existent, see above). So I expected a certain amount of confusion. I am slightly surprised, however, to see that the percentage of respondents who report the words as identical is the same (67%) among natives from England as from the US.

hairy / Harry
Ident.Uncl.Dist.DKhairy / Harry
RPˈhɛəɹi / ˈhæɹi
GA??ˈhɛɹi / ˈhæɹi
Native 43%  4% 53%  0%
NonNat 11%  9% 79%  1%

This pair is a test of the (Mary=merry)–marry merger (merger of the already merged SQUARE and DRESS vowels with TRAP before intervocalic ‘r’) which occurred in some North American accents. Of the 40 native respondents reporting from the US, 72% reported identical, 5% reported unclear and 22% reported distinct for this question; while of the 15 native respondents reporting from England, all reported the words as distinct.

fir / fur
Ident.Uncl.Dist.DKfir / fur
RPfɜː
GAfɝː
Native 92%  4%  4%  0%
NonNat 36%  9% 27% 28%

Compare with earn / urn above. Both words should have the NURSE vowel. Again, I expect that Irish accents may tell the two apart, but my poll does not seem to have gotten any responses from Ireland. Three native respondents reported the words as distinct, one from North-West England, one did not specify beyond UK, and one did not specify at all.

for / four
Ident.Uncl.Dist.DKfor / four
RPfɔː
GA??fɔːɹ
Native 88%  6%  5%  0%
NonNat 52% 16% 32%  1%

Compare with horse / hoarse and morning / mourning above. Here, three (8%) of the 40 native respondents from the US reported the distinction as unclear, two (5%) as distinct; the four native respondents who reported the words as distinct were two from the UK and two from the US.

surely / Shirley
Ident.Uncl.Dist.DKsurely / Shirley
RPˈʃɔːli / ˈʃɜːli
GA?ˈʃʊɹli / ˈʃɝːli
Native 42% 12% 47%  0%
NonNat 16% 12% 67%  4%

The (tonic) vowel in surely seems to be either that of CURE, which has a tendency for to merge with the NORTH (and hence FORCE) vowel, or, at least in the US, that of NURSE. The (tonic) vowel in Shirley is unambiguously that NURSE. Of the 40 native respondents reporting from the US, 65% reported identical, 15% reported unclear and 20% reported distinct for this question.

cot / caught
Ident.Uncl.Dist.DKcot / caught
RPkɒt / kɔːt
GA??kɑːt / kɔːt
Native 29%  5% 66%  0%
NonNat 20% 10% 64%  7%

This is a merger occurring in many areas of North America (broadly speaking, much of Canada and the Western United States). Of the 40 native respondents reporting from the US, 42% reported identical, 10% reported unclear and 48% reported distinct for this question; while of the 15 native respondents reporting from England, all reported the words as distinct.

meet / meat
Ident.Uncl.Dist.DKmeet / meat
RPmiːt
GAmiːt
Native100%  0%  0%  0%
NonNat 77% 11% 11%  1%

This is the FLEECE merger, which took place after the Great Vowel Shift, around the 17th century, between the /iː/ and /eː/ vowels resulting from the Great Vowel Shift (from earlier /eː/ and /ɛː/ vowels). This merger is reportedly not complete in some parts of Northern England, but this does not show up on this poll.

cap / cup
Ident.Uncl.Dist.DKcap / cup
RPkæp / kʌp
GAkæp / kʌp
Native  0%  0%100%  0%
NonNat  3%  5% 90%  2%

↑Entry #2618 [older| permalink|newer] / ↑Entrée #2618 [précédente| permalien|suivante] ↑

Recent entries / Entrées récentesIndex of all entries / Index de toutes les entrées