Professional Documents
Culture Documents
The IPA is a relatively complex system of symbols, intended to facilitate the representation of every sound speakable by humans. It uses many symbols not in the latin alphabet, and as such is not directly renderable in 7-bit ASCII. For that matter, it's not even directly representable in ISO-Latin-1. So until Unicode becomes more standard, users of IPA on the internet have two options: make graphics or fake it up with ASCII. As for the former, check out my webfont, used to do these pages, but in email and in newsgroups and other textonly media, even graphics are out. Unicode is a lot better supported now than when I first did these pages in 1997, but still not perfect. Recentmodel web browsers can handle it (assuming the correct fonts are installed), though there are still a lot of people using older browsers, and of course mailing lists and newsgroups still have difficulty with it. In any case, to insert Unicode into HTML, you type � into your document, replacing "000" with the hex codes given below. For instance, if I type "ɐ", I get ""---that's a turned a, in case your browser doesn't display it right. Your own authoring software may give you a less painful way to do this. The leftmost column below has the IPA in this form, and is pasteable text on systems that support that. (I've left the graphical version in for those whose browsers don't do Unicode yet.) For those who want to use a unicodeless forum, or appeal to a broader audience, then, we have ASCII-IPA. The problem is, no representation is perfect; different systems are better for different purposes. Here, though, are several commonly used ASCII-IPA systems: Kirshenbaum, Coutts-Barrett, Branner, Carrasquer, and SAMPA. Kirshenbaum is popular among hobbyists because it tries to stay close to the physical representation of ASCII, or else have a decent mnemonic, for most things. SAMPA seems more popular among professional linguists for reasons which elude me. And of course, in addition to these schemes you will see various home-grown schemes which may or may not be marked as such. But notice that the most commonly used IPA symbols tend to be pretty uniform among all the schemes. Usually if an author uses one of the more esoteric bits of IPA, e will specify what scheme e's using to transcribe it to ASCII. And without further ado... IPA a b c c Unicode
U+0061 U+0250 U+0251 U+0252 U+00E6 U+028C U+0062 U+0253 U+0299 U+03B2 U+0063 U+0063 U+02BC
Kirsh. a
A A. & V b b` b<trl> B c
CB Bran. Carr. SAMPA Description Cardinal vowel 4: open front unrounded a a a a ("lower-case a") Almost fully open central unrounded vowel ;a a& a" 6 ("turned a") Cardinal vowel 5: open back unrounded @ A A A ("script a") Cardinal vowel 13: open back rounded ;@ A& A" Q ("turned script a") Almost fully open front unrounded vowel .ae ae) & { ("ash") Cardinal vowel 14: open mid back ;v v& A* V unrounded ("inverted v") b b b b" B" B c c` B\ B c b Voiced bilabial stop ("lower-case b") Voiced bilabial implosive ("hooktop b") Bilabial trill ("small capital b") Voiced bilabial fricative ("beta") Voiceless palatal stop ("lower-case c") Palatal ejective
(b b$ B B
|B V c c' c
d d e
U+00E7 U+0255 U+0064 U+0064 U+032a U+0257 U+0256 U+00F0 U+0065 U+0259 U+025a U+0258 U+025b
C c" d
C s\ d
Voiceless palatal fricative ("c cedilla") Voiceless alveolo-palatal fricative ("curly-tail c") Voiced alveolar stop ("lower-case d") Voiced dental stop
d d[ d` d. D e @ R @ E
(d d$ <d dr) -d e ;e D e @
d" d. D e @ d` D e @
Voiced dental/alveolar implosive ("hooktop d") Voiced retroflex stop ("right-tail d") Voiced dental fricative ("eth") Cardinal vowel 2: close-mid front unrounded ("lower-case e") Mid central vowel ("schwa") Rhotic mid central vowel ("right-hook schwa") Close-mid central unrounded vowel ("reversed e") Cardinal vowel 3: open-mid front unrounded ("epsilon") alt Open-mid central unrounded vowel ("reversed epsilon") alt Rhotic open-mid central unrounded vowel ("right-hook reversed epsilon") Open-mid central rounded vowel ("closed epsilon") Voiceless labiodental fricative ("lower-case f") Voiced velar stop ("lower-case g") Voiced labiovelar stop ("g-b tie")
9 ;3 E
e& E
e" E
@\ E
U+025c
V"
3 ;E
E&
E"
g` G G` Q ~ oh
Voiced velar implosive ("hooktop g") Voiced uvular stop ("small capital g") Voiced uvular implosive ("hooktop small capital g") Voiced velar fricative ("gamma") Velarized diacritic ("superscript gamma") Cardinal vowel 15: close-mid back unrounded ("baby gamma") Voiceless glottal fricative ("lower-case h")
i j k k
U+02b0 U+0127 U+0266 U+0267 U+0265 U+029c U+0069 U+0268 U+026a U+006a U+02b2 U+029d U+025f U+0284 U+006b U+006B U+02BC
<h> H h<?>
Aspirated diacritic ("superscript h") Voiceless pharyngeal fricative ("crossed h") Voiced glottal fricative ("hooktop h") Simultaneous S and x ("hooktop heng") Voiced labial-palatal approximant ("turned h") Voiceless epiglottal fricative ("small capital h") Cardinal vowel 1: close front unrounded ("lower-case i") Close central unrounded vowel ("barred i") Almost fully close front unrounded vowel ("small capital i") Palatal approximant ("lower-case j") Palatalized diacritic ("superscript j") Voiced palatal fricative ("curly-tail j") Voiced palatal stop ("barred dotless j") Voiced palatal implosive ("barred esh") Voiceless velar stop ("lower-case k") Velar ejective Voiceless labiovelar stop ("k-p tie") ("turned k")
j<rnd>
;h H
h& H i iI j j^
i i" I j ;
i -i I j ^j
l l
l l[ <lat>
^l ~l
l^
_l 5 s* l. z* L" m K l` K\ L\ m
Lateral release diacritic ("superscript l") ("l with tilde") Voiceless alveolar lateral fricative ("belted l") Retroflex lateral approximant ("l with right tail") Voiced alveolar lateral fricative ("l-ezh ligature") Velar lateral approximant ("small capital l") Bilabial nasal ("lower-case m")
s<lat> l. z<lat> L m
U+0271 U+026f
M u-
m) M ;m W m&
M U"
n n
j<vel> n n[
;|m W" n n
W" M\ n n
Labiodental nasal ("m with leftward tail at right") Cardinal vowel 16: close back unrounded ("turned m") Alternate Velar approximant ("turned m with long right leg") Alveolar nasal ("lower-case n") Dental nasal
^n n^
U+0272 U+014b
_n n" N J N
Nasal release diacritic ("superscript n") Palatal nasal ("n with leftward hook at left") Velar nasal ("eng") Labiovelar nasal ("eng-m tie")
n^ N n<lbv>
|n
nj)
n. n" o p! @. Y &. W O
n` N\ o O\ 8 2 9 & O
Retroflex nasal ("n with right tail") Uvular nasal ("small capital n") Cardinal vowel 7: close-mid back rounded ("lower-case o") Bilabial click ("bull's eye") Close-mid central rounded vowel ("barred o") Cardinal vowel 10: close-mid front rounded ("slashed o") Cardinal vowel 11: open-mid front founded (o-e ligature) Cardinal vowel 12: open front rounded ("small capital o-e ligature") Cardinal vowel 6: open-mid back rounded ("turned c") Alternate Voiceless bilabial stop ("lower-case p") Bilabial ejective Voiceless bilabial fricative ("phi")
!* p! -o o-
p p q q r
U+0070 U+0070 U+02BC U+0278 U+0071 U+0071 U+02BC U+0072 U+027e U+027c
p p` F q q` r<trl> *
p p' |o q q' r ;J
p p` F q
p p` P q q`
r d" Rr)
r r" r*
Alveolar trill ("lower-case r") Alveolar flap ("fish-hook r") Retroflex trill ("r with long leg")
r s s
U+027d U+0279 U+0072 U+032A U+027b U+027a U+0280 U+0281 U+0073 U+0073 U+02BC U+0282
r. R r\
Retroflex flap ("r with right tail") Alveolar approximant ("turned r") Dental approximant
R. l" R" R* s
r\` l\
Retroflex approximant ("turned r with right tail") Alveolar lateral flap ("turned long-legged r") Uvular trill ("small capital r")
R s
Uvular fricative ("inverted small capital r") Voiceless alveolar fricative ("lower-case s") Alveolar fricative ejective
s.
s`
Voiceless retroflex fricative ("s with right tail") Voiceless postalveolar fricative ("esh") Voiceless alveolar stop ("lower-case t") Voiceless dental stop Voiceless dental ejective
t t t t u v w x
U+0283 U+0074 U+0074 U+032A U+0074 U+032A U+02BC U+0074 U+02BC U+0288 U+03b8 U+0075 U+0289 U+028a U+0076 U+028b U+0077 U+02b7 U+028d U+0078 U+03c7
S t
S t
S t
S t
t'
t`
t` t. T u u" U v V w t` T u } U v P, v\ w _w W x X W x X
Dental/alveolar ejective Voiceless retroflex stop ("t with right tail") Voiceless dental fricative ("theta") Cardinal vowel 8: close back rounded ("lower-case u") Close central rounded vowel ("barred u") Almost fully close back rounded vowel ("upsilon") Voiced labiodental fricative ("lower-case v") Labiodental approximant ("script v") Voiced labial-velar approximant ("lower-case w") Labialized diacritic ("superscript w") Voiceless labial-velar fricative ("inverted w") Voiceless velar fricative ("lower-case x") Voiceless uvular fricative ("chi")
^w w^
w<vls> ;w w& x X x X x X
y z
y l^ I. z
y ;y Y z
y y& Y z
y L Y z z" z. Z ?
y L Y z z\ z` Z ?
Cardinal vowel 9: close front rounded ("lower-case y") Palatal lateral approximant ("turned y") Almost fully close front rounded vowel ("small capital y") Voiced alveolar fricative ("lower-case z") Voiced alveolo-palatal fricative ("curly-tail z") Voiced retroflex fricative ("z with right tail") Voiced postalveolar fricative ("ezh") Glottal stop Glottal stop (optional substitute for ?)
-? H<vcd> ;?
??&
?" # #"
>\ ?\ <\
Epiglottal plosive ("barred glottal stop") Voiced pharyngeal fricative ("reversed glottal stop") Voiced epiglottal fricative ("barred reversed glottal stop") Alternate Pharyngealized diacritic ("superscript reversed glottal stop") (Post)alveolar click ("exclamation point") Dental click ("pipe") Palatoalveolar click ("double-barred pipe") Alveolar lateral click ("double pipe") Minor (foot) group ("pipe") Major (intonation) group ("double pipe")
<H> c! t! c! l!
_?\ !\ |\ =\ |\|\
_ [ ] +
__d _a _+ _r _o " %
Retracted diacritic ("under-bar") Dental diacritic ("subscript bridge") Apical diacritic ("subscript inverted bridge") Advanced diacritic ("subscript plus") Raised diacritic ("raising sign") Lowered diacritic ("lowering sign") Primary stress ("superior vertical stroke") Secondary stress ("inferior vertical stroke")
U+02c8 U+02cc
' ,
' ,
' ,
U+0329 U+031a
<o>
,)
=, _= _}
Syllabic diacritic ("syllabicity mark") No audible release diacritic ("corner") Syllable break ("period")
^7 .) . .
:\ _" _t :
Half-long ("half-length mark") Centralized diacritic ("umlaut") Breathy voiced diacritic ("subscript umlaut") Long ("length mark") Ejective ("apostrophe")
U+02d0 U+02bc U+0325 U+030a U+031c U+0339 U+0303 U+0334 U+0330 U+032c U+0306 U+032f
_0
Voiceless diacritic ("under-ring") Voiceless diacritic (use if character has descender) Less rounded diacritic ("subscript left halfring") More rounded diacritic ("subscript right halfring") Nasalized diacritic ("superscript tilde") Velarized or pharyngealized diacritic ("superimposed tilde") Creaky voiced diacritic ("subscript tilde") Voiced diacritic ("subscript wedge") Extra-short ("breve") Non-syllabic diacritic ("subscript arch") Tie bar ("top ligature") Linking (absence of a break) ("bottom ligature") Mid-centralized diacritic ("superscript x")
< > ~ ~ ~ = ;~
U) u) ~^ ~) ~
_c _O ~, _~ _e _k _v _X _^ _ -\
^v v) .' ;$ . (^ ( )) =)
U+033d
Rhoticity diacritic ("rhoticity mark") Laminal diacritic ("subscript box") Advanced tongue-root diacritic ("advancing sign") Retracted tongue-root diacritic ("retracting sign") Linguolabial diacritic ("subscript seagull") Global rise ("diagonal up arrow")
U+2197
/) ;^ \ \) 1 11 13 15
! _B
_R _L _B_L _M _H_T _H _F
U+0300
22 31
U+0304
33 35
U+0301
44 51 53
U+030b
55
_T
(tones)
It is expected that when the Unicode/ISO 10646 character set becomes commonly used for mail, news, and web pages, this transcription will no longer be needed, as the IPA characters will be able to be used directly. Included in this archive are the specification itself and the "Pronunciation Symbols" page of MerriamWebster's New Collegiate Dictionary", done over in this transcription. This latter should be of use for American English speakers who are not used to the IPA symbols. In the future I hope to add a version of the specification which includes images of the actual IPA characters as well as sound clips of each of the segments.
This article describes a standard scheme for representing IPA transcriptions in ASCII for use in Usenet articles and email. The following guidelines were kept in mind:
It should be usable for both phonemic and narrow phonetic transcription. It should be possible to represent all symbols and diacritics in the IPA. The previous guideline notwithstanding, it is expected that (as in the past) most use will be in transcribing English, so where tradeoffs are necessary, decisions should be made in favor of ease of representation of phonemes which are common in English. The representation should be readable. It should be possible to mechanically translate from the representation to a character set which includes IPA. The reverse would also be nice.
In order to be able to represent a wide range of segments while making common segments easy to type, we allow more than one representation for a given segment. Each segment has an "explicit" representation, which is a set of features between curly braces ("{" and "}"). Each feature is represented as a three letter abbreviation taken from a standardized set. The phoneme /b/ (a voiced, bilabial stop) could be represented as /{vcd,blb,stp}/. A first cut at the feature set appears in appendix A below. The word tag could thus be represented phonemically as
/{vls,alv,stp}{low,fnt,unr,vwl}{vcd,vel,stp}/
and phonetically as
[{vls,asp,alv,stp}{low,fnt,lng,unr,vwl}{unx,vcd,vel,stp}]
This works, but it's a bit of a pain. To simplify transcription, we allow an "implicit" representation for a segment which consists of a (generally alphabetic) symbol followed by diacritics. Thus /b/ stands for /{vcd,blb,stp}/. Case is significant (/n/ and /N/ are different segments). The segment symbols are given in appendix B below. The word tag can thus be represented phonemically as
/t&g/
The diacritics for a segment are represented between angle brackets ("<" and ">") and consist of symbols or features. (In the common case where the diacritic symbol is a single character which does not encode a segment, the brackets may be removed.) The features which the diacritics map to override those of the segment.
or
[t<h>&<:>g<o>]
or
[t<h>&:g<o>]
Some diacritic symbols encode more than one feature set. Which one is meant should be apparent from context. For example, "." stands for "{rnd}" when attached to a vowel, but "{rfx}" when attached to a consonant. Clicks are common to many languages (especially in Africa), but there is no IPA diacritic that means "click". Rather than use up several characters for clicks (which are infrequent in the languages most often discussed), we instead use the diacritic "!" after the homorganic unvoiced stop. Thus /t!/ (= /t<clk>/ = /{alv,clk}/) is the sound commonly written tsk and used in English to show disapproval. The complete set of diacritic symbols appears in appendix C below. Appendices D and E contain representations of segments more or less ordered by feature (appendix D in tabular form, appendix E as a list). Appendix F contains a list of all of the ASCII characters and the uses they have been pressed to. For transcription of any specific language a group can by convention alter the character mappings (as an example, for Spanish /R/ may be better used to represent /{alv,trl}/ than /{mid,cnt,rzd,vwl}/). An author may also press a little used symbol (for the language under consideration) into service to highlight a distinction. Such an alteration should be made explicitly to avoid confusion. The diacritics "+" and "=" and the segment symbols "$" and "%" are explicitly left unspecified so that they can be used to mark language-specific features (that are otherwise cumbersome to mark). Such symbols can be assigned either by convention for a specific language or in an ad-hoc manner by an individual author. Stress marks are prepended to the syllable they attach to. "'" signals primary stress, "," signals secondary stress. Spaces should be employed to separate words (cliticized words may be written unseparated). When discussing single words, it may be helpful to insert a space before each syllable that doesn't carry a suprasegmental marker. Thus, I hear the secretary for an American might be something like
/aI hir D@ 'sEkrI,t&ri/
Transcribing tone is harder. Here's an attempt. For register tone languages (e.g., Hausa, Navajo), numbers should be used with one being the lowest. Thus in Navajo, "1" is low tone and "2" is high. In Yoruba "1" is low, "2" is mid, and "3" is high. The language's "default" tone need not be specified. For contour tone languages (e.g., Mandarin, Thai), there is generally a numeric system in place (Mandarin: "1" is high, "2" is rising, "3" is falling rising, "4" is falling). The tone indication should follow the syllable (vowel?). The symbol "#" is used to represent a syllable or word boundary.
blb lbd dnt alv rfx pla pal vel lbv uvl phr glt stp frc
bilabial labio-dental dental alveolar retroflex palato-alveolar palatal velar labio-velar uvular pharyngeal Glottal stop fricative
apr vwl lat ctl trl flp clk ejc imp hgh smh umd mid lmd low
approximant vowel lateral central trill flap click ejective implosive high semi-high upper-mid mid lower-mid low
bck unr rnd asp unx syl mrm lng vzd lzd pzd rzd nzd fzd
back unrounded rounded aspirated unexploded syllabic murmured long velarized labialized palatalized rhoticized nasalized pharyngealized
U+0262 LATIN LETTER SMALL CAPITAL G U+0127 LATIN SMALL LETTER H BAR
U+026A LATIN LETTER SMALL CAPITAL I U+0269 LATIN SMALL LETTER IOTA U+025F LATIN SMALL LETTER DOTLESS J BAR ?? U+026B LATIN SMALL LETTER L WITH MIDDLE TILDE U+029F LATIN LETTER SMALL CAPITAL L U+026C LATIN SMALL LETTER L BELT U+0271 LATIN SMALL LETTER M HOOK U+014B LATIN SMALL LETTER ENG U+0254 LATIN SMALL LETTER OPEN O U+03A6 GREEK CAPITAL LETTER PHI U+0263 LATIN SMALL LETTER GAMMA U+025A LATIN SMALL LETTER SCHWA HOOK U+0280 LATIN LETTER SMALL CAPITAL R U+0283 LATIN SMALL LETTER ESH U+03B8 GREEK SMALL LETTER THETA U+028A LATIN SMALL LETTER UPSILON U+0277 LATIN SMALL LETTER CLOSED OMEGA U+028C LATIN SMALL LETTER TURNED V U+0153 LATIN SMALL LETTER O E U+03C7 GREEK SMALL LETTER CHI U+00F8 LATIN SMALL LETTER O SLASH U+0153 LATIN SMALL LETTER O E U+0292 LATIN SMALL LETTER YOGH LATIN LATIN LATIN LATIN SMALL SMALL SMALL SMALL LETTER LETTER LETTER LETTER GLOTTAL STOP SCHWA A E FISHHOOK R
{vls,alv,lat,frc} ?? {lbd,nas} {vel,nas} {lmd,bck,rnd,vwl} {vls,blb,frc} {vcd,vel,frc} {mid,cnt,rzd,vwl} ?? {alv,trl} ?? S {vls,pla,frc} T {vls,dnt,frc} U {smh,bck,rnd,vwl} V W X Y {lmd,bck,unr,vwl} {lmd,fnt,rnd,vwl} ?? {vls,uvl,frc} {umd,fnt,rnd,vwl} ?? {lmd,fnt,rnd,vwl} ?? Z {vcd,pla,frc} ? @ & * % $
{glt,stp} U+0294 {mid,cnt,unr,vwl} U+0259 {low,fnt,unr,vwl} U+00E6 {vcd,alv,flp} U+027E -- Ad Hoc Segment --- Ad Hoc Segment --
Appendix C. Diacritics
~ Vowels: {nzd} Consonants: {vzd} : {lng} - Vowels: {unr} Consonants: {syl} ! {clk} . Vowels: {rnd} Consonants: {rfx} U+0303 NON-SPACING TILDE U+0334 NON-SPACING TILDE OVERLAY U+02D0 MODIFIER LETTER TRIANGULAR COLON -- No equivalent -U+0329 NON-SPACING VERTICAL LINE BELOW -- No equivalent --- No equivalent -U+0322 NON-SPACING RETROFLEX HOOK BELOW U+0323 NON-SPACING DOT BELOW ` Voiceless: {ejc} U+02BC MODIFIER LETTER APOSTROPHE Voiced: {imp} -- No equivalent -[ {dnt} U+032A NON-SPACING BRIDGE BELOW ; {pzd} U+02B2 MODIFIER LETTER SMALL J U+0321 NON-SPACING PALATALIZED HOOK BELOW " Vowels: {cnt} -- No equivalent -Consonants: {uvl} -- No equivalent -^ {pal} -- No equivalent -+ -- Ad Hoc Diacritic -= -- Ad Hoc Diacritic -<H> {fzd} U+0334 NON-SPACING TILDE OVERLAY <h> {asp} U+02B0 MODIFIER LETTER SMALL H <o> {unx} ?? U+02DA SPACING RING ABOVE {vls} ?? U+0325 NON-SPACING RING BELOW <r> {rzd} U+02B3 MODIFIER LETTER SMALL R <w> {lzd} U+02B7 MODIFIER LETTER SMALL W U+032B NON-SPACING INVERTED DOUBLE ARCH BELOW <?> {mrm} U+02B1 MODIFIER LETTER SMALL H HOOK U+0324 NON-SPACING DOUBLE DOT BELOW
S Z
--phr--
H H<vcd>
U+0299 LATIN LETTER SMALL CAPITAL B U+0253 LATIN SMALL LETTER B HOOK U+0298 LATIN LETTER BULLSEYE
/n[/ /t[/ /d[/ /T/ /D/ /r[/ /l[/ /d`/ (same as {alv,imp}) /t[`/
{dnt,clk}
/t!/ U+0287 LATIN SMALL LETTER TURNED T (by rights this should be alveolar, but the alveolar and palatal clicks use the same symbol (/c!/))
/n/ /t/ /d/ /s/ /z/ /r/ /l/ /r<trl>/ U+0072 LATIN SMALL LETTER R (perhaps /R/) {alv,flp} /*/ {vls,alv,lat,frc} /s<lat>/ U+026C LAIN SMALL LETTER L BELT {vcd,alv,lat,frc} /z<lat>/ U+026E LATIN SMALL LETTER L YOGH {alv,imp} /d`/ U+0257 LATIN SMALL LETTER D HOOK {alv,ejc} /t`/ {alv,clk} /c!/ (same as {pal,clk}) {rfx,nas} {vls,rfx,stp} {vcd,rfx,stp} {vls,rfx,frc} {vcd,rfx,frc} {rfx,apr} {rfx,lat} {rfx,flp} {vls,pla,frc} {vcd,pla,frc} {pal,nas} {vls,pal,stp} {vcd,pal,stp} {vls,pal,frc} {vcd,pal,frc} {pal,apr} {rnd,pal,apr} {pal,lat} {pal,imp} {pal,clk} {vel,nas} {vls,vel,stp} {vcd,vel,stp} {vls,vel,frc} {vcd,vel,frc} {vel,apr} {vel,lat} {vel,imp} {vel,ejc} {vel,clk} {lbv,nas} {vls,lbv,stp} {vcd,lbv,stp} {vls,lbv,frc} {vcd,lbv,frc} {lbv,apr} {uvl,nas} {vls,uvl,stp} {vcd,uvl,stp} /n./ /t./ /d./ /s./ /z./ /r./ /l./ /*./ /S/ /Z/ /n^/ /c/ /J/ /C/ /C<vcd>/ U+029D LATIN SMALL LETTER CROSSED-TAIL J (perhaps /j/ (same as {pal,apr})) /j/ /j<rnd>/ /l^/ /J`/ /c!/ /N/ /k/ /g/ /x/ /Q/ /j<vel>/ /L/ /g`/ /k'/ /k!/ U+0265 U+028E U+0284 U+0297 LATIN LATIN LATIN LATIN SMALL LETTER TURNED H SMALL LETTER TURNED Y SMALL LETTER DOTLESS J BAR HOOK LETTER STRETCHED C U+0273 U+0288 U+0256 U+0282 U+0290 U+027B U+026D U+027D LATIN LATIN LATIN LATIN LATIN LATIN LATIN LATIN SMALL SMALL SMALL SMALL SMALL SMALL SMALL SMALL LETTER LETTER LETTER LETTER LETTER LETTER LETTER LETTER N RETROFLEX HOOK T RETROFLEX HOOK D RETROFLEX HOOK S HOOK Z RETROFLEX HOOK TURNED R HOOK L RETROFLEX HOOK R HOOK
U+0270 LATIN SMALL LETTER TURNED M WITH LONG LEG U+0260 LATIN SMALL LETTER G HOOK U+029E LATIN SMALL LETTER TURNED K
/n<lbv>/ Written as "ng" with tie above /t<lbv>/ Written as "kp" with tie above /d<lbv>/ Written as "gb" with tie above /w<vls>/ U+028D LATIN SMALL LETTER TURNED W /w/ (same as {lbv,apr}) /w/ /n"/ /q/ /G/ U+0274 LATIN LETTER SMALL CAPITAL N
{vls,uvl,frc} {vcd,uvl,frc} {uvl,apr} {uvl,trl} {vls,uvl,imp} {vcd,uvl,imp} {vls,phr,frc} {vcd,phr,frc} {glt,stp} {glt,apr} {mrm,glt,frc} {vcd,lat,flp} {lat,clk}
/X/ U+03C7 GREEK SMALL LETTER /g"/ (same as {uvl,apr}) /g"/ U+0281 LATIN LETTER SMALL /r"/ U+0280 LATIN LETTER SMALL /q`/ U+02A0 LATIN SMALL LETTER /G`/ U+029B LATIN LETTER SMALL /H/ /H<vcd>/ /?/ /h/ /h<?>/ /*<lat>/ /l!/ /i/ /y/ /I/ /I./ /e/ /Y/ /E/ /W/ /&/ /&./
U+0266 LATIN SMALL LETTER H HOOK U+027A LATIN SMALL LETTER TURNED R WITH LONG LEG U+0296 LATIN LETTER INVERTED GLOTTAL STOP
{hgh,fnt,unr,vwl} {hgh,fnt,rnd,vwl} {smh,fnt,unr,vwl} {smh,fnt,rnd,vwl} {umd,fnt,unr,vwl} {umd,fnt,rnd,vwl} {lmd,fnt,unr,vwl} {lmd,fnt,rnd,vwl} {low,fnt,unr,vwl} {low,fnt,rnd,vwl}
{hgh,cnt,unr,vwl} /i"/ U+0268 LATIN SMALL LETTER BARRED I {hgh,cnt,rnd,vwl} /u"/ U+0289 LATIN SMALL LETTER U BAR {umd,cnt,unr,vwl} /@<umd>/ U+0258 LATIN SMALL LETTER REVERSED E {umd,cnt,unr,rzd,vwl} /R<umd>/ U+025D LATIN SMALL LETTER REVERSED EPSILON HOOK {mid,cnt,unr,vwl} /@/ {mid,cnt,unr,rzd,vwl} /R/ {mid,cnt,rnd,vwl} /@./ U+0275 LATIN SMALL LETTER BARRED O {lmd,cnt,unr,vwl} /V"/ U+025C LATIN SMALL LETTER REVERSED EPSILON {lmd,cnt,rnd,vwl} /O"/ U+025E LATIN SMALL LETTER CLOSED REVERSED EPSILON {low,cnt,unr,vwl} /a/ U+0061 LATIN SMALL LETTER A {hgh,bck,unr,vwl} /u-/ {hgh,bck,rnd,vwl} /u/ {smh,bck,rnd,vwl} /U/ {umd,bck,unr,vwl} {umd,bck,rnd,vwl} {lmd,bck,unr,vwl} {lmd,bck,rnd,vwl} {low,bck,unr,vwl} {low,bck,rnd,vwl} /o-/ /o/ /V/ /O/ /A/ /A./ U+026F LATIN SMALL LETTER TURNED M U+028A LATIN SMALL LETTER UPSILON U+0277 LATIN SMALL LETTER CLOSED OMEGA U+0264 LATIN SMALL LETTER BABY GAMMA
S: Ad Hoc S: Ad Hoc S: {low,fnt,unr,vwl} P: Primary stress Unused Unused S: {vcd,lav,flp} D: Ad Hoc P: Secondary stress D: Vowel: {unr} Cons: {syl} D: Vowel: {rnd} Cons: {rfx} P: Phonemic delimiter Unused P: Tone 1 P: Tone 2 P: Tone 3 P: Tone 4 Unused Unused Unused Unused Unused D: {lng} D: {pzd} P: Diacritic delimiter D: Ad Hoc P: Diacritic delimiter S: {glt,stp} <D>: {mrm} S: {mid,cnt,unr,vwl} S: {low,bck,unr,vwl} S: {vcd,blb,frc} S: {vls,pal,frc} S: {vcd,dnt,frc} S: {lmd,fnt,unr,vwl} Unused S: {vcd,uvl,stp} S: {vls,phr,frc} <D>: {fzd} S: {smh,fnt,unr,vwl} S: {vcd,pal,stp} Unused S: {vcd,vel,lat} S: {lbd,nas} S: {vel,nas} S: {lmd,bck,rnd,vwl} S: {vls,blb,frc}
V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
S: {lmd,bck,unr,vwl} S: {lmd,fnt,rnd,vwl} S: {vls,uvl,frc} S: {umd,fnt,rnd,vwl} S: {vcd,pla,frc} P: Phonetic delimiter D: {dnt} Unused P: Phonetic delimiter D: {pal} Unused D Voiced: {imp} Voiceless: {ejc} S: {low,cnt,unr,vwl} S: {vcd,blb,stp} S: {vls,pal,stp} S: {vcd,alv,stp} S: {umd,fnt,urd,vwl} S: {vls,lbd,frc} S: {vcd,vel,stp} S: {glt,apr} <D>: {asp} S: {hgh,fnt,unr,vwl} S: {pal,apr}/{vcd,pal,frc} <D>: {pzd} S: {vls,vel,stp} S: {vcd,alv,lat} S: {blb,nas} S: {alv,nas} S: {umd,bck,rnd,vwl} <D>: {unx} S: {vls,blb,stp} S: {vls,uvl,stp} S: {alv,apr} <D>: {rzd} S: {vls,alv,frc} S: {vls,alv,stp} S: {hgh,bck,rnd,vwl} S: {vcd,lbd,frc} S: {lbv,apr}/{vcd,lbv,frc} <D>: {lzd} S: {vls,vel,frc} S: {hgh,fnt,rnd,vwl} S: {vcd,alv,frc} P: Feature set delimiter Unused P: Feature set delimiter D: Cons: {vzd} Vowel: {nzd}
To aid English speakers in using the phonetic transcription, this document describes the mapping onto a standard American dictionary transcription system for sounds that commonly occur in the English language. When it differs from the symbol used, I've also included a description of the IPA symbol for the benefit of non-Americans. The table is taken from the 'Pronunciation Symbols' page of Merriam-Webster's New Collegiate Dictionary. In the examples, the letters which spell the sound are bracketed by '<...>'.
Note that this only describes a small subset of the transcription system. There are far more sounds (used in other languages) and nuances of sound that can be captured. See the document describing the full standard for complete details. Phonemic (broad) transcriptions are bracketed by '/.../'. Phonetic (narrow) transcriptions are bracketed by '[...]'. Syllables that carry primary stress are preceded by "'". Syllables that carry secondary stress are preceded by ",". When giving the transcription of a single word, spaces are generally inserted between syllables (often omitted before syllables that have stress marks). When giving the transcription of a multiword utterance, it is common to put spaces between words and omit them between syllables. /@/: schwa (upside-down 'e'). Used in both unaccented ('b<a>nan<a>', 'c<o>llide', '<a>but'), and accented ('h<u>mdr<u>m', 'ab<u>t') contexts. The IPA symbol is a schwa. [British speakers often have different vowels in these two contexts. The accented one is further back and is written /V/. Its IPA symbol is a 'wedge' or upside-down 'v'.] /l-/, /n-/, /m-/, /N-/: Superscript schwa preceding consonant. As in 'batt<le>', 'mitt<en>', 'eat<en>'. Signifies that the consonant is pronounced as a syllable by itself. The IPA symbol is a vertical bar below the consonant. /R/: shwa followed by 'r'. 'op<er>ation', 'f<ur>th<er>', '<ur>g<er>'. The IPA symbol is a schwa with a hook. /&/: short a. 'm<a>t', 'm<a>p', 'm<a>d', 'g<a>g, 'sn<a>p', 'p<a>tch'. The IPA symbol is an 'a-e' digraph. /eI/: long a ('a' with bar above). 'd<ay>', 'f<a>de', 'd<a>te', '<a>orta', 'dr<a>pe', 'c<a>pe'. /A/: a with diaeresis (two dots) above. 'b<o>ther', 'c<o>t', and, with most American speakers, 'f<a>ther', 'c<a>rt'. The IPA symbol is a script 'a'. /a/: a with dot above. 'f<a>ther' as pronounced by speakers who do not rhyme it with bother. /AU/: a followed by u with dot. 'n<ow>', 'l<ou>d', '<ou>t'.
/b/: '<b>a<b>y', 'ri<b>'. /tS/: ch. The dictionary notes "(actually, this sound is \t\ + \sh\)" '<ch>in', 'na<tu>re' (/'neI tSR/). In IPA transcription, this is sometimes spelled as 'c with hacek'. /d/: '<d>i<d>', 'a<dd>er'. /E/: short e. 'b<e>t', 'b<e>d', 'p<e>ck'. The IPA symbol is a lower-case epsilon. It is sometimes spelled with a small capital E. /i/: long e ('e' with bar above). 'b<ea>t', 'nosebl<ee>d', '<e>venl<y>', '<ea>s<y>'. /f/: '<f>i<f>ty', 'cu<ff>' /g/: '<g>o', 'bi<g>', '<g>ift'. /h/: '<h>at', 'a<h>ead'. /hw/: '<wh>ale' as pronounced by those who do not have the same pronunciation for both 'whale' and 'wail'. /I/: short i. 't<i>p', 'ban<i>sh', 'act<i>ve'. The IPA symbol is a small capital I or a lower-case iota. /aI/: long i ('i' with bar above). 's<i>te', 's<i>de', 'b<uy>', 'tr<i>pe'. /dZ/: j. The dictionary notes "(actually, this sound is \d\ + \zh\)" '<j>ob', '<g>em', 'e<dge>', '<j>oin', '<j>u<dge'. /k/: '<k>in', '<c>oo<k>', 'a<che>'. /x/: k with bar below. (Same as /C/.) German 'Bu<ch>'. /C/: k with bar below. (Same as /x/.) German 'i<ch>'. /l/: '<l>i<l>y', 'poo<l>'. /m/: '<m>ur<m>ur', 'di<m>', 'ny<m>ph'. /n/: '<n>o', 'ow<n>'. /<vowel>~/: superscript 'n'. "indicates that a preceeding vowel or diphthong is pronounced with the nasal passages open as in French 'un bon vin blanc' /W~ bo~ va~ blA~/"
The IPA diacritic is a tilde above the vowel. /N/: eng ('n' with a tail). 'si<ng>' /sIN/, 'si<ng>er' /'sIN R/, 'fi<ng>er' /'fIN gR/, 'i<n>k' /iNk/ The IPA symbol is an eng. /oU/: long o ('o' with bar above). 'b<o>ne', 'kn<ow>', 'b<eau>'. /O/: 'o' with dot above. 's<aw>', '<a>ll', 'gn<aw>'. The IPA symbol is a small open 'o' or upside-down 'c'. /W/: o-e digraph French 'b<oeu>f', german 'H<o:>lle. The IPA symbol is an o-e digraph. /Oi/: 'o' with dot above followed by 'i'. 'c<oi>n', 'destr<oy>'. [The dictionary also lists 's<awi>ng', but I pronounce that as two separate syllables /'sO IN/.] /p/: '<p>e<pp>er', 'li<p>'. /r/: '<r>ed', 'ca<r>', '<r>a<r>ity'. /s/: '<s>our<ce>', 'le<ss>'. /S/: sh. '<sh>y', 'mi<ssi>on', 'ma<ch>ine', 'spe<ci>al'. The IPA symbol is an esh: a tall, pulled 's' or long, barless 'f'. /t/: '<t>ie', 'a<tt>ack'. /T/: th. '<th>in'. 'e<th>er'. The IPA symbol as a lower-case theta. /D/: 'th' with bar below. '<th>en', 'ei<th>er', '<th>is'. The IPA symbol is an eth, sort of a script 'd' with the bar crossed.
/u/: 'u' with diaeresis (two dots) above. 'r<u>le', 'y<ou>th', 'union' /'jun j@n/, 'few' /fju/. /U/: 'u' with dot above. 'p<u>ll', 'w<oo>d', 'b<oo>k', 'curable' /'kjUr @ b@l/. The IPA symbol is a small letter upsilon. A small capital U or closed lower-case omega is also used. /y/: u-e digraph. German 'f<u:>llen', 'h<u:>bsch', French 'r<ue>'. /v/: '<v>i<v>id', 'gi<ve>'. /w/: '<w>e', 'a<w>ay'. /j/: '<y>ard', '<y>oung', 'cue' /kju/, 'union' /'jun y@n/; /<cons>;/: superscript 'y' following consonant; "indicates that during the articulation of the sound represented by the preceding character, the front of the tongue has substantially the position it has for the articulation of the first sound of 'yard', as in French 'digne' /din;/." The IPA diacritic is a superscript 'j' following or hook below the consonant. /ju/: '<you>th', '<u>nion', 'c<ue>', 'f<ew>', 'm<u>te'. /jU/: 'c<u>rable', 'f<u>ry'. /z/: '<z>one', 'rai<se>'. /Z/: zh. 'vi<si>on', 'azure' /'aZ R/. The IPA symbol is a yogh: like a flat-topped '3' lowered so that the top is the height of that of a 'z'.
SAMPA (Speech Assessment Methods Phonetic Alphabet) is a machine-readable phonetic alphabet. It was originally developed under the ESPRIT project 1541, SAM (Speech Assessment Methods) in 1987-89 by an international group of phoneticians, and was applied in the first instance to the European Communities languages Danish,Dutch, English, French, German, and Italian (by 1989); later to Norwegian and Swedish (by 1992); and subsequently to Greek, Portuguese, and Spanish (1993). Under the BABEL project, it has now been extended to Bulgarian, Estonian, Hungarian, Polish, and Romanian (1996). Under the aegis of COCOSDA it is hoped to extend it to cover many other languages (and in principle all languages). On the initiative of the OrienTel project, Arabic, Hebrew, and Turkish have been added. Other recent additions: Cantonese, Croatian, Czech, Russian, Slovenian, Thai. Coming shortly: Japanese, Korean. Where Unicode (ISO 10646) is not available or not appropriate, SAMPA and the proposed XSAMPA (Extended SAMPA) constitute the best robust international collaborative basis for a standard machine-readable encoding of phonetic notation.
Note about Unicode: Recent version of the Internet Explorer and Netscape browsers are capable
of handling WGL4, the subset of Unicode needed for the orthography of all the languages of Europe. Test yours by looking at this page, or download an up-to-date browser and a WGL4 font. Unicode SAMPA pages are now available with correct local orthography, for those with this capacity, for Bulgarian, Czech, Greek, Hungarian, Polish, Romanian, and Slovenian. See if your browser can cope with Unicode IPA symbols by looking at this special version of the English SAMPA page. For IPA in Unicode, see here. SAMPA basically consists of a mapping of symbols of the International Phonetic Alphabet onto ASCII codes in the range 33..127, the 7-bit printable ASCII characters. Associated with the coding (mapping) are guidelines for the transcription of the languages to which SAMPA has been applied. Unlike other proposals for mapping the IPA onto ASCII, SAMPA is not one single author's scheme, but represents the outcome of collaboration and consultation among speech researchers in many different countries. The SAMPA transcription symbols have been developed by or in consultation with native speakers of every language to which they have been applied, but are standardized internationally.
A SAMPA transcription is designed to be uniquely parsable. As with the ordinary IPA, a string of SAMPA symbols does not require spaces between successive symbols. SAMPA has been applied not only by the SAM partners collaborating on EUROM 1, but also in other speech research projects (e.g. BABEL, Onomastica, OrienTel) and by Oxford University Press. It is included among the resources listed by the Linguistic Data Consortium. In its basic form SAMPA was seen as catering essentially for segmental transcription, particularly of a traditional phonemic or near-phonemic kind. Prosodic notation was not adequately developed. This shortcoming has now been remedied by a proposed parallel system of prosodic notation, SAMPROSA. It is important that prosodic and segmental transcriptions be kept distinct from one another, on separate representational tiers (because certain symbols have different meanings in SAMPROSA from their meaning in SAMPA: e.g. H denotes a labial-palatal semivowel in SAMPA, but High tone in SAMPROSA). A proposal for an extended version of the segmental alphabet, X-SAMPA, extends the basic agreed conventions so as to make provision for every symbol on the Chart of the International Phonetic Association, including all diacritics. In principle this makes it possible to produce a machine-readable phonetic transcription for every known human language. The present SAMPA recommendations (as devised for the basic six languages) are set out in the following table. All IPA symbols that coincide with lower-case letters of the Latin alphabet remain the same; all other symbols are recoded within the ASCII range 37..126. In this current WWW document the IPA symbols cannot be shown, but the columns indicate respectively a SAMPA symbol, its ASCII/ANSI number decimal), the shape of the corresponding IPA symbol, the Unicode number (hex, decimal) for the IPA symbol, and the symbol's meaning or use.
IPA
0251 593 open back unrounded, Cardinal 5, Eng. start 00E6 230 near-open front unrounded, Eng. trap
turned a 0250 592 open schwa, Ger. besser 0252 594 open back rounded, Eng. lot 025B 603 open-mid front unrounded, C3, Fr. mme
turned e 0259 601 schwa, Eng. banana 025C 604 long mid central, Eng. nurse
small cap 026A 618 lax close front unrounded, Eng. kit I
O 2 9 & U } V Y
79 50 57 38 85 125 86 89
0254 596 open-mid back rounded, Eng. thought 00F8 248 close-mid front rounded, Fr. deux 0153 339 open-mid front rounded, Fr. neuf 0276 630 open front rounded 028A 650 lax close back rounded, Eng. foot
upsilon
barred u 0289 649 close central rounded, Swedish sju turned v 028C 652 open-mid back unrounded, Eng. strut small cap 028F 655 lax [y], Ger. hbsch Y Consonants
B C D G L J N R S T H Z ?
66 67 68 71 76 74 78 82 83 84 72 90 63
03B2 946 voiced bilabial fricative, Sp. cabo 00E7 231 voiceless palatal fricative, Ger. ich 00F0 240 voiced dental fricative, Eng. then 0263 611 voiced velar fricative, Sp. fuego
turned y 028E 654 palatal lateral, It. famiglia left-tail n 0272 626 palatal nasal, Sp. ao eng inv. s.c. R esh 014B 331 velar nasal, Eng. thing 0281 641 vd. uvular fric. or trill, Fr. roi 0283 643 voiceless palatoalveolar fricative, Eng. ship 03B8 952 voiceless dental fricative, Eng. thin
turned h 0265 613 labial-palatal semivowel, Fr. huit 0292 658 vd. palatoalveolar fric., Eng. measure
dotless ? 0294 660 glottal stop, Ger. Verein, also Danish std Length, stress and tone marks
: " % ` '
58 34 37 96 39
low vert. 02CC 716 secondary stress str. (see note 1) (see note 1) falling tone rising tone
Note 1: The SAMPA tone mark recommendations were based on the IPA as it was up to 1989-90. Since then, however, the IPA has changed its symbols for falling and rising tones. These SAMPA tone marks may now be considered obsolete, having in practice been superseded by the SAMPROSA proposals. Diacritics (shown with another symbol as an example) =n O~ 60 126 inf. stroke 0329 809 syllabic consonant, Eng. garden (see note 2)
Note 2: At the time SAMPA was established it was assumed that the syllabicity diacritic should precede the base character. More recently, ISO and Unicode have established that all diacritics should follow the base character, and this principle should be applied in future work.
The phonemic notation of individual languages These pages provide a brief outline of the phonemic distinctions in various languages:Arabic, Bulgarian, Cantonese, Czech,Croatian,Danish,Dutch, English, Estonian, French, German, Greek, Hebrew, Hungarian, Italian, Norwegian, Polish, Portuguese, Romanian,Russian,Spanish, Swedish, Thai, Turkish. Extensions These pages provide extensions of the basic segmental SAMPA: SAMPROSA (prosodic), X-SAMPA (other symbols, mainly segmental). A utility: Instant IPA in Word - converts SAMPA to IPA. To refer to SAMPA cite this website (www.phon.ucl.ac.uk/home/sampa) or the printed version [Wells, J.C.], 1997. 'SAMPA computer readable phonetic alphabet'. In Gibbon, D., Moore, R. and Winski, R. (eds.), 1997. Handbook of Standards and Resources for Spoken Language Systems. Berlin and New York: Mouton de Gruyter. Part IV, section B. For queries please contact:John Wells by e-mail or at
Department of Speech, Hearing and Phonetic Sciences, University College London, Gower Street, London WC1E 6BT.
Last revised 2005 October 25
Download Unicode Phonetic Keyboard 1.02 (self-install executable, 2MB) Download Unicode Phonetic Keyboard 1.10 (Vista/Vista64) (self-install executable, 2MB) Download Keyboard Layout (PDF)
John Wells has written a number of pages which give more information about the set of phonetic symbols available in Unicode, and about how these can be used in Microsoft Word and other applications:
The IPA-SAM fonts are a set of TrueType fonts (not Unicode) suitable for Windows and MacOS that include all current IPA symbols. The keyboard layout is designed to be compatible with SAMPA.
Unicode. There is also another version, with no font specified, that you can use to test fonts.
In Word, with a Unicode font selected, useInsert | Symbol (normal text) and scroll down the box until you find the character you want. Select it, and Insert. With Word 2003 and later, you can alternatively type in the Unicode hex number (see below), select it, and do Alt-X. The character will appear. If you are going to use the character frequently, it might be worthwhile assigning a Shortcut Key (macro) for it. You can also use the program Character Map to find your character, then select, copy and paste it. Or you can use a keyboard facility such as this.
Afterwards, save the document using File | Save as HTML. Word will automatically convert the character into the corresponding numeric entity (see next para) or the corresponding UTF-8 encoding. Alternatively, write direct HTML, referencing each IPA symbol using the code numbers listed below. You can do this using either decimal or hex numbers. To create such a "numeric entity", you put ampersand (&), number sign (#), the Unicode number for the symbol, and semicolon. If using hex numbers, you must place an x between the number sign and the number. For example, to include the velar nasal symbol, ,which has the Unicode decimal number 331, write ŋ, or, since its hex number is 014B, you can alternatively write ŋ. To transcribe the English word thing, , write θɪŋ or, alternatively, θɪŋ. The browser will render these with the correct IPA symbols, always provided an appropriate font is available. Force the use of an appropriate font by including a font tag as mentioned above, for example in your cascading style sheet, p {font-family:"lucida sans unicode";}, or in the text, an in-line tag <font face="Lucida Sans Unicode">.
Symbol decimal hex 593 0251 592 0250 594 0252 230 00E6 595 0253 665 0299 946 03B2 596 0254 597 0255 231 00E7 599 0257 598 0256 240 00F0 676 02A4 601 0259 600 0258 602 025A 603 025B 604 025C 605 025D 606 025E 607 025F 644 0284
value open back unrounded open-mid schwa open back rounded raised open front unrounded vd bilabial implosive vd bilabial trill vd bilabial fricative open-mid back rounded vl alveolopalatal fricative vl palatal fricative vd alveolar implosive vd retroflex plosive vd dental fricative vd postalveolar affricate schwa close-mid schwa rhotacized schwa open-mid front unrounded open-mid central rhotacized open-mid central open-mid central rounded vd palatal plosive vd palatal implosive
609 608 610 667 614 615 295 613 668 616 618 669 621 620 619 622 671 625 623 624 331 627 626 628 248 629 632 952 339 630 664 633 634 638 635 640 641 637 642 643 648 679
0261 0260 0262 029B 0266 0267 0127 0265 029C 0268 026A 029D 026D 026C 026B 026E 029F 0271 026F 0270 014B 0273 0272 0274 00F8 0275 0278 03B8 0153 0276 0298 0279 027A 027E 027B 0280 0281 027D 0282 0283 0288 02A7
vd velar plosive (but the IPA has ruled that an ordinary g is also acceptable) vd velar implosive vd uvular plosive vd uvular implosive vd glottal fricative vl multiple-place fricative vl pharyngeal fricative labial-palatal approximant vl epiglottal fricative close central unrounded lax close front unrounded vd palatal fricative vd retroflex lateral vl alveolar lateral fricative velarized vd alveolar lateral vd alveolar lateral fricative vd velar lateral vd labiodental nasal close back unrounded velar approximant vd velar nasal vd retroflex nasal vd palatal nasal vd uvular nasal front close-mid rounded rounded schwa vl bilabial fricative vl dental fricative front open-mid rounded front open rounded bilabial click vd (post)alveolar approximant vd alveolar lateral flap vd alveolar tap vd retroflex approximant vd uvular trill vd uvular fricative vd retroflex flap vl retroflex fricative vl postalveolar fricative vl retroflex plosive vl postalveolar affricate
Top of lists
649 650 651 11377 652 611 612 653 967 654 655 657 656 658 660 673 661 674 448 449 450 451
0289 028A 028B 2C71 028C 0263 0264 028D 03C7 028E 028F 0291 0290 0292 0294 02A1 0295 02A2 01C0 01C1 01C2 01C3
close central rounded lax close back rounded vd labiodental approximant voiced labiodental flap open-mid back unrounded vd velar fricative close-mid back unrounded vl labial-velar fricative vl uvular fricative vd palatal lateral lax close front rounded vd alveolopalatal fricative vd retroflex fricative vd postalveolar fricative glottal plosive vd epiglottal plosive vd pharyngeal fricative vd epiglottal fricative dental click alveolar lateral click alveolar click retroflex click
Spacing diacritics and suprasegmentals To study these, you may find it helpful to set your browser text size to Largest.
Symbol decimal hex 712 02C8 716 02CC 720 721 700 692 688 689 690 695 736 740 734 02D0 02D1 02BC 02B4 02B0 02B1 02B2 02B7 02E0 02E4 02DE
value (primary) stress mark secondary stress length mark NB: there is a bug in some versions of MS IExplorer that causes this character not to display. It is probably best to use a simple colon instead. half-length ejective rhotacized aspirated breathy-voice-aspirated palatalized labialized velarized pharyngealized rhotacized
Note the ready-made characters 602 025A (combining 601 0259 and 734 02DE) and 605 025D (combining 604 025C and 734 02DE).
Non-spacing diacritics and suprasegmentals As you can see, several of these are unsatisfactory, particularly in smaller sizes. They are shown here with an appropriate supporting base character. When composing a text in HTML, enter the diacritic after the base character th s ( oice ess n n) n̥. The browser automatically backspaces the diacritic, but by a constant amount, which may or may not produce a satisfactory result.
Symbol decimal hex nd 805 0325 778 030A ba 804 0324 td 810 032A st 812 032C ba 816 0330 td 826 033A td 828 033C td 827 033B t 794 031A 825 0339 771 0303 796 031C u 799 031F e 800 0320 e 776 0308 ln 820 0334 619 026B e 829 033D e 797 031D mnl 809 0329 e 798 031E e 815 032F e 792 0318 e 793 0319 e 774 0306 e 779 030B 769 0301 e 772 0304 768 0300 e 783 030F xx 860 035C xx 865 0361
Arrows
value voiceless voiceless (use if character has descender) breathy voiced dental voiced creaky voiced apical linguolabial laminal not audibly released more rounded nasalized less rounded advanced retracted centralized velarized or pharyngealized
(ready-made combination, dark l)
mid-centralized raised syllabic lowered non-syllabic advanced tongue root retracted tongue root extra-short extra high tone high tone mid tone low tone extra low tone tie bar below tie bar above value downstep upstep
(becomes, is realized as not recognized by the IPA) global rise global fall
For a much more thorough discussion of displaying and using Unicode characters, see Alan Wood's Unicode resources.
Syllabic, devoiced
Advantages: Straightforward. Disadvantages: You need to keep this page on-screen. Some symbols are not shown here. (b) Do Insert | Symbol,
and find the symbol in the drop-down box that appears.
Advantages: Easy. You can even define a Shortcut Key for a symbol you need a lot. Disadvantages: Fiddly. The on-screen symbols are very small. Some diacritics are too small to
distinguish. (This criticism does not apply to Word 2002, where the drop-down box is much improved.)
Advantages: Gives a label for each character, helping you to be sure you've got the right one. Disadvantages: Even more fiddly. Doesnt work in older versions of Windows. (d) Read the articles Eureka and Eureka-IPA
and create AutoCorrect shortcuts as described there.
Advantages: Excellent if you need to use each phonetic symbol several times. Builds on (b) above. Disadvantages: Takes some time to set up. (e) In Word 2002, type the symbol's Unicode number and do Alt-x.
The Unicode number must be in hexadecimal form; e.g. the number for the velar nasal is 014B. (A complete list of the hex numbers of phonetic symbols.)
Advantages: easy, if you know the number. Disadvantages: you need to know the number. Does not work with previous versions of Word. (f) Use a phonetic keyboard.
You can install a virtual keyboard allowing you to access phonetic symbols by using the ordinary keys. I recommend Mark Huckvales Unicode Phonetic Keyboard, which you can download free from here. (Windows PC only.)
Advantages: easy, using the chart supplied; does not require knowledge of Unicode numbers. Disadvantages: you have to toggle into and out of the special keyboard.
to create a Word document with the string of symbols you want, and then copy (ctrlC) and paste (ctrl-V) them into Excel/Powerpoint etc.; or to use a virtual keyboard.
This works with any Unicode-enabled application, but not of course with those that are not Unicode-compliant.
In the table of pulmonic consonants, below, the first symbol within a cell denotes an unvoiced sound (e.g., t), while the second symbol denotes the corresponding voiced sound (e.g., d).
Pulmoni Labiode Alveola Postalve Retrofl c Bilabial Dental Palatal ntal r olar ex Consona nts Plosive p Nasal Trill Tap or Flap Fricative F Lateral fricative Approxi mant Lateral approxi mant B f v T D s 4 V b m } M t d n R P z 5 r { | j K S Z $ [ 2 C J x \ / % c ] # k
Velar
Uvular
g q N
G 7 8
+ X
Q H
9 h
W L
In the table of vowels, below, wherever symbols appear in pairs, the leftmost symbol of the pair denotes an unrounded vowel, while the rightmost symbol denotes the corresponding rounded vowel. Vowels Close Front i y I Close-mid e 0 Y ) =
@
Central 1
Back w U u
,
; * ^
A
Open-mid
~ &
Open
<
The above two tables span the range of most common sounds (pulmonic consonants and vowels). There are a few remaining ASCII characters (., !, :, ', >, _ and `), and a number of sounds (non-pulmonic consonants, affricates) and symbols that are not included in the above tables. We suggest using some of the remaining ASCII characters as starting tokens of "escape sequences" to represent the remaining IPA symbols.
The following table suggests escape sequences (all starting with the ! symbol) for non-pulmonic consonants: Non-pulmonic Consonants Clicks ASCII IPA sequence symbol description !0 !| !! != !# bilabial dental (post)alveolar palatoalveolar alveolar lateral
Voiced Implosives ASCII IPA sequence symbol description !b !d !f !g !G bilabial dental/alveolar palatal velar uvular
Ejectives ASCII IPA sequence symbol description !' !p !t !k !s bilabial dental/alveolar velar alveolar/fricative
The following table suggests escape sequences (all starting with the . symbol) for the affricates, other double articulations, and other symbols: Affricates ASCII IPA sequenc symbol e s Other symbols ASCII IPA sequenc symbo e l description voiceless labio-velar .m fricative voiced labio-velar .w approximan t voiced labiopalatal .h approximan t voiceless epiglottal .H fricative Other symbols Other symbols ASCII IPA ASCII IPA sequenc symbo descriptio sequenc symbo e l n e l description voiced alveolar epiglottal .9 .I lateral flap fricative
.s
.S
.?
epiglottal plosive
.X
simultaneou s
and
.z
.c
'
primary stress
.Z
.7
secondary stress
You may also want to take a look at this page, which describes various other conventions, considered moreor-less "standard". A related page of mine is: Greek Sounds in the International Phonetic Alphabet Back to the Index of Topics in Language