Diacritic: Difference between revisions

Content deleted Content added
Alphabetization or collation: more angbr instead of italic
No edit summary
Tags: Mobile edit Mobile web edit Advanced mobile edit
 
(22 intermediate revisions by 10 users not shown)
Line 21:
* accents (so called because the acute, grave, and circumflex were originally used to indicate different types of [[pitch accent]]s in the [[polytonic transcription]] of [[Greek language|Greek]])
<!-- This list uses <span style="font-family: serif"> because of rendering limitation in Android (as of v13), that its default sans font fails to render "dotted circle + diacritic", so visitors just get a meaningless (to most) [X] mark. Please retain at least until the issue is resolved because this is a very large proportion of visitors. -->
** <span style="font-family: serif">{{char|◌́}}</span> – [[acute accent|acute]] ({{lang-langx|la|[[apex (diacritic)|apex]]}}); for example {{char|ó}}
** <span style="font-family: serif">{{char|◌̀}}</span> – [[grave accent|grave]]; for example {{char|ò}}
** <span style="font-family: serif">{{char|◌̂}}</span> – [[circumflex accent|circumflex]]; for example {{char|ô}}
Line 60:
** <span style="font-family: serif">{{char|◌̒}}</span> – [[inverted apostrophe]]
** <span style="font-family: serif">{{char|◌̔}}</span> – [[reversed apostrophe]]
** <span style="font-family: serif">{{char|◌̉}}</span> – [[hook above]] ({{lang-langx|vi|dấu hỏi}})
** <span style="font-family: serif">{{char|◌̛}}</span> – [[horn (diacritic)|horn]] ({{lang-langx|vi|dấu móc}}); for example {{char|ơ}}
* subscript curls
** <span style="font-family: serif">{{char|◌̦}}</span> – [[comma#Diacritical usage|undercomma]]; for example {{char|ș}}
Line 114:
These diacritics are used in addition to the acute, grave, and circumflex accents and the diaeresis:
* <span style="font-family: serif">{{char|◌ͺ}}</span> – [[iota subscript]] ({{lang|grc|ᾳ, εͅ, ῃ, ιͅ, οͅ, υͅ, ῳ}})
* <span style="font-family: serif">{{char|῾◌}}</span> – [[rough breathing]] ({{lang-langx|grc|δασὺ πνεῦμα|dasỳ pneûma}}, {{lang-langx|la|spīritus asper}}): aspiration
* <span style="font-family: serif">{{char|᾿◌}}</span> – [[smooth breathing|smooth (or soft) breathing]] ({{lang-langx|grc|ψιλὸν πνεῦμα|psilòn pneûma}}, {{lang-langx|la|spīritus lēnis}}): lack of aspiration
 
===Hebrew===
Line 121:
[[File:Example of biblical Hebrew trope.svg|thumb|upright=1.6|right|'''Genesis 1:9 "And God said, Let the waters be collected".'''<br>Letters in black, <span style="color:#CC0000;">[[niqqud]] in red</span>, <span style="color:#0000CC;">[[Hebrew cantillation|cantillation]] in blue</span>]]
* [[Niqqud]]
** {{largebig|{{char| ּ}}}} – [[Dagesh]]
** {{largebig|{{char| ּ}}}} – [[Mappiq]]
** {{largebig|{{char| ֿ}}}} – [[Rafe]]
** {{largebig|{{char| ׁ}}}} – [[Shin dot]] (at top right corner)
** {{largebig|{{char| ׂ}}}} – [[Sin dot]] (at top left corner)
** {{largebig|{{char| ְ}}}} – [[Shva]]
** {{largebig|{{char| ֻ}}}} – [[Kubutz]]
** {{largebig|{{char|ֹ◌}}}} – [[Holam]]
** {{largebig|{{char| ָ}}}} – [[Kamatz]]
** {{largebig|{{char| ַ}}}} – [[Patakh]]
** {{largebig|{{char| ֶ}}}} – [[Segol]]
** {{largebig|{{char| ֵ}}}} – [[Tzeire]]
** {{largebig|{{char| ִ}}}} – [[Hiriq]]
 
* ([[Hebrew cantillation|Cantillation]] marks do not generally render correctly; refer to [[Hebrew cantillation#Names and shapes of the ta'amim]] for a complete table together with instructions for how to maximize the possibility of viewing them in a web browser.)
* Other
** {{largebig|{{char| ׳}}}} – [[Geresh]]
** {{largebig|{{char| ״}}}} – [[Gershayim]]
 
===Korean===
Line 186 ⟶ 187:
 
==Generation with computers==
{{main|Unicode input}}
[[File:Germanic umlaut on keyboard.jpg|thumb|German keyboard with umlaut letters]]
Modern computer technology was developed mostly in countries that speak Western European languages (particularly English), and many early binary encodings were developed with a bias favoring English{{mdash}}a language written without diacritical marks. With [[computer memory]] and [[computer storage]] at premium, early [[character set]]s were limited to the Latin alphabet, the ten digits and a few punctuation marks and conventional symbols. The American Standard Code for Information Interchange ([[ASCII]]), first published in 1963, encoded just 95 printable characters. It included just four free-standing diacritics{{mdash}}acute, grave, circumflex and tilde{{mdash}}which were to be used by backspacing and overprinting the base letter. The [[ISO/IEC 646]] standard (1967) defined national variations that replace some American graphemes with [[precomposed character]]s (such as {{angbr|é}}, {{angbr|è}} and {{angbr|ë}}), according to language{{mdash}}but remained limited to 95 printable characters.
Modern computer technology was developed mostly in English-speaking countries, so data formats, keyboard layouts, etc. were developed with a bias favoring English, a language with an alphabet without diacritical marks. Efforts have been made to create [[internationalized domain names]] that further extend the English alphabet (e.g., "pokémon.com").
 
Depending on the [[keyboard layout]], which differs amongst countries, it is more or less easy to enter letters with diacritics on computers and typewriters. Some have their own keys; some are created by first pressing the key with the [[Combining character|diacritic mark]] followed by the letter to place it on. Such a key is sometimes referred to as a [[dead key]], as it produces no output of its own but modifies the output of the key pressed after it.
 
In modern Microsoft Windows and Linux operating systems, the keyboard layouts ''US International'' and ''UK International'' feature [[dead key]]s that allow one to type Latin letters with the acute, grave, circumflex, diaeresis/umlaut, tilde, and cedilla found in Western European languages (specifically, those combinations found in the [[ISO Latin-1]] character set) directly: {{keypress|¨}} + {{keypress|e}} gives ''ë'', {{keypress|~}} + {{keypress|o}} gives ''õ'', etc. On [[Apple Macintosh]] computers, there are keyboard shortcuts for the most common diacritics; {{keypress|Option|E}} followed by a vowel places an acute accent, {{keypress|Option|U}} followed by a vowel gives an umlaut, {{keypress|Option|C}} gives a cedilla, etc. Diacritics can be [[Compose key|composed]] in most [[X Window System]] keyboard layouts, as well as other operating systems, such as Microsoft Windows, using additional software.
 
On computers, the availability of [[code pageUnicode]]s determineswas whetherconceived oneto can use certain diacritics. [[Unicode]] solvessolve this problem by assigning every known character its own code; if this code is known, most modern computer systems provide a [[Unicode#Input methods|method to input it]]. WithFor Unicodehistorical reasons, italmost isall alsothe possibleletter-with-accent tocombinations combineused in European languages were given unique [[Combiningcode character|diacritical markspoint]]s withand mostthese characters.are However,called as[[precomposed ofcharacter]]s. 2019,For veryother fewlanguages, fontsit includeis theusually necessary support to correctlyuse rendera [[combining character-plus-]] diacritic(s) fortogether with the Latindesired base letter. Unfortunately, Cyrilliceven as of 2024, many applications and someweb otherbrowsers alphabetsremain (exceptionsunable includeto [[Andikaoperate (typeface)|Andika]])the combining diacritic concept properly.
Depending on the [[keyboard layout]], whichand differs[[keyboard amongst countriesmapping]], it is more or less easy to enter letters with diacritics on computers and typewriters. SomeKeyboards haveused theirin owncountries where letters with diacritics are the norm, have keys; someengraved with the relevant symbols. In other cases, such as when the [[US international]] or [[UK extended]] mappings are used, the accented letter is created by first pressing the key with the [[Combining character|diacritic mark]], followed by the letter to place it on. SuchThis a keymethod is sometimes referred toknown as athe [[dead key]] technique, as it produces no output of its own but modifies the output of the key pressed after it.
 
==Languages with letters containing diacritics==
Line 200:
=== Latin script ===
====Baltic====
:* [[Latvian alphabet|Latvian]] has the following letters: ''{{angbr|[[ā]]}}, {{angbr|[[ē]]}}, {{angbr|[[ī]]}}, {{angbr|[[ū]]}}, {{angbr|[[č]]}}, {{angbr|[[ģ]]}}, {{angbr|[[ķ]]}}, {{angbr|[[ļ]]}}, {{angbr|[[ņ]]}}, {{angbr|[[š]]}}, {{angbr|[[ž]]''}}
:* [[Lithuanian alphabet|Lithuanian]]. In general usage, where letters appear with the caron (''{{angbr|č}}, {{angbr|š''}} and ''{{angbr|ž''}}), they are considered as separate letters from ''{{angbr|c}}, {{angbr|s''}} or ''{{angbr|z''}} and collated separately; letters with the [[ogonek]] (''{{angbr|[[ą]]}}, {{angbr|[[ę]]}}, {{angbr|[[į]]''}} and ''{{angbr|[[ų]]''}}), the [[Macron (diacritic)|macron]] (''{{angbr|[[ū]]''}}) and the [[anunaasika|superdotoverdot]] (''{{angbr|[[ė]]''}}) are considered as separate letters as well, but not given a unique collation order.
 
====Celtic====
:* [[Welsh language|Welsh]] uses the circumflex, diaeresis, acute, and grave accents on its seven vowels ''{{angbr|a}}, {{angbr|e}}, {{angbr|i}}, {{angbr|o}}, {{angbr|u}}, {{angbr|w}}, {{angbr|y''}} (hence the composites {{angbr|â}}, {{angbr|ê}}, {{angbr|î}}, {{angbr|ô}}, {{angbr|û}}, {{angbr|ŵ}}, {{angbr|ŷ}}, {{angbr|ä}}, {{angbr|ë}}, {{angbr|ï}}, {{angbr|ö}}, {{angbr|ü}}, {{angbr|}}, {{angbr|ÿ}}, {{angbr|á}}, {{angbr|é}}, {{angbr|í}}, {{angbr|ó}}, {{angbr|ú}}, {{angbr|}}, {{angbr|ý}}, {{angbr|à}}, {{angbr|è}}, {{angbr|ì}}, {{angbr|ò}}, {{angbr|ù}}, {{angbr|}}, {{angbr|}}). However all except the circumflex (which is used as a macron) are fairly rare.
:* Following spelling reforms since the 1970s, [[Scottish Gaelic]] uses graves only, which can be used on any vowel (''{{angbr|[[à]]}}, {{angbr|[[è]]}}, {{angbr|[[ì]]}}, {{angbr|[[ò]]}}, {{angbr|[[ù]]''}}). Formerly acute accents could be used on ''{{angbr|á}}, {{angbr|ó''}} and ''{{angbr|é''}}, which were used to indicate a specific vowel quality. With the elimination of these accents, the new orthography relies on the reader having prior knowledge of pronunciation of a given word.
:* [[Manx language|Manx]] uses the singlecedilla diacritic {{angbr|[[ç]]}} combined with h to give the digraph {{angle bracket|çh}} (pronounced {{IPA|/tʃ/}}) to mark the distinction between it and the digraph {{angle bracket|ch}} (pronounced {{IPA|/h/}} or {{IPA|/x/}}). Other diacritics used in Manx included the circumflex and diaeresis, as in {{angbr|â}}, {{angbr|ê}}, {{angbr|ï}}, etc. to mark the distinction between two similarly spelled words but with slightly differing pronunciation.
:* [[Irish language|Irish]] uses only acute accents to mark long vowels, following the 1948 spelling reform. [[Lenition]] is indicated using an [[overdot]] in [[Gaelic type]]: ({{angbr|[[ċ]]}},{{angbr|ḋ}},{{angbr|ḟ}}, {{angbr|[[ġ]]}}, {{angbr|ṁ}}, {{angbr|ṗ}}, {{angbr|[[ṡ]]}}, {{angbr|ṫ}}); in [[Roman type]], a suffixed {{angbr|h}} is used. Thus, <span style="font-family:Duibhlinn, Ceanannas, Corcaigh, sans-serif">{{lang|gv|a ṁáṫair}}</span> is equivalent to <span style="font-family:Times New Roman, serif">{{lang|gv|a mháthair}}</span>.
:* [[Breton orthography|Breton]] does not have a single orthography (spelling system), but uses diacritics for a number of purposes. The diaeresis is used to mark that two vowels are pronounced separately and not as a diphthong/digraph. The circumflex is used to mark long vowels, but usually only when the vowel length is not predictable by phonology. Nasalization of vowels may be marked with a tilde, or following the vowel with the letter <{{angbr|ñ>}}. The plural suffix -où is used as a unified spelling to represent a suffix with a number of pronunciations in different dialects, and to distinguish this suffix from the digraph <{{angbr|ou>}} which is pronounced as {{IPA|/u:/}}. An apostrophe is used to distinguish {{angbr|c'h}}, pronounced {{IPA|/x/}} as the digraph <{{angbr|ch>}} is used in other Celtic languages, from the French-influenced digraph ch, pronounced {{IPA|/ʃ/}}.
 
====Finno-Ugric====
:* [[Estonian alphabet|Estonian]] has a distinct letter ''{{angbr|[[õ]]''}}, which contains a tilde. Estonian "dotted vowels" ''with [[double dot (diacritic)|double-dot diacritics]] {{angbr|ä''}}, ''{{angbr|ö''}}, ''{{angbr|ü''}} are similar to German, but these are also distinct letters, notunlike like[[Umlaut (diacritic)|
German umlauted]] letters. All four have their own place in the alphabet, between ''{{angbr|w''}} and ''{{angbr|x''}}. Carons[[Caron]]s in ''{{angbr|š''}} or ''{{angbr|ž''}} appear only in foreign proper names and [[loanwords]]. Also these are distinct letters, placed in the alphabet between ''s'' and ''t''.
:* [[Finnish alphabet|Finnish]] uses double-dotted (umlauted) vowels (''{{angbr|ä''}} and ''{{angbr|ö''}}). As in Swedish and Estonian, these are regarded as individual letters, rather than 'vowel + umlautdiacritic' combinations (as happens in German). It also uses the characters ''{{angbr|å''}}, ''{{angbr|š''}} and ''{{angbr|ž''}} in foreign names and loanwords. In the Finnish and Swedish alphabets, ''{{angbr|å''}}, ''{{angbr|ä''}} and ''{{angbr|ö''}} collate as separate letters after ''{{angbr|z''}}, the others as variants of their base letter.
:* [[Hungarian alphabet|Hungarian]] uses the umlautdouble-dot, the acute and double acute accentdiacritics (the last is unique to Hungarian): (''{{angbr|ö}}, {{angbr|ü''}}), (''{{angbr|á}}, {{angbr|é}}, {{angbr|í}}, {{angbr|ó}}, {{angbr|ú''}}) and (''{{angbr|ő}}, {{angbr|ű''}}). The acute accent indicates the long form of a vowel (in case of ''{{angbr|i}}/{{angbr|í''}}, ''{{angbr|o}}/{{angbr|ó''}}, ''{{angbr|u}}/{{angbr|ú''}}) while the double acute performs the same function for ''{{angbr|ö''}} and ''{{angbr|ü''}}. The acute accent can also indicate a different sound (more open, likeas in case of ''{{angbr|a}}/{{angbr|á''}}, ''{{angbr|e}}/{{angbr|é''}}). Both long and short forms of the vowels are listed separately in the [[Hungarian alphabet]], but members of the pairs ''{{angbr|a}}/{{angbr|á}}, {{angbr|e}}/{{angbr|é}}, {{angbr|i}}/{{angbr|í}}, {{angbr|o}}/{{angbr|ó}}, {{angbr|ö}}/{{angbr|[[ő]]}}, {{angbr|u}}/{{angbr|ú''}} and ''{{angbr|ü}}/{{angbr|[[ű]]''}} are collated in dictionaries as the same letter.
:* [[Livonian language|Livonian]] has the following letters: ''{{angbr|ā}}, {{angbr|ä}}, {{angbr|[[ǟ]]}}, {{angbr|[[ḑ]]}}, {{angbr|ē}}, {{angbr|ī}}, {{angbr|ļ}}, {{angbr|ņ}}, {{angbr|ō}}, {{angbr|[[ȯ]]}}, {{angbr|[[ȱ]]}}, {{angbr|[[õ]]}}, {{angbr|[[ȭ]]}}, {{angbr|ŗ}}, {{angbr|š}}, {{angbr|ț}}, {{angbr|ū}}, {{angbr|ž''}}.
 
====Germanic====
:* [[German orthography|German]] uses the [[two dots (diacritic)|two-dots diacritic]] ({{langx|de|[[Umlaut (diacritic)|umlaut]]}}): letters {{Angbr|[[ä]]}}, {{angbr|[[ö]]}}, {{angbr|[[ü]]}}, used to indicate the [[fronting (phonology)|fronting]] of back vowels (see [[umlaut (linguistics)]]).
:* [[Dutch orthography|Dutch]] uses [[acute accent|acute]], [[circumflex]], [[grave accent|grave]] and [[diaeresis (diacritic)|diaeresis]]two-dots diacritics with most vowels and [[cedilla]] with c, as in French. This results in ''{{angbr|[[á]]}}, {{angbr|[[à]]}}, {{angbr|[[ä]]}}, {{angbr|[[é]]}}, {{angbr|[[è]]}}, {{angbr|[[ê]]}}, {{angbr|[[ë]]}}, {{angbr|[[í]]}}, {{angbr|[[î]]}}, {{angbr|[[ï]]}}, {{angbr|[[ó]]}}, {{angbr|[[ô]]}}, {{angbr|[[ö]]}}, {{angbr|[[ú]]}}, {{angbr|[[û]]}}, {{angbr|[[ü]]''}} and ''{{angbr|[[ç]]''}}. This is mostly on words (and names) originating from French (like ''crème, café, gêne, façade''). The acute accent is also used to stress the vowel (like ''één''). The [[two -dots (diacritic)|two dots diacritic]] ({{char|¨}}) is used as a linguistic diaeresis (indicating a [[vowel hiatus]]) that splits the two vowels, e.g., ''reële, reünie, coördinatie''), rather than to indicate ana [[umlautlinguistic (linguistics){{lang|de|umlaut]]}} as used in German.
:* [[Afrikaans alphabet|Afrikaans]] uses 16 additional vowelsvowel forms, both uppercase and lowercase: ''{{angbr|[[á]]}}, {{angbr|[[ä]]}}, {{angbr|[[é]]}}, {{angbr|[[è]]}}, {{angbr|[[ê]]}}, {{angbr|[[ë]]}}, {{angbr|[[í]]}}, {{angbr|[[î]]}}, {{angbr|[[ï]]}}, [[ʼn]], {{angbr|[[ó]]}}, {{angbr|[[ô]]}}, {{angbr|[[ö]]}}, {{angbr|[[ú]]}}, {{angbr|[[û]]}}, {{angbr|[[ü]]}}, {{angbr|[[ý]]''}}. <!-- The precomposed digraph ʼn is not a letter and its use is deprecated. -->
:* [[Faroese alphabet|Faroese]] uses [[acute accent|acutes]] and othersome specialadditional letters. All are considered separate letters and have their own place in the alphabet: ''{{angbr|[[á]]''}}, ''{{angbr|[[í]]''}}, ''{{angbr|[[ó]]''}}, ''{{angbr|[[ú]]''}}, ''{{angbr|[[ý]]''}} and ''{{angbr|[[ø]]''}}.
:* [[Icelandic orthography|Icelandic]] uses acutes and other specialadditional letters. All are considered separate letters, and have their own place in the alphabet: ''{{angbrZ[[á]]''}}, ''{{angbr|[[é]]''}}, ''{{angbr|[[í]]''}}, ''{{angbr|[[ó]]''}}, ''{{angbr|[[ú]]''}}, ''{{angbr|[[ý]]'',}} and ''{{angbr|[[ö]]''}}.
:* [[Danish alphabet|Danish]] and [[Norwegian language|Norwegian]] use additional characters like the o-slash ''{{angbr|[[ø]]''}} and the a-overring ''{{angbr|[[å]]''}}. These letters come after ''{{angbr|z''}} and ''{{angbr|[[æ]]''}} in the order ''{{angbr|ø}}, {{angbr|å''}}. Historically, the ''{{angbr|å''}} has developed from a ligature by writing a small superscript ''{{angbr|a''}} over a lowercase ''{{angbr|a''}}; if an ''{{angbr|å''}} character is unavailable, some Scandinavian languages allow the substitution of a doubled ''a'', thus {{angbr|aa}}. The Scandinavian languages collate these letters after {{angbr|z}}, but have different national [[collation]] standards.
:* [[Swedish alphabet|Swedish]] uses a-diaeresis (''{{angbr|[[ä]]''}}) and o-diaeresis (''{{angbr|[[ö]]''}}) in the place of {{lang|sv|ash}} (''{{angbr|æ''}}) and slashed o (''{{angbr|[[ø]]''}}) in addition to the a-overring (''{{angbr|å''}}). Historically, the diaeresistwo-dots diacritic for the Swedish letters ''{{angbr|ä''}} and ''{{angbr|ö'', like the [[German umlaut]],}} developed from a small Gothic ''{{angbr|e''}} written above the letters. These letters are collated after ''{{angbr|z''}}, in the order ''{{angbr|å}}, {{angbr|ä}}, {{angbr|ö''}}.
 
====Romance====
:* In [[Asturian language|Asturian]], [[Galician language|Galician]] and [[Spanish alphabet|Spanish]], the character ''{{angbr|[[ñ]]''}} is a letter and collated between ''n'' and ''o''.
:* [[Asturian language|Asturian]] uses an underdot: {{angbr|[[Ḷ]]}} ([[lower case]], [[{{angbr|]]}}), and {{angbr|[[Voiceless glottal fricative|Ḥ]]}} ([[lower case]] [[Voiceless glottal fricative{{angbr|ḥ]]}})<ref>{{cite book|url=http://www.academiadelallingua.com/diccionariu/gramatica_llingua.pdf |title=Gramática de la Llingua Asturiana |access-date=2011-06-07 |url-status=dead |archive-url=https://web.archive.org/web/20110525120027/http://www.academiadelallingua.com/diccionariu/gramatica_llingua.pdf |archive-date=2011-05-25 |publisher=Academia de la Llingua Asturiana | edition=3rd | date=2001 | isbn=84-8168-310-8 | at=section 1.2}}</ref>
:* [[Catalan language|Catalan]] uses the acute accent ''{{angbr|é}}, {{angbr|í}}, {{angbr|ó}}, {{angbr|ú''}}, the grave accent ''{{angbr|à}}, {{angbr|è}}, {{angbr|ò''}}, the diaeresis ''{{angbr|ï}}, {{angbr|ü''}}, the cedilla ''{{angbr|ç''}}, and the [[interpunct]] ''{{angbr|l·l''}}. In [[Valencian language|Valencian]], the circumflex ''â, ê, î, ô, û'' may also be used.
::* In [[Valencian language|Valencian]], the circumflex {{angbr|â}}, {{angbr|ê}}, {{angbr|î}}, {{angbr|ô}}, {{angbr|û}} may also be used.
:* [[Corsican language|Corsican]] uses the following in [[Corsican alphabet|its alphabet]]: {{angbr|À}}/{{angbr|à}}, {{angbr|È}}/{{angbr|è}}, {{angbr|Ì}}/{{angbr|ì}}, {{angbr|Ò}}/{{angbr|ò}}, {{angbr|Ù}}/{{angbr|ù}}.
:* [[French language|French]] uses four diacritics appearing on vowels (circumflex, acute, grave, diaeresis) and the cedilla appearing in "ç".
:* [[ItalianFrench language|ItalianFrench]] uses twofour diacritics, appearing on vowels (circumflex, acute, grave, diaeresis) and the cedilla appearing in {{angbr|ç}}.
:* [[LeoneseItalian language|LeoneseItalian]]: coulduses usetwo ''[[ñ]]''diacritics, orappearing ''[[Liston ofvowels Latin(acute, digraphs#N|nn]]''.grave)
:* [[PortugueseLeonese language|PortugueseLeonese]]: usescould a [[tilde]] with the vowelsuse {{angbr|añ}} andor {{angbr|o}}[[List andof aLatin cedilla with cdigraphs#N|nn]]}}.
:* [[Portuguese language|Portuguese]] uses a tilde with the vowels {{angbr|a}} and {{angbr|o}} and a cedilla with c.
:* [[Romanian alphabet|Romanian]] uses a [[breve]] on the letter ''a'' (''{{angbr|[[ă]]''}}) to indicate the sound [[schwa]] {{IPA|/ə/}}, as well as a circumflex over the letters ''a'' (''{{angbr|[[â]]''}}) and ''i'' (''{{angbr|[[î]]''}}) for the sound {{IPA|/ɨ/}}. Romanian also writes a [[comma below]] below the letters ''s'' (''{{angbr|[[ș]]''}}) and ''t'' (''{{angbr|[[ț]]''}}) to represent the sounds {{IPA|/ʃ/}} and {{IPA|/t͡s/}}, respectively. These characters are collated after their non-diacritic equivalent.
:* [[Spanish language|Spanish]] uses acute accents (''{{angbr|á}}, {{angbr|é}}, {{angbr|í}}, {{angbr|ó}}, {{angbr|ú''}}) to indicate stress falling on a different syllable than the one it would fall on based on default rules, and to distinguish certain one-syllable homonyms (e.g. ''{{lang|es|el''}} (masculine singular definite article) and ''{{lang|es|él''}} "[he"]). Diaeresis is used on u only, to distinguish the combinations ''{{lang|es|gue, gui''}} {{IPA|/ge/, /gi/}} from ''{{lang|es|güe, güi''}} {{IPA|/gwe/, /gwi/}}, e.g. ''{{lang|es|vergüenza, lingüística''}}. The tilde on {{angbr|ñ}} is not considered a diacritic as {{angbr|ñ}} is considered a distinct letter from {{angbr|n}}, not a mutated form of it.
 
====Slavic====
:* [[Gaj's Latin alphabet|The alphabet used]], used in the Bosnian,[[Croatian language|Croatian, Montenegrin]] and latinized [[Serbian languageslanguage|Serbian]], has the symbols {{angbr|[[č]]}}, {{angbr|[[ć]]}}, {{angbr|[[đ]]}}, {{angbr|[[š]]}} and {{angbr|[[ž]]}}, which are considered separate letters and are listed as such in dictionaries and other contexts in which words are listed according to alphabetical order. It also has one [[digraph (orthography)|digraph]] including a diacritic, ''[[dž]]'', which is also alphabetized independently, and follows {{angbr|[[d]]}} and precedes {{angbr|[[đ]]}} in the alphabetical order.
:* The [[Czech alphabet]] uses the acute (á é í ó ú ý), caron ([[č]] [[ď]] [[ě]] [[ň]] [[ř]] [[š]] [[ť]] [[ž]]), and for one letter ([[ů]]) the ring. (In ď and ť the caron is modified to look rather like an apostrophe.) Letter with caron are considered separate letters, whereas vowels are considered only as longer variants of the unaccented letters. Acute does not affect alphabetical order, letters with caron are ordered after original counterparts.
:* [[Polish alphabet|Polish]] has the following letters: [[ą]] [[ć]] [[ę]] [[ł]] [[ń]] [[ó]] [[ś]] [[ź]] [[ż]]. These are considered to be separate letters: each of them is placed in the alphabet immediately after its Latin counterpart (e.g. {{angbr|ą}} between {{angbr|a}} and {{angbr|b}}), {{angbr|ź}} and {{angbr|ż}} are placed after {{angbr|z}} in that order.
:* The [[Serbian Cyrillic alphabet|Serbian Cyrillic]] alphabet has no diacritics, instead it has a grapheme ([[glyph]]) for every letter of [[Gaj's Latin alphabet|its Latin counterpart]] (including Latin letters with diacritics and the digraphs dž, ''[[Lje|lj]]'' and ''[[Nj (digraph)|nj]]'').
:* The [[Slovak alphabet]] uses the acute (á é í ó ú ý [[ĺ]] [[ŕ]]), caron (č ď ľ ň š ť ž dž), umlaut (ä) and circumflex accent (ô). All of those are considered separate letters and are placed directly after the original counterpart in the [[Slovak alphabet|alphabet]].<ref name="PSP2000">http://www.juls.savba.sk/ediela/psp2000/psp.pdf page 12, section I.2</ref>
:* The basic [[Slovenian alphabet]] has the symbols {{angbr|[[č]]}}, {{angbr|[[š]]}}, and {{angbr|[[ž]]}}, which are considered separate letters and are listed as such in dictionaries and other contexts in which words are listed according to alphabetical order. Letters with a [[caron]] are placed right after the letters as written without the diacritic. The letter {{angbr|đ}} ('d with bar') may be used in non-transliterated foreign words, particularly names, and is placed after {{angbr|č}} and before {{angbr|d}}.
Line 265 ⟶ 267:
:*[[Vietnamese alphabet|Vietnamese]] uses the [[horn (diacritic)|horn diacritic]] for the letters ''ơ'' and ''ư''; the [[circumflex]] for the letters ''â'', ''ê'', and ''ô''; the [[breve]] for the letter ''ă''; and a bar through the letter ''đ''. Separately, it also has á, à, ả, ã and ạ, the five tones used for vowels besides the flat tone 'a'.
 
===[[Cyrillic letters]]===
{{further|Cyrillic script}}
:*[[Belarusian alphabet|Belarusian]] and [[Uzbek alphabet#Correspondence chart|Uzbek Cyrillic]] have a letter ''[[Short U (Cyrillic)|ў]]''.
:*[[Belarusian alphabet|Belarusian,]] and [[BulgarianUzbek languagealphabet#AlphabetCorrespondence chart|BulgarianUzbek Cyrillic]], Russian and Ukrainian have thea letter ''{{angbr|[[Short IU (Cyrillic)|йў]]''}}.
:* Belarusian, [[Bulgarian language#Alphabet|Bulgarian]], Russian and Ukrainian have the letter {{angbr|[[Short I|й]]}}.
:* Belarusian and [[Russian alphabet|Russian]] have the letter ''{{angbr|[[Yo (Cyrillic)|ё]]''}}. In Russian, this letter is usually replaced by ''{{angbr|[[Ye (Cyrillic)|е]]''}}, although it has a different pronunciation. The use of ''{{angbr|е''}} instead of ''{{angbr|ё''}} does not affect the pronunciation. ''Ё'' is always used in children's books and in dictionaries. A [[minimal pair]] is все (''vs'e'', "everybody" pl.) and всё (''vs'o'', "everything" n. sg.). In Belarusian the replacement by ''{{angbr|е''}} is a mistake; in Russian, it is permissible to use either ''{{angbr|е''}} or ''{{angbr|ё''}} for ''{{angbr|ё''}} but the former is more common in everyday writing (as opposed to instructional or juvenile writing).
:* The [[Cyrillic script|Cyrillic]] [[Ukrainian alphabet]] has the letters ''[[ґ]]'', ''[[й]]'' and ''[[ї]]''. Ukrainian [[Latynka]] has many more.
:* The Cyrillic [[MacedonianUkrainian language|Macedonianalphabet]] has the letters ''{{angbr|[[kjeґ]]}}, {{angbr|ќ[[й]]''}} and ''{{angbr|[[gje|ѓї]]''}}. Ukrainian [[Latynka]] has many more.
:* [[Macedonian language|Macedonian]] has the letters {{angbr|[[kje|ќ]]}} and {{angbr|[[gje|ѓ]]}}.
:* In Bulgarian and [[Macedonian language|Macedonian]] the possessive pronoun ѝ (''ì'', "her") is spelled with a grave accent in order to distinguish it from the conjunction и (''i'', "and").
:* The acute accent {{char|&nbsp;́◌́}} above any vowel in Cyrillic alphabets is used in dictionaries, books for children and foreign learners to indicate the word stress, it also can be used for disambiguation of similarly spelled words with different lexical stresses.
 
==Diacritics that do not produce new letters==
Line 279 ⟶ 282:
===English===
{{main article|English terms with diacritical marks}}
[[English alphabet|English]] is one of the few European languages that does not have many words that contain diacritical marks. Instead, digraphs are the main way the Modern English alphabet adapts the Latin to its phonemes. Exceptions are unassimilated foreign loanwords, including borrowings from [[French language|French]] (and, increasingly, [[Spanish language|Spanish]], like ''jalapeño'' and ''piñata''); however, the diacritic is also sometimes omitted from such words. Loanwords that frequently appear with the diacritic in English include ''café'', ''résumé'' or ''resumé'' (a usage that helps distinguish it from the verb ''resume''), ''soufflé'', and ''naïveté'' (see ''[[English terms with diacritical marks]]''). In older practice (and even among some orthographically- conservative modern writers), one may see examples such as ''élite'', ''mêlée'' and ''rôle.''
 
English speakers and writers once used the diaeresis more often than now in words such as ''coöperation'' (from Fr. ''coopération''), ''zoölogy'' (from Grk. ''zoologia''), and ''seeër'' (now more commonly ''see-er ''or simply'' seer'') as a way of indicating that adjacent vowels belonged to separate syllables, but this practice has become far less common. ''[[The New Yorker]]'' magazine is a major publication that continues to use the diaeresis in place of a hyphen for clarity and economy of space.<ref>{{cite magazine|last=Norris|first=Mary|title=The Curse of the Diaeresis|url=http://www.newyorker.com/online/blogs/culture/2012/04/the-curse-of-the-diaeresis.html|magazine=The New Yorker|date=26 April 2012|access-date=18 April 2014}}</ref>
Line 299 ⟶ 302:
* [[Filipino alphabet|Filipino]] has the following composite characters: ''á, à, â, é, è, ê, í, ì, î, ó, ò, ô, ú, ù, û''. Everyday use of diacritics for Filipino is, however, uncommon, and meant only to distinguish between [[homonym]]s between a word with the usual [[penult]]imate stress and one with a different stress placement. This aids both comprehension and pronunciation if both are relatively adjacent in a text, or if a word is itself ambiguous in meaning. The letter ''ñ'' ("''eñe''") is not a ''n'' with a diacritic, but rather collated as a separate letter, one of eight borrowed from Spanish. Diacritics appear in [[Spanish language in the Philippines|Spanish]] [[List of loanwords in Tagalog#Spanish|loanwords]] and [[Filipino name|names]] observing Spanish orthography rules.
* [[Finnish alphabet|Finnish]]. Carons in ''š'' and ''ž'' appear only in foreign proper names and [[loanword]]s, but may be substituted with ''sh'' or ''zh'' if and only if it is technically impossible to produce accented letters in the medium. Contrary to Estonian, ''š'' and ''ž'' are not considered distinct letters in Finnish.
* [[French alphabet|French]] uses five diacritics. The grave (''accent grave'') marks the sound {{IPA|/ɛ/}} when over an e, as in ''père'' ("father") or is used to distinguish words that are otherwise homographs such as ''a''/''à'' ("has"/"to") or ''ou''/''où'' ("or"/"where"). The [[acute accent|acute]] (''accent aigu'') is only used in "é", modifying the "e" to make the sound {{IPA|/e/}}, as in ''étoile'' ("star"). The [[circumflex]] (''accent circonflexe'') generally denotes that an S once followed the vowel in Old French or Latin, as in ''fête'' ("party"), the Old French being ''feste'' and the Latin being ''festum''. Whether the circumflex modifies the vowel's pronunciation depends on the dialect and the vowel. The [[cedilla]] (''cédille'') indicates that a normally hard "c" (before the vowels "a", "o", and "u") is to be pronounced {{IPA|/s/}}, as in ''ça'' ("that"). The diaeresis diacritic ({{lang-langx|fr|tréma}}) indicates that two adjacent vowels that would normally be pronounced as one are to be pronounced separately, as in ''Noël'' ("Christmas").
* [[Galician language|Galician]] vowels can bear an acute (''á, é, í, ó, ú'') to indicate stress or difference between two otherwise same written words (''é'', 'is' vs. ''e'', 'and'), but the diaeresis is only used with ''ï'' and ''ü'' to show two separate vowel sounds in pronunciation. Only in foreign words may Galician use other diacritics such as ''ç'' (common during the Middle Ages), ''ê'', or ''à''.
* [[German alphabet|German]] uses the three umlauted characters ''ä'', ''ö'' and ''ü''. These diacritics indicate vowel changes. For instance, the word ''Ofen'' {{IPA|de|ˈoːfən|}} "oven" has the plural ''Öfen'' {{IPA|[ˈøːfən]}}. The mark originated as a superscript ''e''; a handwritten blackletter ''e'' resembles two parallel vertical lines, like a diaeresis. Due to this history, "ä", "ö" and "ü" can be written as "ae", "oe" and "ue" respectively, if the umlaut letters are not available.
Line 346 ⟶ 349:
{{blockquote|
 
{{lang-langx|bo|ཧྐྵྨླྺྼྻྂ|label=none}}
{{vpad|1=2em}}
}}
Line 618 ⟶ 621:
 
==External links==
* [http://urtd.net/projects/cod/ Context of Diacritics |{{!}} A research project] {{Webarchive|url=https://web.archive.org/web/20141012135832/http://urtd.net/projects/cod/ |date=2014-10-12 }}
* [http://diacritics.typo.cz/ Diacritics Project]
* [https://www.unicode.org/ Unicode]