Wiktionary:Beer parlour/2020/February
Fix languages' information
[edit]- Category:Latin language -> Module:languages/data2: Add varieties as varieties = {"Old Latin", "Classical Latin", "Late Latin", "Medieval Latin", "New Latin", "Church Latin"}.
- Category:Alemannic German language -> Module:languages/data3/g: Change otherNames into varieties because that's what it is.
- Category:East Central German language -> Module:languages/datax: Change otherNames into varieties because that's what it is.
- Category:Central Franconian language -> Module:languages/datax: Ripuarian (including Colognian = Kölsch) and Moselle Franconian aren't other names but are varieties.
- Category:German language: Put Indiana German and Texan German under the item American German as Indiana and Texas are part of (the USA and thus of) America.
- Category:Akan language: It shouldn't be * Fante | ** Fanti, but * Fante, Fanti as Fanti is no form of Fante but just another spelling of it. Additionally, instead of * Twi-Fante | * Twi | * Fante it might rather be * Twi-Fante | ** Twi | ** Fante, that is Twi and Fante as forms of Twi-Fante.
--B-Fahrer (talk) 11:27, 1 February 2020 (UTC)
French ipa accent
[edit]Most people happen to know that french is always accented on the last syllable. I presume, that is the reason for not using any accent at IPA for French. But, why is it assumed? How is the phonetic scipt «International» if it assumes that all readers are familiar with the idiosyncracies of a certain language? Does this omission occur with other languages too? Merci. sarri.greek (talk) 12:19, 1 February 2020 (UTC)
- French doesn't actually have word accent at all. If you say a French word in isolation, there's stress on the last syllable, but that's because French puts stress on the last syllable of a phrase, not necessarily the last syllable of a word. —Mahāgaja · talk 12:37, 1 February 2020 (UTC)
- d'ˈaccord, then. ˈmerci, Mahāgaja. --sarri.greek (talk) 13:14, 1 February 2020 (UTC)
- It's international because its use and nominally its interpretation is worldwide. It's not an Americanist phonetic alphabet, or the Uralic Phonetic Alphabet. You don't have to mark every little detail, because that's very noisy and rarely useful. You don't have to mark stress in any language if you don't want to. Certain Bengali–Assamese languages separate t̪ and t̺, but most other languages don't distinguish them, and using the wrong one is a slight accent, not a major misstep, so we don't want to mark them in English and French.
- I don't have a strong opinion on marking stress, but I don't think it's productive to argue that the IPA should be used same way on every language.--Prosfilaes (talk) 22:55, 5 February 2020 (UTC)
- There's also notable secondary stress in French on the antepenultimate syllable, rarely discussed but it's not always used. E.g. photographie is pronounced /fɔ.ˌtɔ.ɡʁa.ˈfi/. --Anatoli T. (обсудить/вклад) 23:14, 5 February 2020 (UTC)
- Thank you all. Excuse my ignorance. If there is no accent, I tend to stress all syllables like a robot-voice. Or the ipa for a string of words, where stresses may have different placing. In arabic, i tend to utter long vowels as stressed. The ipa, i feel, should be like a music score with clear instructions. If phoneticians say it is not correct to add stresses, they may have good reason. But it is not helpful.sarri.greek (talk) 17:28, 8 February 2020 (UTC)
- There's also notable secondary stress in French on the antepenultimate syllable, rarely discussed but it's not always used. E.g. photographie is pronounced /fɔ.ˌtɔ.ɡʁa.ˈfi/. --Anatoli T. (обсудить/вклад) 23:14, 5 February 2020 (UTC)
Removing headers in categories for individual characters.
[edit]Sarang (talk • contribs) has asked me to write a small bot task to change [[Category:Han rad sup]]
to [[Category:Han rad sup| ]]
, and likewise for various other categories where the entries are just individual characters. That way the category-page would have just one list, with no header, instead of having hundreds of singleton lists, where the header for each list is identical to the sole element of the list. (See Category:Han character radicals for an example of what it would look like after the change.)
Are y'all on board with this?
—RuakhTALK
07:28, 5 February 2020 (UTC)
- d'accord (as initiator) sarang♥사랑 12:01, 5 February 2020 (UTC)
What is durably archived?
[edit]According to CFI, "'attested' means verified through ... 2. use in permanently recorded media". However, there is no specification for what constitutes durably archived/permanently recorded media. Can we come up with more explicit guidelines? Ultimateria (talk) 20:05, 6 February 2020 (UTC)
- In 2012, I indexed as many discussions as I could find at that time at WT:T:CFI#Discussions of durability; that also includes a 2015 list of types of durable media that entries cite. The main category of media we currently/de facto cite is media that is archived/conserved by libraries and museums: written and printed media including handwritten manuscripts, books, magazines, journals, and newspapers, including text found in photographs printed in such media, but also recorded media (by studios/countries major enough that libraries archive it) including songs, films, and television shows, from which we've cited both words spoken aloud and words shown onscreen; as well as other conserved media which contains text, including a bowl, a disk, a spear-shaft, a horn (which notably no longer exists), and paintings. We also cite monumental inscriptions (that those two words could also be cited from e.g. books is beside the point), including runestones, other stones, and various steles, as well as ancient graffitied words; these inscriptions etc are in almost all cases documented in other (paper) written/printed media. The oddest duck out is Usenet, which we have considered durable because its decentralized method of archiving famously (as Scientologists found out) makes deleting content difficult/impossible. - -sche (discuss) 20:47, 6 February 2020 (UTC)
- One of a few open questions is whether putting an otherwise non-durable cite in the Internet archive thereby makes the archived copy durably archived. DCDuring (talk) 22:50, 6 February 2020 (UTC)
- Not any more, because they deindex content if a domain-owner changes and he requests it, it has been a scandal some years ago, and because the feds take things down from it; not only, even the leftist hivemind does, and I stress does, although still rarely, without necessarily going into details (which is of course irrelevant if you do not want to archive quotes for the newest racist neologisms). Headquarters San Francisco, all connected to the same ephemeral soup (and that can be relevant for anything archived, that the culture sustaining such computer systems disappears). Out-law archives like archive.today are reliable, being out of influences. It should just be multiple archives. But another question would be why one would not say, have ten internet-archived cites instead of “three durable” ones, whether such can count lesser; taking into account usual link rot rates and combining with archive power, and taking into account that different types of vocabulary are normatively not expected to appear in durable places in the first place, so readers do not expect quotes to be of such quality either. Fay Freak (talk) 05:11, 7 February 2020 (UTC)
- But I can't just show a picture of a bowl with some words on it and use it as a cite (right?). I'd have to get it accepted in a museum, etc. And certainly some random mp3 or student film wouldn't be citeable (right?), so not all songs and movies are allowed as cites either. Where's the line? 76.100.241.89
- One of a few open questions is whether putting an otherwise non-durable cite in the Internet archive thereby makes the archived copy durably archived. DCDuring (talk) 22:50, 6 February 2020 (UTC)
I guess a big question I have is whether content posted primarily online is durable. What about the hundreds of online magazines, like HuffPost? Even if a print newspaper has free access to its articles, we don't know which ones actually appear in print.
I also worry there's almost nothing durable online to document many minority languages. I've added quotes from Jornalet, an Occitan newspaper that doesn't seem to have a paper edition. I wonder if I should stop adding quotes from it, or does that only matter if I try to attest an RFV-ed term? I went to several bookshops in Toulouse a few years ago, and not one of them sold books in the language. A simple online newspaper is a precious asset for documenting the modern language. Ultimateria (talk) 06:30, 7 February 2020 (UTC)
- I definitely think some sort of lenience should be accorded minority languages and dialects that have had no significant literary culture (as in a diglossic situation where the written language has historically not been the spoken minority language). Currently for example dialectal terms attested only in dialectal dictionaries or insufficiently attested outside of them are also missing the boat, which is causing me to hold back a bit on adding Dutch dialectal terms. CFI in general probably needs some sort of revision to allow more words in languages and lects that 1) lack widespread attestation in the sources we currently privilege as being durable enough and/or 2) are often attested as mentions rather than usages, but I am not sure how to do this without compromising our ability to weed out protologisms and other spurious/non-dictionary-worthy words. — Mnemosientje (t · c) 16:28, 7 February 2020 (UTC)
- The Lower Sorbian weekly newspaper Nowy Casnik has both an online edition and a paper edition. I've subscribed to the paper edition (as well as the online edition), despite wasting paper, not because I can actually read it (my Lower Sorbian isn't nearly good enough for that) but simply to encourage them to continue printing the paper edition so that I can use it as a source of citations here on Wiktionary. If they ever stop printing the paper edition and go to online-only, I'd be worried that the cites would no longer be accepted. —Mahāgaja · talk 08:58, 8 February 2020 (UTC)
Portuguese Old Portuguese
[edit]Many Old Portuguese pages such as quarta feira have an untemplatized pseudo-label "Portugal". What is this supposed to mean? DTLHS (talk) 16:37, 7 February 2020 (UTC)
- Looking at the synonyms at that entry, I think it is to contrast the variant of Old Portuguese spoken in what we now call Portugal with the variant of Old Portuguese spoken in Galicia, but I'm no expert. — Mnemosientje (t · c) 16:40, 7 February 2020 (UTC)
- You still explained it correctly nonetheless. Old Portuguese is the ancestor of both Portuguese and Galician, so the label is telling us quarta feira was used in Portugal when Old Portuguese was spoken while mercores in Galicia. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 00:11, 20 February 2020 (UTC)
Allowing the header "Compounds" for select languages
[edit]Currently there are approximately 13,000 entries containing the header "Compounds". The overwhelming majority of them are in entries in Han characters; and about 2600 are in different scripts, these are mostly Finnish entries. This header is not as of now allowed by Entry Layout.
In Chinese, Japanese, Korean, Okinawan and Vietnamese single-character entries in the Han script, the 'compounds' section is used to link to multicharacter entries containing the respective Han characters. Terms under this section may include those that do not derive regularly from their parts (f.e. they may be phonetically represented loans or may have irregular readings). The exact implementation of the header differs per language; it may be L3 or L4.
In Finnish entries the "Compounds" sections contain compounds as they are familiar to English speakers; it is used to separate compounds from other derived terms. It is subordinate to a POS header. Finnish editors use this header consistently for this purpose.
So the question arises whether to explicitly allow the header "Compounds" and for what languages. There is also the matter of only permitting it for (some of) the above languages or any language whose editors support using it. Then there is the issue of what position is allowed for the header. Personally I would suggest to leave standardisation of the placement of this header to editors in the respective languages, but feel free to raise it here as well.
The change would require an update of Entry Layout, so if there is sufficient support, the matter would have to come to a vote. ←₰-→ Lingo Bingo Dingo (talk) 11:13, 8 February 2020 (UTC)
- In what way do compounds differ from derived terms, that makes a separate header necessary? —Rua (mew) 12:54, 8 February 2020 (UTC)
- Compounds are very numerous, so it makes sense to separate them into their own header; this is also the standard in Finnish Wiktionary and in every other Finnish dictionary that I've seen even list either of them. Derived terms can then be reserved for terms derived through some other means, like suffixes or abbreviations. After all, deriving terms using such means and forming compounds are two very different processes. — surjection ⟨?⟩ 13:43, 8 February 2020 (UTC)
- Then why not subdivide the Derived terms section? I don't think we should have both sections side by side. —Rua (mew) 13:57, 8 February 2020 (UTC)
- I'm personally fine with subdividing it too, but EL doesn't currently allow it all, which is what this entire discussion is for. — surjection ⟨?⟩ 13:59, 8 February 2020 (UTC)
- EL doesn't prohibit having separate lists under a heading (eg. light#Derived_terms_4). There isn't necessarily a need to have a separate header for each list. If you mean something different, then please elaborate. -Mike (talk) 07:55, 9 February 2020 (UTC)
- @Moverton: For example, some Chinese characters are used phonetically, rather than semantically to "derive" a term, as 斯 in 高斯. Thus "derived terms" may sound inaccurate. 恨国党非蠢即坏 (talk) 09:11, 9 February 2020 (UTC)
- I was talking about using "Compounds" as a heading. — surjection ⟨?⟩ 16:55, 9 February 2020 (UTC)
- EL doesn't prohibit having separate lists under a heading (eg. light#Derived_terms_4). There isn't necessarily a need to have a separate header for each list. If you mean something different, then please elaborate. -Mike (talk) 07:55, 9 February 2020 (UTC)
- I'm personally fine with subdividing it too, but EL doesn't currently allow it all, which is what this entire discussion is for. — surjection ⟨?⟩ 13:59, 8 February 2020 (UTC)
- Then why not subdivide the Derived terms section? I don't think we should have both sections side by side. —Rua (mew) 13:57, 8 February 2020 (UTC)
- Compounds are very numerous, so it makes sense to separate them into their own header; this is also the standard in Finnish Wiktionary and in every other Finnish dictionary that I've seen even list either of them. Derived terms can then be reserved for terms derived through some other means, like suffixes or abbreviations. After all, deriving terms using such means and forming compounds are two very different processes. — surjection ⟨?⟩ 13:43, 8 February 2020 (UTC)
How to best include a word in a suffix category
[edit]I wanted to include alliterative in Category:English words suffixed with -ive and I followed the example at festive with an etymology section that is "Equivalent to {{affix|en|alliteration|-ive}}." Is there a better way to include all of English's -ive words into this category? Seems a little ham-fisted but it works. Thoughts? —Justin (koavf)❤T☮C☺M☯ 22:37, 8 February 2020 (UTC)
- Erutuon has added the etymology that does include -ive, but for the record, I'm against using suffix categories to contain words simply ending in a string of letters if it doesn't reflect their etymology. Ultimateria (talk) 17:18, 10 February 2020 (UTC)
- @Ultimateria: I agree. Alliterative belongs in the category because it was formed in English, but, for instance, native doesn't because it was formed in Latin (nātus, the supine stem of nāscor, plus -īvus) and it can't be re-coined from English components.
- As an alternative to putting words like native in the "suffixed with -ive" category, there could be a category, or an appendix page, for English words whose origin includes Latin -īvus or a derivative of it. Most such words can be found by searching for English lemmas ending in -ive in User:Dixtosa's Toolforge site (warning: the search takes a while). I think that could be an interesting list, and that many people less experienced in etymology will interpret "suffixed with -ive" in that way, though I would rather have Category:English words suffixed with -ive and such categories stick with the more correct meaning of the phrase in which the word has been formed in English. — Eru·tuon 20:55, 10 February 2020 (UTC)
- I agree that words that couldn't be "recoined" in English, like "native", don't belong in the category. I am inclined to allow something like "equivalent to
{{af|en|foobar|-ive}}
" to put entries into the category in at least some cases where a word was not necessarily entirely formed in English per se, especially if it was simply formed in Middle English from the Middle English forms of foobar and -ive, or if it's something like "Marxism", discussed a while ago, where French marxisme was clearly at least adapted to if not calqued with Marx + -ism (hence the differing capitalization and missing -e). (Whereas, something like "rapprochement", discussed two years ago, indeed does not involve -ment.) - -sche (discuss) 22:29, 10 February 2020 (UTC)- I'd prefer to add
|nocat=1}}
to affix templates in entries like regularidade with "equivalent to X + Y". I've only done it a handful of times, but I'm even more inclined to remove[[Category:X words suffixed with -Y]]
from the bottom of a page. I'd keep the categorization for calques though. Ultimateria (talk) 05:38, 11 February 2020 (UTC)- I normally reserve these categories for synchronic surface etymology, i.e. either reflecting how the word is formed, how it is analysed as being formed, or the modern descendants of the components from which it is formed. I only include it for derivational processes that remain productive in the language, so old suffixes that are no longer recognisable as such would not get an affix category/template. I use the same logic for "Derived terms" as well, since it's effectively the inverse of etymology: if the term has/could have an affix template, it can also appear as a derived term in each of its components. For this reason, I do not think that the verb fell should appear as a derived term of fall. While indeed, the former was formed from the latter, an affix template could not possibly used to derive it: the formation occurred in pre-Proto-Germanic times and is most definitely not a productive means of deriving verbs in modern English. —Rua (mew) 21:38, 11 February 2020 (UTC)
- I'd prefer to add
- I agree that words that couldn't be "recoined" in English, like "native", don't belong in the category. I am inclined to allow something like "equivalent to
Capitalization in language names
[edit]A few language names are currently capitalized incorrectly:
maj
: Jalapa De Díaz Mazatec → Jalapa de Díaz Mazatecmig
: San Miguel El Grande Mixtec → San Miguel el Grande Mixtectoo
: Xicotepec De Juárez Totonac → Xicotepec de Juárez Totonac
--Lvovmauro (talk) 04:04, 9 February 2020 (UTC)
- I've fixed
maj
; the other two languages have entries, so a rename will require someone to update about a hundred pages. - -sche (discuss) 05:59, 9 February 2020 (UTC)- @-sche, Lvovmauro Fixed the other two. I looked for pages such as translation tables that contain the old names in their text by searching the Feb 1 dump, so if there are any translation tables or similar pages with the old names added since then, they won't have been corrected. This doesn't apply to lemmas and non-lemma forms in these languages or to categories, because I used up-to-date lists. Benwing2 (talk) 20:59, 16 February 2020 (UTC)
Positioning of "&lit" entries
[edit]Where should "&lit" entries ("Used other than with a figurative or idiomatic meaning") be placed? Should they go at the start of the definitions or at the end? I don't have a strong view myself, but I have seen both styles used, and ideally we should make it consistent. Please express your opinion as to the best position. Mihia (talk) 11:32, 10 February 2020 (UTC)
- I usually move them to the bottom, because they are the most obvious and least interesting. (Otherwise why wouldn't we include &lits on almost any multi-word phrase? I imagine most can be attested in the non-idiomatic usage, e.g. cut the cheese.) Even better in most cases just omit them. Equinox ◑ 15:03, 10 February 2020 (UTC)
- Quiz question: what do you call the residue left after cutting the cheese? Here people line up to pass water to extinguish a fire. :) --Lambiam 20:20, 10 February 2020 (UTC)
- I think it's useful to have them when an idiom is also a common, but SOP, collocation, since people might expect to find the collocation defined in the entry. It might also be helpful to note in such entries how common the literal sense is relative to the idiom. There's nothing in the entry for cut the cheese, for instance, that tells non-native speakers that it's perfectly fine and understandable to use the expression in its literal sense. Conversely, Netflix and chill shows usage in the literal sense (which is placed first in the verb section), but there's nothing to indicate to a non-native speaker that most people will understand the phrase in the sexual sense. Andrew Sheedy (talk) 15:49, 10 February 2020 (UTC)
- For your first example, you could've fixed that by adding
{{&lit}}
(and I just did). For your second, I wonder if you're right — few would use it outside of that sense, but much of my US family would definitely not understand it as anything but literal. —Μετάknowledgediscuss/deeds 16:47, 10 February 2020 (UTC)- Thanks, I guess I should have added it. It still isn't enough information in itself though. It just tells the user that the literal form exists, not how common it is relative to the idiom. A usage note would do a better job of that, especially since it could explain regional/generational differences. Andrew Sheedy (talk) 01:35, 11 February 2020 (UTC)
- Since it's 2016 again, here is a high-profile literal use (that drew some prescriptivist comment), so I strongly doubt the idiomatic sense is understood very widely.
←₰-→Lingo Bingo Dingo (talk) 09:38, 11 February 2020 (UTC)
- For your first example, you could've fixed that by adding
- I agree with Eq — to the bottom they go. —Μετάknowledgediscuss/deeds 16:47, 10 February 2020 (UTC)
- It depends on the frequency (the same argument as at many other occasions where order is in question). Since cut the cheese is likely and meseems more likely to be meant literally (because people eat and cut cheese and talk about it in food-related content copiously; never encountered the idiom) it should go to the top. Another argument is the chronology. Fay Freak (talk) 20:04, 10 February 2020 (UTC)
- I agree in principle that &lit ought to come first when the literal sense is more common than the idiomatic sense, and vice versa mutatis mutandis, but proving which is more common can be difficult. Fay Freak never encountered the idiom cut the cheese, whereas when I was 8 years old that was the normal term for fart in my peer group. (I have no idea whether 8-year-olds still say "cut the cheese" now over 40 years later, or whether adults ever say it.) And still to this day my inner 8-year-old snickers when somebody uses the phrase in its literal sense, and I find myself deliberately saying "slice the cheese" so as to avoid the idiom. —Mahāgaja · talk 20:29, 10 February 2020 (UTC)
- I agree with Fay Freak that frequency is the ideal standard; skimming a few pages on Google Books often gives a decent indication. If that doesn't have clear result, putting it at the bottom can be the alternative.
←₰-→Lingo Bingo Dingo (talk) 09:38, 11 February 2020 (UTC)- In many of our English phrasal verb entries, many of the definitions that we have or formerly had should be subsumed under
{{&lit}}
, preferably with usage examples. The number of NISoP definitions advnced as idiomatic is strong evidence of a need to set{{&lit}}
first, without regard to considerations of frequency. DCDuring (talk) 20:32, 11 February 2020 (UTC)
- In many of our English phrasal verb entries, many of the definitions that we have or formerly had should be subsumed under
- In cases where there are multiple idiomatic meanings, which is not uncommon, including "&lit" in a by-frequency ordering scheme could potentially put it in the middle of the list. I'm not sure how that would look. Mihia (talk) 21:04, 11 February 2020 (UTC)
- It depends on the frequency (the same argument as at many other occasions where order is in question). Since cut the cheese is likely and meseems more likely to be meant literally (because people eat and cut cheese and talk about it in food-related content copiously; never encountered the idiom) it should go to the top. Another argument is the chronology. Fay Freak (talk) 20:04, 10 February 2020 (UTC)
Do we want this? It's currently added by {{head|en|obsolete verb form}}
(an "unrecognized POS"), and often kept out of its parent category and also Category:English past participles by |nocat=1
being used on the {{term-label|en|obsolete}}
and {{past participle of}}
context and "definition" templates of the few entries in the category. I added a few entries to the category on the model of the few that were already there, but then I started wondering if it was worthwhile. A lot more entries could go in it... or should we just leave them in Category:English obsolete terms? - -sche (discuss) 22:13, 10 February 2020 (UTC)
- I removed the 10-20 entries that were in the category. But it does occur to me that labelling all obsolete verb forms with
{{term-label|en|obsolete}}
would swamp Category:English obsolete terms. Archaic verb forms seem to be categorized into Category:English archaic third-person singular forms (1532 entries) and Category:English second-person singular forms (2207 entries) via specific definition-line templates, I suppose obsolete participles (etc) should go in their own category through their own template. - -sche (discuss) 18:47, 12 February 2020 (UTC)
- I've set up Template:en-obsolete past participle of and Category:English obsolete past participles on the model of Template:en-archaic third-person singular of and its category, leaving Category:English obsolete verb forms as a parent category like Category:English archaic verb forms. (I suppose obsolete simple past tense forms could have their own category or they and these could be merged into one category and template for "obsolete past tense form"s.) Entries can now use a "normal" POS, solving the "unrecognized POS" issue, and don't swamp Category:English obsolete terms. - -sche (discuss) 22:13, 12 February 2020 (UTC)
Deleting empty categories
[edit]We currently have 5,670 empty categories listed in Category:Empty categories. I'd like to delete some of them. I think User:Suzukaze-c at one point deleted all empty categories, and someone apparently then objected, because they were all restored; but if I'm somewhat selective, maybe this complaint won't occur. Here, for example, are all the empty topic categories that occur more than once:
600 :Mammalogy 406 :Birdwatching 375 :Horticulture 367 :Ornithology 305 :Entomology 276 :Herpetology 236 :Ichthyology 130 :Zoology 124 :Medicine 114 :Arachnology 89 :Malacology 78 :Timekeeping 70 :Air 53 :Mycology 35 :Bodily functions 31 :Mineralogy 24 :Lichenology 18 :Toxicology 16 :State capitals of Brazil 15 :Skeleton 14 :Municipalities of Distrito Federal, Brazil 13 :Arthropodology 12 :Kitchenware 12 :Animal tissues 10 :Cleaning 9 :Phycology 8 :Earth sciences 7 :Forms of government 6 :Polities 5 :Neurology 5 :Liquids 5 :Light sources 5 :Hymenopterans 5 :Hominids 5 :Combustion 5 :Bacteriology 4 :Time 4 :Lifeforms 4 :Electromagnetism 4 :Body parts 4 :Anatomy 3 :Weather 3 :Viruses 3 :Virology 3 :State capitals of the United States 3 :Regions of China 3 :Political subdivisions 3 :Mind 3 :List of sets 3 :Linguistics 3 :Light 3 :Industries 3 :Face 3 :Districts of the Philippines 3 :Districts of South Korea 3 :Cities in Pakistan 3 :Calendar terms 2 :Worms 2 :Western Sahara 2 :Water 2 :Vision 2 :Vertebrates 2 :United States 2 :Turtles 2 :Tupi mythology 2 :Travel 2 :Symbols 2 :Seasons 2 :Religion 2 :Provinces of France 2 :Proteales order plants 2 :Pome fruits 2 :Places 2 :Pigs 2 :Oaks 2 :Numbers 2 :Nonverbal communication 2 :Natural materials 2 :Metallurgy 2 :Letter names 2 :Fungi 2 :Four 2 :Foods 2 :Flowers 2 :Female animals 2 :Felids 2 :Fantasy 2 :Dogs 2 :Districts of Norway 2 :Districts of India 2 :Districts of China 2 :Districts of Canada 2 :Cutlery 2 :Countries 2 :Counties of the United States, USA 2 :Computing 2 :Chemistry 2 :Canids 2 :Boroughs in England 2 :Body 2 :Birds 2 :Beekeeping 2 :Ball games 2 :Arthropods 2 :Art 2 :Alcoholic beverages
Any objections to me deleting all of these empty categories (or, for that matter, all empty topic categories)? Note that every 3 days or so I do a bot run and automatically create all non-empty categories in Special:WantedCategories that can be defined using {{auto cat}}
and aren't already created, so mistakenly deleted categories won't be empty for long.
Some other stats:
- 558 empty categories of the form "Foo terms derived from Bar" (and 9 "Terms derived from Bar")
- 152 empty categories of the form "Foo terms borrowed from Bar" (and 8 "Terms borrowed from Bar")
- 12 empty categories of the form "Foo terms inherited from Bar" (and 1 "Terms inherited from Bar")
- 103 empty categories of the form "Foo words prefixed with bar-"
- 175 empty categories of the form "Foo words suffixed with -bar"
- 32 empty categories of the form "Foo terms belonging to the root B A R" (of which 31 are Arabic and 1 is Maltese)
- 29 empty categories of the form "Foo form-of templates"
- 9 empty categories of the form "Foo inflection-table templates"
- 8 empty categories of the form "Foo entry templates"
- 4 empty categories of the form "Foo reference templates"
- 18 empty categories of the form "Foo lemmas"
- 6 empty categories of the form "Foo non-lemma forms"
- 7 empty categories of the form "Foo verb forms"
- 5 empty categories of the form "Foo noun forms"
- 7 empty categories of the form "Foo noun plural forms"
- 7 empty categories of the form "Foo adjective comparative forms"
- 4 empty categories of the form "Foo adjective superlative forms"
- 3 empty categories of the form "Foo adverb comparative forms"
- 4 empty categories of the form "Foo adverb superlative forms"
- 35 empty categories of the form "Foo rare forms"
- 6 empty categories of the form "Foo figures of speech"
- 19 empty categories of the form "Foo *** verbs", where *** is "causative", "frequentative", "perfective", "imperfective", "transitive", "intransitive", "deponent", "auxiliary", etc.
[etc.]
Any objection to me deleting all of the above empty categories? Benwing2 (talk) 03:42, 11 February 2020 (UTC)
- It was User:Wyang. I'm not an admin. —Suzukaze-c◇◇ 04:28, 11 February 2020 (UTC)
- Wiktionary:Grease pit/2019/May#Empty categories —Suzukaze-c◇◇ 04:28, 11 February 2020 (UTC)
- There's also Special:UnusedCategories, which is generated automatically from the database, though because it can only list 5,000 results, it's probably missing some of the members of Category:Empty categories. I'm in favor of deleting all empty non-maintenance categories, including all the ones you listed. Incidentally, we can keep empty maintenance categories out of Special:UnusedCategories with
__EXPECTUNUSEDCATEGORY__
, which I've added to Module:category tree/requests. — Eru·tuon 09:21, 11 February 2020 (UTC)- Automatic creation of "wanted" categories and deletion of empty categories is very useful for temporarily useful categories such as those in Category:Requests for date by source. DCDuring (talk) 03:20, 13 February 2020 (UTC)
- I'm curious about how we have almost 6,000 empty categories to begin with. Equinox ◑ 21:09, 11 February 2020 (UTC)
- Many of the most numerous ones were categories that existed only because another category was placed in them by the category tree system. These are mostly categories for terms used in specific scientific fields, and therefore probably will never have entries in the majority of languages. For example, Category:gem-pro:Mammalogy is extremely unlikely to ever contain a single entry, and it was only created to hold Category:gem-pro:Mammals. For that reason, I removed them as parent categories. In general, to avoid cases like this, I think we should avoid subcategorising categories for things into scientific fields dedicated to studying those things. —Rua (mew) 21:47, 11 February 2020 (UTC)
- To respond to Benwing: fine by me. Canonicalization (talk) 22:17, 11 February 2020 (UTC)
- OK, I am soon going to delete all empty topical categories, of which there are currently 3905 of them, from Category:aa:Horticulture to Category:zza:Timekeeping. After that I'll work on the other empty categories, taking care to exclude maintenance categories (if there are any listed under Category:Empty categories). Benwing2 (talk) 03:47, 12 February 2020 (UTC)
I have deleted all the topic categories as well as many other categories. I've taken care not to delete any maintenance categories. However, I notice that Special:UnusedCategories has a ton of request categories, e.g.:
Page 4464 Category:Requests concerning Atorada: Processing Page 4465 Category:Requests concerning Aushiri: Processing Page 4466 Category:Requests concerning Baniwa: Processing Page 4488 Category:Requests for Brahmi script for Sogdian terms: Processing Page 4489 Category:Requests for Canadian syllabics script for Cree terms: Processing Page 4490 Category:Requests for Cuneiform script for Hurrian terms: Processing Page 4525 Category:Requests for aspect in Russian entries: Processing Page 4526 Category:Requests for attention concerning Aiwoo: Processing Page 4527 Category:Requests for attention concerning Akkadian: Processing Page 4528 Category:Requests for attention concerning Algerian Arabic: Processing Page 4570 Category:Requests for audio pronunciation in Bella Coola entries: Processing Page 4571 Category:Requests for audio pronunciation in Greek entries: Processing Page 4572 Category:Requests for audio pronunciation in Vietnamese entries: Processing Page 4573 Category:Requests for clarification of definitions in Alutor entries: Processing Page 4574 Category:Requests for clarification of definitions in Bulgarian entries: Processing Page 4575 Category:Requests for clarification of definitions in Cebuano entries: Processing Page 4598 Category:Requests for cleanup in Albanian entries: Processing Page 4599 Category:Requests for cleanup in Bulgarian entries: Processing Page 4600 Category:Requests for cleanup in Cebuano entries: Processing Page 4635 Category:Requests for date/Archbishop Newcome: Processing Page 4636 Category:Requests for date/Black Eyed Peas: Processing Page 4637 Category:Requests for date/Charles Churchill: Processing Page 4661 Category:Requests for definitions in Bashkir entries: Processing Page 4662 Category:Requests for definitions in Central Nahuatl entries: Processing Page 4663 Category:Requests for definitions in Chuvash entries: Processing Page 4694 Category:Requests for deletion in Amharic entries: Processing Page 4695 Category:Requests for deletion in Armenian entries: Processing Page 4696 Category:Requests for deletion in Balinese entries: Processing Page 4740 Category:Requests for etymologies in Akkadian entries: Processing Page 4741 Category:Requests for etymologies in Bactrian entries: Processing Page 4742 Category:Requests for etymologies in Baluchi entries: Processing Page 4771 Category:Requests for example sentences in Amharic: Processing Page 4772 Category:Requests for example sentences in Arabic: Processing Page 4773 Category:Requests for example sentences in Danish: Processing Page 4782 Category:Requests for expansion of etymologies in Armenian entries: Processing Page 4783 Category:Requests for expansion of etymologies in Azerbaijani entries: Processing Page 4784 Category:Requests for expansion of etymologies in Bulgarian entries: Processing Page 4798 Category:Requests for gender in Bodo (India) entries: Processing Page 4799 Category:Requests for gender in Central Kurdish entries: Processing Page 4800 Category:Requests for gender in Frankish entries: Processing Page 4816 Category:Requests for images in Breton entries: Processing Page 4817 Category:Requests for images in Cornish entries: Processing Page 4818 Category:Requests for images in Finnish entries: Processing Page 4832 Category:Requests for inflections in Arabic adjective entries: Processing Page 4833 Category:Requests for inflections in Arabic noun entries: Processing Page 4834 Category:Requests for inflections in Armenian noun entries: Processing Page 4877 Category:Requests for native script for Abaza terms: Processing Page 4878 Category:Requests for native script for Abkhaz terms: Processing Page 4879 Category:Requests for native script for Adyghe terms: Processing Page 4968 Category:Requests for pronunciation in Afrikaans entries: Processing Page 4969 Category:Requests for pronunciation in Akkadian entries: Processing Page 4970 Category:Requests for pronunciation in Albanian entries: Processing
Almost all of these are now handled by {{auto cat}}
, meaning that my bot will automatically recreate them as necessary. For this reason, I don't see a reason to maintain them as empty categories, and I think we should delete them. This applies doubly so to categories like Category:Requests for date/Archbishop Newcome, which can multiply arbitrarily (we already have 1,696 of these categories). What do people think? Benwing2 (talk) 05:55, 13 February 2020 (UTC)
- I just see the deleting and resurrecting as extra work that's not strictly necessary since the categories can be kept out of Special:UnusedCategories with
__EXPECTUNUSEDCATEGORY__
. All of the categories handled by Module:category tree/requests should be out of the list the next time it's updated, and any other empty maintenance categories that might at some time be repopulated (aren't obsolete or incorrect) can be removed as well without deleting them. This is how Wikipedia handles maintenance categories; see for instance Category:Lang and lang-xx template errors, which is currently empty but is kept out of w:Special:UnusedCategories by w:Template:empty category. (A sort of preview of the next edition of Special:UnusedCategories can be gotten by opening the full list and running a JavaScript snippet in the console to hide redlinked categories and those with "request" in them:Array.from(document.getElementsByClassName('special')[0].getElementsByTagName('li')).filter(e => /request/i.test(e.textContent) || e.getElementsByClassName('new').length !== 0).forEach(e => e.style.display = 'none')
.) — Eru·tuon 07:00, 13 February 2020 (UTC)
Phrase v noun, adjective, etc
[edit]Time was when initialisms had "Initialism" as a heading, now we generally/always use the apprpriate part of speech. What about phrases? Is there a preference for "Phrase" (eg and whatnot) as the heading or "Adverb", etc (eg one way or another? — Saltmarsh. 11:57, 12 February 2020 (UTC)
- I do not know if we have a stated preference, but I prefer the more specific POS header, so that nice as pie and open and affirming are classified as adjectives, by a landslide as an adverb, and blue screen of death and moment in the sun as nouns. The more specific POS is also desirable for cases like but and ben, which can be used as a noun and as an adverb, and has different definitions depending on the POS. IMO the header “Phrase” should be used for phrases that can stand on their own, such as sayings and proverbs, or else as a last resort. --Lambiam 14:30, 12 February 2020 (UTC)
- I also prefer the more specific PoS headers. But sometimes we have multiword expressions that do not fit traditional parts of speech. In some cases they aren't even phrases. They are "non-constituents". See Category:English non-constituents and Category:English coordinates. There are undoubtedly more, but these categories have to be hard-coded. DCDuring (talk) 20:57, 12 February 2020 (UTC)
- Thankyou both — that is the way I felt we were going. — Saltmarsh. 06:02, 13 February 2020 (UTC)
- There are a smallish number of entries remaining with "Abbreviation" or "Initialism" as a heading. They can be found at User:Erutuon/abbreviation headers. I've spent the last year or more trying to get that list down to 0, and have to admit that a few times if it didn't seem obvious what the SOP was, I just said "to hell with it" and called it a "Phrase" - especially for languages that I had no idea about. --AcpoKrane (talk) 10:20, 13 February 2020 (UTC)
"Written Standard Chinese"
[edit]If you will take a glance at the way that the zh-wp module displays on the China (中國/中国 (Zhōngguó)) page, you will see that eight Wikipedias are linked. This is fine. My only problem is, one of the Wikipedias is labeled as "Written Standard Chinese". List of Wikipedias calls that Wikipeda version 'Chinese'- and I changed it to Mandarin Chinese just now ([1]). Who's right? What language is that Wikipedia written in? I will post a similar question on the List of Wikipedia pages talk page. ([2]) --Geographyinitiative (talk) 00:14, 13 February 2020 (UTC) I tried to find a sort of compromise position ("Chinese (Written vernacular Chinese, a form of Mandarin Chinese)") --Geographyinitiative (talk) 00:25, 13 February 2020 (UTC)
- It is very sad that there isn't a unified term cross Wiki term for what that Wikipedia version's language is to be called. --Geographyinitiative (talk) 00:28, 13 February 2020 (UTC)
- @Geographyinitiative: The written form is w:Written vernacular Chinese. Since it is possible to read vernacular Chinese texts with almost any varieties of Chinese, the colloquial form is technically undetermined. But it is most likely w:Standard Chinese, on which the written form's vocabulary and grammar is based. 恨国党非蠢即坏 (talk) 10:05, 13 February 2020 (UTC)
- @恨国党非蠢即坏
"Since it is possible to read vernacular Chinese texts with almost any varieties of Chinese, the colloquial form is technically undetermined."
我們吃中飯.
Mandarin (so-called Standard Chinese): Wǒmen chī zhōngfàn. -Valid sentence
hypothetical Hokkien (Min Nan): Ngó͘--bûn khit / khek / khiak tiong pn̄g. - Not a valid sentence.
If so-called Standard Chinese (a standardized form of Mandarin Chinese) is the language of that Wikipedia, then it can be called Standard Chinese because all the Wikipedias are composed in written form- no need to say "Written Standard Chinese"- of course they are writing it- it's an encylopedia.
So if the adjective is being used to describe Standard Chinese, then it's worthless, but if it's part of some kind of Orwellian proper noun like "Written vernacular Chinese", then it may be justified- but it's not being used that way.
The term Standard Chinese is a Mainland term for specialized, official form of Mandarin.
The English language term that encompasses the language used on that Wikipedia is Mandarin Chinese. --Geographyinitiative (talk) 11:16, 13 February 2020 (UTC)- @Geographyinitiative: I was just giving Wikipedia entry links. No point arguing for the names with me. 恨国党非蠢即坏 (talk) 14:51, 13 February 2020 (UTC)
- @恨国党非蠢即坏
- @Geographyinitiative: The written form is w:Written vernacular Chinese. Since it is possible to read vernacular Chinese texts with almost any varieties of Chinese, the colloquial form is technically undetermined. But it is most likely w:Standard Chinese, on which the written form's vocabulary and grammar is based. 恨国党非蠢即坏 (talk) 10:05, 13 February 2020 (UTC)
- It is grammatically Mandarin with heavy literary influence (unless grammatical structures like 稱其爲XX are valid colloquial language now?). —Suzukaze-c◇◇ 16:18, 13 February 2020 (UTC)
- @Geographyinitiative: Millions of people in Hong Kong write "Chinese" (which is based on Mandarin) but read the text in Cantonese. You can say it's Mandarin, but every character can be read in Cantonese. So is it Mandarin or Cantonese? Different varieties of Chinese can do this to different extents. I remember being in Fujian and hearing people read "Chinese" texts (Written Standard Chinese) in Min Dong. — justin(r)leung { (t...) | c=› } 01:49, 14 February 2020 (UTC)
- @Justinrleung, Suzukaze-c, 恨国党非蠢即坏 This is a really interesting question to me. I think the inconsistency between us and Wikipedia shows that there is a major problem- we had to make up a term for a form of langauge that does not have a Wikipedia page and link to a page with a different name in order to describe the language on zh.wikipedia. I don't know how to solve this Gordian knot, but I think the inconsistency in naming that I'm pointing out here is one of the 'cracks' that's letting a little light through from the truth: is that Wikipedia version rightly called Chinese Wikipedia? Should they change the name? Why would we need to have different names for Wikipedia versions than what are used on Wikipedia? Why can't zh-wp just call zh "Chinese"? The reason we have an alternate name for zh.wikipedia is because the term 'Chinese Wikipedia' is so morally obscene that people that write a dictionary like us could never blithely accept it. --Geographyinitiative (talk) 00:24, 15 February 2020 (UTC)
- "Morally obscene"? How did I know you would arrive at something like this? Any angle to get a shot at a certain long-deceased equine, you go for it. What does surprise me is that you would think such transparently manipulative stunts would change anyone's mind. Chuck Entz (talk) 00:51, 15 February 2020 (UTC)
- Oh, fer Pete's sake. This again.
- This is akin to complaining that "1, 2, 3, 4" are called "Arabic numerals", or that "I, II, III, IV" are called "Roman numerals", even though we read it out in English as "one, two, three, four". Give it a rest. There are numerous editors here who are native speakers of various kinds of Chinese, and yet you, as apparently a native English speaker, are the one who keeps coming back to this topic, with strange arguments about morality that none of the rest of us can understand. Let it go. You're not making sense, and you don't have any lines in this play anyway. ‑‑ Eiríkr Útlendi │Tala við mig 23:47, 18 February 2020 (UTC)
- @Justinrleung, Suzukaze-c, 恨国党非蠢即坏 This is a really interesting question to me. I think the inconsistency between us and Wikipedia shows that there is a major problem- we had to make up a term for a form of langauge that does not have a Wikipedia page and link to a page with a different name in order to describe the language on zh.wikipedia. I don't know how to solve this Gordian knot, but I think the inconsistency in naming that I'm pointing out here is one of the 'cracks' that's letting a little light through from the truth: is that Wikipedia version rightly called Chinese Wikipedia? Should they change the name? Why would we need to have different names for Wikipedia versions than what are used on Wikipedia? Why can't zh-wp just call zh "Chinese"? The reason we have an alternate name for zh.wikipedia is because the term 'Chinese Wikipedia' is so morally obscene that people that write a dictionary like us could never blithely accept it. --Geographyinitiative (talk) 00:24, 15 February 2020 (UTC)
- @Geographyinitiative: Millions of people in Hong Kong write "Chinese" (which is based on Mandarin) but read the text in Cantonese. You can say it's Mandarin, but every character can be read in Cantonese. So is it Mandarin or Cantonese? Different varieties of Chinese can do this to different extents. I remember being in Fujian and hearing people read "Chinese" texts (Written Standard Chinese) in Min Dong. — justin(r)leung { (t...) | c=› } 01:49, 14 February 2020 (UTC)
Inclusion of reconstructed pronunciations of Proto-Indo-European words
[edit]While on the talk page of a user to check on an unrelated conflict, I became aware that an IP user (who has since been blocked) added some of Don Ringe's reconstructed pronunciations of Proto-Indo-European words, and that these were being reverted. I, for one, would very much like to have those reconstructed pronunciations, so I reverted some of the reverts and explained my reasoning.
One reason I support inclusion of these reconstructed pronunciations is that one has been included on the entry for *méh₂tēr for many years, apparently since this edit in 2014. I've always thought that having this reconstructed pronunciation there was like a refreshing drink of cold water. When researching PIE, one is often confronted with baffling jumbles of laryngeals and syllabic sonorants which seem unpronounceable by actual human beings. Seeing Ringe's [máx.tɛːr] reconstruction was refreshing, because it allows me to imagine what the word actually sounded like, without having to constantly refer back to the Laryngeal theory and PIE phonology articles, and without imagining that these ancient people sounded like dial-up modems.
Upon mentioning the helpfulness of the *méh₂tēr reconstructed pronunciation, the user Rua deleted the reconstruction, despite it being there for five years. I reverted the deletion, but my revert was reverted by the user Surjection, who suggested that I come here, which brings us to the present moment.
Why was the reconstructed pronunciation for *méh₂tēr allowed to exist for so many years, if there is supposedly some rule that we can't have it? Did people just forget that it was there?
I, for one, fully support the inclusion of reconstructed pronunciations on PIE entries. I suggest the following:
- We should include the reconstructed pronunciation, but put it under a heading that says Reconstructed pronunciation rather than merely Pronunciation
- There could be multiple reconstructions, if sourced by multiple linguists. Why not?
In anticipation of an objection: yes, I understand that these pronunciations are tentative and not secure. But, do you really want to tell me with a straight face that the verb conjugation tables we have on basically every verb entry any more secure? Can you really claim that, for example, *ǵn̥h₃sḱóyh₁m̥ is so much more secure than Ringe's pronunciation reconstructions, such that the former should be allowed on the page for *ǵn̥h₃sḱéti, but that the latter should be deleted? Linguists can't even agree on verb endings! If we can have huge, precise verb conjugation tables, but we can't have reconstructed pronunciations, I have to ask: why the double standard?
Therefore, I suggest that the years-old *méh₂tēr pronunciation reconstruction be restored. Furthermore, I suggest that we allow properly-sourced pronunciation reconstructions of PIE entries to be added. BirdValiant (talk) 00:42, 14 February 2020 (UTC)
- Note: here are the rest of the Proto-Indo-European IPA transcriptions in
{{IPA}}
as of the first of the month, from this search on Templatehoard: *bʰréh₂tēr:{{IPA|ine-pro|[b̤ráx.tɛːr]}}
; *h₂éwis:{{IPA|ine-pro|[xáwis]}}
; *méh₂tēr:{{IPA|ine-pro|[máx.tɛːr]}}
; *nisdós:{{IPA|ine-pro|[niz.dós]}}
; *pénkʷe:{{IPA|ine-pro|[péŋ⁽ʷ⁾.kʷe]}}
. — Eru·tuon 01:40, 14 February 2020 (UTC)
- I strongly agree. For noobs like me, the pronunciation is the only window into how these words might actually have sounded. I think it's pretty clear that if the word is reconstructed, the pronunciation must be as well, so I see no harm in having them in the entry. After all, it's the pronunciation that's ultimately being reconstructed, not the orthography! It seems entirely counterintuitive that we wouldn't allow pronunciation. Andrew Sheedy (talk) 01:42, 14 February 2020 (UTC)
- I agree too. I don't see any reason to not have reconstructed prons for reconstructed words. The whole point of the complex system of PIE orthography is to indicate pronunciation, but as IPA is familiar to many more people than PIE orthography, it can only be helpful to give IPA renderings, theoretical as they are. - Sonofcawdrey (talk) 07:56, 14 February 2020 (UTC)
- Oppose The pronunciation of PIE is far too disputed for inclusion. --
{{victar|talk}}
08:03, 14 February 2020 (UTC)- @Victar: Per my anticipated objection in my OP: Should we delete all the verb conjugation tables then, seeing as how many of the verb endings are also in dispute? BirdValiant (talk) 08:21, 14 February 2020 (UTC)
- This because that is not a good counter argument. I'm voting on pronunciations, not inflection tables. --
{{victar|talk}}
08:25, 14 February 2020 (UTC)- @Victar: And how is that an argument? If you think that a detail of the PIE reconstruction being disputed is sufficient to prevent inclusion, and if PIE verb conjugation is also disputed, then doesn't that mean that PIE verb conjugation should also be excluded? BirdValiant (talk) 08:33, 14 February 2020 (UTC)
- You're creating a false equivalence argument. All reconstructions can be disputed, thus is their nature, but we can decide what level of disputability we accept. I hold that PIE pronunciations exceed that level. --
{{victar|talk}}
08:54, 14 February 2020 (UTC)
- You're creating a false equivalence argument. All reconstructions can be disputed, thus is their nature, but we can decide what level of disputability we accept. I hold that PIE pronunciations exceed that level. --
- @Victar: And how is that an argument? If you think that a detail of the PIE reconstruction being disputed is sufficient to prevent inclusion, and if PIE verb conjugation is also disputed, then doesn't that mean that PIE verb conjugation should also be excluded? BirdValiant (talk) 08:33, 14 February 2020 (UTC)
- This because that is not a good counter argument. I'm voting on pronunciations, not inflection tables. --
- @Victar: Per my anticipated objection in my OP: Should we delete all the verb conjugation tables then, seeing as how many of the verb endings are also in dispute? BirdValiant (talk) 08:21, 14 February 2020 (UTC)
- If well-sourced and room is given for multiple interpretations, then I am inclined to think it could be a net positive, but I have my reservations. I don't agree with the idea of changing the header to "Reconstructed pronunciation", for one. If we were to use that header, we might as well use it for all pronunciations of dead languages, all the way up to Middle English, as all those pronunciations are technically speaking reconstructed as well. Instead of a different header, a generic disclaimer in the pronunciation section noting the controversial status of PIE pronunciations would be better, I think. Whatever we do, reliable sources are a sine qua non for this; as Victar noted, these pronunciations are very much disputed so in this case we can't include our own semi-original interpretations of others' research (as is the de-facto standard for e.g. Proto-Germanic pronunciations). — Mnemosientje (t · c) 08:51, 14 February 2020 (UTC)
- Oppose more or less for the same reasons Victar gave. There are too many open-ended questions regarding PIE pronunciation, not just for laryngeals but also for the velar-palatovelar distinction. The argument of sourcing makes theoretical sense, but in practice it means that we would end up with everyone and their dog's reconstructed pronunciation all on the same page. And if the page contains only e.g. Ringe's pronunciation, it would appear to give preference to Ringe simply because nobody else has published one yet. There is a reason that most linguists don't bother with describing PIE pronunciation in too much detail. —Rua (mew) 10:22, 14 February 2020 (UTC)
- I think it is rather ridiculous to present narrow transcriptions, and bad to present any phonetic transcriptions without a clear warning that these are merely informed – but largely speculative – guesses. So consider this an Oppose. --Lambiam 10:45, 14 February 2020 (UTC)
- Re: "... without a clear warning that these are merely informed – but largely speculative – guesses": Why not include them with a clear warning that these are merely informed – but largely speculative – guesses. (I do not know enough about PIE to prevent applying the same predicate to PIE itself, viz that PIE is a collection of "merely informed – but largely speculative – guesses".) --Dan Polansky (talk) 11:07, 14 February 2020 (UTC)
- We do have a rather clear idea of the set of phonemes of PIE if you are willing to consider them as abstract units. The problem is the phonetic interpretation of these units. Dramatic sound changes can occur in a relatively short period. For example, Old Persian *vr̥da- (“rose”) – which may be the etymon of our rose – changed by a regular process into Classical Persian گل (gul). Several millenia elapsed between the genesis of PIE and the appearance of written texts in an IE language. Much of what we know about the pronunciation of Classical Latin is through contemporary authors discussing proper pronunciation, and additionally by spelling mistakes showing that two words had similar pronunciations. --Lambiam 13:05, 14 February 2020 (UTC)
- Why don't we use phonemic transcriptions instead of phonetic ones, then, abandoning IPA [] in favor of IPA //? (I do realize that the examples given in this thread are narrow, e.g. [xáwis], and I accept the point that one should not be too specific in something which is rather uncertain anyway. --Dan Polansky (talk) 20:04, 14 February 2020 (UTC)
- The trouble with giving only a phonemic transcription is that we'd be forced to only show sequences like *eh₂, because these are the underlying phonemes. IPA uses [] brackets for phonetic transcription, and // brackets for phonemic transcription. However, phonetic transcription is not required to be precise. My objection to only using a phonemic transcription is that readers who are not intimately familiar with the both the PIE phonology and Laryngeal theory wiki articles would have no idea that *eh₂ probably always sounded like [a]. The bare h₁/h₂/h₃ notation also gives zero indication that *h₃ was almost certainly rounded and that it has o-coloring effects. Zero indication would be given that *s was probably pronounced like [z] in situations like *nisdós. Finally, readers are also expected to know that *y is really just IPA /j/. In short, the lack of any suggestion of pronunciation means that all PIE entries have a steep learning curve. Readers are forced to refer back and forth to other articles; if they don't, they're basically left looking at a bunch of hieroglyphics, with the impression that spoken PIE must've sounded like a dial-up modem. BirdValiant (talk) 22:09, 14 February 2020 (UTC)
- Why don't we use phonemic transcriptions instead of phonetic ones, then, abandoning IPA [] in favor of IPA //? (I do realize that the examples given in this thread are narrow, e.g. [xáwis], and I accept the point that one should not be too specific in something which is rather uncertain anyway. --Dan Polansky (talk) 20:04, 14 February 2020 (UTC)
- We do have a rather clear idea of the set of phonemes of PIE if you are willing to consider them as abstract units. The problem is the phonetic interpretation of these units. Dramatic sound changes can occur in a relatively short period. For example, Old Persian *vr̥da- (“rose”) – which may be the etymon of our rose – changed by a regular process into Classical Persian گل (gul). Several millenia elapsed between the genesis of PIE and the appearance of written texts in an IE language. Much of what we know about the pronunciation of Classical Latin is through contemporary authors discussing proper pronunciation, and additionally by spelling mistakes showing that two words had similar pronunciations. --Lambiam 13:05, 14 February 2020 (UTC)
- Re: "... without a clear warning that these are merely informed – but largely speculative – guesses": Why not include them with a clear warning that these are merely informed – but largely speculative – guesses. (I do not know enough about PIE to prevent applying the same predicate to PIE itself, viz that PIE is a collection of "merely informed – but largely speculative – guesses".) --Dan Polansky (talk) 11:07, 14 February 2020 (UTC)
- This discussion lacks statements of facts linked to their verification to form its basis. To begin with, how many different IPA PIE pronunciations can we link to reliable sources per PIE word? Two, five, ten? If pronunciation is disputed, what prevents us from listing multiple competing pronunciations, possibly with a template-generated footnote indicating that the pronunciations are disputed, and directing the reader to an appendix explaining more? See also Andrew Sheedy above: "After all, it's the pronunciation that's ultimately being reconstructed, not the orthography!" and the rest of his thoughtful post. --Dan Polansky (talk) 11:04, 14 February 2020 (UTC)
- I Oppose the practice of including IPA transcriptions for any proto-languages, not just PIE. —Mahāgaja · talk 13:06, 14 February 2020 (UTC)
- What about adding multiple transcriptions, each one conforming to a different author's opinion and clearly marked as such? —Suzukaze-c◇◇ 16:44, 14 February 2020 (UTC)
- The problem is that we can't include reconstructions from the vast majority of experts who had the sense not to attempt a reconstruction. A reconstruction is inherently an abstraction, an averaging of the differences between the daughter languages' reflexes. We can be pretty sure there was variation over time and over distance, but we don't know where or when or for how long anything was spoken. One can come up with lots and lots of plausible scenarios for the sequence and distribution of sound changes that might lead to the observed results.
- We don't know the extra-linguistic factors and accidents of history that could have completely changed how etyma turned out. Look at Category:English doublets: these are cases where different sequences of multiple sound changes and borrowings produced spectacularly different outcomes that happened to end up in English.
- Because of historical attestation, we know that palaver, parable, parabola, parlay, parley, parlor, parole, etc. all come from a dialectal variant of παραβολή (parabolḗ, “to put beside”) via Latin parabola, and that the biblical sense of a story told to convey religious concepts became a word for talking, and that the conquest of 1066 caused a French legal term to be adopted in English as parole, and the perception of French as prestigious and socially/culturally advanced led to a room in upper-class homes being called a parlor, which in turn was used by businesses trying to improve their image in terms like funeral parlor and tattoo parlor, and that other historical circumstances led to Africans adopting a Portuguese word as a term for meeting and negotiating, which was used humorously and dismissively in English as palaver, and the status of first Greek, then Latin as languages of science and mathematics led to the English technical term parabola.
- Now imagine that we have only a few of the words in languages still spoken today, and all memory of Ancient Greek and Latin, along with all etymology and all history have been lost- how do we know how the term that gave rise to Greek παραβολή (paravolí), parley and palaver was pronounced? Even if we knew about παραβολή (parabolḗ), how would we decide which of its pronunciations over the centuries and in various dialects to use? Chuck Entz (talk) 20:36, 14 February 2020 (UTC)
- Another issue is false precision. Even with disclaimers, we're still giving the strong impression that we know more than we do: the more specific and detailed something is, the more real it seems (in technical terms, we mistake precision for accuracy). That's why I tried to get a PIE color-template deleted. As I pointed out, the same PIE etymon has English blue and Latin flavus (“yellow”) as descendants, so whatever color we display would probably be wrong- but people would still tend to see that color in their mind's eye. It reminds me of the family in Kansas that sued a geolocation company after it listed IPS coordinates on their farm for all unresolved US IP addresses.
- There's a strong emotional attraction to making the past and the unknowable real, which is why we have to be very careful. An IP editor in France we block for translations in dead languages to things that didn't exist when they were spoken also loves adding pronunciations to proto-language entries. Sure, make-believe is fun, but we're a reference work. Chuck Entz (talk) 22:31, 14 February 2020 (UTC)
- @Mahagaja At one point I added pronunciations for Proto-Germanic, and that seems to have become the standard for that language. But I no longer do so, and would not be sad to see them go too. —Rua (mew) 19:41, 14 February 2020 (UTC)
- FWIW, here's a list of IPA templates in the Reconstruction namespace. — Eru·tuon 23:21, 14 February 2020 (UTC)
- I don't think removing the IPA from all proto-languages is a good idea. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 13:45, 15 February 2020 (UTC)
- What about adding multiple transcriptions, each one conforming to a different author's opinion and clearly marked as such? —Suzukaze-c◇◇ 16:44, 14 February 2020 (UTC)
- Oppose for much of the same reasons stated by Victar and Rua. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 19:32, 14 February 2020 (UTC)
- Why should we not provide multiple phonemic (not phonetic) transcriptions sourced to reliable sources, with a footnote that these are very uncertain? In reading Victar and Rua's posts, I have not found an answer to that question. --Dan Polansky (talk) 20:07, 14 February 2020 (UTC)
- What is being disputed is not so much specific proposed reconstructed pronunciations of individual words (at a level of precision that allows one to give IPA transcriptions), but the whole process, including both problems with cherry picking among reconstructed descendant languages and problems with the assumptions the process is based on (such as the content, extent and timing of various sound change laws, of which quite a few are needed). In many cases the core of the criticism is not even that these assumptions are wrong (and thus should be replaced by correct ones) but that they are based more on wishful thinking than any concrete evidence. We cannot capture that criticism by giving lists of contenders. The situation is not nearly as bad as with Altaic; I expect that in due time consensus will arise on the most plausible reconstruction, but this is not ready for prime time yet. --Lambiam 21:23, 14 February 2020 (UTC)
- Why should we not provide multiple phonemic (not phonetic) transcriptions sourced to reliable sources, with a footnote that these are very uncertain? In reading Victar and Rua's posts, I have not found an answer to that question. --Dan Polansky (talk) 20:07, 14 February 2020 (UTC)
- Oppose. To answer Dan's question, they're not just uncertain, they're unknown, and the better scholars tend not to even try. It's vry unlike some other reconstructed pronunciations that reflect a scholarly community's consensus. —Μετάknowledgediscuss/deeds 21:27, 14 February 2020 (UTC)
- I'll abstain from voting, since I have marginal role in the project, but I wanted to point that PIE pronunciations are quite speculative. On the surface, they may sound as a useful idea, however, the mess that they will bring is going to stultify any positives. Currently, Wiktionary provides reconstructions of early to middle PIE (with laryngeals, 3-partition of the veral stops, pre-simplification of thorn clusters, etc.). The pronunciations that Ringe propose reflect post-Anatolian split (with laryngeal coloring, Greaco-Aryan realization of the stops, and so on). It's quite misleading to apply Ringe's or anyone else's proposal in this context. This will automatically add prescriptive functions to the Wiktionary project, while it is supposed to be strictly descriptive. неактивен 22:43, 14 February 2020 (UTC)
- Oppose. Reconstructing the pronunciation is pretty ridiculous. Many forget about the fundamental principle of saving speech effort. This error leads to perfectionism. With verbs, it's certainly fun. But no less funny is that here play with similar roots of words. Participation in the empty summation of heterogeneous reconstructions and etymologies leads to typical graphomania. This is called a competition — "who will merge the roots more". Especially when the accentological component is not taken into account. I've noticed this more than once. At the same time, the Wiktionary is read by millions of people, who will be misled by many reconstructions. This is fraught with the formation of myths. —— Gnosandes (talk) 14:08, 15 February 2020 (UTC)
- So what I'm gathering is that we have the written form of a language that wasn't written that we don't what it sounded like and we don't know what it meant. I'm not sure this should be a part of Wiktionary, instead of being left to the abstract linguists who see some value in it.--Prosfilaes (talk) 02:23, 18 February 2020 (UTC)
- That is inaccurate. We use the comparative method to reconstruct character states of a common ancestor, including meaning and form. We don't strictly speaking "know", rather we "reconstruct": for that reason such entries are not in the main dictionary, and are instead treated in the Reconstruction namespace. When we reconstruct Proto-Bantu *c, we are certain that this was a phoneme, and we could denote it as /c/ in (broad) IPA — but this would be misleading, because it was probably [s] or [t͡ʃ], and we don't (yet) have a robust way of determining which is correct. In the mean time, we should avoid making it seem that we can comfortably reconstruct more than we can. In general, protolanguages are very important to our etymological work, and I encourage you to learn about the basics of historical linguistics rather than dismiss it out of hand. —Μετάknowledgediscuss/deeds 02:49, 18 February 2020 (UTC)
- As I said, this phoneme is written c, but we don't know what it sounded like, and as far as we know the language was unwritten. Above it was claimed we know a certain Proto-IE word stood for a color, but it could have been "yellow" or "blue" or something else. My statement may have been sarcastic, but I don't think it was inaccurate.
- We could do etymological work without ever mentioning protolanguages; just trace knife back to knífr and at knífr mention the oldest words in other languages it's believed to be cognate to. I don't dismiss historical lingustics, despite the tone of the message, but I do question its value at this level to Wiktionary. To pick one example: fraction says
- From Middle English fraccioun (“a breaking”), from Anglo-Norman, from Old French fraction, from Medieval Latin fractio (“a fragment, portion”), from earlier Latin fractio (“a breaking, a breaking into pieces”), from fractus (English fracture), past participle of frangere (“to break”) (whence English frangible), from Proto-Indo-European *bʰreg- (English break).
- (arithmetic) A ratio of two numbers, the numerator and the denominator, usually written one above the other and separated by a horizontal bar.
- How about this definition: "A member of the equivalence set of ordered pairs (see Axiom of Pairing) of an integer and a positive number, such that (a, b) == (c, d) iff ad = cb." Or "let S = {x ∈ (a,b), a ∈ ℤ, b ∈ ℕ+, and ~ = {(a, b), (c, d) ∈ S| ((a, b), (c, d)) iff ad = bc}. Then a fraction is an element of the equivalence set S/~." I think you should be able to say that's not the level we're going for without getting told to learn the basics of mathematics. But fraction certainly feels like it's half-assing the definition of fraction at the same time it's going into excessive detail about Proto-Indo-European.
- If we're going to drag in *bʰreg-, and refuse to give any explanation of what that might mean, I'm not sure we shouldn't include the "let S = {x ∈ ..."; more of our audience might understand it, and it would be an actual definition of the thing. I do honestly feel that *bʰreg- feels like there's a certain group of editors not worried about the general level of Wiktionary or the cost of dropping such items onto general-audience pages like fraction.--Prosfilaes (talk) 05:54, 18 February 2020 (UTC)
- Oh, and that supposed protoword for blue, Reconstruction:Proto-Indo-European/bʰlēw- redirects to Reconstruction:Proto-Indo-European/bʰleh₁-, which, I don't know. I definitely feel like the environment has changed; there are pages on Wiktionary where I don't understand things, like sininen, but it feels like it's no more complex than it needs to be and I could pick it up quickly were I actually learning Finnish. This stuff is just inside baseball.--Prosfilaes (talk) 06:04, 18 February 2020 (UTC)
- This is not a work founded on mathematics — that would be (one section of) Wikipedia. This is a work founded on linguistics. I wouldn't put PIE in fraction#Etymology myself, but it's going to have to go somewhere along that etymological chain, whether or not you're willing to make the effort to click the Wikipedia link and learn about it. (Also, on a practical level, it turns out that your idea of listing cognates in etymology sections might work if you only care about European languages, but is deeply impractical for most of the rest of the world.) —Μετάknowledgediscuss/deeds 06:34, 18 February 2020 (UTC)
- I disagree. A dictionary is not a work founded on linguistics. It's a work that uses linguistics to do a job. 90% of the words we have are linguistically uninteresting, and we could firmly half-ass it a lot more; calling a kangaroo "A group of Australian mammals" is probably already pushing the linguistically interesting part of the definition. On the flip side, whenever it presumes to give a definition, it's entangled with mathematics, biology, or whatever field that definition is in.--Prosfilaes (talk) 08:16, 18 February 2020 (UTC)
- This is not a work founded on mathematics — that would be (one section of) Wikipedia. This is a work founded on linguistics. I wouldn't put PIE in fraction#Etymology myself, but it's going to have to go somewhere along that etymological chain, whether or not you're willing to make the effort to click the Wikipedia link and learn about it. (Also, on a practical level, it turns out that your idea of listing cognates in etymology sections might work if you only care about European languages, but is deeply impractical for most of the rest of the world.) —Μετάknowledgediscuss/deeds 06:34, 18 February 2020 (UTC)
- Actually, we know a surprising amount. We know that there's "something" that became *b in Germanic and Balto-Slavic languages, *φ in Ancient Greek, *भ in Sanskrit, *f in Latin, etc. In most languages with a voiced-unvoiced distinction, the descendant form is voiced. In languages with an aspirated-unaspirated distinction, it's aspirated. And in pretty much all the daughter languages, it's a labial sound. We could call it "something #15" or something like that. We could say "whatever it is that ends up as Germanic *b, Greek *φ and Sanskrit *भ", but it's easier to refer to it as *bʰ. Likewise, we have sets of correspondences where certain languages have a different outcome that seems to result from certain types of neighboring sounds in the ancestral form, so it's important to show all the "somethings" together so we discuss the likely interaction between those "somethings".
- The result is something that looks like a morpheme or an entire word, but it's really a sort of index to what we've been able to figure out about that piece of the parent language. It's very useful in figuring out how the descendant forms got to be what they were, and we've even had cases where newly discovered words or even new languages have exactly the kind of sound that was predicted from the reconstructed parent form.
- The main drawback is that people try to read more into these reconstructions than they should, and there's a tendency to mistake this scientific and imperfect process for a magic wand that lets us look into the past and see wonders that have been lost. We have this romantic desire to make this past as real as possible, even when our picture is rather fuzzy and the past we're reconstructing probably isn't all that different from the present.
- It's a pet theory of mine that the excitement over the discovery of Proto-Indo-European in the 19th century played a big part in the development of Nazism in the 20th: the feeling was that the world which was discovered must have been magical in some way, and far better than the present. The Nazis projected their opinions about what was wrong with the present world and their fantasies what a perfect world must be like onto this reconstructed world and created their own version that confirmed all their prejudices. Chuck Entz (talk) 03:40, 18 February 2020 (UTC)
- That is inaccurate. We use the comparative method to reconstruct character states of a common ancestor, including meaning and form. We don't strictly speaking "know", rather we "reconstruct": for that reason such entries are not in the main dictionary, and are instead treated in the Reconstruction namespace. When we reconstruct Proto-Bantu *c, we are certain that this was a phoneme, and we could denote it as /c/ in (broad) IPA — but this would be misleading, because it was probably [s] or [t͡ʃ], and we don't (yet) have a robust way of determining which is correct. In the mean time, we should avoid making it seem that we can comfortably reconstruct more than we can. In general, protolanguages are very important to our etymological work, and I encourage you to learn about the basics of historical linguistics rather than dismiss it out of hand. —Μετάknowledgediscuss/deeds 02:49, 18 February 2020 (UTC)
- Oppose Agree with Chuck above and others. Oppose as an extremely error-prone exercise, although very interesting in a case-by-case scenarios and can happen to be right or very close. It will make us believe what something sounded like and no native speaker to dispute. --Anatoli T. (обсудить/вклад) 11:11, 18 February 2020 (UTC)
- Having seen some of the above arguments, I can better appreciate the opposition to including pronunciation. I still think it would be useful to have, say, three representative pronunciation reconstructions, along with a disclaimer. But I will leave the final decision to those who are more knowledgeable than I am. Andrew Sheedy (talk) 22:57, 18 February 2020 (UTC)
I am not sure if it was said before, but the transcriptions given at the start of the discussion are 90% the exact same characters, except for the laryngeal, which is of course one of the deeper rooted problems of PIE phonology. Even if *h2 [*x] might resonate well with me, a broad transcription is almost useless. For refrrence there's an old Stackexchange question Do Linguists pronounce PIE roots?
Moratorium on Proto-Nuristani
[edit]I recommend a moratorium on the creation of links and entries for Proto-Nuristani. Proto-Nuristani is a feasibly reconstructible language, but the research is infantile and contradictory. It isn't even agreed upon where Nuristani sits within the Indo-Iranian language family.
Arguably, the largest effort towards standardization are the works of Richard Strand, but they're primarily unpublished and, in many ways, out of step with modern Indo-Iranic research. Until someone publishes a compendium of Proto-Nuristani reconstructions, I believe we should hold back on its reconstruction here on the project. This includes the deletion of some twenty entries, most of which were created by me. --{{victar|talk}}
08:01, 14 February 2020 (UTC)
I have created a new vote on how to increase admin accountability. Pinging everybody who voted last time we had a related vote: @Allahverdi Verdizade, Numberguy6, Lingo Bingo Dingo, TheDaveRoss, Andrew Sheedy, Mnemosientje, Vorziblix, Nardog, Donnanz/@So9q, Dan Polansky, Tom 144, Qehath, Victar, Vahagn Petrosyan, Fay Freak, Eirikr, Hazarasp/@Equinox, AryamanA, SemperBlotto, Jusjih, Robbie SWE, Saltmarsh, Billinghurst —Μετάknowledgediscuss/deeds 21:24, 14 February 2020 (UTC)
Chinese "Zhaoqing" dialect
[edit](肇庆?) Does anyone know this dialect? There are some script errors related to this. 恨国党非蠢即坏 (talk) 16:07, 15 February 2020 (UTC)
Ordering of senses (again)
[edit]I know there have been discussions in the past about whether senses should be ordered chronologically or by (modern) frequency of use. Does anyone know where we presently stand with this? Is the issue still undecided? Mihia (talk) 18:42, 15 February 2020 (UTC)
- AFAIK there's still no agreed-upon rule. For my part, I used to favour chronological ordering, but have come to somewhat prefer ordering by commonality, and above all to prefer (and practice, in the highly polysemous entries I overhaul like get off to give a recent example) grouping "related" senses. - -sche (discuss) 20:28, 15 February 2020 (UTC)
- As we often have little direct factual evidence in support of either chronological, frequency, or likelihood-of-lookup ordering, there is lots of scope for more analytical-subjective ordering. Grouping of senses seems to be not-too-controversial (though some dislike subsenses). Where relevant, placing more physical or 'basic' senses first may be helpful in providing users with the appropriate basis for interpreting extended and more figurative, metaphorical definitions. DCDuring (talk) 19:43, 16 February 2020 (UTC)
Bulgarian as descendant of Old Church Slavonic
[edit]The w:Bulgarian language should be seen as the descendant of w:Old Church Slavonic as the latter is also referred to as "Old Bulgarian", it was also spoken in the territory of the later development of Bulgarian, they belong to the same subdivision of Eastern South Slavic languages and I presuppose that they would have enough relatable grammar and vocabulary. Even though Bulgarian has already been entered as descendant of Old Church Slavonic, this possibility of derivation is mostly not made use of. In line with the suggestion, derivations from Old Church Slavonic should be extended more thoroughly. HeliosX (talk) 21:32, 15 February 2020 (UTC)
- @HeliosX: I don't know if it's firmly established that Bulgarian is a descendent of Old Church Slavonic but I can't find the discussions. --Anatoli T. (обсудить/вклад) 21:53, 16 February 2020 (UTC)
- @HeliosX, Atitarev: Here's the one I remember. Canonicalization (talk) 08:35, 17 February 2020 (UTC)
- It is possible to set an etymology-only language as a parent of a language, so we could create an etymology-only code for Old Bulgarian and set it as the ancestor of Bulgarian. —Mahāgaja · talk 18:06, 17 February 2020 (UTC)
- @Mahagaja: But why? Etymology-only codes are not for synonyms. If you say there was an Old Bulgarian beside Old Church Slavonic you are not only making things impractical but faking them. The difference between what was written by the Preslav Literary School and what their authors spoke was at most the difference between how Ottomans wrote and the vulgar spoke Turkish, or the Vulgar Latin of the later Roman Republic and not the end of the Roman Empire (which does not justify a code for Vulgar Latin, unlike later stages). Not a language difference at all. The current splits for Slavic, and I do not consider Slavic microlanguages but perhaps it is even for them, are fine. There is Old East Slavic and Old Polish and Old Czech because this can be distinguished and there isn’t Old Slovak because nobody knows what that is, and the Serbo-Croatian corpus is not more different by age than by region; and in Slovene distinctions in itself and from Proto-Slavic are similarly but it appears to me even less pronounced. Only that “Old Church Slavonic” still pends renaming to “Church Slavonic”, since this is seen from outside and later Bulgaria as a separate artificial language without any cut in time, distinguished from evolved Bulgarian and the other region’s more or less evolved languages.
For those who do not have a purview of all that Slavic, languages which may appear a bare lot from afar in the Anglosphere, I try to explain: The Medieval state is perhaps to imagine as like in current France or Germany there are regions with dialects less or more heavy and thus in some regions distinct Romance resp. Germanic languages but people write a separate “Church French” from Paris so to say but a non-church-French is not to be sought in the center of Paris itself, and in Hannover and Bielefeld people really speak like in the books – unlike around 1900 though in these, but now there is no “Vulgar German” and nothing continued parallelly here; and now imagine the Swiss knowing only High Alemannic (as the Russians their native Old East Slavic) would begin only now to write Standard German, having learned it from Hannoverans and Bielefelders (as the Russians learned “Church Slavonic” or Standard Bulgarian from Bulgarians). And then the Standard German in the named two cities after new Kleinstaaterei might develop away over half a millenium and then Alemannic also is even more different from now (like modern Ukrainian and Russian from Old East Slavic) but you would not immediately realize that there really wasn’t anything but Standard German spoken in Bielefeld. In this fashion there wasn’t any “Old Bulgarian” parallely to Church Slavonic from which recent Bulgarian descends. Fay Freak (talk) 04:08, 18 February 2020 (UTC)
- @Mahagaja: But why? Etymology-only codes are not for synonyms. If you say there was an Old Bulgarian beside Old Church Slavonic you are not only making things impractical but faking them. The difference between what was written by the Preslav Literary School and what their authors spoke was at most the difference between how Ottomans wrote and the vulgar spoke Turkish, or the Vulgar Latin of the later Roman Republic and not the end of the Roman Empire (which does not justify a code for Vulgar Latin, unlike later stages). Not a language difference at all. The current splits for Slavic, and I do not consider Slavic microlanguages but perhaps it is even for them, are fine. There is Old East Slavic and Old Polish and Old Czech because this can be distinguished and there isn’t Old Slovak because nobody knows what that is, and the Serbo-Croatian corpus is not more different by age than by region; and in Slovene distinctions in itself and from Proto-Slavic are similarly but it appears to me even less pronounced. Only that “Old Church Slavonic” still pends renaming to “Church Slavonic”, since this is seen from outside and later Bulgaria as a separate artificial language without any cut in time, distinguished from evolved Bulgarian and the other region’s more or less evolved languages.
- It is possible to set an etymology-only language as a parent of a language, so we could create an etymology-only code for Old Bulgarian and set it as the ancestor of Bulgarian. —Mahāgaja · talk 18:06, 17 February 2020 (UTC)
- @HeliosX, Atitarev: Here's the one I remember. Canonicalization (talk) 08:35, 17 February 2020 (UTC)
- @HeliosX: {{inh|bg|cu| works fine even under the current state of affair (whatever it may be).
- Moreover, Old Church Slavonic (a standardized language) and Old Bulgarian (stage of evolution of a dialect continuum) are not exactly equivalent. One is a codification based partially on of the other. Effectively, Modern Bulgarian and OCS are ausbau cousins. During 18-19th century, there was a trend among Bulgarian intelligentia to use a language which is quite similar to OCS, however, after the establishment of the new Bulgarian state a simpler version was chosen instead. The current language is a further simplified version of that already colloquilized version coined in the 1870's, so it cannot really claim descent from OCS. Same holds for Macedonian, Torlak/Shopian, Pomak, and all other varieties standardized during the 20th century on the basis of Balkan South Slavic. Неактивен 23:30, 21 February 2020 (UTC)
Help with wanted categories
[edit]Could some knowledgeable experts help with the following wanted categories?
Japanese terms spelled with FOO read as BAR
[edit]I gather these need to be defined with {{ja-readingcat}}
, e.g. {{ja-readingcat|赤|あか|kun|nanori}}
for Category:Japanese terms spelled with 赤 read as あか. Unfortunately I don't know how to enter the params like "kun" and "nanori". The full list is as follows:
- Category:Japanese terms spelled with 繰 read as くり
- Category:Japanese terms spelled with 還 read as かえ
- Category:Japanese terms spelled with 之 read as これ
- Category:Japanese terms spelled with 優 read as よし
- Category:Japanese terms spelled with 凍 read as こご
- Category:Japanese terms spelled with 分 read as ぷん
- Category:Japanese terms spelled with 司 read as まもる
- Category:Japanese terms spelled with 和 read as なり
- Category:Japanese terms spelled with 國 read as ごく
- Category:Japanese terms spelled with 堪 read as こら
- Category:Japanese terms spelled with 塊 read as かたまり
- Category:Japanese terms spelled with 太 read as おお
- Category:Japanese terms spelled with 嫡 read as てき
- Category:Japanese terms spelled with 已 read as や
- Category:Japanese terms spelled with 弥 read as ミ
- Category:Japanese terms spelled with 往 read as ゆ
- Category:Japanese terms spelled with 抉 read as くじ
- Category:Japanese terms spelled with 抉 read as こじ
- Category:Japanese terms spelled with 指 read as さし
- Category:Japanese terms spelled with 攻 read as せ
- Category:Japanese terms spelled with 放 read as ばな
- Category:Japanese terms spelled with 斗 read as ト
- Category:Japanese terms spelled with 斜 read as なな
- Category:Japanese terms spelled with 是 read as こ
- Category:Japanese terms spelled with 朕 read as わ
- Category:Japanese terms spelled with 殻 read as カラ
- Category:Japanese terms spelled with 灰 read as ハイ
- Category:Japanese terms spelled with 然 read as そ
- Category:Japanese terms spelled with 王 read as をー
- Category:Japanese terms spelled with 生 read as じょう
- Category:Japanese terms spelled with 痘 read as iiregular
- Category:Japanese terms spelled with 皇 read as すべら
- Category:Japanese terms spelled with 皇 read as すめ
- Category:Japanese terms spelled with 皇 read as すめら
- Category:Japanese terms spelled with 看 read as み
- Category:Japanese terms spelled with 空 read as あき
- Category:Japanese terms spelled with 紅 read as もみ
- Category:Japanese terms spelled with 絶 read as とだ
- Category:Japanese terms spelled with 維 read as これ
- Category:Japanese terms spelled with 耐 read as た
- Category:Japanese terms spelled with 臨 read as のぞ
- Category:Japanese terms spelled with 蚕 read as こ
- Category:Japanese terms spelled with 融 read as よう
- Category:Japanese terms spelled with 装 read as よそお
- Category:Japanese terms spelled with 誹 read as そし
- Category:Japanese terms spelled with 責 read as せ
- Category:Japanese terms spelled with 走 read as は
- Category:Japanese terms spelled with 返 read as かえし
- Category:Japanese terms spelled with 返 read as がえし
- Category:Japanese terms spelled with 錮 read as こ
- Category:Japanese terms spelled with 陽 read as び
- Category:Japanese terms spelled with 集 read as じゅう
- Category:Japanese terms spelled with 零 read as ゼロ
- Category:Japanese terms spelled with 音 read as おっと
- Category:Japanese terms spelled with 音 read as のん
- Category:Japanese terms spelled with 類 read as たぐ
- Category:Japanese terms spelled with 髄 read as すね
- Category:Japanese terms spelled with 髄 read as なずき
- Category:Japanese terms spelled with 龍 read as りょう
- @Benwing2: This information can be found at e.g. 赤#Readings for each character, and I would guess the same is true for Okinawan. A bot that could check the relevant template might be too much effort, though. —Μετάknowledgediscuss/deeds 18:19, 16 February 2020 (UTC)
- @Metaknowledge Thanks. There are enough such cases that I would rather do it by bot, esp. since there will be more in the future. Benwing2 (talk) 18:29, 16 February 2020 (UTC)
Okinawan terms spelled with FOO read as BAR
[edit]Same as above but for Okinawan. The full list is as follows:
- Category:Okinawan terms spelled with 人 read as っちゅ
- Category:Okinawan terms spelled with 大 read as うふ
- Category:Okinawan terms spelled with 子 read as し
- Category:Okinawan terms spelled with 菓 read as くゎー
- Category:Okinawan terms spelled with 夜 read as ゆー
- Category:Okinawan terms spelled with 島 read as しま
- Category:Okinawan terms spelled with 押 read as う
- Category:Okinawan terms spelled with 月 read as ぐゎち
- Category:Okinawan terms spelled with 校 read as こー
- Category:Okinawan terms spelled with 油 read as ゆー
- Category:Okinawan terms spelled with 食 read as か
- Category:Okinawan terms spelled with 鹿 read as か
- Category:Okinawan terms spelled with 一 read as ちゅ
- Category:Okinawan terms spelled with 七 read as なな
- Category:Okinawan terms spelled with 三 read as さん
- Category:Okinawan terms spelled with 世 read as ゆー
- Category:Okinawan terms spelled with 京 read as ちょー
- Category:Okinawan terms spelled with 傷 read as きじ
- Category:Okinawan terms spelled with 元 read as むとぅ
- Category:Okinawan terms spelled with 元 read as むーとぅ
- Category:Okinawan terms spelled with 光 read as ふぃかり
- Category:Okinawan terms spelled with 児 read as く
- Category:Okinawan terms spelled with 写 read as さ
- Category:Okinawan terms spelled with 北 read as にし
- Category:Okinawan terms spelled with 南 read as ふぇー
- Category:Okinawan terms spelled with 古 read as く
- Category:Okinawan terms spelled with 唇 read as しば
- Category:Okinawan terms spelled with 唐 read as とー
- Category:Okinawan terms spelled with 喉 read as ぬーでぃー
- Category:Okinawan terms spelled with 声 read as くぃー
- Category:Okinawan terms spelled with 多 read as うふ
- Category:Okinawan terms spelled with 夜 read as ゆる
- Category:Okinawan terms spelled with 大 read as うー
- Category:Okinawan terms spelled with 女 read as ゐな
- Category:Okinawan terms spelled with 子 read as ぐ
- Category:Okinawan terms spelled with 学 read as がく
- Category:Okinawan terms spelled with 官 read as くゎん
- Category:Okinawan terms spelled with 宮 read as なー
- Category:Okinawan terms spelled with 察 read as さち
- Category:Okinawan terms spelled with 広 read as くゎん
- Category:Okinawan terms spelled with 御 read as う
- Category:Okinawan terms spelled with 愛 read as かな
- Category:Okinawan terms spelled with 戦 read as いくさ
- Category:Okinawan terms spelled with 新 read as みー
- Category:Okinawan terms spelled with 日 read as にち
- Category:Okinawan terms spelled with 暁 read as あかちち
- Category:Okinawan terms spelled with 書 read as か
- Category:Okinawan terms spelled with 書 read as し
- Category:Okinawan terms spelled with 本 read as ふん
- Category:Okinawan terms spelled with 本 read as むとぅ
- Category:Okinawan terms spelled with 東 read as あがり
- Category:Okinawan terms spelled with 東 read as とぅん
- Category:Okinawan terms spelled with 東 read as とー
- Category:Okinawan terms spelled with 枚 read as めー
- Category:Okinawan terms spelled with 桃 read as むむ
- Category:Okinawan terms spelled with 水 read as みじ
- Category:Okinawan terms spelled with 油 read as あんだ
- Category:Okinawan terms spelled with 泡 read as あー
- Category:Okinawan terms spelled with 泳 read as っゐー
- Category:Okinawan terms spelled with 清 read as ちゅら
- Category:Okinawan terms spelled with 物 read as むち
- Category:Okinawan terms spelled with 犬 read as いん
- Category:Okinawan terms spelled with 猫 read as まやー
- Category:Okinawan terms spelled with 生 read as う
- Category:Okinawan terms spelled with 生 read as っん
- Category:Okinawan terms spelled with 田 read as たー
- Category:Okinawan terms spelled with 由 read as ゆー
- Category:Okinawan terms spelled with 男 read as ゆきが
- Category:Okinawan terms spelled with 男 read as ゐきが
- Category:Okinawan terms spelled with 男 read as をぅとぅく
- Category:Okinawan terms spelled with 真 read as しん
- Category:Okinawan terms spelled with 神 read as かみ
- Category:Okinawan terms spelled with 線 read as しん
- Category:Okinawan terms spelled with 肝 read as ちむ
- Category:Okinawan terms spelled with 自 read as じ
- Category:Okinawan terms spelled with 花 read as はな
- Category:Okinawan terms spelled with 花 read as ふぁな
- Category:Okinawan terms spelled with 菓 read as くゎ
- Category:Okinawan terms spelled with 葉 read as は
- Category:Okinawan terms spelled with 西 read as しー
- Category:Okinawan terms spelled with 言 read as くとぅ
- Category:Okinawan terms spelled with 記 read as ち
- Category:Okinawan terms spelled with 話 read as はなし
- Category:Okinawan terms spelled with 話 read as ふぁー
- Category:Okinawan terms spelled with 警 read as きー
- Category:Okinawan terms spelled with 警 read as ちー
- Category:Okinawan terms spelled with 車 read as くるま
- Category:Okinawan terms spelled with 酒 read as さき
- Category:Okinawan terms spelled with 針 read as はーい
- Category:Okinawan terms spelled with 雨 read as あみ
- Category:Okinawan terms spelled with 雲 read as くむ
- Category:Okinawan terms spelled with 露 read as ちゆ
- Category:Okinawan terms spelled with 髪 read as かん
- Category:Okinawan terms spelled with 魂 read as たまし
- Category:Okinawan terms spelled with 魂 read as たましー
- Category:Okinawan terms spelled with 魚 read as いゆ
- Category:Okinawan terms spelled with 鶴 read as ちる
- Category:Okinawan terms spelled with 黒 read as くるー
- Category:Okinawan terms spelled with 鼠 read as えんちゅ
- Category:Okinawan terms spelled with 鼠 read as っゑんちゅ
- Category:Okinawan terms spelled with 鼠 read as ゑんちゅ
FOO order/family/genus plants
[edit]These need entries added to Module:category tree/topic cat/data/Plants, but with descriptions that only plant experts can fill in. Some of the categories below appear to be typos that need fixing.
- Category:kk:Balsaminaceae family plants
- Category:kk:Boraginaceae family plants
- Category:kk:Brassiaceae family plants
- Category:kk:Buxaceae family plants
- Category:kk:Colchicaceae family plants
- Category:kk:Convolvulaceae family plants
- Category:kk:Crassulaceae family plants
- Category:kk:Cucurbitaceae famiily plants
- Category:kk:Cucurbitaceae family plants
- Category:kk:Cyperaceae family plants
- Category:kk:Gentianaceae family plants
- Category:kk:Humulus genus plants
- Category:kk:Hypericaceae family plants
- Category:kk:Juglandaceae family plants
- Category:kk:Lamiaceae family plants
- Category:kk:Lauraceae family plants
- Category:kk:Moraceae family plants
- Category:kk:Orchid family plants
- Category:kk:Pedalium family plants
- Category:kk:Platanaceae family plants
- Category:kk:Rutaceae family plants
- Category:kk:Salicaceae family plants
- Category:my:Caesalpinia tribe plants
- Category:my:Myrtle order plants
@Chuck Entz, MiguelX413 Maybe you can help. Benwing2 (talk) 15:23, 16 February 2020 (UTC)
- Some of these are too narrow to be worth creating a category for, some have categories at a different taxonomic rank, and some use a taxonomic name where the existing category uses a common name, or phrases it differently. Here's how I would categorize them:
Category Given | Closest Existing category | Notes |
---|---|---|
Balsaminaceae family plants | Category:Ericales order plants | Too narrow |
Boraginaceae family plants | Category:Borage family plants | |
Brassiaceae family plants | Category:Crucifers | Brassicaceae is also misspelled |
Buxaceae family plants | Category:Buxales order plants | Too narrow |
Colchicaceae family plants | Category:Liliales order plants | Too narrow |
Convolvulaceae family plants | Category:Morning glory family plants | |
Crassulaceae family plants | Category:Saxifragales order plants | This family is distinctive/recognizable enough and probably has enough potential members to consider splitting off, though it might be better to call it "Stonecrop family plants" |
Cucurbitaceae famiily plants | Category:Gourd family plants | |
Cyperaceae family plants | Category:Sedges | |
Gentianaceae family plants | Category:Gentianales order plants | |
Humulus genus plants | Category:Rosales order plants | |
Hypericaceae family plants | Category:Malpighiales order plants | |
Juglandaceae family plants | Category:Fagales order plants | There might be enough potential members to justify a separate category, but I would call it "walnut family plants" |
Lamiaceae family plants | Category:Mint family plants | |
Lauraceae family plants | Category:Laurel family plants | |
Moraceae family plants | Category:Mulberry family plants | |
Orchid family plants | Category:Orchids | |
Pedalium family plants | Category:Lamiales order plants | |
Platanaceae family plants | Category:Proteales order plants | |
Rutaceae family plants | Category:Citrus subfamily plants | |
Salicaceae family plants | Category:Willows and poplars | |
Caesalpinia tribe plants | Category:Caesalpinia subfamily plants | |
Myrtle order plants | Category:Myrtle family plants |
- I don't have time right now to explain further- more later. Chuck Entz (talk) 16:54, 16 February 2020 (UTC)
- @Chuck Entz Thanks. Can I go ahead and fix the pages in those categories in line with your suggestions? Benwing2 (talk) 18:03, 16 February 2020 (UTC)
- I'm sure he'd love it. FWIW, his proposals look very sensible to me. DCDuring (talk) 19:52, 16 February 2020 (UTC)
- @Chuck Entz, DCDuring Done. Benwing2 (talk) 20:17, 16 February 2020 (UTC)
- BTW I didn't try to split out the walnut or stonecrop family plants, having no idea in general which plants go into these families and which ones don't. Benwing2 (talk) 20:19, 16 February 2020 (UTC)
- @Chuck Entz, DCDuring Done. Benwing2 (talk) 20:17, 16 February 2020 (UTC)
- I'm sure he'd love it. FWIW, his proposals look very sensible to me. DCDuring (talk) 19:52, 16 February 2020 (UTC)
- @Chuck Entz Thanks. Can I go ahead and fix the pages in those categories in line with your suggestions? Benwing2 (talk) 18:03, 16 February 2020 (UTC)
Mongolian 'ном' (book) declension table incorrect?
[edit]https://en.wiktionary.org/wiki/%D0%BD%D0%BE%D0%BC#Declension
Hi there. I found a table of nouns and their cases in John Gaunt's book 'Modern Mongolian: A Course Book', and noticed his table was different to the declention table for 'ном' on wikitionary? I'm not sure which one is most accurate or reliable, so any input would be greatly appeciated: https://en.wiktionary.org/wiki/%D0%BD%D0%BE%D0%BC#Declension
I also made a post on the Mongolian reddit to get feedback on the table from native Mongolian speakers. Feel free to let me know what you think :-) Here are the links to the image and reddit post:
https://i.imgur.com/RHl8UQs.png
or if that doesn't work, try:
https://imgur.com/gallery/yHTRRJR
Reddit post: https://www.reddit.com/r/mongolia/comments/f57r6g/how_accurate_is_this_table_of_mongolian/ CcfUk2018 (talk) 16:24, 18 February 2020 (UTC)
- @Crom daba (the only editor on Mongolian I can immedlately think of). The differences I can see are the order of cases, the lack of comitative and directional from the book and some of the case names. — surjection ⟨?⟩ 20:42, 20 February 2020 (UTC)
- This seems to be due to changes in T:mn-decl-noun introduced by @LibCae (who is maybe the only currently active editor in Mongolian).
- The purpose was to add substantive genitive besides genitive, I'm not sure what construction this is referring to, maybe the so called "oblique case" (name used by Janhunen IIRC) that occurs in compounds. The changes in the template need either to be synchronized with the entries or rolled back. Crom daba (talk) 15:22, 23 February 2020 (UTC)
User Holodwig21
[edit]I don't really understand where to file a complaint, so I write here that the user @Holodwig21 cancels edits without reaching a consensus/compromise. Isn't that a violation of the rules? ---- Gnosandes (talk) 19:12, 19 February 2020 (UTC)
- Am not that into PIE so I can't very well comment on the case, but if anyone's curious it's about this edit and this one it seems, and note the discussions on the respective talk pages of the entries in question. Btw: if there's a disagreement on matters of etymology generally WT:Etymology scriptorium is the place. Don't like to lecture but I have to note that it seems to me from the relevant talk pages like Holodwig has in good faith given an explanation of his motives, so a "call-out" like this seems counterproductive. — Mnemosientje (t · c) 08:04, 20 February 2020 (UTC)
Misleading "unclassified family" info in languages categories
[edit]For primary language families, the little info-table states the "Parent family" as "unclassified". Readers might think that families firmly established as primary families are unclassified and might have some other relations . I think we should have a way to indicate this in the Module:languages data to either just get rid of the row in the table or have some way of indicating it's a primary family. Julia ☺ ☆ 20:17, 19 February 2020 (UTC)
Stress related information for combining forms
[edit]Currently there's little if any info. regarding the stress(es) of words when suffixed and prefixed --Backinstadiums (talk) 18:19, 20 February 2020 (UTC)
Accents on PIE roots
[edit]From what I've have learn about Proto-Indo-European entries, is that accents on vowel aren't included, no roots page has them here in wiktionary, although this rule doesn't apply to fully-formed words. In a recent discussion, here, the User Gnosandes has presented the argument that WT:AINE says nothing against PIE roots carrying accents or it being prohibited. I've contradicted that by quoting WT:AINE with "Fully-formed words (as opposed to roots) have an inherent accent, which should be present in the page name. You may add a word without an accent if you don't know where the accent should be placed, but when the accent placement becomes known the entry should be moved/renamed to reflect this." In the first line it says "Fully-formed words (as opposed to roots) have an inherent accent"; I understand this to be stating that roots don't carry accent. Against this, Gnosandes argued, "This is not convincing. For it is not exactly spelled out." and later reiterated, "It doesn't say that roots can't have an accent. And WT:AINE doesn't forbid it.".
So I'm asking if people agree with my interpretation of what WT:AINE says about accent on roots and If so, could we rewrite WT:AINE to say explicitly that we don't include accents on roots.
How it is now:
- Cite roots and stems (forms that are not fully inflected words) with a hyphen: *peḱ-.
- Fully-formed words (as opposed to roots) have an inherent accent, which should be present in the page name. You may add a word without an accent if you don't know where the accent should be placed, but when the accent placement becomes known the entry should be moved/renamed to reflect this.
How it would become:
- Cite roots and stems (forms that are not fully inflected words) with a hyphen: *peḱ-.
- Roots don't carry accents on vowels.
- Fully-formed words (as opposed to roots) have an inherent accent, which should be present in the page name. You may add a word without an accent if you don't know where the accent should be placed, but when the accent placement becomes known the entry should be moved/renamed to reflect this.
. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 19:03, 20 February 2020 (UTC)
- Support, Yes, I will support a clearer interpretation. But it is still unclear. ---- Gnosandes (talk) 20:23, 20 February 2020 (UTC)
- I'd prefer it if we didn't have accents at all on PIE page names, but rather treated them like macrons in Latin and Old English: present in the headword line, but stripped in links. —Mahāgaja · talk 22:25, 20 February 2020 (UTC)
- @Mahagaja: That is a good suggestion, but the case with both Latin and Old English is that both are mostly attested without diacritics. PIE isn't attested at all. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 23:12, 20 February 2020 (UTC)
- True, but we already do that for Proto-Slavic (i.e. diacritics in the headword line but not in the page name). PIE is "attested" in the work of historical linguists, whose use of accent marks is sporadic at best unless word accent is actually the topic under discussion. —Mahāgaja · talk 12:58, 21 February 2020 (UTC)
- @Mahagaja: I support including accents in the page name for Proto-Slavic. I really don't see any reason not to include them for reconstructed terms. What makes accents any different from any other phonological feature? Are we going to omit vowels next, or something? —Rua (mew) 18:53, 24 February 2020 (UTC)
- @Rua: The difference between accent and other phonological features for PIE at least is that the accent is (1) very often unknown/unknowable (e.g. for terms attested only in languages that have lost all trace of the accent, or for terms where the evidence from the daughter languages is contradictory) and (2) very often left unmarked by researchers, even when the accent is known, which means people looking up a PIE term that they've read in a journal article or something may have difficulty finding it here. Point (2) definitely also applies to PSl., though I'm not sure to what extent point (1) does. —Mahāgaja · talk 10:56, 25 February 2020 (UTC)
- Point 2 can very easily be remedied by redirects. Point 1 just means we provide the accent only when we know it, which we already do and hasn't been a problem. —Rua (mew) 11:41, 25 February 2020 (UTC)
- Maybe it hasn't been a problem for you, but providing the accent only when we know it has been a problem for me and others if we don't know whether or not the accent is known, or if different scholars say different things about where the accent fell. Providing the accent only when we know it also results in inconsistent page names, as some have accents and some don't, which leads to confusion. And using diacritic stripping for PIE obviates the need for a bunch of redirects. —Mahāgaja · talk 12:34, 25 February 2020 (UTC)
- Point 2 can very easily be remedied by redirects. Point 1 just means we provide the accent only when we know it, which we already do and hasn't been a problem. —Rua (mew) 11:41, 25 February 2020 (UTC)
- @Rua: The difference between accent and other phonological features for PIE at least is that the accent is (1) very often unknown/unknowable (e.g. for terms attested only in languages that have lost all trace of the accent, or for terms where the evidence from the daughter languages is contradictory) and (2) very often left unmarked by researchers, even when the accent is known, which means people looking up a PIE term that they've read in a journal article or something may have difficulty finding it here. Point (2) definitely also applies to PSl., though I'm not sure to what extent point (1) does. —Mahāgaja · talk 10:56, 25 February 2020 (UTC)
- @Mahagaja: I support including accents in the page name for Proto-Slavic. I really don't see any reason not to include them for reconstructed terms. What makes accents any different from any other phonological feature? Are we going to omit vowels next, or something? —Rua (mew) 18:53, 24 February 2020 (UTC)
- True, but we already do that for Proto-Slavic (i.e. diacritics in the headword line but not in the page name). PIE is "attested" in the work of historical linguists, whose use of accent marks is sporadic at best unless word accent is actually the topic under discussion. —Mahāgaja · talk 12:58, 21 February 2020 (UTC)
- @Mahagaja: That is a good suggestion, but the case with both Latin and Old English is that both are mostly attested without diacritics. PIE isn't attested at all. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 23:12, 20 February 2020 (UTC)
- I'd prefer it if we didn't have accents at all on PIE page names, but rather treated them like macrons in Latin and Old English: present in the headword line, but stripped in links. —Mahāgaja · talk 22:25, 20 February 2020 (UTC)
- Ugh, even though it should be obvious to anyone working in the field, if that helps deal with problematic users, go for it. --
{{victar|talk}}
05:20, 22 February 2020 (UTC)- @Mahagaja, Holodwig21 I agree with the suggestion that PIE page names should not have accents in them. It's already hard and error-prone enough to get the title right of PIE pages, and the accents make it significantly harder without really serving a disambiguation purpose that would justify their use in page names. Furthermore, different experts sometimes differ in where they place the accent, and in many cases it isn't clear at all. Benwing2 (talk) 06:31, 23 February 2020 (UTC)
Permission to add brief noun-stem etymological information to Proto-Slavic entries
[edit]I am seeking permission to add brief notes on Proto-Slavic derivations to note the noun-stem of nouns, e.g. o-stem, u-stem, etc. Specifically, there are two things which are already done elsewhere and which I would like to continue.
- Add small italics after alternative forms, to note the source and noun-stem of those alternative forms. This is done on, e.g. Reconstruction:Proto-Slavic/ely, Reconstruction:Proto-Slavic/ěra, etc. I would also add the actual reference, as well. The purpose of this small additional information is so that the reader will know which other noun-stem variants there are for particular etymons, without actually having to click through to those entries.
- Add a small note in the etymology section, if a source indicates that a particular noun-stem is more likely. This was done on, e.g. Reconstruction:Proto-Slavic/dǫbъ. The purpose of this information is to inform the reader which noun-stem was more likely to exist, based on the available references.
I am seeking permission here because my attempt to add both instances of this etymological information to Reconstruction:Proto-Slavic/jьlъ was reverted by the admin Rua who has, for similar disagreements, reverted similar edits of mine and then protected the page. If Wiktionary also has something like w:WP:BRD and I am unable to do the B part without being reverted and blocked, well, I guess I will just ask for permission first. BirdValiant (talk) 04:44, 22 February 2020 (UTC)
- What you're adding belongs in the etymology section, not under References, which is strictly intended for references. --
{{victar|talk}}
05:24, 22 February 2020 (UTC)- @Victar: So, can I put the phrase "Vasmer lists the word as originally being a u-stem"? Or, perhaps (ripped off from the exactly-analogous Reconstruction:Proto-Slavic/dǫbъ page): "The etymon is attested both as an o-stem and as a u-stem, but per Vasmer, the original more likely was u-stem" ? (cited, of course)
- @BirdValiant: I all you need to write is "perhaps originally a u-stem", or something like that, with a
<ref>
behind it. You don't have to mention the author by name, and I really rather you wouldn't. --{{victar|talk}}
21:20, 22 February 2020 (UTC)
- @BirdValiant: I all you need to write is "perhaps originally a u-stem", or something like that, with a
- And what about the italic notes in the "Alternative forms" list? Is there a prohibition against doing that, despite the practice being found on numerous entries? If not: would it be better to use (Derksen, o-stem) or, if there is an inline citation, maybe just (o-stem) ? BirdValiant (talk) 06:33, 22 February 2020 (UTC)
- Yeah, no, don't do that. Again, just add a
<ref>
behind it. --{{victar|talk}}
21:20, 22 February 2020 (UTC)- @Victar: Why not have under Alternative forms, e.g. *jìlъ (o-stem)<ref>? Is there a prohibition against that? Isn't it nice to know about the alternative noun-stem before clicking through? Doesn't that improve Wiktionary? BirdValiant (talk) 21:35, 22 February 2020 (UTC)
- That wouldn't be the worst, but I can't promise you won't be reverted. --
{{victar|talk}}
22:41, 22 February 2020 (UTC) - @BirdValiant: OK, looking at *jьlъ I can see some things you're doing wrong. 1. You shouldn't create reconstruction entries of alternative forms. In fact, don't even link to them. 2. Alternative forms is in the wrong place. 3. If you saying it might be derived from a u-stem, that's an ancestral form, not an alternative form. That also being said, no point listing labeling the alternative form as an o-stem. 4. What's your source for Ancient Greek ἰλύς (ilús, “mud, slime”) bring a cognate? --
{{victar|talk}}
01:34, 24 February 2020 (UTC)- @Victar: Where are all these rules located? 1. Why can't there be reconstruction entries for alternative forms? If I hadn't done that for *jìlъ, w:Proto-Slavic#Nouns would still have a red-link. And why can't we link to them? 4. The source is directly in the Vasmer entry, emphasis mine: "Родственно лтш. īls "очень темный", греч. ἰ̄λύ̄ς "тина, грязь", εἰλύ ̇ μέλαν "очень темный" (Гесихий)". BirdValiant (talk) 01:45, 24 February 2020 (UTC)
- And why have you deleted the Verweij source? Now, between *jьlъ and *jìlъ, it is no longer referenced at all, meaning that the further evidence for the u-stem noun has been hidden. BirdValiant (talk) 01:55, 24 February 2020 (UTC)
- My mistake on removing the Verweij source. You asked about how we format entries and this is how it's done. --
{{victar|talk}}
02:09, 24 February 2020 (UTC)- @Victar: If you insist on deleting the *jilъ entry, then at the very least, its information should be included in *jьlъ. As stated in WT:PROTO, with my emphasis: "Note that in cases where different reliable references disagree over certain details in a reconstruction, both references should usually be provided, even if only to note in some cases, for the sake of avoiding ambiguity, that a certain older form is now considered defunct. Pages should generally be named after whichever form has the broadest support among contemporary experts in the field, without Wiktionary weighing in on which form is more or less accurate; variants and disputed forms can then be addressed in great detail within the text of the pages themselves." Whether to include the information onto *jьlъ or link to a separate *jilъ, either way I think is fine. But if the guideline says that the information can be expressed in great detail, and the information is being referenced by a reputable source, then I think that it would be a mistake to hide it.
- Also, I removed the AP-a specification for *jьlъ for now because, technically, neither Derksen nor Vasmer specify the accent paradigm for that variant. The word is absent from Olander's wordlist, so I am unable to find the exact AP for the *jьlъ form in any references that have been presented so far. BirdValiant (talk) 03:22, 24 February 2020 (UTC)
- @BirdValiant: You're now saying two conflicting things in the entry: a) that it's derived from a u-stem, b) that it's could be reconstructed as a u-stem. --
{{victar|talk}}
04:19, 24 February 2020 (UTC)
- @BirdValiant: You're now saying two conflicting things in the entry: a) that it's derived from a u-stem, b) that it's could be reconstructed as a u-stem. --
- My mistake on removing the Verweij source. You asked about how we format entries and this is how it's done. --
- That wouldn't be the worst, but I can't promise you won't be reverted. --
- @Victar: Why not have under Alternative forms, e.g. *jìlъ (o-stem)<ref>? Is there a prohibition against that? Isn't it nice to know about the alternative noun-stem before clicking through? Doesn't that improve Wiktionary? BirdValiant (talk) 21:35, 22 February 2020 (UTC)
- Yeah, no, don't do that. Again, just add a
- @Victar: So, can I put the phrase "Vasmer lists the word as originally being a u-stem"? Or, perhaps (ripped off from the exactly-analogous Reconstruction:Proto-Slavic/dǫbъ page): "The etymon is attested both as an o-stem and as a u-stem, but per Vasmer, the original more likely was u-stem" ? (cited, of course)
An indexing system for Japanese non-lemma forms
[edit]Many electronic dictionaries allow you to arrive at the lemma entry by searching for a non-lemma form. On Wiktionary, this is done by mass creating non-lemma forms, manually or by bots. But this needn't be the case. Inspired by the awesome User:Yair rand/FindTrans.js, the following approach immediately suggests itself:
- Make the inflection template track the lemma entry with non-lemma forms as keys. For example, 微笑む should be tracked with the keys
index/infl/ja/微笑みます
,index/infl/ja/微笑まない
, etc. - Write JavaScript so that whenever the user searches X, a list of pages that link to Template:tracking/index/infl/ja/X is fetched from the MediaWiki APIs and displayed on the search result page.
As a demostration, I turned on tracking for kanji inflected forms in the OJAD and wrote User:Dine2016/FindLemma.js. Enabling this script makes the search result page of 微笑みます display the following message:
Aside from inflected forms, we can also apply this approach to all sorts of romanizations, so that searching the nonstandard "gakkou" may arrive at 学校.
(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 06:26, 22 February 2020 (UTC)
- It's a good idea. But I'd rather have the inflected forms be bot-created, to be honest. —Μετάknowledgediscuss/deeds 06:33, 22 February 2020 (UTC)
- @Metaknowledge: Yes, but we currently allow only one kind of romanization (Wiktionary Hepburn). The new approach would allow users to find entries by other romanizations as well. As for inflection, there are several grammatical theories with different notions of "words" and we haven't agreed on which set of inflected forms we should create. --Dine2016 (talk) 04:04, 23 February 2020 (UTC)
- Well, I wouldn't mind romanisation entries for gakkou, personally. And maybe we need a discussion about which inflected forms can be created. —Μετάknowledgediscuss/deeds 18:48, 23 February 2020 (UTC)
- @Metaknowledge: Yes, but we currently allow only one kind of romanization (Wiktionary Hepburn). The new approach would allow users to find entries by other romanizations as well. As for inflection, there are several grammatical theories with different notions of "words" and we haven't agreed on which set of inflected forms we should create. --Dine2016 (talk) 04:04, 23 February 2020 (UTC)
By the way, I wonder if we can make the message server-side rendered, on /wiki/X (whether it's an entry or a page-does-not-exist message) as well as on search result pages. --Dine2016 (talk) 05:35, 23 February 2020 (UTC)
Proposal to use ⟨ʕ⟩ ⟨ʔ⟩ instead of ⟨ʿ⟩ ⟨ʾ⟩ in Proto-Semitic transcription
[edit]The current de facto Wiktionary transcription system for Proto-Semitic is the traditional Semiticist system which uses the Hans Wehr apostrophes ⟨ʿ⟩ ⟨ʾ⟩ to represent the reconstructed phonemes */ʕ/ and */ʔ/ . Much of the contemporary literature on Proto-Semitic has shifted towards the use of the IPA symbols ⟨ʕ⟩ ⟨ʔ⟩. The case for this is laid out here on Wiktionary:About Contemporary Arabic#Romanization_only by @M. I. Wright who argued:
- Use IPA ⟨ʕ⟩ ⟨ʔ⟩, not their Hans Wehr equivalents. Not only are the apostrophic Hans Wehr marks hard to see and distinguish from one another, but their use also sort of feeds the West-centric misconception that the glottal stop & pharyngeal approximant/fricative are "not really consonant sounds" not deserving of their own letters. Besides, ⟨ʕ⟩ ⟨ʔ⟩ are derived from the same apostrophes that Hans Wehr uses.
Marijn van Putten, an authority on Berber and Semitic historical linguistics, is in agreement with this line of argumentation and has been vocal about his opposition to the use of ⟨ʿ⟩ ⟨ʾ⟩, saying: “This is exactly why we shouldn't use things that look like diacritics to represent consonants. I compromise often, but I hate it. In my book I use ʔ and ʕ.”[3]
An exhaustive survey would be impossible, but here are just a few important publications and scholars who have opted in favor of ⟨ʕ⟩ ⟨ʔ⟩:
- John Huehnergard, the leading authority on comparative Semitic linguistics (See The Semitic Languages, 2019 [4])
- Benjamin Suchard (See The Development of the Biblical Hebrew Vowels, 2019[5])
- Lutz Edzard (See “Notes on the Emergence of New Semitic Roots in the Light of Compounding”, 2017[6])
- Marijn van Putten (See publications[[7]])
I propose that Wiktionary formally implements the full-sized symbols ⟨ʕ⟩ ⟨ʔ⟩ as the standard for Proto-Semitic transcription. If we can agree on this, I can be responsible for moving Proto-Semitic entries and fixing any links. @Fay Freak. Rhemmiel (talk) 07:26, 22 February 2020 (UTC)
- As someone with relatively poor eyesight, this would be helpful, but I worry that we are introducing inconsistency into how we handle Semitic romanisation. Of course, PSem is freer to diverge because it is limited to the realm of academia; it is more troubling to me that various Arabic languages are currently inconsistent in this respect. —Μετάknowledgediscuss/deeds 07:42, 22 February 2020 (UTC)
- Yeah, this is definitely something that needs to be worked out over at Wiktionary:About Contemporary Arabic (I would support the use of ⟨ʕ⟩ ⟨ʔ⟩ across the board, with ⟨ʾ⟩ reserved for word-initial elidible glottal stops as M. I. Wright proposed). On the other hand, I don’t think it's too pressing of an issue seeing as how these are really just variants of the same symbols. It’s a separate discussion, but what concerns me more is the inconsistency in our representation of the Semitic sibilant series. Namely that:
- Proto-Semitic */s/ is transcribed as ⟨š⟩, which represents /ʃ/ in our romanisation of the daughter languages
- Proto-Semitic */t͡s/ is transcribed as ⟨s⟩, which represents /s/ in our romanisation of the daughter languages
- Proto-Semitic */ɬ/ is transcribed as ⟨ś⟩, which represents /ɬ/ in our romanisation of the daughter languages (however, original Hebrew /ɬ/ is inconsistently transcribed as either ⟨ś⟩ or ⟨s⟩ in the descendants section of PS entries)
- Unfortunately, developments in Semitic historical linguistics have rendered the original Semiticst transcription system woefully inadequate and at times downright confusing, but many continue to use it for lack of an agreed-upon alternative. Rhemmiel (talk) 10:50, 22 February 2020 (UTC)
- @Rhemmiel, Metaknowledge: I would definitely not support ʕ and ʔ for Arabic transcriptions, if that's what's being also suggested. --
{{victar|talk}}
01:40, 24 February 2020 (UTC)- That was the proposal from the start. —M. I. Wright (talk, contribs) 06:31, 25 February 2020 (UTC)\
- Perhaps your proposal, but this discussion was created to address Proto-Semitic entries. --
{{victar|talk}}
17:30, 28 February 2020 (UTC)- Oh, my bad. Most of the discussion around this that I've seen has been centered around Arabic, much of which Rhemmiel cited originally, so I missed the mention of PS in the header. —M. I. Wright (talk, contribs) 05:39, 3 March 2020 (UTC)
- Perhaps your proposal, but this discussion was created to address Proto-Semitic entries. --
- That was the proposal from the start. —M. I. Wright (talk, contribs) 06:31, 25 February 2020 (UTC)\
- @Rhemmiel, Metaknowledge: I would definitely not support ʕ and ʔ for Arabic transcriptions, if that's what's being also suggested. --
- Yeah, this is definitely something that needs to be worked out over at Wiktionary:About Contemporary Arabic (I would support the use of ⟨ʕ⟩ ⟨ʔ⟩ across the board, with ⟨ʾ⟩ reserved for word-initial elidible glottal stops as M. I. Wright proposed). On the other hand, I don’t think it's too pressing of an issue seeing as how these are really just variants of the same symbols. It’s a separate discussion, but what concerns me more is the inconsistency in our representation of the Semitic sibilant series. Namely that:
- I fain suffered Rhemmiel’c choices for uniformization, particularly ⟨ʕ⟩ ⟨ʔ⟩, not ever liking the ⟨ʿ⟩ ⟨ʾ⟩. One knows my stance that I would even be for writing them as ⟨c⟩ ⟨ↄ⟩ which are real letters that actually fit the character of the Latin alphabet and the former of which is in general use in the Cushitic Latin alphabets for this sound, but maybe this is for 2030. Fay Freak (talk) 14:06, 22 February 2020 (UTC)
- For my part, I love ⟨c⟩ [ʕ] in native-speaker-oriented alphabets — but given that transliterations are for a foreign audience, I find spellings like كَعْك (kack) a bit uncomfortable. That's why I'm for ⟨ʕ⟩ ⟨ʔ⟩. —M. I. Wright (talk, contribs) 19:38, 22 February 2020 (UTC)
- @Rhemmiel: Do you know any other options for ⟨ṯ̣⟩ / ⟨ṱ⟩ / ظ (ẓ)? Both are very stodgy. Depending on font and font size the first one you chose even looks identical to ⟨ṯ⟩ / ث (ṯ). Fay Freak (talk) 16:18, 22 February 2020 (UTC)
- Agreed 6000%, and it would do us well to get away from the Hans Wehr ⟨ẓ⟩ altogether if that's not too disagreeable... Jonas Sibony uses ⟨ŧ⟩ ⟨đ⟩ for the interdentals in his "Semitic cognates" Tweets, and these very readily take a combining underdot as in ⟨ŧ̣⟩ ⟨đ̣⟩. Maybe we could look into using those. —M. I. Wright (talk, contribs) 19:38, 22 February 2020 (UTC)
- @Fay Freak, Rhemmiel I am fine with using ⟨ʕ⟩ ⟨ʔ⟩. The symbols ⟨ʿ⟩ ⟨ʾ⟩ are hard to read, and even worse, in some fonts (e.g. the font used for editing module code), they are displayed backwards from what's expected. If there's general agreement, I can change the module and use a bot to fix all cases where manual Arabic transliteration occurs. Benwing2 (talk) 05:59, 23 February 2020 (UTC)
- @Fay Freak There’s no obvious alternative unfortunately, but I’ll start a discussion over at Wiktionary_talk:About_Proto-Semitic and make a list of all the different ways people have handled representing */θʼ/. @M. I. Wright I agree about ⟨ẓ⟩, and I think Jonas Sibony’s tweets tend to look a bit visually cluttered (especially the ⟨ŧ̣⟩), but aesthetic preferences should definitely take a backseat to display capabilities here. A lot of older literature also, for some inexplicable reason, uses ⟨ŧ⟩ to transliterate Hebrew/Aramaic ṭēt. @Benwing2 That would be wonderful! I’ve noticed them displaying backwards before when copy/pasting but could never figure out what was up. Rhemmiel (talk) 07:08, 23 February 2020 (UTC)
- @Atitarev What do you think of the above proposal to use ⟨ʕ⟩ ⟨ʔ⟩ in place of ⟨ʿ⟩ ⟨ʾ⟩? If you agree, I can implement it. Benwing2 (talk) 21:05, 23 February 2020 (UTC)
- @Benwing2: Not my preference for Arabic but if everyone else wants it, I won't oppose. I'm for consistency and following agreed rules. --Anatoli T. (обсудить/вклад) 21:08, 23 February 2020 (UTC)
- Definitely would not support for Arabic. --
{{victar|talk}}
01:59, 24 February 2020 (UTC)- @victar: We're gonna need a little more than that. Can you elaborate? You just responded to a whole lot of good arguments against the apostrophes and for ⟨ʕ⟩ ⟨ʔ⟩, and, personally, I can't see an argument for the apostrophes that invokes anything other than tradition/convention. Which isn't much good for an argument! It's OK to admit that convention can be wrong and that we can work toward fixing it. —M. I. Wright (talk, contribs) 01:37, 25 February 2020 (UTC)
- ʕ and ʔ are fine for academic use, but your average Arabic reader isn't going to understand them. There's no good argument I can see made for straying for their use in standardized Arabic transcriptions, and certainly not "because they're easier to read". --
{{victar|talk}}
07:22, 25 February 2020 (UTC)- How many "average Arabic readers" are going to be needing transcriptions from English Wiktionary? I can see an argument for compatibility with other reference works, but improving usability for people who are going to have trouble even reading the definitions isn't high on my priority list. Chuck Entz (talk) 07:40, 25 February 2020 (UTC)
- A lot I, would say. 1. en.Wikt is a great source for Arabic etymologies, and 2. anyone who Speaks Arabic as a second language. Also, just to point out, many Arabic speakers commonly type online in Latin characters. If anything, employing academic characters ʕ and ʔ decreases usability simply because they aren't understood. And to your first point, ʿ and ʾ are the forms used in academic transcriptions of Arabic. --
{{victar|talk}}
07:58, 25 February 2020 (UTC)- @victar: If you would support the creation of Arabizi entries as Latin script alternative forms of contemporary Arabic lemmas—akin to how Gothic and Japanese are handled—then by all means I would back you. ⟨2⟩ and ⟨3⟩ are certainly preferable to ⟨ʾ⟩ and ⟨ʿ⟩ in that they are clearly meant to represent full consonant sounds and are widely understood by the “average Arabic reader.” But what we are discussing here is transliteration, not alternative orthographies used by speakers. Rhemmiel (talk) 11:29, 25 February 2020 (UTC)
- I would not and I'm aware of what's being discussed. --
{{victar|talk}}
17:53, 28 February 2020 (UTC)- @victar: I don’t mean to be presumptuous, forgive me if I came off that way. To be honest, using anything other than ⟨ʿ⟩ and ⟨ʾ⟩ for Arabic transliteration doesn’t feel completely right to me either. For a long time I wouldn't have considered anything else. That being said, it’s been my experience that colloquial Arabic contributors overwhelmingly tend towards the use of ⟨ʕ⟩ ⟨ʔ⟩ in manual transliteration. As part of a pattern, these manual transliterations must then be “fixed” and brought in alignment with the standardized scheme by editors whose focus is typically MSA. At a certain point, I have to ask myself what is more valuable: adhering to a set of academic standards out of convention, or listening to the practical concerns of Arabic speakers like @M. I. Wright, who are doing the important work of documenting colloquial Arabic dialects as living languages. I’m sure that as strange as transliteration with ⟨ʕ⟩ and ⟨ʔ⟩ might appear now, after a week or so of getting used to it wouldn't be anything out of the ordinary. Rhemmiel (talk) 01:58, 3 March 2020 (UTC)
- @Rhemmiel: And you'll have to forgive me if I find the statement, "colloquial Arabic contributors overwhelmingly tend towards the use of ⟨ʕ⟩ ⟨ʔ⟩", laughable. If you want to try and get over 2/3rds of editors to agree with you, go right ahead and start a vote on it. --
{{victar|talk}}
04:28, 3 March 2020 (UTC)- Yeah, I don't want my preference unduly elevated here just because I'm an Arabic speaker. I think it stands on its own on objective grounds of the sort Rhemmiel raised initially. When you say that the average reader isn't going to understand ⟨ʕ⟩ ⟨ʔ⟩, it's not like ⟨ʿ⟩ ⟨ʾ⟩ are actually any more helpful ― they just fool the reader into thinking they are. But I'd be fine with a vote, of course, and I get why it isn't likely to pass. No big deal if it doesn't. —M. I. Wright (talk, contribs) 18:55, 6 March 2020 (UTC)
- @Rhemmiel: And you'll have to forgive me if I find the statement, "colloquial Arabic contributors overwhelmingly tend towards the use of ⟨ʕ⟩ ⟨ʔ⟩", laughable. If you want to try and get over 2/3rds of editors to agree with you, go right ahead and start a vote on it. --
- @victar: I don’t mean to be presumptuous, forgive me if I came off that way. To be honest, using anything other than ⟨ʿ⟩ and ⟨ʾ⟩ for Arabic transliteration doesn’t feel completely right to me either. For a long time I wouldn't have considered anything else. That being said, it’s been my experience that colloquial Arabic contributors overwhelmingly tend towards the use of ⟨ʕ⟩ ⟨ʔ⟩ in manual transliteration. As part of a pattern, these manual transliterations must then be “fixed” and brought in alignment with the standardized scheme by editors whose focus is typically MSA. At a certain point, I have to ask myself what is more valuable: adhering to a set of academic standards out of convention, or listening to the practical concerns of Arabic speakers like @M. I. Wright, who are doing the important work of documenting colloquial Arabic dialects as living languages. I’m sure that as strange as transliteration with ⟨ʕ⟩ and ⟨ʔ⟩ might appear now, after a week or so of getting used to it wouldn't be anything out of the ordinary. Rhemmiel (talk) 01:58, 3 March 2020 (UTC)
- I would not and I'm aware of what's being discussed. --
- @victar: If you would support the creation of Arabizi entries as Latin script alternative forms of contemporary Arabic lemmas—akin to how Gothic and Japanese are handled—then by all means I would back you. ⟨2⟩ and ⟨3⟩ are certainly preferable to ⟨ʾ⟩ and ⟨ʿ⟩ in that they are clearly meant to represent full consonant sounds and are widely understood by the “average Arabic reader.” But what we are discussing here is transliteration, not alternative orthographies used by speakers. Rhemmiel (talk) 11:29, 25 February 2020 (UTC)
- A lot I, would say. 1. en.Wikt is a great source for Arabic etymologies, and 2. anyone who Speaks Arabic as a second language. Also, just to point out, many Arabic speakers commonly type online in Latin characters. If anything, employing academic characters ʕ and ʔ decreases usability simply because they aren't understood. And to your first point, ʿ and ʾ are the forms used in academic transcriptions of Arabic. --
- How many "average Arabic readers" are going to be needing transcriptions from English Wiktionary? I can see an argument for compatibility with other reference works, but improving usability for people who are going to have trouble even reading the definitions isn't high on my priority list. Chuck Entz (talk) 07:40, 25 February 2020 (UTC)
- ʕ and ʔ are fine for academic use, but your average Arabic reader isn't going to understand them. There's no good argument I can see made for straying for their use in standardized Arabic transcriptions, and certainly not "because they're easier to read". --
- @victar: We're gonna need a little more than that. Can you elaborate? You just responded to a whole lot of good arguments against the apostrophes and for ⟨ʕ⟩ ⟨ʔ⟩, and, personally, I can't see an argument for the apostrophes that invokes anything other than tradition/convention. Which isn't much good for an argument! It's OK to admit that convention can be wrong and that we can work toward fixing it. —M. I. Wright (talk, contribs) 01:37, 25 February 2020 (UTC)
- @Atitarev What do you think of the above proposal to use ⟨ʕ⟩ ⟨ʔ⟩ in place of ⟨ʿ⟩ ⟨ʾ⟩? If you agree, I can implement it. Benwing2 (talk) 21:05, 23 February 2020 (UTC)
- @Fay Freak There’s no obvious alternative unfortunately, but I’ll start a discussion over at Wiktionary_talk:About_Proto-Semitic and make a list of all the different ways people have handled representing */θʼ/. @M. I. Wright I agree about ⟨ẓ⟩, and I think Jonas Sibony’s tweets tend to look a bit visually cluttered (especially the ⟨ŧ̣⟩), but aesthetic preferences should definitely take a backseat to display capabilities here. A lot of older literature also, for some inexplicable reason, uses ⟨ŧ⟩ to transliterate Hebrew/Aramaic ṭēt. @Benwing2 That would be wonderful! I’ve noticed them displaying backwards before when copy/pasting but could never figure out what was up. Rhemmiel (talk) 07:08, 23 February 2020 (UTC)
- @Fay Freak, Rhemmiel I am fine with using ⟨ʕ⟩ ⟨ʔ⟩. The symbols ⟨ʿ⟩ ⟨ʾ⟩ are hard to read, and even worse, in some fonts (e.g. the font used for editing module code), they are displayed backwards from what's expected. If there's general agreement, I can change the module and use a bot to fix all cases where manual Arabic transliteration occurs. Benwing2 (talk) 05:59, 23 February 2020 (UTC)
- Agreed 6000%, and it would do us well to get away from the Hans Wehr ⟨ẓ⟩ altogether if that's not too disagreeable... Jonas Sibony uses ⟨ŧ⟩ ⟨đ⟩ for the interdentals in his "Semitic cognates" Tweets, and these very readily take a combining underdot as in ⟨ŧ̣⟩ ⟨đ̣⟩. Maybe we could look into using those. —M. I. Wright (talk, contribs) 19:38, 22 February 2020 (UTC)
Feedback requested on the best way to catalogue business-speak
[edit]There is a subset of English that I've usually referred to as org-speak but it could be called business-speak, garbage-language, etc. that relies on certain tropes like buzzwords, needless metaphors, flagrant BSing, etc. I think there's genuine value in somehow recording terms like these but I'm struggling on how best to do so... Is this appendix-only trivia? Would it be appropriate to have a context label and category? If so, what would that label be? Any thoughts? —Justin (koavf)❤T☮C☺M☯ 20:23, 22 February 2020 (UTC)
- I definitely think we should include any that are citable, and grouping them somehow would be ideal. Maybe corporate jargon could work as a label? Andrew Sheedy (talk) 21:22, 22 February 2020 (UTC)
- We try to have neutral labels and category names. DCDuring (talk) 03:19, 23 February 2020 (UTC)
- I didn't realize that wasn't neutral. I suggested it precisely because I thought it was both neutral and descriptive... Andrew Sheedy (talk) 22:30, 23 February 2020 (UTC)
- We try to have neutral labels and category names. DCDuring (talk) 03:19, 23 February 2020 (UTC)
- I seem to recall that some years back we removed "jargon" labels. Equinox ◑ 22:39, 23 February 2020 (UTC)
Merging city and town categories for non-English places where this distinction doesn't apply
[edit]@Donnanz I brought this up before somewhere (I can't find where). Before, I suggested a general merger of city and town categories into e.g. Category:Cities and towns in Nebraska, USA. Some people objected that there are legal distinctions between cities and towns. However, these distinctions only apply in certain countries. Others (e.g. Russia, Greece, France, etc.) don't make such distinctions, and so the determination of whether a given urban area is a city or town is fairly arbitrary. France doesn't even seem to make a clear distinction between city, town and village, and calls them all "commune", which approximately translates as "municipality" (but in many other places, a "municipality" is clearly distinct from and bigger than a city or town). My instinct is to group things according to the country's legal system; hence, England would make a three-way distinction between city, town and village, whereas Russia would make a two-way distinction between city-town and village, and France would combine them all as communes. Wikipedia's Category:Lists of cities by country does a pretty good job of indicating how to do these combinations. What do people think? Benwing2 (talk) 06:19, 23 February 2020 (UTC)
- @Benwing2: Going country by country, it can be difficult. My experience with Norwegian by is that it can be either a city or town, whereas storby is usually a city and landsby is a village. A Norwegian municipality is a kommune, which can include several towns and villages. Sometimes I take population into account, over 30,000 could be regarded as a city rather than a town, but that doesn't apply in Britain of course. I don't know of an easy answer, but I do have the feeling that in many countries putting them all into a "places in" category, as is being done now, is the fall-back solution. DonnanZ (talk) 13:17, 23 February 2020 (UTC)
- Is there really much uniformity in structure among US, UK, France, and other countries? All places have placenames that correspond to are legally meaningful entities and other placenames that don't. It would seem that we need to follow local practice. This can be hard if boundaries are fluid or if there is not much information on the web. I also assume that most countries have places with informal names in common use. New York City has lots of "neighborhood" names, like Riverdale, City Island, South Bronx, Morrisania, Kingsbridge, Co-op City (all in the Bronx, but similar names apply in all five boroughs of NYC). (See w:List of Bronx neighborhoods.). I have the feeling that a category structure that attempts to include all the municipal government structures would have to have very vacuous names. DCDuring (talk) 17:16, 23 February 2020 (UTC)
- @DCDuring I am not at all wedded to following the legal government structures. In fact I'd rather that we use e.g. "neighbo(u)rhood" consistently in category names in place of barrio, barangay, etc. My main concern is that for many countries, the distinction between city and town is fairly arbitrary and has no basis in the language or legal system, and by having distinct categories for the two, we're introducing an artificial distinction that doesn't make a lot of sense. For this reason I originally proposed merging the two categories across the board, but some people didn't like that. What do you think? Benwing2 (talk) 21:10, 23 February 2020 (UTC)
- Put me in the "i-don't-like-it" category. If you don't follow the legal classificaton, you will end up either with a hard-to-maintain system that doesn't correspond to the daily experience of normal users and/or very big categories, eg, Telegu toponyms. Among other things, one can probably find lists of all of the toponyms that correspond to given national or other legal terms that give would-be contributors bite-sized tasks to complete and a sense of what other lists would be worth locating and adding from. I'm not really sure what value any Wiktionary category system has without some reference to commonly used real-world categories. DCDuring (talk) 02:33, 24 February 2020 (UTC)
- @DCDuring I'm very confused. Up above you seemed to have expressed the opinion that we should not follow the legal structure of a given country ("I have the feeling that a category structure that attempts to include all the municipal government structures would have to have very vacuous names"), and now you appear to be saying the opposite. Can you clarify your opinion? Benwing2 (talk) 07:23, 24 February 2020 (UTC)
- BTW What do you think of my proposal to merge city and town categories into Category:Cities and towns in X for X = various countries? If you don't like it, what about the fact that some countries don't have the city vs. town distinction? Benwing2 (talk) 07:24, 24 February 2020 (UTC)
- I don't care that some countries don't have the distinction. IMO, we should have whatever legal distinctions they do have. If that messes up our category structure, so much the worse for universal categories in this realm. If you need every toponym to be in a category, Category:Toponyms is available, as are Category:Inhabited places, Category:Inhabited places with permanent residents, Category:Inhabited places with local governments.
- If category membership depends upon what English word is applied to a place, then, in principle, we should have evidence concerning what English word is applied. I don't think you would want to do that for every place to be categorized. A sounder scheme might be for each country (or other-named entity) to have its own toponym categories derived from the local language(s) with an English translation or Latin transcription of the local name(s) and English alias(es) like city, town, or village for the categories.
- Using categories built on local-language names has the advantage that one could take advantage of work done in local language pedias and wiktionaries, which is likely to be more complete than what we can achieve on our own. DCDuring (talk) 15:19, 24 February 2020 (UTC)
- Put me in the "i-don't-like-it" category. If you don't follow the legal classificaton, you will end up either with a hard-to-maintain system that doesn't correspond to the daily experience of normal users and/or very big categories, eg, Telegu toponyms. Among other things, one can probably find lists of all of the toponyms that correspond to given national or other legal terms that give would-be contributors bite-sized tasks to complete and a sense of what other lists would be worth locating and adding from. I'm not really sure what value any Wiktionary category system has without some reference to commonly used real-world categories. DCDuring (talk) 02:33, 24 February 2020 (UTC)
- @DCDuring I am not at all wedded to following the legal government structures. In fact I'd rather that we use e.g. "neighbo(u)rhood" consistently in category names in place of barrio, barangay, etc. My main concern is that for many countries, the distinction between city and town is fairly arbitrary and has no basis in the language or legal system, and by having distinct categories for the two, we're introducing an artificial distinction that doesn't make a lot of sense. For this reason I originally proposed merging the two categories across the board, but some people didn't like that. What do you think? Benwing2 (talk) 21:10, 23 February 2020 (UTC)
- Is there really much uniformity in structure among US, UK, France, and other countries? All places have placenames that correspond to are legally meaningful entities and other placenames that don't. It would seem that we need to follow local practice. This can be hard if boundaries are fluid or if there is not much information on the web. I also assume that most countries have places with informal names in common use. New York City has lots of "neighborhood" names, like Riverdale, City Island, South Bronx, Morrisania, Kingsbridge, Co-op City (all in the Bronx, but similar names apply in all five boroughs of NYC). (See w:List of Bronx neighborhoods.). I have the feeling that a category structure that attempts to include all the municipal government structures would have to have very vacuous names. DCDuring (talk) 17:16, 23 February 2020 (UTC)
I'm happy with merged cities and towns across the board. I don't think there's much use in a category just for towns. Ultimateria (talk) 17:34, 24 February 2020 (UTC)
Geographyinitiative's initiative to deunify Chinese L2 or create a chaos
[edit]Geographyinitiative (talk • contribs) continues with his agenda despite knowing the community disagrees. The derived term section at 會#Pronunciation_2 consists entirely of unmarked various dialectal terms. See one of the previous talks here: Talk:吃飽. (Notifying Kc kennylau, Tooironic, Jamesjiao, Meihouwang, Suzukaze-c, Justinrleung, Hongthay, Mar vin kaiser, Dokurrat, Zcreator alt, Dine2016, Geographyinitiative): --Anatoli T. (обсудить/вклад) 11:01, 23 February 2020 (UTC)
- @Atitarev The chaos is inherent in a system which includes the languages of eight Wikipedia versions into one language header. --Geographyinitiative (talk) 11:19, 23 February 2020 (UTC)
- Hanyu Pinyin is great and important. But there's other stuff too! There is really no reason at all to ignore the other romanization systems from the other linguistic forms in the derived terms sections. The only reason to ignore them is because of the drive to eliminate non-Mandarin in the PRC. Tailo romanizations (very close to the POJ which is used in church publications in Taiwan and on the Min Nan Wikipedia) are now allowed on Taiwanese (ROC) passports. "the Ministry of Foreign Affairs announced that Taiwanese are now allowed to use the romanized spellings of their names in Hoklo, Hakka and Aboriginal languages for their passports." My agenda is to show readers how things are read in "some form of Chinese". If that agenda is "wrong" in this community, this dictionary will not make it long run because it is supporting old-school PRC/ROC objectives (which may now be somewhat obsolete and exist primarily as widespread prejudices against non-Mandarin speech) instead of documenting linguistic realities. My suggestion is rather than hating me just continue to make the dictionary better by improving coverage of these languages. --Geographyinitiative (talk) 11:23, 23 February 2020 (UTC) (modified again)
- You have ignored all calls to separate your lists by dialects and romanisations. It's not about what is great. No dialects are suppressed here and the coverage is only growing but you are the one who has been making a mess out of it. The discussion at Talk:吃飽 clearly shows that you are the only one creating the mess and insisting that it's right. The previous block apparently had no effect on you. --Anatoli T. (обсудить/вклад) 11:30, 23 February 2020 (UTC)
- "No dialects are suppressed here"- the fact that they are all under one header in which Mandarin is listed first is an unbelievable statement of oppression. Don't say they aren't suppressed: they are. --Geographyinitiative (talk) 11:34, 23 February 2020 (UTC)
- Min Nan: has own Wikipedia version
- Min Nan: has entries for Romanized forms
- Also Min Nan: Chinese
- --Geographyinitiative (talk) 11:37, 23 February 2020 (UTC)
- I'm sorry your eyes are hurt by the fact that these words and these romanizations exist, but they do and us ignoring it damages the dictionary. The mess you are talking about is people's native languages. --Geographyinitiative (talk) 11:39, 23 February 2020 (UTC)
- If we are going to throw all "Chineses" in one language header, there's no reason to ignore the romanizations of those languages. It would be like putting all CJKV character languages under one header and showing Ancient Chinese reconstructions as the pronunciations and considering Japanese and Korean transliterations too messy to display. --Geographyinitiative (talk) 11:43, 23 February 2020 (UTC)
- Hanyu Pinyin is part of Chinese, not the whole thing. --Geographyinitiative (talk) 11:46, 23 February 2020 (UTC)
- Attempts to pretend I'm the only one that sees a problem will not work my friend. --Geographyinitiative (talk) 11:52, 23 February 2020 (UTC)
- @Geographyinitiative: I am tempted to consider these posts themselves as disruptive. You need to establish consensus for a dramatic change and I recommend against spamming this page with a dozen comments in a row. —Justin (koavf)❤T☮C☺M☯ 12:08, 23 February 2020 (UTC)
- You have ignored all calls to separate your lists by dialects and romanisations. It's not about what is great. No dialects are suppressed here and the coverage is only growing but you are the one who has been making a mess out of it. The discussion at Talk:吃飽 clearly shows that you are the only one creating the mess and insisting that it's right. The previous block apparently had no effect on you. --Anatoli T. (обсудить/вклад) 11:30, 23 February 2020 (UTC)
Again, I think the long-term solution is to create separate L2 sections for the non-Mandarin topolects, so that each section can use its own romanization system. Doing so also encourages the addition of non-Mandarin examples (the examples on 會 are exclusively Mandarin even though the word isn't, which is a shame). The Chinese section can continue to hold non-Mandarin pronunciations because Written Standard Chinese can be read in any topolect. --Dine2016 (talk) 15:40, 23 February 2020 (UTC)
- Different L2 sections are not going to be created any time soon, unless there is a consensus to do that and there isn't one in sight. it is extremely confusing posting all sorts of romanisations without any markings. --Anatoli T. (обсудить/вклад)
Cool, romanizations in different dialects. They are totally unmarked, and therefore useless. —Suzukaze-c◇◇ 19:16, 23 February 2020 (UTC)
- Exactly. Even if he had a preference for Min Nan and they were all POJ, it would make more sense to users, if they knew who was the editor, LOL. Now, it's all different unmarked dialects, yes, totally useless. -Anatoli T. (обсудить/вклад) 20:53, 23 February 2020 (UTC)
If anyone wants to actually test this to see if it's coherent or better as an alternative, I suggest making a sandbox and showing the existing formatting and the proposed changes side-by-side. It's much better than bickering live in the dictionary itself. As someone who is pretty ignorant of Chinese and CJK(V) characters, I would be sympathetic to seeing headers that don't merge all Chinese languages with Mandarin or that can provide some intelligible way to understand how a certain character is used differently from Hakka to Wu to Yue, etc. —Justin (koavf)❤T☮C☺M☯ 21:38, 23 February 2020 (UTC)
- @Koavf: it is very confusing for users not familiar with different romnanisation to makes sense. The possible options are discussed and presented at Talk:吃飽. Nothing is simple but Geographyinitiative is not really looking for a solution, he just continues to push what he likes. When he is called out on his actions, he starts shouting about the unfairness of the Chinese merger. -Anatoli T. (обсудить/вклад) 21:56, 23 February 2020 (UTC)
- That's the impression that I get but I'm willing to assume good enough faith that he genuinely wants this to be a better dictionary and wants to help display and collate information on Chinese languages in a way that is logical and helpful. The discussion at Talk:吃飽 is helpful (thanks) but I'd like Geo to actually make examples so that we can see them live. Otherwise, it is a lot of complaining with no pay-off. No one is going to be on your side if you're just revert-warring and posting a cascade of complaints. —Justin (koavf)❤T☮C☺M☯ 22:08, 23 February 2020 (UTC)
- @Koavf: I haven't reverted his edits yet and present them as they are. User:Dine2016 and User:Suzukaze-c are showing Geo how to do things (one of the ways) to just use
{{qualifier}}
, he likes the methods but he still goes ahead and uses his style. --Anatoli T. (обсудить/вклад) 22:18, 23 February 2020 (UTC)- If a "Chinese" word has a known romanization, it should be displayed, even if there is no Hanyu Pinyin. It gives an English speaker a purchase on the approximate sound of a word. What argument can be had against that? The only argument I see is that Mandarin is the only legitimate form of Chinese deserving display of its romanization scheme. That argument is so heart breaking and disgusting that I'm surprised there would be anyone to stand up for it. As for the argument "it's confusing"- well, you are throwing the languages of eight Wikipedia versions into one language header and giving Mandarin top billing. There's your problem in terms of causing confusion. Chinese language: "Chinese...is a group of languages". Don't pretend to stand on keeping the dictionary nice and tidy so you can stamp on non-Mandarin languages. Geographyinitiative (talk) 22:35, 23 February 2020 (UTC)
- Has there been a known instance of confusion caused by an edit like I made, or is there no evidence whatsoever of any confusion caused by adding these romanizations? Everyone understands that things besides Hanyu Pinyin are still part of "Chinese" and that they may show up sometimes. --Geographyinitiative (talk) 22:39, 23 February 2020 (UTC)
- From my point of view, this is all about demanding Hanyu Pinyin dominance and not about some kind of concern about the organization and tidyness of the dictionary. The goal is to make it seem like only Hanyu Pinyin is a useful or legitimate romanization for Chinese characters. That's a very limiting view of our past and our future. --Geographyinitiative (talk) 22:42, 23 February 2020 (UTC)
- Middle English has its own header on Wiktionary. Ever consider giving Classical Chinese its own header? Classical Chinese has its own Wikipedia version after all. But no: we are bent on demanding Mandarin and Hanyu Pinyin be put at the top of a "Chinese" header and thereby perpetuating the lie that Chinese is one language when Wikipedia says it isn't. Imagine for a moment if you might be on the wrong side: it might be true that Wikipedia's Chinese language article is right, and that this effort to create 'tidiness' is actually part of a systematic prejudice that demands suppression of non-Mandarin languages. What's more likely- that there's a tidiness problem or a prejudice problem? --Geographyinitiative (talk) 22:52, 23 February 2020 (UTC)
- Community consensus is not against the varieties of Chinese and their romanizations. It is in favor of them and of displaying their romanizations. --Geographyinitiative (talk) 22:54, 23 February 2020 (UTC)
- Frankly I'm not interested in all your messages attempting to pwn Mr. Titarev and your other opponents. In his original post in this thread, he brought up the template
{{zh-der|會曉:ē-hiáu|會使:â̤-sāi|哪會;nai voi|會堪得:ē-kham-tit}}
, which you added under the section 會#Pronunciation 2. Please explain how uninformed readers are supposed to figure out which romanization belongs to which language here. Without looking at the entries, I'm guessing maybe there are Min-Nan and Wu romanizations here, based on my limited knowledge of Chinese languages, but readers who are less informed than me won't get that far. If you want to present more romanizations, I think it's best to label the romanizations somehow. — Eru·tuon 23:19, 23 February 2020 (UTC)
- Frankly I'm not interested in all your messages attempting to pwn Mr. Titarev and your other opponents. In his original post in this thread, he brought up the template
- @Koavf: I haven't reverted his edits yet and present them as they are. User:Dine2016 and User:Suzukaze-c are showing Geo how to do things (one of the ways) to just use
- That's the impression that I get but I'm willing to assume good enough faith that he genuinely wants this to be a better dictionary and wants to help display and collate information on Chinese languages in a way that is logical and helpful. The discussion at Talk:吃飽 is helpful (thanks) but I'd like Geo to actually make examples so that we can see them live. Otherwise, it is a lot of complaining with no pay-off. No one is going to be on your side if you're just revert-warring and posting a cascade of complaints. —Justin (koavf)❤T☮C☺M☯ 22:08, 23 February 2020 (UTC)
- This is not about which romanization is better. This is about not further complicating display of huge blocks of text by randomly inserting unlabeled strings of characters that mean nothing to the vast majority of English speakers who are studying Chinese. We can, should, and do include all the information about these romanizations in the proper section of the entries, but we consistently use Hanyu pinyin elsewhere because it's easier for our readers to not have to learn a bunch of romanization systems for basic navigation.
- To most people, this text is just random gibberish that they have no context or references to explore further- it doesn't even link to anything. But then, you're only concerned with symbolism and ideology. As long as you get to slip in passive-aggressive little provocations to drive home your protests to your fellow editors, you don't care at all about whether it makes any sense whatsoever to our users. You talk about consensus, but you've repeatedly expounded at extreme length about how the current consensus is evil appeasement of sinister totalitarian forces. Chuck Entz (talk) 00:02, 24 February 2020 (UTC)
- @Geographyinitiative I'm afraid your edits don't actually support the Chinese varieties. The one romanization you pick for each of the terms seem to be picked at random. Why have one Hokkien romanization for 會曉, a Min Dong romanization for 會使 and a Hakka romanization for 哪會? Aren't you suppressing other varieties by doing so? 會曉 is also used in Min Dong, and 會使 is also used in Hokkien - what's the deal? Just trying to flaunt "diversity" and make the entry utterly useless? If someone is only familiar with Pinyin (or any other romanization), they'd be scratching their heads trying to figure out what on earth they're looking at. — justin(r)leung { (t...) | c=› } 00:27, 24 February 2020 (UTC)
- I don't want to get myself actually banned here, so if there are specific edits that need to be reverted, revert them and I will basically have no recourse except to fall in line. I have adapted to (while vigorously pointing out) the limitations of thought imposed on me in Wiktionary and I will adapt to incorporate any limitations imposed on me in this case as well. What I really want is to keep having access to edit this incredible (though I would say flawed) resource which is a very helpful exercise for me in my studies and other editing on Wikipedia. --Geographyinitiative (talk) 00:48, 24 February 2020 (UTC)
- @Geographyinitiative: Here's another alternative: set up User:Geographyinitiative/sandbox to compare and contrast what you think are the relative merits of your approach and the deficiencies of the other and make your case. —Justin (koavf)❤T☮C☺M☯ 00:54, 24 February 2020 (UTC)
- I don't want to get myself actually banned here, so if there are specific edits that need to be reverted, revert them and I will basically have no recourse except to fall in line. I have adapted to (while vigorously pointing out) the limitations of thought imposed on me in Wiktionary and I will adapt to incorporate any limitations imposed on me in this case as well. What I really want is to keep having access to edit this incredible (though I would say flawed) resource which is a very helpful exercise for me in my studies and other editing on Wikipedia. --Geographyinitiative (talk) 00:48, 24 February 2020 (UTC)
- @Geographyinitiative I'm afraid your edits don't actually support the Chinese varieties. The one romanization you pick for each of the terms seem to be picked at random. Why have one Hokkien romanization for 會曉, a Min Dong romanization for 會使 and a Hakka romanization for 哪會? Aren't you suppressing other varieties by doing so? 會曉 is also used in Min Dong, and 會使 is also used in Hokkien - what's the deal? Just trying to flaunt "diversity" and make the entry utterly useless? If someone is only familiar with Pinyin (or any other romanization), they'd be scratching their heads trying to figure out what on earth they're looking at. — justin(r)leung { (t...) | c=› } 00:27, 24 February 2020 (UTC)
I have created this vote to provide a standard for our chemical formula entries, and I would appreciate feedback to improve it. —Μετάknowledgediscuss/deeds 22:02, 23 February 2020 (UTC)
- I made an edit that I don't think is terribly controversial: revert if you disagree. I feel like the tone is a little... combative? Aggressive? But it does help to explain the problem and I think this is a real issue that does need to be resolved. —Justin (koavf)❤T☮C☺M☯ 00:18, 24 February 2020 (UTC)
- Feel free to edit for tone. I aim to write clearly and succinctly, but at the cost of sometimes seeming unintentionally abrasive. —Μετάknowledgediscuss/deeds 00:21, 24 February 2020 (UTC)
- Which is still better than unintentionally seeming intentionally abrasive. :) --Lambiam 22:06, 24 February 2020 (UTC)
- Feel free to edit for tone. I aim to write clearly and succinctly, but at the cost of sometimes seeming unintentionally abrasive. —Μετάknowledgediscuss/deeds 00:21, 24 February 2020 (UTC)
Idioms within idioms
[edit]Currenty entries do not generally mark idioms which are used in other idioms, therefore hindering the understanding of embedded idiomatic expressions without warning the user. For example, in ill at ease one can recognize at ease. --Backinstadiums (talk) 18:32, 24 February 2020 (UTC)
- I have linked “at ease” in the headword line, and additionally listed this as an antonym. I think most cases can be handled in a similar way, but each will have to be considered separately. --Lambiam 21:52, 24 February 2020 (UTC)
- Yes. If the "nested idiom" retains its sense in the longer idiom, it should be linked in the headword line, and whether or not it retains its sense, it should be linked in one of the "connected terms" sections: either synonyms / antonyms if appropriate, or related terms, or (at a minimum) see also. - -sche (discuss) 22:02, 24 February 2020 (UTC)
- It would be a worthwhile project to make the headword link to any component MWEs. Perhaps someone could produce a list of candidates for correction. It seems like a good programming puzzle to do it efficiently. DCDuring (talk) 04:57, 25 February 2020 (UTC)
- @DCDuring: Also for example the case where a meaning such as "disconcert" is shared by both throw and throw off (that is "throw (off)"); both entries must especially indicate they're essentially synonymous in such cases. --Backinstadiums (talk) 16:51, 26 February 2020 (UTC)
- It would be a worthwhile project to make the headword link to any component MWEs. Perhaps someone could produce a list of candidates for correction. It seems like a good programming puzzle to do it efficiently. DCDuring (talk) 04:57, 25 February 2020 (UTC)
- Yes. If the "nested idiom" retains its sense in the longer idiom, it should be linked in the headword line, and whether or not it retains its sense, it should be linked in one of the "connected terms" sections: either synonyms / antonyms if appropriate, or related terms, or (at a minimum) see also. - -sche (discuss) 22:02, 24 February 2020 (UTC)
I've just realized redirectioning is an approach only used in one direction: in the balance shows hang in the balance but searching the latter doesn't shows the idiomaticity of in the balance --Backinstadiums (talk) 10:39, 3 March 2020 (UTC)
- Right. That's why we would need to modify the headword template for each case, as I have just modified it for [[hang in the balance]]. A list of possible offenders would enable us to be systematic about the changes. DCDuring (talk) 11:54, 3 March 2020 (UTC)
- @DCDuring: Can nonlinear pairs be automatically dealt with too, such as go to lengths? --Backinstadiums (talk) 16:05, 3 March 2020 (UTC)
- I wouldn't have called those 'nonlinear' pairs. What regular expression or computer algorithm would identify them? And then how do you think we should deal with them? That's above my paygrade. DCDuring (talk) 18:25, 3 March 2020 (UTC)
- @DCDuring: Can nonlinear pairs be automatically dealt with too, such as go to lengths? --Backinstadiums (talk) 16:05, 3 March 2020 (UTC)
- Right. That's why we would need to modify the headword template for each case, as I have just modified it for [[hang in the balance]]. A list of possible offenders would enable us to be systematic about the changes. DCDuring (talk) 11:54, 3 March 2020 (UTC)
Categorization of Physical Traumas/Injuries?
[edit]Can someone familiar with medical stuff point out how to categorize terms such as injury, trauma, wound, sprain, graze, dislocation, fracture, rupture, bruise, cramp, etc? Do they go under Cat:Medicine, Cat:Pathology or maybe Cat:Physiology? Shouldn't they have their own subcategory similar to {{cat:Diseases}}
? Безименен 21:03, 24 February 2020 (UTC)
- If you want to sound medical, go with Category:Traumatology. But I think a Category:Injuries will be just fine. In either case this should be a subcat of Category:Pathology. Disclaimer: my familiarity with medical stuff is confined to the suffering end. --Lambiam 22:03, 24 February 2020 (UTC)
- @Lambiam: I was about to create Category:Injuries for Bulgarian, however, when I gave preview, Wiktionary informed me that the head category has not been created yet. Should I ask an admin for permission? Безименен 23:07, 24 February 2020 (UTC)
- OK, I've set up Category:Injuries. For Bulgarian, you would put entries into Category:bg:Injuries. - -sche (discuss) 05:43, 25 February 2020 (UTC)
- @-sche:, @Koavf Thanks. Безименен 10:12, 25 February 2020 (UTC)
- OK, I've set up Category:Injuries. For Bulgarian, you would put entries into Category:bg:Injuries. - -sche (discuss) 05:43, 25 February 2020 (UTC)
- @Lambiam: I was about to create Category:Injuries for Bulgarian, however, when I gave preview, Wiktionary informed me that the head category has not been created yet. Should I ask an admin for permission? Безименен 23:07, 24 February 2020 (UTC)
Etymologies are not kept in the entries for the plurals
[edit]For example, slops shows as if they shared a common etymology (unlike slop) both
Plural of slop (“scraps that will be fed to animals, particularly to hogs”)
(nautical, dated) Clothing and bedding issued to sailors.
--Backinstadiums (talk) 12:37, 26 February 2020 (UTC)
- As it is, it all looks a bit confusing to me. -Mike (talk) 04:16, 27 February 2020 (UTC)
- You can split those etymology sections on the plural pages if you want. Better not to duplicate the ety text from the singular though: it is a maintenance headache to maintain two copies. Equinox ◑ 19:44, 27 February 2020 (UTC)
- I've split the senses by ety, but also tried to clean up where the senses are, since some were at slops but belonged at slop (AFAICT, since they can apparently also be used in the singular) or vice versa (if they are plural-only). I'm merely guessing, as were as where to put the footwear sense; please move it if necessary. - -sche (discuss) 19:05, 28 February 2020 (UTC)
- @-sche: Thanks. It is a general problem which I thought could be solved by keeping the etymologies in inflected forms such as the plural. Secondly , regarding this particular word, two genuine plural nouns for each etymology from the 14th century:
Leftover food, especially kitchen waste, that is fed to hogs [< Old English sloppe "dung" < Germanic] Clothes and personal articles that are sold from the ship's store to sailors on a merchant ship [Probably < Middle Dutch] Microsoft® Encarta® 2009
--Backinstadiums (talk) 19:23, 28 February 2020 (UTC)
- One of those is already listed in the entry, the other is not plural-only, AFAICT. - -sche (discuss) 19:29, 28 February 2020 (UTC)
- @-sche:
SLOP: (archaic)a loose smock or overalls [14th century. Probably < Middle Dutch] Microsoft® Encarta® 2009
--Backinstadiums (talk) 19:37, 28 February 2020 (UTC)
- (Other than possibly copyvio-ing a number of Encarta defs,) what of it? We have that basic sense at slop already. - -sche (discuss) 19:42, 28 February 2020 (UTC)
- @-sche: Apart form the genuine plural noun,
SLOP: 2. poor-quality unappetizing or watery food (often used in the plural) Microsoft® Encarta® 2009
Alternatively,
kitchen refuse; swill. [plural] badly cooked or unappetizing food or drink. [uncountable] Random House Learner's Dictionary of American English
--Backinstadiums (talk) 19:51, 28 February 2020 (UTC)
Additional interface for edit conflicts on talk pages
[edit]Sorry, for writing this text in English. If you could help to translate it, it would be appreciated.
You might know the new interface for edit conflicts (currently a beta feature). Now, Wikimedia Germany is designing an additional interface to solve edit conflicts on talk pages. This interface is shown to you when you write on a discussion page and another person writes a discussion post in the same line and saves it before you do. With this additional editing conflict interface you can adjust the order of the comments and edit your comment. We are inviting everyone to have a look at the planned feature. Let us know what you think on our central feedback page! -- For the Technical Wishes Team: Max Klemm (WMDE) 14:14, 26 February 2020 (UTC)
Enhanced Password Reset now available on Wiktionary
[edit]Note: I am posting this on English Wiktionary, but I welcome this message being translated and shared on other wikis. Thank you!
Hello, everyone! The Community Tech team has released a new feature, which is called Enhanced Password Reset (EPR), to Wiktionary and Wikivoyage. With this feature, you can optionally select to require both username and email address to be submitted on Special:PasswordReset in order to generate password reset emails. This feature was developed by the Community Tech team, in response to the #3 wish in the 2019 Community Wishlist Survey. We decided to incrementally release the feature, so we released to Wiktionary and Wikivoyage first. The release to all other wikis will happen soon. In the meantime, we would love your feedback!
To enable the feature, go to the “Email options” section in “Preferences.” You can click on the checkbox that states, “Send password reset emails only when both email address and username are provided.” Once you click the checkbox and save, the preference is enabled. Please note that Password Reset Update is not a global preference by default. It is enabled per wiki. However, you can make it global in your global preferences. For more information on password resets and EPR, you can visit the Help:Reset_password page on MediaWiki. Thank you, and we look forward to checking out your feedback on the project talk page! --IFried (WMF) (talk) 19:51, 27 February 2020 (UTC)
Supplemental search in other namespaces after failed search
[edit]In Wiktionary:Tea_room/2020/February#Fourth_Point_of_Contact we have evidence of a use case in which a user failed to find in principal namespace something that could be found in Appendix namespace, within the text of an appendix. In the event of a failed search of principal namespace, should we try to offer users a supplemental search of, say, Appendix, Citations, and Thesaurus namespaces? Users familiar with these namespaces already can select them for a supplemental search, but it takes several keystrokes or mouseclicks, so even experienced users might benefit. I don't think that we would want to add other namespaces to the three, but Concordance might be worthwhile in principle. User space seems like too much.
if there are no objections, I will raise the "How can this be done?" question at GP. DCDuring (talk) 12:49, 29 February 2020 (UTC)
- I think all of those namespaces should be searchable as parts of the dictionary's proper content. Going from main to main+those and then all of the other meta stuff is a sensible flow. —Justin (koavf)❤T☮C☺M☯ 13:01, 29 February 2020 (UTC)
- Main, Appendix, Citations, Thesaurus, and Reconstruction too (it often contains terms which are not yet created) should be searched by default. Fay Freak (talk) 15:07, 29 February 2020 (UTC)
Link to a specific definition out of more than one
[edit](I don't know if this is the right place for this question, but it seems the best fit.)
How can I make a link to one definition out of several that are given? Yid in English has 2 noun entries, one of which has two definitions. (I have no idea where "February" and "(plural Beer parlour/2020/Februarys)" came from.)
>>>>>>>>>>
Noun
[edit]February (plural Yidden or Yidn)
- (among Jews, informal) a Jew
- 2016, Mark Ledbetter, Victims and the Postmodern Narrative or Doing Violence to the Body, p. 84:
- Erdman's response is confessional, 'I am a Yid... I am! My father was a Yid. Please believe me!' (p. 211).
- 2016, Mark Ledbetter, Victims and the Postmodern Narrative or Doing Violence to the Body, p. 84:
Noun
[edit]February (plural Februarys)
- (derogatory) a Jew
- 2014, Bob Cook, Disorderly Elements:
- “You're a fucking Jew, Grünbaum!” Frege yelled. “A miserable, fucking Yid.”
- 2014, Bob Cook, Disorderly Elements:
- (informal) a supporter or club member of Tottenham Hotspur F.C.
<<<<<<<<<<
I would like to link specifically to the first noun entry,
for something I am writing outside WMF space. But all my searches for a way to do this have been in vain. I have been editing Wikipedia for nearly 15 years, but Wiktionary only occasionally, and I don't know all the ins and outs and abbreviations that experienced editors use.
--Thnidu (talk) 03:17, 31 March 2020 (UTC)
- @Thnidu: Use
{{senseid}}
. Also, for simple questions, use the Information Desk. —Μετάknowledgediscuss/deeds 03:44, 31 March 2020 (UTC) - (Also, it's not February any more... you should just use the + link at the top of the page next time you want to create a new discussion.) —Μετάknowledgediscuss/deeds 03:46, 31 March 2020 (UTC)
- @Metaknowledge: Thanks for your help.
- Re month: (facepalm).
- Re the + link at the top of the page: Oh. How am I supposed to know that? Wikipedia uses a button labeled "New section". See above, "all the ins and outs and abbreviations that experienced editors use". As I used to say at my job before last, "Word of mouth and telepathy make lousy documentation."
- Re
{{senseid}}
: Tried that, doesn't work:
{{senseid|en|among Jews}}{{lb|en|among Jews|informal}} a [[Jew]]
- displays as
- (with no bullet; where the heck did that come from?) which is reasonable, but the corresponding link
https://en.wiktionary.org/wiki/Yid#among_Jews does not go to that sense, but just to the top of the page. So I guess I will have to specify the location explicitly in text, because the link just doesn't do it.
- --Thnidu (talk) 05:45, 31 March 2020 (UTC)
- @Thnidu: At the top of WT:BP, there's a big link that says "Click here to start a new Beer parlour discussion." I don't see how it could be any clearer. While we're on the subject, of course your
{{senseid}}
link didn't work, because you didn't follow the documentation. Look at how the URL is formatted and copy it. —Μετάknowledgediscuss/deeds 05:50, 31 March 2020 (UTC)- It would be nice if we could make the process much simpler, so less-dedicated contributors could add the anchors and create the links needed. In my imagination, but far beyond my programming skills, I could see highlight a definition (including #), right clicking, selecting "add senseid", selecting a short portion of the definition, selecting "preview senseid", optionally selecting "edit senseid", selecting "accept senseid"; finally one could click to "copy senseid link" for pasting into the appropriate source entry. It still isn't exactly intuitive, because the objective isn't very intuitive, but perhaps someone with a better imagination than mine could come up with more fundamental rethinking. DCDuring (talk) 17:44, 31 March 2020 (UTC)
- @Thnidu: At the top of WT:BP, there's a big link that says "Click here to start a new Beer parlour discussion." I don't see how it could be any clearer. While we're on the subject, of course your
- @Metaknowledge: Thanks for your help.