Wiktionary:Beer parlour/2023/October

Newest and oldest pages Default Minimized

I am against the default minimized status for the "Newest and oldest pages" display in categories. When I go to Category:en:Places in Hubei, this "Newest and oldest pages" display was originally open by default, and I would automatically see "what's new". That's valuable to me if I want to be able to see what other people are doing in a category in the recent past. Now, with default minimization, I have to click. And when I do click, everything around the box gets shifted down. I can only interpret this as stupidity, or as a concerted effort to get rid of this box by making it unpalatable to use. Once you unminimize, the whole category and its contents are shifted down and you can't see shit, unlike before. Humbug. If you ruin the functionality of something, that doesn't prove that the thing that was ruined was not worthwhile before it was ruined. --Geographyinitiative (talk) 00:14, 1 October 2023 (UTC)[reply]

FWIW I would also prefer it be expanded by default. - -sche (discuss) 05:51, 1 October 2023 (UTC)[reply]

Same. Andrew Sheedy (talk) 06:08, 1 October 2023 (UTC)[reply]

Same. P U C – 12:46, 1 October 2023 (UTC)[reply]

Me too. I only rarely glance at it (usually at Category:English lemmas) but I see no reason to hide it on a desktop screen. Maybe on a tiny phone only. Equinox ◑ 13:15, 1 October 2023 (UTC)[reply]

It's been minimized? What skin of Wikt are you using? (Also, there should be a way to customize how many of the oldest and newest entries in a category you see. I don't think that non-maintenance categories need the oldest pages, and I would appreciate having a way to remove them.) CitationsFreak (talk) 15:11, 1 October 2023 (UTC)[reply]

After doing some testing, it affects all but MinervaNeue. This bug now affects me. CitationsFreak (talk) 19:46, 1 October 2023 (UTC)[reply]

See Wiktionary:Grease_pit#Category_columns_broken. Maybe we can get a setting? -- Sokkjō 17:03, 1 October 2023 (UTC)[reply]

Maybe just bypass {{autocat}}. Such templates are impossible for non-technomavens to customize in finite time. DCDuring (talk) 17:05, 1 October 2023 (UTC)[reply]

@Benwing2, This, that and the other, Theknightwho (Not sure they all watch this page.) I'm not sure the problem is solvable by changing things on this wiki. Isn't it part of how category subcategories and items are presented, ie, following all top matter? DCDuring (talk) 17:39, 1 October 2023 (UTC)[reply]

@DCDuring @-sche @Geographyinitiative @Andrew Sheedy @PUC This change was made by User:This, that and the other in this diff on Sep 26: [1] The diff made various changes, and the change message links to the Grease Pit discussion mentioned by User:Sokkjo. The default-collapsible setting can easily be removed. It might well be possible to change things so the presence of the oldest entries can be removed by CSS as requested by User:CitationsFreak; I'm not familiar enough with CSS to know how to do that. Changing the number of entries displayed might not be possible; I'm not sure how the newest and oldest pages functionality is implemented but it might be part of MediaWiki itself. Benwing2 (talk) 19:35, 1 October 2023 (UTC)[reply]

It is possible to insert the oldest and/or newest on hand-crafted category pages and set the length, but they still drive down the listing of the items in the category.

This does seem like yet another example of overstandardization and failure to use the BP for complaints and for airing changes with contributor consequences. DCDuring (talk) 19:54, 1 October 2023 (UTC)[reply]

@DCDuring OK, you can also change the code to affect the display of the newest and oldest pages. But hand-crafted category pages are in general a bad idea, and they don't let anyone customize the display at a per-user level. It is better to use consistent code and make the display customizable per-user. Benwing2 (talk) 20:01, 1 October 2023 (UTC)[reply]

Why is it so much better if we get less functional pages? DCDuring (talk) 20:24, 1 October 2023 (UTC)[reply]

FYI, You can add document.getElementById("newest-and-oldest-pages").classList.remove("mw-collapsed"); to your common.js file to have it expand automatically. -- Sokkjō 20:36, 1 October 2023 (UTC)[reply]

I honestly have never used this info myself, so I'm surprised to find such strong support for its visibility! The reason I made it collapsed by default was due to an issue on certain category pages such as WT:RTINE, reported by Sokkjo and -sche, where the box was hanging down below the "Subcategories" heading and creating a large amount of white space down the right side of the entire category listing below the box itself. This was especially problematic in the "Vector 2022" skin.

I have tried to make it so that the collapsibility only affects Vector 2022 skin users, and the old functionality is restored for everyone else. This, that and the other (talk) 23:52, 1 October 2023 (UTC)[reply]

@Benwing2, This, that and the other, Theknightwho, Sokkjo Is there any skin for which the expanded newest/oldest pages do not push down the display of the subcategories and other category members? DCDuring (talk) 00:03, 2 October 2023 (UTC)[reply]

@DCDuring On category pages with short introductory sections, the expanded newest/oldest pages box must, out of necessity, either (a) push the category listing aside, creating blank space below the box down the right side of the category listing, or (b) push the category listing down, creating blank space to the left of the box below the introductory section. The behaviour (a) was what was in place before my change, and it is the behaviour I have now restored for all skins, except for Vector 2022 which retains behaviour (b). This, that and the other (talk) 00:12, 2 October 2023 (UTC)[reply]

@This, that and the other: One thing to note is that I recently changed the handling of float-right items so they don't push the left-aligned text down, since it was making the display of various categories look bad with the breadcrumbs and such below all the float-right boxes. A side effect of this is that the subcategories start above the bottom of the float-right boxes and don't go all the way to the right. Is there a way of causing the subcategory text to flow around the float-right boxes? Benwing2 (talk) 00:17, 2 October 2023 (UTC)[reply]

I strongly doubt that would be possible. This, that and the other (talk) 00:33, 2 October 2023 (UTC)[reply]

@This, that and the other Hmm, do you know why? It seems like normally with 'float: right;', the text should wrap around the box. Is there something about the newest/oldest box that is preventing this (like it is extremely long vertically)? Benwing2 (talk) 00:37, 2 October 2023 (UTC)[reply]

@Benwing2 It's to do with the fact that the subcategory layout is in a box with column layout applied. Floats and column layouts don't seem to get along very well. See User:This, that and the other/floating boxes and columns. This, that and the other (talk) 00:47, 2 October 2023 (UTC)[reply]

@This, that and the other Hmm. Can we then hack the subcategory layout using JS and make it a non-column layout? Benwing2 (talk) 00:48, 2 October 2023 (UTC)[reply]

You mean, make the columns ourselves using JS instead of relying on CSS? That sounds very complicated for relatively little gain... This, that and the other (talk) 00:51, 2 October 2023 (UTC)[reply]

@This, that and the other I mean, just change whatever inline CSS property of the div box is causing it to be 'column' to instead be 'non-column' (or whatever the correct value is). It should be a one-line fix. Benwing2 (talk) 00:55, 2 October 2023 (UTC)[reply]

@Benwing2 That would be a trivial change indeed, but it seems like a very odd ask. Do you really think a page like WT:RTINE would look better with the category contents listed all in one long column? This, that and the other (talk) 03:26, 2 October 2023 (UTC)[reply]

@This, that and the other I'm very confused. I thought turning it into non-column mode would make it wrap into however many columns will fit, wrapping around the image. I guess that's not how CSS columns work though? Benwing2 (talk) 03:28, 2 October 2023 (UTC)[reply]

.mw-category.mw-category-columns { -moz-column-gap: 2em; -webkit-column-gap: 2em; column-gap: 2em; column-width: 18em;} fixes it. -- Sokkjō 04:10, 2 October 2023 (UTC)[reply]

@Sokkjo You rock. Benwing2 (talk) 04:12, 2 October 2023 (UTC)[reply]

@Benwing2 I'm not entirely sure what you're talking about. Either you have columns or you don't. Sokkjo/Victar's proposed fix only makes the columns narrower (removing the upper limit of 3 columns); it doesn't make the text wrap.

Perhaps we have different understandings of what is meant by "wrapping"; I thought you meant something like what Microsoft Word does: [2]. This is something akin to our wrap verb sense 5, although the gloss misses the nuance. This, that and the other (talk) 04:56, 2 October 2023 (UTC)[reply]

@This, that and the other Yes, what your image shows is what I'm looking to do. Maybe CSS doesn't support this. Benwing2 (talk) 05:21, 2 October 2023 (UTC)[reply]

I don't think it does. Finding the right words to Google is a challenge though. This, that and the other (talk) 08:11, 2 October 2023 (UTC)[reply]

I see what you're asking now. I believe you can do that with flex, but #newest-and-oldest-pages would have to be loaded inside .mw-category.mw-category-columns, which I'm not sure can be done, right? -- Sokkjō 21:45, 2 October 2023 (UTC)[reply]

(@TTO) FWIW, I thought your initial solution [to the issue of categories showing only one column] of making {{-}} the last part of the autocat boilerplate worked well, so that the list of entries in a category displayed as two columns regardless of whether the box was collapsed or not. Admittedly, it makes for a bit of whitespace between the "boilerplate" and the entries in the category when "newest pages" is expanded, but IMO that's tolerable. Perhaps we could consider ideas to make things more compact, e.g. maybe present the "newest pages" list on one wrapping line ("item 1 · item 2 · item 3 · etc") rather than putting each page on its own line? Or maybe that's a bad idea. - -sche (discuss) 04:12, 2 October 2023 (UTC)[reply]

@-sche the {{-}} looks okay on WT:RTINE but on a page like Category:French internet slang - which is much more representative of our category system - it would make for an incredibly inefficient use of vertical space.

I'd certainly be up for making the newest/oldest pages list more compact. This, that and the other (talk) 04:58, 2 October 2023 (UTC)[reply]

@Benwing2, This, that and the other, Theknightwho, Sokkjo If the oldest/newest boxes were taken out of {{autocat}}, wouldn't it be possible for users to customize whether or not each appeared at all and whether they were expanded or not by default. Would it be possible then to customize the number of columns? Could one then also change the number of items displayed? DCDuring (talk) 14:35, 2 October 2023 (UTC)[reply]

@DCDuring: Right now, I have document.getElementById("newest-and-oldest-pages").classList.add("mw-collapsed"); in my common.js to collapse it by default. If you want to increase the number of columns, you could add .mw-category.mw-category-columns { -moz-column-count: 4; -webkit-column-count: 4; column-count: 4} to your common.css file. -- Sokkjō 17:49, 2 October 2023 (UTC)[reply]

Thanks. Are you saying one can't increase the number of items in these newest/oldest boxes (which is what I expect) or that you haven't yet discovered how? DCDuring (talk) 17:57, 2 October 2023 (UTC)[reply]

Oh that's what you meant: #recent-additions ol li:nth-of-type(1n+6), #oldest-pages ol li:nth-of-type(1n+6) { display:none;} We could also add columns to that box, but it becomes a bit unruly with reconstructions: #recent-additions ol, #oldest-pages ol {-moz-column-count: 2; -webkit-column-count: 2; column-count: 2;}. -- Sokkjō 18:21, 2 October 2023 (UTC)[reply]

You fully addressed two of the three issues I had in your previous answer. I don't understand exactly how to use the JS snippet #recent-additions ol li:nth-of-type(1n+6), #oldest-pages ol li:nth-of-type(1n+6) { display:none;} when I insert it into my custom JS. How do I specify the number of items in a one-column newest/oldest box? Also, how do I specify only wanted the newest items? DCDuring (talk) 23:43, 2 October 2023 (UTC)[reply]

The above code goes into your User:DCDuring/common.css. :nth-of-type(1n+6) limits the list to 5. :nth-of-type(1n+7) would be 6 and :nth-of-type(1n+5) would be 4. If you want to hide the oldest pages all together, you could add #oldest-pages { display:none;}. -- Sokkjō 23:59, 2 October 2023 (UTC)[reply]

You can see how much help I need if I can't tell the difference between JS and CSS (and ignored 1=css in {{code}} above. Thanks for your patience. DCDuring (talk) 00:50, 3 October 2023 (UTC)[reply]

@Geographyinitiative Do any of these proposals or possibilities address your original expressed concerns? It seems to me that not just the oldest/newest box, but also the subcategories display pushes down the beginning of the display of category members. Also, is there really a need to simultaneously display the oldest/newest box and the first few category members? DCDuring (talk) 14:35, 2 October 2023 (UTC)[reply]

Out of curiosity, how many people use the "oldest pages" part of this? For my part, I only use "newest pages". If no-one uses "oldest pages", maybe we could collapse that half, if we are trying to keep things as compact as possible...? - -sche (discuss) 19:01, 2 October 2023 (UTC)[reply]

Probably the "oldest" list has very little actual usefulness because it doesn't actually list the pages that have been in the category the longest. The database doesn't track that information, but tracks "category link updates", which include page creation but also other unspecified events. New entries will certainly be on the "newest" list, but some old entries get bumped up to the "newest" list randomly and I guess the longer the page has been around, the more likely it'll get bumped up. — Eru·tuon 19:35, 2 October 2023 (UTC)[reply]

I use for cleanup cats like "Requests for quotations" to figure which pages have had the cleanup request the longest. CitationsFreak (talk) 21:41, 2 October 2023 (UTC)[reply]

It is easy, even for a naif like me, to insert one of these tables of any size, for any category, on any page (including user pages and subpages). See, for example, the CSS at the top of Category:Entries using missing taxonomic name (genus).

This can be useful for specific projects. One could have tables for two categories side-by-side on a user page. I'd be surprised if there weren't other useful parameter settings to add more data per item. DCDuring (talk) 00:50, 3 October 2023 (UTC)[reply]

I apologize if my initial comment was rude, but I'm kind of a "low information editor" that is to say, I know 0 about the system, 0 about programing, 0 about coding, etc. All I know is, there's a functionality that I, as a common rube, can use. For instance, if an IP makes a low quality entry in a category I work on, I can help them fix up the entry. If Wpi31, RcAlex or LlweyanII or someone makes a new sophisticated entry, maybe I can add something to it, or a variant form or something. This way I can kind of keep in contact with what's happening in my area of interest and engage with people and try to encourage growth and cooperation. So I apologize if I offended anyone; my car was having problems. --Geographyinitiative (talk) 14:46, 3 October 2023 (UTC)[reply]

Should appendix-only words in a language be added to that language's POS categories by default?

This is related to the ongoing vote to modify POS templates so that they work correctly on Appendix pages. I noticed, for example, that Appendix:Minecraft/illager - by virtue of using {{blend|en}} and {{en-noun}} - is added to Category:English nouns, etc. Do we want that? Or do we want e.g. category assignments to be suppressed by default on non-mainspace pages?

Thanks,

Chernorizets (talk) 20:51, 1 October 2023 (UTC)[reply]

I would argue that these words should be put in a new subcategory of Cat:English nouns called Cat:English appendix-only nouns, or perhaps a less Wiktionary-jargon-based name like Cat:English nouns in the Appendix, or Cat:English nouns belonging to fictional universes, since that seems to describe most of the appendix-only terms in practice. This, that and the other (talk) 00:08, 2 October 2023 (UTC)[reply]

I feel like they should be suppressed on the non-snowclone appendix pages. If they were real lemmas, then why aren't they in the main space in the first place? CitationsFreak (talk) 20:14, 3 October 2023 (UTC)[reply]

@CitationsFreak that's sort of my feeling too. How do you propose we proceed from here? Chernorizets (talk) 20:35, 3 October 2023 (UTC)[reply]

I also agree; if these would fail RFV (ATTEST, FICTION, etc) in mainspace, they shouldn't be put into the categories through the backdoor of appendix-space. One solution would be to not have these "like mainspace but not" separate pages for entries that don't meet CFI; if someone wants to make a single appendix page formatted like a glossary that's one thing, but psuedo-entries with POS templates (and hence this bad categorization) are another. - -sche (discuss) 02:48, 4 October 2023 (UTC)[reply]

Proto-Sino-Tibetan and STEDT

Previous discussion: Wiktionary talk:About Proto-Sino-Tibetan#Concerns about STEDT

As mentioned in the previous discussion, out existing Proto-Sino-Tibetan coverage is based on STEDT which has a number of problems with its reconstructions and methodology. Since there isn't really other decent PST reconstructions (other than that of STEDT, or others that do not take into account of newer/broader evidence), I think the best approach would be what @Vampyricon proposed - removing all PST entries and only allow Proto-Tibeto-Burman reconstructions, with Sinitic ones listed as cognates. How should we carry out this change? @Justinrleung, RcAlex36 I'm not sure if there are other editors I should ping – wpi (talk) 08:48, 2 October 2023 (UTC)[reply]

Support RcAlex36 (talk) 12:18, 2 October 2023 (UTC)[reply]

@wpi: How certain is Tibeto-Burman as a valid branch? And what would be the primary sources for its reconstruction? I seem to remember that the internal classification of Sino-Tibetan was a topic of debate in recent literature, but that might be dated info now. Thadh (talk) 12:26, 2 October 2023 (UTC)[reply]

@Thadh @Wpi Yes, I have heard that the only evidence for having Proto-Tibeto-Burman as a valid branch is a 6th vowel in Sinitic that supposedly merged in PTB, but that more recent investigations have shown that Old Tibetan and Old Burmese both preserve evidence of this 6th vowel still being separate in PTB. This came up some months ago in WT:RFM#Renaming Proto-Mon-Khmer to Proto-Austroasiatic. Benwing2 (talk) 19:09, 2 October 2023 (UTC)[reply]

@Thadh, Benwing2: I'm aware of the fact that PTB does not merge the 6th vowel, which makes PTB less valid. There are two cases depending on whether Sinitic is considered to be a first-level branch of ST or not: a) the more traditional view that PST splits into PTB and PSinitic, which is straightforward to see that the proposed change is valid since the PTB reconstructions we have are based on this view; b) Sinitic is not a first level branch within ST (where the proponents often prefer the name Trans-Himalayan), so the PTB reconstructions of this family would just be PST without considering Sinitic evidence. Either way PTB does not take Sinitic into account so they can't be listed in the descendants section - It's only a matter of terminology rather than classification. I'm not necessarily asking for a name change here, but just removing the bogus ones (which happen to be called PST) taken from STEDT, while the other STEDT's other ones (which are based on 20th century sources so they are called PTB) are mostly fine - though to reflect such change I would rather have the reconstructions presented as PTB and not PST. – wpi (talk) 01:02, 3 October 2023 (UTC)[reply]

Support - As mentioned, there are two main theories about the top-level structure of PST (or perhaps more accurately, one theory and its detractors), which I think correspond to the two ways to go about this:

1. If PST does bifurcate, then moving all STEDT entries to Reconstruction:Proto-Tibeto-Burman/item would be sufficient.

2. If there is not enough evidence to show that PST does bifurcate, then I don't know if providing STEDT's reconstruction would even be a good idea.

It seems obvious to me that scenario (2) requires more work: One should to go into every entry that mentions a "PST reconstruction" and write a whole paragraph about its etymology. "Compare Sinitic X, Tibetic Y, Burmese Z, Tangut, etc. etc." Because of this, I think going about as if (1) were true probably gets us the most with the least amount of effort, even though I think (2) is true. On a related note, I believe some "PST" entries actually have a PST reconstruction by Coblin, which is only Proto-Sino-Tibetan in the strictest sense: It accounts for only Sinitic and Tibetic forms. I don't know if we should do something like Old Chinese reconstructions and provide both on equal footing, or if we shouldn't provide Coblin's at all. Vampyricon (talk) 02:32, 12 October 2023 (UTC)[reply]

Persian nested translations - split or labelled?

(Notifying Ariamihr, Benwing2, Dijan, Mazsch, Qehath, Rodrigo5260, ZxxZxxZ, Sameerhameedy, Saranamd): : Hi.

Persian pronunciations, vocalisation and transliterations are increasingly split but it's not consistent or even agreed on.

Thanks to @Sameerhameedy's work, the modules can now produce classical, Dari and (modern) Iranian pronunciations, vocalisation and transliterations, which are similar but there are substantial differences as well.

Should Persian be split into fa-cls, prs and (the current default) fa?

An example of a would-be nested Persian translations at horse#Translations, already used by Sameerhameedy but with default codes:

Persian: اَسْب (fa) (asb)
Classical Persian: اَسْپ (asp)

Dari: اَسْپ (asp)

An alternative would be just use the {{qualifier}}

Persian: اَسْب (fa) (asb), اَسْپ (asp) (Dari, classical)

Do you support the split, nesting and new language codes for Classical Persian and Dari?

Support

Support. Hi @Atitarev:, I similarly proposed using the different etym codes for Classical and Dari in translations; courtesy of User:Theknightwho, we can now have different translations per etym code. This would be a convenient way of getting auto-translation that is correct for the particular lect. Benwing2 (talk) 03:50, 4 October 2023 (UTC)[reply]

Thanks, @Benwing2. The auto-transliterations for Persian are not enabled yet, even if the quality status is about the same as Urdu but the challenge is to separate Persian by the variety and apply different methods, even for simple examples like (daqiqe) (modern Persian) vs (daqīqa) (Dari or classical). Anatoli T. ^{(обсудить}/^вклад) 03:58, 4 October 2023 (UTC)[reply]

@Benwing2, @Sameerhameedy: I have just corrected the labels in the vote. "Classical Persian" (fa-cls) and Dari (prs), (the default) modern Persian stays unnested and uses "fa", if it's OK. To be used by default or if the variety is unknown. Anatoli T. ^{(обсудить}/^вклад) 04:10, 4 October 2023 (UTC)[reply]

Support --Anatoli T. ^{(обсудить}/^вклад) 03:59, 4 October 2023 (UTC)[reply]
Support --Rodrigo5260 (talk) 04:09, 4 October 2023 (UTC)[reply]
Support But I think all dialects (Classical, Dari, Iranian) should all be nested under "Persian". I think nesting Classical and Dari under Iranian Persian is weird. It gives the impression that Dari and Classical are variations of Iranian Persian. IMO it should look something like this:

Persian:
Classical Persian: خودرو (xwadraw)

Dari Persian: خودرو (xudraw, xudrō), موتر (mōtar)

Iranian Persian: خودرو (xodrow), ماشین (mâšin) Putting all varieties under the language name is also the standard for Punjabi FYI.- سَمِیر | Sameer (^{مشارکت‌ها} • ^{کتی من گپ بزن}) 04:35, 4 October 2023 (UTC)[reply]

Hi @Sameerhameedy:

Punjabi nesting is manual, using the same language code "pa". Since it's not automated, you will see all kinds, one-liners with qualifiers.

Punjabi:
Gurmukhi: ਪੰਜਾਬੀ (pa) f (pañjābī)

Shahmukhi: پَنْجَابِی f (panjābī)

Do you want to enable nesting for "fa"? So, if "fa" is the only one available in a translation, it will still be (automatically, by the translation adder tool)

Persian:
Iranian Persian: خودْرُو (fa) (xodrow), ماشین (fa) (mâšin)

Anatoli T. ^{(обсудить}/^вклад) 04:48, 4 October 2023 (UTC)[reply]

@Atitarev Why is "fa-ira" not available, so it can link to Persian Wiktionary? Is it not possible to have "prs" "fa-ira" and "fa-cls" all link to Persian Wiktionary? It's not like Dari and Classical Persian would ever get separate wiktionaries... If the problem is conversion, almost everything is Dari or Iranian Persian; and every Dari translation has the qualifier "Dari". We can safely convert everything without the qualifier. سَمِیر | Sameer (^{مشارکت‌ها} • ^{کتی من گپ بزن}) 05:03, 4 October 2023 (UTC)[reply]

@Sameerhameedy: We can probably make the code work "fa-ira" but link it to "fa" Wiktionary. Same with other varieties, since, there won't be any Dari or Classical Persian Wiktionaries. As for nesting, if you insist for "Persian/Iranian Persian", this can be specific to "fa-ira" or to "fa".

"fa-ira", "fa-cls" or "prs" are not yet available for translations.

@Benwing2: Do you think it's a good idea to nest Iranian Persian per Sameer? Anatoli T. ^{(обсудить}/^вклад) 05:08, 4 October 2023 (UTC)[reply]

By the way, the Mongolian and Serbo-Croatian nestings are also manual, using the same "mn" and "sh" codes. In both cases, the scripts are completely different. Malay nestings are occasional too. If we formalise the nesting, then whoever accesses MediaWiki:Gadget-TranslationAdder.js with "fa" will receive the nesting but it's hard to achieve as it is, using the same language code. Anatoli T. ^{(обсудить}/^вклад) 05:03, 4 October 2023 (UTC)[reply]

Support But mainly @Sameerhameedy's ideas. I'd also like to thank him for the immense work that has gone into the Persian IPA module.--Saranamd (talk) 17:08, 4 October 2023 (UTC)[reply]
@Saranamd: I second that. Great work! @Sameerhameedy, also on the Urdu modules, among other things. Anatoli T. ^{(обсудить}/^вклад) 23:56, 4 October 2023 (UTC)[reply]

Oppose

Abstain

Anatoli T. ^{(обсудить}/^вклад) 03:46, 4 October 2023 (UTC)[reply]

[Note: This vote is not about splitting Persian into multiple L2 headers, we still use ==Persian==, it's about using additional codes for transliterations, translations, etymologies and vocalisation, which differ.]

@Benwing2: It seems, at least the two active Persian editors favour nesting the Persian translations under Persian/Iranian Persian. Is it feasible, practical to use the current "fa" code for that or we need the new code "fa-ira"? If we leave the ways things are, then translation nesting will need to be done manually, which is inefficient, inconsistent and error-prone.

I have tried before requesting new codes for Mongolian - differing by scripts, so that we have automatic nesting for Mongolian/Cyrillic and Mongolian/Mongolian and it was rejected. Maybe we need to revisit that discussion to automate nesting for Mongolian, Serbo-Croatian, Malay, Punjabi, etc. since the nesting is already working for Chinese and Norwegian but these use different language codes.

@-sche might remember the discussion. FYI, @Sameerhameedy, @Saranamd. Anatoli T. ^{(обсудить}/^вклад) 00:41, 5 October 2023 (UTC)[reply]

Classification of anime-adjacent terms

Chinese has {{lb|zh|ACG}} (cf. ACG) which categorizes into Category:Chinese fandom slang.

But this sort of anime-inspired culture isn't limited to Greater China but is really found all over the world. cf. English -san, which currently doesn't really explain the cultural context of this and similar Japanese borrowings, or Korean 순애 (sunae) which is not exactly "Internet slang" either.

The closest thing we have is Category:Japanese fiction, but the label is inaccurate because even if the subculture was inspired by certain forms of Japanese media, it has now taken on a life of itself in certain countries.

Thoughts? I feel there should be a specific label that categorizes either into fandom slang, or into a new category of its own. "ACG" is a good term but it's too China-specific imho.--Saranamd (talk) 17:14, 4 October 2023 (UTC)[reply]

Category:English fandom slang ? -kun is glossed as "anime and manga fandom" but that isn't a recognised category so there is no generated page for it. Equinox ◑ 17:21, 4 October 2023 (UTC)[reply]

Multiword terms (namely "acabar de") in Iberian Romance

I was in the middle of adding conjugation tables to some Galician verbs and ended up coming across this interesting thing in acabar. One of its senses says "(acabar de facer) to have just done (something)". I don't know a lot about Asturian, but it does look awfully simillar to the "acabar de" in Catalan and Spanish, which do have a page just for them. Shouldn't the Asturian version be in there too? Could it be there are other cases like this around our dictionary? Am I missing some Wiktionary policy concerning Asturian? MedK1 (talk) 10:41, 5 October 2023 (UTC) On closer inspection, it seems that this meaning is included for Portuguese and Galician as well, but in the original page. Perhaps the question shouldn't be "why isn't Asturian in a separate page" but rather, "why are Catalan and Spanish separate". MedK1 (talk) 11:31, 5 October 2023 (UTC)[reply]

Ordering of descendants within a "Descendants" section in an entry

As far as I can tell, it's an unwritten but largely followed convention to order descendants within a "Descendants" section alphabetically by language name. What I'm curious is whether that still applies when there's a mixture of direct descendants, languages that borrowed the term, and/or languages that derived from the term.

What I've seen in more than one entry is that direct descendants come first (and are alphabetized by language), then borrows & derivations (also alphabetized by language). Is that the intended order, or should we simply alphabetize by language regardless of direct vs. indirect descent?

Thanks,

Chernorizets (talk) 10:47, 6 October 2023 (UTC)[reply]

I consistently order them as inheritances first, and borrowings after (both in alphabetical orders), with derivations being placed among them. I think that this is a pretty widespread convention. Thadh (talk) 10:54, 6 October 2023 (UTC)[reply]

Checking WT:Descendants there is no written rule about exact ordering - I was always under the impression it should be alpahebetical, sometimes split into type if the list is particularly long. Perhaps having an unwritten rule is better here to give more freedom, unless people feel there should at least be a guideline (as opposed to a hard set rule). Vininn126 (talk) 10:57, 6 October 2023 (UTC)[reply]

Considering the unwritten rule is to do inherited first then borrowed/calqued et cetera, I propose we add a line to WT:Descendants saying "In general direct descendants should be listed alphabetically first followed by borrowed and other descendants alphabetically." We might should also add a note on organizing inherited terms by sub-family as well. Vininn126 (talk) 13:03, 6 October 2023 (UTC)[reply]

Agreed. Ultimateria (talk) 16:30, 6 October 2023 (UTC)[reply]

Ditto. Does this require a vote, or just an admin doing the edit? Chernorizets (talk) 02:22, 8 October 2023 (UTC)[reply]

If it’s a change to “Wiktionary:Entry layout” it requires a formal vote. — Sgconlaw (talk) 05:31, 8 October 2023 (UTC)[reply]

@Chernorizets I always list direct descendants first (in alphabetical order), and then borrowings (also in alphabetical order). I never mix them up. ZomBear (talk) 13:01, 6 October 2023 (UTC)[reply]

I'm aware of two unwritten rules and I hope this can be codified and documented (and existing entries sorted by a bot according to the agreed-upon rules):

Direct descendants before borrowings (as others have pointed out)
Sort by "main component" of language name (not sure how to describe this but to give an example: Ottoman Turkish is sorted under "T" and not "O"). (Also, it's implicit but we should make it explicit that sorting is by language name and not language code.)

The first one is followed widely, the second one not so much. I think this should be voted upon and codified. tbm (talk) 07:41, 18 October 2023 (UTC)[reply]

If anyone has the motivation to bring this to vote, that'd be fine by me. Vininn126 (talk) 08:19, 18 October 2023 (UTC)[reply]

@Thadh @Vininn126 @Tbm @Ultimateria: I'm planning to bring this to a vote in January, since I'm already in the middle of another vote, and then it's holiday season. If any of you would like the vote to occur sooner, feel free to take it off my hands. Chernorizets (talk) 08:28, 21 November 2023 (UTC)[reply]

I don't have time to write something at the moment but I'm happy to read a draft. Thanks for working on this! tbm (talk) 11:01, 21 November 2023 (UTC)[reply]

@Thadh @Vininn126 @Tbm @Ultimateria - I've gone ahead and created the vote at Wiktionary:Votes/2023-11/Ordering_of_descendants_in_mainspace_entries. It starts in a week (unless we want to start it sooner). Feedback on the wording is, as always, welcome.

My proposal also indicates what I believe should be the sort order for languages with multi-word names like Old English or Ottoman Turkish, based on 1) what I've seen more often than not in practice, and 2) trying to keep things simple considering we have 4300+ languages. That said, I'm less invested in that portion of the proposal than the idea of listing direct descendants first, so I'm open to changing it if you have better ideas. Chernorizets (talk) 02:53, 28 November 2023 (UTC)[reply]

Blocked User:Dan Polansky requesting talk page access

Passing on a message received from DP in my e-mail. I don't really want to do a "unilateral" unblocking so will leave others to decide. (The phrasing "your user account Benwing" must mean Wiktionary's user account, because obviously it isn't mine.)

can you please unblock my user talk page in the English Wiktionary?

1. I would like to talk to editors about the possibility of me editing again.

2. I would like to make a proposal of possible constraints on my editing so that concerns others have can effectively be addressed. The proposal I have in mind radically constrains my editing ability, while allowing some of it.

3. My talk page shall only be used for the purpose of unblock discussion and no other purpose (especially no comments on Wiktionary events, discussions and users allowed), until and unless the decision to unblock me is made.

4. I recognize the statements I made that I was blocked for by your user account Benwing as problematic and unacceptable in the English Wiktionary. I plan to make a more detailed statement concerning this on my user talk page.

Thank you,

Equinox ◑ 09:07, 8 October 2023 (UTC)[reply]

@Dan Polansky I

Support unblock on the talk page to discuss the possibility of unblock. I would just say that you ought to try to be polite. --Geographyinitiative (talk) 09:24, 8 October 2023 (UTC)[reply]

Sounds reasonable, I

Support removing the block from the user talk page. Thadh (talk) 11:40, 8 October 2023 (UTC)[reply]

Oppose on the basis of Dan’s consistent abuse of procedure and the extensive personal attacks he has previously written on his talk page. We shouldn’t be entertaining this at all. Theknightwho (talk) 13:10, 8 October 2023 (UTC)[reply]

@Theknightwho: We can always block them again if it turns out that they abuse this opportunity. Frankly, if Dan is restricted to editing mainspace/reconstruction only and stops writing these repulsive messages on his talkpages, I don't really see what harm he could do. Thadh (talk) 13:41, 8 October 2023 (UTC)[reply]

@Thadh We gave him chance after chance after chance - he's the embodiment of give them an inch and they'll take a mile. Theknightwho (talk) 14:05, 8 October 2023 (UTC)[reply]

Oh hell no.

Oppose per the extremely-convincing arguments laid out by TKW, Sameer, and AG202 (among others!). Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 13:51, 28 October 2023 (UTC)[reply]

Oppose. Giving Dan Polansky talk page access is liking giving an alcoholic a drink. He may be sincere for the moment in trying to limit himself, but he's too invested in trying to talk his way into and out of things to hold it for long. The fact that he accidentally sent Equinox an email intended for @Benwing2 doesn't inspire confidence, either. Chuck Entz (talk) 15:23, 8 October 2023 (UTC)[reply]

(DP has clarified that he originally sent the mail to Benwing, got no reply, and then tried me, forgetting to edit that line.) Equinox ◑ 14:46, 10 October 2023 (UTC)[reply]

Oppose, I agree with Chuck. Also, I don't think someone can be racist on accident. And seeing the thread of him calling Metaknowledge a "threat to anglo-saxon culture" kinda confirms that for me; I mean it wasn't even like a snarky heat of the moment comment either, he wrote a whole essay about them... There is no possible way in my mind he wrote that accidentally. And that's not even mentioning that he had been writing essays attacking users for at least a few months by that point. He kept going even after numerous prior blocks! Hes never changed his behavior after being blocked before, not sure I believe he'd do so now.- سَمِیر | Sameer (^{مشارکت‌ها} • ^{کتی من گپ بزن}) 19:59, 8 October 2023 (UTC)[reply]

Support, but just the request for unblocking. If it's clear he's a changed man, then unblock him. If we feel that he hasn't changed, keep him blocked. CitationsFreak (talk) 22:10, 8 October 2023 (UTC)[reply]

Oppose. It is clear he can’t have changed in 8 months. 8 years is possible, and even that not likely by the average of how people use keep their identity rather than reject it to develop themselves. Fay Freak (talk) 03:10, 9 October 2023 (UTC)[reply]

Oppose. — Fenakhay ^{(حيطي · مساهماتي)} 22:17, 8 October 2023 (UTC)[reply]

@Chuck Entz Just FYI he wrote to me a month or so ago with the same message, which I didn't respond to. I think this message was intended for Equinox, but he copy-pasted the message he sent to me and forgot to edit it. Benwing2 (talk) 23:02, 8 October 2023 (UTC)[reply]

Support कालमैत्री (talk) 03:12, 9 October 2023 (UTC)[reply]

Support. I'm a big fan of second chances and he can always be blocked again. At least let him make his case. This isn't even about unblocking him completely, it's just about letting him argue why he should be given greater editing freedom again. If we don't find that he makes a convincing case, we can simply end it there. Andrew Sheedy (talk) 05:02, 9 October 2023 (UTC)[reply]

~~

Weak support per this. I've been reading up on the guy. Afaik, from what I read, he makes plenty of useful contributions to Thesaurus entries when unblocked. It'd be a shame to lose those. Though, I feel like it'd be naïve (of me!) to just supp outright considering his past. MedK1 (talk) 14:41, 9 October 2023 (UTC)~~[reply]

A lot of his thesaurus work is actually very sketchy and has to be completely redone or cleaned up. So, no, not that helpful. Vininn126 (talk) 14:45, 9 October 2023 (UTC)[reply]

Agreed, and trying to get him to change the way he does things is a Sisyphean task. Theknightwho (talk) 15:34, 10 October 2023 (UTC)[reply]

Oppose, based on personal preference, not any particular principle. DCDuring (talk) 14:33, 9 October 2023 (UTC)[reply]

Oppose. User has had too many chances. fool me once... Vininn126 (talk) 14:45, 9 October 2023 (UTC)[reply]

Strong oppose. Are we forgetting that he's been openly racist? Look at this extremely long block list! How many "second chances" does a person need? Seriously. Even when we have open racists, we still want to give them another shot. I'm disappointed but not surprised at all. I also don't like the wording, "as problematic and unacceptable in the English Wiktionary", as if it's only a problem here. He only seeks attention and chaos, and I would like us to stop giving it to him. (This also goes into our leniency issue and having to clean up messes afterwards) CC: @MedK1 since you said you've been reading up into it. AG202 (talk) 15:20, 9 October 2023 (UTC)[reply]

Oppose He's been getting banned over and over since 2011?? I had no idea, damn. Yeah, that's got me convinced for sure. Somebody else said the thesaurus work was actually not high-quality at all, so it definitely dismantles any reasoning I had to support the unban... MedK1 (talk) 16:35, 9 October 2023 (UTC)[reply]

To be fair, he was the one who set up the thesaurus in the first place and deserves credit for the early work populating it to where people could see what it was good for. The problem is that it's grown beyond his original vision, and he hasn't. He still thinks of it as something new that he can keep reinventing, and has tried expanding it in ways that aren't compatible with the current setup. Chuck Entz (talk) 18:04, 9 October 2023 (UTC)[reply]

@AG202 @MedK1 Being racist is of no consequence to me, unless it creates disrupt while discussions and leads to poor entries. He is only asking about his possibility of editing again and seems to show reform over his offensive remarks, and that's not harming to anyone.

Also for information Wonderfool had been blocked since 2004.कालमैत्री (talk) 16:52, 9 October 2023 (UTC)[reply]

@कालमैत्री Dan is extremely disruptive in discussions he participates in, and he has continued in that behaviour on other wikis since his block here. He will try to turn any kind of discussion on his talkpage into a circus. Theknightwho (talk) 15:34, 10 October 2023 (UTC)[reply]

I don't think it's fair to compare Wonderfool to Dan, considering WFs blocks were mainly for vandalism and block evasion. Most of Dans blocks were for obstructionism, intimidation/bullying & harassment, and messing with votes in discussions (deleting votes he didn't agree with for example).

also two users here have attested that his entries regularly need to be fixed, so if entry quality are really all you care about then there's not much there. In fact (using google translate) it appears that just two days ago he got into a dispute on his talk page on czech wiktionary for using his own entry layout instead of the standard one. Though google translate may be wrong so take that with a grain of salt. سَمِیر | Sameer (^{مشارکت‌ها} • ^{کتی من گپ بزن}) 21:07, 10 October 2023 (UTC)[reply]

Oppose--Saranamd (talk) 16:01, 9 October 2023 (UTC)[reply]

Support. Until Polansky is free, nobody is free. Allahverdi Verdizade (talk) 14:43, 10 October 2023 (UTC)[reply]

Oppose Although I'm a big believer in second chances, from my experience people's fundamental nature doesn't change quickly, and Dan has been given a ton of second and third chances already. Maybe after several years but not now. Benwing2 (talk) 07:04, 11 October 2023 (UTC)[reply]

Alright. I just got an email reply back from Dan relating to this subject, to see if he had changed for the better. I emailed him: "You see, people oppose the unblocking due to you making personal attacks, trying to have it your way, and your weird rants on Asians and trans people. If you were unbanned, would you do those things? (Also, do you still hold negative views on Asians and trans people?)". This is his response:

"My request was about having the option of asking for unblock on my talk page, not about being immediately unblocked in general. That request would then possibly be rejected based on whether my proposed conditions would be found helpful. This two-step process would be so that I am allowed to respond to further queries and counter-proposals. It would work like on other wikis, where even the blocked person usually has access to user talk page and can request unblock, publicly recognize mistakes as part of that, describe plans for improvement or problem avoidance, etc. I saw it work elsewhere, and I thought it could or should work in the English Wiktionary as well.

My plan was to propose, once I would be unblocked, a one year block (or longer as desired) from Wiktionary namespace and further fairly stringent restrictions, making me pretty much a discussion and debate non-participant and editor only, so that whatever bad tendencies and bad communication ideas I had would be effectively blocked. If I am not even allowed to use my user talk page to make a request for unblock--as if that alone was dangerous and to be prevented at any and all costs--I do not know what else to do at this point. I expressly specified that the user talk page can be used for the purpose of the request only. I was trying to figure out a way how to provide gain for me as a content editor (no longer a policy making participant and process watchdog) as well as for the project and its users, especially users of Czech content to be expanded. According to established negotiation theory, solutions can often be found if parties strive to find practical solutions for actual problems using their imagination and are willing to make compromise if required." CitationsFreak (talk) 23:15, 11 October 2023 (UTC)[reply]

This honestly changes nothing for me, if not making it worse. AG202 (talk) 23:28, 11 October 2023 (UTC)[reply]

Yeah, it just proves the point that Dan hasn't changed at all. Theknightwho (talk) 01:59, 12 October 2023 (UTC)[reply]

My question (not even an issue) around this, is that people can pop out of the blue and decide to impose a bunch of rules (ostensibly on themselves?). I mean, if the rule is only on yourself (like if I, Equinox, said "I won't edit Wiktionary while drunk") then that's your own issue, and doesn't need to be mentioned to others; if it is a rule that imposes something on others (like, say, "whenever I, Equinox, post on RFV, somebody must revert it and spank me on my talk page") then I am causing trouble for other people, who otherwise wouldn't have the trouble. I've seen this quite often: we had that kid recently who devised for himself a whole schedule of punishments and durations... You just have to say, well, why, on what authority? The murderer doesn't hang himself. Equinox ◑ 05:42, 12 October 2023 (UTC)[reply]

@Equinox: I think in this case Dan is asking for a partial block from these namespaces - which is a one-time event. Thadh (talk) 07:48, 12 October 2023 (UTC)[reply]

How much energy do we want to devote to imposing, administering, and litigating this custom treatment? As long as there are available channels of communication, like unblocked e-mail, we can expect DP to use them in exercise of his considerable talents at litigating/sophistry. If the main standard for behavior is "Does it help build Wiktionary?", what ever positive contributions DP might make must be weighed against the risk of high cost for the custom treatment he asks for. DCDuring (talk) 12:54, 12 October 2023 (UTC)[reply]

Support. Mostly because I am happy to be allowed to vote again. P. Sovjunk (talk) 15:30, 12 October 2023 (UTC)[reply]

Weak support. DP wrote: "It would work like on other wikis, where even the blocked person usually has access to user talk page and can request unblock". I can confirm that on Wikipedia blocks usually (by default) allow blocked users to still post to their own Talk page. Is that not the same on Wiktionary? If not, rather than seeing this as a DP-specific issue, perhaps it can be seen more broadly as a option to improve the way blocking is implemented on Wiktionary.
I am uneasy about the assessments of how useful DP's contributions are (or have been): AFAIK, blocking is not to be implemented because of (say) mediocre contributions. Blocking based on contributions could only be where the contributions, on average, ultimately lead to worse articles. In any case, if the current block was not instituted based on contribution quality, then contribution quality should not be used to perpetuate the block.
DP foreshadowed that "I would like to make a proposal of possible constraints [...]. The proposal I have in mind radically constrains my editing ability [...]." We are not at the stage of needing to accept or decline a finalised proposal. The proposal could be put forward, critiqued, amended, and eventually accepted or rejected — or even just ignored.
DP also wrote, "My talk page shall only be used for the purpose of unblock discussion and no other purpose". If we don't want to police this, we don't have to police this. DP is inviting us to enforce this standard, but we are not obliged to do so. (It is certainly not a requirement on Wikipedia — only formal unblock requests lodged on a user's Talk page are supposed to stay on-topic, but AFAIK Wikipedia does not impose any specific prohibition on the blocked user's Talk page being used for something else.)
—DIV (1.129.104.79 08:06, 27 October 2023 (UTC))[reply]

The reason that DP has no talk page access is that he previously used it to insult (and also praise) certain Wiktionary editors on it. He also wrote essays on various things on Wiktionary that could be improved.

Also, we did not block him for making bad edits, it's just that if he's been nothing but bad/mediocre on Wiktionary, why unblock him? CitationsFreak (talk) 18:31, 28 October 2023 (UTC)[reply]

I'm aware DP wasn't blocked for bad edits. Actually, to be very clear, I was not alleging that DP made bad edits, or good edits, or mediocre edits. The point I was making there is that I feel that arguments about lifting a block should focus on issues related to the block being initiated in the first place.
Let's suppose that (hypothetically!!!) DP makes a lot of typographical errors: but if it wasn't a factor in deciding to implement the current block, then it shouldn't be a topic of discussion now. The reason I adopt this stance is that otherwise it creates an injustice where — to continue the hypothetical scenario — one editor (DP) is singled out to be blocked effectively for having poor typing skills, whilst thousands of other editors with poor typing skills remain unblocked. (Effectively in the situation where the perpetuation of the block were based on poor typing skills, rather than being justified purely on the basis of the original issues from a few months/years(?) ago.)

I am ~~vaguely~~ aware of reports of DP making insulting or abusive remarks on Wiktionary, and perhaps that was a good reason to block access to his own Talk page. Nevertheless, I'm still (weakly) in favour of now lifting the block on DP accessing his own Talk page, and see what he writes there. Maybe it'll be a solution proposal that almost everyone can accept. Maybe it'll be a solution proposal that almost nobody can accept. Maybe it'll be something objectionable, or maybe we'll all want to give him a hug. It's not issuing a full pardon — it's a rather minimal step that seems proportionate to the circumstances.

—DIV

P.S. In the meanwhile, today I coincidentally stumbled on a nugget from DP that — IMHO — is actually quite a useful little contribution. I don't want to get into a debate about that; it's just a striking coincidence to encounter that by chance.

(1.145.44.122)

Weak abstain. P U C – 19:07, 28 October 2023 (UTC)[reply]

Rather late here but

Oppose per Chuck and AG202, for the record. —Al-Muqanna المقنع (talk) 11:22, 21 November 2023 (UTC)[reply]

Inline Wikipedia template

I've occasionally seen Wikipedia links manually inserted at the end of definition lines to lead readers to WP articles that are relevant to a specific sense (see e.g. here). I would like to suggest creating an inline Wikipedia template to standardize these links. This would be particularly useful in proper name entries with many definition lines (such as Hudson). Here are a few possible versions that I could think of:

example

Lorem ipsum dolor sit amet.^{(Wikipedia article)}
Lorem ipsum dolor sit amet.^{(See also example on Wikipedia.)}
Lorem ipsum dolor sit amet.^WP

Another (perhaps more noticeable) solution could be placing the link in a new line similarly to the semantic relation templates, although this takes up a bit more space:

example

Lorem ipsum dolor sit amet.
Wikipedia article: example

(Note that {{pedia}} and {{R:wp}} already exist, but their formatting is not appropriate for definition lines.)

Thoughts? Einstein2 (talk) 14:46, 8 October 2023 (UTC)[reply]

I am not in love with any of the three examples. Were we to do this kind of thing, I think there are a few design considerations:

don't take up more space than absolutely necessary, either
1. horizontally or
2. vertically;
Make it clear to even new users that the link is to Wikipedia;
Don't give commands to users.

IMHO, design 1 fails consideration 2; design 2 fails 3 and 1.1; design 3 fails 2; and "Another" fails 1.2.

Something that just had the WP article title (possibly shortened) followed by "at/@ Wikipedia" would be better. If wording in the definition were close to the title of the WP article, then a template that had "Wikipedia" in small superscript over the piped link to the WP article. might be OK.

I wonder whether any in-definition links to WP are desirable. I'd be surprised to find that the vast majority of users come to us for anything other than definitions and translations. Anything that risked distracting users from getting to understandable definitions and translations seems undesirable to me. DCDuring (talk) 18:50, 8 October 2023 (UTC)[reply]

How do designs 1 and 3 fail to make clear that the link goes to Wikipedia when one explicitly says "Wikipedia article" and the other one literally contains the Wikipedia logo? I don't see consideration #3 as a problem either, personally: we even have a section on pages with that name. I can't even really say it's a command per se, it's seriously just wiki jargon. MedK1 (talk) 20:37, 8 October 2023 (UTC)[reply]

@Einstein2, MedK1: I thought the idea was the the article title appeared instead of the words "Wikipedia article". We try to exclude wikijargon from definitions and from user-facing pages in general. DCDuring (talk) 14:28, 9 October 2023 (UTC)[reply]

If that's how you saw it, then I can see where you were coming from; I interpreted it as the text staying the same while clicking it would take you to different places. Wikipedia article and Wikipedia article for example. MedK1 (talk) 14:34, 9 October 2023 (UTC)[reply]

@MedK1 FYI, Appendix:Glossary has a lot of inline links to Wikipedia. They are typically formatted like this: (See also {{pedia|Voice (grammar)}}) which produces:

(See also

Voice (grammar) on Wikipedia.Wikipedia )

We have an awful lot of Wikipedia linking templates already (see Category:Interwiki templates) but I think something called e.g. {{seepedia}} that mimics the way that Appendix:Glossary does its links could work. (I also don't see why using the imperative form is a problem.) Benwing2 (talk) 23:09, 8 October 2023 (UTC)[reply]

So {{seepedia}} would prefix {{pedia}} with "See " or "see "?

Something in the imperative implies that the user is better off to click through than not to. Not everybody filters out the mostly academic convention of "see" and "see also". DCDuring (talk) 14:28, 9 October 2023 (UTC)[reply]

Funny you should mention these links - I actually thinking about going through the entries that use them and converting them to standard {{wikipedia}} or {{pedia}} links. Can you identify an entry where the placement of these links as right-floating boxes or in the "further reading" section in unsatisfactory? This, that and the other (talk)

On a relate note, a simple search for "wikipedia.org" turns up a number of html links to Wikipedia that I've been cleaning up. The really dumb ones are in the |authorlink= parameter, which would be converted to html by the template if just the article name was used. I'm not really sure what to do about the ones that link to old revisions, though. Then there are the hits for "m.wikipedia.org" which force the mobile view on anyone who clicks the link. Chuck Entz (talk) 01:16, 9 October 2023 (UTC)[reply]

Yeah, I think we should normally use the {{wikipedia}}/{{pedia}} templates; creating a template for definition-line 'see Wikipedia' links would result in people trying to add such links systematically to as many definitions as possible, whereas I think very few definitions should actually contain wikipedia links, and such links (where they are useful) are almost always better worked into the prose, like "# Aesthetically pleasing due to being functional, in architect {{w|Richard Foobar's philosophy of design}}" rather than e.g. "# Aesthetically pleasing due to being functional, in architect Richard Foobar's design philosophy. See [[w:Richard Foobar's philosophy of design]]." If we're just talking about a link at e.g. truck to say "Wikipedia has a longer encyclopedia article about this kind of truck", even though we're able to adequately define that sense of truck in our entry (whereas we wouldn't have an entry for a SOP phrase like "Richard Foobar's philosophy of design" to explain that, only Wikipedia would), my initial inclination is that the {{wikipedia}}/{{pedia}} templates are better. - -sche (discuss) 17:14, 9 October 2023 (UTC)[reply]

Ideally the definition lines should be as clutter-free as possible, so templates should be kept to an absolute minimum, in practice {{lb}}, {{senseid}} and {{gl}}. Usually it's clear from the context which link belong to which sense, so I don't see which problem this proposal wants to solve? – Jberkel 14:51, 9 October 2023 (UTC)[reply]

I agree with most of you that {{wp}}/{{pedia}} works perfectly fine in a majority of entries. However, there are at least two cases where I think a definition-line template would be certainly beneficial:

Proper noun entries with dozens of senses (mostly placenames); e.g. York, Jackson, Newport, Chester, Hamilton. Currently, most of these entries only link to a WP disambiguation page so readers must browse through another huge list in order to find the article for a particular sense. Displaying a box template for each sense would be infeasible.
Specialized senses of frequent words with many definitions; e.g. cross (verb 1.4), shoot (verb 9), brake (noun 5.3, 5.4), selection (noun 6, 7, 9, 10), bounce (noun 6), casting (noun 4). Each example contains a plain WP link that looks similar to the proposed template. Replacing these with the existing template formats would make it nearly impossible for readers to match the sense to the WP template (located somewhere else in the entry), depriving them of the opportunity to find more information about a given sense.Einstein2 (talk) 13:51, 11 October 2023 (UTC)[reply]

I think interwiki box templates should be avoided generally, but especially in long entries. For me, their main value is as marker of taxonomic name entries that need cleanup. The existing {{pedia}} template works adequately on definition lines, as Ioaxxere demonstrates below. In the old days [[w:WPEntryName]] was considered adequate, but doesn't pass the bathing-suit competition nowadays and isn't nice to newbies. DCDuring (talk)

I support the idea of using {{pedia}}. Here's an example of what that might look like on the vector entry:

(molecular biology) A DNA molecule used to carry genetic information from one organism into another. See also: Vector (molecular biology) on Wikipedia.Wikipedia

Ioaxxere (talk) 19:46, 12 October 2023 (UTC)[reply]

Should all Kyakala terms be converted to traditional characters

This is a Tungustic language written with Chinese characters. However, which type of script should be used for the entries? Currently the entries are in Simplified characters. Mahogany115 (talk) 10:10, 9 October 2023 (UTC)[reply]

Kyakala has only ever been written in those simplified characters (I don't think even IPA has been used before it went extinct), so it should not be converted.--Saranamd (talk) 16:02, 9 October 2023 (UTC)[reply]

Opportunities open for the Affiliations Committee, Ombuds commission, and the Case Review Committee

You can find this message translated into additional languages on Meta-wiki.

More languages • Please help translate to your language

Hi everyone! The Affiliations Committee (AffCom), Ombuds commission (OC), and the Case Review Committee (CRC) are looking for new members. These volunteer groups provide important structural and oversight support for the community and movement. People are encouraged to nominate themselves or encourage others they feel would contribute to these groups to apply. There is more information about the roles of the groups, the skills needed, and the opportunity to apply on the Meta-wiki page.

On behalf of the Committee Support team,

~ Keegan (WMF) (talk) 16:41, 9 October 2023 (UTC)[reply]

non-integrated topical categories and what to do about them

Solomonfromfinland and other users have been busy creating and populating random non-integrated topical categories. I found all of them that are prefixed with en:. The following are my suggestions for how to handle them. Note that "keep" implies integrating them into the topic cat system.

Non-integrated category	Disposition
Category:en:3D printing	Keep?
Category:en:Accessory clouds	Delete? Seems too specific.
Category:en:Allotropes	Delete? Seems too general. An allotrope is "Any form of an element that has a distinctly different molecular structure to another form of the same element, with different physical properties and often different chemical properties." Note, we do have CAT:en:Isotopes.
Category:en:Allotropes of carbon	Delete? See above.
Category:en:American Revolution	Keep?
Category:en:Atomic bombings of Hiroshima and Nagasaki	Keep?
Category:en:Australian nicknames for people	Delete and move to a POS category 'Australian English nicknames' or similar.
Category:en:Belarusian demonyms	Keep? We could potentially have a ton of such categories, and indeed there are several such categories among this list, e.g. CAT:en:British demonyms, CAT:en:Demonyms for Americans, CAT:en:Greek demonyms. Existing integrated categories are Category:Armenian demonyms and Category:Latvian demonyms.
Category:en:Bestiality	Keep?
Category:en:Big Bang	Currently in RFD.
Category:en:British demonyms	Keep? See Category:en:Belarusian demonyms above.
Category:en:Brothers Grimm	Keep?
Category:en:CGS units	Keep? We currently have Category:SI units.
Category:en:Chaos theory	Keep?
Category:en:Child abuse	Keep?
Category:en:Circulatory system	Keep?
Category:en:Cities in the Roman Empire	Questionable. There was a BP discussion about this which generally concluded we shouldn't have categories like this for former polities.
Category:en:Colonialism	Keep? Although several terms in this category don't seem to belong, e.g. Pilgrim, Plymouth Rock, Roanoke, Thirteen Colonies and similar terms related to the colonization of the US; maybe we need a separate Category:en:Colonization.
Category:en:Copernican Revolution	Not sure. This category has terms like Newton's first law (and second and third laws), and various terms related to Kepler, Galileo and Tycho Brahe in addition to terms related to Copernicus.
Category:en:Counties of Australia	Keep, I think, but the individual entries (only 4 of them) need to be in subcategories like Category:en:Counties of Victoria (only some Australian states and territories have counties).
Category:en:Cryogenics	Keep?
Category:en:Demonyms for Americans	Keep but rename? See discussion above under Category:en:Belarusian demonyms. Either call it Category:en:American demonyms or Category:en:United States demonyms.
Category:en:Deuterium	Keep? But some entries are very questionable, like brown dwarf and hydrogen bomb.
Category:en:Differential equations	Keep?
Category:en:Districts of Kerala	Keep?
Category:en:Doujin	Delete? I think this is too specific.
Category:en:Dracula	Keep maybe? We have Category:en:Benedict Cumberbatch so it's hard to argue that we shouldn't have a category for Dracula.
Category:en:Element nomenclature	Delete and merge terms into Category:en:Chemical elements.
Category:en:Eponymous political ideologies	??? This has several such terms like Putinism, Stalinism, Trumpism, Bidenism, etc. but seems rather specific; maybe rename to 'Eponymous ideologies' and/or make it a POS category?
Category:en:Eschatology	Keep?
Category:en:Exotic atoms	Delete? I think this is too specific.
Category:en:Federal territories in Malaysia	Delete. Redundant to Category:en:Federal territories of Malaysia (note "of").
Category:en:Fictional materials	Keep?
Category:en:Fractals	Keep?
Category:en:Futurology	Delete; a mismash of randomness.
Category:en:Genetic engineering	Keep?
Category:en:Ghosts	Keep? But make a set category and remove the non-set entries.
Category:en:Graphics	Delete, redundant to its parent Category:en:Visualization; or make a set category 'Types of graphs' or similar.
Category:en:Greek	Keep? We do have Category:German, Category:English, etc. for various languages, although I don't like this naming convention.
Category:en:Greek demonyms	Keep? See Category:en:Belarusian demonyms above.
Category:en:Groups of people	Delete. Too vague.
Category:en:Harry Potter appendices	Terminate with extreme prejudice. Instead, there should be an Appendix page listing the Harry Potter-related appendices.
Category:en:Hell	Keep?
Category:en:High-speed rail	Keep?
Category:en:Hindu mantras	Keep?
Category:en:History of science	Delete. On the surface seems reasonable but in reality it's a grab bag of randomness.
Category:en:Hybrids	Rename to 'Hybrid animals'.
Category:en:Hypothetical chemical elements	Delete. Come on now. Also delete Category:en:Obsolete element names, Category:en:Rejected element names and Category:en:Supposed chemical elements.
Category:en:Immortality	Delete. On the surface seems reasonable but in reality it's a grab bag of randomness.
Category:en:Infinity	Keep?
Category:en:Interstellar travel	Delete, redundant to Category:en:Space travel.
Category:en:Isaac Newton	Keep?
Category:en:Italian	Keep? See comment at Category:en:Greek.
Category:en:Jupiter	Rename to 'Jupiter (planet)'; cf. Category:en:Mars (planet).
Category:en:Kuala Lumpur	Keep.
Category:en:Macroenzymes	Delete. Too specific and has only two items in it.
Category:en:Maglev	Delete. See Category:en:High-speed rail above, which is enough IMO.
Category:en:Malaysian politics	Keep?
Category:en:Manifolds	Keep?
Category:en:Medical body positions	Rename to 'Body positions'.
Category:en:Megastructures	Delete. A bunch of sci-fi shit, better placed in other categories.
Category:en:Metallocene	I think this is too specific, but maybe should be kept, but renamed 'en:Metallocenes' as it's a set category.
Category:en:Metric prefixes	Keep?
Category:en:Metric system	Keep?
Category:en:Mikhail Gorbachev	Keep?
Category:en:Mobility aids	Keep? Maybe?
Category:en:Moby-Dick	Keep?
Category:en:Monarchism	Delete, redundant to 'Monarchy'.
Category:en:Multiracial	Delete, already in RFD.
Category:en:Muppets	Keep? Names of Muppets. Ugh, though.
Category:en:Napoleonic Wars	Keep?
Category:en:Native American languages	Delete. Only one term in it and not a family.
Category:en:Nikola Tesla	Keep?
Category:en:Non-Euclidean geometry	Keep?
Category:en:Obsolete element names	Delete. See Category:en:Hypothetical chemical elements above.
Category:en:Obsolete scientific theories	Delete.
Category:en:Overpopulation	Keep? Maybe?
Category:en:Paradoxes	Keep?
Category:en:Philosophers	Keep?
Category:en:Physical constants	Keep?
Category:en:Physical quantities	Keep?
Category:en:Plant milk	Keep but rename to 'Types of plant milk' or similar as it's a set category.
Category:en:Pluto	Keep but rename to 'Pluto (planet)'; cf. Category:Mars (planet) and Category:en:Jupiter above.
Category:en:Political parties	Keep?
Category:en:Polynomials	Keep? But remove the non-set terms.
Category:en:Prefectures of France	Keep?
Category:en:Provinces of Ukraine	Keep?
Category:en:Racist names for continents	Shoot on sight. Jesus.
Category:en:Racist names for countries	Nuke it.
Category:en:Racist names for places	Blow it to high heaven.
Category:en:Radar	Keep?
Category:en:Ramadan	Keep?
Category:en:Rape	??? A pretty awful category currently populated with various types of rape and related crimes.
Category:en:Rejected element names	Delete. See Category:en:Hypothetical chemical elements above.
Category:en:Rotation	Delete, I think this is too random.
Category:en:Rudyard Kipling	Keep?
Category:en:Shapes in non-Euclidean geometry	Delete? See Category:en:Non-Euclidean geometry above, which is maybe enough.
Category:en:Shapeshifters	Delete. I think Category:en:Characters from folklore and Category:en:Mythological creatures are enough.
Category:en:Ships with acronyms (fandom)	??? We already have a ton of categories related to "ships" (in the fandom sense); do we really need this and the three below?
Category:en:Ships with idiosyncratic names (fandom)	See above.
Category:en:Ships with initialisms (fandom)	See above.
Category:en:Ships with portmanteau names (fandom)	See above.
Category:en:Siblings	??? Has 69 entries in it but fairly random.
Category:en:Slavic paganism	Keep?
Category:en:Space colonization	Keep?
Category:en:Specialists	Delete. Only two items, redundant to Category:en:Healthcare occupations.
Category:en:Sphere	Rename to 'Spherical geometry'?
Category:en:Sports nicknames	??? These are names of sports teams. Not sure these names even meet CFI.
Category:en:State nicknames of the United States	Keep?
Category:en:Station codes	??? There was a BP discussion about this.
Category:en:Steel	Keep?
Category:en:Superconductivity	Keep?
Category:en:Supernovas	??? Possibly redundant to Category:en:Stars and Category:en:Astrophysics.
Category:en:Supposed chemical elements	Delete. See Category:en:Hypothetical chemical elements above.
Category:en:Systematic element names	Keep?
Category:en:Tesselation	??? Seems questionable.
Category:en:Topological spaces	Delete, redundant to Category:en:Manifolds above.
Category:en:Tritium	Keep?
Category:en:Trojan War	Keep?
Category:en:Types of chemical element	Keep? But rename to 'Types of chemical elements'.
Category:en:Ukrainian politics	Keep?
Category:en:Units of energy	Delete; redundant to Category:en:SI units and Category:en:Physical quantities.
Category:en:Utopian and dystopian fiction	Keep?
Category:en:Vampires	Keep?
Category:en:Vietnam War	Keep?
Category:en:Vulgar names for cities	Fuck me. Delete this.
Category:en:Vulgar names for places	Likewise.
Category:en:Werewolves	Keep?
Category:en:Whaling	??? Maybe? Currently a random collection of terms though.
Category:en:Zombies	Keep?

User:-sche, you might have thoughts about this in particular. Benwing2 (talk) 03:53, 10 October 2023 (UTC)[reply]

On a related note, maybe we need a better terminology for the top-level language-independent categories like Category:Mathematics and Category:English. We have confusing differences like Category:English (a top-level topic category) vs. Category:English language (a language POS category), Category:Extinct languages vs. Category:All extinct languages, etc. Benwing2 (talk) 04:17, 10 October 2023 (UTC)[reply]

Yeah, this is one reason I keep coming back to the idea that maybe we should "prefix" all the categories (so all the topic categories get "topic:" added to their name, and all the different types of sets get their own wording like "Named [foobars]" vs "Types of [foobars]", or maybe even "set:Named ..." and "set:Types of ..."); then we'd have "Category:topic:English" for the top-level category — assuming it's intended to be a topic category, anyway; it currently contains a mix of types of English and terms which have some sort of connection to the topic of language and spelling as used by English speakers. - -sche (discuss) 05:40, 10 October 2023 (UTC)[reply]

Category:en:Atomic bombings of Hiroshima and Nagasaki may be too specific. Most of the entries in the category are a few steps removed from being about the bombings, e.g. trinitite, Manhattan Project, VP Day; if the off-topic entries are removed I wonder if enough relevant entries remain to support a category. (We don't seem to have a corresponding Japanese category, which might have been expected to contain more entries.) - -sche (discuss) 05:40, 10 October 2023 (UTC)[reply]

@-sche I've been meaning to make a longer posting about sets vs. names vs. types. I added the functionality to support a type param in the topic definitions, with the values "topic", "set", "name" and "type", with the intention that "set" categories should be converted into either "name" or "type" categories. You can actually put a comma-separated list in type, something like type = "type,topic" for Category:Beards and type = "topic,name,type" for Category:Flags. These are indicative of categories that should be split. One issue I'm running into is that not all set categories can be easily assigned to the name vs. type distinction, or at least it isn't clear which one should be used. For example, Category:en:Heraldic charges contains terms for shapes found on coats of arms, and Category:en:Heraldic tinctures contains terms for colors found on coats of arms (heraldry has its own lingo for colors and shapes). Maybe these are "types" but then the default definition becomes e.g. "English terms for types of heraldic tinctures" which reads weirdly to me and suggests that it should contain generic types of colors rather than specific actual colors. Similarly for all the animal categories, which frequently contain genuses and species; you could argue the genuses are types, but the species seem rather specific for that. I wonder if we should use the terminology "class" instead of "type"; this terminology gets a bit technical but "class" is opposed to "instance", and the class-instance distinction maps well onto what we're trying to accomplish by splitting "types" and "names". Note also that some categories currently have 'type' or a synonym in their name, e.g. Category:Types of planets, Category:Bicycle types, Category:Literary genres (= "types of literature"), Category:Manga genres (= "types of manga", etc.), Category:Musical genres, Category:Film genres, Category:Video game genres, Category:Forms of government (= "types of government"), Category:Forms of discrimination (= "types of discrimination"). Benwing2 (talk) 08:29, 10 October 2023 (UTC)[reply]

Maybe even more to the point: Category:en:Diacritical marks, e.g. grave accent, oxia, candrabindu, etc. The current definition says "names of diacritical marks". Are these names or types? Benwing2 (talk) 09:39, 10 October 2023 (UTC)[reply]

Oof, yes, that's tricky. I wonder if we should use 'word-then-colon' style prefixes in the actual names of the categories, and add verbiage after "set:" whenever it was necessary to distinguish "specific foobars" from "types of foobars", so "Category:en:set:Types of wars" (civil war, war of attrition, war of conquest, police action, armed conflict, ...), "Category:en:set:Named wars" or "Category:en:set:Individual wars" or "Category:en:set:Specific wars" or whatever (World War I, World War II, Vietnam War, ...), but then some things could just have "set:" without the extra verbiage if we aren't going to split the category, so "Category:en:set:Heraldic tinctures". Technically, we could split individual tinctures like gules and argent and vair off from types of tinctures, but the latter category could only ever contain a handful of entries—metal, colour, stain, fur—so I'm not sure it'd be reasonable to split. If we want to include placenames in the naming scheme, they could also just have "set:" like "Category:en:set:Towns in the United Kingdom". And either {{autocat}} could know to continue to display "English names of [towns in the United Kingdom, etc]" or "English terms for [heraldic tinctures, etc]" when the category name is just "set:" and not "set:Types of..." or "set:Named...", or we could store the information somewhere (the way we currently have Module:category tree/topic cat/data/History etc) that tells it to do that. (This proposal should still work even if we decide to leave placenames — and other names? — where they are.) - -sche (discuss) 22:00, 10 October 2023 (UTC)[reply]

@-sche Thanks for your input. All of this makes sense and I think I will proceed for now in trying to clarify the nature of each existing category. Since we are likely to have categories where the type (topic, set, name, type) isn't derivable from the category name (cf. CAT:en:Literary genres, which is probably clearer than CAT:en:Types of literature), I think it makes sense to continue to require that the type field be specified in the definition of each category. The way the code works currently, if the description has the value of "default", it will automatically display "names of", "types of", "terms for various" or "terms related to" + the topic name. You can also customize just the topic name by specifying the description as e.g. "=the {{w|Barbie}} fashion doll produced by Mattel" for CAT:en:Barbie, which will automatically get converted to "{{{langname}}} terms related to the {{w|Barbie}} fashion doll produced by Mattel" since CAT:Barbie is specified using type = "topic". Benwing2 (talk) 22:57, 10 October 2023 (UTC)[reply]

Agree with the list, if only in order to obstruct not, though the astrophysical stuff is above my paygrade. If there is something intelligent wrongly deleted I can still create an integrated category.

Keeping Category:en:State nicknames of the United States but deleting the toponymical slurs (“racist names of …”) is contradictory though. If we have ethnic slurs it is very much logical to also have the analogon for toponyms instead of demonyms, but with better name. Fay Freak (talk) 07:57, 10 October 2023 (UTC)[reply]

@Fay Freak The main difference is that the state nicknames are well attested and reasonably common, whereas the various racist toponyms are typically rare Usenet-only terms. For the ones that aren't, CAT:English ethnic slurs is probably good enough. Benwing2 (talk) 08:12, 10 October 2023 (UTC)[reply]

Category:en:Allotropes of carbon makes sense - these are all arrangements of carbon atoms and there are many of them; the other allotrope category feels messy, and I would split it by element but that would mean the categories are too small except for carbon. The element names one can be merged into Category:en:Chemical elements. The terms in Category:en:Futurology are all reasonably closely related to futurology, but yes it's a bit of a mish-mash right now - I'm on the fence about whether deleting it or not. Category:en:Tesselation should be renamed to the correct spelling Category:en:Tessellation, if not merging into a less specific category. – wpi (talk) 10:23, 10 October 2023 (UTC)[reply]

Presumably each category to be deleted will be run through WT:RFDO?

Could we have any simple objective criteria for determining:

when a category should/could be deleted without being run through WT:RFDO and
what should be done with the members.

Categories that are "too small" and whose members are includable or included in other categories (that capture the essence of the to-be-deleted category [Can this be objectified?]) would be examples. A procedure for categorization the members would be desirable. This may not be possible, but perhaps we could at least lay down practical guidelines. DCDuring (talk) 13:10, 10 October 2023 (UTC)[reply]

@DCDuring I did it this way because having 136 or so separate entries in WT:RFDO would be overwhelming, but I agree that it would be useful to have some clear criteria for how to decide when a category can be deleted without going through WT:RFDO. Hopefully we can get some consensus on which categories to keep and which to delete through the BP process. I can RFD some of the groups of categories together that I think should be deleted (e.g. the chemical-element-related categories and the "racist/vulgar terms for" categories) but I'm not sure about the one-offs. As for the members of the categories I'm proposing to delete, I can add a column indicating approximately where to move them to, although there will inevitably be some per-category judgment needed for certain entries. Benwing2 (talk) Benwing2 (talk) 23:03, 10 October 2023 (UTC)[reply]

It's a good way to start, but there are only about 30 items which you think should be deleted. I don't think that would overwhelm WT:RFDO. Maybe some can be speedied. There also seem to be items for WT:RFC and WT:RFM. Grouping would be nice to help focus, yet generalize, a deletion discussion. BP is the place for policy. If we can tease out some policies or, more likely, guidelines from reviewing a large number of cases, that seems to me like a good use of BP.

BTW, the category system and the use of {{autocat}} do not seem transparent to less-frequent contributors or to aging ones like me. As you know, the result of excessive template opacity is that folks are more likely to hard-code (if they know how) instead of using templates. I believe that those who have trouble and can't work around are likely to fail to contribute or even abandon enwikt as users. DCDuring (talk) 15:31, 11 October 2023 (UTC)[reply]

This is a good point, DCDuring. If you follow the instructions in the edit notice at Cat:en:Something and add {{auto cat}} to the page, you see this message:

	The automatically-generated contents of this category has errors.
	The label given to the `{{topic cat}}` template is not valid. You may have mistyped it, or it simply has not been created yet. To add a new label, please consult the documentation of the template.

This is doubly problematic: I didn't give a label to {{topic cat}}, I just added {{auto cat}}! And the words "documentation of the template" don't link to the documentation; you're expected to know that you need to click the template name.

And then once you get to the documentation of Template:topic cat/documentation, the page very much assumes you know Lua and how our modules are organised.

Perhaps we don't want the bar to adding topic categories to be too low, but if seasoned contributors like DCDuring are uncomfortable with the process, we have a problem. This, that and the other (talk) 22:20, 11 October 2023 (UTC)[reply]

lmao I'm the panphobic "free speech extremist" your mother warned you about, but "Racist names for continents", "Racist names for countries", and "Racist names for places" can surely only have been planted by the CIA or MI5 to try to discredit us. Hilarious. Equinox ◑ 15:33, 12 October 2023 (UTC)[reply]

Category:Racist Harry Potter ships with portmanteau names Jberkel 23:02, 12 October 2023 (UTC)[reply]

Deprecating parameter ref= in Template:zh-x

{{zh-x}} has a |ref=, which uses plain text formatting and does not go through Module:quote etc, meaning a lot of those features are lacking; furthermore sometimes a plain Youtube link or a simple Wikipedia link is used, making it extremely unhelpful to readers and editors. ~~Well at the very least there's a link so one could verify the quotation, so I guess I shouldn't complain about it~~.

Also, {{zh-x}} could either be a usage example or a quotation, depending on whether |ref= is present or not - this applies to the distinciton between #: vs #* (which creates a special case and make bot jobs more complicated - sometimes the bot edits have to be reverted), and the categorisation into Category:LANG terms with usage examples and Category:LANG terms with quotations. Some of us have been using the {{quote-*}} templates with {{zh-x}} nested under it, but this fails to categorise properly because of the supposed need of |ref= being present. Note that |ref= also accepts a number of predefined titles (see Module:zh-usex/data), e.g. |ref=Analects automatically outputs the reference for Analects.

Proposal: I suggest that we should convert all of the |ref= usages to use the relevant {{quote-*}} templates and to create a {{zh-q}} that categorises into the quotation categories (a la {{zh-co}} for collocations); the titles at Module:zh-usex/data can be replaced by {{RQ:<title>}} templates, depending on the frequency of usage (I doubt all of them are used that many times such that a template is needed). (Notifying Atitarev, Tooironic, Fish bowl, Justinrleung, Mar vin kaiser, RcAlex36, The dog2, Frigoris, 沈澄心, 恨国党非蠢即坏, Michael Ly, ND381): – wpi (talk) 15:14, 10 October 2023 (UTC)[reply]

Support. — justin(r)leung _{{ (t...) | c=› }} 15:41, 10 October 2023 (UTC)[reply]

Support. The dog2 (talk) 16:53, 10 October 2023 (UTC)[reply]

Support. ND381 (talk) 18:48, 10 October 2023 (UTC)[reply]

Support. I also note that there's {{Q}}, but I've looked at it a few times and never figured out how to adapt zh-usex/data to it. —Fish bowl (talk) 19:40, 10 October 2023 (UTC)[reply]

It seems we might want to use {{Q}} in certain places (e.g. specifying the chapter numbers), but I reckon the {{RQ}} ones would be more convenient because of how Chinese formats the quotations, which is not compatible with |quote= in {{Q}}. Adapting {{Q}} wholesale would necessitate the use of {{zh-q}} anyways, so {{RQ}} (which have much greater customisability) would probably be more convenient. – wpi (talk) 03:11, 11 October 2023 (UTC)[reply]

Comment: I've created {{zh-q}} accordingly. – wpi (talk) 02:54, 11 October 2023 (UTC)[reply]

Belter Creole

Someone decided to bypass the normal language creation process and created 363 lemmas for "Belter Creole", assigning an arbitrary code 'art-blt' that isn't recognized by Wiktionary and coding everything entirely manually. See Category:Belter Creole lemmas. We have three choices IMO:

Normalize this appendix-only constructed language by assigning a code to "Belter Creole" (IMO 'art-bel' is better than 'art-blt') and do a bot cleanup to fix the lemmas to use standard templates.
Delete the entries wholesale.
Move all the entries into the userspace of the editor who created them, and delete the categories.

Wikipedia has an article on Belter Creole that claims it's used by The Expanse (TV series), a Syfy show I've never heard of:

Belter Creole, also simply known as Belter, is a constructed language developed by the linguist and polyglot Nick Farmer for The Expanse television series. In the universe, it was spoken by Belters, inhabitants of the asteroid belt and outer planets of the Solar System.

It should be noted, however, that the Wikipedia article was almost entirely written by the same person who created all the Belter Creole lemmas on Wiktionary, so I'm somewhat skeptical of the notability of this conlang. Benwing2 (talk) 09:00, 13 October 2023 (UTC)[reply]

I would go with option 1, since the entries seem to be pretty good quality already. Ioaxxere (talk) 05:09, 14 October 2023 (UTC)[reply]

@Benwing2 I thought we weren't in a hurry to expand our coverage of conlangs? As I recall, some people were against Interslavic being included on Wiktionary, let alone something that was invented for a TV show. Have we changed our mind? Chernorizets (talk) 05:30, 14 October 2023 (UTC)[reply]

Nvm, I hadn't noticed it's appendix-only. Can conlangs on Wikt claim that their lemmas are "derived from" a natural language? I see categories like "Belter Creole terms derived from Italian" and it just strikes me as a bit odd. Chernorizets (talk) 05:36, 14 October 2023 (UTC)[reply]

@Chernorizets (e/c) No. It was not my intent to express an opinion in favor of keeping these entries, I'm just giving options. Personally I am leaning towards not keeping them (i.e. option #3), but I'd like to get a consensus in favor of one of the options before doing anything. In response to your question about deriving conlang terms from natural languages, this is frequently done e.g. with Esperanto, where many of the terms have a clear origin in some European language. I agree that this seems a bit strange, though. Benwing2 (talk) 05:40, 14 October 2023 (UTC)[reply]

@Benwing2 I don't know OTOH how much work Option 1 is, but I'm assuming this appendix-only language isn't hurting anybody, so that would be the option I choose. That's also because I don't have a good grasp on what our criteria (if any) for appendix-only languages are. I've seen appendix-only terms that are for e.g. video game-specific lingo in English, so right now my level of understanding is "appendix = Wild West". Chernorizets (talk) 05:48, 14 October 2023 (UTC)[reply]

@Chernorizets The difference between single appendix pages that list things like in-game terminology for a given video game and a full appendix-only conlang is that the latter has a bunch of categories in mainspace. (Lemmas for the former currently end up mixed in with normal English lemmas, but there is consensus to change that based on a recent discussion I think in the BP, which would just involve changing {{head}} to e.g. categorize lemmas for appendix terms into 'English appendix-only lemmas'; whereas for the conlang categories it's more difficult to see how to make them identify themselves as appendix-only.) Benwing2 (talk) 07:34, 14 October 2023 (UTC)[reply]

I find Wiktionary's war on conlangs an utter nuisance, so I'll go for option 1, which is the least shitty of all. Nothing is gained and much is lost by deleting them. Steinbach (talk) 21:49, 16 October 2023 (UTC)[reply]

Agreed. I vote for option 1 as well. The Expanse is a relatively popular show (and its attention to detail in other areas suggests to me that the conlang is probably better quality than most). It doesn't hurt us to have this, so I don't see why it would be a problem. Andrew Sheedy (talk) 05:11, 19 October 2023 (UTC)[reply]

OK, I added code 'art-bel' and cleaned up the lemmas. This took rather more work than I thought it would, so I would definitely be opposed to repeating this process, and it should not set a precedent for sneaking conlangs into Wiktionary. Benwing2 (talk) 02:58, 27 October 2023 (UTC)[reply]

@Benwing2 kudos! Chernorizets (talk) 03:30, 27 October 2023 (UTC)[reply]

bad reversion by User:DCDuring

My use of {{auto cat}} on Category:Taxonomic hypernym templates was reverted by User:DCDuring claiming it was "destructive". Now, User:DCDuring may not like {{auto cat}} for category definitions but it is the standard way of doing things, and manually entering a bunch of wikicode is not. Rather than get in an edit war I'm bringing this issue here to clarify that {{auto cat}} is correct for categories and we should not be manually coding category definitions. Benwing2 (talk) 18:29, 13 October 2023 (UTC)[reply]

I fail to see how the manual input is more effective. This comes across as fear or hatred of automation for the sake of it. Vininn126 (talk) 18:32, 13 October 2023 (UTC)[reply]

I dislike losing functionality and don't get the complexity of the category scheme as it applies to taxonomic entries. Before I reverted the change, there was an absence of proper display of a table of new additions to the category and of the alphabetical index. As I am essentially the sole user and maintainer of such categories, I don't see the point of uniformitarian Lua imperialism. DCDuring (talk) 18:37, 13 October 2023 (UTC)[reply]

The alphabetical index worked under Lua. (Also, not sure that you're the only person who uses this cat.) CitationsFreak (talk) 14:42, 14 October 2023 (UTC)[reply]

As did the display of new additions to the category. I have therefore reinstated {{auto cat}}, given that @DCDuring's complaint doesn't seem to have any substance. If the description needs adding to, it can be edited in the proper place. Theknightwho (talk) 19:29, 14 October 2023 (UTC)[reply]

If changes of the {{autocat}} variety never impaired functionality, w:I, for one, welcome our new insect overlords, but they do - and often. The approach seems to be if the change works for test cases of the change advocates' choosing, usually involving neighborhoods adjacent to their existing empire, then it is imposed on all. DCDuring (talk) 18:45, 13 October 2023 (UTC)[reply]

Can you give a fer-instance of how {{auto cat}} was causing a problem? —Justin (koavf)❤T☮C☺M☯ 19:10, 13 October 2023 (UTC)[reply]

Beyond my paygrade to trace why it didn't work when I originally removed it. Perhaps the required higher-level categories weren't created, not that I even know for sure that such are required. Perhaps someone had fiddled with something somewhere with the customary unawareness of bad consequences caused by ambitions exceeding skill level.

All that I know is that Benwing replaced the previous category page content with {{autocat}} on October 2. On the lucky 13th it wasn't working and I couldn't create subcategories without error messages, so I went back to the last thing that was working. Now it works, but not as well as before because I can't adjust the number of new additions displayed and remove the display of the oldest ones. I have already expressed my complaint about the opacity, complexity, and poor documentation for the {{autocat}} system. Not to mention that the topic categories don't have criteria for their creation, structure, and membership. Some of the categories being added now that Benwing2 objects to do not violate any written policy or guidelines, so contributors have no guidance and there is no convenient, principled way to remove them, other than by individual RFDO, DCDuring (talk) 21:08, 14 October 2023 (UTC)[reply]

What does it take to
1. suppress the oldest items display
2. expand the new items display
3. make the categories display as two columns?

It seems like a rectal tonsilectomy just to change the text. Techno control-freak imperialism? DCDuring (talk) 00:31, 15 October 2023 (UTC)[reply]

@DCDuring I'm working on the documentation. The poscatboiler documentation is in decent shape but I haven't done a lot of work on the topic cat documentation. You can always press the "Edit category" button and it will bring you to the module code for the category in question; you should be able to search for the category name and change the text if you want to. As for changing the layout of the newest and oldest items, the problem is that different people have different ideas for how many items to display, so IMO it's not a good idea to hardcode this on a per-category basis like you have been doing. Probably the best thing is for this to be customizable with CSS; maybe User:Sokkjo/User:Victar or User:This, that and the other or someone who knows CSS well can give you the right CSS instructions for how to do this. As for displaying the categories in two columns, we had a discussion about this recently; the problem seems to be that either you have to push the start of the categories down below all the stuff on the right side, leaving a bunch of whitespace above the categories, or you'll get whitespace on the right side below the right-aligned elements. Ideally the categories would wrap around the right-aligned elements but I'm not sure that's possible (regardless of whether we use {{auto cat}} or hard-code the wikitext). Benwing2 (talk) 00:44, 15 October 2023 (UTC)[reply]

BTW I am going to bring up another BP discussion soon with my ideas for guidelines for what sorts of topic categories are reasonable and which ones are not. Benwing2 (talk) 00:45, 15 October 2023 (UTC)[reply]

@DCDuring are these changes something that you believe should be in place for all Wiktionary readers and editors, noting that they may have varying reasons for visiting the category page and a multitude of screen size and device types? Or are you requesting ways to make this display more convenient for your own personal needs, so that the technical changes are applied in your personal CSS only? Either one can be taken care of, it's just important to distinguish to two. This, that and the other (talk) 01:32, 15 October 2023 (UTC)[reply]

Excellent question. I am skeptical about the value of oldest additions to the category for anyone. I wonder whether the oldest/newest display has to be on every page of the category-member display for anyone.

I suppose that the default closing of the newest/oldest display should take care of the needs of new/occasional users. Frankly, I had simply forgotten that I had the option of closing that display by default, which made me temporarily (I hope) an example of a naive user with respect to this matter. The oldest/newest display seems to narrow the effective space available for the columns of category members, so having it closed by default, which @Geographyinitiative had complained about, makes for a better experience for new/occasional users. DCDuring (talk) 13:56, 15 October 2023 (UTC)[reply]

I'm not sure what the discussion here is about, but I will say that I have never actually understood what the "oldest" part actually means. I guess I could use it to find pages in a category that have been neglected, but I've never tried to edit that way- I could try it a little right now. However, I have used the "newest" section very recently for at least two things- on the "en:Places in China/Taiwan" categories when someone added the Davis Line to the category, and before that when LlywenyllII added Slender West Lake. In both cases I expanded my own knowledge and helped grow Wiktionary's grasp of the entry. I would not have opened the box on the off chance something new might be there- these categories go long periods without new entries. --Geographyinitiative (talk) 14:05, 15 October 2023 (UTC)[reply]

They're the pages which have been in the category the longest. I agree that they don't really seem to have much value. Theknightwho (talk) 14:06, 15 October 2023 (UTC)[reply]

Okay, I tried to use the "oldest" part in Category:en:Places in China, just to see if it would be useful. I added some changes on YRD, South Manchuria, Si, Ningpo, and PRD, which are/were listed as having the "oldest" edits. After my edits today, I have refreshed the Category:en:Places in China category page in my browser numerous times, and after my edits & after the refresh, sometimes you can STILL see these five entries as the "oldest" (1 through 5). So the "oldest" index isn't snappy on updating after a new edit and also leaving some entries on there even after a recent edit (today). I remember having this same problem long ago when I tried to use this "oldest" list. So the technical functionality of that list may be weak somehow that I don't understand- it doesn't update as quick as the "newest" list for sure. But in defense of the "oldest" list, I would say that the theory of its potential usefulness is strong. You have to imagine that the "newest" entries and the "oldest" edited entries are hotspots for entries that are malformed in some way. UPDATE: Several hours later, I am refreshing the category and there is still no update on the "oldest" list. Note that in my experience, the "newest" list usually updates pretty fast. --Geographyinitiative (talk) 14:55, 15 October 2023 (UTC) (Modified)[reply]

@Theknightwho: "They're the pages which have been in the category the longest"- how can that be true? That would seem to make the category eternally static (if those ten were never removed). Is that true? Impossible. Ningpo was created in 2015, and it's in the "oldest" list for Category:en:Places in China. What is the actual function of the "oldest" list? I would really like to explore that. Is the functionality of the "oldest" list broken somehow? If it comes down to it, I wouldn't throw a good system (the "newest" list) out with a broken functionality. But I want to find out what the original function of the "oldest" list is and see if I can use it somehow. I would urge others to avoid eliminating the user friendly feature that is the "newest" list. It's straightforward and helpful. And maybe the "oldest" list is just broken, and I refuse to judge the "oldest" list as bad just because it is broken at this time. --Geographyinitiative (talk) 19:43, 15 October 2023 (UTC)[reply]

I like to have information on the oldest entries for maintenance categories, so that I can eliminate the longest-standing maintenance items first. For instance, it used to be that Category:Requests for review of French translations had a table displaying the oldest additions to the category (i.e. entries that had been in the category the longest), which I tried to deal with first. Unfortunately, that was changed a few years ago, so that it now only displays oldest pages according to their last edit, which is useless for what I want. Andrew Sheedy (talk) 21:30, 15 October 2023 (UTC)[reply]

@DCDuring perhaps we can organise something where the decision to collapse/expand the box is remembered, so that collapsing the box will keep it collapsed on every category you visit, until you expand it again. What do you think of that? This, that and the other (talk) 00:38, 16 October 2023 (UTC)[reply]

It would be nice, but not really essential for me. If it is expanded and annoys me, I'm now not likely to forget how to collapse it. But you may recall that @Geographyinitiative was peeved about it, too. DCDuring (talk) 01:47, 16 October 2023 (UTC)[reply]

I agree with User:Andrew Sheedy that the "oldest" display is useful for not-too-frequently reviewed maintenance categories that are emptied. It is much harder to see the point for categories that are not routinely emptied and tend to grow. It would be more interesting for such categories to see both the newest additions and the newest removals, not that I would find that very useful at this time. I empty the maintenance categories I tend usually daily or more frequently, so even the newest additions display is not useful unless I am away for a long time. DCDuring (talk) 02:11, 16 October 2023 (UTC)[reply]

Category:Geographic lines or Similar

Hey- Wiktionary:Scope mentions under Place names that "Cultural and geographical regions and dividing lines" are in scope. I think there is an unexplored potential category, both on Wikipedia and Wiktionary, for geographic lines that have names- this would include: Mason-Dixon Line, Davis Line, International Date Line, Surovikin Line, Maginot Line, Line of Actual Control, Iron Curtain, nine-dash line, Military Demarcation Line, McMahon Line and similar. Idk how to implement this or what it should be called, but this area of geography would be well-served by having its own independent category. --Geographyinitiative (talk) 23:31, 14 October 2023 (UTC)[reply]

I agree that these items should be categorized, but after a moment of pondering, I'm not entirely certain on whether all these examples belong together, or what the best terms would be. The majority of these are some form of political boundary, separating the area of control of one nation or region from that of another. They are not merely lines, they are borders between discrete areas; and it may be misleading to simply call them geographic, as they relate to divisions between man-made concepts and many are purely imaginary, rather than objectively representing physical geography, like a river or a mountain range.

The Maginot Line differs from the majority of your examples, as it refers to a man-made physical obstacle, akin to the Great Wall of China or Hadrian's Wall. The International Date Line is also somewhat unique, since it doesn't separate two entities -- technically, it borders itself. But it is not purely derived from a single geographic line of latitude or longitude, so it could still be argued that it belongs with the other borders, rather than together with the prime meridian or the Tropic of Capricorn.

My opinion would be that most of these examples would fit under a category termed Category:Political boundaries or something similar, perhaps with a further division into "natural", "man-made", and "intangible" to separate rivers from walls from arbitrary scribbles on a map? Qwertygiy (talk) 20:04, 15 October 2023 (UTC)[reply]

@Qwertygiy I would like to go ahead and create 'Category:en:Political boundaries' under 'Category:en:Places' or similar. If you would like to do it, go ahead, otherwise I will try to do it myself in the coming days. --Geographyinitiative (talk) 20:48, 15 October 2023 (UTC)[reply]

Unfortunately, I'm not at all familiar with the process of adding a new topic to the Template:auto cat system, and by the discussion at #non-integrated topical categories and what to do about them above, it doesn't look like something I have time to figure out at the moment. But don't let my ignorance stand in the way of your progress. Qwertygiy (talk) 21:56, 15 October 2023 (UTC)[reply]

@Qwertygiy @Geographyinitiative I would call these "demarcation lines" or "political demarcation lines", and Wikipedia has a corresponding entry Demarcation line that lists many of these lines as examples. If you agree with this I'll go ahead and create the category. Benwing2 (talk) 22:26, 15 October 2023 (UTC)[reply]

Yes, do that if you want to. Sounds great! Thanks! Geographyinitiative (talk) 22:33, 15 October 2023 (UTC)[reply]

Definitely looks sharper than "intangible political boundary". Works for me. Qwertygiy (talk) 00:00, 16 October 2023 (UTC)[reply]

Jarawa orthography: should Devanagari be used over scientific phonetic spelling?

The Jarawa are a people indigenous to the Andaman Islands in India. Currently, all Jarawa terms on WT are written in IPA format (e.g. ŋeŋeŋe, ʈʰehuːʈʰu), with triangular colons, superscript symbols and all. The Central Institute of Indian Languages, part of the Indian Ministry of Education, uses a Devanagari orthography for Jarawa, but the majority of accessible language documentation uses IPA (phonetic) transcription. Should all Jarawa lemmas be moved to corresponding CIIL Devanagari spellings?

Reasons for: Some users have stated that any proposed form of orthography is preferable to spelling terms with the IPA. Until just yesterday, there existed several Naxi terms spelled in the IPA with Sinological tone numerals, such as kʰɯ³³ (now kee, using the standard Latin orthography). When there exists a standard script or scripts, spelling terms using the IPA may serve only to confuse readers. The Devanagari system has some official status as opposed to the phonetic system, and thus should be used in its place.

Reasons against: The Jarawa only made amicable contact with the outside world in the 1990s. As such, the small body of work on their language has been constantly growing as more is learned and old theories are disproved. @Soap mentions that in just the last decade, the phonology of Jarawa shown in its WP article has roughly doubled in size as new research is published. We still are unsure of the phonemic status of several sounds (e.g. [ɛ]), and a Devanagari script assumes a level of phonetic understanding which we have not yet reached in the literature. The phonetic spellings are likely to be what inquiring users first encounter and search on WT.

Other information: Around 1% of Jarawa speakers are literate in their language; I can't seem to find information on if they read and write using Devanagari.

I am in opposition of moving Jarawa terms to corresponding Devanagari spellings. What will the consensus be? What reasons did I miss on either side? ~ Blansheflur ｡･: talk :: contributions :･ﾟ❀,｡ 08:10, 15 October 2023 (UTC)[reply]

Support changing the Jarawa orthography on Wiktionary very strongly. I honestly don't really care to what, as long as it's not IPA - IPA is not an orthographical system, it is a scholarly transcription.

There is already an orthography - it may not be great, but in the lack of a better one, it's still better than IPA. If we can find anything that is more suitable for the language and proposed elsewhere than on our website - fine by me. But we shouldn't have entries written in IPA. Thadh (talk) 08:31, 15 October 2023 (UTC)[reply]

Oppose per above and I'll provide an expanded reasoning. First, to clarify, this discussion first began on Discord, so I'll link the old and updated phonologies of Jarawa here for convenience. Saying it doubled in size was a slight exaggeration but going from 16 to 28 consonants in ten years is a pretty big change, and shows that our understanding of the language may still be in flux. I suspect the Devanagari orthography created by the government of India is based on the smaller phonology reported in the 2012 study, and perhaps on an even earlier study than that. As such, if we use that, we would not be able to represent the pronunciations of the words unless we manually coded the pronunciations in IPA into every single entry. At the very least, we should be able to see this proposed Devanagari orthography before we start writing words in it. All the best, —Soap— 09:10, 15 October 2023 (UTC)[reply]

I don't find the "automatic IPA" argument convincing at all - English can't have an automatic IPA either, doesn't mean we should now write English in IPA or enPR. Thadh (talk) 09:19, 15 October 2023 (UTC)[reply]

Oppose--Saranamd (talk) 09:51, 15 October 2023 (UTC)[reply]

Comment. Just like we have Japanese entries such as herikoputā (herikoputā) defined as Rōmaji transcription of ヘリコプター, we could have Jarawa entries such as ङेङेङे (or whatever the Devanagari rendering is) defined as Devanagari rendering of ŋeŋeŋe. --Lambiam 12:56, 15 October 2023 (UTC)[reply]

Oppose. We should use whatever is most common in sources. There are African languages that use an official orthography very similar to the IPA; "IPA is not an orthographical system" seems a strange reason to prefer a system that apparently no one uses. Benwing2 (talk) 22:19, 15 October 2023 (UTC)[reply]

Nobody uses IPA, either. It is not a writing system for speakers or anyone really, it is a scholarly transcription. Hosting a language at a scholarly transcription would be akin to making a database of chemical compounds and host them at their molecular formulae rather than systematic names. Thadh (talk) 10:25, 16 October 2023 (UTC)[reply]

Incommensurable chemistry analogy. Quite normal to do so with languages that are more prominently written by scholars than the actual community. Correct for Modern South Arabian languages: The speakers write in Arabic script if asked to, but if some literary culture developed then they would not keep such script as it is a kludgy approximation. Devanāgarī is better, but does it still obfuscate information unless diacritics have been developed, such as in the last hundred years for Kurdish, which will then be unencoded? Soap’s comment suggests that the known Devanāgarī spellings have not even been created on the basis of sufficient information about the phonology and are thus even less promising. In any case IPA will stand the test of time better. Surely in a decade there will be new Devanāgarī orthographies and all extant ones ditched. But we will follow some scholars rather than literary usage with either option, as a “writing system for speakers” cannot be achieved at all. Even if someone from the Andamans wrote a novel in any of the Ongan languages it would only be there to bait scholars rather than be read by an organic community, artificial like most conlangs by internet warriors. An undisputed orthography requires a certain material culture upon a sufficient number of speakers, which Lower Sorbian speakers in 21st century Germany can have but in harsh Yemen is paradisiacal and on the tropical Andamans, I imagine, unmotivated, a first world problem. Fay Freak (talk) 10:58, 16 October 2023 (UTC)[reply]

Oppose — Fenakhay ^{(حيطي · مساهماتي)} 02:17, 16 October 2023 (UTC)[reply]

Oppose weakly. The fact that any Devanagari transcription is likely to change as our understanding of the language (and literacy among the ethnic group in question) increases does not make current spellings invalid. After all, we recognise shew and synge as obsolete spellings for show and sing, respectively. However, Jarawa written in Devanagari is currently hardly a thing. By far the most common context in which Jarawa is read and written is in scientific papers. This is where Jarawa words are found, and in these forms these words come to us. IPA "not being a real alphabet" is a bogus argument. Most true alphabets were originally phone{t/m}ic, or intended to be so. IPA can be read, learned and used like any other script, its being more precise than Latin or Devanagari is after all only an advantage. That being said: if literacy in Jarawa develops and a written standard is established, we can still switch to Devanagari (and keep the IPA lemmas as alternative spellings). Steinbach (talk) 21:44, 16 October 2023 (UTC)[reply]

Comment: with a tiny minority language like this, is there a risk of our decision in any way influencing what does become adopted as the standard? Given the influence of English in India, is it possible that Wiktionary might have that sway? If so, that should impact our decision, I think. Andrew Sheedy (talk) 22:27, 16 October 2023 (UTC)[reply]

Thanks for bringing this up; I hadn't considered this. I think not: the literature will have more sway, and the literature is more likely to take other literature into account rather than Wiktionary. I think this question might be best answered by someone who lives/has lived in India and not America, though. ~ Blansheflur ｡･: talk :: contributions :･ﾟ❀,｡ 05:32, 17 October 2023 (UTC)[reply]

Oppose —Aryaman^A ^{(मुझसे बात करें • योगदान)} 05:26, 22 October 2023 (UTC)[reply]

Delete Category:Noxilo language

Okay, this is a bit ironic since I have complained about en.Wiktionary's intolerance of conlangs on more than one occasion. But I think the Noxilo language category should be deleted. The Wikipedia article on Noxilo was deleted in early 2014 on notability grounds. Our Appendix was deleted later that year for the same reason. The category is empty; any lemmas have long been deleted. Regardless of my personal opinion, I see no reason for keeping this cat. I already requested its deletion at this page, but Theknightwho remarked that, since this amounts to the abolition of an entire language, the case had to be discussed here. So here it is: any objections? Steinbach (talk) 22:02, 16 October 2023 (UTC)[reply]

Support. Theknightwho (talk) 22:11, 16 October 2023 (UTC)[reply]

Support Benwing2 (talk) 22:45, 16 October 2023 (UTC)[reply]

Support CitationsFreak (talk) 02:36, 17 October 2023 (UTC)[reply]

Support tbm (talk) 07:49, 18 October 2023 (UTC)[reply]

Support Chernorizets (talk) 20:38, 22 October 2023 (UTC)[reply]

Deleted. Benwing2 (talk) 06:39, 1 November 2023 (UTC)[reply]

Review and comment on the 2024 Wikimedia Foundation Board of Trustees selection rules package

You can find this message translated into additional languages on Meta-wiki.

More languages • Please help translate to your language

Dear all,

Please review and comment on the Wikimedia Foundation Board of Trustees selection rules package from now until 29 October 2023. The selection rules package was based on older versions by the Elections Committee and will be used in the 2024 Board of Trustees selection. Providing your comments now will help them provide a smoother, better Board selection process. More on the Meta-wiki page.

Best,

Katie Chan
Chair of the Elections Committee

01:13, 17 October 2023 (UTC)

Adding edit/history links to category entries

I’ve created a script that displays edit/history links next to category entries, which I’ve found useful when doing systematic manual editing. If anyone else would find this useful, feel free to add this to your common.js page (on a new line):

importScript("User:Theknightwho/scripts/extraCategoryLinks.js");

Theknightwho (talk) 04:09, 18 October 2023 (UTC)[reply]

How does one comment something out in common.js? It looks useful and I'd like it to be handy, but I'd need it rarely. DCDuring (talk) 22:13, 18 October 2023 (UTC)[reply]

At Category:English entries with language name categories using raw markup the first page shows the edit/history link before the title of each entry. On subsequent pages the links follow the entry title. I don't expect it to be an actual problem, but .... DCDuring (talk) 22:27, 18 October 2023 (UTC)[reply]

@DCDuring Double slash // at the start of the line it's on should comment it out.

The links all display after the titles for me - what browser are you using? Theknightwho (talk) 23:33, 18 October 2023 (UTC)[reply]

Thanks. FF 118.0.2. Win 64. I just looked again. It happens on every page, but I can see it flash to the right, then appear on the left. DCDuring (talk) 00:00, 19 October 2023 (UTC)[reply]

@DCDuring Hmm - I've just checked FF and it works okay. The fact it flashes to the right then the left makes me think it might be caused by some other script you have that runs after it, but I'm really not sure - sorry. Theknightwho (talk) 00:23, 19 October 2023 (UTC)[reply]

@Theknightwho: I remember him saying he had the gadget enabled that puts the tables of contents on the right instead of the left. Chuck Entz (talk) 02:40, 19 October 2023 (UTC)[reply]

@Theknightwho: I think it is the OrangeLinks gadget!? (Uncertainty due to possible latency delays.) I have about 6 gadgets running and three things in custom JS. DCDuring (talk) 12:08, 19 October 2023 (UTC)[reply]

Conclusive after ABAB test. DCDuring (talk) 17:26, 19 October 2023 (UTC)[reply]

@DCDuring It works okay for me with the orange link gadget enabled. What do you mean by ABAB test? Theknightwho (talk) 17:29, 19 October 2023 (UTC)[reply]

ON,OFF,ON,OFF; ON=Orange links enabled. I use Vector (2010). DCDuring (talk) 19:47, 19 October 2023 (UTC)[reply]

It works properly with Orange links enabled and the Beta feature "Paragraph-based edit conflict" disabled. DCDuring (talk) 19:57, 19 October 2023 (UTC)[reply]

@DCDuring Hmm - I have that enabled as well. Are there any other scripts you're using? I suspect it's to do with the order in which things are being run - I wonder if there's some way to get this script to run after whichever one is causing the problem. Theknightwho (talk) 20:46, 19 October 2023 (UTC)[reply]

I was getting annoyed at the Beta feature anyway, so I'll keep it off. If there is a recurrence I'll let you know on your talk page. It is a time saver and runs locally until I click a link, right? DCDuring (talk) 22:36, 19 October 2023 (UTC)[reply]

@DCDuring Yes. Theknightwho (talk) 01:33, 20 October 2023 (UTC)[reply]

Palatalisation of Polish consonants followed by i

According to w:Polish orthography:

Except in the cases mentioned in the previous paragraph [here is meant the digraphs ci, dzi, ni, si, zi as well as some cases of foreign words], the letter ⟨i⟩ if followed by another vowel in the same word usually represents /j/, but it also has the palatalizing effect on the previous consonant. For example, pies ("dog") is pronounced [pʲjɛs] (/pjɛs/).

If we compare the IPA for ćwierć with that from Polish wiktionary, we see a discrepancy. Why do we not indicate the palatalisation of a previous consonant (aside from aforementioned digraphs) caused by ⟨i⟩? Would it be possible to update Module:pl-IPA or have we chosen against this level of detail? Helrasincke (talk) 13:05, 19 October 2023 (UTC)[reply]

@Helrasincke This is a highly phonetic transcription (indicated with []). As you can see, we have a phonemic transcription (indicated with //). Furthermore, there is a new Module:zlw-lch-IPA which will change things further yet. It could be possible to give both a phonemic and phonetic transcription, but it might get cluttery very fast. Vininn126 (talk) 13:10, 19 October 2023 (UTC)[reply]

Sanskrit inflection tables

I've noticed that there are multiple Sanskrit infl. tables which want to do the same thing but have somewhat different designs and syntax. For instance for adjectives there is both {{sa-decl-adj}} (as on पुरोहित (purohita)) and {{sa-decl-adj-mfn}} (as on एक (eka)). Which one should be used? And shouldn't we phase on out in favour of the other for consistency? ᛙᛆᚱᛐᛁᚿᛌᛆᛌ ᛭ Proto-Norsing ᛭ Ask me anything 13:58, 19 October 2023 (UTC)[reply]

Norwegian Nynorsk using "as above" (referring to Norwegian Bokmål entries)

I noticed that that sense of a lot of Norwegian Nynorsk entries use a comment (or qualifier or gloss) "as above" to refer to the gloss found in the Norwegian Bokmål entry. One example is krampe. Is this normal/acceptable? I thought each language (even if related) should stand on its own. I count over 600 of these "as above" comments. If someone were to export Norwegian Nynorsk entries and display them, these "as above" entries wouldn't be very useful. tbm (talk) 08:42, 20 October 2023 (UTC)[reply]

Definitely not normal. Vininn126 (talk) 08:44, 20 October 2023 (UTC)[reply]

No, these should be fixed. — SURJECTION ^{/ T / C / L /} 09:10, 20 October 2023 (UTC)[reply]

I remember it was a discussion about it. Anyway, i also think it's wrong, cause if not, we can use 'as above' in many Scandinavian laguages when reffering between Norwegian, Danish or Swedish. And it's not so good idea, especially if the meaning we refer to gonna suddently change by any reason. Tollef Salemann (talk) 09:36, 20 October 2023 (UTC)[reply]

I concur, this doesn't sound proper at all. Separate language, separate section, separate definitions. Each language should stand alone as if there are no other entries on the page. With bot scraping and machine learning, that's exactly how it would appear. Qwertygiy (talk) 15:38, 20 October 2023 (UTC)[reply]

I agree with everyone else. (Sometimes the gloss is comical, like on skandinav, where just saying "a Scandinavian" would be enough and "(as above)" has added nothing.) On a related note, there are entries that say "all senses", like апетит is defined as "appetite (in all senses)", which I think is similarly sub-optimal because if English appetite gains a new sense in e.g. coding, no-one is going to know to update the Bulgarian entry (to "in all senses except coding"?) — just spell out briefly which senses. - -sche (discuss) 04:45, 21 October 2023 (UTC)[reply]

Update: fixed both krampe and skandinav. Tollef Salemann (talk) 08:30, 21 October 2023 (UTC)[reply]

I think, it is also a bad idea to clarify the senses by just add numbers. I remember some entry in some language referring to an English word's "sense 1 and 3", while somebody has changed order of the senses in the English entry, as it was obvious out of many new added senses. Tollef Salemann (talk) 08:34, 21 October 2023 (UTC)[reply]

Template:tlb

There are enough CSS classes in place now that we can easily display term labels on their own line. This could help with them getting noticed better:

.ns-0 .headword ~ .usage-label-term, .ns-118 .headword ~ .usage-label-term { display: block; }

This changes e.g. (absent-minded)

absent-minded (comparative more absent-minded, superlative most absent-minded) (possessional)

to

absent-minded (comparative more absent-minded, superlative most absent-minded)

(possessional)

I think this is a lot better and helps them get noticed, but I'll bring it up here to see how others think. You can test it by adding the CSS snippet above to your common.css. The CSS could also be made more robust by adding an extra span for headwords that encompasses the entire output of Template:head. — SURJECTION ^{/ T / C / L /} 11:40, 20 October 2023 (UTC)[reply]

For what it's worth, if we do make this visual change, it may also be considered whether we should move the {{tlb}} invocations to their own lines in the wikitext as well. — SURJECTION ^{/ T / C / L /} 11:49, 20 October 2023 (UTC)[reply]

I would prefer allowing to add {{tlb}} (or maybe a modified {{lb}}?) to lines rather than make the headword span multiple lines - I personally find it more difficult to see where the headword ends and the senses begin when the headword is multiline. Thadh (talk) 12:51, 20 October 2023 (UTC)[reply]

The issue is that there may be multiple definition lines and {{tlb}} would apply to all of those. It's also possible from a technical standpoint to add more spacing, if it makes it clearer. — SURJECTION ^{/ T / C / L /} 13:21, 20 October 2023 (UTC)[reply]

No objection to Surjection’s suggestion. In a sense OED does this too (although it doesn’t have the same format of heading for verbs as we do). — Sgconlaw (talk) 04:45, 21 October 2023 (UTC)[reply]

I'm of two minds about this. I agree with User:Surjection that {{tlb}} labels are easily missed when placed to the right of the inflections, but I haven't decided whether I like the two-line placement. Maybe theoretically we could place the labels between the headword itself and the inflections, although as it stands that would require a lot of recoding of the language-specific headword templates. Benwing2 (talk) 04:09, 22 October 2023 (UTC)[reply]

I recommend trialing it by adding the CSS to your common.css or as a userstyle. It can now be expressed much more simply as

.headword-line + .usage-label-term { display: block; }

It looked a bit weird at first, but I quickly got used to it. — SURJECTION ^{/ T / C / L /} 08:44, 22 October 2023 (UTC)[reply]

Looks good to me. I usually only notice these templates after spending some time on a page. Ultimateria (talk) 17:45, 22 October 2023 (UTC)[reply]

I've implemented this now. I can revert it if there are more objections posted here. — SURJECTION ^{/ T / C / L /} 11:17, 28 October 2023 (UTC)[reply]

I think this change looks rather ugly in places like yliche#Etymology 2, where there's little to no header content otherwise. --{{victar|talk}} 06:16, 9 November 2023 (UTC)[reply]

I have seen another editor also complain about this. In my opinion it is actually not that bad, and it's actually better for consistency to always display the tlb in the same position. But even if that wasn't the case, CSS cannot tell line lengths apart in any meaningful way here, so there is no way to group them on the same line only in cases like this. — SURJECTION ^{/ T / C / L /} 07:06, 9 November 2023 (UTC)[reply]

I guess I would then argue that the ugliness of this change outweighs it's suggested benefit. 🤷 --{{victar|talk}} 07:42, 9 November 2023 (UTC)[reply]

Funny, the example yliche is the only place where it looks good in my view. In all other places, where there is considerable inflection information, it looks ugly. One should add some other decoration separating the labels from the header and making them closer to the senses, but I think of nothing concrete yet. It is not solely the horizontal nor vertical position. Maybe even brackets should be avoided for something else, they are only imitating some dense print lexica for sure. Fay Freak (talk) 21:00, 9 November 2023 (UTC)[reply]

De-crattificationary actions

Our favourite oft-drunken editor User:Equinox has mentioned a few times that User:SemperBlotto is deceased. If that is indeed true, he should probably have his 'crattage removed. If it is a false statement, Equinox should get spanked for being rude, or as a stricter punishment, be burdened himself with bureaucrat powers. Make sense? P. Sovjunk (talk) 19:30, 20 October 2023 (UTC)[reply]

In the US that (not necessarily the spanking) would be unconstitutional as "cruel and unusual punishment". DCDuring (talk) 20:19, 20 October 2023 (UTC)[reply]

I live down the road from Equinox, outside the US jurisdiction, so would be happy to deliver the spanking. P. Sovjunk (talk) 20:22, 20 October 2023 (UTC)[reply]

I have heard that you can't remove the 'crat role, during the realization that we have no 'crats willing to desysop the inactive admins. CitationsFreak (talk) 03:03, 21 October 2023 (UTC)[reply]

@CitationsFreak To remove the bureaucrat role you have to contact a Wikimedia Foundation steward. However, if this is true (may he rest in peace if so ...), I would agree with removing his bureaucrat powers as a security precaution. Benwing2 (talk) 04:17, 22 October 2023 (UTC)[reply]

Unfortunately I have dealt with similar cases on other wikis. The Wikipedia guideline w:Wikipedia:Deceased Wikipedians/Guidelines is pretty reasonable if a sysop or 'crat has to make any decisions in the absence of local procedures. Vriullop (talk) 07:29, 23 October 2023 (UTC)[reply]

Step 1 of that page is to find an obituary, which I'm unable to do, so we may have to wait until Aug 13 2024 (2 years after last contribution). Benwing2 (talk) 01:08, 24 October 2023 (UTC)[reply]

"catastroph-e" with its final eta

I am interested in that class of English words that derive from Greek, where our usual silent 'e' is pronounced.
That is atypical and it is not a terribly extensive list.
I was expecting that to already exist as a Category.
But evidently it does not.
Is that a worthwhile category? Varlaam (talk) 21:39, 20 October 2023 (UTC)[reply]

I think it would not be necessary to have such a category. It seems too specific. — Sgconlaw (talk) 04:38, 21 October 2023 (UTC)[reply]

At best I could see a rhyme-related category for "English words ending with vocalized -e" or something of that nature, not specifically related to any Greek origin or influence. Then you could cross-reference categories if you wanted just the Greek words. Qwertygiy (talk) 12:19, 21 October 2023 (UTC)[reply]

I see rhyme nor reason in when eta-descended final ⟨e⟩s are not silent. Compare anemone /əˈnɛm.ə.ni/ from ἀνεμώνη (anemṓnē) with hydrocele /ˈhaɪdɹəsiːl/ from ὑδροκήλη (hudrokḗlē). --Lambiam 19:26, 21 October 2023 (UTC)[reply]

As mentioned on that page, hydrocele may be influenced by French. Likewise, the modern disyllabic pronunciation of Irene may have been influenced by French Irène.--Urszag (talk) 19:35, 21 October 2023 (UTC)[reply]

Errors in China-related Place Names

Check out Citations:Yunnan (Yunnan/Yunan error), Citations:Hebei/Citations:Hopeh/Citations:Hopei (Hebei/Hubei, Hopeh/Hupeh, Hopei/Hupei errors), Citations:Guangzhou (Guangzhou/Guangdong error), Citations:Shanxi (Shaanxi/Shanxi error), Citations:Xingjiang/Citations:Singkiang (Xingjiang/Xinjiang, Singkiang/Sinkiang errors), Citations:He'nan (Henan/He'nan error), Citations:Hu'nan (Hunan/Hu'nan error), etc. Are there other errors like this that you can think of, or remember seeing? List some and I will see if Wiktionary can cite them. Also, any feedback on the handling of any of these is appreciated. --Geographyinitiative (talk) 12:20, 22 October 2023 (UTC)[reply]

Took me a bit to realise you meant misspellings. I suppose it might be better if the citations for misspellings are placed on the citation page of the misspelling itself instead of the correct spelling. – wpi (talk) 14:54, 22 October 2023 (UTC)[reply]

Do typeface names meet WT:CFI?

I found Arial, Futura, Gill Sans, Helvetica, and Times New Roman. Do these meet the criteria for inclusion? 2607:FB91:322:8B42:3D0D:6D6B:B537:6564 00:16, 23 October 2023 (UTC)[reply]

These are names of names of specific entities, for which we have no clear guidelines: “many should be excluded while some should be included, but there is no agreement on precise, all-encompassing rules for deciding which are which”. A criterion that I think is somewhat workable (but for which there is no consensus) is the following:

If a term can be attested through uses in which the meaning of the term is not explained or defined by the context, but assumed by the author to be understood as is by the reader, these uses count as attestations towards the minimum number required for inclusion in Wiktionary.

For example, a use as in “The court order was typeset in the staid Times New Roman font” should not count, because the context “typeset in ... font” reveals the occupant of the slot is a typeface. But in “Thus becomes reality the fantasy of every AI system-glitch-free world domination, coded in Comic Sans, powered by the potent energy of fear, nightmare, and terror” nothing reveals the nature of Comic Sans – if you didn’t know it is a typeface, you’re not any wiser now, and you may have to look it up in a dictionary. Such “unexplained” uses imply that the term has entered the lexicon of the language. --Lambiam 19:03, 24 October 2023 (UTC)[reply]

They seem WT:BRAND-like to me, with the way that some of them have entered popular culture. —Fish bowl (talk) 03:19, 2 November 2023 (UTC)[reply]

I agree that they need to comply with WT:BRAND. — Sgconlaw (talk) 05:32, 2 November 2023 (UTC)[reply]

Chemical formulae

Note: apparently I'm asking about systematic (IUPAC) names rather than formulas.

Quoth WT:CHEM:

All chemical formulae are Translingual. To be included, chemical formulae must be attested in publications that (1) are not written for a scientific or technical audience; (2) don't make clear that they're formulae by e.g. explicitly discussing chemical formulae or by listing their component parts; and (3) do not otherwise explain the meaning of the formula.

Yet while spelunking in Category:English terms spelled with numbers, one can relatively easily find beauties like: 5-methoxy-dimethyltryptamine, 1-methyl-4-phenylpyridinium, etc. Even without numbers, there are things like N-benzyl-N-isopropylpivalamide.

So, which is right - the CFI policy, or the IRL practice that's seemingly already in place?

Thanks,

Chernorizets (talk) 05:06, 23 October 2023 (UTC)[reply]

Those are not chemical formulas. Those would be things like "CO2" or "H2O". (Also, see talk:N-benzyl-N-isopropylpivalamide) CitationsFreak (talk) 05:09, 23 October 2023 (UTC)[reply]

@CitationsFreak as a non-specialist, I don't have an intuitive grasp of what Wiktionary means by "chemical formula". I guess IUPAC names are something else, which I'm fine with, but 1) it would be nice to say as much in the policy, and 2) it would be even nicer to spell out the criteria for inclusion of IUPAC names. I know enough to know there are many of them, that they can get arbitrarily long/complex, and that a good number are of very narrow technical scope. So I wonder whether we should have clear criteria for inclusion for those as well. Chernorizets (talk) 05:15, 23 October 2023 (UTC)[reply]

@Chernorizets: See Wiktionary:Beer parlour/2019/January#Straw polls on criteria for including chemical formulas, as valid as before. My argument was that all kinds of formula projections at w:de:Summenformel are what we intend to include not. A maiore ad minus the possibility of encoding such a formula in one line in some particular projections doesn’t make it inclusionworthy. But English-language speakers may be embarrassed to recognize all these projections by name, hence the opaque CFI formulation, which however tries to state additional criteria making formulae sufficiently inclusionworthy. Fay Freak (talk) 05:51, 23 October 2023 (UTC)[reply]

@Fay Freak I've come to realize the examples I provided in my original message are not formulas per se, but the so-called systematic names of chemical compounds. I find it odd that formulas would have criteria for inclusion, but systematic names would not. The only thread I've found specifically about the names vs formulas is this one from 2008, as well as some individual talk pages, but there doesn't appear to be strong consensus either way.

Put plainly, I think everything on Wiktionary should satisfy some criteria for inclusion, and as far as I can tell, most things do. IUPAC names appear to live in some sort of gray area, and I wonder if we can make it less gray. Chernorizets (talk) 06:12, 23 October 2023 (UTC)[reply]

@Chernorizets: It is easy to find reasons (to rationalize ex post, for bad tongues, nonetheless correctly) however. Like we include linguistics and we cannot stretch it too far. So we cannot include coordinates but named places, not 52° 1′ N, 8° 32′ O but Sparrenburg, not even most named streets, there have recently been formed criteria for them. In practice IUPAC names are no gray area, due to editors handling of IUPAC names and at the same time authoring our rules with the understanding that they are allowed. There is also an a minore ad maius argument here: butanol is a word so butan-2-ol is. Sure, not everything in WT:CFI is clear for the newcomer without an additional session of thinking. Interpretation should not be a dirty word: when it is possible, instead of controversy there are sometimes only particular satisfying approaches within it. Fay Freak (talk) 06:34, 23 October 2023 (UTC)[reply]

Don't forget Appendix:Protologisms/Long words/Titin, and English titin. Chuck Entz (talk) 06:51, 23 October 2023 (UTC)[reply]

Labels: "figurative" or "figuratively"?

There have been changes back and forth to "Module:labels", and some discussion at "Module talk:labels#“Figurative” and “figuratively”, on whether adverbs (for example, attributively and figuratively) or adjectives (attributive and figurative) should be used in labels. I have no strong feelings about either, but if I had to choose I'd say we might as well settle on adjectives since we already use labels like humorous, informal, and poetic rather than humorously, informally, and poetically. In any case I think the matter should be settled definitively. (Pinging @Benwing2, Einstein2, Theknightwho who participated in the earlier discussion.) — Sgconlaw (talk) 20:11, 23 October 2023 (UTC)[reply]

Agreed that adjectives make the most sense and are most common and that we should be consistent. —Justin (koavf)❤T☮C☺M☯ 20:15, 23 October 2023 (UTC)[reply]

@Sgconlaw My instinct is, as I mentioned in the label discussion, to allow both and not change them. IMO it violates the principle of least surprise to "correct" 'figurative' to 'figuratively' or vice-versa. It's more common to say 'literally' rather than 'literal', but 'figurative' or 'figuratively' may both make sense depending on the context. It's similar to the issue of adjective vs. noun demonyms, e.g. 'Tuscany' vs. 'Tuscan'; if you "correct" one to the other, regardless of which way you go, you end up messing up a significant portion of the labels. Benwing2 (talk) 00:32, 24 October 2023 (UTC)[reply]

Yeah, I say either stick with the adjective or else have both like had been implemented at various times. (We don't say "datedly", "obsoletely", "Britishly"...) - -sche (discuss) 21:15, 23 October 2023 (UTC)[reply]

I also prefer adjectives. Theknightwho (talk) 22:09, 23 October 2023 (UTC)[reply]

I don't see a great value to consistency here (assuming they're treated the same on the back end, and only differ in what's shown to the reader of the page). So I neither oppose nor support standardizing to adjective-only.--Urszag (talk) 01:02, 24 October 2023 (UTC)[reply]

I support keeping both separate and categorizing them together, due to the edge cases of editors using it in phrases like ”used figuratively”, which reoccur in other contexts that aren’t easily replaceable, but only displaying the adjective is also good, in view of the analogies. Fay Freak (talk) 05:12, 24 October 2023 (UTC)[reply]

@Fay Freak: I'm not sure how often that happens, but a label like "used figuratively" should really just be replaced with "figurative" (or "figuratively"). — Sgconlaw (talk) 06:37, 24 October 2023 (UTC)[reply]

That’s what I say. Fay Freak (talk) 06:50, 24 October 2023 (UTC)[reply]

Nouns and adjectives are simpler, and adequate. We put "medicine", not "medically"; "noun", not "used nominally". Equinox ◑ 05:18, 24 October 2023 (UTC)[reply]

Agree with using figurative, per everyone above. CitationsFreak (talk) 16:20, 24 October 2023 (UTC)[reply]

OK, I changed "figuratively" back to "figurative"; "used figuratively" finds only a very few entries using that phrase in labels (it's somewhat more common in usage notes, etymologies, etc, but even then only 209 of our 7,586,039 entries use the phrase), so I'm manually changing those few entries to just "figurative". "used literally" is even less common, used in just 41 entries and mostly not in labels. I note that having this display as an adverb also made for weird things like "figuratively, colloquial" (surely the labels should agree in POS) and "rare, figuratively, also impersonal". I am not opposed to having both labels if there are really labels that use "figuratively" that can't be rephrased to use "figurative", but I am sceptical. - -sche (discuss) 21:52, 24 October 2023 (UTC)[reply]

Oppose "figurative". We tend to use adverbs in cases where the label refers to a previous definition, like in (especially) and (specifically). Thus comparing it with something like (humorous) or (obsolete) is invalid. This convention, in my view, actually makes it easier to understand labels. Ioaxxere (talk) 02:55, 25 October 2023 (UTC)[reply]

@Ioaxxere: I agree that especially and specifically seem more natural than their adjective counterparts, but perhaps these two should be treated as the only exceptions. I can’t picture how figurative and other adjective forms of labels would be inappropriate in other contexts. — Sgconlaw (talk) 04:46, 25 October 2023 (UTC)[reply]

@Sgconlaw I feel the same way about "literally"; "literal" as a qualifier sounds unnatural to me. That's why I suggest allowing both "figurative" and "figuratively"; "figurative" in most cases, but "figuratively" when paired with "literally". Benwing2 (talk) 06:55, 25 October 2023 (UTC)[reply]

Isnt it possible to have one be an alias of the other like we do with other labels? Why would we not allow both? Is it to save time on processing the code of the page? —Soap— 07:14, 25 October 2023 (UTC)[reply]

@Soap Normally, aliases display in a particular way regardless of which alias was used, so it will display as either 'figurative' or 'figuratively' if one is an alias of the other. But it's easy to make the code allow both and have them display as written but function equivalently, which is what I'm proposing. I'm not really sure why people are insisting that we make 'figuratively' display as 'figurative'. Benwing2 (talk) 07:22, 25 October 2023 (UTC)[reply]

My apologies. I've noticed that before (e.g. with country labels) but wasn't thinking clearly. Yes, I would be in favor of allowing both figurative and figuratively so long as the literally label exists, and if they can be made to feed into the same category, that would be ideal. Thanks, —Soap— 08:37, 25 October 2023 (UTC)[reply]

@Soap Yes, it's easy to make this happen, although AFAIK the 'figurative(ly)' labels don't currently categorize; but they would link to the same glossary entry. Benwing2 (talk) 09:06, 25 October 2023 (UTC)[reply]

Okay thank you. I agree with everything you've said in this thread so far. Having two labels that both link to the same place is nothing new. Making (figurative) the default label is in line with how most of our other labels appear, but allowing (figuratively) as an alias would maintain the assonance with the (literally) label. —Soap— 09:35, 25 October 2023 (UTC)[reply]

My main concern about having both is that it creates inconsistency and difference without distinction — people who think labels should be adjectives will write "figurative", people who think they should be adverbs will write "figuratively", and users will have to figure out whether the fact that some entries or senses use one and some use another means anything. And people will use either in situations where we might(?) even agree only the other should be used, e.g. "rare, figuratively, dated" where surely the labels should agree. (Of course, at present we're forcing припе́чь (pripéčʹ) to render {{lb|ru|rare|figurative|also|_|impersonal}} as "rare, figuratively, also impersonal", which is worse, but I'm glad to see only one user wants that.) But I don't oppose having both. Maybe we should even allow different POS in other labels, I dunno, but contra Benwing's Tuscan / Tuscany example, we currently don't — AFAICT almost all placenames alias-ize their noun and adjective forms together rather than having each form display as itself: Texan is an alias of Texas, Southern Brazilian is an alias of South Brazil, Ostrobothnian is an alias of Ostrobothnia, Croatian of Croatia, etc. - -sche (discuss) 21:40, 26 October 2023 (UTC)[reply]

@-sche The problem with the aliasing of Tuscan/Tuscany and such is we aren't consistent with whether we use nouns or adjectives, and the use of nouns conflicts the use of adjectives like rare. So we ended up with a bunch of solecisms like 'Tuscany Italian' and such when people would write 'Tuscan|_|Italian' or something (this isn't a good example; I can't recall the examples I ran into that motivated this but there were several). Benwing2 (talk) 22:33, 26 October 2023 (UTC)[reply]

Britain slang —Soap— 21:20, 27 October 2023 (UTC)[reply]

Labels: is "vulgar" or "obscene" better in the context of Latin?

The above reminded me of an unrelated label issue that I had been thinking about recently. I think "vulgar" is not ideal as a label for Latin words due to the (unfortunately still somewhat current) use of the adjective "vulgar" to refer to popular or late Latin vocabulary rather than vocabulary that was unwelcome in polite company. I think "obscene" is clearer. Examples of affected words would be glūbō, mentulātus. I guess there isn't so much ambiguity to the category title "Latin vulgarities", so I feel like maybe that could stay the same but the per-page labels could be changed to show "obscene": how does that sound? Is it too inconsistent? I feel like I've seen this discussed before but I forgot when or how it went. Urszag (talk) 00:59, 24 October 2023 (UTC)[reply]

@Urszag Do "vulgar" and "obscene" mean the same thing? "Obscene" feels stronger. Benwing2 (talk) 01:09, 24 October 2023 (UTC)[reply]

Well, maybe "obscene" would be too strong for a few words like meio. As it is, there's no way to use the word "obscene" in cases where it would fit better.--Urszag (talk) 02:58, 24 October 2023 (UTC)[reply]

“Vulgar” is also too strong there. I’d think it is colloquial and consider the label “vulgar” confused. Fay Freak (talk) 05:09, 24 October 2023 (UTC)[reply]

I think the word obscene has such a wide scope that it shouldn't be used as a label. Is the word pee obscene? What it refers to is, so how it could not be? To me obscene describes the meaning of a word, while vulgar describes the speech register. Piss is a vulgar synonym of the word pee, but both are obscene because they have the same meaning. I could easily change my mind, though, and since as I understand it this proposal only applies to Latin I can see how readers would be confused by the existence of a whole language called Vulgar Latin. —Soap— 06:29, 25 October 2023 (UTC)[reply]

As somewhat of a counterargument to myself, to show Im not 100% on this, the author of the same comic strip I linked above also drew this, where obscene describes words that need to be censored with dashes or asterisks. Nobody censors pee (or urine) ... if we find ourselves in such a situation that we can't even say those words we rephrase the whole sentence. I still think obscene isnt quite right as a label but it's a weak argument. —Soap— 06:37, 25 October 2023 (UTC)[reply]

@Benwing2, evidently "vulgar" and "obscene" are not the same: "obscene" always refers to something objectionable, but "vulgar" is ambiguous because it can be used either as a synonym of "obscene" or to refer to "common usage" and particularly "usage by common folk". At least, that's what the Wiktionary definitions say! Perhaps for this reason "obscene" may be perceived as a stronger marker than "vulgar". —DIV (1.129.104.79 07:30, 27 October 2023 (UTC))[reply]

Ryukyuan languages lacking data in Module:Hrkt-translit

Ryukyuan languages have newly created templates {{LANG-head}}/ {{LANG-kanji}} (e.g. {{ryu-head}} and {{ryu-kanji}}) which uses Module:Hrkt-translit as the backend for generating automatic romanisations, but some languages are lacking the data module (cf Module:Hrkt-translit/data/ryu for Okinawan), so they are generating the wrong transliterations. Some people (myself included) has been "helpfully" replacing plain {{head}} invocations with the language specific ones, but wasn't/hadn't been aware of the fact that the new templates does not work properly.

This concerns Kikai [kzg], Okinoerabu [okn], Northern Amami [ryn], Southern Amami [ams], Tokunoshima [tkn], Yonaguni [yoi], and Yoron [yox]. Either someone should add the data modules (or provide a phonology/romanisation scheme - I don't mind doing the coding) of these languages, or we should revert the replacements to the versions with {{head}}. – wpi (talk) 18:21, 25 October 2023 (UTC)[reply]

@Wpi Some of the existing transliteration data modules are incomplete, too, I think. Theknightwho (talk) 18:26, 25 October 2023 (UTC)[reply]

@Chuterix you might be interested in this, seeing that you added 蓼#Toku-no-Shima which does not generate correct romaji. Do you know any decent reference that explains the kana orthography/romanisation schemes? – wpi (talk) 14:30, 4 November 2023 (UTC)[reply]

ｪｨ is not used in any sources, so this transcription of [ɨ] is subject to debate. Sources such as Ishigaki Hogen Jiten and Amami Hogen Kiso goi no Kenkyu transcribe ɨ as イゥ. Chuterix (talk) 14:34, 4 November 2023 (UTC)[reply]

FYI: October Updates from Unicode

https://mailchi.mp/136f1b5d79de/testing-rickys-template-6259062 —Justin (koavf)❤T☮C☺M☯ 17:18, 26 October 2023 (UTC)[reply]

Pre-consonantic x

AFAIK a closed syllable x in Spanish is pronounced like /s/ in normal speech, being only pronounced as spelled in articulated speech, I remember even reading that in a dictionary by the RAE my school had. Rodrigo5260 (talk) 13:56, 27 October 2023 (UTC)[reply]

Hate speech in wiktionary google results, what to do?

The top google result for boycott wiktionary apparently contain inappropriate hate speech from one of the entry quotations [3] I don't know whether I should report the result to google or do something about the entry in Wiktionary. Are there any Wiktionary guidelines for quotations like this? Considering the authors background the quotation is clearly satire, but Poe's law should be considered here. NinuKinuski (talk) 16:55, 27 October 2023 (UTC)[reply]

I think it's a Google thing, since it should display a definition. (It does on my machine). CitationsFreak (talk) 17:37, 27 October 2023 (UTC)[reply]

I think we need better quotes. The ones in the entry aren't typical of the bulk of actual usage, and make the entry look POV. Chuck Entz (talk) 17:52, 27 October 2023 (UTC)[reply]

I mean if you search for boycott hate speech is likely to come up. It has to do with a strong feeling. But -sche has replaced the quote with another two that should also be clearer without already knowing lots of political context, though the previous one was also good to read due to its short apodictic sentences—which made it easier to be offensive. Fay Freak (talk) 19:45, 27 October 2023 (UTC)[reply]

We should have a policy that mundane words should not use extremist quotations like this. Even the quotation as such could have been simply "The boycott will destroy us". It didn't need the off-topic jeremiad and conspiracy-mongering. —Justin (koavf)❤T☮C☺M☯ 19:52, 27 October 2023 (UTC)[reply]

No, it couldn’t have been. It would not have illustrated the setup in which “boycott” was used—the boycott of the agricultural production of a whole country, as I have understood, pretty sure on topic. And not every generalization is a conspiracy theory. Fay Freak (talk) 20:06, 27 October 2023 (UTC)[reply]

I didn't write that every one is: stop putting words in my mouth. There was no need for that quotation at all and certainly not all of it. —Justin (koavf)❤T☮C☺M☯ 20:09, 27 October 2023 (UTC)[reply]

Agree we should try to avoid citations that say contentious things about real-world entities, where feasible, whether those are brands, real individual people, or countries at war. Equinox ◑ 09:25, 28 October 2023 (UTC)[reply]

Arrogant/condescending edit from an otherwise helpful editor

@Vininn126, Thadh for viz. I had been trying to figure out the etymology of Bulgarian бахър (bahǎr) - a word that originated in (likely student) slang, and is now a common colloquialism meaning "freezing cold". People generally assume it's a Turkish borrowing, but it's not in the Bulgarian reference dictionaries we commonly use on Wiktionary (especially {{R:bg:BER}}), so it was unclear what the corresponding Turkish word would be.

To figure things out, I pursued three independent routes:

I asked for help on the #turkic channel on our Discord server, in case it would ring any bells for editors familiar enough with Turkish.
I reached out to the Institute for Bulgarian Language (IBL) at the Bulgarian Academy of Sciences using their free service to ask questions about Bulgarian, including about the etymology of words.
I searched on Google Books for matches in more specialized dictionaries, and in Google Scholar for articles that may have dealt with this etymology.

In the end, the IBL came up with a reply, and their hypothesis was that бахър (bahǎr) was linked to a particular sense of Turkish bağır (a squeeze, a clenching). The entire reply is on the article's Talk page. I couldn't find that sense for myself, but I also don't know Turkish, so I'm useless with Turkish monolingual dictionaries. I used the sense that is attested on Wikt - "chest", "bosom", "upper body" - and adapted the IBL's response to that. In retrospect, maybe that was a bad idea.

In the Discord channel, @Vahagn Petrosyan came up with a different etymology - Turkish bahar (“spice”) - based on a few considerations:

as it turns out, бахър (bahǎr) has a variant spelling of бахар (bahar). Per Vahagn, there's precedent in other languages to have a semantic shift from "spice" to "cold", e.g. French piquant. бахар (bahar) is also Bulgarian for allspice or pimiento - a direct loan from the Turkish word.
there is at least one Bulgarian dictionary of Turkish loans ({{R:bg:Kristeva}}) that connects the spice sense to the "cold" sense by attributing them both to Turkish bahar.

Vahagn was very skeptical of the IBL hypothesis. I included both etymologies in the entry, because they both come from people who do this for a living, which I'm not an example of. On Discord, Vahagn qualified the IBL etymology as "folk etymology" and "amateur", and described the other alternative as "transparent". In the end, he unilaterally edited the Bulgarian entry, with the edit message:

sorry, Chernorizets, good effort but the Institute's amateur etymology can't be mentioned for the reasons I gave on Discord

Vahagn might be right, but this is a slang word, and my contention was that we have at best educated guesses about how it came to be. Instead of leaving it at that, and against my reservations, he just did what he wanted, and in a way I consider disrespectful. I thought Vahagn had been very helpful with digging up reference sources and other useful info, so this conclusion felt disappointing.

In situations like this, when someone feels convinced they are correct and another person has reservations, what's the best way to proceed? I won't edit-war and I won't keep arguing about this one word, but this seems like the kind of thing that might happen again.

Thanks,

Chernorizets (talk) 09:03, 28 October 2023 (UTC)[reply]

I'm inclined to agree with Vahag.n I saw the discord conversation myself, and he was not disrespectful towards you, but perhaps to the IBL, but everyone in the channel agreed that their etymology is bad. Usually we should include multiple etymologies from multiple sources but in my opinion there are times when we have to ignore one. Vininn126 (talk) 09:39, 28 October 2023 (UTC)[reply]

As Vininn126 interpreted: we have more standing to be arrogant against publishing academics, for the very reason they are paid and we are not, and as a volunteer and amateur you are obviously not at fault and blamed for ignorance, which due to the nature of dictionary work we have to admit here day by day. I do not follow the Discord and got chuffed today for Vahagn adding a fourth etymology: of all the etymologies the IBL was the least credible, and it badgered me a bit when you replaced your superior own etymology with that which cannot well be imagined.

To be fair and clear it appeared to me:

4th place: IBL etymology бахъ́р (bahǎ́r, “freezing cold”) from باغر (bağır, “bosom”) or rather, as you mended the connection, a made-up word—this is the most infuriating, when etymological references derive from words that demonstrably do not exist, professors do this even for our capital cities, transparent cap even for our schizoposters. One would need to ask whether language institutes have taken a place of institutionalized soothsaying. Saying random stuff looking like an etymology? The very stuff that we decided to remove even from WT:ES, to say nothing about the mainspace? I don’t know Vahagn’s reasoning in this case but going beyond it there is this remarkable comparison.
3rd place: I related to باقر (bakır, “copper”) or rather Kipchak cognates of it since this copper word has meant tinsel on a Christmas tree in some languages and Russian снег (sneg, “snow”) also means (a kind of) tinsel so I imagined the reverse development
2nd place: Your own with the dating of certain cults which is more religion than I can understand but affording more thought than a suspected trainee at the IBL would deign
1st place: бахар (bahar, “allspice”) from بهار (bahar, “spice”) in view of French piquant (“spicy; ice-cold”) Fay Freak (talk) 12:28, 28 October 2023 (UTC)[reply]

I too agree with Vahag on this. We should avoid endorsing ipse dixits believing in something only by the position of who has said it. People who do this for a living aren't necessarily always the oracle of truth, I can understand that for you this can be a sad surprise, Vahag knows this well, having had to deal with Armenian etymology. I too if judging from its name would trust the Bulgarian Language Institute more, but judging from the etymology they proposed I can't help by being very much disappointed in their etymology mailing system. Underhandedly distorting the semantics of a foreign term is, as has been told, very unprofessional, and the semantic gap stays notable even after this. The other problems (phonology, and notably chronology) which have been pointed out on the Discord don't seem to have been addressed by the IBL, their answer sounds hasty and dismissive, which is not a bad thing per se, as it allows them to reply more questions, though we must take this into consideration. Vahag's proposal (which is indeed supported by the source you named here, without the valuable French parallel) is practically flawless, and looking back to it, quite evident. He may have sounded arrogant to you, I'm sure that wasn't his intention, simply trying to be informal and straight-forward. Usually a solution between disagreeing authorities is listing the multiple hypotheses, though in cases like this I believe it would only be dispersive. That said, I'm glad you, @Chernorizets, have brought this interesting word to attention, and this matter to a discussion room rather than reverting. Thank you. Catonif (talk) 16:41, 28 October 2023 (UTC)[reply]

The comparison with French piquant is unconvincing. The literal sense of this term is “stinging”. What we call “a stinging wasp” the French call une guêpe piquante.^[4] The French Wikipedia gives the caractère piquant (stinging nature) as the reason little love is lost on the stinging nettle.^[5] The senses of the term piquant that are relevant here, one indicating pungency, the other biting cold, were not transferred from one to the other; both stem from the basic sense “stinging” applied to different sensations. Allspice, the spice sense of бахар, is not pungent at all. (We do list a sense 2 “pimento, pimiento”, but I think this is mistaken. It is the dried fruit of Pimenta dioica which is sometimes confusingly called pimento but has nothing to do with peppery pimiento.) --Lambiam 20:55, 28 October 2023 (UTC)[reply]

@Lambiam personally, I would've expected a connection with the other sense of Turkish bahar - spring (the season). Turkish ilkbahar corresponds to a holiday - Илък Бахар - celebrated by Bulgarian Turks on March 9 to mark the beginning of spring, when it's still relatively cold (e.g. here. I don't know how much weight to put on the French example, seeing as semantic shift processes in French need not have analogs in Bulgarian. Chernorizets (talk) 22:09, 28 October 2023 (UTC)[reply]

English IPA vowels

I apologize if there is a better place for this discussion that I failed to find.

So… The IPA notation for English vowels that Wiktionary currently uses is deeply flawed. It has always left me with more confusion than insight—and I'm a native speaker who doesn't actually need these transcriptions, so I can only imagine the headache this causes those who do. I'll mainly refer you to this video by Dr Geoff Lindsey, as it gives a far more extensive and detailed explanation than I could write here. The video mainly covers BrEng but broadly applies to AmEng as well.

One problem not mentioned in the video: The RP-ification of semivowel glides makes it impossible to tell whether the fleece and goose vowels are actually monophthongal when discussing specific dialects.

It also causes a lot of messy, inconsistently overcomplicated transcriptions. Here are some examples I found, just searching off the top of my head, and how I would fix them:

studying	/ˈstʌdiːɪŋ/	"studding"?	/ˈstʌdijɪŋ/
Rhymes:English/eɪɪŋ		That's a Roman numeral…	/-ejɪŋ/
Himalayas	/ˌhɪm.əˈleɪ.əz/, /hɪˈmɑl.ə.jəz/	In the first transcription, the /j/ is subsumed into the /eɪ/ notation and thus arbitrarily erased.	/ˌhɪm.əˈlej.əz/, …
paella	(UK) /paɪˈ(j)ɛlə/, /pɑːˈɛ.(j)ə/ (US) /pɑːˈeɪ.(j)ə/, /paɪˈ(j)eɪ.(j)ə/	Duplicate /(j)/s were written in the following syllables, since it is not clear that the /ɪ/ represents this.	(UK) /pɑjˈɛlə/, /pɑːˈɛjə/ (US) /pɑːˈej.ə/, /pajˈej.ə/
Io	/ˈaɪoʊ/	But not /ˈaɪ.(j)oʊ/ as above? I beg you to try pronouncing the symbols /ɪ ʊ/ literally, so that the word becomes a tetraphthong with no semivowels.	/ˈajow/

Additionally, as far as AmEng goes, /ʌ/ refers to a vowel that doesn't exist. This encourages the pop linguistic myth that "schwa is never stressed"—as well as a seemingly even more arbitrary convention in /ə ˈɜ/, which seems like a distinction for narrow transcriptions at best. Personally, since the /ʌ/ notation made me mistake my schwa for an open-mid back vowel, I was confused about that entire quadrant of the IPA vowel chart for years. However, a lot of other American pronunciation resources are catching on and using /ə/ in all instances.

above	(General American) /əˈbʌv/	(General American) /əˈbəv/
Russia	(UK, US) /ˈɹʌʃə/	(UK) /ˈɹʌʃə/ (US) /ˈɹəʃə/
burger	(US) /ˈbɝɡɚ/	(US) /ˈbɚɡɚ/

Overall, the current notation is being stretched beyond its applicability (to the bygone RP) and beyond its capabilities. Readers and editors are made to hallucinate phonemes that don't exist in their dialects, to mistake glides for length and stress for vowel quality, to doubt that a phoneme exists or else erroneously double it up. The very people who ought to benefit from the IPA are learning to use its symbols (ː, ə and ʌ, etc.) incorrectly. I regrettably can't reach any other conclusion: the notation is false and misinformative, and has little place in a reference work such as this. I think we need a far deeper investigation into the best system for transcribing English into the IPA. AgentMuffin4 (talk) 02:17, 29 October 2023 (UTC)[reply]

@AgentMuffin4: IMO, /ˌhɪm.əˈleɪ.əz/ is correct and consistent with how other dictionaries note English pronunciation, and /ˌhɪm.əˈlej.əz/ incorrect. English diphthongs do not end in a /j/ or /w/, but a significantly lower and more centralized variant vowel. Try listening to English my vs. French maille, which does end in /j/, and you'll see the difference. Benwing2 (talk) 04:48, 29 October 2023 (UTC)[reply]

Discussion about /ʌ/ ~ /ə/

This has come up before. In the last thread, I stridently opposed the attempt to transcribe the vowel of STRUT with a /ə/, because that is not the vowel we pronounce. We pronounce it as IPA [ʌ]. There is no stressed schwa in English. The situation is very similar in German, where the schwa vowel /ə/ occurs only in unstressed syllables, despite being spelled e in nearly all cases. Nobody that I know of has argued that German's schwa is just an allophone of /ɛ/, or that we should transcribe words such as beginnen as /bɛ'gɪn.ɛn/. —Soap— 15:47, 29 October 2023 (UTC)[reply]

"There is no stressed schwa" is factually wrong for GenAm at least. /ʌ/ is a relict from the past. Vininn126 (talk) 16:00, 29 October 2023 (UTC)[reply]

Yeah, I and many other GenAm speakers pronounce them exactly the same. The video sent by AgentMuffin4 explains it very well. AG202 (talk) 16:13, 29 October 2023 (UTC)[reply]

Schwa /ə/ and STRUT /ʌ/ vowels in EVERY English accent (almost) is also a good video to watch. AG202 (talk) 21:52, 29 October 2023 (UTC)[reply]

Yes, I remember that video being linked in the discussion a year ago. I only need to play the first 13 seconds to hear him say that the supposed American English schwa varies a bit depending on stress. So, does this mean that the American English schwa has allophones? Would it be right to state that the stressed allophone of the schwa vowel is [ʌ]? —Soap— 22:07, 29 October 2023 (UTC)[reply]

The video I just sent was not linked iirc. But I'd recommend that you actually watch both videos thoroughly and listen to the examples provided of GenAm speakers as they do not distinguish them in terms of actual quality (not just stress). I personally do not hear [ʌ], especially when compared to the RP speakers who actually do distinguish /ʌ/ vs /ə/ AG202 (talk) 22:11, 29 October 2023 (UTC)[reply]

Looking at Wiktionary:Beer parlour/2022/November § ʌ in American English pronunciations, there was really only one or two people who were strongly opposed, but yet the discussion just died (per usual). The change should have been made by now. If for some reason it requires a full vote, then it should be brought up, but otherwise there looks to already be a clear consensus in favor of deprecating /ʌ/ for /ə/ in GenAm. Same thing applies to Wiktionary:Beer parlour/2022/November § /ɝ/ vs /ɚ/ in GenAm. AG202 (talk) 16:29, 29 October 2023 (UTC)[reply]

Watching the Geoff Lindsey video as well, we're clearly in the minority for using /ʌ/, as almost every resource uses /ə/, minus a few exceptions, one of which (Longman) openly admits that they're the same vowel. We have got to get with the times. AG202 (talk) 16:49, 29 October 2023 (UTC)[reply]

I am strongly opposed to this, and I've explained my reasoning. It's wrong, and will lead readers to believe that Americans are pronouncing words with a literal stressed schwa phone, which I've never heard. (If you said [pət] out loud with any context, how would the listener know whether you meant putt or put?) We'd have to say that the schwa, unlike all other vowels, has two distinct allophones, one for stressed and one for unstressed syllables. English is very similar to German, and we are perfectly content to analyze German with a schwa vowel that occurs only in reduced syllables. Why should English be different? I made the same argument in November, and nobody answered me then either (search the thread for entdecken).

If the community decides to go forward with this based on the November vote, I'd rather us just do it and not have yet another Beer Parlour thread for it. But this is NOT in any way a weakening of my position from November. Best regards, —Soap— 16:51, 29 October 2023 (UTC)[reply]

Yes, I know that you were strongly opposed and still are now. However, the evidence already strongly points towards the two vowels being the exact same with just stress being the differing factor. I don't know German phonetics so I'm not going to comment too much on it, but listening to the audio file at entdecken, it's clear that the first two vowels have a completely different quality than the schwa, even without stress, so it's not the same situation as GenAm where the vowel quality is actually the same.

Also, as stated, MW, OED, and many many other dictionaries & guides for GenAm have already made the switch, and I'd be very wary to state that they're all objectively wrong. If you watch the video provided again, I'd be really really surprised to hear you say that the GenAm speakers there are actually making a vowel quality difference between /ʌ/ & /ə/. (Especially compared to the actual example of /ʌ/ vs /ə/ in RP provided) AG202 (talk) 17:01, 29 October 2023 (UTC)[reply]

You're linking to a word that has two /ʌ/ vowels. Are you saying that you want to also change unstressed /ʌ/ into schwa? This would imply that the prefix en-, in one pronunciation, would be homophonous with un-. Are there people who merge enwrap and unwrap? That would be quite confusing. But I've made every possible point already in the thread from a year ago. I just want to hear from the people who support the merger ...

does this vowel have allophones that vary by stress, or is it the same IPA cardinal value in all positions? if the latter,

what is the actual IPA value of the vowel you're hearing? Is it [ʌ], [ə], or something in between?

what should we do with the unstressed schwas that are not from underlying /ʌ/, as in the prefix en- as above? should those pronunciations be removed so that the AmEng schwa vowel can be paired exclusively with RP /ʌ/?

Best regards, —Soap— 17:13, 29 October 2023 (UTC)[reply]

As sche mentioned, if I were to say "enwrap" it'd be with /ɪ/ as sche mentioned. It's also interesting that you bring up the point about Americans pronouncing words, as honestly I've seen this transcription of /ʌ/ lead to more confusion when Americans are learning other languages. Ex: with Korean which actually has something closer to a true /ʌ/, Americans see that undone supposedly has that vowel and then try to replicate it in Korean, only for it to be entirely wrong. Not saying that phonemic transcriptions are going to match between languages, but I've never seen such stark comparisons than with /ʌ/. AG202 (talk) 21:57, 29 October 2023 (UTC)[reply]

But put is pronounced with /ʊ/. No one is questioning that one. As to enwrap/unwrap, stress is one of the deciding factors in those words, not quality. Vininn126 (talk) 17:40, 29 October 2023 (UTC)[reply]

Soap's allegation is that the sequence [pət] with a phonetic mid-central vowel, might sound for some English speakers closer to the vowel that they use for the phoneme "/ʊ/" FOOT (which is often centralized) rather than the vowel they used for the phoneme "/ʌ/" STRUT (which, as implausible as it might seem to some participants in this conversation, is actually pronounced by some English speakers with a more open quality than the official position of [ə] on the IPA vowel chart, which places the phone [ə] at at higher position than [ɛ] and [ɔ]). I know that for me, my STRUT vowel is fairly open, and it's actually about as back as my LOT/CAUGHT vowel /ɑ/; although that might be fronted somewhat relative to phonetic [ɑ], /ɑ/ is one of the backest vowels in my inventory (more back than my /u/), so I'm not actually very convinced by the argument that the symbol [ʌ], representing an open-mid back vowel (that is, a vowel that sounds something like a closer version of [ɑ]), is terribly misleading and inaccurate compared to [ə].--Urszag (talk) 19:25, 29 October 2023 (UTC)[reply]

The focus is less on where exactly it falls on the vowel chart, but more on IF /ʌ/ & /ə/ have an actual distinction in vowel quality in GenAm. (Also, tbh the IPA vowel chart is far from always precisely or officially followed in many languages, especially when it comes to phonemic transcription.) AG202 (talk) 21:31, 29 October 2023 (UTC)[reply]

If we're going to be transcribing it, we need to pick a value. I repeat my question from above ... what is the actual IPA value of the vowel you're hearing? Is it [ʌ], [ə], or something in between? If it's [ə], are you claiming that someone saying [pət] the ball would be understood as saying "putt the ball"? Thanks, —Soap— 21:58, 29 October 2023 (UTC)[reply]

Yes. Or at least both of the vowels in "above" and the vowel in "putt" sound the exact same to me. I can say that it's not [ʌ] but the exact location compared to [ə], I'm not sure about. Nonetheless, I'd rather just use the practice nowadays to use /ə/ and deal with the phonetic transcription later as that's not the more pertinent issue at hand right now. AG202 (talk) 22:15, 29 October 2023 (UTC)[reply]

If they actually put stress on the world at a syllable level, then yeah, they could most definitely be understood as saying "putt". The difference between "put" and "putt" here would be the stress, not the vowel (since a stressed "put" would have the foot vowel instead). MedK1 (talk) 20:08, 2 November 2023 (UTC)[reply]

Someone brought up en- vs un- last time, and I would love audio examples of this alleged pronunciation of en- with /ə/ but un- with /ʌ/, because people have turned out to have a number of misconceptions in these discussions, from not noticing /ʊ/ (I'm not sure if Soap is saying put has a schwa, but in the past Gilgamesh claimed that not only /ʊl/ words but also /oʊl/ words all had /ə/, not only for him but for most Americans! which is plainly wrong), to misidentifying which words have /ʌ/ vs /ə/ (because, due to not actually distinguishing those vowels themselves, they can't tell that their stress difference isn't the same as the vowel-quality difference speakers who actually distinguish /ʌ/-vs-/ə/ have, and so misjudge which words have one or the other), to thinking most Americans merge /ɑ/ and /ɔ/ to /ɒ/. It's evident that psychology influences what people hear, so until there's evidence, I am sceptical of the claim that en- has a schwa but un- has /ʌ/ for any group of Americans, let alone for GenAm overall; as far as I've ever heard and as far as I can find in any dictionary (Cambridge, Collins, Dictionary.com, Merriam-Webster, ...), en- and un- are distinct because en- doesn't have a schwa. - -sche (discuss) 20:53, 29 October 2023 (UTC)[reply]

Can we please make sure to distinguish between slashes and brackets in this discussion? I'd guess that Gilgamesh was actually talking about the phone [ə], not the phoneme /ə/, and likewise about the phone [ɒ], but I'm not entirely sure. As for en- vs. un-, another prefix it may be useful to consider is con-. Consider the pair intention and contention; both have (or rather, can have) a fully unstressed first syllable, and the first syllable of contention surely doesn't have /ɪ/ (or I don't think I've seen anyone argue for that...). Then the hypothesis that en-= /ən/ predicts that this pair should match aside from the initial consonant (and whatever allophonic effects it has on the following vowel), whereas the hypothesis that en-=/ɪn/ predicts that they should have distinct vowel sounds in the first syllable.--Urszag (talk) 21:29, 29 October 2023 (UTC)[reply]

Gilgamesh thought most people in America merged /ʊl/ and /oʊl/ to /l̩/, the same way they merge horse-hoarse. Mahagaja and I did find a few mentions in literature that a few speakers do merge those sounds; apparently he did actually merge them, and simply couldn't hear that most people don't, in the same way speakers with the cot-caught merger don't notice other people pronouncing them distinctly. That's why I think it's important to look at scholarly works that measure the vowels people say; our lay assessments of "I think I say /.../" can be mistaken or unawarely idiosyncratic.

If some speaker (with {{a|weak vowel merger}}?) pronounced in- with an actual schwa, that would indeed not be distinct from the pronunciations of un- that Lindsey measures as objectively having /ə/. But our transcriptions of {{a|weak vowel merger}}, like e.g. (pin-pen merger), are separate from {{a|GenAm}}, so I wouldn't consider them when deciding how to notate GenAm in-, let alone un-. In GenAm, as far as I've seen in any of the references that have been brought up in discussions of this, in- and un- are distinguished by in- being /ɪ/. (References vary between using the traditional /ʌ/ notation for un-, using /ə/, or using /ʌ/ but noting that it means /ə/.) - -sche (discuss) 03:08, 30 October 2023 (UTC)[reply]

Audio (US):

(file)

I forgot to record a word specifically starting with the prefix "en-", but here's an audio clip of me saying "affectionate, onion, intelligence, intention, contention, undone, blowgun, bank run, staple gun". I think it's definitely possible that my pronunciations of "intelligence" and "intention" start with [ɪ], but it could be [ɘ], and I don't feel like there's as big a difference between the sound of "in-" there and the sound I use for unstressed "-ence" and "-on" as there is between the sound of "un-", "run" and "gun" and the sound of unstressed "on" and "-ence". So I would say that I have a weak vowel merger with neutralization of /ə/ and /ɪ/. As far as I know, though, there's no party in favor of transcribing intention and intelligence with /ɪ/ in the last syllable.--Urszag (talk) 23:29, 30 October 2023 (UTC)[reply]

I can definitely hear that your intelligence & intention have /ɪ/ in initial syllables and sound different than the schwa in onion & others. In terms of /ʌ/, I can hear a difference, but going back to other points brought up, this is part of why I think it's better to look at concise general studies of GenAm rather than trying to analyze each person's pronunciation. I'm sure I have my own quirks that wouldn't align completely with GenAm, but I wouldn't try to put them online anywhere marked as GenAm unless it's been shown repeatedly that they're a feature of the accent across the board. AG202 (talk) 23:47, 30 October 2023 (UTC)[reply]

For the record, I for my part only support changing /ʌ/ to /ə/ or /ɝ/ to /ɚ/ if we do both. I'm actually fine with keeping the traditional notation for historical continuity (indeed, that's what I'm weakly inclined to do). If we update our notations to reflect modern speech, I'm down with that too, as long as we update both — but every reference work I've come across in all of these discussions either supports both changes, or keeps using the older notation for both; as far as I've seen so far, the only people who think that just one of the two changes happened are a few users of this site. - -sche (discuss) 09:04, 30 October 2023 (UTC)[reply]

I'd argue that tradition for tradition's sake is a dangerous thing and can lead to misinformation, or at least outdated information. I also support transcribing the central r vowel as /ɚ/. Vininn126 (talk) 09:42, 30 October 2023 (UTC)[reply]

Agreed. Andrew Sheedy (talk) 11:42, 30 October 2023 (UTC)[reply]

I agree with User:-sche here. Benwing2 (talk) 22:06, 30 October 2023 (UTC)[reply]

I prefer keeping both /ʌ/ and /ɝ/ over getting rid of both. Replacing them with /ə/ and /ɚ/ is certainly a defensible choice, but I don't think it simplifies things as much as it is alleged to, since it just shifts the complicated part from the choice of vowel symbols to the choice of where to use secondary stress marks. Before any large-scale replacement, I would want to see a system set up for dealing with that (e.g. "/ʌnˈduː/" ought to be replaced with /ˌənˈduː/ not just /ənˈduː/). I would also want this to be decided by a formal vote.--Urszag (talk) 23:29, 30 October 2023 (UTC)[reply]

@Urszag, AG202

emburden, enqueue, etc.:

(file)

This includes the same words as above plus some words in en-/em- and un- for comparison. Benwing2 (talk) 00:13, 31 October 2023 (UTC)[reply]

@Urszag I think the best "system for replacement" would be my writing an {{en-IPA}} that abstracts out some of this detail. There's still the issue of whether words like undone and unburden have secondary stress; as planned, my module can infer the position of primary stress in some circumstances but will require all secondary stresses to be notated explicitly. Benwing2 (talk) 00:21, 31 October 2023 (UTC)[reply]

Is secondary stress contrastive in English? I agree it's a good idea to transcribe, but we might end up having to use [] as opposed to //. Vininn126 (talk) 08:48, 31 October 2023 (UTC)[reply]

@Vininn126 I think it's best to assume secondary stress is phonemic in English, since it's consistent in many words but unpredictable. A possible minimal pair that comes to mind is Reagan vs. raygun (assuming an analysis where /ʌ/ = /ə/). It's clearly tied up with vowel reduction, though. Benwing2 (talk) 08:56, 31 October 2023 (UTC)[reply]

That is indeed a good pair! I am satisfied calling it phonemic then. Vininn126 (talk) 09:08, 31 October 2023 (UTC)[reply]

@Benwing2 I'm already working on an English pronunciation module, just as an FYI. Theknightwho (talk) 13:44, 31 October 2023 (UTC)[reply]

@Theknightwho Cool. What is the current state? Can you post some code? Benwing2 (talk) 22:30, 31 October 2023 (UTC)[reply]

@Benwing2 Sure - it's Module:User:Theknightwho/en-pron, which is an adapted version of the Espeak-ng text-to-speech engine. It uses just over 6,000 rules to construct a phoneme array, which can then be turned into IPA as appropriate. Accents are not a problem, because it takes into account all the various distinctions, which can then be merged as appropriate for a given accent (e.g. parm-palm, cot-caught etc). It handles primary and secondary stress, though it will need manual respelling in some situations. Theknightwho (talk) 22:44, 31 October 2023 (UTC)[reply]

@Theknightwho Can you include some information on how it works from the end user's perspective? What is the structure of the respelling, for example? Also are you planning on working on this in the near future? Benwing2 (talk) 22:48, 31 October 2023 (UTC)[reply]

@Benwing2 I've been working on it in the last few days, as I needed a break from the other stuff. I haven't decided on the respelling rules yet, but the ruleset is expansive enough that it's likely to be possible with intuition in most cases. That being said, we could allow for explicit phoneme inputs (which is what E-speak actually uses for some words - see this list). I intend to have a "verb" flag, too, since that can affect stress. Theknightwho (talk) 22:55, 31 October 2023 (UTC)[reply]

@Theknightwho All right, sounds good. I would not recommend a respelling that looks remotely like the phoneme respelling in the list you linked to. Instead it should be something that uses an approximation of normal English rules. I wrote the following in WT:Grease pit/2022/November:

I am planning on starting on an English pronunciation module soon. I have written such modules for several languages. My plan is to use an approximation of actual English spelling as the respelling, with some diacritics added to denote things like primary and secondary stress. This is what I do currently e.g. for Portuguese (see {{pt-IPA}}), and it's what I did for German (the German module is not yet released but you can see lots of testcases along with the current output at Module:User:Benwing2/de-pron/testcases/prefixes, Module:User:Benwing2/de-pron/testcases/suffixes and Module:User:Benwing2/de-pron/testcases/misc). I don't want to use IPA for the respelling because that is likely to lead to inconsistencies, as noted by User:Urszag. The first thing I will do is create a page with a bunch of testcases, similar to the German testcases just mentioned; this should help refine the particular respelling conventions. Note also that the module will not allow all possible respellings, but will throw errors for certain combinations, e.g. the use of 'ough' in respelling, as well as short vowels followed by single consonants (except in certain circumstances) is likely to result in an error since these combinations are too ambiguous to reliably interpret. Hence the beginning of the Gettysburg address might be respelled something like 'fore score and sevvan years ago, our fahdhars braught forth on this continènt a new naition, canceeved in Libbarty, and déddicàited to the pròppazítion that aull men are creeáited eequal'. Here, 'a' is used for schwa; acute accents indicate primary stress when it doesn't follow the standard third-from-the-end rule or can be inferred from the occurrences of schwas (which never bear stress); grave accents indicate secondary stress; the module knows how to handle certain common words like 'the', 'a' 'is', 'are', etc.; the module knows about final -tion, -y and various other suffixes; etc. The respelling should be abstract enough that all major English accents can be generated from it. Benwing2 (talk) 04:36, 20 November 2022 (UTC)[reply]

Here I give an example of respelling; with 6,000 rules you could almost certainly do better, although at a certain point the rules get so complex that it would be difficult for native speakers to reliably produce the respelling. I would definitely also recommend creating a large set of testcases using your chosen respelling, as I did for German; this will help you work the kinks out of the respelling system, while giving you a set of regression tests to test changes on. Benwing2 (talk) 23:21, 31 October 2023 (UTC)[reply]

@Benwing2 Absolutely - I've got 106,000 testcases from the Current British English Pronunciation Dictionary, though obviously that's limited to a specific accent (and I'll need to ensure that all the phonemes line-up). It'll be too big to test using the sandbox, so I'll do testing offline.

Many of the rules are clearly aimed at individual roots, though I think it's using the logic that we tend to pronounce novel words by analogy to existing ones, and I've been impressed with how closely it matches my own intuition when I feed it made-up words. Theknightwho (talk) 23:34, 31 October 2023 (UTC)[reply]

FWIW, I think we should be marking secondary stress independent of what vowel we notate strut words as having.
It's not a perfect minimal pair because the first consonant also differs, but on stackexchange someone noted that the second syllables of the nouns (I have a) permit and (I have a) Kermit differ in having secondary stress (permit) vs zero stress (Kermit) and are not perfect rhymes as a result, which is true at least in my experience. This paper mentions that "Sven Mattys (2000) examined whether native American English speakers are capable of distinguishing between primary and secondary stress within words without accompanying lexical information" by isolating e.g. the prose- part of audio of prosecutor (in which the first syllable has primary stress) and prosecution (in which it has secondary stress) and asking native speakers to identify which word was being said; he found that they were able to do so, "which he believes is evidence that speakers do have the capability to distinguish between primary and secondary stress in isolation." (Some would argue the difference is just "stress [meaning primary stress]" vs "no stress [meaning either no stress, or secondary stress]", but the rest of Torresquintero's paper is devoted to showing that secondary stress is different from no stress both objectively / measurably and also perceptibly / in a way users of language notice and use to identify words.) - -sche (discuss) 17:29, 31 October 2023 (UTC)[reply]

@-sche For me (GA speaker), "permit" (as a noun) and "Kermit" are perfect rhymes. What is your accent? Benwing2 (talk) 22:32, 31 October 2023 (UTC)[reply]

Changing Latin verb definitions to use "to ..." instead of "I ..."

We lemmatize Latin verbs at the first-person singular present indicative rather than the infinitive, consistent with several other dictionaries. IMO this is fine, but we also currently define such terms using "I ..." instead of "to ...". See for example inficio, a verb I picked at random:

I dip, I dunk, I submerge.
I color, I dye, I imbue, I stain, I tinge.
I corrupt, I poison, I spoil, I taint, I infect.

While this is technically correct, IMO it gives the wrong impression. Since these are lemmas, we should define them like this:

to dip, to dunk, to submerge
to color, to dye, to imbue, to stain, to tinge
to corrupt, to poison, to spoil, to taint, to infect

This is consistent with Wiktionary practice for many other languages that use something other than the infinitive as the verbal lemma form, e.g. Bulgarian, Macedonian, Hebrew, Arabic, Sanskrit, even Ancient Greek (see for example ἀγορεύω, ἀγανακτέω, ἀγαπάω, etc., although the latter strangely has one of its definitions using "I ...").

Changing this by bot should not be hard; essentially, "I am" -> "to be" and otherwise "I" -> "to" in definition lines, along with "my" -> "one's" and a few other substitutions. Benwing2 (talk) 03:46, 29 October 2023 (UTC)[reply]

C. None of the above. The "to" could be misunderstood as indicating that the lemma is the infinitive. It would be better to leave out both the pronoun and "to":

In the rare cases where the English infinitive is different from the first person singular, use the infinitive form without "to":

be

Chuck Entz (talk) 04:18, 29 October 2023 (UTC)[reply]

@Chuck Entz I see this will not be so easy as I thought. I strongly disagree with omitting the 'to' because the resulting bare infinitive can easily be interpreted as a noun in many cases (e.g. dip, color, dye, stain, tinge, poison, taint, etc.). Benwing2 (talk) 04:42, 29 October 2023 (UTC)[reply]

Also, this is inconsistent with the handling of most existing languages. Benwing2 (talk) 04:43, 29 October 2023 (UTC)[reply]

The Latin verb sections are under the "Verb" subheader, so the only confusion is if you don't read that. CitationsFreak (talk) 06:04, 29 October 2023 (UTC)[reply]

@CitationsFreak True but easily missed in my experience. Benwing2 (talk) 06:25, 29 October 2023 (UTC)[reply]

This is done with many languages that don't lemmatise at the infinitive, including Afar, Yukaghir, Witotoan, Hellenic... The verb being similar to the noun is not an argument, that's just the issue of English; Adjectives and nouns can also be homonymous, as can adjectives and adverbs. If the user isn't sure, he'll have to look up a bit and see the massive POS header. Thadh (talk) 19:07, 29 October 2023 (UTC)[reply]

What do you mean by Hellenic? Recent additions to Category:Ancient Greek verbs, which seem to be mostly the work of @Mahagaja, use the to-infinitive. There probably are verb entries which don't follow that format, but I believe it's been suggested before that they should, so I don't think they can be taken as evidence of anything. P U C – 19:27, 29 October 2023 (UTC)[reply]

γρικώ. Thadh (talk) 19:33, 29 October 2023 (UTC)[reply]

Yeah... Alongside ασημώνω (asimóno) created by the same editor, or πλιατσικολογώ (pliatsikologó), created by another one... P U C – 19:37, 29 October 2023 (UTC)[reply]

So what, we have to now run a bot to determine which is slightly more used? What my point is that a bunch of language already use the bare lemma form instead of the to-infinitive, and it works for them. Thadh (talk) 19:40, 29 October 2023 (UTC)[reply]

No, I'm just saying that given the number of counterexamples, your original statement ("This is done with many languages that don't lemmatise at the infinitive, including [...] Hellenic") doesn't appear to be true for that language family.

I would also suggest that, regardless of what form is currently used the most frequently, Hellenic entries be made to use the to-infinitive. That's probably something for another day, though. P U C – 19:59, 29 October 2023 (UTC)[reply]

@Chuck Entz Hungarian lemmatizes at the third person present singular and it has not caused any problems using "to". This is a clear case of too literal thinking. Vininn126 (talk) 19:24, 29 October 2023 (UTC)[reply]

Support, per the principle of "lemma = lemma". P U C – 11:28, 29 October 2023 (UTC)[reply]

Support Chuck Entz’s concern seems for something virtual, since in spite of the infinitive-marker the translation is still unmarked enough to just be the most unmarked translation for a citation form, the form usually understood by users of language teaching materials as unmarked, only marked to dispel confusion with nouns, and we can’t really do something not done elsewhere at all yet. Fay Freak (talk) 12:00, 29 October 2023 (UTC)[reply]

Support I've brought this up once on Discord. It makes Wiktionary look amateurish with people not knowing how a dictionary should be written. As PUC said, lemma = lemma. — Fenakhay ^{(حيطي · مساهماتي)} 12:48, 29 October 2023 (UTC)[reply]

Support - using "I" feels like hypercorrect nonsense. Theknightwho (talk) 17:57, 29 October 2023 (UTC)[reply]

Support Vininn126 (talk) 18:07, 29 October 2023 (UTC)[reply]

Oppose using to-forms,

Support using bare forms; See above. "to" is not part of the lemma, otherwise we should start hosting verbs at something that includes these. Thadh (talk) 19:13, 29 October 2023 (UTC)[reply]

Ingrian aijoittaa, a verb entry fairly recently created by you, includes the to. Why is this different? P U C – 19:18, 29 October 2023 (UTC)[reply]

As with Murui Huitoto jɨkade. The languages cited by Thadh as examples are those where he himself created the entries, and even then he's not consistent. Benwing2 (talk) 19:22, 29 October 2023 (UTC)[reply]

Ah, it seems I was mistaken with Huitoto. But the fact I give examples of languages I edit is because I edit them and thus remember them; I don't remember every language out there and how they work. Greek is an example of a language I do remember. Thadh (talk) 19:30, 29 October 2023 (UTC)[reply]

BTW, just to make this clear: I am consistent within a language, I just forgot which languages I handle in which way. Thadh (talk) 19:44, 29 October 2023 (UTC)[reply]

@Thadh Doesn't that suggest that there's no real value in omitting "to"? Theknightwho (talk) 12:19, 31 October 2023 (UTC)[reply]

@Theknightwho: Murui is a language that is syntactically very un-European, so its third person can actually serve the function of what in most European languages would be the (to)-infinitive, which I apparently used as an argument to keep the to-forms in definitions. I do think it's disingenuous to translate lemmas with unrelated nonlemmas; it's like translating a Russian infinitive by an English gerund. Thadh (talk) 12:56, 31 October 2023 (UTC)[reply]

@PUC: because that is the infinitive. Thadh (talk) 19:29, 29 October 2023 (UTC)[reply]

I don't understand. So in your view, languages which lemmatize at the infinitive can be translated with a to-infinitive in English, but languages which don't should use the bare infinitive? Why? P U C – 19:34, 29 October 2023 (UTC)[reply]

@PUC: because that's not just an infinitive, it's a lemma form. Just like I think collective nouns should be given in the plural and non-collective in the singular - some terms can be translated with a form of the lemma, others cannot. Using the lemma would work regardless. Thadh (talk) 19:38, 29 October 2023 (UTC)[reply]

Well, it is actually citation form = citation form, not lemma = lemma, as lemma is the term for the page title. One does cite verbs with to, and speakers even call that the infinitive, though it be difficult to understand from a Germanicist comparative perspective. Fay Freak (talk) 19:30, 29 October 2023 (UTC)[reply]

Support -- I've always found it confusing that the current standard is to translate these as "to INFINITIVE" in etymology sections, but not in the entries themselves.--Urszag (talk) 19:28, 29 October 2023 (UTC)[reply]

Support getting rid of "I". I prefer the "to" because so many English verbs are homographs of nouns, so using "to" makes it clearer at a glance that we're talking about the verb. (I use "to" in Ancient Greek entries partially for that reason and partially because I tend to copy-paste the English glosses from {{R:Middle Liddell}}, which uses "to". However, I don't feel very strongly about it, and will not object if amō is glossed “love” rather than “to love”, as long as it isn't glossed “I love”. —Mahāgaja · talk 19:36, 29 October 2023 (UTC)[reply]

Just to be clear, my actual view is the flip side of this: I prefer the bare infinitive, but I don't feel that strongly about it. I just wanted to present the option so that people are aware of it and can choose from a full list of choices.

That said, if the definition uses the "to" infinitive, is there anything in the headword that says it's the 1st person singular present indicative rather than one of the infinitives? Chuck Entz (talk) 20:13, 29 October 2023 (UTC)[reply]

Support. The headword is a convention, a title of the story that is told by the contents of the entry. Some of these contents may not even relate to the headword, for example the information on the inflection and etymology of the suppletive forms. --Vahag (talk) 21:20, 29 October 2023 (UTC)[reply]

Abstain. The handful of modern Latin dictionaries I could find with preview available on Google Books do seem to translate Latin first-person lemmas to English to-infinitives. It is perhaps confusing to be glossing a non-infinitive form as an infinitive when Latin also has an actual infinitive, but clearly other people find it confusing to be glossing a citation form with "I...". - -sche (discuss) 21:25, 29 October 2023 (UTC)[reply]

Support — — Salt marsh ^🢃 12:00, 30 October 2023 (UTC)[reply]

I don't have a strong feeling either way, but I note that Latin is not the only language for which our entries use "I ..." glosses for verbs. I ran the numbers:

Latin (81% of verb entries use "I ..." glosses - note that form-of entries do not use "I ..." glosses because they do not have glosses at all, so the "true" percentage would be much higher)
Proto-Italic (95% of verb entries use "I ..." glosses). If Latin is changed it seems obvious that Proto-Italic should be changed too.
Ancient Greek (16% of verb entries use "I ..." glosses). Given the very close grammatical similarities concerning verb forms and lemmatisation, I feel like Latin and Ancient Greek should be kept in sync. Whatever we do for Latin should also apply to Ancient Greek.
Albanian (50% of verb entries use "I ..." glosses)
Aromanian (74% of verb entries use "I ..." glosses)
Megleno-Romanian (97% of verb entries use "I ..." glosses)
Greek (8% of verb entries use "I ..." glosses)
Mohawk (65% of verb entries use "I ..." glosses)

Should we also make changes for any of these languages, for the same reasons? This, that and the other (talk) 00:42, 31 October 2023 (UTC)[reply]

@This, that and the other IMO, yes we should, for all of these (which look to be mostly a collection of "Greek-adjacent" languages). Benwing2 (talk) 00:54, 31 October 2023 (UTC)[reply]

Actually I can't speak for Mohawk as I know nothing of this language, but yes at least for the remainder. Benwing2 (talk) 00:56, 31 October 2023 (UTC)[reply]

Most (I think) Greek entries are the bare lemma, but a change to "to" would seem fine - uniformity makes sense! — Salt marsh ^🢃 11:56, 31 October 2023 (UTC)[reply]

As said above, I support this for Greek and Ancient Greek too. P U C – 12:22, 31 October 2023 (UTC)[reply]

Wiktionary:Style guide § Verbs states: ‘Definitions of verbs should begin with “to”.’ I don't see an argument for making an exception for verbs whose lemma form is not an infinitive. --Lambiam 07:52, 1 November 2023 (UTC)[reply]

@Lambiam: Style Guide is not a policy, it's just thoughts from different users on one page. Thadh (talk) 07:04, 2 November 2023 (UTC)[reply]

Our style guide is not a haphazard collection of random thoughts. It helps to ensure a consistency that makes the use of Wiktionary a smoother experience. The sentence has been in the style guide from its inception in 2009. All guidelines (and policies, for that matter) should be applied with common sense, but if one is advocating for a systematic exception, I feel one should present an argument for this deviation of the standing guideline. --Lambiam 08:29, 2 November 2023 (UTC)[reply]

Support, only if it's made clear that it's the first-person singular active present form. Our Latin entries are linked from a ton of Romance language entries, which do lemmatize at the infinitive, and to be honest, it's already confusing trying to piece forms together if you don't have experience with how Latin entries are lemmatized. I feel that it could get even more confusing with a change to "to ...". AG202 (talk) 12:22, 1 November 2023 (UTC)[reply]

@AG202 Can you give an example of how you think the text should look? One possibility is to display something like lemmatized at first singular present indicative at the beginning of the inflection section of the headword, which would then look like:

(lemmatized at first singular present indicative; present infinitive īnficere, perfect active īnfēcī, supine īnfectum)

My main concerns are that this might be perceived as verbose, and that people might not completely understand what lemmatized means. Maybe just first singular present indicative is enough, I don't know. Benwing2 (talk) 07:17, 2 November 2023 (UTC)[reply]

Definitely no need for linguistics jargon front and centre like that. How about this for a crazy idea:

first person singular present indicative īnficiō (present infinitive īnficere, perfect active īnfēcī, supine īnfectum)

I know no other language does this... yet... (Note I left out "active" because "active" is also missing from "present infinitive", although it is strangely included in "perfect active".)

Alternatively, we could just defer to the conjugation table provided lower on the page, where the lemma form is the very first entry. This, that and the other (talk) 11:03, 2 November 2023 (UTC)[reply]

@Benwing2 I could go with something like this. AG202 (talk) 16:20, 8 November 2023 (UTC)[reply]

@AG202 What about just this (including the headword itself in the example):

īnficiō first person singular present indicative (present infinitive īnficere, perfect active īnfēcī, supine īnfectum)

It seems strange to put the headword in the middle of the text (if that was the intention rather than repeating the headword). Benwing2 (talk) 21:36, 8 November 2023 (UTC)[reply]

@Benwing2 Yes, you're right, this would be better. AG202 (talk) 14:12, 10 November 2023 (UTC)[reply]

Oppose to-forms,

Support bare forms. There's no way they're getting confused at all; it's not like "Verb" in big bold letters above is the easiest thing in the world to miss. MedK1 (talk) 19:56, 2 November 2023 (UTC)[reply]

@MedK1 Are you aware that we have to use "to" forms regardless when mentioning Latin verbs in etymologies (since otherwise there's no indication that a word in -o is a verb)? Also, I definitely get confused when I see bare verb definitions, since they look like noun definitions and most verb definitions use "to". Benwing2 (talk) 20:24, 2 November 2023 (UTC)[reply]

I am, which is why I'm only for bare forms in lemma definitions where it's clear they're verbs at least. If you add the "to" there though, it makes it seem like the word in -ō is an infinitive. MedK1 (talk) 20:40, 2 November 2023 (UTC)[reply]

Honestly, I feel like the option with the best result would be to "just" lemmatize the verbs at the infinitive. But then that's a lot of work... MedK1 (talk) 20:42, 2 November 2023 (UTC)[reply]

@Benwing2: That's not true, you can give {{der|LANG|la|TERM|pos=verb}}. Thadh (talk) 20:41, 2 November 2023 (UTC)[reply]

I suppose you could, but that seems a lot of extra work just to be contrary. Benwing2 (talk) 21:00, 2 November 2023 (UTC)[reply]

Support. One need look no further than words like aestuo (“I rise in billows”), scintillo (“I scintillate”), fluctuo (“I undulate”), fragro (“I am redolent”), fulmino (“I fulminate”), tono (“I thunder”), fulgeo (“I blaze”), fremo (“I buzz, complain loudly”), I mean come on... DJ K-Çel (contribs ~ talk) 20:50, 2 November 2023 (UTC)[reply]

Support. The citation/lemma form should not be taken as a reference to a specific inflected form but rather as a conventional label for the lemma as an abstract whole, subsuming all inflected forms under it. For that reason, citation forms should be defined with citation forms. — Vorziblix (talk · contribs) 16:10, 8 November 2023 (UTC)[reply]

Support changing 'I X' to 'to X'. In general, there are some languages where adjectives and verbs are not distinguished (or more likely there is a very small set of adjectives), and the disambiguation supplied by prefixing 'to' is helpful. For example, the difference between meanings 'clean' (English adjective) and 'to clean' (English verb) is quite significant. If the words with the meanings are both verbs, then I'd rather distinguish 'clean' and 'to clean' than 'to be clean' and 'clean'. I do note though that for this particular example, the Khmer words are translated as 'to be clean' (Khmer adjective) and 'to clean' (Khmer verb) (2 out of 3 - the third translation is 'to clean up'). WT:About Khmer is unhelpful on the parts of speech, as is typical of the 'about' pages. --RichardW57m (talk) 14:04, 10 November 2023 (UTC)[reply]

Done. I used to-forms as I count 14 in favor of to-forms, 3 in favor of bare forms, and no one in favor of the status quo. I followed the suggestion of User:AG202 and User:This, that and the other to put an indication of the actual POS of the lemma in the headword. This is actually a general feature of Module:headword now, so we could add similar indications to other languages where the verb lemma isn't the infinitive or where the noun lemma isn't the nominative singular (e.g. Old French, where it's the objective singular, and Sanskrit, where it's the root). If there is agreement to do this, and someone makes a list of the languages in question and which form is being lemmatized, I can make the relevant changes. Benwing2 (talk) 04:49, 14 November 2023 (UTC)[reply]

@Benwing2 may I suggest two followups: (1) doing the same for Proto-Italic reconstructions, and (2) doing the same for glosses for Latin terms in {{m}}, {{der}} and like templates? Personally I think the same change should be done for Ancient Greek and Greek entries too, because of the clear favouring of "to" forms in these entries already, but I'll leave you to be the judge of whether such a thing is supported. This, that and the other (talk) 05:24, 14 November 2023 (UTC)[reply]

@This, that and the other: Done for Proto-Italic. I agree with changing this for glosses of Latin terms as well but this will take a bit more work as the templates can potentially be anywhere. Benwing2 (talk) 05:57, 14 November 2023 (UTC)[reply]

For Pali and Prakrit:

Nouns: lemma form is stem
Adjectives: lemma form is masculine stem
Verb: lemma form is present active 3rd person singular

I think the lemma form for Sanskrit is actually the present active 3rd person singular where it exists; however, there may sometimes be good reason to use the present middle 3s, present passive middle 3s or, very rarely, the non-standard present passive active 3s.

For Hebrew verbs, the lemma form is, to use the terminology for Ivrit, the past active 3sm. RichardW57m (talk) 09:57, 14 November 2023 (UTC)[reply]

Support —Caoimhin ceallach (talk) 17:34, 16 November 2023 (UTC)[reply]

I say, it'd be better to have the actual infinitive forms hold the whole article with a declension table instead of having e.g. "scribo" hold that. I disagree with changing "I write" to "write" or "to write", it's simply inaccurate... 82calamities (talk) 14:56, 16 February 2024 (UTC)[reply]

I agree with M @82calamities +at the 2nd part of his comment. My good mentor Saltmarsh will forgive me, for my

Oppose. How can one support an inaccurate translation? It is especially confusing, when an infinitive at a language does exist (with t=to love, a very different meaning from t=I love). Couldn't there be, at least a note This language lemmatises its verbs in the 1st person of present tense ? Something like the wikt:fr:amo#Latin note and their wikt:fr:Modèle:convention latine, wikt:fr:Modèle:convention verbe grc, also wikt:fr:Modèle:convention arabe. Thank you. ‑‑Sarri.greek ^♫ I 16:05, 16 February 2024 (UTC)[reply]

@Sarri.greek I added exactly such a note, but User:Catonif proceeded to delete it, complaining that it "wildly changes the look" of heawords. Benwing2 (talk) 18:19, 16 February 2024 (UTC)[reply]

a.. Perhaps will like at a Usage.section or Under the head? a nice thin little line? @Catonif? We have to say 'something' in greek and latin... and...

In this language verbs are presented in 1st person of present tense. Impersonals, in 3rd person.

ω @Benwing2? ‑‑Sarri.greek ^♫ I 18:37, 16 February 2024 (UTC)[reply]

To clarify my summary: "wildly changing the look" of something is not bad per se, but big changes need to undergo proper discussion and gain the community's consensus, which had not happened. I hope you didn't take the revert personally. Catonif (talk) 21:48, 16 February 2024 (UTC)[reply]

@Catonif I didn't take it personally but (a) I see it as counterproductive as we need to be conveying this info one way or another and you proposed no alternatives; (b) several people did express support for this sort of change; (c) I don't see it as such a big change as you're presenting it to be. Benwing2 (talk) 00:33, 17 February 2024 (UTC)[reply]

I agree with this suggestion. I get that it’s common practice for dictionaries to list the first principal part as the lemma, but to me, it seems like an incongruity between the listed form of the verb and its translation just makes the language more inaccessible to those not already familiar with the conventions (which is a reputation Latin already kinda has).

I think we should aim to reflect the language as it was understood by native speakers. Do we know if this was convention in antiquity? Listing the second principal part (i.e. the infinitive) as the lemma isn’t unprecedented in modern Latin grammars, see Dumesnil’s 1819 thesaurus for example. Asticky (talk) 19:47, 16 February 2024 (UTC)[reply]

@Asticky I've tried.

- Nicodene (talk) 21:04, 16 February 2024 (UTC)[reply]

Support using 'to X' as the translation by an English verb of other languages' verbs, i.e. citation form by citation form. I believe it is also possible that some languages' stative verbs might best be translated by English adjectives. --RichardW57m (talk) 14:36, 30 May 2024 (UTC)[reply]

@RichardW57m: And Welsh verbnouns by English gerunds. 0DF (talk) 16:08, 30 May 2024 (UTC)[reply]

@Benwing2: This change has introduced errors into pluit (“it is raining”) and presumably other impersonal verbs. In the quest for consistency for consistency's sake, adverse consequences seem not to have been foreseen. 0DF (talk) 10:15, 14 November 2023 (UTC)[reply]

Well spotted. Fixed. Not sure why people around here seem to have bad words to say about consistency... This, that and the other (talk) 11:41, 14 November 2023 (UTC)[reply]

@This, that and the other It's always the same two people - I wouldn't worry about it. Theknightwho (talk) 21:01, 14 November 2023 (UTC)[reply]

Please excuse the >6-month delay in my response.
@Theknightwho: Très drôle, Man camarade.
@This, that and the other: Consistency is a virtue, other things being equal; however, it is a lesser virtue than that of accuracy and, perhaps, lesser than the virtue of ready apprehensibility, too. (It is also possible to see the value of constitency intralingually, even if not translingually.) The focus of my criticism was not consistency as such, but of enforcing it by means of automatic replace-all commands without giving due consideration to the possibility that such a process would (and indeed to the actuality that such a process did) generate errors.
0DF (talk) 14:16, 30 May 2024 (UTC)[reply]

@0DF: I don't see a new error.

However, "pluit third-singular present indicative (present infinitive pluere, perfect active pluit or plūvit); third conjugation, impersonal, no passive, no supine stem" looks like nonsense as the headword of a lemma. We want to convey something like, "pluit (Headword form is third-singular present indicative) (present infinitive pluere, perfect active pluit or plūvit); third conjugation, impersonal, no passive, no supine stem", where I've italicised the possible new test. There may be a better way of putting it - e.g. just chopping out "present indicative". --RichardW57m (talk) 09:18, 15 November 2023 (UTC)[reply]

@RichardW57m: Please excuse the >6-month delay in my response. The changes to pluit's headword line were caused, not by changes to [[pluit]] itself, but by changes to Module:la-headword. When I saw [[pluit]], this was the current version of the module, which caused [[pluit]] to display:

pluit first-singular present indicative (present infinitive pluere, perfect active pluit or plūvit); third conjugation, impersonal, no passive, no supine stem

You saw [[pluit]] when this was the current version of the module, which caused [[pluit]] to display:

pluit third-singular present indicative (present infinitive pluere, perfect active pluit or plūvit); third conjugation, impersonal, no passive, no supine stem

Since then, all lemma glosses were removed, such that [[pluit]] currently displays:

pluit (present infinitive pluere, perfect active pluit or plūvit); third conjugation, impersonal, no passive, no supine stem

Which is tantamount to ignoring the original motivating concern, voiced by AG202 and This, that and the other, that it may be confusing to have different lemmatisation conventions without parsing those lemmata. 0DF (talk) 14:16, 30 May 2024 (UTC)[reply]

@0DF Yes, I tried to add this but User:Catonif made a unilateral decision to remove it, ignoring those concerns. Benwing2 (talk) 22:43, 30 May 2024 (UTC)[reply]

@Catonif, Benwing2 I would like to see this added. Is there really anyone opposed? I think Catonif's objections were to doing it without discussion, not to the actual implementation. Andrew Sheedy (talk) 15:14, 31 May 2024 (UTC)[reply]

Yes, I am opposed due to the sheer amount of space that would take up. A tooltip on the other hand… Nicodene (talk) 20:34, 31 May 2024 (UTC)[reply]

Sorry for answering late. For anyone interested or confused about this event, I have an explanation at User talk:Catonif#changes to Module:el-translit, from About this incident with Latin onwards. I'd like people to understand the revert wasn't any more unilateral nor concern-ignoring than the edit itself was. Catonif (talk) 10:24, 2 June 2024 (UTC)[reply]

@Benwing2, Andrew Sheedy, Nicodene, Catonif: I have started a discussion about this specific issue in Wiktionary:Beer parlour/2024/July#Parsing the principal parts in the headword lines of Latin verbs. 0DF (talk) 03:18, 1 July 2024 (UTC)[reply]

Ordering of etymologies when some are for non-lemmas and some are for lemmas

Hi folks,

It's not infrequently the case in Bulgarian (and I suspect other languages) that the same "word" can have multiple etymologies, some of which are non-lemma forms of other lemmas, and some of which are lemmas in their own right. Do we as Wiktionary have a preference about whether lemmas come first (vertically above) non-lemmas? Is that open to judgment - e.g. if you have a non-lemma form of a common word and a rare lemma, would we still place the lemma first? To give a specific example - Bulgarian костура (kostura). It corresponds to an archaic dialectal lemma for "knife", and a non-lemma for a common type of fish.

Thanks,

Chernorizets (talk) 09:46, 30 October 2023 (UTC)[reply]

In my experience I have only seem lemmas first and then non-lemmas and this is my practice. I do see some value in having a more common non-lemma first above a rare/regional/archaic lemma. Wiktionary:Etymology#Inflected_forms makes no statement about ordering. Vininn126 (talk) 09:50, 30 October 2023 (UTC)[reply]

If a term is particularly rarely encountered and a non-lemma form particularly often, I put the latter first: أنفاق. It’s as with the senses under one part of speech, which are not necessarily sorted by frequency but some order by which they are best understood, and when a reader opens a non-lemma page he may be more likely to want to find out about the rarer lemma there than when he opens the lemma form of the more frequent term directly. Various factors are weighed. Fay Freak (talk) 11:14, 30 October 2023 (UTC)[reply]

I prefer having the lemma form first, and the non-lemma form underneath in all situations. Looks cleaner that way, and gets the information the reader is most likely to want up front. CitationsFreak (talk) 21:49, 30 October 2023 (UTC)[reply]

@Chernorizets I agree with User:Vininn126 and User:CitationsFreak about putting the lemma first always. The only exceptions I normally make are with lemmas derived from non-lemmas, e.g. lemmatized participles or pluralia tantum that are etymologically derived from a plural of a still-extant singular. In those cases I put the non-lemma first, followed by the derived lemma (both in the same etymology section if there's more than one etymology section). Benwing2 (talk) 22:04, 30 October 2023 (UTC)[reply]

@Benwing2 do you make exceptions if the lemma is far less common than the non-lemma form? I was trying to look at it from the POV of someone looking up a word they saw, e.g. a learner. If what they're looking up is most likely a non-lemma form of a word in common circulation, rather than a rare lemma, wouldn't we want to feature the non-lemma more prominently by putting it first? Chernorizets (talk) 23:04, 30 October 2023 (UTC)[reply]

@Chernorizets I generally don't make such exceptions, because it feels cleaner to me to always list the lemmas first, although like Vininn126 I could see people arguing the other direction. My thoughts are that if we're consistently putting the lemmas first, a site user will learn to look down past the rare/archaic lemmas for any non-lemma forms (although I suppose it depends on how much they know the language; they might have so little knowledge that they can't distinguish lemmas from non-lemma forms). Also it simplifies bot creation of non-lemma forms (e.g. most of the non-lemma entries in Russian were created by a bot script of mine, which consistently puts the entries below lemmas except for the exceptions I've noted above). Benwing2 (talk) 23:18, 30 October 2023 (UTC)[reply]

For concrete examples of this that I have dealt with, see: Chishan, Da'an, Lianjiang, Lienchiang, Nankang, Qishan, Xinhua, etc. I know I have encountered this problem before, but I forget where. Xingjiang is the closest thing I can think to a "far less common lemma" in Etymology 1 (it may not even meet WT:ATTEST) with a "much more common non-lemma"- actually a misspelling- in Etymology 2. But I think there could be justifications to put the non-lemma form first, for various reasons including how common it is, some kind of natural order between etymologies, etc. --Geographyinitiative (talk) 09:48, 31 October 2023 (UTC) Modified[reply]

If an entry has multiple etymology sections, then I place those that contain a lemma higher than those that don't. If an etymology section contains lemma and non-lemma headwords, and the lemma sense is clearly derived from the non-lemma, I put the non-lemma first (e.g. часом#Ukrainian). Voltaigne (talk) 23:22, 30 October 2023 (UTC)[reply]

Right, this is my practice too, as I mentioned above in the context of participles and pluralia tantum. Benwing2 (talk) 23:28, 30 October 2023 (UTC)[reply]

@Voltaigne funnily часом#Russian further up the page opts for the exact opposite order. What I've taken from the discussion so far is that different editors have different practices/preferences. @Benwing2 is this the kind of thing that's worth codifying via a vote, or should we simply continue to let editors exercise their discretion? I can get behind your idea that if there were a consistent pattern on Wikt, users would eventually learn to expect it, but 1) as far as I can tell, no such consistency exists today, and it would likely be time-consuming to chase down and fix the inconsistencies, and 2) I doubt the majority of Wikt users use the site regularly enough to spot something like this (but I don't have data to prove it). Chernorizets (talk) 09:35, 31 October 2023 (UTC)[reply]

@Chernorizets The issue with часом in Russian being in the wrong order may not be intentional; I bet the non-lemma form was created by my bot, which wasn't smart enough to know to order it before the lemma. If you want to create a vote, go ahead and do it but if you don't have the time, don't sweat it; a lot of similar things fall in the "best practice" category and aren't codified. Benwing2 (talk) 22:23, 31 October 2023 (UTC)[reply]

I would say there is a preference for lemmas before non-lemmas, but that is also consistent with placing our guess at the form most commonly looked up first, in the delusion that Wiktionary ought to be useful. Of course, there are some lemmas that are impossible or rare as words, but one might hope for users to more commonly look up lemmas than words. So, it may make more sense for some mere forms to precede mere words.

Rearranging terms may have deleterious consequences - sometimes editors have distinguished entries by relative position rather than by using the {{senseid}} machinery. --RichardW57m (talk) 14:32, 10 November 2023 (UTC)[reply]

@RichardW57m Yes, I have seen that, e.g. manually specified fragments, something like {{m|en|set#Noun 2}}; I've also seen {{sense|2}} etc. attached to synonyms. But IMO these are very bad practices, and they shouldn't prevent us from rearranging or adding sections or definitions. Benwing2 (talk) 07:20, 12 November 2023 (UTC)[reply]

@Benwing2 I think I might end up creating a vote with a goal of providing some recommendations around the ordering of etymologies. I use the term "recommendation" on purpose because, based on this discussion and my own intuition, "thou shalt's" don't make a lot of sense here. However, capturing the community's take on what constitutes a best practice might be worth it. Chernorizets (talk) 07:27, 12 November 2023 (UTC)[reply]

How in a proposed new regime would drawing#English be handled? It now has two etymologies (though the need for two is not obvious to me). We have templated such ing-forms to show, in a nod to centuries of Latin-derived grammatical thinking, that they are both gerunds and participles, as if that mattered to most dictionary users. But we have not gone through the noun definitions in these entries to remove "action of" and other gerund-type definitions. Nor have we followed the other course of adding 'noun' definitions that correspond to every attested sense of the verb. Nor do we have any criterion for determining which definitions should be "understood" to be included under the verb lemma, eg, draw#Verb and which require or merit a noun definition.

I'd be surprised if this was the only example of an underlying issue that would seem to make uniformity premature. In the world of templates and format rules, perhaps, things are relatively simple, but definitions and the arrangement of them usually lag far behind and usually require individual attention, which is usually not promptly forthcoming. It may be that there are languages that do not have any similar underlying issues, but I don't think English is one of them. DCDuring (talk) 14:26, 12 November 2023 (UTC)[reply]

@DCDuring It's not clear how any of that is relevant to the order of the etymologies. Please stay on-topic, instead of shoehorning in your pet bugbears every discussion. Theknightwho (talk) 14:50, 12 November 2023 (UTC)[reply]

I'm sorry that you don't see the relevance of the sequence of definitions to the sequence of etymology sections. I am probably not the best person to help you with that. I'd really like to hear some substantive response to my initial question. DCDuring (talk) 16:15, 12 November 2023 (UTC)[reply]

@DCDuring None of what you wrote related to the sequence of the definitions: you first talking about removing senses, then adding them, then the question of how we decide whether certain senses should be considered nouns or verbs. If you want to talk about that, start a separate thread, because all you seem to be doing here is using vague concerns that aren't relevant to push back against what you see as unnecessary uniformity, which just clogs everything up. Theknightwho (talk) 16:22, 12 November 2023 (UTC)[reply]

I thought the implications would be clear. It concerns me that they aren't.

If senses are removed they no longer have any effect on the ordering of the etymologies. Conversely, added senses would add more weight to the desirability of placing the etymology in which they occur ahead of another etymology. In the case of ing-forms, it is easy to argue that the commonly used senses are 'really' the inflected form of the verb. But if we decide that we need to duplicate reworded definitions of all the verb senses under the noun PoS and its distinct etymology section, then that section should appear before the mere inflected-form definition and etymology. DCDuring (talk) 16:33, 12 November 2023 (UTC)[reply]

@DCDuring It wasn't clear because the example you gave only gives the inflection under the verb sense, so the number of senses we have under the noun etymology is of no relevance even if we accept the idea that etymologies with more senses should go before those with fewer. This thread is also about the ordering of lemmas versus non-lemmas, which is a wholly separate issue to anything you're talking about. Come on. Theknightwho (talk) 16:43, 12 November 2023 (UTC)[reply]

You could consider being less dismissive.

Why shouldn't we use 'more-common definitions/usage' as an etymology-ordering criterion? We seem to prefer it for ordering definitions within etymologies.

I use the ing-form example as a case where simple-minded etymology-ordering rules come a cropper. Had we ever resolved whether and what inf-form definitions should be explicitly presented as noun definitions rather than implicitly referred to the verb entry, a simple-minded solution could probably work. We haven't and are not likely to in the next weeks and months.

In ing-form entries the verb-form definition usually can be considered to include implicitly definitions corresponding to every definition of the verb, most relevantly, the most common ones. Where it is separate from a noun ("gerund") etymology, the verb-form etymology section virtually never includes any substantive definition. But, if the noun (gerund) definitions, if any, are limited to those that do not directly correspond to verb definitions, then the verb-form ety. section with its redirection to the lemma should typically be presented ahead, despite the absence of definitions other than the form-of definition. DCDuring (talk) 17:12, 12 November 2023 (UTC)[reply]

@DCDuring But, if the noun (gerund) definitions, if any, are limited to those that do not directly correspond to verb definitions, then the verb-form ety. section with its redirection to the lemma should typically be presented ahead

Two questions:

Can you give an example of a gerund which doesn't correspond to a verb definition? This doesn't seem to make any sense, and the way you've phrased your comment suggests you don't understand the difference between gerunds and nouns.
Where did you get the idea that the verb should go first in such cases?

Theknightwho (talk) 17:23, 12 November 2023 (UTC)[reply]

@Theknightwho Where do we document our distinction between English gerunds and nouns? Do we allow gerunds to take articles? I think I've seen it claimed that English gerunds don't.

I would say that the sense 'picture' of drawing is an unpredictable derivation of draw - plenty of precedent, but no guarantee. But perhaps you would relegate that sense to the grammar of English. Part of the cure would be to merge the etymologies - a division by etymology should sunder participle and gerund, not gerund and noun! --RichardW57m (talk) 11:49, 13 November 2023 (UTC)[reply]

@DCDuring my intent is to enhance Wiktionary:Entry layout#Etymology with a few recommendations on how to order multiple etymologies, not normative requirements. We have 4300+ languages so I fully expect the recommendations to not always be applicable. The value of providing some guidance, in my mind, is to make it easier for newcomers to do a right thing, and to reduce contention between editors who believe that Wiktionary as a whole prefers ordering X vs. ordering Y. I've been in such arguments myself.

As to your particular example, I think it will be better answered by the proposed addition to EL I'll be crafting. Chernorizets (talk) 23:14, 12 November 2023 (UTC)[reply]

I fear that the guidance may not be specific enough to guide contributors in realistic problematic cases if it is intended to cover all languages. In my example case, I'm not sure we can get a consensus among English-language contributors. DCDuring (talk) 23:53, 12 November 2023 (UTC)[reply]

@DCDuring, Benwing2, Thadh, Vininn126: the vote (tagged "premature" per custom) has been created at: Wiktionary:Votes/2023-11/Ordering_of_etymologies_within_an_entry. Feel free to provide improvement suggestions on the Talk page. Thanks! Chernorizets (talk) 00:08, 13 November 2023 (UTC)[reply]

@Chernorizets FYI as this vote is written, I would vote against it as it suggests ordering by commonness, with the lemma-first principle being an afterthought. I believe the lemma-first principle should be the overriding concern except in the case where a lemma is derived from a non-lemma; ordering non-lemmas before lemmas merely due to commonness should happen, if at all, only in extreme cases, e.g. the lemma is obsolete or extremely rare. Benwing2 (talk) 01:17, 17 November 2023 (UTC)[reply]

@Benwing2 there are two recommendations made by the proposed text, the second of which is to order etymologies lemma-first by default. I don't see it as an afterthought, and the exceptions listed are the ones that we've discussed in this thread: lemmas derived from non-lemma forms, and rare/obsolete lemmas coming before common non-lemma forms. Personally, I see commonness as more intuitively justifiable and explainable than "lemma-ness", which is why it's listed first. We order word senses by commonness, and in English we also order parts of speech by commonness when a word can be used as e.g. both a noun and a verb. I see etymology ordering as another instance of this pattern.

If you have suggestions on improving the wording to emphasize that certain orderings should be rare, feel free to post those suggestions on the vote's Talk page. An "oppose" vote is, obviously, perfectly valid as well. Chernorizets (talk) 01:54, 17 November 2023 (UTC)[reply]

@Benwing2 FYI I've reworded that paragraph a bit, both for succinctness and (hopefully) clarity. Chernorizets (talk) 08:07, 17 November 2023 (UTC)[reply]

@Chernorizets Thanks, this is much closer to my preferred position. I don't have such a strong view on ordering of etymology sections but I think if all of them represent lemmas, ordering by commonness is very reasonable. An example that I can think of is "flag". The obvious meaning, the one that comes to mind first, is "a piece of cloth that flies in the air", with derived senses like a marker of various sorts. The next thing that comes to mind is "to get tired", usually in the negative (unflagging devotion, etc.), followed by specialized usages (flagstone, sweet flag). Similarly for "bay" the first thing that comes to mind is a "body of water" followed by bay leaves, bay horses, bay windows, etc. in some order. Wiktionary indeed puts the senses of "flag" in the order that I suggested, but for "bay" puts "bay leaf"/"bay tree" first, and "body of water" second. Maybe there is some rationale for this (it seems that "bay leaf" comes from an obsolete meaning "berry" that has a Germanic origin, while all the other etymologies derive from French; maybe the intention was to put the native-origin etymologies first?), but I would not object if someone switched the first two etymology sections. Benwing2 (talk) 09:55, 17 November 2023 (UTC)[reply]

If we strive to reduce duplication of content, then putting not-so-common noun "lemma" content of ing-forms before the inflected form is a bad idea when we show the gerund (noun) as having a different etymology from the inflected form ing-form, formerly present participle, as we often do. Contributors may still add noun definitions that essentially duplicate verb entry content, but they may find it harder to ignore the verb page if it precedes the usually less-common ing-form definitions that do not directly correspond to verb definitions.

I don't know whether there are similar cases in other languages. If this phenomenon is unique to English ing-form entries that display two etymologies, then simply unifying the etymologies, as suggested above, would address my objection. I haven't counted how many two-etymology ing-form entries there are.

My objection to ordering less-common lemma definitions before more common inflected form definitions in the case of ing-forms within the same etymology section remains. DCDuring (talk) 13:52, 17 November 2023 (UTC)[reply]

Manual reordering is fine, if one can locate and fix bad cross-references. But bots should not be allowed to attempt such reorderings. --RichardW57m (talk) 11:14, 13 November 2023 (UTC)[reply]

There you go, down the slippery slope. I'll join you. DCDuring (talk) 13:52, 17 November 2023 (UTC)[reply]

A vote has now started to add guidance to WT:EL on ordering etymologies within an entry: Wiktionary:Votes/2023-11/Ordering of etymologies within an entry. The vote closes at 23:59 UTC on Dec. 19. Chernorizets (talk) 00:09, 20 November 2023 (UTC)[reply]

@Chernorizets: Just to be clear, in the case of часом (časom) and others like it, the above vote, as it is now, doesn't address them as they're under the same etymology section. If we'd like to address those cases, we'd need to update POS-ordering guidelines as well. Specific language is important as it limits unintended interpretations later. AG202 (talk) 01:31, 20 November 2023 (UTC)[reply]

@AG202 correct, the vote is about ordering of etymology sections, not parts of speech within the same etymology section. I've tried to make that clear, by using the "multiple etymologies" wording already in that section which refers to multiple etymology sections. If you have any suggestions on how to make that even clearer, please share them on the talk page. Chernorizets (talk) 01:41, 20 November 2023 (UTC)[reply]

@AG202 I've added a clarification note to the vote page, and I've echoed it on the vote's Talk page, to make sure there's no confusion. Chernorizets (talk) 01:51, 20 November 2023 (UTC)[reply]

Thank you! That makes things a lot clearer. AG202 (talk) 14:10, 20 November 2023 (UTC)[reply]