Wiktionary:Beer parlour/2021/April

Wikidata Lexographical event is ongoing

Wikidata:Wikidata:Events/30 lexic-o-days 2021 is ongoing till April 15. Vis M (talk) 08:09, 1 April 2021 (UTC)[reply]

Interesting, thanks. However an earlier heads-up (from the organizers of the event) would have been nice. Many sessions have already happened, and most of them unrecorded. There's still not much overlap of contributors and dialogue across projects. – Jberkel 09:03, 1 April 2021 (UTC)[reply]

`{{bahuvrihi}}`

We already have a category Category:Bahuvrihi compounds by language. I would like a template like {{bahuvrihi}} which would categorise appropriately and automatically. If nobody opposes this template by the end of April, I'll create this by moving Template:bahuvrihi/sandbox to Template:bahuvrihi. I'm ready to implement this manually. I'm also thinking of its shortcut to be {{bv}} because bahuvrihi is from Sanskrit bahu + vrīhi. Also, I'd prefer to have it categorise as 'lang bahuvrīhi [IAST of बहुव्रीहि] compounds' instead of 'lang bahuvrihi [English of बहुव्रीहि] compounds' because {{vrddhi}} categorises as 'lang vṛddhi [IAST of वृद्धि] derivatives' and not 'lang vrddhi/vriddhi [English of वृद्धि] derivatives'. Thanks and regards. 🔥शब्दशोधक🔥 09:25, 2 April 2021 (UTC)[reply]

If you look at Category:Types of compound words by language, you'll see there are lots of different types of compound that we categorize for. Rather than having a dedicated template for bahuvrihis, which would open the door to dedicated templates for each of the other types, maybe it would work better to have a parameter such as |type= that could be added to {{compound}} (and {{affix}}, which also puts categorizes words as compounds) to subcategorize entries by the type of compound they are. —Mahāgaja · talk 10:08, 2 April 2021 (UTC)[reply]

@Mahagaja: Yes, that would certainly be better. But the problem is, that I'm, till now, not familiar with module editing, and find it rather complicated and difficult. Hence, I pass this task to those who might be able to do so. (Notifying Dixtosa, Kc kennylau, Rua, Ruakh, ZxxZxxZ, Erutuon, Jberkel, JohnC5, Benwing2): 🔥शब्दशोधक🔥 11:57, 2 April 2021 (UTC)[reply]

@Mahagaja, शब्दशोधक I'll add this. Benwing2 (talk) 18:05, 3 April 2021 (UTC)[reply]

@Mahagaja, शब्दशोधक Implemented. Use |type=bahuvrihi (or |type=bahu, or |type=bv) in either {{affix}} (short form {{af}}) or {{compound}} (short form {{com}}) to add prefix text indicating the type of compound and categorize appropriately for the compound type, in addition to any other categorization done. You can use |type= even with no affixes or compound parts, to get just the text and single type-specific category. The standard params are supported: |nocap=1 to make the initial text letter lowercase, |notext=1 to suppress the extra initial text entirely, |nocat=1 to disable all categorization. The types supported are in Module:compound but include bahuvrihi/bahu/bv, karmadharaya/karma/kd, tatpurusa/tat/tp, dvandva/dva, alliterative/all, rhyming/rhy, synonymous/syn, antonymous/ant. Benwing2 (talk) 02:42, 4 April 2021 (UTC)[reply]

@Benwing2: Thanks a lot!! But you should change 'bahuvrīhi' which appears, to simply 'bahuvrihi' because if one is 'bahuvrīhi' then some of the others should be 'tatpuruṣa', 'karmadhāraya', etc. Also, please see to CAT:Sanskrit vrddhi derivatives and CAT:Sanskrit vrddhi derivatives with a -य extension. Can you do something so that {{auto cat}} works on them? I've changed vṛddhi to simply vrddhi because if that has diacritics, then even karmadhāraya, tatpuruṣa, and bahuvrīhi should have (in cats). 🔥शब्दशोधक🔥 03:43, 4 April 2021 (UTC)[reply]

I don't support your change of vṛddhi to vrddhi in the template I created {{vrddhi}} and was never pinged on any suggestion for that change. --{{victar|talk}} 15:40, 4 April 2021 (UTC)[reply]

@Victar, शब्दशोधक Blah, Senator No strikes again. The way I have set this up, the categories are named Category:Hindi bahuvrihi compounds etc. without any diacritics, but the template itself displays the diacritics in the generated text: bahuvrīhi, tatpuruṣa, karmadhāraya. The categories like Category:Hindi bahuvrihi compounds, Category:Hindi tatpurusa compounds, Category:Hindi karmadharaya compounds long predate Victar's creation of Category:Sanskrit vṛddhi derivatives, which is the odd man out with its diacritic in it. I strongly believe it's a bad idea to include such diacritics, since they're very hard for most people to type. Since SodhakSH seems to agree, I am going to rename the categories appropriately while leaving the accents in the text generated by {{vrddhi}} (note that the template itself, thankfully, does not have a diacritic in its name). Benwing2 (talk) 15:53, 4 April 2021 (UTC)[reply]

@Benwing2: I'm OK with the category name changing, so long as vṛddhi is kept in the etymology, so it sounds like we're on the same page. @SodhakSH changed both. It is really necessary to resort to name calling though, Benwing? @Chuck Entz, Surjection --{{victar|talk}} 16:06, 4 April 2021 (UTC)[reply]

@Victar OK, good. As for "name calling", it's just that your attitude in several instances recently has come across as if you are asserting ownership and liberum veto power over anything you have worked on. This may well not be your intention but it comes across that way, and has been grating on me. Benwing2 (talk) 16:47, 4 April 2021 (UTC)[reply]

(edit conflict) @Benwing2: SodhakSH's edit was a partially bad edit made with no discussion to its change and I stand by my reversion of it. To show continued interested in pages, templates, and modules one creates is what any good editor would do. Wanting to first engage in a discussion before a decision on a change is made is not liberum veto, nor should it be an invitation for personal attacks. --{{victar|talk}} 17:17, 4 April 2021 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ On a related topic, I notice we have both Category:Bahuvrihi compounds by language and Category:Exocentric compounds by language, as well as Category:Tatpurusa compounds by language and Category:Endocentric compounds by language. AFAIK, bahuvrihi and exocentric are more or less synonymous, likewise tatpurusa and endocentric. IMO we should merge these pairs. I gather that technically bahuvrihi is maybe a type of exocentric compound and tatpurusa is maybe a subset of endocentric compounds, but the difference seems small enough as to be not worth making. I have no particular opinion on which terminology to keep. Normally we would prefer endocentric/exocentric as more transparent, but the Sanskrit-based terms are quite well established. Benwing2 (talk) 17:01, 4 April 2021 (UTC)[reply]

I strongly object to the use of the Sanskrit-derived terms in category names for ANY language when an English terms is available. IOW, exocentric and endocentric would be preferred over bahuyrihi and tatpurusa, respectively, if they are indeed synonyms. If bahuyrihi and tatpurusa are hyponyms of the corresponding English terms then they can be subcategories of the categories that use the English term. If they cut across category names based on English terms, I would argue that they don't belong in the same category hierarchy. DCDuring (talk) 18:20, 4 April 2021 (UTC)[reply]

@DCDuring Wikipedia's page on English compound asserts, for example, that exocentric = bahuvrihi, whereas Wiktionary puts bahuvrihi as a subclass of exocentric. Bahuvrihi implies (although doesn't completely assert) that bahuvrihi is a subclass of exocentric, and that bahuvrihi = "possessive compound" (a term I've never heard). My point is that there are some subtle distinctions here that probably aren't worth making, and are unlikely to be correctly distinguished. I am inclined to use the English terminology, i.e.:

(1) Categorize under exocentric and move bahuvrihi -> exocentric;

(2) Categorize under endocentric and move tatpurusa -> determinative (a subclass of endocentric) and likewise karmadharaya -> descriptive (also a subclass of endocentric);

(3) Move dvandva -> coordinative (this type of compound doesn't really exist in English).

Benwing2 (talk) 19:09, 4 April 2021 (UTC)[reply]

That all sounds sensible. ~~While we're at it, do you know of a good synonym for dvandva (eg, secretary-treasurer, tractor-trailer, Minneapolis-Saint Paul, hawk-owl, Rodham Clinton?, Buda pest?)?~~ DCDuring (talk) 22:39, 4 April 2021 (UTC)[reply]

@DCDuring Most of those examples aren't really dvandvas, except maybe Minneapolis-Saint Paul, Budapest and tractor-trailer. In a dvandva, the referent is not the same as either part of the compound. A secretary-treasurer is in fact a secretary who is also a treasurer (hence it's a karmadharaya compound); for it to be a dvandva, the meaning of secretary-treasurer would have to be something like "executive board". Terms like Bennifer and Spederline would be dvandvas if written out rather than blended. Benwing2 (talk) 00:27, 5 April 2021 (UTC)[reply]

A Rodham Clinton is not the same as a Clinton or a Rodham. I take your point about hawk-owl and secretary-treasurer. Thanks. DCDuring (talk) 04:49, 5 April 2021 (UTC)[reply]

The "-est" and "-eth" archaic forms of English verbs

Hi! Considering the purpose of Wiktionary to be a descriptive dictionary of both archaic and modern uses, I think that there should be an option to put the "-est" and "-eth", such as "takest", "taketh", and "tookest" in the main entry "take". Looking at other languages, they also put their archaic forms in the main entry, so I think that English should too. Thoughts? --Mar vin kaiser (talk) 01:04, 3 April 2021 (UTC)[reply]

This would need to go in an inflection table. The headword line is abused enough for inflections as it is. —Rua (mew) 16:24, 4 April 2021 (UTC)[reply]

@Mar vin kaiser, Rua I agree with Rua here. I would go further and say we don't need to include inflection tables listing these archaic forms. It would add a lot of unnecessary content to every English verb page if we were to include them, and it opens up a can of worms. For example, both thou tookest and thou tookst are probably attested, but for less common verbs, these forms will be only theoretical, and for verbs invented in the last few hundred years, it would look ridiculous to add such forms (thou textest, he texteth?). Benwing2 (talk) 16:53, 4 April 2021 (UTC)[reply]

@Rua, Benwing2: Look at the entry for abandon, there's already an inflection table there, though labelled as Conjugation. --Mar vin kaiser (talk) 21:53, 4 April 2021 (UTC)[reply]

@Rua: I think for me the solution there is to only include attestable forms? Technically, we already differentiate attestable forms and invented forms, like tooketh labelled as a hypercorrection, and the attestable forms simply labelled as archaic like tookest, makest, drinketh, etc. --Mar vin kaiser (talk) 22:01, 4 April 2021 (UTC)[reply]

@Mar vin kaiser I know about {{en-conj-simple}} but I don't think it needs to exist. There was a previous discussion, in fact, about this, which I think led to the conclusion that it should be deleted. Benwing2 (talk) 00:29, 5 April 2021 (UTC)[reply]

@Benwing2: Well, my other suggestion is to do what I did the entry for take. I put them in the usage notes, which I copied from what was done in the entry for do. --Mar vin kaiser (talk) 01:30, 5 April 2021 (UTC)[reply]

for all I care / for all someone cares / for all one cares

We have for all I care as a lemma, but of course the "I" can be freely substituted (for all you care, for all Alice cares, for all the city cares, etc.). My reading of WT:English#Phrases is that this should be moved to for all someone cares (with for all I care as a redirect). But I wanted to double check. My main concern is that there are some incoming links for some terms that specifically relate to the first-person form, e.g. meinetwegen, 愛……不……, and the entry already has some translations specific to the first-person form. So maybe if it has value as a translation hub, it can be kept, but marked as something like {{form of|en|for all someone cares}}? Alternatively, as far as one knows is an example of an entry that started in the first person form and just kept the original translation table after it was moved. Not sure if that was intentional or an oversight.

Also, any reason to prefer "one" vs. "someone"? WT:English#Phrases just says to prefer "one" for verb phrases that are usually reflexive. So I assumed "someone" was the default for all other cases. But then I saw we have in one's opinion, in one's book, to one's liking, etc. But also in someone's eyes, in someone's light, at someone's disposal. I could imagine there being a rule-of-thumb at play that, for prepositional phrases, we use "one" if the position is usually filled by a first-person pronoun and "someone" otherwise, though ngrams suggest that "to my liking" is less common than "to {your,his,their} liking"...

Other examples of entries that use non-obligatory first person pronouns: as far as I'm concerned, in my opinion (though in one's opinion also exists), over my dead body. Colin M (talk) 03:06, 3 April 2021 (UTC)[reply]

I think this should be moved to for all someone cares, which can also be attested in (at least) the past tense for all someone cared. The caring agent need not be a pronoun, it can also be used as in for everything my mother cared and so on. --Lambiam 14:31, 4 April 2021 (UTC)[reply]

@Lambiam, Colin M My preference is to use one in most cases and reserve someone for a non-reflexive object. Another way of putting this is to use one whenever it works, and someone when one doesn't work. For example lock someone up and throw away the key (you cannot use one here because it refers to a different entity than the subject) but look as if one has been dragged through a hedge backwards, call it as one sees it. IMO at one's disposal sounds better than at someone's disposal. Benwing2 (talk) 18:53, 4 April 2021 (UTC)[reply]

That sounds reasonable. The only minor downside I can see with one is that it's probably more likely to be misconstrued as something other than a placeholder. e.g. when someone sees for all of one, they might first read it as for all of 1. On a related note, while looking into the one/someone question, I found a BP thread from a couple years ago proposing some special visual treatment for placeholders. It struck me as a great idea. It's a shame nothing seems to have come of it. Colin M (talk) 23:49, 4 April 2021 (UTC)[reply]

@Benwing2, Colin M — In my opinion it is not a matter of what sounds better. There is an essential grammatical difference, which can be illustrated with a few examples. One can stand on one’s toes to reach higher or look taller. One can also stand on someone’s toes – perhaps with the same goal – but this is not only rude but an entirely different act. One can stand on someone’s shoulders, like in order to see farther by standing on the shoulders of giants, but one cannot stand on one’s shoulders. (The latter is, however, not so much a grammatical as an anatomical impossibility.) Conversely, one can do one’s best, but not someone’s best; “my aunt did her uncle’s best” is nonsensical. The point is that someone is a placeholder for just any noun phrase having a person (or persons) as its referent, whereas in these various idioms the indefinite personal pronoun one is a grammatical placeholder for personal pronouns only; there is only one way to fill the slot in “my aunt did ... best”. Since one is itself a noun phrase having a person as its referent, it can, like any noun phrase, (grammatically speaking) always fill the slot of the placeholder someone, but the converse is not possible. We should therefore reserve the use of one as placeholder for slots that require a personal pronoun, with one’s reserved for the corresponding possessives. --Lambiam 21:33, 6 April 2021 (UTC)[reply]

I think this is equivalent to the policy of using one for verb phrases that are usually reflexive and someone in all other cases. (At least in all the examples above, the reason the slot needs to be filled with a personal pronoun is that the slot corresponds to the subject of the verb, and it would be weird to say something like "my aunt did my aunt's best".) Colin M (talk) 02:45, 8 April 2021 (UTC)[reply]

As I understand the term reflexive verb, it is a verb whose object is a reflexive pronoun, as seen in “I can’t guarantee that my uncle will behave himself”. The verb to do is not reflexive, the object her best is not a pronoun, and English has no separate reflexive possessive pronouns but uses the general possessive pronouns, so the use of the attribute “reflexive” in the guideline may not be entirely clear. In any case, yes, I think that following the guideline is better than the advice to use one’s in a case like at (some)one’s disposal, since the possessing referent in a sentence such as “Three brand new Baby Rolls-Royces were at His Majesty’s disposal”^[1] is clearly not the subject. By using one(’s), important information gets lost. --Lambiam 12:10, 8 April 2021 (UTC)[reply]

Previously-deleted entries

What is our standard for recreating entries that were previously deleted out of process (i.e. did not go up for a vote at RFD)? Do we have to make an undeletion request? Imetsia (talk) 17:20, 4 April 2021 (UTC)[reply]

@Imetsia What was the term deleted out of process? IMO it is not necessary for all terms to go through RFD (we do have the concept of speedy deletion) but if there is any controversy about a deleted term, it should be undeleted (no need to submit an undeletion request, just restore it) and submitted to RFD. Benwing2 (talk) 18:47, 4 April 2021 (UTC)[reply]

@Benwing2: I was thinking howl at the moon and possibly blind rage (which I'm less convinced about). Imetsia (talk) 18:55, 4 April 2021 (UTC)[reply]

If it was clearly deleted out of process, then I don't see why any admin couldn't undelete it. DCDuring (talk) 18:51, 4 April 2021 (UTC)[reply]

@Imetsia Both were deleted long ago, and probably not because of SOP but because the content was garbage. In these cases, I would not bother undeleting but just create them anew. Benwing2 (talk) 19:11, 4 April 2021 (UTC)[reply]

@Imetsia: They weren't deleted out of process, they were speedied as useless, which is allowed by the rules. To be truly out of process, they would have to be deleted while the rfd or rfv process was already underway, and even then speedying would still sometimes be completely legitimate. To put it another way: if they would qualify for deletion if tagged with {{delete}}, than they can be speedied for the same reasons they would be tagged for. After all, it would be silly for an admin to first tag it, then delete it, or to wait for someone else to tag it. For those without admin privileges who can't see the deleted content, the howl at the moon entry consisted literally and entirely of "Howl at the moon (VERB)" followed by "1.To howl at the moon." The content for blind rage consisted entirely of the sentence "To be so enraged as to have the mind blinded of the destruction of your rage." To fix either of those you would have to basically recreate the entry from scratch, anyway, so why leave it cluttering up the dictionary?

The following deletion reasons are almost always without prejudice regarding later recreation, because they're about the specifics of the deleted content, not anything inherent in the page name itself:

Attack page or other personally identifying info
Copyright violation
Created in error (assuming the error wouldn't be covered under a deletion reason like "bad entry title")
Incomprehensible, meaningless or empty
No usable content given.

I would add to that SemperBlotto's "just too many errors", which is a variant of "no usable content given".

Even "vandalism" might just mean that someone created a page with objectionable content, but a page without the content would be just fine as an actual entry. Also, the rules that governed deletion a long time ago were in some cases quite different from the current ones.

The general rule of thumb for entries that weren't deleted through rfd or rfv: if it would be an appropriate entry aside from the fact that it was previously deleted, than the deletion usually doesn't matter. It's always a good idea to investigate the deletion to make sure that there wasn't some reason you weren't aware of to not recreate, but that's just common sense. Chuck Entz (talk) 20:57, 4 April 2021 (UTC)[reply]

clarity and clarity-adding equivalents in other languages

In this edit @2603:9000:8B0C:A8D6:71A7:D3FA:80CE:8D90 (no idea if pinging ip's works) thinks that the word "hard" is unclear in "erect, hard, having a hard-on". Do you also think so? They also think the Greek is unnecessary; yet the Greek equivalent is given in {{R:TLL}} and {{R:OLD}}, presumably because the two words were felt to be equivalent by the speakers in something like regional semantic cross-pollination, while English lacks such a word. Is this reasonable? These two dictionaries often give such Greek equivalents, as does Forcellini, L&S and basically the entire Latin lexicographic tradition. Is it allowed or desirable to add clarifying equivalents in other languages to definitions in general? As glosses, or in some other way if there's any. Ditto for similar but not quite equivalent English phrases, like reach for the stars in ad astra for instance. Brutal Russian (talk) 02:39, 5 April 2021 (UTC)[reply]

@Brutal Russian In that definition line the meaning of hard is perfectly clear. It may be redundant, but that is not a big issue. I agree that the link to the Ancient Greek is not desirable on the definition line, but if it really is a plausible case of cross pollination it is relevant to mention in the etymology. ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 14:05, 5 April 2021 (UTC)[reply]

Clearly name categories for specific Xs, terms about X, types of X, terms used in X

A shortcoming with our categories which was brought up at RFD is that we don't always distinguish, in a way users understand and maintain, categories for 1) a set of names of Xs (e.g. prisons: Alcatraz, Rikers Island; or stars: Betelgeuse, etc), 2) terms relating to topic X (warden; or corona), 3) types of X (jail, penitentiary; or red dwarf), and 4) terms used in the slang of people who are in or study X (tree jumper). The first three are lumped together in Category:en:Prison, and the fourth is only separated into Category:English prison slang thanks to Colin M's efforts; for e.g. military terms, all four are lumped in Category:en:Military (even as some are also in Category:English military slang).
In the past, one idea was to use names like "Category:en:Set:Prisons" for type 1 (or "Category:en:Names of prisons"?), "Category:en:Topic:Prison" for 2, and "Category:en:Types of prisons" for 3 iff we care to separate it from 2. Type 4 is "Category:English prison slang". Not all of these will exist for all Xs ("English star slang" doesn't make sense, though "Astronomy jargon" might). Some of these duplicate work the Thesaurus could do (types 1 and 3 could be "hyponyms" of prison), but that's already the case with the current categories. Does anyone have better names for any of these types of categories, or an argument against categorizing any of these types? - -sche (discuss) 05:15, 5 April 2021 (UTC)[reply]

The main issue with categories is the tension between classification, which involves having a separate place for everything and placing everything in the exact place it belongs, and linking, which involves bringing things together that have something in common. The name "category" implies the first, but the implementation of categories in Mediawiki software is all about the latter. These are navigational tools for the purpose of finding other entries that have the same thing in common. That said, linking to everything at once makes for a tangled mess no one can use, so you have to have a structure.

On the one hand, you can have a completely logical system of categories that only have one or two items each- which is useless for navigation. On the other hand, you can have a disorganized mess that no one understands enough to navigate or to decide where to add new items.

Then there's the matter of connections that cut across the structure: do we keep equine and equid separate because one is an adjective and the other is a noun? It's true that an adjective isn't the name for something, but the two are more close to each other semantically than to anything with the same part of speech. Not only that, but adjectives can be used like nouns and nouns can be used like adjectives. Reality is full of complex webs of many-to-many connections, with the nature of the connections varying between one part of the system and another. Being completely rigorous and rigid in following the structure can deprive us of the richness of these interconnections. Ignoring the structure can deprive us of the ability to find anything. There again- the tension between making sense of things and finding connections.

I've worked with category structures as much as anyone, but I'm not sure how we can resolve this inherent contradiction. Chuck Entz (talk) 08:15, 5 April 2021 (UTC)[reply]

If someone wanted to make separate categories for 1, 2, 3, and 4, there's nothing preventing them from doing so, and we have examples of categories of each of those types. For 3), a category can be specified as a set category by adding Category:List of sets as a parent in the metadata module. And 1 is just a special case of set category (Category:Names). Spot-checking a few English set cats, it seems editors are generally pretty good about following the set criterion, though it varies a lot from cat to cat. Category:en:Cattle, for example, despite having a special warning at the top of the category page in large font, currently includes non-cattle-denoting terms like calving jack, cowhand, cowpat, and moly cow. That's potentially an argument for a more explicit naming scheme (e.g. Category:en:set:Cattle) - though actually, digging deeper into the edit histories, I think it might just be the case that these entries were categorized at a time when Cattle was a topic cat?

~~That said, I'm not sure if there's a process in place (in terms of naming) for the situation where someone wants to have a Cattle set category and topic category. I have yet to see such a pair.~~ Edit: found one. Category:en:Currencies is a set cat for terms for currencies and Category:en:Currency is a sibling topic cat. Discussed in this BP thread. So in this instance, I guess we've followed Wikipedia's convention of using plural for the set cat and singular for the topic cat. But it's definitely not true that all plural categories are set rather than topic: e.g. Category:en:Children, Category:en:Restaurants, Category:en:Sex positions are all listed as topic rather than set cats.

My feeling is that, as a dictionary, our focus should be more on categorizing by the properties of words qua words rather than by the properties of their referents. So in the example above, if I were to choose one of those to factor out of the mass of "prison terms", it would definitely be 4, since it's a property of the word (specifically, where is the word used and by which speakers).

If someone wants to create categories to disentangle 1-3, I wouldn't object, but I think in many cases it's not adding a lot of marginal value. When it comes to organizing specific named prisons, we're probably not going to be able to do better than Wikipedia's existing Category:Prisons category. When it comes to organizing different words that mean "prison" (or a more specific type of prison), Thesaurus:prison is going to be more reader-friendly in a lot of ways. Though that's just the prison example - the situation may well be different for some other domains. Colin M (talk) 15:50, 5 April 2021 (UTC)[reply]

Some related previous discussions that people may wish to check out:

2017 category renaming proposal (I think this is what -sche was referring to above regarding the idea of the "Category:en:Set:Prisons" style naming scheme)
2019 discussion of ambiguity between 1 and 3 above (synonyms or subtypes of X vs. names of specific Xs).
short 2020 thread about distinction between set and topic cats. I think it's very revealing that even an extremely experienced and technically proficient editor was confused by this distinction. I took it for granted because I came from Wikipedia where the set/topic distinction is a very salient and well-understood part of their categorization scheme. But we could definitely do a better job of documenting the distinction in our category descriptions and at Wiktionary:Categorization which currently has 0 mentions of set categories. (Assuming that the topic/set distinction is one we think is worth observing.)

Colin M (talk) 17:02, 5 April 2021 (UTC)[reply]

Universal Code of Conduct – 2021 consultations

Universal Code of Conduct Phase 2

The Universal Code of Conduct (UCoC) provides a universal baseline of acceptable behavior for the entire Wikimedia movement and all its projects. The project is currently in Phase 2, outlining clear enforcement pathways. You can read more about the whole project on its project page.

Drafting Committee: Call for applications

The Wikimedia Foundation is recruiting volunteers to join a committee to draft how to make the code enforceable. Volunteers on the committee will commit between 2 and 6 hours per week from late April through July and again in October and November. It is important that the committee be diverse and inclusive, and have a range of experiences, including both experienced users and newcomers, and those who have received or responded to, as well as those who have been falsely accused of harassment.

To apply and learn more about the process, see Universal Code of Conduct/Drafting committee.

2021 community consultations: Notice and call for volunteers / translators

From 5 April – 5 May 2021 there will be conversations on many Wikimedia projects about how to enforce the UCoC. We are looking for volunteers to translate key material, as well as to help host consultations on their own languages or projects using suggested key questions. If you are interested in volunteering for either of these roles, please contact us in whatever language you are most comfortable.

To learn more about this work and other conversations taking place, see Universal Code of Conduct/2021 consultations.

-- Xeno (WMF) (talk)

20:45, 5 April 2021 (UTC)

Invitation to m:Talk:Universal Code of Conduct/2021 consultations/Discussion

I am interested in hearing the input of Wiktionary users about the application of the Universal Code of Conduct, especially from the perspective of interactions on Wiktionary. Xeno (WMF) (talk) 00:15, 18 April 2021 (UTC)[reply]

how to handle modernized editions of Middle English texts

How should we handle "modernized" editions of Middle English texts that respell things to modern English norms but largely retain the grammar and lexis of Middle English? As ==Middle English==, as ==English==, as unincludable...?
A decade ago, I suggested they could be seen as "translations" of Middle English into English, like translations of German would still be English even if they used many German loans; Doremítzwr embraced this idea as it allowed for citing old terms "in" English, but I've long doubted it was the right approach. At Talk:meedfully, User:Hazarasp suggested treating the spellings as ==Middle English== {{modernised spelling of}}s. This would fit with how we allow entries for modernized/"normalized" spellings of e.g. Norse (indeed, with Norse we lemmatize those and have manuscript spellings on the side). (Other past discussions: frain, quemful, etc.) - -sche (discuss) 03:20, 7 April 2021 (UTC)[reply]

Why do you think the modernised spellings are necessary to be shown? Since the original language was not written using these, and besides the fact that Middle English has a goodly attested corpus, I do not think those deserve entries. Just my tuppence, however. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 22:23, 8 April 2021 (UTC)[reply]

All words. IMO, particularly those in formats that people read.--Prosfilaes (talk) 02:01, 11 April 2021 (UTC)[reply]

Uh, because people might come across them and want to know what they mean? Our inclusion policies should be based around the reality of how people use dictionaries, not arbitrary ideological stances. Hazarasp (parlement · werkis) 05:05, 11 April 2021 (UTC)[reply]

something wrong with the display of certain pages

Hello,

The display of templates seems partly inefficient if the page is too big (the "lua error" says "not enough memory"). Quite a strange thing since these pages are mainly text without memory-consuming elements such as images or videos. 193.54.167.180 09:48, 7 April 2021 (UTC)[reply]

See WT:Lua memory errors. The memory-consuming elements are the modules we use to display thousands of different languages correctly. —Mahāgaja · talk 10:03, 7 April 2021 (UTC)[reply]

It is not completely clear from that description what the factual problem is. Might more timely garbage collection solve the problem, or would there still not be enough memory? 50 MB is a lot; I guess this is data rather than code. Does all that data really need to be loaded simultaneously for the display of a given page, such as that for a? BTW, a kludgy workaround for users is to click [ edit ] next to the L2 language heading and look at the preview. --Lambiam 11:32, 8 April 2021 (UTC)[reply]

We can't expect any random user to know all that. If we want to cater to more than veteran Wiktionary editors, this problem should be solved once and for all. If we can't increase the memory ceiling, we'll have to split those pages (as we already do with heavy translation tables). MuDavid 栘𩿠 (talk) 02:01, 9 April 2021 (UTC)[reply]

Or perhaps write better Lua code, or get a better Lua implementation, depending on where the shoe pinches. --Lambiam 09:54, 9 April 2021 (UTC)[reply]

@MuDavid: Yes, I wholly agree that we should do something about this. We can easily halve an overly long page in twain, i.e., by dividing the number of entries by two. The namespace of the two new pages would reflect this development. In the top, of course, we need to show a pointing message using {{selfref}}. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 20:44, 9 April 2021 (UTC)[reply]

I've been editing [[a]] to use fewer modules. Sometimes, removing one or more instances of a template that seems "heavy" actually temporarily increases memory usage (until more modules are removed); OTOH, removing seemingly "light" templates like {{gloss}} and {{n-g}} seemed to markedly (and seemingly durably, after further changes and seeing / accounting for some randomness in garbage collection, etc) decrease memory usage. I do think we can cut the module usage down to where the whole page will display. - -sche (discuss) 21:24, 9 April 2021 (UTC)[reply]

Still Lua memory errors, starting in Scottish Gaelic. Maybe we can cut module usage until the whole page displays, but the current way of handling it is not durable (it's only a matter of time until more languages are added), a pain where the sun doesn't shine (unless you love this type of editing; do you?), and a disservice to the users. MuDavid 栘𩿠 (talk) 02:11, 12 April 2021 (UTC)[reply]

collapsed/minimized language headers

I am bringing this up again for our attention. On mobile pages, language headers are overtly expanded which makes the pages very long to scroll through.

A 2014 change in phabricator was the cause of Wiktionary headers always being expanded. This is different from other Wikimedias which have collapsed headers, which makes it easy to go to header you want to. (shared by someone from the tech community).

I was not sure how much community agreement the change has.

One suggested solution is to set the headers to collapsed again.

A second proposal is to only collapse entries with more than, say, 5 language headers. Then, shorter pages with less than 5 headers will not be collapsed. This behavior is like the _TOC_ box which only appears when there are about 5 or more language headers in desktop view.

I wonder how many of us write on mobile, but increasing number of people use mobile view to visit Wiktionary. This means overly long pages which are difficult to read are driving away readers and potential contributors. So it is quite an important issue. 119.56.98.229 04:54, 4 April 2021 (UTC)[reply]

I was thinking how many pages and how many visitors are affected by this issue. We can have a look at the most visited pages, have a look using mobile view and imagine how it will look like on a phone/tablet. There should be a link to most visited at Special:Statistics 119.56.96.203 06:41, 8 April 2021 (UTC)[reply]

Mobile page has been bad on WMF for a long time. Many current readers and editors still use desktop for full functionality, but it is a reality that more people are accessing the web through mobile. The subpar mobile experience has become such an impediment to wikiwork that complaints are filed with User_talk:Jimbo Wales.

119.56.97.153 19:00, 19 April 2021 (UTC)[reply]

Agreed that the mobile user experience is pretty awful. Sometimes pages load on mobile with all the L2 headers collapsed (user talk pages), which is preferable; sometimes they don't (our forum pages, like WT:TEA), and the page can quickly become unusably long. I don't understand why the behavior is different; it comes across as shoddy programming. ‑‑ Eiríkr Útlendi │^{Tala við mig} 18:46, 20 April 2021 (UTC)[reply]

@EirikrSomeone from phabricator, which is WMF volunteer tech working group, complained that it is too much work on Wiktionary to open up the collasped headers when there are only a few headers on a non-talk page. So all the headers are now expanded by default to meet that guy's requirement.119.56.111.132 13:45, 20 June 2021 (UTC)[reply]

meta:Tech/Archives/2021See April. — This unsigned comment was added by 119.56.111.132 (talk) at 13:55, 20 June 2021 (UTC).[reply]

Huh. So I was right -- it is shoddy programming. What's worse, it's intentional. :(

@Anon, thanks for the link. ‑‑ Eiríkr Útlendi │^{Tala við mig} 19:08, 21 June 2021 (UTC)[reply]

I've brought this back up for further consideration over at Wiktionary:Grease_pit/2021/June#Experience_on_mobile. ‑‑ Eiríkr Útlendi │^{Tala við mig} 20:56, 21 June 2021 (UTC)[reply]

Oirat language code

Kalmyk is a major variety of Oirat, so Oirat is a hypernym of Kalmyk. Currently, Oirat and Kalmyk are encoded under the same ISO 639-3 code of xal. However, on Wiktionary the xal code corresponds to Kalmyk and not Oirat, so templates that use language codes display Kalmyk instead of Oirat. How can I get templates to display Oirat and not Kalmyk? RcAlex36 (talk) 17:44, 8 April 2021 (UTC)[reply]

We need to decide how we want to handle those languages before any change can be made. I expect @Allahverdi Verdizade, Victar will have useful thoughts on this. —Μετάknowledge^{discuss/deeds} 18:28, 8 April 2021 (UTC)[reply]

Don't know too much about Mongolian lects, but just glancing at it, Kalmyk should probably be a etymology-only code Oirat xal. --{{victar|talk}} 18:34, 8 April 2021 (UTC)[reply]

Changes to RFV

Both WT:RFVE and WT:RFVN are so big (in bytes) that it literally takes a lot of time to load from a mobile (maybe also any device). I have a few suggestions regarding what can be done for this.

1. RFV discussions be held on the entry's talk page only, which language name and starting month already mentioned in the discussion heading and categorised as an rfv discussion (perhaps with the help of a new template like {{rfv-discussion}} to be used when starting a new one)

2. RFV discussions be classified by month - like WT:RFE (and transcluding all the months' discussions on WT:RFVE and WT:RFVN) - and continue what we do right now. As compared to the first option, this would save time of changing many things and creating new template.

3. RFV discussions on category talk pages of the category containing rfv-tagged entries of a language. For example, RFV discussions of Sanskrit be held on Category talk:Requests for verification in Sanskrit entries with completed discussions being periodically archived into Category talk:Requests for verification in Sanskrit entries/archive 1 and so on. This is effectively splitting RFV by each language.

Please give your ideas on this, thanks. 🔥शब्दशोधक🔥 15:39, 9 April 2021 (UTC)[reply]

We have actually already agreed to split RFVN, but no one has taken the bull by the horns and actually effected the split. —Mahāgaja · talk 16:09, 9 April 2021 (UTC)[reply]

I actually like the idea of having monthly subpages that are then transcluded onto one main page, like with WT:RFE, better than the previously-agreed but unimplemented "by language type... ish" split. Subpages could continue to be translated onto the main page as long as there were still requests on them. I think Commons and does something similar for their equivalent "Categories for Discussion" (etc) pages, transcluding/listing all days (pages being divided by day) with open requests onto one page. - -sche (discuss) 02:33, 10 April 2021 (UTC)[reply]

1 seems the most elegant to me. This is how requested move discussions and RFCs are handled on Wikipedia. A bot picks up transclusions of the corresponding template and automatically links to the discussion at a central listing of ongoing discussions (e.g. Wikipedia:Requested moves/Current discussions). That would require some work to implement, but it would presumably pay dividends over time by making it easier to close discussions (you would just need to edit the talk page rather than cutting out the discussion from the RFV page and copying it to the entry talk page). Colin M (talk) 16:08, 11 April 2021 (UTC)[reply]

RFV seriously needs to be split, in whatever possible way. @Metaknowledge's suggestion

The real solution is to close old RFVs, which is a task that often needs to be left to specialists. Let's make an effort to clean it out, by doing what we can and pinging knowledgeable people for what we can't, and then reassess.

is definitely the real solution - RFV keeps periodically updating with old requests out and new requests in, and I'm trying to do that, but that is sort-of impossible at the moment. Even some temporary solution like holding such discussions on talk pages would do for now. Thanks. 🔥शब्दशोधक🔥 04:47, 12 April 2021 (UTC)[reply]

@SodhakSH I will try to get to splitting WT:RFVN tomorrow. BTW when pinging me make sure to ping User:Benwing2 even for deletion requests; I normally only see User:Benwing pings if I log onto that account (or if I happen to pay attention to Wiktionary-related email). Benwing2 (talk) 04:08, 20 April 2021 (UTC)[reply]

Dingsidang

Wiktionary's entry for Dingsidang, the name of a town of 38,445 persons in central China, was created in June 2018 by @PerfectlyOutOfSync (who does not seem to be active today). There is very little evidence for this word in the English speaking world, let alone three use-citations from durably archived sources. Beside that entry, the times the word has ever been mentioned are in @Lieutenant of Melkor's (@CaradhrasAiguo) edit in December 2012, someone on Baidu Baike [2] ("外文名称 Hubei province Xishui County, Dingsidang Town"), me, basically copying The Lieutenant, in January 2018 [3] and in addresses [4][5] and databases. This word does NOT appear in GEOnet. The closest thing I could find to a citation was this: [6]. If anyone can find examples of this word usable for citations/quotations on Wiktionary, I request that you help us with that. I am not yet smart enough to find them.
The English language loan word 'Dingsidang' is clearly in violation of the CFI. However, the original word 丁司垱 is almost certainly citable in Chinese character media. (Here are some websites using the word indicating a potential for discovery in durably archived sources: [7] [8] [9]).
I propose that despite these facts, Wiktionary should keep the English entry 'Dingsidang' as a translation term for 丁司垱, on the basis that 丁司垱 is obviously citable in Chinese characters and that Dingsidang is the English translation term.
Is my proposal a violation of the CFI rules as they stand? (I think it is.) If my proposal is a violation, has a similar proposal been made and voted down in the past? I would like to refine the concept presented above and perhaps put it to a vote written as a general principle for translations into English.

--Geographyinitiative (talk) 19:11, 9 April 2021 (UTC)[reply]

Just English being the lingua franca of the world does not warrant English entries for all towns and thorps of the world. We should not keep this entry unless we find any attestation of the town’s name in English-language literature. If Google Books be not kind to you, then you should look for old (19th-20th c.) English books about (that region of) China. By the way, I found this Dutch attestation: [10]. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 20:19, 10 April 2021 (UTC)[reply]

Proposing consistent wording for etymology templets

I do hereby propose that the modules of the templets {{inherited}} and {{borrowed}} be edited in such wise as they generate the words “Inherited/Borrowed from […] ”, in line with the wording of its other sister templets like {{learned borrowing}}, {{calque}}, {{transliteration}}, etc. The parameters |nocap= & |notext= could be brought to use for this. I had originally discussed this with User:Benwing2 here. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 21:27, 9 April 2021 (UTC)[reply]

I propose that this motion be rejected because the templates are consistent in not displaying any texts, which would eventually be manually overwritten anyhow. Fay Freak (talk) 23:05, 9 April 2021 (UTC)[reply]

Oppose, it's a (imho) useless addition that goes against the workflow of most editors and I - and probably the majority of editors - personally never use "Inherited from", rather just "F/from". Also, the amount of work needed to change all the existing etymologies is unnecessarily enormous for such a small change. Thadh (talk) 23:20, 9 April 2021 (UTC)[reply]

@Thadh: You need not worry about “the amount of work needed […] ” inasmuch as some bot-owner (Ben or Eru) would be willing to do this if the proposal be accepted. And regarding “[A]ddition that goes against the workflow of most editors […] ”, bear in mind that “From […] ” ought to be used only when you are employing {{der}} (which is used when the etymon is not derived directly, as in inherited, loaned, or calqued terms; or when you are not sure about the exact mode of origin of the etymon). -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 19:57, 10 April 2021 (UTC)[reply]

@Inqilābī: "From" (or <) almost exclusively means "Inherited from", unless the context makes clear we are talking about a root derivation or a part of the term [Even WT:Etymology, with which I disagree on a wide array of matters, says direct descent should be marked with "from"], so what you're saying is simply incorrect. And about your statement that "[I] need not worry about 'the amount of work needed'": We have over six and a half million pages, of which I would guess over a half use the templates {{inh}}, {{der}} or {{bor}}. Even if we assume a tenth of a second is needed for the conversion, the overall update will take three days non-stop, and I'm not even talking about entries in CAT:E. And a last point (prolly should've given it at the bottom of this discussion, but oh well): the etymologies shouldn't be standardised. It's a reflection of the individualities of editors, and that's exactly why no policy is and should be made on whether to denote descent with "from" or "<". We're not machines, we live by our own sense of aesthetics, and these may differ, but that difference's what makes us human. The reason for adding automatic vocabulary with templates like {{calque}} is so that editors don't make spelling mistakes and that the text links to the glossary, nothing else. Thadh (talk) 20:25, 10 April 2021 (UTC)[reply]

I know that writing “From […] ” generally refers to inheritance, but it could also be confused with {{der}}, which is widely used in our entries (either correctly to show indirect derivation or incorrectly when the editor is not sure which other templet to use). What I intend is to separate inheritances from {{der}}-using etymologies in our presentation of etymology. And I wholly disagree with your statement about the choice of editors: know well that editors do not own this project; WT is subject to standardisations, or else our bot-owner would not be drudging to clean up stuffs. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 20:59, 10 April 2021 (UTC)[reply]

As noted by Benwing2, {{bor}} had its "Borrowing from" text removed. It was actually by a vote back in 2017 that passed by a high margin, so reversing it probably requires another vote. I voted against "Borrowing from" last time, but am willing to reconsider if compelling arguments are presented and if an expert bot editor volunteers to fix the etymologies (an enormous task). I looked back at the vote and someone said that 90% of cases of {{bor}} used the template with the "Borrowing from" text (which at the time meant neither |nocap=1 nor |notext=1 was present). If the "Borrowing from" would still be appropriate in most instances, maybe changing back would be a good idea. — Eru·tuon 00:34, 10 April 2021 (UTC)[reply]

@Erutuon: Let’s use “Borrowed from” instead of “Borrowing from”. As we do not say “Inheritance from” but “inherited from”… -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 19:57, 10 April 2021 (UTC)[reply]

@Inqilābī: Substitute "Borrowed from" into my comment if you like. My comment was about any text versus no text, not one particular text. — Eru·tuon 19:20, 17 April 2021 (UTC)[reply]

Now let me explain why I am for it. This would standardise the etymology statements. Earlier there was no need for’t when we had {{etyl}}—when all kinds of etymologies were treated as one—but now that we have sundry etymology templets, it is high time that we had consistent etymology statements for all kinds of word origins. Since the default setting for the templets {{lbor}}, {{slbor}}, {{ubor}}, {{clq}}, {{translit}}, etc. are full statements, it would be inconsistent to have no etymology statement for {{inh}} and {{bor}}. As I said earlier that now we distinguish between the kinds of etymologies—today being not the days of employing {{etyl}} (which but functioned similarly to {{der}})—we cannot do without full etymology statements for inherited or borrowed terms. For, categorization is not the only thing we work on; presentation is equally important, after all, we contribute to this lexicographic project with an end to presenting the work to our readers, whom we do not expect to click the Edit button so as to find out which templets have been employed (as I would do in my earlier days here as a reader because I was not knowledgeable about linguistic diversity and language families back then and so I had to do that to see if the word were inherited or borrowed or whatever: and behold that mine was a living example to show how problematic it is to forgo full etymology statements!), or to scroll the page down to check the categories, not to mention all readers do not use a gadget other than mobile. And of course expecting interested editors to write that part manually is not the solution to this. This is my justification for this proposal. If everyone be still not convinced therewith, I would be starting a vote to decide on this. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 19:57, 10 April 2021 (UTC)[reply]
Support - see my previous idea on this, and also Eiríkr Útlendi's observation - "From" is in fact used indiscriminately with bor and inh, and the reader is expected to guess. This is one reason I have to manually check every single Romance etymology section I come across, whether it uses the correct template (this website has a very entrenched confusion between inherited and borrowed forms in Romance, including in the =Descendants= section). That idea was in order to circumvent the issue; if the issue can be fixed at its root, then this is preferrable. —In fact, I want to propose a more automated solution which would remove the need for bots, but I don't know if it's technically possible: if the template has the cap-initial "From" directly preceding, remove that and replace with "Borrowed from"; if a newline, insert "Borrowed from"; otherwise do nothing - the template is used in the running text of etymologies often enough. This wouldn't interfere with the editors' current habits, it would be user-friendly, and there would always be a parameter to force the text on or off when necessary. Brutal Russian (talk) 01:32, 11 April 2021 (UTC)[reply]

Templates and modules can only replace themselves with text. They can't directly write anything anywhere outside of their footprint in the wikitext. If they're called/invoked by another template or module, that limitation applies to the footprint of the calling/invoking template or module, and so on. The only way to do what you want would require enclosing the text you want to change within the template in a parameter. I suppose it might be possible to do it via Javascript, but that means running the code every time the page is viewed- having code rewriting the wikitext and saving the change would be opening a Pandora's box of interactions between code and its inputs that might be very unpredictable if it went wrong. Chuck Entz (talk) 01:56, 11 April 2021 (UTC)[reply]

Oppose per previous vote. P U C – 19:17, 13 April 2021 (UTC)[reply]

Support. Imetsia (talk) 19:56, 13 April 2021 (UTC)[reply]

Strong support; I agree with everything Inqilābī said. My only concern is that, what about those Hindi (and other, like एक) words inherited from A.Old Hindi, B.Sauraseni Apabhramsa, C.Sauraseni Prakrit and finally D.Sanskrit? Will |notext= be used there? 🔥शब्दशोधक🔥 05:45, 16 April 2021 (UTC)[reply]

@SodhakSH: I have also said about bringing back |notext= (and |nocap=) in my original post. However, thanks for reminding me about {{inh}}; I really forgot that this templet is employed manifold times in an etymology section, therefor my proposal for having the full text for {{inh}} would mean many an instance of using |notext=, which would be a tiring handly job. While the proposal about {{bor}} is alright because it is employed only once in an etymology section, we need to do something about {{inh}}. @Benwing2, Erutuon, given this predicament for {{inh}}, should we have the reverse of |notext=, that is some “|text=” to display the text, or a bot can just simply insert the text “Inherited from […] ”? Ideas? -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 07:59, 17 April 2021 (UTC)[reply]

@Inqilābī: I see no advantage to a |text= parameter, whether it's boolean or contains the actual text to put before the link. Typing inherited from before {{inh}} is much clearer and less strange than typing |text=1|nocap=1 or |text=inherited from inside {{inh}}. — Eru·tuon 19:20, 17 April 2021 (UTC)[reply]

@Erutuon: All right, I got your point. So, if I am to start a vote to decide on my proposal, then would it be fine for the vote to have the following two parts?

1) The template {{bor}} will display the text “Borrowed from […] ”; the parameters |nocap= & |notext= will be brought back for this.

2) A bot operation will insert the text “Inherited from […] ” before occurrences of {{inh}} that are only in the beginning of the etymology, and not in any subsequent occurrences of the template. [Also, this is an exception where the bot will not insert the text.]

-_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 21:54, 17 April 2021 (UTC)[reply]

Support for this proposal, not for the other one in which it'll be cumbersome to place |notext= after every first {{inh}}. 🔥शब्दशोधक🔥 01:46, 18 April 2021 (UTC)[reply]

The proposal sounds like it might work (though I'm not convinced yet), but I suggest first finding out what text is most common already and what text can be most easily inserted. (That would require looking through the wikitext dump.) You should also confirm that a bot operator would be willing to commit to do the work if the vote succeeds. Not that it's required by policy but it's a good idea, and people will be less likely to vote against it simply because they think nobody will do it. I don't want to take it on myself. (To me personally it seems a lot of work for not a great deal of benefit, and I haven't done such a big and complex task before.) — Eru·tuon 22:59, 17 April 2021 (UTC)[reply]

It'd be ideal if the template {{inherited}} can somehow detect if it has been used once in a particular L3 section. If it is possible, then the first time, it should be Inherited from […] and after that, from […] (nocap-ed by default, with a |cap= parameter for making the "f" capital). 🔥शब्दशोधक🔥 10:51, 17 April 2021 (UTC)[reply]

@शब्दशोधक: Templates can't infallibly detect which section they're used in. (A module function that implements a template could get the page content and parse it and deduce which template instances might be its own based on the parameters, but that is not an acceptable solution because it would reduce performance. And it wouldn't be infallible because there can be multiple template instances on a page with the same parameters.) — Eru·tuon 19:20, 17 April 2021 (UTC)[reply]

Oppose, I prefer the flexibility of the current setup. —Mahāgaja · talk 10:54, 17 April 2021 (UTC)[reply]
Oppose, for the same reasons as Mahagaja. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 20:26, 17 April 2021 (UTC)[reply]
Oppose; I've been thinking about this, and I think it would require setting notext= or nocap= too much of the time, or (more likely) people would often fail to do this and we'd have etymologies capitalized like "Partially Inherited from Middle High German foo (Inherited from Old High German foo) and partially Borrowed from Middle Low German foo". For the individual languages that have borrowed so many terms from the ancestor languages they've inherited terms from that it's unclear without checking the wikicode whether "From x" means "borrowed" or "inherited", it's probably better to chide users to take care to spell out "Borrowed from {{bor}}" (vs "Inherited from {{inh}}"). For most pairs of languages, it is obvious whether "From x" means "borrowed" or "inherited" even if a user neglects to spell that out. - -sche (discuss) 06:38, 18 April 2021 (UTC)[reply]
@-sche: For inheritances from multiple sources as you told above, I think the bot could take a special note, for example, from the wording “partial". Nevertheless, I do agree that implementing my (revised) proposal is going to be quite tough; in that case I like your suggestion of making it compulsory for editors to spell out the whole statement. But again many editors are not willing to do that: therefor, should I merely start a vote to make writing that part mandatory (in case no bot-runner is able to implement my proposal). -_⸘- dictātor·mundī 20:59, 18 April 2021 (UTC)[reply]

Part 2

It's starting to look more and more like my {{bor+}} solution has been the optimal one all along: Template:inh+ and Template:bor+. Why doesn't someone simply implement that? Brutal Russian (talk) 10:59, 19 April 2021 (UTC)[reply]

@Brutal Russian: I am willing to drop my own proposal inasmuch as yours look much better. There would be no need of adding the parameter |notext= to {{inh}} all the time after the first instance of inheritance in an etymology. Anyway, having said that, I strongly feel that {{inh}} should not be abused (that is to say, when the editor wilfully wants to not show the full statement), & therefor, I would also like to make it mandatory for all to use {{inh+}} in the first instance of inheritance. And as for borrowings, I do not feel we need to have {{bor+}} because a word is loaned directly only once in an etymology, but the templet {{bor}} itself could just display the full statement. Also, I agree with @Eirikr’s earlier post that {{com}} could display the full statement as well. -_⸘- dictātor·mundī 00:04, 20 April 2021 (UTC)[reply]

@Inqilābī I'm happ that you liek my proposal, and I agree that a full statement somewhere in the text of the etymology should be mandatory. However I have confusenings about the whole "loaned only once" and "instances of inheritance", which I've expressed once already without receiving a clear answer - technically a word is inherited only once as well; practically, it's desirable to not to have to go on a trip down the Middle English memory lane every time you simply want to see English words inherited from Old English, and this is even more true with borrowings. I personally don't see a difference between these two templates in this regard; moreover there's the issue of consistency. I have an improved idea: telegraph to the user whether a text is included or not using "+" and "-" in the template's name. Current templates will receive aliases accordingly - if {{com}} doesn't show it, it gets the alias {{com-}} - and down the line the current ones can be retired if necessary - the botter's task will be made much easier in that case. That manual writing of notext and nolb in the myriad other noes detracts significantly from the website's current usability score. Brutal Russian (talk) 02:25, 20 April 2021 (UTC)[reply]

@Inqilābī, Erutuon, Brutal Russian (e/c) I would support {{inh+}} and {{bor+}}. They are easy enough to implement, certainly, and I could write a bot script to convert e.g. instances of 'From {{bor|...}}' and 'From {{inh|...}}' to use {{inh+}} and {{bor+}}. My instinct is to leave the generic {{inh}} and {{bor}} alone; there was a reason they were changed not to include any text (it was cumbersome to use |notext= and the fact that {{bor}} said "Borrowing from" and not "Borrowed from" wasn't helpful, either). Also, I'm not sure what is complex about this bot proposal or the previous one to add "Inherited from ..." text before {{inh}} at the beginning of an etymology section. As for making the use of {{inh+}} and {{bor+}} mandatory at the beginning of an etymology section, that's probably not possible, as existing editors will do what they like regardless. As to whether these are superfluous, it's true that for English it's usually clear if you know something about the linguistic history of English, but for many languages, it's far from clear. For example, the large majority of words in Spanish come from Latin, but some are inherited, some are borrowed, some are semi-learned borrowings, etc. Often it's not even possible to tell which is the case without some sophisticated linguistic analysis and/or sleuthing around to see what the earliest attestation of a word is. So having "Inherited from" or "Borrowed from" displayed is quite important. The beauty of {{inh+}} and {{bor+}} is that no one is compelled to use them and they don't add any cumbersomeness to any existing templates. Benwing2 (talk) 02:41, 20 April 2021 (UTC)[reply]

@Brutal Russian Not sure I agree with deprecating e.g. {{com}} in favor of {{com-}}. That seems a counterintuitive name as well as being longer. Benwing2 (talk) 02:43, 20 April 2021 (UTC)[reply]

@Benwing2 I thought of it as a solution to the consistency problem: currently some templates ({{lbor}}, {{calque}}) show the text and others don't. But I suppose some of these can be bot-normalised and others will require simply adding the + when the user doesn't get the expected result. Brutal Russian (talk) 03:15, 20 April 2021 (UTC)[reply]

@Benwing2: "I could write a bot script to convert e.g. instances", would highly object to that. --{{victar|talk}} 04:09, 20 April 2021 (UTC)[reply]

@Victar Maybe you see now why I complain about your tendency to object to everything? First of all, what are your reasons for objecting? Secondly, it was just a suggestion in response to a request from User:Inqilābī. However, thirdly, now that you've made me think about it more I'm more convinced it is the correct thing to do for languages that are descendants of well-known classical languages and frequently borrow from those same languages (e.g. Spanish, Hindi, Modern Greek). I can leave alone whatever are your pet languages and make the changes where it matters the most (maybe only for terms derived from those classical languages, i.e. where there is ambiguity). Benwing2 (talk) 04:19, 20 April 2021 (UTC)[reply]

@Benwing2: If you haven't noticed, this whole discussion is full of objections. I'm not objecting to {{bor+}}, however the discussion has yet to garner any wide support. I'm only objecting to your bot suggestion before you implement it without consensus first, as you oft do, and then cry foul when someone calls you out on it. --{{victar|talk}} 04:44, 20 April 2021 (UTC)[reply]

@Victar: I understand that you want to guard against people taking initiative without consulting others and seeking consensus; but in our case I believe there's nothing to guard against. My suggestion has been specifically to introduce a convenience option without affecting existing usage. A mandatory requirement to use the template or give some other etymological clarification would affect the latter, but this is a separate issue. Not surprisingly, nobody in this discussion objects to having their life made easier with an optional template; the bot job that @Benwing2 is proposing would only result in replacing "From" with "Borrowed from" (or "Inherited"), merely clarifying already-extant wordings. It would also make the template's introduction known to the editors; but it would not in principle affect any existing consensus, being merely equivalent to manually adding a clarifying "Borrowed/Inherited" to all the already-existing ambiguous etymologies, which surely nobody can object to. Brutal Russian (talk) 15:17, 21 April 2021 (UTC)[reply]

@Benwing2: Regarding my proposal of making the use of {{inh+}} and {{bor+}} mandatory whenever required, you need not worry if that be possible or not because I would like to start a vote so as to have this determined. However, your idea of implementing the proposal for descendants of classical languages is kind of a good compromise. (But do note these categories: CAT:English learned borrowings from Middle English & CAT:English learned borrowings from Old English; so whether these can be superfluous for a language whose ancestor is not a superstrate language notwithstanding the learned loans, is more of a subjective thing). My concern is that {{der}} is being used side by side with other templets (and here I do not mean those instances where {{der}} substitutes indirect borrowings, but only those cases where it actually can be confusing, as in derivations from a component word) in our etymologies, and I would like to have them somehow visually differentiated. That’s why I am bent upon making the use of {{inh+}} & {{bor+}} mandatory whenever it’s required. Otherwise, we would have to go for {{der+}} for the sake of the said differentiation! -_⸘- dictātor·mundī 07:32, 20 April 2021 (UTC)[reply]

@Inqilābī, Benwing2: In the case of borrowings from a parent language, the optimal solution here is using {{lbor}}. It's currently very rarely used, as I remarked in a previous discussion; but what principled reason might there be to avoid its use? If there's a possibility of replacing them in the course of implementing {{bor+}}, I say there's every reason to do it. (I must confess I was unable to understand what was meant by "so whether these can be superfluous for a language whose ancestor is not a superstrate language notwithstanding the learned loans, is more of a subjective thing", because a superstrate language is by definition a language that is not an ancestor, but also in general. I suspect you mean the same thing I discuss in the February discussion - in this case I think that if a template exists and gives a more clear and correct information, we should embrace its use). Brutal Russian (talk) 15:22, 21 April 2021 (UTC)[reply]

@Brutal Russian: What Benwing & I were discussing about implementing the proposal for “descendants of classical languages” was with reference to {{inh+}}, that is to say, such languages have the greatest ambiguity owing to the lack of full etymology statements. And in that extracted words of mine, I was justifying the necessity of using {{inh+}} for descendants of non-superstrate languages as well. Your idea of replacing all instances of {{bor}} to {{lbor}} is quite problematic given that we also have {{slbor}} (semi-learned loans)! We have to fix such etymologies manually (that is what I actually do: I use the more specific {{lbor}}, {{slbor}}, {{translit}}, etc. whenever applicable). -_⸘- dictātor·mundī 19:04, 21 April 2021 (UTC)[reply]

@Inqilābī: Could you illustrate what a descendant of non-superstrate languages is? If the descendant is a word, then the opposite of a borrowing from a superstrate language is an inherited word , but then no justification is needed. If the descendant is a language, I'm even more confused. Now when it comes to learned vs semi-learned, I think the problem is exactly in this distinction. What are the criteria for deciding for one or the other? If it's date of borrowing, then this is a very unfortunate naming scheme. If semi-learned means "borrowing inherited from an earlier stage of the language", then this a whole new level of problematic >:3 Brutal Russian (talk) 19:29, 21 April 2021 (UTC)[reply]

@Brutal Russian: By “descendant” I meant a descendent language. And by a non-superstrate language I mean languages like Germanic languages (Old English, Old High German, etc.) whose vocabulary was not drawn upon by its descendent languages, unlike what happened in the Romance and the Indo-Aryan family (where Latin and Sanskrit respectively are the superstrate languages). Semi-learned borrowings are distinguishable from learned borrowings, and you would have to use etymological dictionaries or historical linguistic treatises for reference. (However I have seen User:Benwing2 making automated corrections of etymologies.) -_⸘- dictātor·mundī 22:11, 21 April 2021 (UTC)[reply]

@Inqilābī: Yes, I see - I don't believe this is a possible use of the term: a superstrate language is necessarily a living language that exists in the same area and bears a higher prestige. For late Old English/early Middle English that was Norman, and for modern Welsh that's English. Latin was the superstrate on Gaulish; now French is the superstrate that's been murdering Occitan. I doubt that even Sanskrit can be properly called a superstrate, since being limited in its sphere of use it doesn't encroach on the everyday prestige of local varieties. I will wonder about (semi-)learned borrowings in a separate thread; for now it will be enough to say that correcting the etymology of a Latin borrowing into Spanish from {{bor}} to {{lbor}} is the right course of action regardless, because the difference between these two is categorical; it can be manually changed to {{slbor}} down the line. Brutal Russian (talk) 23:19, 21 April 2021 (UTC)[reply]

As a matter of fact, superstrates are not necessarily synchronic. The international scientific vocabulary, for example, is widely recognised as a superstratum. -_⸘- dictātor·mundī 23:44, 21 April 2021 (UTC)[reply]

@Inqilābī: Would you care pointing me to a website or publication that regards scientific vocabulary as a superstratum? It's not even a language! (and it is synchronic!) Brutal Russian (talk) 22:36, 22 April 2021 (UTC)[reply]

@Brutal Russian: Here. Pali and Sanskrit as liturgical languages were superstrates on Southeast Asian (& South Asian) languages. Both of these languages were not spoken natively by anyone (yes, even Sanskrit— the Indo-Aryan speakers spoke Old Indo-Aryan lects that were closely related to Sanskrit, which was nothing but a standardised, literary OIA language). I am not super-knowledgeable about Pali, but Sanskrit for sure was not contemporaneous with the South & Southeast Asian languages upon which it was a superstratum. And also, the international scientific vocabulary example was taken from w:Superstratum. -_⸘- dictātor·mundī 08:15, 25 April 2021 (UTC)[reply]

@Inqilābī: Thanks for the example, but the authors were using the word in an improper, non-technical sense, which they signalled by including it in double quotes and appending a clarifying word: these languages weren't a superstrate, but playing a "superstrate" role. Talk about caution! The wikipedia passage has no citations, and it contradicts the definition that the article starts with, namely that a superstratum is a language. I'm still convinced that this is not a possible use of the term. —Also I'm not sure in what sense Sanskrit could have been non-contemporaneous with the languages it influenced. As far as I'm aware, it continued to be in active oral use as a literary, religious (and possibly court?) language from the time it appeared, and is actively spoken to this day. Besides, Dravidian is so choke-full of Sanskrit loans that it's only conceivable that these were transmitted through direct contact. Brutal Russian (talk) 00:34, 26 April 2021 (UTC)[reply]

@Brutal Russian: Yes, Sanskrit literature did thrive for centuries after the emergence of Middle Indo-Aryan languages, but I am not sure if it continued to be spoken afterwards: and today it is just spoken as a learned, nationalistic affection by few people, who want to revive the language. And borrowings from Sanskrit in Dravidian languages are obviously learned loans; besides though there are ancient borrowings from Old Indo-Aryan that were naturalised in Dravidian. -_⸘- dictātor·mundī 08:31, 26 April 2021 (UTC)[reply]

I struck my votes... first, please make it clear what would happen, not that the proposal keeps changing. Making {{inh+}} and {{bor+}} seems the best, so far, but you never know if a better proposal comes the next hour. I'd request a vote to be created so to have things cleared up: once the vote has started, its wording can't be changed. Thanks and regards. 🔥शब्दशोधक🔥 10:29, 20 April 2021 (UTC)[reply]
This is what discussions are for! Be forbearant: haste makes waste. -_⸘- dictātor·mundī 08:27, 21 April 2021 (UTC)[reply]

Oppose. — Mnemosientje (t · c) 10:00, 21 April 2021 (UTC)[reply]

List Norwegian Bokmål as a descendant of Danish on etymology pages

I believe that all Norwegian Bokmål entries should be listed as inherited from Danish, instead of directly from Old Norse, and that this should also be on descendant tables.

For instance, on the Old Norse entry illr, instead of the current

Icelandic: illur
Faroese: illur
Norwegian Bokmål: ille
Old Swedish: ilder, īller
- Swedish: illa
Danish: ilde
→ Middle English: ille
- English: ill

I believe it should be

Icelandic: illur
Faroese: illur
Old Swedish: ilder, īller
- Swedish: illa
Danish: ilde
- Norwegian Bokmål: ille
→ Middle English: ille
- English: ill

There are good reasons for this. First off, "Norwegian Bokmål" is not a direct descendant of Old West Norse, through Old and Middle Norwegian, no, it has very East Norse, and specifically Danish features.

I think one of the best examples is the personal pronouns. Personal pronouns, unlike common vocabulary, are very rarely borrowed, and are commonly considered to be a good way to show group belonging when applying the comparative method. Bokmål's personal pronouns are East Norse, and thus can not be descended from Old Norwegian.

For example first singular jeg, which is an East Norse variant, with the earlier e- broken into ja- (jak). The Old Norwegian form was not broken, rather starting with e- (ek), as we see in Nynorsk eg. Another example is the first plural pronoun, Bokmål vi. Quite early in the development of Old Norwegian (but after the separation of Old Icelandic), due to influence from the first plural ending -um, the west norse vér (east norse form was wíR) changed to mér, which is reflected in Nynorsk me. In Bokmål however, the pronoun is vi, which can not be descended from West Norse vér, due to the i, and likewise can not be descended from Old Norwegian mér, since it starts with a v-, and not an m-. Danish however, has vi, which fits perfectly.

The reason for this is quite clear when we look at the history of Bokmål; Bokmål was not historically a spoken properly Norwegian dialect, no, its origin is simply in the standard Danish written language during the Kingdom of Denmark-Norway, and so is the majority of its vocabulary. In fact, prior to 1907 the spelling of Bokmål was identical to that of standard Danish at the time. However, over the years some wealthy Norwegians started speaking Danish as it were written, and sometimes snuck in Norwegian words. Since these wealthy Norwegians were not very good at pronouncing Danish, and since Danish (at least in its spelled form) is still very similar to the earlier properly Norwegian dialects, they started dropping sounds and changing them to match Norwegian. Thus, after 1907, spelling reforms have been conducted, to make Bokmål spelled more similarily to this spoken variant, moving away from Danish. However, it is still very clearly East Norse, and has as its basis Danish.

Because of all of this, I believe that showing it as a direct descendant of Old Norse, like in the example above, is highly misleading. Therefore, my suggestion is that it should be changed to my version, and that this should be consistent in all Old Norse pages. I also believe this applies on Bokmål pages, where instead of the Etymology header saying: "From {{inh|nb|non}}." should say "From {{inh|nb|da}}, from {{inh|nb|non}}." In the rare case that the Bokmål word has snuck in from native Norwegian, this should of course be noted.

I also think that the Wiktionary:About_Norwegian page should be edited with this information, albeit formulated much more concisely.

I look forward to hearing what other people think about this. Mårtensås (talk) 21:27, 10 April 2021 (UTC)[reply]

I just now noticed that Nynorsk Bokmål is set as a descendant of Middle Norwegian, not as one of Danish in Module:languages/data2. This must of course also be changed. Mårtensås (talk) 21:29, 10 April 2021 (UTC)[reply]

Support. I also always had the same impression on Bokmål. I thought that Norwegian Bokmål was shown as a “Norwegian” language for the sake of simplicity. You are the right person to propose the right thing. This relationship between Danish and Norwegian Bokmål is quite reminiscent of that between Dutch and Afrikaans. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 22:22, 10 April 2021 (UTC) P.S. I am not knowledgeable about Norwegian Bokmål, though. @Mårtensås, I am assuming that Bokmål is not the case of a Norwegian register with a Danish superstratum, and having a West Norse substrate? Asking just to be sure about this. -_⸘- inqilābī ^{‹inqilāb·zinda·bād›} 22:29, 10 April 2021 (UTC)[reply]

I'm not myself going to vote on this issue, since I'm not an editor of Norwegian or any continental Scandinavian language for that matter, but @Inqilābī the situation of Norwegian and Danish is nothing like Afrikaans and Dutch: Bokmål is a mixed language, of which there is serious debate whether it's Danish influenced by Norwegian or Norwegian standardised as, spelled like and influenced by Danish. Afrikaans isn't in any sense of the term a mixed language: it's just a descendant of Early Modern Dutch that split of in a colony and developped differently. Thadh (talk) 22:51, 10 April 2021 (UTC)[reply]

And that in many ways continued to be strongly influenced by Dutch up to the mid-twentieth century. ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 08:21, 11 April 2021 (UTC)[reply]

Support ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 08:21, 11 April 2021 (UTC)[reply]

I'mma ping a few active (North) Germanic editors: @Supevan, Gamren, Glades12, Ofkosinn, Krun, Mulder1982, Rua, Mnemosientje --Thadh (talk) 11:50, 11 April 2021 (UTC)[reply]

@Njardarlogar, Donnanz ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 12:05, 11 April 2021 (UTC)[reply]

Support I see no principal etymological og grammatical difference between Bokmål and Danish. Try to write down any rural oral Norwegian conversation, song or storytelling and you see that Bokmål is completely useless. How we suppose to categorise any Norwegian word etymology if it ain't made no clear word derivation differencication between Bokmål and Nynorsk forms? Tollef Salemann (talk) 09:08, 27 February 2023 (UTC)[reply]

Abstain. I'm not completely against it, but I'd prefer combining Norwegian Bokmål and Norwegian Nynorsk into just Norwegian and have the various different forms marked as being one or the other. I'm leaning toward BM and NN simply being two (highly divergent, probably) orthographies for the same language. Mulder1982 (talk) 12:50, 11 April 2021 (UTC)[reply]

No, we have been down that road before, Bokmål (nb) and Nynorsk (nn) are separate languages, even though they have common words, inflections often differ. There is a third (unofficial) version, Riksmål, which is closer to Danish. I have previously thought that Bokmål is a compromise between Danish, which was official in Norway until independence and for some years afterwards, and Nynorsk. If I remember correctly Bokmål was previously called Riksmål, but has diverged from it. DonnanZ (talk) 13:37, 11 April 2021 (UTC)[reply]

@Mulder1982: While, as mentioned, this has been debated before, I' will just point out that from a linguistic perspective, treating Norwegian Bokmål and Nynorsk as the same language while treating Danish, Norwegian and Swedish as separate languages would be rather arbitrary. Contemporary Bokmål is not much closer, if at all, to Nynorsk than either Danish or Swedish. --Njardarlogar (talk) 16:18, 12 April 2021 (UTC)[reply]

Well, the difference of course being that Bokmål and Nynorsk are not language but orthographies, so are Riksmål and Samnorsk. At best, they are codifications but ultimately they are one language: Norwegian. And yeah, I saw the debate on merging the two and I was really surprised that it did not go through. Mulder1982 (talk) 20:36, 12 April 2021 (UTC)[reply]

Abstain. I don't have much of an opinion about this, Norwegian is my native language but I am not much of a historian and don't really have too much knowledge about this subject. But I do have a question / concern, as an active contributor who relies solely on Norwegian dictionaries, I have to say these dictionaries to not reflect what you are presenting here. The Bokmål Dictionary (Bokmålsordboka) and the Norwegian Academy Dictionary (Det Norske Akademis Ordbok - NAOB) state that most Nordic words just descent directly from Old Norse. Such as the word for "ball" which is in Norwegian ball, comes from Old Norse bǫllr, and so on, Danish is not mentioned here, though according to this is should say Norwegian ball > Danish bold > Old Danish ball > Old Norse bǫllr? The only time these dictionaries mention Danish, is when a word presumably comes directly from Modern Danish to Modern Norwegian, such as (Modern) Norwegian kroppert from (Modern) Danish kroppert according to Naob, or from Old Danish to Modern Norwegian/Middle Norwegian, such as (Modern) Norwegian kummen from (Old) Danish kummen (Which is equivalent to (Modern) Danish kommen, which the (Modern) Norwegian word does not stem from.

I guess my question is, as an active Norwegian contributor who is not much of a historian, should I always assume a word comes directly from Danish according to this suggestion, as opposed to from Old Norse? Even though dictionaries will only state Old Norse in their etymologies, and what about exceptions, how am I supposed to know? Just curious how it would work in practice. Supevan (talk) 14:07, 11 April 2021 (UTC)[reply]

It's a good question, and not one that I can fully answer. The proposal is focused on the nature of Bokmål as a written norm that has become increasingly spoken. For a lot of words you can suppose that a lot of cognate lexemes existed in the speech of Norwegians, independent of Danish, and those lexemes would have descended from Old Norse. But the written language of the urban upper classes was modelled on Danish. Perhaps a toned down version of the proposal, where only the forms that are better explained as inherited from Danish are given as descendants from Danish, is preferable. ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 14:16, 11 April 2021 (UTC)[reply]

Oppose. It's bad enough that we treat them as two languages, but we shouldn't remove them even further from each other. It's also totally wrong to claim that Bokmål as a whole is derived from Danish. It contains many inherited words that clearly do not come from Danish. bein, øy, vite, haug just to name a few. —Rua (mew) 14:35, 11 April 2021 (UTC)[reply]

@Rua Yes, these exist in Bokmål, but they are certainly not inherited. They can not be inherited, since Bokmål, as I said above, has its origin in the standard Danish written language. Rather, they were borrowed into Bokmål from Nynorsk, due to the discontinued Samnorsk policy, which aimed to merge Bokmål and Nynorsk. Influence from native Norwegian does not, however, change the origin of Bokmål, which again is in standard Danish. Mårtensås (talk) 18:34, 11 April 2021 (UTC)[reply]

Two of your examples are wrong. Bein is a Samnorsk alternative to Bokmål ben, while vite is a modified version of Bokmål/Riksmål vide. While the Nynorsk vite is an e-infinitive form of vita or veta (a-infinitive uses often as a split infinitive in Nynorsk before 2005, same as å-infinitive in Trøndersk dialect or -a in Totning. Bokmål has no such stuff because Bokmål is derived from Danish. Furthermore, Bokmål has a Danish declension of bein (see beinene). Tollef Salemann (talk) 08:56, 27 February 2023 (UTC)[reply]

Oppose: Extra complication for little benefit. --{{senator_no|talk}} 19:39, 11 April 2021 (UTC)[reply]

Language codes in names of reference templates

I want to create some new templates and I'm unsure what are the curretly adopted views on this. Example: {{R:la:OHCGL}}. Do I only need to use them if I'm guessing the acronym in the name could come up as the name of another reference down the line (wow, arbitrary)? Do I skip them if I use the format "Author YEAR"? Or is it desirable to always use them - in that case I consider it a necessity to create such... longcuts! to all the old templates as well so that the user doesn't get pissed off on regular basis unless they remember exactly what arbitrary selection of templates for a given language uses the language code. Brutal Russian (talk) 01:40, 11 April 2021 (UTC)[reply]

Most dictionaries and other frequently cited books will have an acronym commonly in use by academics, whilst journal papers are chiefly referenced by the author name and year. --{{victar|talk}} 03:55, 11 April 2021 (UTC)[reply]

@victar This is also useful to me, but what's the stance on using language codes (R:la:)? Brutal Russian (talk) 11:23, 11 April 2021 (UTC)[reply]

Per what -sche was saying, I add a language code to every template applicable. I don't use English language sources, so the question of adding en: or not doesn't apply in my case. --{{victar|talk}} 03:37, 12 April 2021 (UTC)[reply]

@victar Sorry for pinging again, but this leaves me a bit unsure: aren't the codes to specify which language the reference *describes*/ is used as a reference for, and not which language the reference itself is in? Brutal Russian (talk) 18:16, 12 April 2021 (UTC)[reply]

You're correct, it's the former. --{{victar|talk}} 18:38, 12 April 2021 (UTC)[reply]

I don't know that there's a consistently-followed policy, in part because many reference templates were named 12+ years ago, before we even consistently used "R:"! I'd consider it common sense that if there is a naming conflict, i.e. there are two different "Smith:1996"s, disambiguation should be used. The more likely it is that a name has multiple referents, the better it is to include a language code, e.g. for short/common last names + year like "Lee:2002" or "Costa:2003" or short acronyms like "GG" or "CD" ("Century Dictionary"? "Cambridge Dictionary"?). It's probably best if we start to always include a language code (e.g. all Armenian templates include language codes, even ones that would probably be adequately distinct by name alone), though English could continue to be an exception, since it's the language of the Wiktionary and the most-covered and probably most-edited-in language on this Wiktionary, and has other privileges too like being sorted ahead of other languages in entries. I think it'd be fine to create redirects from the language-code-containing versions of existing templates, like T:R:de:Duden, for consistency. - -sche (discuss) 14:36, 11 April 2021 (UTC)[reply]

Two years ago we held the vote Wiktionary:Votes/2019-06/Language code into reference template names on whether the language code should always be incorporated into the names of reference templates. It failed. The de facto situation is that if you make a new reference template, you can decide whether or not you want to include the language code. Personally, I prefer to include the code, provided the reference covers just a single language. Some references of course cover multiple languages, in which case I omit any code. But that's just my personal preference, and others doubtless do it differently. —Mahāgaja · talk 21:07, 11 April 2021 (UTC)[reply]

Senks yuo for the replies, they gave my existential anxiety over this a calming pat. Brutal Russian (talk) 18:13, 12 April 2021 (UTC)[reply]

I think we should always include language codes (if applicable). If a reference was created with a language code, it should not be renamed, like it happened here (Special:Diff/61682157/61698999) (reason: "Like another temaplate"?). – Jberkel 08:53, 15 April 2021 (UTC)[reply]

I restored the "pt:" to that template; it shouldn't have been removed, especially given how likely it is for that particular name to refer to something else as well. (I left a redirect so the existing links continue to work.) - -sche (discuss) 21:16, 16 April 2021 (UTC)[reply]

Add xno to Module:languages/data3/x

wikidata:Q35214 EdwardAlexanderCrowley (talk) 11:02, 11 April 2021 (UTC)[reply]

We treat Anglo-Norman as an etymology-only variety of Old French, so xno is at Module:etymology languages/data. —Mahāgaja · talk 14:29, 11 April 2021 (UTC)[reply]
This reminds me that I or someone should resume work on adding little --comment pointers in the places where intentionally unincluded codes would go, to discourage re-addition by unaware users without discussion (as happened with Twi/Fante) and pre-answer queries like this. - -sche (discuss) 14:41, 11 April 2021 (UTC)[reply]

According to nl:Categorie:Woorden_in_het_Anglo-Normandisch, there are some unique words. EdwardAlexanderCrowley (talk) 16:54, 11 April 2021 (UTC)[reply]

The category Category:Anglo-Norman_language failed RFD, though. I'm not interested in Anglo-Norman, but I believe there're some words from Norse. EdwardAlexanderCrowley (talk) 17:02, 11 April 2021 (UTC)[reply]
The existence of some words in Anglo-Norman that aren't in other dialects of Old French doesn't prove that that AN is a separate language. The various national varieties of English have words not present in other dialects too, but they're all still English. —Mahāgaja · talk 21:08, 11 April 2021 (UTC)[reply]
And those distinctly Anglo-Norman words can be categorized in Category:Anglo-Norman Old French, to which the nl.Wikt category could be linked. :) - -sche (discuss) 00:31, 12 April 2021 (UTC)[reply]
Thanks, I know where it should be. EdwardAlexanderCrowley (talk) 02:20, 12 April 2021 (UTC)[reply]

Line numbering coming soon to all wikis

From April 15, you can enable line numbering in some wikitext editors - for now in the template namespace, coming to more namespaces soon. This will make it easier to detect line breaks and to refer to a particular line in discussions. These numbers will be shown if you enable the syntax highlighting feature (CodeMirror extension), which is supported in the 2010 and 2017 wikitext editors.

More information can be found on this project page. Everyone is invited to test the feature, and to give feedback on this talk page.

-- Johanna Strodt (WMDE) 15:08, 12 April 2021 (UTC)[reply]

"Related terms" confuses editors

One of the very commonest user errors is to add related topics under "Related terms", e.g. souvlaki under kebab. But it's only meant for etymological relations. Can we do anything about this, e.g. pick a better name for the subheading? Equinox ◑ 00:15, 14 April 2021 (UTC)[reply]

It's longer, but also more explicit -- what about Etymologically related terms? Or Derivationally related terms? ‑‑ Eiríkr Útlendi │^{Tala við mig} 01:03, 14 April 2021 (UTC)[reply]

I don't like any of those suggestions. It's not really a problem, merely a convention, and all dictionaries have those. The only true solution is an editing interface where people have a list of fields they can fill, and each field (e.g. 'Related terms') can have a brief description accompanying it. —Μετάknowledge^{discuss/deeds} 05:30, 14 April 2021 (UTC)[reply]

Fair enough, re: not liking the suggestions. I'm not overly fond of them myself. :D

Re: editing interface, that sounds like a terminology tool we looked at years ago, something back-ended by MediaWiki software but using a forms-based input interface much like you describe. Let me see if I can find that... Meh, not seeing it, it might not exist anymore. At any rate, while I do think it would have the potential to be a major improvement, I don't anticipate any such UI for the EN Wiktionary any time soon. ‑‑ Eiríkr Útlendi │^{Tala við mig} 00:23, 15 April 2021 (UTC)[reply]

I once suggested making Related terms a subsection of Etymology, which would make it clearer, but that got shot down as many people feel Etymology sections are already too bloated for the top of the page. OTOH someone suggested moving Etymology to the bottom, in which case perhaps my suggestion would be more palatable. —Mahāgaja · talk 07:31, 14 April 2021 (UTC)[reply]

Renaming this section to Paronyms. Which is however more restrictive, since “related terms” has been used for terms which have influenced the term semantically or have been influenced by it in contradistinction. Fay Freak (talk) 11:40, 14 April 2021 (UTC)[reply]

I think this would be a bad move. 99.99% of readers will not know what "paronym" means. I dislike headings like "Hypernyms" and "Hyponyms" for the same reason, though at least in those cases a perceptive reader might be able to guess what they mean based on the listed terms. Colin M (talk) 14:53, 14 April 2021 (UTC)[reply]

I also can't agree to using "paronyms" in any heading. It might be the technically fitting term, but it's also extremely rare and unlikely to be understood by our users. ‑‑ Eiríkr Útlendi │^{Tala við mig} 00:23, 15 April 2021 (UTC)[reply]

When we do use lesser-known terms, one option would be to have a dotted underline and pop-up explanation on hover, like some sites do to explain acronyms (though I don't know how well this works on tiny mobile screens). Equinox ◑ 15:08, 14 April 2021 (UTC)[reply]

We're a dictionary. If they don't know what it means, they can look it up. —Mahāgaja · talk 16:06, 14 April 2021 (UTC)[reply]

But it's a convenience feature. People could also "look up" all the linked words even if we removed all the links, but it would be far more tedious. Equinox ◑ 16:51, 14 April 2021 (UTC)[reply]

Agreed that we shouldn't use obscure technical jargon in our headings. Making users manually look up terms just to understand the UI is appallingly bad design. ‑‑ Eiríkr Útlendi │^{Tala við mig} 00:23, 15 April 2021 (UTC)[reply]

I think you misunderstood what I wrote. Equinox ◑ 01:58, 15 April 2021 (UTC)[reply]

@Equinox: That's what I get for multitasking -- I conflated a couple things in my response to you. Apologies for my confusion.

Re-reading and responding to what you yourself wrote, and un-conflating my response, if we are to use any less-common terms in our headings, we should make it as easy as possible for users to understand what those terms mean. The <abbr> element might be one approach, but this doesn't render at all on mobile, and if the word within the tags is also a link, the tooltip pop-up doesn't work in a regular browser either. I have a quick-and-dirty mock-up at [[User:Eirikr/Scratchpad]] for those interested.

Given that, perhaps linking through to either a full entry or to a glossary or appendix entry might be a better way to do this.

That would seem to lead us back to the idea of using templates for our headers, much as the FR Wikt does, but I dimly recall there's general opposition to this idea? Maybe opinions have changed / are changing? ‑‑ Eiríkr Útlendi │^{Tala við mig} 04:16, 15 April 2021 (UTC)[reply]

This is not workable because it increases the Lua memory usage, and page-loading times, and makes the source-codes less readable without gain. The only advantage of templatizing all headers would be the hypothetical option to manipulate all headers modularily, without changing individual entry source texts. Fay Freak (talk) 11:52, 15 April 2021 (UTC)[reply]

@Fay Freak: Your concern about Lua memory is appropriate, but there's no reason these header templates would have to use Lua. Templates that don't use Lua should have zero impact on Lua memory usage (unless the back-end devs have done something really dumb).

Arguably, the question raised in this thread moves the ability to change all headers in a uniform fashion from the "hypothetical" to more of an actual concrete use case. And depending on how the templates are designed, they need not be all that difficult to read. We've already grown accustomed to arcana like {{lb}} or {{rfe}}, and I hazard that most of us no longer bat an eye when skimming over these in the editor view. </devil's advocate> ‑‑ Eiríkr Útlendi │^{Tala við mig} 22:11, 15 April 2021 (UTC)[reply]

What about "Etymological cognates"?--Tibidibi (talk) 02:10, 15 April 2021 (UTC)[reply]

Technically this is not different from what the cognates in etymology sections are. Fay Freak (talk) 11:53, 15 April 2021 (UTC)[reply]

Indeed, which has been a problem all along: cognates vs related terms. This is another reason these two sections need to be somehow together - this will also be a hint to the editors. Although I do think some name like Intra-language cognates would still be nice. Needs more imagination and less traditional terminology (even less if its Greek). Brutal Russian (talk) 13:08, 15 April 2021 (UTC)[reply]

This is definitely an issue. I'm not opposed to renaming the header "Etymologically related terms" or something. If we put it under the Etymology header (if we move the Etymology header down below the definitions), people are liable to be confused about whether any (etymologically) related terms go in the list (seems plausible!) or only terms in the same language (what's actually the case), though. I might question the utility of the header at all: could we present the information in some other way? (Another thing that confuses many readers is our use of {{sense}} to briefly summarize what definition antonyms pertain to; users perennially think it is supposed to summarize the definition of the antonym itself; moving antonyms under senses is resolving that.) - -sche (discuss) 21:25, 16 April 2021 (UTC)[reply]

simple.wiktionary uses "Related words", which is cute Yellow is the colour (talk) 21:40, 18 April 2021 (UTC)[reply]

Editing on wiktionary is very structured, which gives a very high entry bar for editors. Anyone wants to try WMF's growth tools project to to see if editors can be guided? The french WT uses it if I remember correctly. 119.56.101.73 17:06, 30 April 2021 (UTC)[reply]

Using references in Usage templates

So given that these templates are rarely seen around here (which I daresay is unfortunate), I have disquisitions to perform in their relation. Firstly, the name implies that they belong in the Usage section, which I understand isn't true - the language that seems to use them the most, Hungarian, places them anywhere. Comment on this if necessary. Next, Latin entries suffer from an excessive outpouring of creative writing skills on the part of some editors: cue obex. I'm about to finally start fixing this with templates, and it seems like a good idea to include references - even I sometimes wonder where what I've written even came from. Additionally, a reference can be used as a place to tuck some of those essays into. Yet I've never seen this used in practice, and it requires adding a reference list. Is this an ok approach? Is there another a language that I should take a look at, that uses Usage templates in an exemplary way? Brutal Russian (talk) 19:16, 14 April 2021 (UTC)[reply]

Tangut transliteration

Do we have a standardizable way to transliterate the Tangut language? The only source I can find here is from this old discussion on a user page, which didn't have sufficient information: https://en.wiktionary.org/wiki/User_talk:Octahedron80/archive_1#Tangut_transliterations

The STEDT database, which is frequently consulted for Sino-Tibetan languages, "mostly" follows the transliteration scheme of Prof. Gong Hwang-cherng (see STEDT page). Other online databases such as this one, 古今文字集成 also use the Gong scheme.

In the very small number of Tangut entries, there seems to be a mix of different transliteration schemes. Some entries uses the Gong scheme while others uses a different one with numeric superscripts. Any source on what that could be?

Also, we need more Tangut experts. --Frigoris (talk) 13:51, 16 April 2021 (UTC)[reply]

Alternative forms heading placement

Recently @J3133 kindly corrected my edit by moving this header above Etymology, citing {{WT:EL}}. I had seen that article suggesting this order, but far as I can see it doesn't ordain a preordained order, and merely suggests one. Now I've been consistenly going against that suggestion for one major reason: if there's more than 1 lemma to a language, placing the header at the top will by default apply those alternative forms to all the lemmas. Of course this is a problem, because more often then not this is incorrect. If we place the header at the very top for some lemmas, and in some other place for others, this will lead to uncertainty and confusion. Additionally, confusion arises out of the fact that "Also see" finds itself at the top of the page as well - I've wondered on numerous occasions why some alternative forms are tucked away in there. The only viable approach is to place Alternative forms consistently somewhere below the topmost header of the lemma, which is currently Etymology. In my experience it finds a perfect place there, reflecting eg etymological spellings, as well as the phonetic developments of the immediately following Pronunciation heading; and it leaves no doubt about which lemma the forms belong to. If we end up moving Etymology down, I'm not sure whether it should be kept near one or the other of these two headings, but in any case it needs to be below whatever topmost heading we decide on. I believe the problem I'm describing is a simple oversight and I don't foresee anyone disagreeing with this approach; besides, it doesn't go against the status quo, but only improves on it - so I expect this won't require a vote to implement. Brutal Russian (talk) 22:19, 19 April 2021 (UTC)[reply]

@Brutal Russian: I have discussed this elsewhere: see Special:Permalink/62146346#diff. -_⸘- dictātor·mundī 06:35, 20 April 2021 (UTC) Also see the format in this entry. -_⸘- dictātor·mundī 06:41, 20 April 2021 (UTC)[reply]

@Inqilābī: Thank you, at first I thought this wasn't exactly what I was talking about, but in fact it exemplifies the problem perfectly. Suppose pila Etymology 2 ends in the level 4 heading Alternative forms (because there are no descendants or else). In the current approach it can also be a level 3 heading of Etymology 3! and the only clue the reader has as to whether it belongs to the former or the latter lemma is the slight difference in font size that most people won't associate with levels even after months, years of using the website! It was this exact page that made me realise just how much of a need there was for placing the header in a fixed, unambiguous location. Hell, it's possible to leave open the level 4 header option for it, as long as the level 3 header is placed below what's currently the topmost lemma header for the vast majority of lemmas, Etymology. I think this is plain old common sense, don't you? Brutal Russian (talk) 14:34, 20 April 2021 (UTC)[reply]

I too have long been bothered by our modal layout -- for instance, the POS header is either L3 or L4 depending on the number of etyms, but the etymology header is always L3. Why would we do this? Why not always have the POSes at L4, consistently across all entries?

Anyway, as you note, this modal approach invites confusion as well when it comes to alternative forms. Do they apply to all etymologies? Only some? Is this section at the top, or somewhere else? Unnecessary confusion. For Japanese in particular, etymologies and alternative forms are usually tied to the pronunciation of a given term, and any given pronunciation might have multiple spellings, while any given spelling might have multiple pronunciations. Learning to read and write this language is a doozy, right up there with the borderline-insane English spelling conventions.

Establishing a consistent place to put alternative forms would be helpful, both to readers and to editors. From my perspective, I would argue for placing these under the etymology headings (inasmuch as this is still our current criterion for splitting up an entry). ‑‑ Eiríkr Útlendi │^{Tala við mig} 20:18, 20 April 2021 (UTC)[reply]

@Eirikr, Inqilābī: Using 'Etymology' to separate unrelated homographs is less than ideal - do you have an alternative suggestion? Perhaps 'Group' would be a better. I'd like the term 'sense', but we use that for 'meaning', which comes below the POS in the hierarchy. RichardW57 (talk) 07:14, 5 May 2021 (UTC)[reply]

For Japanese at least, using etymology as an entry's organizing criterion works pretty well -- especially for any term lemmatized at a kanji spelling, as kanji often have multiple readings (distinct pronunciations), each with distinct derivations. Grouping all of these together would be a nightmare, and I can't come up with a good visualization for how etymologies and pronunciations would be presented in a combined layout if each sense line has its own derivation and pronunciation. As an extreme example, see Japanese 柄. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:53, 5 May 2021 (UTC)[reply]

As an organising principle, etymology works. But the heading gets silly if you have a couple of unknown etymologies. I think an appropriately vague set of headings would be 'Notion 1' and so on. 'Etymology' would then be an optional subordinate entry under that. If we formalised this, I think we'd want to allow 'Other notions' for unrelated entries, such as non-lemmas in inflected languages. RichardW57 (talk) 18:59, 5 May 2021 (UTC)[reply]

@RichardW57, changing structure and heading terminology is likely to encounter some resistance, and should probably be discussed in a more focused fashion than this current thread. If this is something you want to pursue, I'd recommend starting a new Beer Parlour thread. ‑‑ Eiríkr Útlendi │^{Tala við mig} 21:09, 5 May 2021 (UTC)[reply]

⇒ @RichardW57 "Notion" (much less "Sense") doesn't intuitively seem like a desirable entry-organizing principle for a dictionary, which deals in headwords/lemmas, each with possibly multiple senses. "Notion" doesn't seem like a linguistic (or properly definable) term at all, and it's not apparent how or why it should be understood to be different from "sense". So if I was going for a generic concept-with-associated-term to organize entries by, I'd pick the actual thing we organize them by: headwords, these being individual word entries, defined practically as whatever ends up being treated as a single word.—This recalls the problem of breaking up one headword as multiple POS, which I think must be solved together with/as part of the topmost header/grouping problem. Brutal Russian (talk) 20:41, 6 May 2021 (UTC)[reply]

@Brutal Russian Yes, 'notion' does pretty much mean 'sense'. However, Wiktionary has chosen to use sense for the smallest element of a word's meaning. Being vaguer, 'notion' seems appropriate for the collection of related meanings. These collections are what one sees being given distinguishing numbers in many dictionaries. 'Headword' as a heading doesn't work. And, of course, we already have the headword itself at the top of the page. And indeed this is related to the discussion you reference, which is caused by the phenomenon of zero-derivation. RichardW57 (talk) 21:31, 6 May 2021 (UTC)[reply]

@RichardW57 Ok, I think we need to take terminology seriously now, seeing as we're a dictionary. "Notion" is a non-linguistic concept, but a cognitive one; it means "conception, idea". Words are linguistic signs, and like other signs, don't have notions, ideas or conceptions - people do; signs refer to notions via denotation.—Dictionaries list lexemes, which are represented by headwords/lemmas, i.e. citation forms. On wiktionary, these latter ones appear immediately below the Part of Speech section - each Verb, Noun, Phrase etc is immediately followed by the headword line. What's found at very the top of the page is the name of the page - {{also see}} is used to suggest page names similar to the one the user is viewing.—The current approach of dividing an entry by Etymology works well because the existence of syntax allows one Etymology-lexeme to appear as different parts of speech, but this creates the problem of distinguishing them that I refer to above. When you say "zero-derivation", the underlying assumption is that different entities are being created. My current thinking is that this assumption is the core of the problem. If we treat these as one entity and relegate the job of syntax to syntax, the problem of POS multiplication should diappear. But major word classes should still be distinguished for languages that clearly distinguish verbs from adjectives, for example; for those that do not, they shouldn't. Brutal Russian (talk) 11:48, 7 May 2021 (UTC)[reply]

For navigation, I think there's a lot to be said for adding section numbers to the section headers in the body, rather than just the table of contents. RichardW57 (talk) 07:14, 5 May 2021 (UTC)[reply]

I agree in principle, and this would improve page navigation and usability. But I suspect some of the reason for why we don't is that it becomes a data-maintenance mess -- any manually-applied numbering scheme is prone to disorder, as multiple disparate editors add and remove and reorder sections. If we could find a way to have those section numbers applied automatically (perhaps using CSS even?), we might be onto something tenable. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:53, 5 May 2021 (UTC)[reply]

I agree that the numbering would have to be automatic. RichardW57 (talk) 18:59, 5 May 2021 (UTC)[reply]

@Brutal Russian: As I understand it, {{Also see}} isn't related to the entries on the page. Rather, it allows one to enter a word as well as the user's keyboards permit, and then get to the relevant page in one more click. For example, to help the less savvy find Sanskrit देव (deva), one might add the Devanagari there to the list for the page deva. Perhaps you'd like a lead such as 'similar spellings'? RichardW57 (talk) 07:14, 5 May 2021 (UTC)[reply]

"placing the header at the top will by default apply those alternative forms to all the lemmas."-- I have used the alternative forms header in exactly this manner- see my edit here: [11]. --Geographyinitiative (talk) 18:22, 5 May 2021 (UTC)[reply]

⇒ @RichardW57, Geographyinitiative I wasn't doubting the purpose of "also see" which is how you described it; but I was pointing out that in single-language entries, a bunch of "also see" forms at the top immediately followed by a bunch of Alternative forms inevitably confuse some readers as to the purpose of and difference between them, and even those who though they understood what goes where start doubting themselves. Now that you mention it, however, it occurs to me there's yet another extremely similarly-named level 4 section See also for stashing semantically-or-something-like-that-related words. In that light I'd say changing the text of {{also see}} to You might be looking for would be an improvement; and seeing as "Related terms" confuses editors, changing the name of See also in concert with it - one to "Semantically related", the other to "Etymologically related" - might be a good idea. Honestly I'd love to see some sort of auto-generated semantic/etymological net, possibly stashed onto a single page like we do with WT:WSI, as I hinted at in a previous discussinon: cognates vs related terms. Brutal Russian (talk) 20:41, 6 May 2021 (UTC)[reply]

My interpretation of "also see" at the top of the page for English entries is that words that use the identical letters (or for Chinese characters, characters that have some core connection- simplified, traditional, variant etc) are put at the top of the page, regardless of capitalization, apostrophes, dashes and the like, which many non-elite educated native English speakers have historically considered irrelevant and remove when they use the word (case in point: Ronald Reagan's quote on the Xian page). Therefore, the "also see" function gives them a chance to realize that they may be missing material on the page they are on that is relevant to their information search. The alternative forms section or sections on a page tell you about etymologically identical and hypothetically identically pronounced words, or words that are VERY closely connected somehow, but they are not necessarily in the "also see" section because they may be spelled with different letters. On the Jixi page, you can see one alternative forms section covering both the Heilongjiang and Anhui locations, but then the Heilongjiang location has its own independent alternative forms section for alternative forms only known (so far at least) to apply to that location and not to the Anhui location. --Geographyinitiative (talk) 21:36, 6 May 2021 (UTC)[reply]

pos= argument glosses format

What gramatical glossing format do you prefer in {{m}} etc.? Capital glosses separated by hyphens (quid nōmen tibi est) or periods? Lowercase separated by periods (barbā tenus sapiēns)? Or do you think glosses actually hurt readability and most readers won't be able to figre them out? If so, I don't see why not introduce a separate page where these will be standardised and their meaning explained. I think glossing is immensely useful - it's a bog standard tool for linguists not without reason, since combining lowercase translations with lowercase spelled-out glosses would make the text difficult to read, not to mention the space. I don't believe writing them in full will be useful to as many people as will find them distracting and superfluous. Therefore in my opinion it's glosses or nothing, although in those cases where only one or two words needs to be glossed as part of a running text, I do sometimes find myself spelling it out. Brutal Russian (talk) 23:30, 19 April 2021 (UTC)[reply]

@Brutal Russian When you say "gloss" I see you are referring to using pos= to indicate the exact grammatical use, similar to interlinear glosses. Normally in linguistic texts such grammatical glosses occur only in interlinear format to avoid distraction. I personally think inline grammatical glosses hurt readability and should be sparingly and only when they really contribute a lot. In a case like barbā tenus sapiēns, if you look up tenus you can see immediately it's a postposition and similarly that sapiēns is an adjective. barbā is the only possible candidate worthy of grammatical glossing and even then you can figure it out easily from the page barba that it must be ablative singular. I prefer to say in etymologies 'Literally, "wise as far as the beard"' or use lit= to indicate the absolutely literal translation in conjunction with {{m}}, e.g. barbā tenus sapiēns (literally “wise as far as the beard”). I think only when you get languages significantly more obscure and grammatically complicated than Latin (maybe e.g. Navajo or Sanskrit) should you resort to grammatical glosses on every word, and there I'd prefer lowercase, spelled out. Note that in practice, even Navajo etymologies don't include grammatical glosses, see e.g. áłtsé góneʼ yéigo daʼahijoogą́ą́ʼ, awáalya yaa áhályání or (a real doozy) chidí naaʼnaʼí beeʼeldǫǫh bikááʼ dah naaznilígíí. Benwing2 (talk) 04:38, 20 April 2021 (UTC)[reply]

Miscategorisation of Dutch hyphenated words as multiword terms

The current categorisation of Dutch terms that contain hyphens, such as Oost-Indiëvaarder, puts them in Category:Dutch multiword terms because hyphens are considered word separators. This produces the absurd situation where the compound zee-eend is considered a multiword term, but its superseded spelling variant zeeëend is not. I think that hyphens should not be considered as word separators at all for Dutch. Any comments? @Thadh, DrJos, Morgengave, Mnemosientje, Rua, MuDavid, Lambiam, Alexis Jazz
@Mahagaja, Jberkel Perhaps a similar change is needed for German as well? ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 18:52, 20 April 2021 (UTC)[reply]

It should clearly not be automatic but be considered on a case-by-case basis. If the hyphen is there merely to avoid adjacent letters being interpreted as a digraph (cacao-overeenkomst but visovereenkomst), it is simply one word. If the structure of a compound A-BC is A-B + C (Tweede-Kamerlid ), then the suggestion that it is a multiword term is misleading. Other cases (schrijver-dichter. dertien-in-een-dozijn, Elzas-Lotharingen) look to me as multiword as writer-poet, dime a dozen and Alsace-Lorraine. --Lambiam 19:56, 20 April 2021 (UTC)[reply]

Sure, the category can be added manually whenever that is found appropriate. I'd consider dertien-in-een-dozijn a multiword term, but I'm not sure I'd consider the other two multiword terms. In Dutch grammar they are formally considered samenstellingen met gelijkwaardige elementen, whatever that is worth. In matters such as stress they resemble univerbations. ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 16:36, 21 April 2021 (UTC)[reply]

I'm having a déjà vu but I'm not sure why. Alexis Jazz (talk) 20:41, 20 April 2021 (UTC)[reply]

@Lingo Bingo Dingo: I had the same idea in March, only to find out it's not new. –Austronesier (talk) 16:22, 21 April 2021 (UTC)[reply]

I see. Has that been implemented yet? ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 16:36, 21 April 2021 (UTC)[reply]

Suggested Values

From April 29, it will be possible to suggest values for parameters in templates. Suggested values can be added to TemplateData and will then be shown as a drop-down list in VisualEditor. This allows template users to quickly select an appropriate value. This way, it prevents potential errors and reduces the effort needed to fill the template with values. It will still be possible to fill in values other than the suggested ones.

More information, including the supported parameter types and how to create suggested values: [1] [2]. Everyone is invited to test the feature, and to give feedback on this talk page.

Timur Vorkul (WMDE) 14:08, 22 April 2021 (UTC)[reply]

Allow Statistics section in WT:MOS

Already used in surnames like Bonacci. I also hope revising my edit to Fibonacci. EdwardAlexanderCrowley (talk) 03:55, 28 April 2021 (UTC)[reply]

Lightening entry A's burden

What I propose at Talk:A#Galizionario-Style_Breakdown_Needed:

Having pages like Wiktionary:About Rohingya provide links to entries like A with templates like {{list:Latin script letters/rhg}}.

In that way, the entry A might not have to be burdened with entries on letters for different languages.

Thanks for considering. --Apisite (talk) 08:56, 29 April 2021 (UTC)[reply]

There was a proposal to do something like that, but there were too many options and the whole thing fizzled out. If we focus on Lua memory limitations, we might get away with something like:

Main space entries for non-translingual letters of alphabetic writing systems (and nouns denoting them) that can be encoded as a single Unicode character shall contain no module invocations, whether direct or via templates. Entries for letters may be stored elsewhere. Letters of alphabetic writing systems may be treated as translingual.

We may have to expand that restriction later, but that should do to get started. It won't cure the problem of the ugliness of pages of redirects, but it should cure the Lua problem for letters. Unfortunately, it's short items in general that have the problem. We already hit memory limits on 'mi' and 'se'. Perhaps the real answer is that there should be a power to reserve pages for English, with other languages being restricted to soft redirects, so a#Pali could only contain a soft redirect to something, e.g. a/Pali. I'm not sure how to word that, but I think something like this is the only real solution in at least the short term. RichardW57m (talk) 13:24, 10 May 2021 (UTC)[reply]

The vote Wiktionary:Votes/2020-07/Removing letter entries except Translingual (simplified from the original proposal) still exists and we could start it at any time if there seems enough support for it. Thadh (talk) 13:31, 10 May 2021 (UTC)[reply]

Definitions/Example-sentences for Unicode Symbols in entries

Unicode symbols are mainspace entries, this means the search bar can "hit" a Unicode entry. As we all know. A long time ago, they shifted over from Appendix-namespace.

At the top of these pages, infoboxes for Unicode symbol(s) are there. Do they function merely as an illustration (like pictures in word entries) or as part of defining the symbol. Lying out true usages of the symbol falls under defining these symbols.

The reason I am calling this out is because, many times only the most major Unicode symbol is defined and examples given, such as as used as punctuaction. The other symbols of the page are often neglected to be mentioned other than as infoboxes. If they are merely illustrations, fine. But, if they are part of the entry, its use even in more technical contexts, in mathematics and so on, should be listed. In the spirit of being a comprehensive dictionary.

I do suggest an exclusion for emojis however. They are pictorial_expressive Unicode, and may have ambiguous agreed-upon meanings. There's already an emojipedia which really is an "emojitionary". Thoughts?119.56.101.73 17:22, 30 April 2021 (UTC)[reply]