Page MenuHomePhabricator

Add lexeme language codes ha-arab, sux-latn, sux-xsux, gsg, tlh-piqd, tlh-latn
Closed, ResolvedPublic

Description

This ticket is to add language codes for the representations of lemmata and forms in a number of languages:

  • Hausa in the Arabic script (Ajami), since Hausa is one of the focus languages of the Abstract Wikipedia;
  • [ ] Manchu, as a language written in the Mongolian script whose code is less controversial than mvf;
  • Sumerian (in its cuneiform and Latin-script representations), because we have L1 and it, out of all lexemes, should at least be modeled correctly;
  • German Sign Language, by analogy with the American equivalent having one and the British one being requested in another ticket; and
  • Klingon (in its pIqaD and Latin forms), so that those processing lexemes in that language can better process it.

The first three have some publicly available resources listed on this page, while the last two at least have non-endangered communities using them.

@Amire80 @jhsoby What do you say?

Event Timeline

Mahir256 renamed this task from Add lexeme language codes ha-arab, mnc, gsg, tlh-piqd, tlh-latn, sux-latn, sux-xsux to Add lexeme language codes ha-arab, mnc, sux-latn, sux-xsux, gsg, tlh-piqd, tlh-latn.May 11 2021, 5:21 AM
Mahir256 moved this task from Backlog to Wikidata (lexemes) on the Language codes board.

Hausa in the Arabic script (Ajami), since Hausa is one of the focus languages of the Abstract Wikipedia;

OK.

Manchu, as a language written in the Mongolian script whose code is less controversial than mvf;

OK, and mvf is OK, too. Is there such a task for mvf? I thought it's added already.

Sumerian (in its cuneiform and Latin-script representations), because we have L1 and it, out of all lexemes, should at least be modeled correctly;

OK.

German Sign Language, by analogy with the American equivalent having one and the British one being requested in another ticket; and

OK.

Klingon (in its pIqaD and Latin forms), so that those processing lexemes in that language can better process it.

Bzzt. Not sure. I'd like a second opinion. The -latn part is probably OK, but I'd love a few clarifications:

OK, and mvf is OK, too. Is there such a task for mvf? I thought it's added already.

For a reminder of the controversies involving a code for Mongolian in its native script, see T215032 and (described as having a negative effect on one participant) T137810.

Bzzt. Not sure. I'd like a second opinion. The -latn part is probably OK, but I'd love a few clarifications:

Absent other circumstances (some fan schism or something, I don't know), my contention is yes.

  • Aren't there any copyright issues?

In this case, this Wiktionary category and its equivalents for other languages would need to be nuked (it's not as if the Star Trek franchise made Klingon CC-BY-SA either).

  • What's the actual need for it?

There are terms in the language, for which the creation of lexemes (beyond the seven present) may well be possible, and when included they ought to be coded properly.

OK, and mvf is OK, too. Is there such a task for mvf? I thought it's added already.

For a reminder of the controversies involving a code for Mongolian in its native script, see T215032 and (described as having a negative effect on one participant) T137810.

@Amire80: Still ok with mvf?

OK, and mvf is OK, too. Is there such a task for mvf? I thought it's added already.

For a reminder of the controversies involving a code for Mongolian in its native script, see T215032 and (described as having a negative effect on one participant) T137810.

@Amire80: Still ok with mvf?

I'm OK with it, I really don't see any reasonable objections.

I'd just like to publicly say that if mvf gets added, the data I've added for Mongolian in Mongolian script should not be changed, because I did not mean Peripheral Mongolian.

If anything, the data I added is more likely to be Halh Mongolian (which can be written in Mongolian script too), but I have not been able to find a single source I can use to determine whether a word is Halh or Peripheral Mongolian, and therefore can only say that it's some form of Mongolian in Mongolian script.

Manchu, as a language written in the Mongolian script whose code is less controversial than mvf;

OK, and mvf is OK, too. Is there such a task for mvf? I thought it's added already.

Don't add mnc, as it will likely be added to core MediaWiki soon.

Esc3300 updated the task description. (Show Details)

Change 698594 had a related patch set uploaded (by Mbch331; author: Mbch331):

[mediawiki/extensions/WikibaseLexeme@master] Add lexeme language codes gsg, ha-arab, mvf, sux-latn, sux-xsux, tlh-latn, tlh-piqd

https://gerrit.wikimedia.org/r/698594

Change 698594 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] Add lexeme language codes bfi, enm, gsg, ha-arab, mvf, pwn, sux-latn, sux-xsux, tlh-latn, tlh-piqd

https://gerrit.wikimedia.org/r/698594

Works on test for - bfi, enm, gsg, ha-arab, mvf, pwn, sux-latn, sux-xsux, tlh-latn, tlh-piqd

Nikki renamed this task from Add lexeme language codes ha-arab, mnc, sux-latn, sux-xsux, gsg, tlh-piqd, tlh-latn to Add lexeme language codes ha-arab, sux-latn, sux-xsux, gsg, tlh-piqd, tlh-latn.Jun 25 2021, 5:24 AM
Nikki updated the task description. (Show Details)