Jump to content

Wiktionary:Beer parlour/2013/October

From Wiktionary, the free dictionary

Now we're using Lua, pretty much all of these templates can be deleted, right? -WF

If there were a system to actually keep the Lua model updated and loaded with all labels. One consequence of using Lua is that our topical label system needs to become a completely closed one to prevent the Lua module from indefinitely expanding. Perhaps Moore's Law will make that possible, albeit at the cost of excluding users of older computers from effective use of Wiktionary as Lua modules get really big. The data for labels (Module:labels/data) is now about 78K, much smaller than the nearly 800K of the data for Language, but not small. DCDuring TALK 11:35, 3 October 2013 (UTC)[reply]
Yes these should be deleted. I've been deleting them when I feel like it, but there are a lot so it has taken some time. There are also still some transclusions and several templates have redirects that have transclusions themselves. These redirects, if still needed/in use, should be migrated to the module as well in the form of "aliases". —CodeCat 12:29, 3 October 2013 (UTC)[reply]
Anything with zero transclusions can go immediately, per CodeCat above. Mglovesfun (talk) 15:39, 8 October 2013 (UTC)[reply]

Abbreviations, acronyms and initialisms in languages other than English

[edit]

Regarding the 'decision' (or rather discussion that was favorable to it) to change the abbreviation, acronym and initialism headers to a 'part of speech' header, as there any languages that we want to exempt from this? It feels a bit awkward for me to change the header and templates in languages I don't speak, like in say Greek. Mglovesfun (talk) 12:21, 4 October 2013 (UTC)[reply]

Changing the header to a more appropriate part of speech would need some knowledge of the language. —CodeCat 13:29, 4 October 2013 (UTC)[reply]

Use of usex lang equals en to format example sentences

[edit]

I oppose edits that turn the simply formatted English ''Example sentence.'' into {{usex|lang=en|Example sentence.}}. I wonder how many editors feel the same. Example edits: diff, diff, diff. --Dan Polansky (talk) 16:36, 4 October 2013 (UTC)[reply]

I'd certainly like to hear what advantage this brings to human contributors and users. Does it bring advantage to folks who repackage Wiktionary? Does it make like easier for Wikidata? DCDuring TALK 16:57, 4 October 2013 (UTC)[reply]
I support such edits. It clearly marks such sentences as example sentences. —CodeCat 17:45, 4 October 2013 (UTC)[reply]
Should we clearly mark definitions with {{def|lang=en|A domestic animal that meows and looks like a cat.}}? --Dan Polansky (talk) 17:47, 4 October 2013 (UTC)[reply]
I did propose something not so different a while ago, if you remember. —CodeCat 18:07, 4 October 2013 (UTC)[reply]
I support these edits. Templatising usexes (and anything else) makes them more flexible. Suppose we ever want to create categories listing every entry with usexes in a given language. If every usex is templatised, this will be as easy as adding a line to the template or the module. Suppose we decide that every usex should have a green background, or that they shouldn’t be italicised anymore, or that the module should automatically wikify the words (but keep them black)? All this will be infinitely easier to do if the usexes are templatised. — Ungoliant (Falai) 18:05, 4 October 2013 (UTC)[reply]
  • Right now definitions, citatations, and usexes are findable from the dumps. Why design for the conjectural when we have deficient content and so many actual problems now and the nature of future problems and solutions is likely to change? Is that really the only advantage?
I find it hard to get excited about the advantages of redundant formatting for deficient content. DCDuring TALK 19:42, 4 October 2013 (UTC)[reply]
They are absolutely not (but I invite you to try). People have been using ad-hoc formatting for years. DTLHS (talk) 20:24, 4 October 2013 (UTC)[reply]
And the sooner this day comes the better. DTLHS (talk) 21:06, 4 October 2013 (UTC)[reply]
Just to be clear, are you saying that editing the source without GUI should ideally no longer be possible? --Dan Polansky (talk) 21:08, 4 October 2013 (UTC)[reply]
No, just the part about getting rid of wiki markup and being able to automatically validate pages. People can edit however they wish. DTLHS (talk) 21:10, 4 October 2013 (UTC)[reply]
Should the new markup be an XML dialect, using all the XML syntax conventions? (As that is what the sentence from 18:36, 4 October 2013 with which you seem to agree says.) --Dan Polansky (talk) 21:20, 4 October 2013 (UTC)[reply]
Does the convention of :''Example sentence.'' prevent automatic validation of pages? --Dan Polansky (talk) 21:21, 4 October 2013 (UTC)[reply]
No, but there is no "convention" (for usage examples especially). We can only validate after the fact because any live validation (as an extreme example, imagine wrapping each page in a single template that would parse the page with Lua) would be too slow. I'm not saying XML is the answer, but whatever it is it needs to be fast and usable across platforms. DTLHS (talk) 21:30, 4 October 2013 (UTC)[reply]
No convention? There is Wiktionary:ELE#Example_sentences, driven by Wiktionary:Votes/2007-07/Layout of example sentences. Come to think of it, the use of usex template violates WT:ELE. --Dan Polansky (talk) 05:33, 5 October 2013 (UTC)[reply]
  • I have never visited this Beer parlour before, and I'm only here because Dan Polansky raised issues with my use of usex. My main, almost sole apart from tidying, activity in the Wiktionary (which I see as having enormous but mostly unrealized potential) is adding quotations. However I noticed some time ago that many examples used usex and so, since it seemed purposeful, I expanded my tidying to add usex to examples that came up on my edit screen. Most of the discussion above is beyond me but I would like to make a couple of remarks (requests?). If usex is encouraged as a result of the Beer parlour discussion, then it could well be expanded to format multiple phrasal examples and to provide for comments separate from the actual example. Also, it would be great if usex could be extended to work within double octothorp context, because the idea of subsenses seems to be very practical for words with a lot of senses. ReidAA (talk) 08:43, 5 October 2013 (UTC)[reply]
  • Re: multiple phrasal examples: No, it's better for those to remain separate. It's "usex", not "sequence_of_usexes"!
    Re: comments: if we decide to make {{usex}} mandatory, then yeah, we'll need to support comments, but if we just make it encouraged, then I think that cases with comments might be better handled using normal wikitext. (Comments on usexes are quite rare IME, and I'm not sure a single standard formatting would make sense for all of them.)
    Re: "Also, it would be great if usex could be extended to work within double octothorp context": You mean, it is great that {{usex}} was extended, about a week ago, to work in that context. :-)
    RuakhTALK 20:55, 5 October 2013 (UTC)[reply]
  • I think that citations should be forbidden in the entries and should be kept inside the Citations namespace by policy. {{usex}} should be strictly for usage examples, i.e. simple made-up sentences or phrases that illustrate how the word is used. Citations are there simply to verify that the meaning/spelling/form is attested, nothing else. --Ivan Štambuk (talk) 21:47, 6 October 2013 (UTC)[reply]
    • I kind of agree with this. However, some citations can be good usage examples too, and then we should treat them as such. Also I think it's a bit strange the way we make quotations collapsible but not usexes. Usexes should be collapsible as well, it would be helpful on many of the Latvian entries that Pereru creates, which often have so many (good, but still) usage examples that they swamp the definitions somewhat. I noticed on Wikipedia they have a system with bottom-of-the-page navigation boxes where they are all collapsed except if there is only one, then it's shown expanded. We can do something similar with usexes: make all usexes but the first one collapsed by default. So if a single definition has more than one usex, the second and onwards are hidden and a button is shown to display them all. —CodeCat 21:57, 6 October 2013 (UTC)[reply]
      My primary concern is the waste of database dump space and page load times. Meanings/forms/spellings that have citations verifying them should have some kind of mark "this meaning is attested, click here to see citations". Perhaps a simple Verified floating after all of the context labels would do (with sense ID somewhere embedded so that the two could be correlated). This would be good for tracking the coverage of attested forms/spellings/meanings which is the long-term goal. Perhaps a tool could be developed to facilitate just that, e.g. loading a bunch of unattested meanings, searching WS for their spellings, and through one mouse click connecting them, generating entries inside the Citations namespace.
      Regarding collapsibility of usexes - if it could be done, that would be great. But considering how they are formatted (through enumerated indentation) that could introduce some technical difficulties. I'd like to see such boxes expanded if they contain a small number of usexes (e.g. 5), and collapsed otherwise. --Ivan Štambuk (talk) 22:10, 6 October 2013 (UTC)[reply]
I oppose collapsible usexes. Unlike citations, they are meant to be short and visible, immediately exemplifying senses. --Anatoli (обсудить/вклад) 04:03, 7 October 2013 (UTC)[reply]
I agree with Anatoli. Usage examples are often much more useful to help someone understand how a word is used than the definition, the grammar tags, or the usage note, let alone the citations, which may be there to support a particular element of challenged attestation rather than clarify usage. DCDuring TALK 04:09, 7 October 2013 (UTC)[reply]
I agree (w/Anatoli and DCDuring). When you encounter a usage of a word with a lot of definitions, and you want to look it up, it's often easiest to look at usage examples to see which one seems to match the original context, and then look at the corresponding definition, rather than trying to slog through all the definitions, no matter how well-written. (And when a word doesn't have a lot of definitions, there's obviously no point to collapsing the usexes.) —RuakhTALK 04:39, 7 October 2013 (UTC)[reply]
Notifications inform you of new activity that affects you -- and let you take quick action.

Greetings!

Notifications will inform users about new activity that affects them on this wiki in a unified way: for example, this new tool will let you know when you have new talk page messages, edit reverts, mentions or links -- and is designed to augment (rather than replace) the watchlist. The Wikimedia Foundation's editor engagement team developed this tool (code-named 'Echo') earlier this year, to help users contribute more productively to MediaWiki projects.

We're now getting ready to bring Notifications to almost all other Wikimedia sites, and are aiming for a 22 October deployment, as outlined in this release plan. It is important that notifications is translated for all of the languages we serve.

There are three major points of translation needed to be either done or checked:

Please let us know if you have any questions, suggestions or comments about this new tool. For more information, visit this project hub and this help page. Keegan (WMF) (talk) 18:27, 4 October 2013 (UTC)[reply]

(via the Global message delivery system) (wrong page? You can fix it.)

Obsolete forms heading

[edit]

I have created vote Wiktionary:Votes/pl-2013-10/Obsolete forms heading.

Let us postpone the vote as much as the discussion needs. --Dan Polansky (talk) 07:02, 5 October 2013 (UTC)[reply]

Probable or "relevant" etymologies

[edit]

I am a bit uncertain what do when there is an obvious possible compound etymology for a word. As a random example, take the Nynorsk word sørover which is most likely sør (south) + over (-ward). My dictionary does not provide an etymology for this particular word, however. It would be wrong to write that sørover = sør + over, since I cannot be absolutely certain that this is the case (the word could come directly from Old Norse, which would be a highly similar etymology, but not the same). But it also feels wrong to leave out this probable etymology, since a reader who is not familiar with Norwegian will not spot this logical/relevant etymology which has a high chance of being correct.

Could I write something like

  • Perhaps from sør (south) + over (-ward).
  • Compare sør (south) + over (-ward).

? --Njardarlogar (talk) 12:49, 6 October 2013 (UTC)[reply]

I think we call this "surface etymology". It's the way that native speakers will analyse the word, even though its origin might be older. I think the surface etymology should always be provided, alongside the diachronic (historical) etymology if it's known. —CodeCat 12:55, 6 October 2013 (UTC)[reply]
Me?thinks this works as long as the af*fix*at*ion or com*bin?at*ion is an in*stance of a pro*duct*ive or, at least, re*cent-ly pro*duct???ive pro*cess in the langu*age in quest*ion. DCDuring TALK 13:31, 6 October 2013 (UTC)[reply]

How should such information be worded and formatted, however? --Njardarlogar (talk) 11:52, 13 October 2013 (UTC)[reply]

What about "Can be constructed from x + y" or "Can be derived from x + y" ? --Njardarlogar (talk) 09:34, 21 October 2013 (UTC)[reply]

I've usually added things like that as "Analogous to sør (south) +‎ over (-ward)." (Analogous to {{compound|lang=nn|sør|t1=south|over|t2=-ward}}.) --WikiTiki89 15:02, 21 October 2013 (UTC)[reply]
I've used "equivalent to". —CodeCat 15:44, 21 October 2013 (UTC)[reply]

Speak up about the trademark registration of the Community logo.

[edit]

Should there be a Template:inherited?

[edit]

We have {{borrowed}} already. People have objected in the past that we are mixing inherited words from borrowed words, which results in categories like Category:Mandarin terms derived from Proto-Indo-European. So should there be a {{inherited}} template, which is used specifically when a term is directly inherited by passing it from generation to generation of speakers? —CodeCat 01:16, 10 October 2013 (UTC)[reply]

I'd prefer a single template {{etyl}} for both purposes. It's obvious from the context whether the word was borrowed or inherited. Separate categorization schemes for borrowed/inherited words is probably not worth the effort. --Ivan Štambuk (talk) 01:26, 10 October 2013 (UTC)[reply]
Indeed. I would also prefer to expand etyl to include the term being referenced, the "type" of the etymology (borrowing, grammatical derivation, inherited...), and optional date and reference parameters. DTLHS (talk) 01:30, 10 October 2013 (UTC)[reply]
Sorry if this comes across as hijacking your thread- I will start a new one if you prefer. DTLHS (talk) 01:45, 10 October 2013 (UTC)[reply]
I think there's an argument for all three; the status quo, deleting {{borrowed}} and keeping borrowed and creating {{inherited}}. Mglovesfun (talk) 14:17, 10 October 2013 (UTC)[reply]

Raw bolded headwords

[edit]

nanodiodes was created just now. I find it rather worrying that entries like this are still being created. It makes our efforts to fix old entries feel like mopping the floor with the tap still running. And I thought Semper had been notified of this already? —CodeCat 14:06, 12 October 2013 (UTC)[reply]

  • Wiktionary:Entry layout explained, in its "A very simple example" section says that the headword should be "the inflection word itself (using the correct Part of Speech template or the word in bold letters)". And quite right to. What would be the point of using a template that does nothing. SemperBlotto (talk) 14:11, 12 October 2013 (UTC)[reply]
    What indeed? If more thought was given to the actual user benefits of templates, modules, CSS, JS, individual design features and the overall architecture of the system, there might be greater harmony between editors and our systems mavens. If there were more magic and less arbitrary and keystroke-adding uniformity, our little world might be a better place. DCDuring TALK 14:37, 12 October 2013 (UTC)[reply]
  • If you've been creating Latin-script entries without raw bolded headwords, I find that rather worrying. It reduces the effectiveness of our attempts to fix existing entries to use raw bolded headwords. I thought you'd already been notified of this?

    O.K., not really. But you see the problem: your comment presupposes that your way is right and Semper's is wrong, even though you know that there are editors who feel the reverse. That's not very constructive; I can't even tell what you're looking to discuss?

    RuakhTALK 17:38, 12 October 2013 (UTC)[reply]
    • I thought there was already a consensus to abandon this old practice. A lot of editors seem to be working on that assumption, by replacing bolded headwords with templates. I don't think it's productive if lack of consensus means one group of editors is going to leave extra work that another group of editors then feels a need to fix. That's bound to cause friction. We need to agree on this. —CodeCat 18:06, 12 October 2013 (UTC)[reply]
      • If old is bad, I may be Satan himself. DCDuring TALK 18:39, 12 October 2013 (UTC)[reply]
        • I'm not saying it is bad. We have to ask if it is, and at least agree that it is or isn't, or agree that we don't agree and form a consensus anyway. Still, what we do have to consider is that this practice was established in the first place because there was nothing else, and because we didn't have the same overview of language support issues that we do today, nor the technical infrastructure to support it. So what it really comes down to is, given the choice that we do now, would we start off by following the now-older practice of omitting all language and CSS tagging from headwords, or the newer practice of including it? I don't really see that many reasons to use the original practice, other than to make it easier for current editors to keep doing what they did then, which is really nothing more than an argument against change as a whole on principle (and therefore suspect). Having to type more doesn't seem like a very good argument against the new practice either, because by that same token we could decide to write uselessly short definitions and omit all non-essential information such as etymologies. My stance on that is: if you don't want to do what is necessary to make a dictionary, don't make a dictionary. For me the primary reason to support the new practice is that it conveys useful information to browsers and editors/bots alike, and is therefore desirable. —CodeCat 19:24, 12 October 2013 (UTC)[reply]
WT:ELE for practical reasons cannot keep up with correct formatting. They way votes are structured to need at least 70% approval, WT:ELE will always reflect what someone wrote some time ago and not current editing. Mglovesfun (talk) 20:40, 13 October 2013 (UTC)[reply]
The votes do not need at least 70% approval; people repeatedly suggested 2/3 threshold, and at least one recent vote was closed with that threshold. (I am saying that in order to prevent the above post to serve later to show that we hold votes against the 70% threshold.) --Dan Polansky (talk) 21:09, 13 October 2013 (UTC)[reply]
  • O.K., I'm just going to pretend we're discussing the question "Which is a better headword-line syntax for English form-of entries: {{head|en}}, or '''nanodiodes'''?", since that's the only constructive way that I see to interpret this section. Under that pretense . . . My preference, like CodeCat's, is for {{head|en}}; it's more explicit, and it's not more typing IMHO. And I think it's obvious that we want language-tagging for non-English headwords (right?), so it seems simplest to use the same notation for English ones as well.
    Some editors clearly prefer to keep using the '''nanodiodes''' notation. To them I ask: do you prefer that others use your notation as well? Do you mind if others manually replace your notation with {{head}}? How about if they use bots and/or AWF to do so? Do you mind if entry-creation tools, like green-links in headword-templates, use {{head}}?
    RuakhTALK 23:15, 13 October 2013 (UTC)[reply]
I don't understand the point of the template uniformity. English is clearly the default language and Latin the default script. All other languages (and the Translinguals) and scripts would need to be marked, but not English and Latin. Why do we require it?
Of all the languages here English is the one that needs the simplest interface to elicit new contributions and corrections from monolingual native speakers. Monolingual native speakers in many other languages can and should be contributing on their own native language wiki, which should also have the most transparent interface. Unless we can develop and maintain a bulletproof form-type interface, which seems extremely unlikely, the simple wiki interface seems highly desirable where possible.
Is it now true that {{head}} will virtually never have to be changed? If not, then every time we add a few hundred thousand transclusions of it, we also increase the time it takes for the changes to propagate through all the entries that transclude it. DCDuring TALK 01:00, 14 October 2013 (UTC)[reply]
It takes longer to propagate changes through untemplatised headwords, since these need to be updated manually or by bot. — Ungoliant (Falai) 01:08, 14 October 2013 (UTC)[reply]
That's not true. Adding a diacritic-removal rule for Zulu entails a change that must propagate to every entry that uses Module:languages, including every entry that uses {{head}}; but it does not entail a manual or bottic change to every entry that uses formatting similar to what {{head}} would have produced. —RuakhTALK 06:28, 14 October 2013 (UTC)[reply]
That’s what I’m trying to say. — Ungoliant (Falai) 07:32, 14 October 2013 (UTC)[reply]
Then why did you say the opposite? DCDuring said something correct, and you replied by saying something wrong. If what you had meant to say was the right thing, then your comment was unnecessary to begin with . . . —RuakhTALK 15:06, 14 October 2013 (UTC)[reply]
I’m not following you then. If we want a change like italicising instead of boldening headwords, or increasing their font size or whatever, headwords with {{head}} will require the template being edited once and MediaWiki will automatically update the entries soon enough. The same change will require bots to update manually boldened headwords. — Ungoliant (Falai) 19:06, 14 October 2013 (UTC)[reply]
The styling of {{head}} and similar templates (at least the ones that are up to date) means that these templates never need to be edited for style changes. This only has to be done in MediaWiki:Common.css, and that's not a template so no changes need to "propagate". The browsers of users just need to download the updated style sheet, which is not something we can control. —CodeCat 20:35, 14 October 2013 (UTC)[reply]
@Ungoliant, CodeCat: The point is, we have never changed the actual formatting of form-of headword-lines, and AFAICT there is no desire to do so, whereas {{head}} has already been modified at least thirteen times this month: four plus six plus three. On average, that's almost one change every day. DCDuring is simply pointing out that these changes have a cost, and that that cost is roughly proportional to the number of pages using these modules. —RuakhTALK 21:47, 14 October 2013 (UTC)[reply]
Why don't you understand the point of template uniformity? Do you want things to be even more confusing and unintuitive for new users? I find "always use a template" more intuitive than "use a template except for English under certain circumstances" because it is just what seems like an arbitrary exception. Also, '''word''' and {{head|en}} do quite different things. The former doesn't add a category while the latter does, and the former wraps the text in plain bold tags while the latter tags it specifically as a headword. So to treat them as equivalent doesn't make sense, because they're not. —CodeCat 01:13, 14 October 2013 (UTC)[reply]
Actually, {{head|en}} does not add a category, since it doesn't know the POS. (And rightly not: we use form-of templates to categorize non-lemmata.) But I agree with the rest of your comment. —RuakhTALK 06:30, 14 October 2013 (UTC)[reply]
Not all form-of templates categorise. {{inflection of}} requires a categorising headword template because it doesn't add any categories itself, as do {{feminine of}} and many others. {{plural of}} and {{comparative of}} do categorise, though. I've been wondering how to remedy that, at least while assuming that it's confusing for editors when they never know when to add categories where. When I create new template infrastructures I tend to favour using the headword template to add categories; none of the Dutch form-of templates add any categories for example. This might be something we can extend to other templates and languages, or at least encourage. It would make the use of templates like {{plural of}} a bit simpler; currently any adjective plurals that use that template need a nocat=1 parameter and a categorising headword, which is a bit unwieldy. —CodeCat 20:40, 14 October 2013 (UTC)[reply]
  • For obvious reasons, I support complete elimination of manual bolding and italicization through wiki syntax everywhere (including citations/usexes) and replacing it with appropriate templates. --Ivan Štambuk (talk) 23:11, 14 October 2013 (UTC)[reply]

So I've started a manual of style. Lots of input please. Rationale: there are lots of issues with a common practice where WT:ELE doesn't mention them or does mention them but gives no ruling on them. Therefore an extensive manual of style is possible without contradicting or duplicating WT:ELE. Mglovesfun (talk) 12:43, 13 October 2013 (UTC)[reply]

Policies on formatting should be in WT:ELE. There should be no WT:Manual of style, IMHO. --Dan Polansky (talk) 13:48, 13 October 2013 (UTC)[reply]
I would prefer to see things centralised in ELE too. Also, there is not necessarily any consensus on any of it. I used to format every sense line like a sentence, but someone used to change them back, so I stopped bothering either way. Equinox 13:50, 13 October 2013 (UTC)[reply]
@Dan Polansky I agree, the problem is reality. Because of the voting process, it could take years (or decades even) to get everything into WT:ELE. This is a better real-world solution. Mglovesfun (talk) 14:22, 13 October 2013 (UTC)[reply]
Same here, Dan. --Lo Ximiendo (talk) 14:37, 13 October 2013 (UTC)[reply]
@Mglovesfun: The voting process is fast enough. There really is no consensus, quite often. In any case, you can edit Wiktionary:Entry layout explained/Editable to your heart's content.
Moreover, I am not sure what problem you are trying to solve. For instance, you write in the manual of style that "Translations should always be inside the templates {{temp|trans-top}} {{temp|trans-mid}} and {{temp|trans-bottom}}": has this ever been a problem or a point of contention? If not, why is the overwhelming common practice in mainspace entries not sufficient as a guide? --Dan Polansky (talk) 14:41, 13 October 2013 (UTC)[reply]
Just because something isn't controversial doesn't mean it isn't worth writing down anywhere. Mglovesfun (talk) 14:45, 13 October 2013 (UTC)[reply]
The fewer policy and guide pages a new editor has to read, the better. If a thing does not tend to be got wrong by people, it can safely remain unspecified. --Dan Polansky (talk) 14:49, 13 October 2013 (UTC)[reply]
In most instances I would agree with that concept, Dan, but when it comes to people who edit reference books as a hobby that might not be the case... - [The]DaveRoss 15:59, 13 October 2013 (UTC)[reply]
I am not sure what you mean. Do you mean that people who edit reference works as a hobby are less inclined to pick a common practice? But then the precondition would not be met: "a thing does not tend to be got wrong by people". In any case, I usually do not oppose specifying things that can safely remain unspecified (as is apparent by the likes of Wiktionary:Votes/pl-2013-09/Wikisaurus and attestation); I am just saying that there is no pressing problem, and that, therefore, neither I nor other people feel motivated to solve it. --Dan Polansky (talk) 16:11, 13 October 2013 (UTC)[reply]
I think Dan your argument is heavily flawed; we can take uncontroversial material out of a without deleting the whole thing. Even then it would be worth noting consensus on uncontroversial issues somewhere, though I guess to please you we'd have to put a notice at the top saying please don't read this unless you're sure you want to. We don't have to force new editors to read anything, generally they pick up on the existing format, the de facto WT:ELE and don't read the policies (either not at all, or not until they've already learned how to edit from the main namespace). I don't think your argument is outright false so much as it has bad conclusion from the evidence and it's not relevant to this discussion. Mglovesfun (talk) 20:36, 13 October 2013 (UTC)[reply]
here is an example of inconsistent yet correct formatting, where the first definition doesn't have a full stop and the second one does. Mglovesfun (talk) 09:20, 14 October 2013 (UTC)[reply]
It seems that this already exists as Wiktionary:Style guide. Mglovesfun (talk) 16:35, 21 November 2013 (UTC)[reply]
There are also pages like Wiktionary:Tutorial (Keep in mind) and Help:Writing definitions. I, for one, have never read any of these, although I have improved them occasionally, and have no idea how they relate to our policies and guidelines. Michael Z. 2013-11-21 18:00 z

Is wiktionary losing its openness?

[edit]

Well, of course you all think "no" but for the first time after seven years of editing here, I am concerned by my recent experience. Recently, I stumbled upon some protected templates that appeared to be malfunctioning and all my efforts to fix them were thwarted by one individual that seems to be their guardian.

That wouldn't be a problem if his answers to my requests were relevant, but they are not in my opinion. Lastly, when I asked creation.js to be fixed because it was putting words in a category that was deleted following a RFD, I was answered that this RFD was 4 year old. Is that really relevant? Is that a valid objection?

So, I am concerned that a script that is used so broadly is in the hand of one individual that both refuses to proceed to obvious fixes and refuses to unlock the page so that I make the change myself. This behavior is not in the spirit of a wiki where changes are welcome, especially when there are in conformance with what was previously discussed and what is actually being done. — Xavier, 01:24, 14 October 2013 (UTC)[reply]

I think you misunderstood: the template is locked because of its importance and because it would be a tempting target for vandals, not to give anyone a monopoly on editing it. You asked if it could be changed, and CodeCat offered the opinion that it wasn't a good idea. Since she's not in charge of it, there's nothing that says she has to make a change she disagrees with. It's true that there aren't many people who know enough to make changes without risking damage, but that just means you need to ask around. First make sure you have a consensus in favor of the change, then ask for help at the Grease Pit. Chuck Entz (talk) 01:39, 14 October 2013 (UTC)[reply]
This is pretty much how it is, yes. You can ask others if it can be changed, as long as they're pointed to the prior discussion. I do disagree with your statement that the templates are "malfunctioning" and need to be "fixed". That's very much your own point of view. —CodeCat 01:48, 14 October 2013 (UTC)[reply]
@Chuck Entz, FYI: This is a client-side script rather a template, so in addition to concerns about vandalism, it's also protected for more serious security reasons. (A mere template, I might briefly unprotect if a specific non-administrator needed to make a specific edit; a client-side script, never.) —RuakhTALK 06:05, 14 October 2013 (UTC)[reply]
There are only a handful of editors wanting to get involved; it's a lot of effort! We can't force people to participate against their will. Mglovesfun (talk) 09:22, 14 October 2013 (UTC)[reply]
But it is possible to drive people away by making it unrewarding to participate, thereby making a low level of participation inevitable. DCDuring TALK 13:07, 14 October 2013 (UTC)[reply]
DC, you guys are kinda already doing that with actions (such as reverting good-faith edits without much explanation) that would be construed as "biting the newcomers", or really biting the non-admins. And Mglovesfun, maybe we should postpone the important decisions until we start getting a broader base of editors participating. The fact that a dozen or so editors provide most of the deletion nominations and votes, most of the patrolling, and most of the reverts is not tenable for attracting and keeping new editors. And Meta, since there are more than 24 editors on this project, it is minority rule. Purplebackpack89 (Notes Taken) (Locker) 15:56, 14 October 2013 (UTC)[reply]
Since we don't make mistakes – at least not meaningful, big ones – the result we've achieved must be what we want. DCDuring TALK 17:23, 14 October 2013 (UTC)[reply]
DCD, please try to transmute the constant nagging, wearying sarcasm into votes/actions for change. Equinox 18:35, 14 October 2013 (UTC)[reply]
If I knew where to begin, I would. I keep hoping that someone more aligned with prevailing opinion would come up with a proposal. I promise not to give it the kiss of death by openly supporting it. DCDuring TALK 18:56, 14 October 2013 (UTC)[reply]
Being aligned with prevailing opinion is hardly a prerequisite for starting or participating in a discussion... Purplebackpack89 (Notes Taken) (Locker) 22:15, 14 October 2013 (UTC)[reply]
  • Xavier, if you want a sysop flag to edit javascript and protected templates, just nominate yourself and you'll probably get it. Making suggestions will get you nowhere, you need to be bold and implement the needed changes yourself. Same goes for policies - if they are missing or are defective, write them or change them yourself. --Ivan Štambuk (talk) 22:58, 14 October 2013 (UTC)[reply]
    Well, "making suggestions will get you nowhere, go ask for adminship" is apparently the sad truth. But what you are suggesting is exactly what I am regretting here: standard editors must discuss the smallest change in great length with little hope of success, while admins do whatever they want. That's basically what MGlovesfun said on another of my requests: the current state wasn't discussed before but no need to question it now, this is unlikely to change (IIRC).
    Personally, vandal blocking is not something I enjoy and in 7+ years of editing here, I have only needed admin rights a handful of times when templates needed to be changed (and until now, this went smoothly). So, I guess my admin rights would quickly be revoked for inactivity. I very much prefer that standard editors be treated with more respect, and this is why I have opened this discussion.
    Chuck, I understand very well why this template is locked and I accept it, that's not my point. Purplebackpack, I have nothing against guidelines or policies written by a handful of editors as long as the process is public and open to everyone. Here the issue is not about current WT policies: I was facing an editor with admin rights who had apparently taken over one of the wiktionary gadgets and who was using it to put French plural nouns in the category she had decided on her own, contrary to what was voted before, and who was refusing to change anything to her code. This, added to MGlovesfun remark, made me worried about the openness of the wiktionary. — Xavier, 00:13, 15 October 2013 (UTC)[reply]
In reply to "But it is possible to drive people away by making it unrewarding to participate, thereby making a low level of participation inevitable" I think your emphasis is wrong, perhaps participation is naturally unrewarding. We would need to find a way to make something inherently unrewarding rewarding. Mglovesfun (talk) 13:36, 18 October 2013 (UTC)[reply]

Using furigana

[edit]

This section mostly relates to Japanese entries, but the same issue touches on any language that uses Han characters. If desired, this section could be moved to Wiktionary_talk:About_Japanese, but it has to be posted here first or else nobody will notice it (I've tried in the past.)

This site has never used furigana AKA ruby very much despite the presence of support for furigana through {{JAruby}} or {{furigana}}. The lack of adoption may be due to the fact that those templates are extremely cumbersome. Yet with Lua and some patterns magic, it becomes very easy, and I've written some code (namely add_ruby in Module:ja) that can add furigana automatically when provided the term plus the term written entirely in kana.

Now that furigana is easy to use, it might deserve a(nother) look. It is easier to read than a parallel line of kana which may not match up exactly, and it is the way the rest of the world does it. Running Japanese text written purely in kana is actually more difficult to read than running Japanese with kanji, even for native speakers. Adding a second line of pure kana is rare in dictionaries or any other publications. I think a good use for it would be in usage examples and I've written {{ja-usex}} to demonstrate but furigana could be used in headwords and links as well.

This format for quotes and examples would be a departure from the current format specified in WT:AJ, but that format was established in order to be as close as possible to that of WT:ELE and WT:QUOTE (see Wiktionary:Votes/2007-08/Layout_of_example_sentences_for_Japanese_entries and Wiktionary:Beer_parlour/2007/August#Example_sentence_format_for_WT:AJ), and I believe that using furigana is no further from WT:ELE and may be closer. At the time, there was no Scribunto and adding furigana was so impractical that it wasn't a real option.

Does anybody have an opinion on using furigana in (a) quotes and examples (b) links (c) headwords? As far as headwords go, adding furigana would require no change in how the entries are written. --Haplology (talk) 18:47, 14 October 2013 (UTC)[reply]

I really like the idea of using furigana in quotes and examples (at the very least). We tried to make a template for it (see example at Template talk:furigana), but it was too difficult to use. If Lua could make this a reality, I’m all for it! —Stephen (Talk) 19:04, 14 October 2013 (UTC)[reply]
I'm not sure what way Lua could help. It could add the small characters, but it would be hard for it to figure out which to add, because it has no way of knowing how a kanji is pronounced. A database of kanji characters would become very large, and even then I think many kanji can be pronounced more than one way. So it would be so unpredictable that you'd end up having to specify the pronunciation characters manually everywhere anyway. And that would make Lua somewhat less useful in doing this, I think. Unless someone knows something that I don't. How is Ruby implemented in Wikicode/HTML? —CodeCat 22:55, 14 October 2013 (UTC)[reply]
The ruby is implemented with the <ruby> tag and associated tags in HTML such as at w:Ruby character#HTML_markup. add_ruby in Module:ja adds the ruby HTML tags directly. --Haplology (talk) 11:46, 15 October 2013 (UTC)[reply]
I admit that I don't know much about this, but what Haplology wrote is, "I've written some code (namely add_ruby in Module:ja) that can add furigana automatically when provided the term plus the term written entirely in kana". So a database isn't necessarily needed. The goal here isn't to eliminate the need for an editor to provide kana, it's just to eliminate the need for an editor to arrange the entire usex in kanji-kana pairs, which is more difficult than it might sound, because it makes the wikitext complicated and untypeable and inscrutable, so it's hard to keep track of where you are, and proofreading is impossible. (I know because I once tried something similar for an Ancient Greek usex — marking up each source-word with an on-hover transliteration and gloss — and it was incredibly difficult.) A quick glance at Haplology's code suggests that it works by taking advantage of the fact that even when Japanese is written with kanji, it still uses kana for word-endings, so the Lua function can use these kana to perform alignment between the two versions. (It seems like this could have problems when the same sequence of kana appears both as a word-ending and inside a nearby word, but perhaps Haplology has addressed that somehow?) —RuakhTALK 23:18, 14 October 2013 (UTC)[reply]
Great idea (in general), if it's easy to use and reliable. Chinese sometimes also uses "ruby" - using pinyin or Zhuyin fuhao (bopomofo) (Taiwan). In fact, a similar approach could probably work for other languages needing transliteration. It would be hard to do it for right-to-left languages or they should use word by word method, not full sentences. --Anatoli (обсудить/вклад) 02:07, 15 October 2013 (UTC)[reply]
{{furigana|[[好]]|[[hǎo]]}}{{furigana|[[主意]]!|[[zhǔyì]]!}}{{Hani|}}
(Good idea!)--Anatoli (обсудить/вклад) 02:15, 15 October 2013 (UTC)[reply]

{{furigana|транс|trans}}{{furigana|ли|li}}{{furigana|те|te}}{{furigana|ра́|rá}}{{furigana|ци|cji}}{{furigana|я|a}}   {{furigana|this  |/ðɪs/}} {{furigana|has  |/hæz/}} {{furigana|o|/ʌ}}{{furigana|ther  |ðɚ/}} {{furigana|po|/pɒ}}{{furigana|si|sᵻ}}{{furigana|bil|bɪl}}{{furigana|i|ə}}{{furigana|ties|tiz/}} but spacing is a problem. Chuck Entz (talk) 04:03, 15 October 2013 (UTC)[reply]

I'm glad this is a popular idea. Unless some opposition emerges I'd like to convert the examples to furigana format, then. To explain how the code works, it replaces every unbroken segment of kanji with a pattern like this: (.+), so for example if you have a sentence like "太陽はすごい黄色いよ", the first step is to turn that into the pattern (.+)はすごい(.+)よ which is applied to the string of kana, "たいようはすごいきいろいよ", and that returns each segment of kana in the second one that was taken up by segments of kanji in the first, in this case たいよう and きいろ, in two separate variables. The final step is to use the ruby tag to stick the kana on top of their respective kanji segments and replace the segments of kanji in the term with those. This should grab the right kana every time no matter where the kanji or kana appear. The only drawback here is that the kana don't match up exactly to each kanji individually (rather the block of kana matches up with its corresponding block of kanji), but in practice with Japanese at least they match up pretty well, and actual Japanese texts often don't usually match up the readings to each kanji exactly either. The same method should work for other languages, but spacing might be more of a problem. In my ja-usex template, I put spaces in the kana-only parameter, but that's in order to make the romanization work, and it has nothing to do with furigana.
Is there any support for putting furigana/ruby on headwords? (Assuming that Module:headword will allow it...) It would just be a matter of plugging in the same code over at Module:ja-headword, so from the technical side it presents no challenge, and everything can be done there. Unlike ja-usex where the line of kana is deleted, in the headword we'd keep the line of kana as a link in parentheses as it is now, since that's a page in its own right. Nothing would be deleted and nobody would have to change their editing behavior. The question is whether it would improve the headword to have furigana on top. --Haplology (talk) 06:18, 15 October 2013 (UTC)[reply]
Things go wrong in headwords but I don't think that's a problem with Module:headword as such. It appears that the ruby tag wraps the text into a table, which interferes with the tagging that is applied for headwords and the wiki doesn't like that: [1]CodeCat 13:11, 15 October 2013 (UTC)[reply]
Are you sure that it's related to the ruby tag? {{furigana}} uses a table, and in fact it does not use a ruby tag. I didn't write Template:furigana and don't propose using it (no offense to the authors.) My code doesn't use any tables, and none show up when I try expanding this: {{head|ja|head={{ja-r|最大|さいだい}}}} at Special:ExpandTemplates.
Putting that aside for a minute, how about linking like (さる)()から()ちる (saru mo ki kara ochiru) ? This is as opposed to 猿も木から落ちる (さるもきからおちる, ​saru mo ki kara ochiru). I think it might be appropriate for long terms like those in the "Derived terms" section of 最大. Haplology (talk) 13:38, 15 October 2013 (UTC)[reply]
(edit conflict) The listings of pinyin, kana and romaji in parentheses could be replaced by the headword form with them over it in ruby format. The nice thing about ruby is that it emphasizes the secondary nature of the other scripts and breaks them up into functional units, so there's less temptation to create unnecessary entries for the long blocks in the alternative scripts. The only serious drawback is the reduced readability of the smaller font-size. Chuck Entz (talk) 13:48, 15 October 2013 (UTC)[reply]
Well, it's obviously not viable for Latin or anything else. But for Kana and Zhuyin, it's ok, barely. -- Liliana 15:38, 15 October 2013 (UTC)[reply]
I am in favour of using native language conventions and typography. Ruby text looks good, and simpler and clearer, even for a non-Japanese reader. There might be issues with the added line spacing, but it can probably be ameliorated in headwords and usage examples.
The legibility problem would be mitigated by letting our site display text in the browser default font-size (16px), rather than reducing it to 80% (=13px), allowing ruby text to keep the traditional 50% size, or allowing it to be bigger still. (Small fonts were a design fad meant to make websites look more sophisticated, which is going away now that people read the web on mobile devices.)
One might also create a simple gadget for WT:PREFS that displays the ruby inline with parentheses. Haplology’s example above would become (さる)()から()ちる (saru mo ki kara ochiru). This is the identical HTML, with some CSS based on W3’s example. (I guess this won’t affect Firefox users without the ruby extension.)
Shouldn’t the <rp> elements in {{JAruby}} contain whitespace outside the brackets, to make the inline version more readable?:  (さる)  () から () ちる (saru mo ki kara ochiru)?
Can we remove the <rb> tags from {{JAruby}}? In HTML5 the rb element is “entirely obsolete, and must not be used by authors.”[3] The spec recommends that “Providing the ruby base directly inside the ruby element is sufficient; the rb element is unnecessary. Omit it altogether.”[4] Michael Z. 2013-10-15 18:04 z
I agree that the font size makes the ruby difficult to read and they should be enlarged. I have no objections to leaving out the <rb> tags--I was just going by Wikipedia's page and the code in JAruby, and hadn't researched it very deeply. Just to make sure there's no confusion, I didn't write JAruby, and all of my templates rely on the function add_ruby in Module:ja. It's getting late here so I can't do anything more until tomorrow. I can revise the HTML at that time, or if you would like to edit the code please feel free. It's at the bottom of the module. Haplology (talk) 18:39, 15 October 2013 (UTC)[reply]
I will remove the rb elements. A number of ruby tutorials I’ve seen support this coding practice, too. If anyone notices a rendering problem in their browser, let me know.
Incidentally, Firefox only displays the ruby text inline with parentheses, which is okay. There is a Firefox ruby add-on, which provides the traditional ruby display, but disables any style sheet control over its display. Safari and Chrome seem to support ruby display and basic CSS for it. I cannot test MSIE, but I understand that it has supported ruby since the Palaeolithic. Michael Z. 2013-10-15 18:52 z

As Chuck Entz suggested above, this (ðɪs)  has (hæz)  other (ˈʌðɚ)  possibilities (ˌpɒsᵻˈbɪlətiz). Although the spec says ruby annotations are “primarily used in East Asian typography,” this is merely a typographical technique for text annotation, and nothing prevents us from making use of it. Templates could be used for specific purposes. but simple ruby is not that hard to hard-code in HTML. Michael Z. 2013-10-15 19:40 z

For the fallback rendering in Firefox and other non-supporting browsers, ruby needs the rp elements, which do complicate the code enough that a template would be preferable. Michael Z. 2013-10-15 19:48 z


Conversation from user talk:Mzajac#ruby size:

Hi there, I notice that ruby is a bit bigger. It's easier to read but now it matches up with the text less than it did before and hangs off the edges of words, and worse yet it sometimes pushes the kanji below apart, so we end up with text with gaps such as here. Is there some way to address this? Thanks --Haplology (talk) 12:30, 18 October 2013 (UTC)

I was afraid that this might be an issue.
Ruby text is traditionally 50% of the font size, and this is the value that the HTML specification recommends (or at least implies, with the CSS3 ruby module’s default stylesheet). Although long ruby text may still hang and push the text apart, increasing its size also increases the occurrence and magnitude of this effect, and clashes with conventional typesetting standards.
There is some new CSS3 like ruby-merge: collapse[5] that may help with this, but I think there is little or no browser support for it.
I am resetting rt size to 50%, for now.[6] Possible solutions for ruby readability:
  1. Live with the small ruby font-size, and let readers use their browser zoom, user style sheets in the browser or Wiktionary, or their magnifying spectacles to read ruby text
  2. Increase the size of ruby text and live with the text layout problems
  3. Increase the font-size of Asian text in our style sheet to make ruby text more readable (Japanese and Chinese are already 14px, 110% of the base 13px)
  4. Globally reset our base font-size from 13px to 16px and stop using a mix of font-sizes for different languages (16px is the recommended value in the HTML spec and the default in web browsers)
The first three don’t show respect for our readers, for traditional Asian typesetting standards, for the integrity of typefaces used to display Wiktionary. They will also have no effect when our text is reused without our style sheets. Michael Z. 2013-10-18 17:01 z

Recently Ivan has been nominating everything he can find as OR, and even created Template:Original research, believing it to be a valid policy on Wiktionary. But last month's discussions have shown that there is no policy and not even a clear consensus on the issue. So now, he is trying to get information deleted based on his own personal opinion of what information should be in Wiktionary, rather than based on what Wiktionary users have actually agreed on (and they haven't agreed on anything so far). And then he tries to force the issue by deleting everything while the discussion is still going, which feels to me like he's saying "I'm going to go ahead, and I will edit war with you for all eternity until you back down, and then I'll take the outcome as consensus". I am very worried about the hard-handed way he acts, not just in this case but also in past cases. For example, his rhetoric regarding the equal treatment of alternative forms also bothered me a lot because he tried to discredit the opposition rather than the argument. He has done the same thing during discussions about SC, by making so many counterattacks on people whose opinion on the issue differed that it made any discussion impossible. I do not believe that Ivan is acting in good faith, he doesn't seem at all interested in forming a consensus nor acting on it, and I would like to bring this issue to a wider audience here. —CodeCat 13:50, 15 October 2013 (UTC)[reply]

The entries still have to go through RfD, don't they? We usually don't develop good policies from first principles, but rather from a contributor trying to apply our existing policies to achieve an objective. I don't see why there should be an edit war. RfD and RfDO seem like the appropriate venues.
If someone feels strongly about something on can expect persistence. Many of us have styles of argumentation (eg, sarcasm, stonewalling, lawyering) that others find objectionable. Thicker skin is a partial solution to the more abrasive styles.
I don't enjoy being on the other side of a battle discussion with Ivan, but that wouldn't make me accuse him of not acting in good faith. DCDuring TALK 14:38, 15 October 2013 (UTC)[reply]
Ivan used RFV instead, which really just confuses me because there are no agreed-upon criteria that can be applied to reconstructed terms. On what grounds would an RFV pass or fail? Clearly we disagree on those grounds, but he is acting as though that means his grounds are automatically valid and so he skirts around any consensus building whatsoever. He then goes ahead and treats the term as "RFV failed" and starts moving, deleting and orphaning pages, regardless of whether anyone else actually agrees with him on that. He also then creates a template purely to support his own POV and then starts tagging things with it, as if the use of such a template were backed by any kind of policy or consensual agreement. That is the part that I consider bad faith. I feel somewhat at a loss as to how to tell him to stop what he's doing and talk things through more thoroughly first, because either I let him go ahead or have to be prepared to discuss with him endlessly. It's his sheer stamina in arguing his own point of view that I find hard to deal with. It's as if he just completely dominates all other viewpoints off the map and is then satisfied that everyone agrees with him because nobody has the energy to continue anymore... —CodeCat 16:01, 15 October 2013 (UTC)[reply]
Nope, that's not bad faith. Bad faith is lying that I started RfV (I didn't). Bad faith is lying that I deleted a page (I didn't, I've moved it to a spelling that is citable with a reference to save it). Bad faith is you orphaning citable form *źombos to uncitable form *źambas, which on e.g. the entry of OCS зѫбъ (zǫbŭ) had for six years[7]. (I'm surprised you didn't do it in one of your illegal bot runs, so that it doesn't appear in watchlist at all). Bad faith is abusing absence of policy to promote your guesswork by declaring any RfV/RfD discussions on them pointless. You're just another POV-pusher that is using personal attacks and logical fallacies to promote their agenda. --Ivan Štambuk (talk) 20:23, 15 October 2013 (UTC)[reply]

Replies:

  1. I don't "believe" that that the newly created {{Original research}} is a valid Wiktionary policy, and the reason I created it is to properly tag entries that do contain original research (OR). You repeatedly restored and defended appendices containing reconstructions that appear to be OR by you (such as Reconstruction:Proto-Balto-Slavic/źambas), so there is a need to tag them. NOR (No Original Research) is one of the pillars of Wikipedia, and the only other sister project that explicitly doesn't embrace it is Wikiversity's with its research groups. Since you resorted to defending their existence not by providing citations from reputable scholars, i.e. etymologists in the field, but by providing explanation for various sound changes[8], it is necessary to make it clear to the reader that this and other reconstructions are not a result of a real linguist, but a Wiktionarian. The message in the template says "This entry contains OR, please provide inline citations proving otherwise so that this notice could be removed". It's important to separate real etymologies, that are established by the scientific community, from the guesswork done by User:Xyz. If their so no clear separation between the two, there is no way that Wiktionary could ever serve as a reliable etymological source.
  2. There is indeed no consensus or policy on OR. But AFAICS you're the only one actually guessing reconstructions without checking any sources with multiple users objecting to such practice. Just because there isn't a policy (yet) regulating what you're been doing it doesn't mean it's OK. You're abusing the lack of regulation to further such practice and dismiss any dissenting opinion. Furthermore, it appears to me that this is your preferred state of affairs, since nobody can tell you that what you're doing is wrong, and (as you rightly point out) the fact that a reconstruction cannot be cited verbatim is irrelevant for forming consensus on its deletion, as opposed to normal entries where the lack of attestation immediately causes their deletion. This enables you to create such appendices without challenge. But that's fine by me - just don't link to them in the etymologies of entries in the main namespace, and other etymological appendices that are compiled on the basis of existing scholarship. If nobody can see them no harm is done. But the moment you do this it becomes a problem. I strongly suggest you don't change cited reconstructions to your guesswork any more.
  3. I haven't "deleted" anything I've simply moved the page to a spelling that is citable. No content was lost. What you did is 1) moved it back 2) removed inline citations for a citable form to make it appear that both are equally valid 3) provided in the etymology section inline citations for the sound changes involved in reconstructing the *źambas form from two authors who happen to support completely different theories on the reconstruction of Proto-Balto-Slavic protolanguage, to bestow an aura of legitimacy on the *źambas form. Cherry-picking evidence when it suits you, ignoring counter-evidence that doesn't suit you. You're misrepresenting the sources and pushing a particular protolanguage form on the basis of your personal preferences. I personally don't have problem with either form, and as I've pointed out, if evidence is presented for both they would have to be reconciled in a neutral page name because of NPOV.
  4. Having British spellings redirect to American spellings or vice versa is wrong. It's also wrong to have Glagolitic Old Church Slavonic entries redirecting to Cyrillic forms, or ignoring non-canonical Church Slavonic altogether, which were your suggestions. You're against French feminine nouns redirecting to masculine counterparts apparently for feminist reasons, yet you see no problems with declaring large chunks of people's cultural heritage as variant forms, for practical editing reasons. That strikes me as a bit hypocritical. Yes I was a bit too abrasive against user from hrwiki when the SC unification was discussed years ago, but my accusations have proven correct - project is run by thinly-veiled far-right propagandists, and there is ongoing discussion on Meta to shut it down because it's such an unfixable embarrassment. You can say it was "for the greater cause". I also think that in the future, as basic English-language entries become so polished that most of edits will be done not in the definition lines but translations, *nyms, etymologies and similar, it will become technically, even manually, feasible to provide an equal treatment for all spelling variants. But that's all irrelevant with regard to this discussion.
  5. Your whole reply above is largely a personal attack. I don't think you have interest in forming a consensus on this matter at all. You just want a push a particular POV. I just want 1) to have current scholarship be properly represented, meaning there is no reason to prefer uncited *źambas to citable *źombos 2) eliminate (or reduce to a bare minimum) any kind of guesswork in etymologies. It's sad that many of protolanguages don't have a complete reconstruction. Indo-Iranian is not opposed by anyone (as opposed to Balto-Slavic, which gets occasionally attacked for political reasons), and yet you won't find a clean reference for Proto-Indo-Iranian anywhere. Such is reality - if the scientific community ignores some part of historical linguistics (because nobody is interested, or whatever), it's not up to us to fill in the blanks. Why is it a problem to have Proto-Slavic *zǫbъ, Lithuanian žam̃bas and Latvian zobs not having Proto-Balto-Slavic stage in their etymological derivational chain?. If for more than a century no Indo-Europeanist has bothered to make such reconstruction, it's probably irrelevant, unnotable and we shouldn't list it. --Ivan Štambuk (talk) 16:56, 15 October 2013 (UTC)[reply]
You're doing it again. A huge wall of text, which I expect will scare off a lot of people who might have something to say on this matter. And it seems like you're just repeating the same things over again, like personal attacks against me ("I don't think you have interest in forming a consensus") or the "thinly-veiled far-right propagandists", or the assumption that theory-backed reconstructions (which you call "OR") are inherently less valid than verbatim citations without motivation, and that this therefore gives you free rein to do as you see fit despite the complaints of others such as me. So I really don't see much point in arguing on this as I don't want to argue on that level. I'm not playing your game anymore. The purpose of this discussion is to allow others to voice their opinions on the matter, not for us to discuss the same things all over again and let you dominate all other points of view out of existence. —CodeCat 17:22, 15 October 2013 (UTC)[reply]
Well I do genuinely believe do you have no intent on forming a consensus, and are just abusing the lack of strict policies to push your reconstructions everywhere. Am I not allowed to say my opinion? How is that a personal attack? If you really want to push your "theory-backed reconstructions" submit a paper to journal. Or open a blog, post them on Cybalist etc. Excuse me for pointing out that replacing citable reconstructions by experts in the field who study those protolanguages for decades by your own reconstructions is nutty. Your theories are your personal theories, nothing else. --Ivan Štambuk (talk) 01:54, 16 October 2013 (UTC)[reply]
Hey guys, I have been absent from Wiktionary for a while, but I do want to start getting involved again. One thing I have always wanted to bring up to the BP was that I think we should have a more Wikipedia-like policy on reconstructed entries. The reasons why we allow original research in the main namespace is that relying on others to publish something about it first could keep us from adding many new words. We have a controlled method for doing this kind research and reviewing it (RFD, RFV, etc.), but this does not apply at all to reconstructed terms. For one thing, new words are not being added to proto-languages so we do not have a naturally growing lexicon to keep up with. In fact, the only way a proto-language's lexicon can grow is through research that we are not qualified to do ourselves (or at least we do not have a reliable process through which to do it). Thus, all we can know about proto-languages is from linguistic publications and so we must cite them in our reconstructions. That is not to say that we can't do any reconstruction ourselves, but only that we have to cite all the information we used to perform these reconstructions (the sound-shift we followed, etc.). I am also not saying that we should go around deleting these pages, but that we should go around trying to find citations that support us. --WikiTiki89 17:37, 15 October 2013 (UTC)[reply]
Yes, this is my view on the matter as well. And I recall that Ivan suggested something similar not too long ago at the end of the prior discussions: Wiktionary:Beer parlour/2013/September#Etymology policy, original research, aliaque Wiktionarii conturbata, Wiktionary:Beer parlour/2013/September#Original research at Wiktionary. But we have already discussed that quite a bit, and I would prefer it if we can keep this focused on Ivan's conduct so that the topic doesn't stray too much from what I feel is quite a problem. Is it ok if I split this discussion off into a separate section? —CodeCat 17:46, 15 October 2013 (UTC)[reply]
I think it is unnecessary to warn users about Original Research, and I think it is wrong to go around deleting pages. I think instead we should create a category for uncited words and then go through them and find citations. --WikiTiki89 17:58, 15 October 2013 (UTC)[reply]
Why don't you try doing that on Wikipedia, adding uncitable reconstructions to articles? You'd get reverted, and eventually blocked if you persisted in adding such OR. So all you kids come here and contaminate legitimate etymologies with your made-up guesswork. Everything worth in Proto-Indo-European and its descendants has already been reconstructed and discussed to death in the 19th century, theories have been updated in the 20th century, and if in the year 2013 you still cannot find a source supporting a particular reconstruction, then it's probably not worth having it at all. Thousands of papers are published each year suggesting new reconstructions, and if User:Xyz's reconstruction cannot be corroborated from a published work, it's either 1) of no practical usage 2) unreconstructable using established methods 3) both. --Ivan Štambuk (talk) 20:12, 15 October 2013 (UTC)[reply]
@Ivan - I'm not in a position to comment on the subject of the discussion itself, but I find it necessary to point out that there is nothing constructive to be gained from making a random accusatory statement like, "You're against French feminine nouns redirecting to masculine counterparts apparently for feminist reasons." What does a sociopolitical ideology have to do with grammatical gender, and why are you assuming that this is what motivates CodeCat's opposition to the redirection proposal in question? -Cloudcuckoolander (talk) 23:22, 15 October 2013 (UTC)[reply]
Well that is how I construed this comment. Instead of treating feminine forms as variants by using a soft-redirect template (i.e. like it's done in all of the dictionaries, not only for French but pretty much all Indo-European languages), users are required to duplicate definitions, usexes and so on simply because it "feels wrong". And this with English language not even having a feminine gender. It seems like unnecessarily imposing somebody's sociopolitical ideology (as you call it). Note however that I don't care at all as to the outcome of that discussion - the only reason why I was mentioning it in the first place is to show hypocrisy in the treatment of alternative forms, in which I was called for making personal attacks, (in case of British/American spellings, but there are some other cases such as (Old) Church Slavonic which boil down to the same argument, not sure what exactly was meant but at any case it doesn't matter). My arguments for full treatment in all of these cases were to ensure equal treatment of every variant/spelling/script, which were frowned upon for being impractical (too much maintenance for mirroring changes). Perhaps someone found themselves offended, but they shouldn't have done such controversial changes without a vote. --Ivan Štambuk (talk) 01:38, 16 October 2013 (UTC)[reply]
"I just dislike the idea of treating feminine nouns as secondary to masculine ones?" There's many reasons someone could think that. I don't see cause for ascribing the specific basis you have to it. It doesn't aid discussion to read something into a comment, as seems to have been done in this instance. -Cloudcuckoolander (talk) 03:07, 16 October 2013 (UTC)[reply]
I can't really think of any other reason than feminism/gender equality and similar. (Not that there's anything wrong with it.) I just found the argument silly - it's not our fault that the derivational morphology of Indo-European languages reflects ancient patriarchal society, and that masculine is the unmarked gender. Fun fact: according to one theory: the original Indo-European feminine marker (< *-eh₂) was in fact taken from the word for "woman" *gʷén-eh₂. So every feminine noun is not only a feminine-form-of, but on a deeper level, a woman-form-of. --Ivan Štambuk (talk) 03:44, 16 October 2013 (UTC)[reply]
He's still trying to force his POV through and present it as consensus: diff. When is this going to stop? It doesn't look like Ivan will stop on his own. —CodeCat 17:07, 21 October 2013 (UTC)[reply]
Sorry, you're the only one fabricating etymologies around here, without checking any of the sources. --Ivan Štambuk (talk)

Purplebackpack89 Rollback request

[edit]

Here's why:

  • Almost 1,000 edits
  • Rollback on a number of other projects

Purplebackpack89 (Notes Taken) (Locker) 22:25, 16 October 2013 (UTC)[reply]

Added rollbacker rights. —Stephen (Talk) 00:50, 18 October 2013 (UTC)[reply]

Transliterations for inflected forms in headwords?

[edit]

The normal practice so far has been to show a transliteration for the headword, but not for any inflected forms. {{head}} never had any support for inflection transliterations and most language-specific templates have never supported it either. However, User:Atitarev asked to add it to the Russian templates, so I'm wonder if this is something that really should be widely supported. From a technical point of view it's not a huge problem to add this support, it would just make Module:headword a bit more complex because it would have to be able to deal with link + transliteration pairs as well as just links alone. But it would make the headword itself look a bit odd. Here's an example, first with the transliteration in plain text and then with it in bold:

па́дать (pádatʹimpf (perfective упа́сть (upástʹ) or пасть (pastʹ))
па́дать (pádatʹimpf (perfective упа́сть (upástʹ) or пасть (pastʹ))

It looks a bit visually cluttered to me, especially with the brackets-within-brackets, so I'm not sure if it's really such a great idea to do this. —CodeCat 22:25, 17 October 2013 (UTC)[reply]

OK, kill "impftr" and "pftr". I've checked with User:Stephen G. Brown, he has no objections. We'll have to make sure that the opposite forms have word stresses and get created. --Anatoli (обсудить/вклад) 01:46, 18 October 2013 (UTC)[reply]
That's fine, but I still wonder what people think about this more generally. —CodeCat 01:57, 18 October 2013 (UTC)[reply]


Yes, we should probably include transliterations everywhere. But the proposed form is not good enough yet.
The transliterations of the inflected forms and their brackets should not be bold. (Come to think of it, the inflected forms probably shouldn’t be bold either.)
The nested brackets are bad. Conventional form is to use square brackets inside round brackets, but that would still be overly cluttered. We should use commas and parentheses, or parentheses and commas. This might require amending WT:ELE.
The linked bullet is confusing. A bullet is a list-item marker or a separator. Here it incorrectly implies that pádatʹ is not related to падать. Also, it is apparently meant to appear after every headword and not after every transliteration, so why should a reader conclude that it relates to transliterations? Notice that impf also conveys additional information, but in a completely different way. Michael Z. 2013-10-18 02:04 z
Maybe a dash as a separator:
падать (padatʹ) impf – perfective упасть (upástʹ) or пасть (pastʹ)
  1. to fall, drop, sink, decline


These long headword lines might be clearer if every inflection item was on its own line.
падать (padatʹ) impf
perfective упасть (upástʹ) or пасть (pastʹ)
  1. to fall, drop, sink, decline


take
takes – third-person singular simple present
taking – present participle
took – simple past
taken – past participle
  1. (transitive) To grasp with the hands.


 Michael Z. 2013-10-18 02:17 z
I support keeping the transliterations everywhere and making the inflected forms (and their transliterations) unbolded by default. Using dash as a separator is a good idea. I oppose removing the link to the corresponding transliteration appendix, and support using a better way for linking to it. My suggestion is using a superscripted question mark immediately after the transliteration:
падать (padatʹ)? impf
--Z 06:59, 18 October 2013 (UTC)[reply]
It seems a bit much to pepper every single mention of a term with little question marks, when a reader can learn what this is in 30 seconds.
How about an unobtrusive link for the whole transliterated text? Keep the text black, but the word could show the tooltip and link to the help text. We could show the same kind of dotted underline that we do for abbreviations like impf, but that might interfere with transliteration schemes that have diacritics below. I believe we can accomplish this using CSS in the stylesheet, but I can’t just demo it in with code on the page. Michael Z. 2013-10-18 16:36 z

Vote on X-SAMPA

[edit]

There was a consensus to remove X-SAMPA in a recent Beer Parlour discussion. Could someone draw up a vote on the matter, please? I would do it myself but I'm trying to spend less time on the Internet. I will review the vote once it's written, if someone is kind enough to initiate one. Mglovesfun (talk) 13:41, 18 October 2013 (UTC)[reply]

Multiple headwords and transliterations

[edit]

Recently I added head2=, head3= (etc) parameters to {{head}}, which allow you to specify more than one headword. This can be useful in cases where diacritics can be added to the base word in more than one way. When I added it, I did overlook one detail and that's transliterations. As a result, if a transliteration is added, it only transliterates the first headword:

пе́рвый or второ́й or тре́тий (pérvyj or vtorój or trétij)

This is probably not something we want to keep this way, and the most obvious solution to me is to add corresponding tr2=, tr3= parameters for each headword. But how do we display these transliterations? Do we group each of them after the corresponding headword?

пе́рвый (pérvyj) or второ́й (vtorój) or тре́тий (trétij)

Or do we put them all at the end?

пе́рвый or второ́й or тре́тий (pérvyj or vtorój or trétij)

Which is better? —CodeCat 18:47, 19 October 2013 (UTC)[reply]

Personally, I like the second option. --WikiTiki89 19:05, 19 October 2013 (UTC)[reply]
First option. Everywhere else, romanization is displayed right next to the romanized word. There’s no reason to write “or” four times for only three terms. Michael Z. 2013-10-19 19:52 z
I like the second option better. I think the first option would make for a more confusing presentation when there are also inflected forms on the line. —RuakhTALK 19:58, 19 October 2013 (UTC)[reply]
Here is what that would look like with the first option:
пе́рвый (pérvyj) or второ́й (vtorój) or тре́тий (trétij) m (genitive пе́рвого, plural пе́рвые)
and with the second:
пе́рвый or второ́й or тре́тий (pérvyj or vtorój or trétij) m (genitive пе́рвого, plural пе́рвые)
CodeCat 20:26, 19 October 2013 (UTC)[reply]
Just for fun, here’s a version with ruby text.
пе́рвый (pérvyj) or второ́й (vtorój) or тре́тий (trétij) m (genitive пе́рвого (pérvogo) , plural пе́рвые (pérvye) )
Can we see a real example from an existing entry? What are the use cases for multiple headwords? Michael Z. 2013-10-19 22:38 z
  • I still support that 1) transliterations should go under their own header ===Transliterations=== 2) there should be multiple transliteration schemes provided (scholarly, "dumbed-down IPA" as they are used now), now easily automatible through Lua 3) they shouldn't be displayed at all inside {{term}}, {{l}} and others, maybe as popups if necessary. --Ivan Štambuk (talk) 23:25, 19 October 2013 (UTC)[reply]
    Surprisingly, I have thought about it, and I would support all three of Ivan's statements, except that I think that transliterations should be included in {{term}}, etc. for scripts that generally have poor font support (e.g. Samaritan or anything outside the Basic Multilingual Plane). --WikiTiki89 23:38, 19 October 2013 (UTC)[reply]
That idea has merit. I did up some examples in 2008, and батяр#Romanization still remains. I don’t recall any objections to the idea then. Today this could be automated with Lua, depending on the transliteration schemes.
Conceivably, there could also be Cyrillizations and other international conversions of terms. We should confirm that this is for orthographic conversions and conventional romanizations, and not pronunciations which already have a header (the distinction is not always clear cut). Is the correct title the restricted Romanization(s), the more general Transliteration(s), or perhaps the all-encompassing Conversion(s).
Do we include non-standard wiki-transliteration schemes? Since we would be opening it up to multiple conversions, do we endeavour to prevent hobbyists from adding another dozen schemes of their own? I would like to limit this to published national and international standards. Michael Z. 2013-10-20 19:17 z
I do think we should keep the transliterations in links, inflection tables and such. They are too helpful to get rid of. And it doesn't really answer the original question of how to display transliterations in headwords, because I think the transliteration should be kept here too for the same reason. —CodeCat 19:25, 20 October 2013 (UTC)[reply]

More things that should have bee done long ago (sorry for diverting the discussion, feel free to add a separate section, but it seems relevant to the topic on how to reduce the visual clutter in the headword line):

  • Inflections in the headword line are useless if there is inflection table present.
  • The sole reason why "principal parts" are listed in the headword line is 1) to mimic the paper dictionaries who cannot provide full inflection due to space constraints 2) inertia with English being the project's main language, and English entries can have the full inflection specified in the headword line.
  • The principal parts, if necessary to be displayed at all, should be displayed in the inflection table, e.g. within the title of the collapsible box.
  • For every language, every single inflectional class and all its subtypes should be labeled, using e.g. numbers and letters. Many languages don't have those and use some vague descriptive names, or have some general inflectional classes with many unpredictable subtypes. All such classes should be documented in appendices, and linked to in the head word line e.g. pattern 3-b.
  • Once we get rid of inflection and transliteration from the headword line, there will be space to put pronunciation on the right, both IPA and the play button.
  • headword line should be hyphenated, and grammatical information (gender, plurality etc.) as well as additional forms (e.g. feminine forms, perfective/imperfective pairs such as for Russian above etc.) should come in the next line below, and would include a part of speech which should be eliminated as a section name. --Ivan Štambuk (talk) 20:33, 21 October 2013 (UTC)[reply]

Hi, I merged Wiktionary:Index to templates into Wiktionary:Templates, as there was plenty of repetition and redundancy. It could still be improved and polished, for sure, but I hope you approve. --ElisaVan (talk) 12:13, 21 October 2013 (UTC)[reply]

Language(s) with incomplete Unicode support

[edit]

I have recently started adding some Judeo-Arabic words and I ran into a problem. Judeo-Arabic has two essential diacritics, a single dot above (for letters ג, ד, ט, כ/ך, צ/ץ, ת) and a double dot above (for the letter ה). The single dot can be represented with U+05C4 (HEBREW MARK UPPER DOT) and has decent font support. The double dot, however, does not exist in Unicode at all. For example, אכׄוה (ʾixwa(t), brothers), the plural of אךׄ (ʾax, brother), should have two dots on its last letter ה, similar to its Arabic equivalent إِخْوَة (ʔiḵwa) , but there is no way to represent this in Unicode (although it seems to have been proposed at some point). I am sure this applies to other minor languages. My questions are:

  • Have we encountered this kind of problem before?
  • How have we solved it or how can we solve it?
  • Should entries be created without the dots?

--WikiTiki89 16:33, 22 October 2013 (UTC)[reply]

For this particular case, could alternatives be used like a diaeresis? —CodeCat 17:02, 22 October 2013 (UTC)[reply]
It comes out like this: אכׄוה̈. But, I just looked at a couple older manuscripts (this one and this one) and neither of them use the double dots, leaving the ה as it is (but they do use the single dots). More modern printed versions such as this one use it consistently. --WikiTiki89 17:20, 22 October 2013 (UTC)[reply]
South Picene, Middle Persian and Egyptian have Latin-script entries due to incomplete Unicode support of their native characters. — Ungoliant (Falai) 00:25, 23 October 2013 (UTC)[reply]
So do both Tocharian languages. —Aɴɢʀ (talk) 20:06, 23 October 2013 (UTC)[reply]
Why are Middle Persian and Egyptian not completely supported by Unicode? -- Liliana 20:15, 23 October 2013 (UTC)[reply]
See this and this. — Ungoliant (Falai) 01:58, 24 October 2013 (UTC)[reply]

Converting WT:RFV, WT:RFD, and WT:RFDO to monthly subpages

[edit]

I think that in order to avoid clutter, it's time to convert WT:RFV, WT:RFD, and WT:RFDO to monthly subpages, as we did with the WT:BP and WT:GP. And we should also require that every issue on these pages be closed withing a decided-upon number of months (I suggest 2). What do you guys think? --WikiTiki89 21:42, 23 October 2013 (UTC)[reply]

And what happens when two months have passed and there is no consensus? DTLHS (talk) 21:48, 23 October 2013 (UTC)[reply]
At RFD and RFDO, no consensus defaults to keep. RFV isn't about consensus; either a term gets verified or it doesn't, and if it doesn't it gets deleted. —Aɴɢʀ (talk) 22:19, 23 October 2013 (UTC)[reply]
  • Initial oppose, unless discussion convinces me otherwise. --Dan Polansky (talk) 22:05, 23 October 2013 (UTC)[reply]
  • I do support dividing them into monthly subpages. But DTLHS does bring up a valid point. We shouldn't have a time limit for discussions; they should go on for as long as needed until either consensus is reached, or we decide there is none. So we would need a somewhat different practice from what we have for the BP. I think it may work if we make it a goal to delete the pages once all the discussions for that month are archived. That way it becomes immediately clear when a discussion is overdue for closure, because its monthly page will still exist. We could also go for a completely different solution and hold the discussions on the talk pages instead, and only link to them or transclude them from the main page. That way, the discussions are self-archiving just like the BP and GP now are. —CodeCat 22:24, 23 October 2013 (UTC)[reply]
  • Support: RfD, RfV, and RfD/O are too long and too unnavigable at present. Purplebackpack89 (Notes Taken) (Locker) 23:03, 23 October 2013 (UTC)[reply]

Can some of these be broken up into categories? WT:RFV/en, WT:Requests for delection/Categories, WT:Requests for delection/TemplatesMichael Z. 2013-10-24 01:11 z

Here is what I was thinking about the time limit:
  • If a consensus is reached, good.
  • If the discussion stalls then at the end of the time period:
    • (for RFD/RFDO) it should be closed as no consensus.
    • (for RFV) the term should be deleted since it couldn't be verified in two months.
  • If the discussion is still going on at that point, it should be moved to the talk page of the term.
I don't think we should we should move all the discussions to the talk pages because having a central page on everyone's watchlist tends to draw more people into the discussion (which is what we want) and this wouldn't happen if the discussions were all on separate pages. --WikiTiki89 01:50, 24 October 2013 (UTC)[reply]
I just realised there is a rather serious technical barrier for this. The {{rfv}} and {{rfd}} templates both link to the discussion pages. If we split them by month, then these templates need a way to know what monthly subpage to link to. I can't think of any other way to do that other than to include the time of the request on the entry being submitted. Wikipedia handles such things by bot, but is that feasible for us? —CodeCat 02:06, 24 October 2013 (UTC)[reply]
Can't we use the {{CURRENTMONTH}} magic word for this? Nevermind, unless we wanted to subst it. DTLHS (talk) 02:10, 24 October 2013 (UTC)[reply]
(after edit conflict) We could require {{rfd}} et al. to be substed and produce something like {{RFD|October|2013}}. And then fix all of the accidentally unsubsted one by bot. --WikiTiki89 02:16, 24 October 2013 (UTC)[reply]
The templates can keep linking to the hub page (WT:Requests for deletion); the link will lead to the correct subsection since all the monthly subpages will be transcluded there. — Ungoliant (Falai) 02:23, 24 October 2013 (UTC)[reply]
I was thinking that and it certainly would work, but I think it would be better for them to link to the subpage so that no one would ever have to load the entire hub page. We could compose the two ideas and have {{rfd}} when not substed or given a date, to link to the hub, and once it is substed or given a date, to link to the subpage. --WikiTiki89 02:27, 24 October 2013 (UTC)[reply]
What will be the incentive for closing discussions? Currently we have the pages being too big. — Ungoliant (Falai) 02:15, 24 October 2013 (UTC)[reply]
Someone will just have to do it. I was even thinking that maybe we should lockdown the current pages until all expired discussions are closed, but I'm not sure that's a good idea. --WikiTiki89 02:18, 24 October 2013 (UTC)[reply]
I oppose. The nature of RFV and RFD is that threads are moved off the central pages once they're closed, which is supposed to be after 1-2 months. The pages are currently very large because many threads remain unclosed. If we split the pages into subpages, people still have to close and move threads, plus they have to delete subpages once they're empty (and keep track of old but non-empty subpages)... i.e. subpagination would create more work. Furthermore, it would reduce the pressure to close old threads. You propose to address this by requiring that threads be closed after some number of months... but per the KISS principle, it seems to me we could solve the problem of the pages being too large just by instituting a requirement that old threads be closed, without splitting RFV and RFD onto subpages. In other words, I think subpagination would be unnecessary at best and detrimental at worst.
I would, however, support converting RFC to use BP-style subpages; see the subsection below. - -sche (discuss) 04:24, 24 October 2013 (UTC)[reply]
It's supposed to make things more systematic. The idea is that you can just go through and close everything on an expired month's subpage, rather than selective closing as we do now. --WikiTiki89 04:44, 24 October 2013 (UTC)[reply]

Converting RFC to use monthly subpages / automating RFC's use of subpages

[edit]

What do people think of converting RFC to use monthly subpages like the BP does?
Whereas RFV and RFD threads are supposed to be closed after a fixed number of months, and can be (by deleting anything that hasn't been verified by then, or keeping anything for which a consensus to delete hasn't developed by then), RFC threads stay open until someone cleans up the entries, and there's no way to 'force' that to happen by a fixed time. (Except, perhaps, to throw out all the babies with the bathwater and delete any "unclean" entries, but I find that idea unpalatable.) As a result, WT:RFC is huge. It's so huge that, for a year now, I've been manually moving almost half a decade of still-open cleanup requests into yearly subpages like this just to make the main page semi-manageable. In other words, WT:RFC already uses subpages; why don't we automate its use of them? - -sche (discuss) 04:24, 24 October 2013 (UTC)[reply]

I seem to have the opposite point of view about this. I feel like having subpages for RFC will make it easier for people to forget about RFC requests. And, unlike RFV/RFD/RFDO, RFC requests can't be forcibly closed, and so we can't just go through and close everything from a given month. --WikiTiki89 04:35, 24 October 2013 (UTC)[reply]

(After edit conflict. I seem to have been adding this section at the same time as -sche was adding the RFC one.)

I'd also like to suggest the same for the Tea Room, although the discussion needs to be separate because it functions differently. Tea Room discussions don't need to "closed" in the same sense as RFV/RFD/RFDO, making them more like Beer Parlour discussions. Therefore, the Tea Room should be able to function the exact same way as the Beer Parlour and Grease Pit. The only technical issue is the same as the template linking issue above, in the rare case that the {{rft}} (which is also poorly named: Request For Tea?) is actually used. --WikiTiki89 04:28, 24 October 2013 (UTC)[reply]

No more Mr Bad Guy

[edit]

Some of you may have noticed that I have not blocked anyone, deleted any nonsense, or reverted any vandalism since yesterday morning.

This is because I am resigning as a sysop forthwith - I just can't be bothered any more.

I shall continue to work on technical words in English, and general words in most of the Romance languages, but I won't be patrolling Recent Changes or doing any other sysop-type activities.

I shall continue to operate the bot "SemperBlottoBot", but will now be unable to delete any incorrect entries that it might create. get a sysop to block it if you think that is best.

Good luck. SemperBlotto (talk) 07:09, 24 October 2013 (UTC)[reply]

p.s. Somebody needs to remove my Bureaucrat status (I don't seem to be able to do it myself). SemperBlotto (talk) 07:15, 24 October 2013 (UTC)[reply]

Uh-oh. DCDuring TALK 12:23, 24 October 2013 (UTC)[reply]
It is best if you retain the status of Bureaucrat. Then, if there is a major vandal attack, or if you need to clean up the bot's mess, you can make yourself a Sysop temporarily. Toasted Almonds (talk) 14:25, 24 October 2013 (UTC)[reply]
Eep. :-/
Re: removing your Bureaucrat status: see Wiktionary:Votes/2012-11/Bureaucrats and de-privving.
RuakhTALK 14:40, 24 October 2013 (UTC)[reply]
  • Too bad. By the way, I don't think any of the truly productive editors here think of you as the bad guy, or if, then in the best sense. Receiving all the misguided complaints on your user talk page all the time must be quite defatiguing, I figure. As I said before, I for one greatly appreciate the ginormous labor you have done for Wiktionary. I wonder what the final straw that broke the camel's back was, if any. --Dan Polansky (talk) 17:48, 24 October 2013 (UTC)[reply]
    User:CodeCat blocked SB the other day [9] because he deleted a userpage of User:Casicastiel (newly registered account, no edits) who immediately complained about it. --Ivan Štambuk (talk) 20:15, 24 October 2013 (UTC)[reply]
    ... ...
    It's HIS fault?
    What must we do to get rid of this person? -- Liliana 20:56, 24 October 2013 (UTC)[reply]
    (edit conflict) To be clear, I was never, at any point, protesting the block; my comment on the Information Desk was intended to suggest that my creation of a userpage as my first edit may have been out of line. I was well aware and accepting of the reasoning behind the block. I was not trying to seek retaliatory action and I apologize if I provoked a conflict. It was not my intention. — E | talk 21:02, 24 October 2013 (UTC)[reply]
    Well, it can hardly be called your fault. To say it in German (fellows can translate if there's a good English equivalent), you were "der Funke, der das Feuer entfacht hat." -- Liliana 21:14, 24 October 2013 (UTC)[reply]
  • I sincerely don't understand what your comment above is trying to say, so if this is the "clarified" version, I can only imagine how impenetrable it must have been before. (Also, please stop referring to CodeCat using masculine pronouns. It's rude and inappropriate.) —RuakhTALK 02:39, 25 October 2013 (UTC)[reply]
It’s sad to see you break your long‐lived administration. It’s true that you were a tougher administrator than most, but personally I was never unhappy about that, it’s just that sometimes it can be excessive. I’m glad that you are going to continue your Romance additions, and I’ll be sure to credit your entries if I borrow from them. —Æ&Œ (talk) 22:26, 24 October 2013 (UTC)[reply]
  • The block log says this: "21:34, 22 October 2013 CodeCat (Talk | contribs) blocked SemperBlotto (Talk | contribs) with an expiry time of 15 minutes (account creation disabled) (Wasting people's time with User:Casicastiel, stop blocking people for no reason)". The idea that SemperBlotto is wasting people's time is ridiculous. CodeCat behaves like the owner of the place, with too few people protesting. Blocking the most productive editor and admin, even if only for 15 minutes, is a damn rude slap in the face.

    There are good reasons to believe that CodeCat is a male who asks others to refer to them as "she" in order to misuse the stereotype that females are fragile, while showing an aggressive male behavior. I for one find this objectionable and ask CodeCat to stop feigning being a female.

    On yet another note, "E | talk" is a horrible signature. For one thing, signatures should match user names. For another thing, signatures should create the appearance of being names, which "E" is not. --Dan Polansky (talk) 09:38, 25 October 2013 (UTC)[reply]

    For all I know E might be a valid name in China or South East Asia. Otherwise, I agree with your points (that's a first!) -- Liliana 09:49, 25 October 2013 (UTC)[reply]
    E has been the name I've gone by online for nearly two years. (Also, I'm pretty white.) — E | talk 10:39, 25 October 2013 (UTC)[reply]
    Re: "E has been the name I've gone by online ...": Which is why you have registered the user name User:Casicastiel, right? --Dan Polansky (talk) 12:29, 25 October 2013 (UTC)[reply]
    "E" has been taken since 2007, and we seemingly have no equivalent to w:WP:USURP over here. Any other unreasonable complaints? Keφr 16:04, 25 October 2013 (UTC)[reply]
Without getting into the fray, I want to thank SemperB for the work he's done, which for all this time has let me and all of the other editors focus on the more rewarding effort of adding and improving entries and spend less time on the tedious and emotionally draining task of cleaning up vandalism. Haplology (talk) 11:13, 25 October 2013 (UTC)[reply]
@Haplology: Word. --Dan Polansky (talk) 12:29, 25 October 2013 (UTC)[reply]
@Dan - Your comments expressing doubt over CodeCat's self-identified gender are utterly beyond the pale. I've seen a lot of unpleasant, uncivil comments in my time at Wiktionary, but this stands out as one of the worst. -Cloudcuckoolander (talk) 11:08, 25 October 2013 (UTC)[reply]
Maybe you can clarify which particular statements are beyond pale--I made more of them--and what it is that makes them beyond pale? By beyond the pale I understand something like "outside the bounds of morality, good behaviour or judgement in civilised company". Or instead of clarifying, maybe you can provide a reference to an online article that clarifies to me why the statements are immoral or incivil or whatever they are? If, in the physical world, a male-looking individual insisting to be addressed as "she", do you deem it my moral duty to address that individual as "she"? --Dan Polansky (talk) 12:29, 25 October 2013 (UTC)[reply]
In general, I take it to indicate incivility in an obscure reference to Ireland. In general, however much you may doubt someone's stated gender, it would be considered rude to comment publicly about it. Especially as your claimed justification is they are 'too manly' in online social interactions on a wiki-based dictionary. Hardly an environment associated with machismo, neh? If someone chooses to identify as a member of some group and it has no relevance to the situation, let them do so without challenge. (For the same reason one, when face-to-face, pretends not to notice involuntary smells or sounds; there is no practical benefit to the conversation in doing so, and plenty of potential harms.) - Amgine/ t·e 14:08, 25 October 2013 (UTC)[reply]
Re: "your claimed justification is they are 'too manly'": No, you're misreading; the "aggressive male behavior" part was part of his accusation, not part of his justification. His claimed justification was "good reasons". (I doubt you care what those "good reasons" are — I sure don't — but if you do, then please ask; don't invent.) —RuakhTALK 14:37, 25 October 2013 (UTC)[reply]
(edit conflict) It's deeply problematic to suggest that an editor must be one gender when they have stated they are the other. With stating that someone has displayed "an aggressive male behavior", and therefore must be "feigning" being a woman. That constitutes a personal attack against CodeCat by implying that A) she's a liar and B) that she's too "aggressive" to be considered a woman. If you want to call someone out because you think they're being overly bold or domineering in their approach to editing, that's fine, but gender shouldn't enter the picture. Overzealousness is an unfavourable trait in anyone, regardless of gender or other distinctions. -Cloudcuckoolander (talk) 14:45, 25 October 2013 (UTC)[reply]
Just to note, every time I see someone mentioning that CodeCat is a she, it is not CodeCat who does it. I have never found a single instance of CodeCat identifying as female anywhere, actually. It seems that for CodeCat h..self, gender is simply an irrelevant issue most of the time. Which is how it should be, I think. (As for myself, you can refer to me as "it". I will not care.)
I think we should go back to talking about whether the blocks were reasonable. I think at least one unreasonable block has been given, but I am unsure which one it was.
Also, @User:Dan Polansky: "CodeCat behaves like the owner of the place, with too few people protesting." Uhm. Pot. Kettle. Hello. Keφr 16:04, 25 October 2013 (UTC)[reply]
If CodeCat never asked others to refer to them as “she”, then I don’t need to hear Dan’s alleged “good reasons” to conclude that he’s deluded. ~ Röbin Liönheart (talk) 23:38, 28 October 2013 (UTC)[reply]
East-coast liberal pinkos, please understand that many of us cannot fathom the notion that you can pick your own gender. Stop imposing you minority views on us and do not call as rude/uncivil just because we cannot bring us to call a male "she". --Vahag (talk) 15:36, 25 October 2013 (UTC)[reply]
(edit conflict) And there goes the last shred of faith I had in this project. -Cloudcuckoolander (talk) 16:17, 25 October 2013 (UTC)[reply]
Cloudcuckoolander, why are you so intolerant? --WikiTiki89 16:22, 25 October 2013 (UTC)[reply]
Oh, right, the classic spin: "calling out incivility and intolerance makes you uncivil and intolerant." I'm done here. -Cloudcuckoolander (talk) 16:35, 25 October 2013 (UTC)[reply]
Tolerance is enduring things you dislike. Vahag was expressing dislike, you were expressing intolerance. --WikiTiki89 16:57, 25 October 2013 (UTC)[reply]
Since when has calling people "east-coast liberal pinkos" and making a comment which apparently dismisses the existence of transgendered people been consistent with "tolerance?" It doesn't matter what personal viewpoints people hold in private, but the moment said viewpoints impact how they edit Wiktionary or how they interact with other editors, it becomes problematic.
My loss of faith in this community is prompted by its general indifference toward the concept of civility. People say discourteous and tendentious things, and are seldom called out on it, much less held to any sort of account. -Cloudcuckoolander (talk) 17:17, 25 October 2013 (UTC)[reply]
Part of being civil is being tolerant to other people's incivility. The reason I called you out on it was because I was hoping you'd see this, since you are more likely to change your mind than Vahag. --WikiTiki89 17:29, 25 October 2013 (UTC)[reply]
@Cloudcuckoolander...I’m assuming you’re referring to Vahag when you say "there goes the last shred of faith". Remember that sarcasm and irony are difficult to convey in this medium. Vahag’s comment was made in jest. He’s the nicest guy you could ever hope to meet, and the most tolerant and laid back. But he does like to crack wise from time to time. —Stephen (Talk) 12:13, 26 October 2013 (UTC)[reply]
I personally hate that toxic dogpiling "callout culture" of Tumblr etc. People have the right to be offensive jerks, even though I wish they would refrain. (Not that I've never done it myself.) Equinox 17:38, 25 October 2013 (UTC)[reply]
What I consider "toxic" is the level of incivility displayed in this thread and this community's apparent indifference toward and/or acceptance of it. It only serves to divide the community and alienate contributors, which is completely self-defeating for a volunteer-driven effort. I should think that the best interests of the project — i.e. maintaining harmony (as much as is realistically possible) and retaining editors — would outweigh someone's "right" to engage in petty name-calling. I stand corrected. We part ways here, Wiktionary. It's been an interesting three years. -Cloudcuckoolander (talk) 17:54, 25 October 2013 (UTC)[reply]
There are many things that I could have objected to on this thread, but the only one I chose to object to is your loss of faith due to something as petty as this. Is it really so hard for you to ignore idiotic conversations that you have to leave because of them? --WikiTiki89 19:07, 25 October 2013 (UTC)[reply]
I'm astonished that you've been here three years and have never noticed that before, and that you've never noticed that Vahag can do whatever she wants with complete impunity. Just last year it was decided that vandalizing mainspace was not a good enough reason to desysop her. —Aɴɢʀ (talk) 18:06, 25 October 2013 (UTC)[reply]
I'm a he. I can show you my penis. --Vahag (talk) 18:24, 25 October 2013 (UTC)[reply]
Sorry, Vahag, but you can't pick your own gender. You said so yourself. —Aɴɢʀ (talk) 19:37, 25 October 2013 (UTC)[reply]
Just the one? I recently read a delightful comic by a cis gay man who likes trans men, "Orientation Police":
"Ready to take my cock?"
"Yes, sir!"
"Well, they're on that shelf. Go pick the one you want."
~ Röbin Liönheart (talk) 23:51, 28 October 2013 (UTC)[reply]
If you believe Cloudcuckoolander it's uncivil for Cloudcuckoolander to call out Dan Polansky, then it's uncivil for you to call out Cloudcuckoolander. I'm sure all our female contributors here appreciate knowing that they voice an opinion they'll be accused of being male and faking their gender. Tolerance means taking people who are jerks to other people and throwing them out, so the people who aren't jerks don't leave. And no, you don't get to pick what makes people too uncomfortable to stick around.--Prosfilaes (talk) 23:33, 26 October 2013 (UTC)[reply]
I have to differ here: WikiTiki was right in saying "tolerance is enduring things you dislike"; see tolerant. Throwing people out might give us a "friendlier" environment, but only by stifling dissent; that is censorship, not tolerance. Equinox 23:48, 26 October 2013 (UTC)[reply]
P.S. INB4 "why do you block people then?": I see the use of blocks as stopping people from actively disrupting the project, which would be spamming, inserting false definitions, etc. but not merely expressing unpopular or obnoxious opinions. Oh, and just in case I seemed to be taking a side that I'm actually not, I probably regret the loss of Cloud as much as anyone. He/she [not sure] certainly taught me a few things about attestation of neologisms that seem impossible to cite at first glance. Equinox 23:57, 26 October 2013 (UTC)[reply]
No, tolerating intolerance is not tolerance. And being an enabler of rudeness is not part of being civil. Quite the opposite. ~ Röbin Liönheart (talk) 23:23, 28 October 2013 (UTC)[reply]
Calling a woman "he" is not rude. It's like mistakenly calling a captain general. Just take the compliment and move on. --Vahag (talk) 07:09, 29 October 2013 (UTC)[reply]
Not unlike calling an Armenian Turkish... — This unsigned comment was added by Chuck Entz (talkcontribs).
In before someone mentions The Soldier Formerly Known As Bradley Manning: can we go back to a relevant topic? Keφr 16:04, 25 October 2013 (UTC)[reply]
  • I claim victory. Hoorah for me! -WF
So, what's our conclusion? What do we do? We can't just lose our top notch vandal fighter like that and sit there and say "hmm... uh... well...". This calls for revenge! -- Liliana 20:50, 29 October 2013 (UTC)[reply]
We should start doing what we admins were supposed to do right from the beginning — patrolling. I know I have been more active during the last week. The retirement of S-o, may the peace be upon him, may turn out to be a beneficial thing in the end. --Vahag (talk) 21:14, 29 October 2013 (UTC)[reply]

This discussion is simply horrible. It turned into a bullying board. Disgusting! --Anatoli (обсудить/вклад) 21:45, 29 October 2013 (UTC)[reply]

In the past, we would occasionally pledge to do more patrolling. After a brief spurt of honoring the pledge, we would lapse into our prior behavior. Why? One reason is that SB was always the backstop if we failed to do the job. This time it seems that will not be the case. If we don't patrol, the Goths, Huns, Vandals, and other barbarians will overrun Rome and end civilization as we know it. DCDuring TALK 21:56, 29 October 2013 (UTC)[reply]
IMO, patrolling should apply to discussion boards as well, such as this one. Call me a bad guy but bullying, racism, sexism, homophobia should not be allowed. I don't know anything about the conflict between SemperBlotto and CodeCat but this discussion, meant to defend SemperBlotto sparked a disgusting bullying rage against CodeCat. Some people here must now feel very proud about bullying her out of the project. --Anatoli (обсудить/вклад) 02:13, 30 October 2013 (UTC)[reply]
There might have been more defense of CodeCat if there were not also a lot of dissatisfaction about CodeCat's lower-key bullying behavior with respect to templates, etc: live by the sword, die by the sword. DCDuring TALK 03:16, 30 October 2013 (UTC)[reply]
Except CodeCat wasn't exactly living by the sword. If someone steals your Halloween candy, you don't shoot them in the face. --WikiTiki89 03:23, 30 October 2013 (UTC)[reply]
That's exactly right. If some of her complicated edits were disliked by a group, she did revert, she listens to reason. Besides, she did a lot of cleanup work, making sure the new templates were used. In any case, there was nothing liking bullying, bullying happened here. --Anatoli (обсудить/вклад) 04:03, 30 October 2013 (UTC)[reply]
CodeCat submitted almost nothing to a vote (especially not the master plan), claimed consensus for drastic changes from consent to minor ones (eg, improving the performance of {{context}}, and ignored clear lack of consensus, taking advantage of the fact that to really confront them ran the risk of leaving us stranded in media res. Obviously I for one don't like the specific changes that cause me extra typing and loss of functionality. But I would have swallowed that whole if there were a clear consensus instead of a steady course of bullying by Luacization. DCDuring TALK 15:32, 30 October 2013 (UTC)[reply]
And I really don't need people to bully me and mislead me for my own good for their version of the good of the project. I get that kind of thing from my government and my public utilities. Other folks have different ideas about the good of the project. DCDuring TALK 15:36, 30 October 2013 (UTC)[reply]
I'm not disagreeing, but as I said, if someone steals your Halloween candy, you don't shoot them in the face. It's just an inappropriate response. --WikiTiki89 15:42, 30 October 2013 (UTC)[reply]
No one shot (blocked) CodeCar in the face or asked them to leave the project. --Vahag (talk) 15:54, 30 October 2013 (UTC)[reply]
And what makes you think shooting in the face refers to blocking? In fact, I think a temporary block may have a been an appropriate response to repeated unwanted edits. --WikiTiki89 15:58, 30 October 2013 (UTC)[reply]
Blocking an admin is just an ineffective show of pique.
The risk of driving away a technical contributor who has made massive changes that they and few others understand is a great risk to the project, however deserving they may be of stern measures. That, IMO, is why CodeCat's behavior amounts to bullying. DCDuring TALK 16:40, 30 October 2013 (UTC)[reply]
I don't think that CodeCat saw it as bullying. I think that CodeCat had only good intentions. Now good intentions, of course, do not justify disruptive actions, which is why CodeCat should have been blocked, which would have sent the message "we don't like what you are doing with the templates" rather than "we hate you and you are a fraud". --WikiTiki89 17:05, 30 October 2013 (UTC)[reply]
There was enough evidence for me to believe that there was a serious risk of CodeCat storming off, leaving a worse mess behind. In a volunteer project that is always a risk. The difference is that a technical contributor has much more ability to do harm than a content contributor, which is why they need to behave better than content contributors and generally have, with only one exception that I am aware of - who stormed off leaving something of a mess behind.
We have enough trouble when technical contributors simply diminish their participation without doing so in a fit of pique or worse. Their efforts gradually drift out of alignment with MW software or with the technical contributions of others. If they are not around to maintain their code, especially if it is not well documented and designed AND simple, someone must step in, grasp the logic, and fix it. DCDuring TALK 17:23, 30 October 2013 (UTC)[reply]
Once again, I am not disagreeing with you about that. I just think we handled it wrong, and as a result of the way we handled it, CodeCat did storm off (at least for now). Also, don't forget there are other editors (such as me and Ruakh, and most likely others that I'm not aware of) that are perfectly capable of understanding CodeCat's Lua and template code and changing it to behave the way the community wants it to. --WikiTiki89 17:50, 30 October 2013 (UTC)[reply]
I think CodeCat's Acchilean sulking (lurking?) withdrawal demonstrates the vulnerability point. An ill-behaved technical contributor is not an asset. We are better off not to be pushing the envelope to the point of dependence on such. There was only one similar instance of storming off and one instance of driving off a technical contributor in my tenure. The technical detritus of the contributor who stormed off remains. DCDuring TALK 22:07, 30 October 2013 (UTC)[reply]
I agree. It was so bad that I resorted to throwing in an outrageously over-the-line response in a desperate attempt to make it stop. I hope Vahag can forgive me for deliberately hitting a very painful raw nerve on a subject that has literally led to war and unspeakable atrocities in his part of the world.
This has been the culmination of an epidemic of nastiness that has brought several people whom I really admire and respect to grappling and gouging with each other in the mud when they should have been listening to each other. Yes, CodeCat tried to ram things through without paying enough attention to the consequences or to the resentment that was building up- but only because she thought she was doing the right thing, and that people would come around once everything was in place and working. Ivan had some legitimate concerns, but got too caught up in what he saw as fighting for the integrity of the project, and started to take it much too personally. SemperBlotto really was overdoing it on some of the more innocent new user pages, but CodeCat was the wrong person to deal with it, given the issues between the two. And that's just scratching the surface of what went wrong.
I don't have time to say more this morning, but we need to realize that a huge amount of damage has already been done, and if we don't resolve this, it could easily destroy the project. Chuck Entz (talk) 14:08, 30 October 2013 (UTC)[reply]
I am not offended. --Vahag (talk) 15:54, 30 October 2013 (UTC)[reply]

language of the month

[edit]

Over at Wikcionario, we have a feature called ‘Lengua del mes’ (tongue of the month), and it’s similar to word of the day. A language of the month can promote interest in a language; I have been including more Romanian entries there, for example. Does this feature sound good to you lot‽ —Æ&Œ (talk) 00:13, 25 October 2013 (UTC)[reply]

What would being the language of the month actually mean for a language? It gets featured on the front page, but then what? —CodeCat 00:18, 25 October 2013 (UTC)[reply]
It would encourage users to learn the language, and encourage us to expand our information on it.
I would also like to suggest that each tongue be elected by the community, should we make this a feature.
Does this answer your query, sir? --Æ&Œ (talk) 00:25, 25 October 2013 (UTC)[reply]
Support. Good idea. I would include encouragement to verify entries, templates, quality and clarity, other things - grammar, transliteration, translations. I even expect some competition between editors because there are too many languages, perhaps (mini-)votes would be required. --Anatoli (обсудить/вклад) 00:36, 25 October 2013 (UTC)[reply]
Support. They have a voting system, where people propose languages and the one with most votes is chosen as next month’s language of the month ([10]). — Ungoliant (Falai) 00:52, 25 October 2013 (UTC)[reply]
Well, that's one opinion. That's why it's good to have a vote, so that the sexiest language is chosen. I wouldn't use this as criteria, as sexy languages get more attention than others. I agree that we should address languages with little coverage but those too, should be prioritised. For example, Lao or Burmese are official languages with million speakers but the coverage is rather low, transliteration is awkward and difficult. Our experts could help adding automatic transliteration and combine efforts with those who know it or have interest. --Anatoli (обсудить/вклад) 01:21, 25 October 2013 (UTC)[reply]
I support it if the language is Japanese. To be serious though, it sounds like a good idea and it can't do any harm. Haplology (talk) 08:46, 25 October 2013 (UTC)[reply]
Should there be rules as to which languages are eligible? For example, I don't think we should feature English, French, Spanish, or any language that people generally know a lot about. --WikiTiki89 16:47, 25 October 2013 (UTC)[reply]
Why not? People are much more likely to come to Wiktionary for major languages than obscure ones IME, and improving our obscure/dialectal coverage for Spanish is at least as helpful, I'd say. —Μετάknowledgediscuss/deeds 17:25, 26 October 2013 (UTC)[reply]
Then maybe we could have a dialect as the language of the month? --WikiTiki89 19:32, 26 October 2013 (UTC)[reply]

Small, strange languages which have good coverage in Wiktionary (e.g. Finnish) should not be ignored either! --Hekaheka (talk) 17:03, 25 October 2013 (UTC)[reply]

Sounds like a nice idea, but I would like to see some more definitive details of what it entails, and how it encourages learning or editing the language. (Also, who will maintain it, and how? With the possible exception of Stephen G, no one person speaks all that many languages.) Equinox 00:30, 27 October 2013 (UTC)[reply]

Some thoughts:

  • You can't force anyone into editing a particular language or topic. Since this is not a policy-related vote, it makes no sense to abstain or oppose. Perhaps this should instead be a page were people would express interest into particular goals, such as "Add 5k complete entries for the underrepresented language X, develop PoS templates and appendices, an About: page documenting it, and a script to generate inflected forms" or "Semi-automatically import and verify as much as possible entries from a certain copyright-expired dictionary" or "Add missing translations in the language X for all of the meanings of the top 10k English words", or "Add as much possible Wikisaurus entries in some topic/word list" etc. Everyone interested could sign up by voting yes, and if the number of interested editors surpasses some per-determined limit (e.g. three) the project would start, last for some fixed amount of time (or until it's completed) and then shut down.
  • (Relatively) Obscure languages and topics would benefit much more from a focused group effort than those already well represented online. This is for two reasons 1) More "value" is added by adding obscure content that cannot (easily) be looked up elsewhere. One of the biggest advantages of Wikipedia over other encyclopedias is that it contains countless articles in topics that don't even have, and will probably never have coverage in the respected native language. 2) There could be multiple editors interested in the same obscure topic/language, and while not individually comfortable in editing it, they would all be encouraged into doing it collectively by ensuring that someone else would doublecheck it. --Ivan Štambuk (talk) 03:28, 27 October 2013 (UTC)[reply]

Here’s the language of the month page, for a start. Ideally, the topic being highlighted would make people interested in researching into it. —Æ&Œ (talk) 05:37, 27 October 2013 (UTC)[reply]

Good start, my comments:
  • We should write something else for justification than "interesting". That lets one think that the languages could be classified according to their interestingness which, of course, isn't the case. Perhaps we should omit the adjectives altogether and just state that this is the language which we focus into this month.
  • The whole process still requires some planning. What are we actually aiming at with a particular choice? In order to avoid creating a pile of crap by editors who don't know the language, each language of the month would benefit of a native moderator/moderators who would commit themselves to monitoring and correcting the contributions during that month. The moderators for their part would need a tool which tracks the edits in that language. Conversely, a language should not be nominated the language of the month unless there's at least one committed and qualified moderator. --Hekaheka (talk) 08:40, 27 October 2013 (UTC)[reply]
Requiring that moderators be native speakers will severely lessen the amount of languages that could be focused on. — Ungoliant (Falai) 09:20, 27 October 2013 (UTC)[reply]
I understand that, but I see no point in a joint effort to increase amount of content without somebody knowing that it is correct. We could allow level 4 speaker if no natives are available. With some languages, like Latin, it's a must! --Hekaheka (talk) 10:03, 27 October 2013 (UTC)[reply]
Since there were no other nominations, I have created Wiktionary:Language of the Month/2013/November for Emilian. Please expand. DTLHS (talk) 06:31, 1 November 2013 (UTC)[reply]

12 December will mark the 11th year of Wiktionary's existence (although, to be fair, the domain was not established until slightly later in the month.) Last year the WMF noticed it was our 10th year a few hours before the day ended, and needless to say there was no official mention/celebration of the anniversary.

At the end of the first year of the second decade, is anyone interested in doing something? - Amgine/ t·e 13:49, 25 October 2013 (UTC)[reply]

The 10th anniversary "celebration" (WMF emissary visiting the IRC channel, which is mainly populated by people who are not active on Wiktionary) must have been amusingly anticlimactic. But what would you suggest? (BTW, 12 December is a Thursday, which is the busiest day in my academic timetable.) Equinox 00:28, 27 October 2013 (UTC)[reply]

Idea for a Halloween competition

[edit]

It's much too early for Christmas and the idea is much more suited to Halloween.

I thought I would publish a list of the characteristics of each of our regular contributors:- what they do well, what they don't, approximate signal-to-noise ratios etc. You have to guess which one is which.

Unfortunately, I won't be able to tell you if you are correct - not if I want to stay as a contributor anyway. SemperBlotto (talk) 16:19, 25 October 2013 (UTC)[reply]

So a bit like apple bobbing, but for apples of discord. Equinox 16:23, 25 October 2013 (UTC)[reply]
Sounds like it could be fun, but may turn ugly. Either way, can't wait! --WikiTiki89 16:50, 25 October 2013 (UTC)[reply]
Oh yes, go ahead! -- Liliana 19:36, 25 October 2013 (UTC)[reply]
Characters could be grouped under appropriately spooky metaphors, e.g. someone rarely seen editing is the ghost; a dry old pedant is the skeleton; an unthinking follower of the herd is the zombie; etc. etc. IGMC. Equinox 17:27, 26 October 2013 (UTC)[reply]
support  This could get ugly. Michael Z. 2013-10-27 15:53 z
I don't believe S-o, may the peace be upon him, knows the community well-enough to write characteristics. I doubt he even knows my name. --Vahag (talk) 22:13, 27 October 2013 (UTC)[reply]
Never heard of you. Do you work here? SemperBlotto (talk) 22:16, 27 October 2013 (UTC)[reply]
։( Yes, Lord Knaggs, for 5 years now. --Vahag (talk) 22:22, 27 October 2013 (UTC)[reply]
  • I've chickened-out of this, but I will be running a Christmas competition this year. Details will arrive mid-November, to be followed by a trial run before a start in early December. I bet you're on the edges of your seats. SemperBlotto (talk) 22:15, 31 October 2013 (UTC)[reply]

Preloaded templates for parts of speech are missing/broken

[edit]

Some of the preloaded templates at the link "use the preload templates" on the "create new page" are missing or broken:

Template:third-person singular of

https://en.wiktionary.org/w/index.php?title=Template:third-person_singular_of&action=edit&redlink=1

Template:past of

https://en.wiktionary.org/w/index.php?title=Template:past_of&action=edit&redlink=1

The history at each of these two links leaves the impression that things have changed, but what's there right now is broken. -- Dough34 (talk) 17:52, 26 October 2013 (UTC)[reply]

Done Fixed. Thank you for pointing it out! —RuakhTALK 18:24, 26 October 2013 (UTC)[reply]

Delinking certain user pages

[edit]

I have noticed that the usefulness of Special:WantedPages is somewhat diminished by the presence of many pages in User and Concordance space that have numerous redlinks for pages that we almost certainly do not "want". (Such pages may also exist in Wiktionary or Appendix space, but those usually have more value.) I am thinking of many of the English frequency and concordance lists that contain many, many links to uppercase forms (probably mostly sentence-initial and proper-name-component uses) of words we have as lower-case lemmas or to lowercase forms of terms we have uppercase entries for.

Rather than delete the pages - which may still have some use - I was thinking that we could disable linking by inserting "<nowiki>" ... ""</nowiki> appropriately. A user who wished to use the page would have the option of removing the "nowikis" and presumably restoring them when finished with using the linked pages. A notification box could draw users' attention to the possibility and also categorize the page to increase the visibility of the pages to newer users. If the delinking inconvenienced anyone significantly it could be reversed "permanently" by simply removing the "nowikis" and perhaps insert a notification box asserting that the page was in regular use.

Another approach would be to combine some pages that are very duplicative, such as the individual chapter concordances for Moby Dick.

For inactive users, this seems like a community decision. Does it need a vote? For user pages of active users, ordinary politeness - though not any rules or policies (?) - would say one should at least notify the user.

I intend to disable in this way some of my own user pages and consolidate some others that may contain needless duplication. DCDuring TALK 18:28, 26 October 2013 (UTC)[reply]

Is it possible to limit which namespaces are counted in Special:WantedPages? --WikiTiki89 19:20, 26 October 2013 (UTC)[reply]
User:DTLHS/WantedPages does that with a run based on the latest dump (10/17). DCDuring TALK 00:49, 27 October 2013 (UTC)[reply]
I had asked the question at Bugzilla and did not elicit a response. It took a long time for them to decide to do the Wanted pages run we now have once a month. It had been disabled for four years. The other special pages haven't been run since September 10. I'm not sure I'd want to even ask them. But I can't imagine that technically it could not be done.
But it would also not be too hard to do such a run on a dump. Dumps are produced at least monthly. In addition, one could do more: the run could be done by language, which would dramatically improve usefulness. We could also occasionally do many other runs, such as for redlinked (or bluelinked!) terms only linked to once, which might help us identify terms inappropriate for use in a definition (ie, misspelled or too rare), or find all English rare, obsolete, and archaic terms used in definitions. Wiktionary is really not very large (3.3gb) compared to the amount of RAM one can fairly affordably have in a tower PC, say, 8-32gb, which can be managed by 64-bit hardware and software, so some of these runs could be probably be done rather quickly. DCDuring TALK 22:19, 26 October 2013 (UTC)[reply]
Indeed- it would be trivial to restrict it to namespace 0 (but you would probably only get 1 run- so any other information would be lost). It would be very difficult to do by language, unless you had a very limited definition of "link" (no templates). DTLHS (talk) 00:37, 27 October 2013 (UTC)[reply]
Couldn't "we" do something based on redlinks from {{l}} and its relatives? It wouldn't be the WantedPages list of anyone's dreams, but it would be a good supplement to the various requested entries pages. DCDuring TALK 00:49, 27 October 2013 (UTC)[reply]
Semi-relatedly(?), there's now a category that appears under "pages to be deleted", which itself isn't up for deletion. Confusing. Equinox 22:22, 26 October 2013 (UTC)[reply]
I don't understand. Is it related to this (which I also do not yet understand)? DCDuring TALK 22:47, 26 October 2013 (UTC)[reply]
DCDuring, please don't harm the project in order to get Special:WantedPages to look more like you want it to look. Mglovesfun (talk) 23:44, 26 October 2013 (UTC)[reply]
Please be specific about how the specific proposal I am making might harm the project. I'm expressing my views here because:
  1. I think it would help for the stated reasons,
  2. I don't think there is any likely harm,
  3. I don't think that there is any possible harm that is not completely reversible,
  4. I know others may be able to identify specific problems or alternative approaches, and
  5. I think that naysayers need to be tolerated.
-- Please make specific suggestions or criticisms. DCDuring TALK 00:26, 27 October 2013 (UTC)[reply]
I had noticed the same thing as DCDuring, that old userspace lists of "wanted" words swamp Special:WantedPages with Uppercase Variants That We Actually Don't Want. I would support delinking those userspace lists. Many of the older ones can even be sent to RFD and deleted: they're impossibly out of date, everything of value has been extracted from them. - -sche (discuss) 01:35, 27 October 2013 (UTC)[reply]
Deletion is certainly a possibility with some of them, but there might be some value to extracted. Delinking and categorizing them as up for deletion might encourage someone to do something productive with them.
As a specific example of the nature of the problem see Special:WhatLinksHere/It. Specifically, there are 85 links (of the 126 total links) from the individual chapter subpages of the Moby Dick concordance. There may be a few terms not yet included from that concordance, so it may still have some value, but effectively switching off the links when they are not in use eliminates a lot of clutter. Many of the other pages linking to It are similarly unhelpful. Perhaps more selective delinking within the pages would be more helpful, though much more time-consuming. DCDuring TALK 03:40, 27 October 2013 (UTC)[reply]
I am in the process of dealing with the Moby Dick issue via bot (converting the instances to piped lowercase). It will take a while. SemperBlotto (talk) 07:46, 27 October 2013 (UTC)[reply]
p.s. Could someone postprocess the generated list to produce a list containing the mainspace entries only? SemperBlotto (talk) 07:50, 27 October 2013 (UTC)[reply]
BTW, it might be handy to include at the top of each altered subpage a link to the page version before the piped-link edits, including my recent manual ones. DCDuring TALK 12:44, 27 October 2013 (UTC)[reply]

Proposal: All script errors are bugs.

[edit]

I hereby propose the following policy:

All script errors are bugs.

"Script error" should never occur in a reader-facing page such as an entry, an appendix, a Wikisaurus page, or the description of a content category. When it does occur, that indicates that there is a bug either in a Scribunto module, or in a template that calls the module. (It is not necessarily a major bug — for example, if it's triggered by a very rare and easily fixed mistake, it may be only a minor bug — but it is a bug nonetheless.)

Exceptions:

  • Scribunto modules are never supposed to be called directly from reader-facing pages; rather, they are supposed to be called via templates. When a reader-facing page calls a Scribunto module directly, the bug is in that page, not in the module.
  • Certain templates are never supposed to be called directly from reader-facing pages, either. Of course, we can't just deprecate a template and instantly decide that all mainspace occurrences are bugs; but if a template has genuinely never been used in entries, and never intended for use in entries, then it is not necessarily a bug if a direct call to it results in a script error.

Rationale:

  • Readers will think that a script error is a bug, and it diminishes their trust in the project. (Plenty of studies show that trust doesn't necessarily follow strict rules of logic, but rather, that it sort of "bleeds" from one area to another. The less professional a site's appearance, the less readers will trust its content, even when there's no logical connection between technical proficiency and accuracy of information.) Given this, it doesn't make sense for us to view script errors as anything but bugs.
  • Casual editors, if they get a script error when editing a page, are very likely to be unwilling or unable to debug the problem. They may save the page anyway; or they may try to work around the problem by choosing some inferior approach (e.g. manual formatting); or they may discard their edits. None of these is a desirable outcome; and editors who encounter this conundrum once are likely not to come again. Even the more serious contributors are likely to scale back their editing if they encounter this problem too often. It may have a less obvious/immediate/disastrous effect than, say, issuing a block to our most dedicated recent-changes patroller, but it will still drive people away over time.

RuakhTALK 22:58, 27 October 2013 (UTC)[reply]

Sounds right.
Have we established any standard ways for our scripts to reveal errors like incorrect or missing input? Are there any other categories of errors? Michael Z. 2013-10-27 23:13 z
bug
So what would be the preferred mode of failing input validation, as opposed to error()? Hidden categories and displaying nothing? --Ivan Štambuk (talk) 23:43, 27 October 2013 (UTC)[reply]
It would depend on what one's views were about the kind of qualifications we should be requiring of those contributing to Wiktionary. For "Script error" to be useful, what would a new-to-Wiktionary contributor's capabilities have to be? Clairvoyance would seem to be one sufficient skill in the face of such an opaque message. Stubborn persistence might be sufficient, leading the person to click on links, find places to ask questions, wait for a sufficiently constructive reply, etc. I'm not sure that experience at other wikis would be much help.
If we drive away everyone who does not have the willingness and ability to cope with our amateurish error handling, we will not win many new contributors to Wiktionary. As it is, I feel that language is changing faster than we can make and maintain changes to Wiktionary. We need more contributors, not fewer, if only to help clearing up the various backlogs. DCDuring TALK 00:24, 28 October 2013 (UTC)[reply]
I think blank display and hidden categories would probably be best. If Lua can distinguish between the viewing and editing screens, we could perhaps input a link to directions for the template. That way readers wouldn't immediately know something was wrong, but editors would, and new editors (or old editors who haven't been around lately) wouldn't have to search for directions on how to fix the mistake. -Atelaes λάλει ἐμοί 00:33, 28 October 2013 (UTC)[reply]
@Ivan, after e/c: A hidden category, yes, and perhaps an error indicator of some sort; it depends on the type of error (for example, bad input is different from missing input), and on how gracefully we can handle it. When we do present an error message, we should aim for gentle and professional. —RuakhTALK 00:50, 28 October 2013 (UTC)[reply]
On fr.wikt we use test pages to show how errors are handled (and to test template rendering). For example fr:Discussion module:section/test. Of course we need to think of every misuse possible that can't be considered bugs (invalid input, missing input...). In the end, we should never show a Script Error in a page even if the template is misused. Dakdada (talk) 12:33, 28 October 2013 (UTC)[reply]
Technical question: Can Lua catch errors called with error()? --WikiTiki89 04:23, 28 October 2013 (UTC)[reply]
Yes, using the pcall function, but I'm not sure it's really worth it, usually. In general, you can't really handle an error unless you understand it, which means that you generally have to handle the error at the point where you call the function that throws the error, which means that instead of writing the function to throw an error and make everyone call it with pcall, you might as well just modify the function to behave the same way that pcall would have, namely, return multiple values and indicate whether there's an error. Lua doesn't have the rich exception-handling you might be used to from languages like Java.
But this isn't to say that pcall is never useful. For example, it might be useful in top-level entry-point functions — the functions invoked directly from templates — where it can sort of be a last-ditch stopgap to prevent the default scary 'Script error'. I'm not sure that all such functions would benefit from it, but if we find that a certain module is causing more script errors than we'd like, then this could be a not-too-terrible patch.
RuakhTALK 06:25, 28 October 2013 (UTC)[reply]
That top level use is exactly what I was thinking. Otherwise, I'm not really an exception kind of guy and would prefer never to use them (except that Python basically forces you to). --WikiTiki89 14:13, 28 October 2013 (UTC)[reply]
Would it be possible and sensible to provide a link to the documentation for the template as part of an error message? Possibly just for residual errors, those not handled by specific messages? DCDuring TALK 12:29, 28 October 2013 (UTC)[reply]
(Ivan's comment):
  • Hidden categories are annoying. I often don't even want to waste time guessing which particular instance of {{term}} invocation of a dozen or so on a page generated a hidden cleanup category.
  • Can be confusing. You save/preview your edit, and nothing happens. Where is the error? You need to go to the documentation page which is often non-existing or painfully out of date. More clicks and seconds wasted figuring out what used to take a single mouse click on red letters.
  • If breaking changes are introduced in modules we won't be able to catch them anymore in script errors category (which kind of hurries fixes), and instead a bunch of content would "disappear" because module would no longer output anything in case of a faultily introduced input validation error.
  • How would you force this? What if I prefer error() and don't add hidden categories and return empty string in modules? Should the module author be punished, or such module be forbidden to be invoked from mainspace templates due to a violation of this policy? If the latter is the case, I'd prefer if this be voted in a context of some more comprehensive module writing policy, which also requires e.g. mandatory unit tests and documentation unless the module is trivial (e.g. < 100 LOC).
--Ivan Štambuk (talk) 03:35, 31 October 2013 (UTC)[reply]
Most of your comment seems to be a reply to Atelaes's suggestion ("blank display and hidden categories"), but in that case I'm not sure why you've posted it so far from his comment? Regarding your last point ("How would you [en]force this?"), this is no different from any other policy. An editor who "prefers" to misformat entries will not long remain an editor, and similarly with an editor who "prefers" to introduce bugs. —RuakhTALK 06:23, 31 October 2013 (UTC)[reply]
Regarding the last point, basic tests should be mandatory, even for small modules, for every change (the "preview" changes is really useful and simple to use): even simple fixes can bring unsuspected errors. It may take a bit more time to set up, but it is also much more efficient to avoid bugs. A guideline to set up such tests would be welcome. Dakdada (talk) 10:19, 31 October 2013 (UTC)[reply]
Not to mention that unit tests are also usage examples (two birds with one stone). --WikiTiki89 11:59, 31 October 2013 (UTC)[reply]

Google Translate speech synthesis in the Pronunciation section?

[edit]

A new user, Neitrāls vārds, has just suggested on my talk page that Latvian entries without a pronunciation sound file could use Google Translate's speech synthesis feature to provide what he considered to be "reasonable approximations" of normal Latvian pronunciation. I am a bit skeptical, but I wondered what you guys would think. (As an example, he/she did a template and added it to the word šķīsts, using Google Translate; what do you guys think of the idea?) Neitrāls vārds also expressed some concerns about whether or not Google Translate synthesized pronunciations are free or in some sense copyrighted (see his/her text on my talk page). Personally, I'm not happy with the pronunciation quality, but maybe it's better to have some hearable pronunciation rather than nothing, or only IPA transcriptions. What do you guys think? --Pereru (talk) 18:43, 29 October 2013 (UTC)[reply]

Very bad idea. It lowers the quality and reliability of our dictionary. --Vahag (talk) 22:00, 29 October 2013 (UTC)[reply]
I don’t think it is acceptable. In šķīsts, all I heard (after some difficulty trying to find the sound file) was "skiists". —Stephen (Talk) 00:06, 30 October 2013 (UTC)[reply]
OK, to reiterate what I already wrote on my talk page the synthesized speech is actually pretty good (well, that's my personal opinion as a native speaker, of course.) Aside from the so called "wide e" issue it actually delivers what I would call a pretty "true to life" pronunciation, and in the case of words such as visšķīstākais for which I'm actually having trouble deciding what the correct IPA should be, the synthesized speech actually provides some sort of a "yardstick," for example, in the (hopefully) unlikely event my IPA is wrong, someone could actually correct it based on the speech synthesis. Neitrāls vārds (talk) 00:22, 30 October 2013 (UTC)[reply]
@Stephen, OK, the šķīsts pronunciation was not exactly perfect but the only problem I heard was the pause between ⟨šķīst⟩ and the final ⟨s⟩ - it should've been pronounced all together - otherwise it was pretty close to [ʃtʃiːsts] which is exactly the way it should be pronounced. Neitrāls vārds (talk) 00:27, 30 October 2013 (UTC)[reply]
I think you have to be attuned to the sounds of a language before you can hear them reliably on a synthesizer. Latvians can understand synthesized Latvian easily, but it does not offer much benefit to the rest of us. Our ears are attuned to other languages, and that’s probably what we hear when the synthesizer speaks. —Stephen (Talk) 00:43, 30 October 2013 (UTC)[reply]
Idk... I checked out a sentence in Portuguese, mind you it was "synthesized" just like Latvian (one could argue that the voice was more agreeable, but that's besides the point!) luckily it was Brazilian accent and I could actually understand it quite well! And, mind you, there's a reason why Portuguese is the last on my Babel box (save for Japanese - which I do not actually speak except for being able to read the most simple of sentences) - it's because I barely even understand spoken Portuguese (OK, Brazilian accent helps,) but with Google speech-synthesizing the Portuguese text I'd put in the input box, I could actually understand what is being said without even looking at the text. I left a link to a Portuguese sentence in Google Translate on User:Pereru's talk page, but I'm sure you could try for yourself with some language you can vaguely understand (if it has a more or less "true to speech" orthography,) and check it out for yourself, or maybe not... My point being that if speech synthesis of Portuguese (if with Brazilian accent) is good enough for me (a person barely capable of understanding spoken Portuguese) then the synthesis is pretty good and you cannot write it off on me being a native speaker (I barely even speak Portuguese.) Idk... with this example of Portuguese (the language from my Babel that I'm least likely to comprehend) I'm trying to say that perhaps it does not have to do anything with "native speaker intuition," "filling in the blanks" or whatever was implied in the previous comment - the pronunciation is actually pretty good (aside from some issues.) Neitrāls vārds (talk) 01:43, 30 October 2013 (UTC)[reply]
Well, with šķīsts, all I heard was "skiists". The initial šķ was too quick and unclear. The synthesized voice did not benefit me. I think if you have a microphone attached to your computer, you could pronounce the word (use Audacity software) and upload it almost as fast as you could add the Google synthesized pronunciation, and the results would be vastly superior and easier to use. —Stephen (Talk) 02:05, 30 October 2013 (UTC)[reply]
We shouldn't use Google Translate for ANY type of reference. It's a very bad idea. It doesn't mean that learners can't use it if they want to but they can figure this out themselves or we can give them informal advice rather than link in the main space. If you want to help, record voice files and upload them. --Anatoli (обсудить/вклад) 02:06, 30 October 2013 (UTC)[reply]
@Stephen idk about "skiists" all I heard was ⟨ščīsc⟩ (respelling it in LV alphabet which is prob. pretty much the same thing as Czech alphabet) anyways ščīsc is exactly what it should sound like, aside from the stupid pause between ⟨ščīs⟩ and ⟨c⟩. Anyways, for example, ķermenis has a really good pron. synth.! @Anatoli, why shouldn't we use it? I'm all for redirecting people interested (in, say, Latvian pronunciation) to Google Translate if they choose so... Should the template I used on some entries (šķīsts#Pronunciation, ķermenis#Pronunciation, etc.) not direct to the mpeg files directly but to the Google Translate interface instead? I actually debated this idea... I'm not sure how it could be implemented... Neitrāls vārds (talk) 02:33, 30 October 2013 (UTC)[reply]
I'm very sceptical of using automatically synthesised pronunciations, though not necessarily opposed to it. I would have thought the purpose of audio was to demonstrate how a native or near-native speaker pronounced the word, by having a native or near-native speaker pronounce it. Synthesised audio seems no better than IPA, at least for users who can both see and hear. For visually-impaired users, whether or not Google-synthesised audio is useful depends on whether Google's synthesiser is better or worse than the one the user's screen-reader would use to pronounce IPA. That said, re Stephen's comment that "you have to be attuned to the sounds of a language before you can hear them reliably on a synthesizer": you have to be attuned before you can human-produced pronunciations, too. Consider this discussion of , of the Dutch word enig: native Dutch speaker CodeCat thought the file sounded fine, the rest of us heard "penis". - -sche (discuss) 05:38, 30 October 2013 (UTC)[reply]
All in all, I tend to agree with -sche. I do think that adding speech synthesized forms adds something, it's better than IPA; but what it adds is indeed very little, and I'm not sure it is going to help non-native speakers very much. It is also true that native speakers do tend to have their ears more attuned to the sound of their language, so they can understand it in situations where the signal is very bad (e.g., on old-fashioned radio, where all the higher frequencies above e.g. 3,000 Hz are cut off). There in fact are linguistic studies showing how much better native speakers are at detecting small cues in pronunciation that identify words, and how they then "think" they heard "all the word" instead of those cues -- it's what happens when we are very used to something: it's so familiar, we recognize it even when most of the input is missing (think of old, bad pictures of family members you know well -- you can tell it's granpa even if most of his facial features are impossible to resolve). In fact, even with pronunciation files recorded by native speakers, non-natives may often find the pronunciation difficult to follow -- if you've ever tried to learn Danish and encountered their "soft d", you may know what I mean... which is why people who record pronunciation files are advised to articulate naturally, but clearly: don't talk too fast, or we non-natives won't get the hang of it... (The pronunciation of Dutch enig above also sounded OK to my ears, but that's because I'm living in the Netherlands and I hear Dutch all the time around me, so I've gotten very much used to it; my pronunciation is still non-native, but my ears have learned a lot already). --Pereru (talk) 08:12, 30 October 2013 (UTC)[reply]
I tried out some of the other languages on Google's speech synthesis and it seems that for most of the major languages (English, French, Italian, Spanish, Portuguese, Dutch, and German from the ones I tested) the speech sounds almost as natural as a native speaker, while for the others (Latvian, Catalan, Russian, and Afrikaans from the ones I tested) the speech sounds like it is put together directly from the IPA transcription and therefore sounds much more like a robot and not very smooth (in Russian, I even noticed it often placed the stress incorrectly). A good way to explain it is the "smooth" languages sound like Siri, while the "not-so-smooth" languages sound like Stephen Hawking. Therefore I would support including synthesized speech in entries only for those languages that have the smoother synthesis algorithms, which ironically does not include Latvian. --WikiTiki89 13:51, 30 October 2013 (UTC)[reply]

OK, I have been convinced (which was kind of the reason for my wavering and why I decided to ask for opinions) that making inline links to media provided by a third party a core component on a (potentially) large(-ish) number of entries might not be exactly ideal. I'm however more concerned about any possible "legal ramifications" (copyright, etc.) Yet at the same time the "external verifiability" of everything written on a wiki, aka, references is kind of the main purpose of any wiki project, so instead I'm relegating this to some type of references template. I'm thinking of a cute little toggle button [quotations ▼] like the {{quote-book}} gives out. I probably shouldn't say this but personally I do not trust my IPA skills 100%, thus, for example, in my case I could add at least some type of external verification for what I've written in those IPA brackets, for peace of mind (not that it's that super-mega-important, of course.)

(As far as the quality of the speech - I still like it!) And that Dutch word does sound weird, I wonder where the p came from, sounds pīnah - [pi:nax]. Could also be "peanut" if the [x] wasn't so audible.

And I was wondering... Does anyone know where the [quotations ▼] toggle function in template:quote-book comes from? There's nothing in its markup and the two templates it transcludes don't have anything of that sort either. Does the wiki software inject it itself in response to some div class? Neitrāls vārds (talk) 23:54, 30 October 2013 (UTC)[reply]

See §§"Hidden Quotes" and "Visibility toggling" in MediaWiki:Common.js. It's not a built-in feature of the software, no; rather, it's something we added ourselves (where by "we" I mostly mean Conrad.Irwin (talkcontribs)). And it's not triggered by anything in {{quote-book}}; rather, it's done for any <li> whose immediate parent is an <ol> and one of whose immediate children is a <ul> (that is: it's done for any ordered list element that directly contains an unordered list). —RuakhTALK 01:22, 31 October 2013 (UTC)[reply]
Hmm, I just tried all kinds of hash and asterisk combinations but it didn't appear. Either way pronunciations are usually bulleted (unordered?) lists, so if that requires an ordered list being there, I need to look into some other toggle templates, Anyway, thanks! Neitrāls vārds (talk) 02:51, 31 October 2013 (UTC)[reply]
  • But what about Latvian tones? We don't even mark them in entries... I'd be much happier if you compiled a mapping table of characters (or character sequences) to IPA, so that we can write a module generating IPA pronunciations. --Ivan Štambuk (talk) 03:18, 31 October 2013 (UTC)[reply]

A mapping table of chars/char sequences to IPA is an interesting idea and it shouldn't be that hard to do, although... OK, to my surprise there actually appears to be no separate article for IPA for Latvian on en.wiki (neither is there one on lv.wiki). In the event I do make one, I'd probably first make an article "IPA for Latvian" on Wikipedia.

The tones might not be predictable always but there are some instances where they can be predicted (imo, anyhow) with some certainty, e.g., voiced consonants in clusters with unvoiced consonants (leading to them becoming unvoiced) usually produce a broken tone, e.g., auksts would have a level tone ['au~ksts] but augsts would have a broken tone ['au^ksts], lauzts would have a broken tone ['lau^sts] but, say, lauska would have level ['lau~ska] (although in lauska the cluster might not be long enough.) Then again there is saukts which has a broken tone although the k is not voiced, sīksts also having a broken tone. But then again with logs and loks this holds true again - logs with broken and loks with level, then there's loks ("spring onion") with a falling tone but I'm barely able to tell it apart from the broken, admittedly in Rīga Latvian the falling tone has been syncretized with the broken tone.

Actually the pattern could be that long consonant clusters after a vowel usually lead to a broken intonation unless there is a homophonous pair (say, auksts/augsts) in which case the one with a voiced consonant will take on a broken intonation and the unvoiced one a level intonation.

And I know that all this tone stuff is involved in the (somewhat controversial) Balto-Slavic discussion (which I'm perfectly OK with by the way), but the different tones in Latvian (at least imo) are very arbitrary often used simply to distinguish between (or "among" since sometimes there's 3 of them) homophones, not unlike the different pronunciations of the English word "use" as either [jius] as a noun or [juːz] as a verb, or "conduct" as ['kʌndəkt] as a noun or [kən'dʌkt] as a verb. Coincidentally Eastern Latvian dialect (Latgalian) which in some aspects preserves similarities to Lithuanian (the arguably go-to language if one wants to observe archaic Baltic features) actually uses the broken tone exclusively (and from the spoken Lithuanian I've heard so does Lithuanian) so the level accent of more westward varieties of Latvian (including Standard Latvian) might actually be an innovation then put to use to distinguish between homophones similar to English "a use/to use." This of course borders on OR on my part...

Anyways, I checked out zaļa zāle ("green grass") and the tone should be broken and it was, then koncertzāle ("concert hall") and the tone is actually different than what I heard for "grass," I'm not sure if it is as level as it should be though. And then Nomierinošas zāles ("sedating medication") and the tone actually sounds falling! Granted my ability to distinguish a falling tone from a broken one is limited. So, it actually sounds as if it's pronouncing the correct tones (or maybe it's just me.) OK, I tried auksts and lauska and it pronounces them both with a broken intonation (it should've been level). So as far as intonation (and "wide e's") it's pretty safe to say that it pronounces them with Eastern Latvian accent. In turn the way it pronounces ⟨l⟩ - more frontal (tongue touching the incisors instead of molars touching) is more of a Western Latvian thing (practiced by some news anchors, which annoys me to no extent, lol!) So, yeah, a fusion of some Eastern Latvian traits (only "narrow e" and only broken tone) and some Western Latvian (⟨l⟩ realized with tongue touching incisors instead of "dark l" with molars touching.)

OK, I think I went way overboard with details here, but I think I might be able to reuse some of this if I do in fact end up making an article for "IPA for LV"! :) Neitrāls vārds (talk) 21:23, 31 October 2013 (UTC)[reply]

No More Mr Good Guy

[edit]

Some of you may have noticed that I have not been blocked by anyone, had any nonsense deleted, or had any vandalism reverted since the weekend.

This is because I am resigning as a Wonderfool forthwith - I just can't be bothered any more.

I shall continue to work on un-technical words in English, and generate words in a couple of Romance languages, but I won't be needing patrolling in Recent Changes or wasting any other sysop-type resources.

I shall continue to feed bots, but will now be unable to create any incorrect bot entries that would end up deleted. Get a bureaucrat to promote me to admin if you think that is best.

Good luck. --ElisaVan (talk) 14:21, 30 October 2013 (UTC)[reply]

I wasn't around in your heyday, but I've heard a lot about you and I'm big fan. It's good that your back. It's good that you've changed. I don't think it would hurt to make you an autopatroller, but I think it would be "best" if you were never promoted to admin again. --WikiTiki89 14:45, 30 October 2013 (UTC)[reply]
Yeah and I've been kinda busy and not made many irate comments on talk pages. Sorry about that. Mglovesfun (talk) 14:56, 30 October 2013 (UTC)[reply]
And when was the last time since I added a new translation to water? -- Liliana 15:01, 30 October 2013 (UTC)I[reply]
WF's track record is still a problem IMO, but the ElisaVan avatar has been a model citizen so far, even doing some of the housekeeping. DCDuring TALK 15:21, 30 October 2013 (UTC)[reply]
Oppose whitelisting. He is still making some mistakes (like forgetting to add parameters to {{es-verb}} and adding content of questionable veracity), and I’ve caught him feeding BuchmeierBot with entries with incorrect conjugation in the past.
I doubt he really is resigning as Wonderfool. More likely he is trying to gain our trust so he gets to delete the main page again. He is not stupid, he knows how to trick us; he managed to delete the MP as recently as 2010, despite everyone knowing who he was. — Ungoliant (Falai) 12:29, 31 October 2013 (UTC)[reply]
Which is why he should never be made admin. But whitelisting is a good idea because he won't waste our time patrolling changes, but we can still patrol him if necessary on his contribs page. --WikiTiki89 12:40, 31 October 2013 (UTC)[reply]

What can change the nature of Wonderfool? --Æ&Œ (talk) 12:33, 31 October 2013 (UTC)[reply]

... and who would want to? SemperBlotto (talk) 22:17, 31 October 2013 (UTC)[reply]
Zen-like. This is my favourite AEOE comment. Equinox 03:24, 2 November 2013 (UTC)[reply]