This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live. |
Beer parlour archives edit | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Word of the Day
That'd be nice to see.
- I agree with anon; word of the day, or rather entry of the day (or week) would be a good feature. Jon Harald Søby 18:46, 17 October 2005 (UTC)
- See Wiktionary:Words in the News - should really be a proper appendix, linked from the front page. But I don't remember to update it very often. SemperBlotto 18:54, 17 October 2005 (UTC)
- There was a conversation about this going on in the discussion page of Wikipedia. Here's what we've come up with:
"Wikipedia has many more regular contributors than we have. There is nobody with the time to do the work every day - the closest we have is Wiktionary:Words in the News SemperBlotto 19:15, 27 December 2005 (UTC)
- I don't understand... Wikipedia may have more contributors than Wiktionary, but Wikiquote is also a sister project, yet it still has a quote of the day feature. Does it really take that much effort to say "The word of the day for December 27, 2005 is ......."? --Thebends 14:54, 29 December 2005 (UTC)
- Are you volunteering to maintain it? When we had this feature before the problem was lack of maintenance. Eclecticology 06:01, 28 December 2005 (UTC)
- I wasn't volunteering, but I could do it if no one else was willing. Honestly, anyone could do it though, it wouldn't be too hard. This is all you'd need:
- Are you volunteering to maintain it? When we had this feature before the problem was lack of maintenance. Eclecticology 06:01, 28 December 2005 (UTC)
- It just seems so appropriate that Wiktionary have a word of the day feature because most other internet dictionaries have it. And after all, it would be a seemingly very simple thing to have. All it needs is one, dedicated person. I wasn't here when we had this feature, so I don't know the story of its downfall... --Thebends 14:54, 29 December 2005 (UTC)
- When we had it before, I was the one who pulled the plug. I don't think that anyone then or now is opposed to the Word of the Day idea. The idea has also been suggested by a few others in the interim. It was stopped before because the same word of the day would often stay unchanged for a week or more. If you go back far enough in the Main Page history you should be able to see what happened.
- If you want to carry on with this it would be tremendously helpful if you became a registered user. If security is a concern please remember that it is easier to trace you through an IP number than through a name that may have nothing to do with who you really are. A big advantage is that it becomes easier to communicate through user talk pages.
- Your example looks nice enough, but if you want it to appear on the Main Page it may need to be trimmed down in size. Perhaps too the idea could be developed on a template page that could be linked from the Main Page. Eclecticology 09:21, 29 December 2005 (UTC)
- If the actual word and definition are to appear on the Main Page, then it can only be edited by an Administrator - who already have enough to do. SemperBlotto 11:55, 29 December 2005 (UTC)
- Nope. It'd be in templates, so anyone could do it. Works perfectly on most Wikipedias. Jon Harald Søby 11:56, 29 December 2005 (UTC)
- Fyi, we have the word of the day feature on the French Wiktionary. In order to avoid having the same word for a week, we have a table that defines one word a day . Sometimes, we have the same word appearing in two consecutive months, but that's better than having it in two consecutive days. Kipmaster 12:09, 29 December 2005 (UTC)
- Thanks for all the comments. I'm a little confused as to why the Administrators would have to be in on this, though. Is it to prevent innapropriate words? Also, where can I register? --Thebends 14:54, 29 December 2005 (UTC)
- I just got an account. It is --Thebends 14:46, 29 December 2005 (UTC)
- Thanks for all the comments. I'm a little confused as to why the Administrators would have to be in on this, though. Is it to prevent innapropriate words? Also, where can I register? --Thebends 14:54, 29 December 2005 (UTC)
- Fyi, we have the word of the day feature on the French Wiktionary. In order to avoid having the same word for a week, we have a table that defines one word a day . Sometimes, we have the same word appearing in two consecutive months, but that's better than having it in two consecutive days. Kipmaster 12:09, 29 December 2005 (UTC)
- Nope. It'd be in templates, so anyone could do it. Works perfectly on most Wikipedias. Jon Harald Søby 11:56, 29 December 2005 (UTC)
- If the actual word and definition are to appear on the Main Page, then it can only be edited by an Administrator - who already have enough to do. SemperBlotto 11:55, 29 December 2005 (UTC)
- Your example looks nice enough, but if you want it to appear on the Main Page it may need to be trimmed down in size. Perhaps too the idea could be developed on a template page that could be linked from the Main Page. Eclecticology 09:21, 29 December 2005 (UTC)
- If you want to carry on with this it would be tremendously helpful if you became a registered user. If security is a concern please remember that it is easier to trace you through an IP number than through a name that may have nothing to do with who you really are. A big advantage is that it becomes easier to communicate through user talk pages.
- When we had it before, I was the one who pulled the plug. I don't think that anyone then or now is opposed to the Word of the Day idea. The idea has also been suggested by a few others in the interim. It was stopped before because the same word of the day would often stay unchanged for a week or more. If you go back far enough in the Main Page history you should be able to see what happened.
- It just seems so appropriate that Wiktionary have a word of the day feature because most other internet dictionaries have it. And after all, it would be a seemingly very simple thing to have. All it needs is one, dedicated person. I wasn't here when we had this feature, so I don't know the story of its downfall... --Thebends 14:54, 29 December 2005 (UTC)
Thanks for signing up. The reason that admins get involved is because the Main Page needs to be protected. It's the first thing that anybody sees including the potential vandals; that also makes it a prime target for them. I've noticed that Uncle G has devised a bot that automatically empties the Sandbox at the same time every day. I wonder whether something similar could be used to change the Word of the Day on a regular basis. That way a stack of words could be made available ahead of time, and there would be no need for the person in charge of the project to make himself available at the same time every day. Eclecticology 16:51, 29 December 2005 (UTC)
- You would only need to edit the main page once to implent this. Just use {{Word of the day/{{CURRENTDAY}}-{{CURRENTMONTH}}-{{CURRENTYEAR}}}}. Jon Harald Søby 16:53, 29 December 2005 (UTC)
- Eclecticology, your idea of a "stack" of words to update the page automatically is great! How would I be able to contact Uncle G, though? Jon, if you look at the example I posted earlier, you'll see that I've already got that under control, thanks for the advice though. --Thebends 17:11, 29 December 2005 (UTC)
- I have what is known as a talk page. ☺ Uncle G 06:07, 9 January 2006 (UTC)
- Yeah, I know, but Ec didn't seem to understand it… Jon Harald Søby 17:14, 29 December 2005 (UTC)
- I think his idea was similar to what Kipmaster said earlier. If you check out the French Wiktionary, you'll see that they have a table of words that makes sure the same word will not be used for weeks at a time. Check it out: . --Thebends 17:22, 29 December 2005 (UTC)"
- Eclecticology, your idea of a "stack" of words to update the page automatically is great! How would I be able to contact Uncle G, though? Jon, if you look at the example I posted earlier, you'll see that I've already got that under control, thanks for the advice though. --Thebends 17:11, 29 December 2005 (UTC)
- I think a Word of the Day feature is almost a necessity for Wiktionary. What's holding us back? --Thebends 23:29, 3 January 2006 (UTC)
- A reasonably-sized pool of feature-quality articles to draw upon. Please construct a list of a year's worth. Uncle G 06:26, 9 January 2006 (UTC)
Working model
Based on the template by Thebends and comments here, I've begun a working model on my user page which will automatically update to a new page daily. I have started a list of words which will appear, a single word sense for each day. The list is easily editable, and I can improve its useability further if the concept is accept by the community.
Due to the very reasonable concerns regarding maintainability, I would suggest a group of volunteers work to create at least one month's worth of entries ahead of time, with a goal of having the next month ready by, at the latest, the 15th of each month. Word choices can be modified at any time, of course, so pertinent terms may be selected based on news events, calendar events, and so on. (It may also be a place to showcase examples of excellent entries.) - Amgine/talk 21:12, 6 January 2006 (UTC)
- I think that that mechanism is not the best idea, for the simple reason that it requires a new template for each day and thus that editors keep actively feeding it new per-day templates indefinitely. If editors ever stop feeding it, the main page ends up with a dangling hyperlink. A better mechanism would be to employ a fixed pool of words, and cycle through the pool continually. Editors can then replace entries in the pool as and when. If the pool is large enough, say 366 entries, it won't matter that there is the potential for repetition. Even if editors stop feeding the mechanism, it will only repeat once per year. You can use a template naming scheme using just {{CURRENTDAY}} on its own to get a pool of 31 entries. You can use a template naming scheme using just {{CURRENTMONTH}}-{{CURRENTDAY}} on their own (without the year) to get a pool of 366 entries. You don't really need the help of User:Uncle G's 'bot for this. Uncle G 06:26, 9 January 2006 (UTC)
Ultimate Wiktionary
previous material archived --Stranger 08:53, 25 September 2005 (UTC)
I've created a Project page for Ultimate Wiktionary Wiktionary:Project - Ultimate Wiktionary. If anyone knows exactly where the information about this project is, please put the links on this page.--Richardb 05:14, 5 Jun 2005 (UTC)
- Update from Wikimania: work is ongoing, though on a slightly later timescale; MW 1.6 has since been released with some success; and Ec and Gerard met in person at Wikimania and enjoyed some positive discussions about unifying wiktionary languages. I hope one of them will write more about their discussion. Meanwhile, it would be great to see more active discussion of what a unified interface might look like (as people are actively working on potential implementations). Please help spread the word. +sj + 16:49, 23 August 2005 (UTC)
- Sj, is the "slightly later timescale" published anywhere? --Connel MacKenzie [+] (contribs) 06:13, 22 September 2005 (UTC)
Formatting help
It requires some studying and practice to learn the formatting rules. Therefore most casual/passing contributers will always create cleanup burdens unless better guidance can be found. I propose a top link above the text box that once clicked will roll out a template in the editing box. Editors will just replace the "insert text" instructions in between the wiki format code. The template will be basic and cover the needs of novice contributers.
Lotsofissues 27 June 2005 12:41 (UTC)
- I think this idea got pushed aside by the decapitation issues. This is a good idea. A certain sysop has been honing his Javascript and maybe would be willing to try making this work, perhaps as buttons underneath the special characters box (so the different common templates can each be added: En:Noun, En:Verb... Italian, Kanji, etc.) --Connel MacKenzie 01:48, 15 July 2005 (UTC)
- Looking at what Wikinews did, I have added ten links on the "Create new page" page. These are (hopefully) good starting points for the most common English inflections. In lieu of the recent vandalisms, and the likelyhood that these would be prime targets, the template pages have been protected. Suggestions for corrections to them belong on their Template talk: pages. Suggestions for new inflections or other languages belong on MediaWiki talk:Nogomatch. Note that all related templates start with "new"...e.g. Template:new en noun. --Connel MacKenzie 17:37, 6 August 2005 (UTC)
- Comments on this are appreciated from the community at large. Do all the regulars here *only* create entries from red-links, and never from the [Go] button? Is that why only newbies seem to notice this? (Paul, do you have improvements you'd make to the plural default template:new en plural?) --Connel MacKenzie 20:27, 9 August 2005 (UTC)
- I would still appreciate comments on these "helper" templates. The impetus to just do these came about on special request from visiting Italian Wiktionarians who needed help adapting to en: while stopping by on IRC. I went back later and found this conversation. But honestly, I expected this experiment to get a whole bunch of edits to the "official" templates themselves. For the general "blank" template, I would think we should have "Basic", "Intermediate" and "Advanced"...with "Basic" being just ==English== ===Noun=== #hrunk. The "advanced" should really be a full-on multiple-etymology monster. But with no feedback, it is hard to guess if I went in the right direction with these. Also, should I be modifying MediaWiki:Monobook.js to change {{PAGENAME}} to {{subst:PAGENAME}}?
- See: MediaWiki:Nogomatch, - User:Connel MacKenzie/monobook.js, - Template:new en basic, - Template:new en advanced, - Template:new en adj, - Template:new en compar, - Template:new en superl, - Template:new en adv, - Template:new en noun, - Template:new en plural, - Template:new en verb infl, - Template:new en verb third, - Template:new en verb pres part, - Template:new en verb past
- --Connel MacKenzie 20:10, 23 August 2005 (UTC)
- As of a two or three days ago, MediaWiki pages no longer parse HTML at all...wikitext only. So, this has again gotten a significant redesign. Now all we need are help links on the edit page... --Connel MacKenzie 15:07, 27 August 2005 (UTC)
- I arrived at some of these pages from the list of all templates, and must admit that my first reaction was, "What is this nonsense? What's he trying to do now?" For now I'm willing to maintain an open mind on the matter, and wait to see what happens. The MediaWiki:Nogomatch page still allow a person to create an article in the normal manner without leading us to a page named "$1". You will probably need to do more documentation, and a bit of a selling job to get the oldtimers to use this, especially since, as you suggested to Paul most of us get to new pages by following red links. Eclecticology 08:36:46, 2005-08-30 (UTC)
- I would like to offer kudos on this particular implementation of this extension. I have, effectively, left the Wikinews project over the extension being forced on us by its creator because it is implemented in a very unwiki way. However, this use of the extension allows users to choose whether or not to use it, and just how much "help" it will give. I'm very impressed, and have brought others over to Wiktionary just to show it to them. Unfortunately, as I'm not a regular contributor here, I can't comment on your templates, but the Mediawiki:Nogomatch is an excellent implementation. - User:Amgine/talk 05:54, 1 September 2005 (UTC)
- Just a minor clarification: I don't expect any of the "old-timers" to use these templates. I myself generally only use them when adding a batch of words from my Project Gutenberg frequency lists. I would appreciate some constructive criticism about the content of the templates themselves...keeping in mind that the only people likely to see these, are total novices to Wiktionary. --Connel MacKenzie 06:25, 1 September 2005 (UTC)
- Preliminary Notes. I think this is brilliant. Thanks again to Connel for his dedication.
- The single most important thing that helped me understand wiki was the simple ==English==, ===Noun===, #hrunk template found on ELE. I have copy and pasted it many, many times. Admittedly, I found "hrunk" a bit odd, and would have preferred something like "[insert real word here]".
- I have noticed a few entries beginning with '''{{PAGENAME}}''' and am wondering if that's a shadow of the template not being used correctly or if that's the standard we are "supposed" to be using instead of '''word'''.
- I also found the red links on Mediawiki:Nogomatch to be a bit confusing, but that may be just because I'm a newbie. I (at first) didn't even try to click on them. Now that I've tried and know they work regardless, I'll try to review them it in more detail. Cheers, --Stranger 14:51, 1 September 2005 (UTC)
- Thank you. {{PAGENAME}} is a wiki variable that has the side-effect of displaying corectly. For my own javascript, I automate the substitution of the correct term in the variable's place. On days when wiki searching is permitted, I search for that string in the main namespace - simply editing those articles automatically "corrects" that (for me, with that javascript in place.) I am very leery of putting that (or similar) code in the site-wide javascript until this is better tested and there is wider community support of this proof-of-concept. --Connel MacKenzie 17:18, 1 September 2005 (UTC)
- Javascript-talk is over my head, my friend. Oh, I also wanted to say that I was delighted to see you included the anon's suggestion of a "basic" "intermediate" and "advanced" template. I thought that was a really good idea. Cheers, --Stranger 17:57, 1 September 2005 (UTC)
- Aside: You can also see my comments on ELE on that discussion page. --Stranger 03:38, 4 September 2005 (UTC)
make image uploads admin-approved?
I hate to suggest this, but due to the recent rash of "ass pus" vandalism, would it be a bad idea to make it so that uploaded images have to be approved by a sysop? I know that this might be a burden to legit users, but honestly, I don't really see a need for images (Wiktionary isn't an encyclopedia).
Just a thought. --Ixfd64 07:02, 1 August 2005 (UTC)
P.S. I kinda miss the "ass pus" vandal's old style - they're much easier to revert.
- In fact we could switch off image uploading entirely. There is another project - WikiCommons - for this purpose. We should migrate all our images and sound files over there. Referring to them is not harder than it is now. When a file is not found on a certain project it will be looked for on Commons. It makes a lot of sense to put all images, sound files, etc together in one place. It avoids a lot of duplication of effort and storage as they can be used and reused by all the projects.
The next step could be to not allow images from external sites anymore. Of course, as we take measures to counter vandalism, some vandals may get more sophisticated. It can't be helped. Polyglot 08:03, 1 August 2005 (UTC)
- As long as turning image uploading here off does not prevent us from referencing files on commons, I agree then that we should have the local-upload capability turned off. Commons is a more logical place to have lots more sets of eyes watching. For audio files, most of us have been using commons anyway. --Connel MacKenzie 08:24, 1 August 2005 (UTC)
- It won't. Wikinews was configured that way for a long while, until it sorted out a fair use policy and local uploads could be enabled. All images had to be from Commons. Whether that configuration makes sense for Wiktionary depends from whether Wiktionary would ever want anything other than public-domain or GFDL media. Wikipedia and Wikinews certainly do, since they make use of logos, publicity photographs, screenshots, and the like. Wiktionary might not.
The proper place to discuss Wiktionary's image policy is the discussion page of Wiktionary's image policy, of course. Uncle G 17:14, 2 August 2005 (UTC)
- It won't. Wikinews was configured that way for a long while, until it sorted out a fair use policy and local uploads could be enabled. All images had to be from Commons. Whether that configuration makes sense for Wiktionary depends from whether Wiktionary would ever want anything other than public-domain or GFDL media. Wikipedia and Wikinews certainly do, since they make use of logos, publicity photographs, screenshots, and the like. Wiktionary might not.
- As long as turning image uploading here off does not prevent us from referencing files on commons, I agree then that we should have the local-upload capability turned off. Commons is a more logical place to have lots more sets of eyes watching. For audio files, most of us have been using commons anyway. --Connel MacKenzie 08:24, 1 August 2005 (UTC)
- Given commons.wikimedia.org's recent propensity for deleting our front-page logos, I am not sure we are entrusting the right people to guard over our media files. Perhaps we should have our primary copy locally, and optionally a second copy loaded on commons for anyone else to use? I am no longer convinced that "the commons" is a safe respository for us. --Connel MacKenzie 15:24, 8 August 2005 (UTC)
- Could we have our logos locally, so we don't need to trust Commons for them and the rest of the images and sound files on Commons.
- The REAL QUESTION is though, whether I can go and ask a developer to switch off uploading of images by non-sysops? Could we have a vote on that? Or at least a handful of people in support and not more than, say 2, against? It makes far more sense for most material to be on commons, where it can be shared between projects. It's time to end the cat and mouse game with our dearest persistent contributor of unsavory vandalism. Polyglot 12:32, 7 September 2005 (UTC)
- Supported. Having everything in one place (i.e. the Commons) makes sense to make. I hope they have, as has been said, more eyes to look out for vandalism. It's frustrating being alone trying to combat it. --Stranger 22:12, 7 September 2005 (UTC)
- Support. Definitely - what kind of images could we possibly ever need which are so hard to get by, that there cannot be any GFDL/PD versions? (and hence forbidden on commons) \Mike 16:18, 25 September 2005 (UTC)
- Support. Don't know how I missed this earlier. --Connel MacKenzie 01:25, 22 October 2005 (UTC)
- Support. Commons is the better place for images and the like. The images that have come through Wiktionary lately, whether intentional vandalism or not, have been copyvio or irrelevant to the project. --Dvortygirl 17:37, 23 October 2005 (UTC)
- Support. Polyglot 20:03, 23 October 2005 (UTC)
- Whatever is decided, would you update Wiktionary:Pictures accordingly. It's rather a mess right now. Cheers, --Stranger 17:38, 9 September 2005 (UTC)
Should this be restarted as a new formal vote, or is the unanimous support sufficient? --Connel MacKenzie T C 20:37, 16 December 2005 (UTC)
Wantedpages
I happened to look at Special:Wantedpages and noticed that the first 500+ were short strings of hiragana text. Some of these are actual Japanese words, but many are not. They are high on the Wanted ranking because they have all got multiple links. On inspection it seems that they are there because at some stage a dictionary of hanzi/kanji (Chinese characters) has been loaded in, and the Japanese readings of those characters have been created as internal links. I guess this was done by some bot at load time.
I think having all those Japanese readings as links, implying that they are free-standing words (most are not) is a bad thing. In particular, it is clogging up the Special:Wantedpages with extraneous words and preventing the genuinely needed words from being listed at the top.
Can I suggest that some house-keeping person remove the link tags from the Japanese readings in the Chinese characters entries. If at some later time actual examples of the usage of those hanzi/kanji are added, links can be created to actual words, e.g. in a Japanese-English or Chinese-English dictionary. --JimBreen 00:17, 11 August 2005 (UTC)
- That special page is not udated regularly; it is only updated when the #mediawiki-tech technicians run the linkUpdates.php script. I've asked a couple times for an update, but that have not done one in a while. They also respond with silence to my requests for a new (or even better: scheduled weekly!) xml dump. :-(
- I shall ask them again. If anyone knows of more expedient ways to reach the Wiki*-boxen-admins, please forward the request on. --Connel MacKenzie 01:37, 11 August 2005 (UTC)
- I've also asked several times on several IRC channels. I've also asked the people there what the proper way to ask is. Usually I am thoroughly ignored. If I keep at it I get fobbed off, told I'm on the wrong channel, etc. If I still persist, I'm told it can't be done now because it has to be done when the wiki is pretty quiet. If I ask to have it done in the near future when things are pretty quiet, things go silent again and when I check weeks later nothing has been done. I think we should make a push to be heard by the people who can do this. We need a channel for requesting updates and we need to know what that channel is. Then we need to have our wishes respected by those who are capable of putting them into action. — Hippietrail 05:49, 11 August 2005 (UTC)
- Perhaps we should be making the requests on mediawiki or foundationwiki or some other non-irc channel place? --Connel MacKenzie 05:36, 12 August 2005 (UTC)
UPDATE: Silently, inexplicably, after almost two and a half months, the special pages were updated between last night and this morning. (Muke noticed first, I think.) I hope this is on a weekly or biweekly update schedule now, since we still have no official statement about who it is that is supposed to be keeping these up to date. --Connel MacKenzie 15:46, 13 August 2005 (UTC)
AND the special pages that show Broken and Double Redirects have been refreshed - there are LOTS! SemperBlotto 16:10, 13 August 2005 (UTC)
- UPDATE: As per the mailing list message (anyone here on the mailing list?) Brion refreshed the xml database dumps last night. So I shall now refresh my todo list, my Project Gutenberg list, protected pages lists, etc... --Connel MacKenzie 13:07, 14 August 2005 (UTC)
Can I put in another plug for stripping out the Japanese fragments from Special:Wantedpages? There are over 500 of them and I don't think they are wanted. --JimBreen 07:14, 21 August 2005 (UTC)
- It's a wiki. You can enter 500 stubs? Or even better, find the "bad" template that references them, and de-wikify them. --Connel MacKenzie 07:22, 21 August 2005 (UTC)
- If I knew how to do the latter I would. They appear in the hanzi/kanji pages as "すい (sui)", etc. i.e. with double square brackets around the すい. It's odd because neither the Chinese (Pinyin) reading nor the Korean reading (in Hangul) is linked. --JimBreen 02:37, 22 August 2005 (UTC)
- I certainly am misunderstanding the question. The term すい seems to have a definition, so it shouldn't be on Special:Wantedpages (and does not seem to be.) For each and every term on that list, you can open that term in a new tab, then click on "What links here" from the toolbox on the left side of the screen.
- Perhaps it would help if I firmly understood what a "reading" is. Are you saying the information in the linking articles is correct, but that they should not be wikilinked? --Connel MacKenzie 01:26, 29 August 2005 (UTC)
- Answering your question first, yes that is exactly what I mean - they should not be wikilinked. For what I mean by a "reading", look at [光]. You'll see three Japanese readings for 光: an "On" (Chinese derived) pronunciation of こう (kō), and two Kun (native Japanese) readings of ひかる (hikaru) and ひかり (hikari), all of which are wikilinked. The problems I have are:
- the こう (kō), relates only to parts of words, e.g. 光学 (kōgaku: optics). The こう wikilink actually goes to a totally different word (Japanese is rich with homophones, and has a squillion words pronounced こう.)
- the ひかる and ひかり are in this case actual (single) Japanese words, but for many kanji they are not. For example there are over 30 kanji with the Kun reading of ひく (hiku). Just wikilinking to ひく has a 97% chance of pointing at the wrong one. (Also in the case of ひかる, only the ひか portion is the reading of the kanji. The る (ru) is an inflectional verb ending and only written in kana - a practice called "okurigana".)
- All these links should be blown away, and where appropriate rebuilt pointing at the right things when and if the kanji entries are rebuilt with real information. --JimBreen 04:36, 5 September 2005 (UTC)
- Answering your question first, yes that is exactly what I mean - they should not be wikilinked. For what I mean by a "reading", look at [光]. You'll see three Japanese readings for 光: an "On" (Chinese derived) pronunciation of こう (kō), and two Kun (native Japanese) readings of ひかる (hikaru) and ひかり (hikari), all of which are wikilinked. The problems I have are:
- I'm sorry that this has been ignored so far. Much effort is currently being diverted to rather pointless bickering at the moment. On each of these entries, clicking the "What links here" link of the left column's toolbox will display the pages that link incorrectly. Perhaps everyone could help out de-linking them. (Also, I think on Special:Wantedpages each term also has a (links) link that will also display them. Using tabbed browsing helps imensely with this type of task.) --Connel MacKenzie 12:23, 12 September 2005 (UTC)
Irregular verbs category
I was looking at the Category:English_irregular_verbs, There is also the Irreguar Verbs appendix.
I think it's fair to say that it is a bit of a mess.
- Some irregular forms appear in the category in place of the infinitive (withstood, sprung), but most verbs are only listed in the infinitive. In some cases, everything is listed (catch/caught, go/goes/going/gone)
- Several verbs or forms are listed as irregular that have no irregularity (randomise, employ), or conform to the "-ch" + "s" -> "-ches" rule (match and inch are listed, but not patch, pitch, pinch, hitch etc).
- Some verbs are listed because they have an archaic form or UK alternative spelling that is irregular (chide, abide) but most are not (geld, smite, strive, dwell, slay)
- Many irregular verbs are missing from the category altogether (rend, rid), particularly compounds (mislay, overhear).
So, my questions are:
- Should only the infinitive be in the category, or irregular forms also?
- Can we confirm that match, inch etc are not to be considered irregular?
- Is a verb to be considered irregular if it has any irregular form or (or alternative spelling), or only if it is irregular in an extant form in at least one major dialect?
- There is also the more complete and consistent Irregular Verbs appendix. What is the difference in purpose between these two pages?
Allan 07:30:00 11 Aug 2005 (UTC)
- Perhaps you could sign up and take the lead in sorting this out. :-) Personally I tend to ignore this category as being of limited use, but I don't complain if others want to use it.
- What would you suggest?
- Personally, I don't consider them irregular.
- What would you suggest?
- The "Appendix" is older, and predates categories. It is top down in that it lists things whether or not we have an article, and thus can be useful as a want list. The category is bottom up in that it is generated from tags on the article. In theory, a category could replace the corresponding, but this may be unwise except where the list is close-ended and all items have articles. Eclecticology 14:48:24, 2005-08-11 (UTC)
- I started going alphabetically through a list of irregular verbs out of one of my dictionaries in order to add them to Category:English irregular verbs. I also format existing articles, mainly to add the Template:irregverb I created (give me some feedback (so far I've done A to D)! I also planned to use Template:regverb and Template:sb). To 1 to 3:
- It would be nice to have a category for infinitives only. But a category with all irregular forms might be interesting as well and certainly won't do any harm (One point to think about here: Only irregular forms, or all forms? Ex. gr: Should sawed be listed? It is the past tense and past participle of saw, but the latter can be sawn as well.) By the way: Given a category with subcategories, is it possible to display an alphabetical list consisting of the entries of both the category and its subcategories (maybe even sub-, subsub-, and so forth categories)? Otherwise we would have to tag the infinitives twice.
- Yes.
- I would tag verbs whose irregular forms are obsolete. Ncik 18:26, 11 August 2005 (UTC)
- Thanks to whoever fixed my link to the category. Newbie as I am, I'm going to go away and try to work out why it needed a leading colon. (Ncik: Nice work on A to D).
- I think the consensus we're getting to is this. The existing "Irregular Verbs" category for the infinitives to belong to, and an "Irregular Verb Forms" category for those articles (e.g. swept) which do little more than point to the main headword.
- This shold answer the "sawed" question, and the double-tagging of the infinitives: Sawn, since it has its own article, belongs to the "Irregular Forms" category. Sawed doesn't - to do so would create a nasty precedent of creating articles for every regular verb form (and, by extension, plural). Saw belongs to Category:English irregular verbs under the "any one irregular form" rule.
- This opens another can of worms, though. Passed redirects (through a mechanism I don't yet understand) to Pass, but Dreamed has it's own article. I suggest that the regular participle Dreamed should redirect directly to Dream, but Dreamt should get it's own article (which appears in the "Irregular Verb Forms" category). This is the convention in most paper dictionaries, I think.
- Allan 14:21:00 12 Aug 2005 (UTC)
- Allan, Passed and passed both were/are redirects, ultimately to pass. The Wiki* software does not follow redirects recursively anymore (due to nasty vandals.) In practice, passed should not be a redirect; it should have its own complete (brief) article. --Connel MacKenzie 01:30, 13 August 2005 (UTC)
- pass is in serious need of major reformatting - moving that particular example's discussion to the Tea Room. --Connel MacKenzie 01:52, 13 August 2005 (UTC)
Especially British ?
I am making an assumption (perhaps incorrect) that the majority of the contributors to Wiktionary are American. I am not an American, and nor am I British. Now that that's clear, I have a question for someone who has oodles more knowledge about dictionary writing than me.
Question: Is it correct to use "esp. British" ? Looking at the entry for flat, which also means something along the lines of apartment, you can see that flat is listed against the apartment as especially British usage. But surely this isn't correct ? What it should be labelled as is "esp. non-North American". Given the preponderance of American contributors to this dictionary, is it not going to be the case that anything that is not particularly non American English will be labelled as British - I think this style of writing should be made clear before the dictionary gets riddled with this type of error. Looking in Wikipedia for "flat" at least gives a taste of the widespread usage of the word.
- You could make things easier by letting us know what country you are writing from, so that we can have a perspective on the term from that country. The expression "especially British" says nothing about what happens in other countries, but neither does it exclude the use in other countries. Your suggestion of "esp. non-North American" may be even more misleading since it makes assumptions about a sweeping rane of other countries. If, for example, you are giving the South African perspective it would not be safe to presume that the term is used the same way in New Zealand or Trinidad.
- I can add that in Canada usage is similar to that in the United States, but in some parts of the country "flat" is used to distinguish a separate apartment in a private home from one in an apartment tower. Eclecticology 09:27:01, 2005-08-13 (UTC)
- I'm American and I'm actually in agreement with you. Aside from the North Americans, the English speakers I've run into living in Taiwan use the word "flat" rather than "apartment". The distinction between American and British English is prevalent in dictionaries, but it doesn't do the rest of the world justice. Davilla 07:15, 14 August 2005 (UTC)
- There will always be a need for "esp British", just as there will be one for "esp Australian" (where, uniquely, chuck=chicken and arvo=afternoon). "UK" is a fairly widely accepted abbreviation for "Commonwealth English", and doesn't imply an anglo-centric view of the world, so maybe "esp UK" has the meaning you're searching for. However, it could just be that the writer didn't want to generalise outside their own dialect; if you know better, feel free to modify the arcticle. Such is the wonderful world of wikis. Allan 15:40, 14 August 2005 (UTC)
- chuck never equals chicken in Australia. chuck means throw or vomit. The word you are thinking of is chook which sounds like chuck to North American ears and perhaps others. — Hippietrail 20:20, 14 August 2005 (UTC)
As the only American in the top 10 en.wiktionary contributors, I'm a bit offended that anyone might suggest that en.wiktionary is anything but extroardinarily British-centric.
Even though I am the most prolific American contributor, I have only approximately the same number of contributions as the top contributors from Belgium and Canada, perhaps 1/2 as many entries as the top contributor from Australia. And of course, only a fraction as many as either of the two top British contributors! About the same as Number 10, about 1/2 as many as number 5, and less than 1/3rd as number 3 are how my "prolific" contributions compare to the top British contributors'.
It is reasonable to assume that {{UK}} refers to Commonwealth English. I have made that assumption in almost every instance that I've used that template. The nice thing about it being a template, is that uppity British can simply change the text of the template to say "Commonwealth English" at any point. I really did not call it "UK" in an attempt to offend anyone (I really do not) it is just understood at face value to mean CW. If it is somehow ambiguous where you are from, then please, correct the text of the template. --Connel MacKenzie 16:28, 14 August 2005 (UTC)
- As further evidence of how astronomically British the en.wiktionary.org is, one need only look to center/centre, where very well intentioned contributors have done some absurd things. (Myself included there.) Notably, the removal (a while ago) of the meaning of centre for US happened rather silently, leaving all further conversations on the topic bordering on nonsensical. In America, "centre" is used, but not to mean a center of something, but only as a hoity-toity marketing gimick to sound exotic (generally ignored as stupid.) The term is also restricted to shopping malls. (Note: not shopping centers.) The other meanings of the spelling centre generally do not exist in America. In an effort to unify meanings, this distinction is no where in sight now. I don't know if the distinction was only discussed here, or in WT:TR or on talk:center or talk:centre, or perhaps center or centre themselves. Maybe it was only on talk:theatre - who knows? Perhaps even someone's talk page?
- Wherever it was, it is not present now. It didn't make it into the conversation, so what I assumed other people could see, actually was not there.
- I suppose if every American started using Wiktionary, and each picked a single word to watch over and check for Britishification every day, American English might be represented well here. Right now, to call en.wikt: anything other than British would be misleading.
- It seems then that the "centre/center" example only suggests that these issues need to be viewed with more subtlety than has been the case. There is certainly nothing wrong with pointing out the differences, but a crusade to ensure that either POV is represented does not strike me as productive. Eclecticology 08:18:09, 2005-08-26 (UTC)
- Connel, could you please fix anything that I did wrong at center/centre? Did I forget to transfer something while trying to sanitize it? All I want to achieve is to not have the same content twice. Both versions will evolve on their own and that's bad. If a meaning applies to both spellings, put it at the common entry, if something applies only to center or only to centre, put it where it belongs. Polyglot 15:44, 26 August 2005 (UTC)
Um, before I edit either of those, I'd like to understand how ON EARTH defining terms accurately by region could be considered "POV"??? I've been the one here trying to ensure that both regions' meanings are explained clearly...neither pro US nor pro UK. That is the most NPOV tact possible! In fact, that is the only way to define the terms neutrally. The past five months or so have seen this issue several times, with that as the only amenable solution. --Connel MacKenzie 12:52, 12 September 2005 (UTC)
- I don't fully understand the POV remark Eclecticology made either. I believe it has something to do with your comment that you seemed to want every US citizen to watch over pages, in order to avoid them becoming too British. We need to find a middle ground somewhere, a compromise we can all live with. I agree that, if there is a difference then both (or even all four or ten) possible regional meanings have to be added. And if those differing meanings can be linked to a specific spelling, then they should also be on different pages. Probably this will also affect the translations list in that case.
- I would say, please go ahead and make the changes you think are necessary to reflect actual usage as well as possible. Thanks. Polyglot 14:01, 12 September 2005 (UTC)
- I noticed that in knock up, someone had tagged the "get pregnant" meaning as "North American Slang". Leaving aside the territory from Mexico to Panama (and the Carribean?), I take it this implies that the usage is Canadian as well. I can't confirm or deny that, but I'm more interested in the general problem. What we need here is an explicit policy for default assumptions. From what I've seen, our implicit policy is something like
- If no regional tag is given, we assume the usage is universal in English.
- The UK tag appears to mean "The Commonwealth (possibly excepting Canada), possibly Ireland, and in general places that tend to speak British English"
- The US tag appears to mean "The United States (and possibly Canada) and presumably their posessions and territories"
- Tags for specific countries (e.g., Australia) imply that the usage is principally seen in the specified country (or countries) and not current elsewhere.
- I think we're close here, but there appear to be a couple of gray/grey areas. I think the general rules are fairly clear
- No regionality tag impies that the term is universal.
- If there is a specific regionality tag, the implication is that the term is not current outside that region.
- So what regions should we define? The short answer is: Whatever we need. There is no need for regions to form a nice nested structure. In particular, while countries are sometimes convenient, actual usage regions may lie with a given country, or lie in parts of several, or comprise one or more coutries, perhaps along with parts of others.
- The actual list of regions should include:
- A general "British" designation. I would argue against calling it either "UK" or "Commonwealth". I don't like "UK" because it encourages people to generalize from a documented usage within the UK to a presumed usage everywhere. While it would be quite tedious to verify that a putative pan-British usage really is used everywhere, one should feel obliged at least to spot-check. E.g., if it's current in England, New Zealand, Pakistan and South Africa, but not in the US, it's most likely pan-British. I don't like "Commonwealth" because it implies that the Commonwealth is more linguistically uniform than it actually is. In particular, Canada throws a bit of a spanner in the works.
- Possibly a separate "UK" designation, though it doesn't seem likely that a usage would be current in all of England, Wales, Scotland and Northern Ireland without also being current in a few other places.
- Any country with a significant English-speaking population is potentially a region, though its an open question whether they all would actually be used.
- Generally recognized designations such as "Southern US", "Scottish Highlands" and whatever else, regardless of whether they lie within or across national borders. This bit promises to be sticky.
- (Basically an extension of the previous item) Collections of countries which mainly share at least some specific usages, e.g. US and Canada (which is what "North American" was aiming at).
- Hmm ... thinking over how this might work in practice, I think we need one more piece. If I find a quote from, say, an English source and have reason to believe it's not universal, I would tend to tag it "England". However, by the reasoning above, this implies that it isn't heard elsewhere. The ambiguity is over whether the term actually appears not to be heard elsewhere, or whether we just don't know whether it is or not. Unfortunately, the best way to determine this is to poll native speakers, which is not practical for us. So instead, we need a way to distinguish "English and maybe elsewhere" from "English but probably not elsewhere", the latter really meaning "If you hear it elsewhere, it's probably somone English or someone trying to sound English."
- This is really a case of a more general problem we have: how to distinguish "most likely not" from "don't know" or even "pretty sure, but don't have definite citations to back it up." This has already caused endless grief on RFD, where the analogous question is "Is the term used at all?" as opposed to "Where is the term used?", but the same problem is lurking with any piece of information we present at all — pronunciation, etymology, spelling ... anything.
- I think similar solutions will work here, too. E.g., Introduce a tag template for "regionality uncertain", together with a category to aid in finding entries that need to be researched further.
- Actually, there's a finer-grained solution: Carry a specific list of both "observed in" and "probably not used in" regions. E.g., "Observed in UK, Australia, probably not used in US". Strictly speaking, this is all we really know. The rest is inference, though often reasonable inference. -dmh 04:19, 10 October 2005 (UTC)
- I noticed that in knock up, someone had tagged the "get pregnant" meaning as "North American Slang". Leaving aside the territory from Mexico to Panama (and the Carribean?), I take it this implies that the usage is Canadian as well. I can't confirm or deny that, but I'm more interested in the general problem. What we need here is an explicit policy for default assumptions. From what I've seen, our implicit policy is something like
Templates within templates — second take
All my attempts to incorporate Template:audio into Template:irregverb have failed (See the top of my user page). What is the problem, and how can I fix it? Ncik 17:39, 21 August 2005 (UTC)
- Yes. Request deletion of the redundant template:irregverb. Pronunciation sections are for individual articles, not all commingled onto a single page (even though I've expressed in the past that I think many things should be combined.) The pronunciation section covers the audio files as only a portion of what belongs in that section. Adding the same information in again a second time, causes inconsistencies. Note also that other Wiktionarians have devised what they prefer as an acceptable article format. Being stubborn about your proposed inferior format will garner you more tepid responses, such as mine. --Connel MacKenzie 22:44, 21 August 2005 (UTC)
Anyone else? Ncik 21:48, 22 August 2005 (UTC)
- I agree with Connel. SemperBlotto 13:51, 23 August 2005 (UTC)
- Re: your comment on dig. Is this the same query, Ncik? I'm just trying to follow the discussion and want to be on the same page. --Stranger 03:47, 31 August 2005 (UTC)
- The problem here is that people (this now includes me) don't want information about inflected forms (be it pronunciation, homophones, spelling variants, etc.) included on the page of the 'basic form'. My comment on dig was purely concerned with layout: I consider the en-infl-foo templates clumsy and too linear in appearance, and my (ir)regverb template a much better solution. I suggest we continue this discussion here. Ncik 13:42, 31 August 2005 (UTC)
- I think the discussion should remain here, in the Beer Parlour. There are many talk pages - too many for people to monitor all the time. The BP is monitored more regularly. I would hate to have people think you were trying to sneak behind their backs by suddenly moving the conversation to a less noticeable location.
- Let's continue here then. Ncik 09:10, 2 September 2005 (UTC)
- I also noticed otaku where you used the colourful column template for a noun. I think the typical procedure here is to try to gain a concensus before making drastic changes.
- See WT:BP#Irregular verbs category. And even if I hadn't mentioned the templates there: This is the world of wikis, after all. Ncik 09:10, 2 September 2005 (UTC)
- I guess I'm still trying to figure out your basic query, Ncik. Are you trying to garner support for your new template? Or do you want to talk about inflections being included on the "basic form" (which I'm not sure what you mean by that)? Or both?
- If you like my templates, use them. If you don't like them, tell me what you don't like about them. There is no question that we want to give inflected forms on the page of the basic form (ie the infinitive for verbs, singular for nouns, positive for adjectives,... you get the idea) of a word. The problem was that I wanted to include information about these inflected forms on the page of the basic form. A poorly considered practice I have to admit. Ncik 09:10, 2 September 2005 (UTC)
- I'm new here - so I could be all wet. You've been around longer than I, so please be nice if I've got something wrong. Peace. --Stranger 18:44, 31 August 2005 (UTC)
- I would request that dig not be editted while we talk about it. It's nice to have an example. --Stranger 12:06, 4 September 2005 (UTC)
- Strike that. I made a copy of the page for my own uses. --Stranger 13:06, 4 September 2005 (UTC)
I like Uncle G's inflection templates. I dislike your templates because they are an intentional duplication of effort. By adding a different template scheme, you are intentially making the articles themselves harder to parse programmatically. He devised his templates long before you; you could have offered your constructive criticism to them, rather than pretend that you've come up with something "new." Oh wait, you did criticize them, but your critiques were discarded as incorrect, wer'nt they? So now you "invent" these ones...
About dig, I cleaned that up on a separate cleanup pass, not related to this, without realizing this was now the "sample" for discussion. Not that there is anything being discussed - even Ncik sees now that his premise for forking them was incorrect. Based on his comment above, it is clear these are about ready for deletion. --Connel MacKenzie 14:29, 4 September 2005 (UTC)
- Sorry, y'all. I didn't realise this had such a long history: Wiktionary_talk:Entry_layout_explained#Vandalism_in_progress. My apologies to the community for whatever part I may have played in stirring something up best left alone. --Stranger 13:32, 5 September 2005 (UTC)
Missing Webster words
Does anybody know why a significant number of words, which have entried in Webster 1913, have no article here? senile, distill, seemingly and affiliation, for example (although I added affiliation yesterday). I've looked at random sample of requested articles and many have perfectly good Webster entries which didn't seem to get imported. Allan 14:20:00 2005-08-22 UTC
- Whilst Webster 1913 is useful as a source of content to top up an article with etymologies and other information that is unlikely to have dated, some editors have regarded a mass-import from Webster 1913 as a bad idea. See Wiktionary talk:Webster's Dictionary (1913) and Wiktionary:FAQ#Writing_definitions. Uncle G 16:22:08, 2005-08-22 (UTC)
- Thanks - I didn't realise that the import had been halted or the reasons why. Allan 20:00:00 2005-08-22 UTC
- If the import has been halted it's only because nobody has been doing it. The Webster 1913 material is a good sound basis for starting many of these words, and is to be encouraged. Eclecticology 08:01:01, 2005-08-26 (UTC)
- Yep ... on further investigation (and taking soundings from other editors), the "anything is better than nothing" argument seems a strong one. Unfortunately, some previous imports came in unwikified, which dumped the difficult part of the import on the rest of the community and gave the idea a bad name. I'll take a look at ways we can use Websters to fill in gaps without leaving a long list of rfc articles. allan 06:32 2005-8-31 UCT
- Has the option of importing to another namespace been considered? -- Nick1nildram 12:36, 4 September 2005 (UTC)
- The mass import of Webster 1913 material was halted because people — both frequent contributors and people wandering by — looked at the definitions and complained. The Webster 1913 definitions fall into four rough classes:
- Reasonably good, and non-trivial (e.g. vestal, vedette). Fairly rare.
- Good, but trivial (e.g. vestigial). Somewhat less rare.
- Quaintly "musty" in smell and/or missing siginificant modern definitions (e.g. vestige, verbena, veil, uproot, troll, treat, treadmill, thrifty ...) There are quite a few of these, things that make you go "huh?" and which have lead newcomers to wonder just what we were up to.
- Downright bad (e.g. toupee, toot, and ones that I can recall actually generating complaint like punk, the verb senses of slight, and several of the original definitions of a — "barbarous corruption" seems a bit POV to some folks)
- Note that many of the articles here have been cleaned up. Check the first history entry to see what I'm talking about. For example, troll has had both etymology and the modern internet sense added, and punk has been completely reworked. Note that the original script seems not to have brought the etymological information along.
- At that point, we instituted the {{webster}} template specifically to flag such entries. Unfortunately, the original script was written before the template. If someone really wants to do something useful with scripts and Webster 1913 entries, it would be great to tag any entry whose history consists solely of User:Poccil and the conversion script with the webster template. This would make it much easier to clean some of the cobwebs off the entries that are already there. Another good project would be to bring in etymological information for articles that lack it, so long as the origin of the information is tagged in case of errors.
- Really the major complaint about the original Webster import was that it was done wholesale, without editorial review, and without any easy way to tell what had happened. If we can get the original importations which have not yet been worked over tagged for what they are, and if all new importations are also tagged, I don't have a great problem with resuming the process — as was said at the time by myself and others. A separate namespace has indeed been suggested and I'd be happy with that.
- I do, however, strongly dispute the notion that the importation as it happened was a good sound basis and was halted only because no one was doing it. Given the quality of the results — and with all due respect to User:Poccil for making the effort in good faith — I'd say it was more of a stopgap, of decreasing value as our entry count heads toward 100,000 and more contributors, bless them, join in. It was stopped because people actively complained, and for good reason, and not "only because no one had been doing it". Frankly, Ec, you ought to know better on that one.
- Myself, at this point I'd really rather see people put in new definitions for missing words and be done with it. -dmh 04:24, 8 September 2005 (UTC)
- The mass import of Webster 1913 material was halted because people — both frequent contributors and people wandering by — looked at the definitions and complained. The Webster 1913 definitions fall into four rough classes:
- The original importations of the Webster material was done by a bot. The first couple of pages directly into their proper article titles, and the rest into a "Webster 1913" pseudonamespace. The original practice was stopped because it was impossible to distinguish which articles were actually edited and which were imports still in need of editing. The latter practice was stopped because the rapidly building quantity of material was overwhelming a project that was still very small. The nearly 1000 articles imported up to that point barely got us to words that begin with "ad-". Poccil's work was not done until more than a year after the bot was stopped. His imports were seldom more than selected elements from what was in Webster, and by all appearances were done manually. Your allegation that the importation was stopped because of complaints about the definitions is at best speculative.
- Your pontifications about some of the words are a source of merriment. How do you manage to characterize vestigial as trivial? It is still as valid and useful to-day as it was a century ago. Toupees haven't changed their meaning much in the interval. Only one Webster meaning of troll is shown, and leaves one with the erroneous impression that the internet meaning is somehow related to the Scandinavian mythic creature.
- The vast majority of that 1913 material is just as good to-day as it was when it was written. (The OED could nevertheless be used to balance some of Webster's Americanization of the language.) The old entries are an important part of the historical record for these words, and as such have come to shape the language as we now know it.
- Of course, we need to fill in the details of what has transpired in the last century. But where do we get thse meanings? Usage dictates meaning, and not the other way around. A good selection of citations will circumscribe a word's field of meanings, and allow us to distill an appropriate definition. Where would you find a modern definition? Assuming that the AHD is a reliable dictionary we can't use their material out of fear of copyright infringement. If we just make up a definition based on a best guess we risk inaccuracy, or at least wandering into the realm of original research. In that uncertain environment it's no wonder that I prefer the old meanings. Eclecticology 08:13:01, 2005-09-11 (UTC)
Pronunciation of US English
Another user has commented on how I have represented US pronunciation, in particular, the use of the IPA symbol /r/. In Received Pronunciation, /r/ is the sound you would expect a "posh" English person to use in "Round and round the ragged rock the ragged rascal ran", a sort of slightly rolled "r". In US English, to my knowledge, the "r" sound is pronounced with the tip of the tongue further back (behind the ridge at the top of the front of the mouth rather than in front of it), which has a different IPA symbol representing it.
So a few questions (it was going to be one but they multiplied as I was writing them):
- Should we be aiming for precision in the representation of variations in langauge or simplicity, allowing one symbol to stand for equivalent (but different) sounds in different varieties of English? The symbol representing "r" in US English is either /ɹ/ or /ɻ/, I believe - I can't remember which. There is also /ɾ/ which is the sound used in accents with flapping for intervocalic "t" and "d". Which method is appropriate?
- A subtler question, which I've asked before: what is the standard for US English? UK English has RP, but I'm not aware of there being an "standard" accent for US English. That's not to say there isn't one - I'm just interested to know what it is, if there is one. If we can find out, I think we should probably draw up a table of suitable IPA (and their equivalent SAMPA) symbols for use in transcribing US pronunciations. — Paul G 09:07, 23 August 2005 (UTC)
- The same goes for English spoken in Australian, New Zealand, South Africa and elsewhere in the Commonwealth, of course. Is it within Wiktionary's remit to provide pronunciations for these Englishes too? I don't think I've seen any given here.
- This means, unfortunately, that a lot of the pronunciations currently given are unsuitable or incomplete as they do not specify where they apply. (Some are just plain wrong as well, but we can correct these as we come across them.)
I think it's important we agree on some sort of standard for the sake of consistency and to avoid a lot of cleaning up at a later stage.
— Paul G 09:07, 23 August 2005 (UTC)
- If you ask me, I would like to see IPA representations that are portable between Wiktionary projects. Every IPA symbol stands for a specific sound, I'd prefer we use it that way, instead of romanizing (i.e. using recognizable symbols, when an English speaker won't be able to tell the difference between sounds). Of course, then we will have to add where a certain pronunciation is used in a lot more detail, because it gets a lot more specific.
- I certainly think it should be within Wiktionary's realm to provide pronunciations for other places in the Commonwealth. Eventually people will show up who will want to do just that. So we provide room for them, even when nobody seems to be interested to do so right now. A placeholder in the scheme, is all it takes.
- I think I added some pronunciations while trying to grok IPA. Sorry for any that are wrong. Polyglot 15:50, 23 August 2005 (UTC)
- Personally, I'd like to see two different layers of precision. In our basic, generalised pronunciations, we should do exactly that: generalise them. That is, distinguish sounds only as they are distinguished phonemically for those accents. In English, no accent distinguishes /ɹ/ and /r/, at least, not that I am aware of, making that distinction redundant, especially since, as Paul alludes to, US English uses /ɻ/ (retroflex r) on the whole, while most British accents use the alveolar form, /ɹ/. I'm saying this because I intend for the pronunciations we give there to apply as broadly as they can, and to be as simple as possible to read.
- However, I would also like to see narrower transcriptions next to, or as part of, the link to audio pronunciations. Most of us, as amateurs, wouldn't be able to (and thus shouldn't) transcribe all the phones in a particular pronunciation, but we can at least distinguish certain things which are more obvious: flapped intervocalic 't' and 'd', rhotic approximants, syllabic consonants etc. etc. As a corollary to this, I'd like to collect as many audio pronunciations as possible for each word, akin to the way we've begun to collect citations. This, of course, would need to be discussed more thoroughly, as we don't want a long list of about 30 very similar pronunciations on each page (so I'm thinking, more subpages, or even on the Citations subpage). -Wytukaze 16:07, 23 August 2005 (UTC)
- Wytukaze, that (overloading a page with "too many" pronunciations) will never happen. Not one British person has yet hooked up a microphone and used Audacity to create an .ogg of Wiktionary. I am begining to think we may never get one! :-)
- Despite not owning a microphone, I may be able to explain how I, a Briton might say Wiktionary. As I see it, there are two possibilities (sorry, I have no understanding of IPA).
- Wik (rhymes with 'lick') - shun (rhymes with 'bun' or 'done') - ree (rhymes with 'tea', 'me' or 'disagree')
- Wik (rhymes with 'lick') - shon (rhymes with 'don', 'scone' or 'add-on') - eree (trhymes with 'Londonderry', 'merry' or 'adversary') (actualy 2 syllables but describing them separately would be difficult.
- Does that help at all? Celestianpower 18:52, 24 August 2005 (UTC)
- Despite not owning a microphone, I may be able to explain how I, a Briton might say Wiktionary. As I see it, there are two possibilities (sorry, I have no understanding of IPA).
- As to your alleged distinction between /r/ and /ɻ/ and /ɹ/, we may never know if there is that difference between US and UK because we have never heard an .ogg of a UK speaker saying them. I suspect the difference is no where near what you suggested...but without a Brit with a decent microphone, we'll never know! :-)
- OK, said a different way: Please, plEASE, pLEase, please, PLEASE hook up a microphone. I have tried to comprehend IPA but I cannot. I can't imagine other Americans, who also have never seen IPA symbols before, would somehow be expected to enter "flapped intervocalic 't' and 'd', rhotic approximants" even if such a concept might be obvious to you. --Connel MacKenzie 18:32, 24 August 2005 (UTC)
- Wytukaze, that (overloading a page with "too many" pronunciations) will never happen. Not one British person has yet hooked up a microphone and used Audacity to create an .ogg of Wiktionary. I am begining to think we may never get one! :-)
- Hah, well, as I have explained numerous times on IRC, I would like to produce audio pronunciations, but my microphone is broken :P. Additionally, I speak in a variant of Estuary English (specifically the Milton Keynes variant), not in Received Pronunciation. Of course, that doesn't make my audio pronunciations less valid. As for Americans not understanding IPA, that isn't true; as many Americans understand IPA as Brits, I should imagine. However, all that flapped this and rhotic that I was talking about isn't too difficult to explain. The former is essentially a short t or d, the way your average American says butter or shudder. The latter is merely how all of us say our Rs, as in, unrolled. The difference between /ɻ/ and /ɹ/ isn't that great either, the former is just a bit further back in the mouth. The reason why I describe these as obvious is that they are well described in the linguistic literature, so if you're speaking with an average American accent, you'll be saying /ɻ/ and flapping your intervocalic Ds (/ɾ/); even if you can't yourself fully distinguish the sounds, you know whether you're speaking like everyone else or not, ne? In fact, go to commons:General phonetics, and click on the symbols to hear examples of the sounds. --Wytukaze 19:30, 24 August 2005 (UTC)
- OK, I've add a UK pronunciation to wiktionary. For those who are wondering, "wik-tshone-airee" would be the received pronunciation. As the accent moves further from RP (which correlates very closely with lowering social class), the t quickly vanishes, the "o" moves closer to a schwa (like the vowel in 'put') and the 'a' moves to an 'e', getting shorter in the process. In a strong estuary accent (which is becoming the dominant accent in the UK), the 'a' vanishes completely and you get "wik-shun-ry". Have a listen and judge my social status for yourself. Allan 23:25 05-09-2005 UCT
- In parenthesis, I learned that [ɾ] appears in "merry" (intervocalic /r/), in the UK pronunciation. ―Gliorszio 16:40, September 8, 2005 (UTC)
Ha! I knew the East-pondians were joking about the three-sylable "Wiktionary." OK, so now, how to we get the main logo repaired? :-) --Connel MacKenzie 06:42, 14 September 2005 (UTC)
I am British and I have just recoreded my pronunciation of Wikipedia: image:UK pronunciation of Wikipedia. Celestianpower 21:32, 25 December 2005 (UTC)
- And Wiktionary Image:En-uk-Wiktionary-2.ogg. Celestianpower 21:32, 25 December 2005 (UTC)
Layout style of Wikipedia links
It has come to my attention that wiktionary lacks a consistent layout style for links from wiktionary to wikipedia. The links appear in differnet places, and in different ways on wiktionary articles. I have witnessed the following:
- Text link in See Also
- Text link in External Links
- Template link at very top
- Template link in See Also
... and etc. Nowhere have i been able to find a policy saying what the preferred method of linking is. The Wiktionary:Entry layout explained page has nothing specific, and even mentions that a text link is suitable in See Also and says the same thing about External Links. This is not helpful.
Personally, i feel that the best layout is putting the {{wikipdia}} template at the very top of the article (that is, before ==english==). What this results in is the template box being placed in the upper-right of the page. In the upper-left is the Table of Contents. Thus, the template box balances the TOC, and also doesn't disrupt any formatting further down the page. It's also an obvious and recognisable way to say "here's more information if you need more than a definition". The language doesn't matter because we're linking from english wiktionary to english wikipedia, the link is therefore always english.
Regardless of which style is selected as best, i feel that having a standard, recorded in the Entry Layout Explained (which could be called 'guide' i think) would be helpful to current editors and new contributors alike. (I came to this confusion after trying to find where and how i should add a wikipedia link). Please comment -- Fudoreaper 20:55:32, 2005-08-28 (UTC)
- Personally, I prefer exactly the same style as you do. However, I have an idea that the most popular (or perhaps just populous) style is to link it under the ===Noun=== (etc) or ==English== headers. However, I think Wikipedia entries are somewhat language-independent (as evidenced by the interwiki linking practices), so matching an article to another article makes perfect sense to me. --Wytukaze 21:23, 28 August 2005 (UTC)
- For the graphical box-thinggy templates {{wikipedia}} and {{wikipediapar|othername}} they don't follow normal rules for placement in an article, becuase they are graphical in nature. Typically, that is placed above the ==English== header (because that should be the first header in our articles...with several exceptions.) When a short article has less than four heading sections, the table of contents is not auto-generated, therefore, to make the layout appear more logically, the template should be right after the second header. If there other floating components (images, other graphical boxes) then the wikipedia template should be as close to the top as will allow for a reasonable layout. For ===See also=== or ===Further reading===
I prefer to use {{pedialite}} which actually does make a little more sense to have at the bottom of a page (since it is not a graphical sort of box thing.) --Connel MacKenzie 23:11, 28 August 2005 (UTC)
- I like the apperance when the link is at the very top. However for short articles the [edit] links are a mess as shown on Carl's Jr. page. If someone knows how to fix the [edit] boxes that would be nice. This problem only happens after the page is saved, not during preview. -- Nick1nildram 23:34, 28 August 2005 (UTC)
- What I just said. --Connel MacKenzie 23:53, 28 August 2005 (UTC)
- I meant fix the template. cummingtonite and loads of other pages also shows that there is a problem with the template. -- Nick1nildram 00:25, 29 August 2005 (UTC)
- Kindly keep in mind any "policy" should take into account multiple wikipedia articles. See DOS as one example off the top of my head. Peace. --Stranger 00:55, 29 August 2005 (UTC) Also see bowling. --Stranger 02:42, 29 August 2005 (UTC)
- That's an excellent point. From the POV of a wikipedian, wiktionary is something like the ultimate disambiguation page. I rather like the templates, though. Should there be several in those cases? Meanings are often broken down in translations, but that doesn't seem the most applicable spot. Davilla 14:32, 31 August 2005 (UTC)
- While it seems like we could make a simple rule like "For short articles, use {{pedialite}} in 'external links', for TOC articles, use {{wikipedia}} at the top, we have a problem with complex cases. What should we do when a word has more than one meaning? And especially, when those meanings have their own wikipedia articles. I just had an idea of a graphical box like the current template, but able to take multiple arguments, and create a bulleted list in the wikipedia box. Clearly, the current wikipedia template is not suited for 2 or more wikipedia entries. – Fudoreaper 14:53:15, 2005-09-01 (UTC)
- I just discovered this: Wiktionary:Style guide#Linking to Wikipedia. I didn't notice this at all before. This brings up the side point that there should probably be a master article (like wikipedia's Manual of Style) that links to all documents describing the way in which an article should be created.
- I'm not sure how much of an issue the layout is. While putting the template on a short page is not the best, because it distorts the edit links, it seems that 95% of competent editors will have no problem with this. The aesthetic argument is not so important. (though i am aware that the point of this discussion is in part, aesthetics, so perhaps i'm going crazy) My point is that to the reader, the Carl's Jr. page is coherent and readable, i think, even if the layout isn't "perfect". – Fudoreaper 14:24:52, 2005-09-01 (UTC)
- I've adjusted the articles mentioned here to match current practices. DOS seems a bit extreme to try to add the cute-looking templates for. It is still unclear to me what correction is being suggested for template:wikipedia. --Connel MacKenzie 15:27, 1 September 2005 (UTC)
BP cleanup notes
Does anyone use the "policies in development" or the "summarized entries" links; otherwise, I'll archive them.
And what do people think about a one-month rention schedule before archiving/filing?
Cheers, --Stranger 14:36, 1 September 2005 (UTC)
- Do not archive those sections. --Connel MacKenzie 15:59, 1 September 2005 (UTC)
- Clarification; I think we should have an accepted alternate before those useful sections go away. --Connel MacKenzie 03:38, 2 September 2005 (UTC)
- I would list them in Wiktionary:Index_to_Internals. --Stranger 00:27, 5 September 2005 (UTC)
- Clarification; I think we should have an accepted alternate before those useful sections go away. --Connel MacKenzie 03:38, 2 September 2005 (UTC)
- Progress note. I finished filing previous (Jl-Sp) BP discussions on various other areas of Wiktionary, making notations in the Index where appropriate. I chickened out on anything before July, though I did a little. If folks wish me to do similar filing for previous months/years of BP discussions I can. Please let me know. Other comments/questions/suggestions are welcomed, though please be nice - my skin isn't as thick as others. Cheers, --Stranger 18:24, 7 September 2005 (UTC)
ELE-POS-phrase
There's a bit of confusion (perhaps only on my part) about when to use "phrase" as a part of speech. Are two words or more a phrase for the purposes of this POS classification? Cheers, --Stranger 00:30, 5 September 2005 (UTC)
- "Phrase" and "Proper noun" should be deleted from the list of possible parts of speech. Ncik 14:55, 5 September 2005 (UTC)
- Ncik, you are the only person on Wiktionary to express this POV so far... --Phroziac 16:42, 5 September 2005 (UTC)
- I have always interpreted "Part of speech" broadly in that context. That's why terms like "Phrase", "Idiom" or "Abbreviation" are all acceptable. The specific expression "Part of speech" does not generally appear on individual articles. If someone has a better term to broadly identify what kind of lexeme or word bit we are dealing with, maybe that would be better. Eclecticology 20:57:59, 2005-09-07 (UTC)
- Yeah, that's what I'm beginning to think. What we call POS really should be called a "third tier heading title style font" or something.
- Getting back to "what is a phrase?" though, my research says that "noun" is a word; "phrase" is two or more words - which is what I've been using. But cannot some two-word combinations be used as nouns (for example) or am I completely off base? I don't know, so I ask what is proper. I just want to do things correctly. Cheers, --Stranger 21:49, 7 September 2005 (UTC)
- In general terms you are right. One has to maintain some flexibility about these things. It's commendable to want to do things correctly, but it frequently turns out that there is no one correct way of doing things ... just different opinions about what it means to be right. Eclecticology 00:48:16, 2005-09-08 (UTC)
- As to idioms, it's better to mark the idiomatic sense {{idiom}} For example, "put out" is a phrase, wtih a non-idiomatic sense ("Honey, would you put out the garbage?") and idiomatic senses like "Did they put out the fire?" and "Did she put out?" In other words, "idiom" is not a grammatical designation, while "phrase" is. -dmh 17:14, 8 September 2005 (UTC)
- It makes no difference. No need to feel "put out" about this point. Eclecticology 18:43:22, 2005-09-08 (UTC)
- In general terms you are right. One has to maintain some flexibility about these things. It's commendable to want to do things correctly, but it frequently turns out that there is no one correct way of doing things ... just different opinions about what it means to be right. Eclecticology 00:48:16, 2005-09-08 (UTC)
{ { U K } } tag
With regard to the {{UK}} and {{US}} templates, is it acceptable to use them to specify country rather than language? As an example, please see NHS which means something different in each country. Kindly let me know if I am using the template wrong. Peace. --Stranger 01:25, 5 September 2005 (UTC)
- No problem there. Eclecticology 20:46:27, 2005-09-07 (UTC)
- Alas, doesn't this contradict the recent WT:ELE change? --Stranger 21:50, 7 September 2005 (UTC)
- Just because one person has made a change on a "policy" page doesn't mean that everyone has agreed to it. Eclecticology 03:18:09, 2005-09-08 (UTC)
- But if a newbie comes along and sees that page as it is now, they will assume that it is gospel, and that everyone has agreed to it. --Stranger 03:51, 8 September 2005 (UTC)
- Absolutely! You've put your finger on the problem of instuction creep. If you are an established active editor how often are you going to be willing to scrutinize policy pages to make sure that nobody has changed them. Eclecticology 04:58:37, 2005-09-08 (UTC)
- Just because one person has made a change on a "policy" page doesn't mean that everyone has agreed to it. Eclecticology 03:18:09, 2005-09-08 (UTC)
- Maybe I'm missing something. I review Recent Changes every day. When I see something like Allan changing WT:ELE or Tedius creating a bunch of appendixes, I drop him/her an e-mail and ask what's up - or bring the matter up in the WT:BP. Because I know how frustrating it can be to leave a note on a Talk page without getting an answer, I try to review every new Talk page entry and give them at least a courtesy reply. I try to look at new entries by anons to see if they make sense - and mark those that are rubbish, and recently just make "google research notes" on some of the questionable ones. And from time to time, when I see a new account was opened, I try to drop a note on their user page complimenting them on their efforts and thanking them: I would send them the {{welcome}} message, but I feel that's something a sysop should do, but's it's been on my mind to ask about in the WT:BP. I've been feeling overwhelmed in the past few days knowing that Semper and Connel are not here for whatever reasons and I have to stop doing some of these tasks lest I burn out. This is why I've been advocating getting rid of things like current events - it takes time to maintain, time and resources which might be better spent someplace else. Or why I suggest RFD RFD, to help streamline that process.
- But what does this have to do with instruction creep? I don't get it. Cheers, --Stranger 07:08, 8 September 2005 (UTC)
- It has to do with instruction creep because it describes the way that instruction creep creeps in. It's not just a matter of someone coming in with a whole page full of new policies. It's an accumulation of small, perhaps even uncontroversial, changes which collectively lead us to where we might not want to go.
- Alas, doesn't this contradict the recent WT:ELE change? --Stranger 21:50, 7 September 2005 (UTC)
- To be perfectly frank I seldom look at recent changes anymore. I find enough to do without that. Feeling overwhelmed carries a serious risk of burnout, the more so when you get no response or when you feel that you have to do battle with someone that has made a questionable entry. The welcome messages are a good thing even though many of those will not get a response either. It's impossible for any one person to do everything, even when it appears that nobody else is doing anything. You need to decide on your own comfort zone and stick to it for your own sake. Eclecticology 19:07:18, 2005-09-08 (UTC)
- PS. I've deleted Current events which was becoming a portal for putting spam on current events. Now if someone with a better technical understanding than I were to remove that entry from the side bar, that would be appreciated. Eclecticology 19:21:15, 2005-09-08 (UTC)
- Are you talking to me, or Hippietrail? A broken link not just on the front page, but every Wiktionary page was so annoying...so very... Was the site hacked? No. Should I block the vandal that did this, then? :-) Fixed, as per the prior discussions on this topic. --Connel MacKenzie 18:55, 12 September 2005 (UTC)
Who's Who
I "stopped" User:Tedius_Zanarukando from entering names into an appendix list recently and asked that s/he bring the matter up here before continuing. I don't see that s/he has, so I will. Is there a separate WikiWho or do we intend this dictionary to also be a dictionary of names? Cheers, --Stranger 17:37, 7 September 2005 (UTC)
- There is fundamentally no objection to having names in Wiktionary. This has been the subject of debates in the past. The appendix lists are pretty well open-ended so I would not be critical of TZ for adding things there. Articles about names should have a little more than simply "Xxxx is a man's name", but should say something about the name. My own differences with TZ have tended to be more about including articles on characters from fantasy and video games. For these articles on anything other than the absolutely best known tend to be encyclopedic. Eclecticology 19:01:32, 2005-09-07 (UTC)
- I didn't know if "appendix" was a reserved namespace or not. So, it's more like a category - anyone can create anything they want?
- I would prefer a WikiWho but if y'all want names here, I'll abide by the consensus. One other thing, didn't someone recently work pretty hard to condense all the names into one general category instead of dividing them up by letter and gender? (Or was that something other than "names"?)
- But I don't care, I guess; it just seemed a little rash and I thought a little discussion and a little wait wouldn't irreparably damage anything. Cheers, --Stranger 22:00, 7 September 2005 (UTC)
- Connell did a lot of work to sort those lists out and consolidate them. It was a clear improvement. Something like a "WikiWho" did get mentioned, and had a bit of support ... but not enough to start a new project. Eclecticology 03:24:53, 2005-09-08 (UTC)
- If there is then, a consensus that Connel's path was the right one to follow, perhaps we should encourage Tedius to walk down that path rather than blasting through the trees to create another. Cheers, --Stranger 03:42, 8 September 2005 (UTC)
- Connell did a lot of work to sort those lists out and consolidate them. It was a clear improvement. Something like a "WikiWho" did get mentioned, and had a bit of support ... but not enough to start a new project. Eclecticology 03:24:53, 2005-09-08 (UTC)
protecting ELE
Earlier today, a rather new user added some material to ELE. Some good, some incorrect. Can we put a sterner warning on the top of ELE not to change it without discussion? --Stranger 17:38, 7 September 2005 (UTC)
- My apologies for appearing thick. I tend not to use abbreviations myself so I'm having difficulties in deciphering what you mean by "ELE". Eclecticology 18:25:33, 2005-09-07 (UTC)
- It was I who changed WT:ELE. I did take soundings from several of the more senior users both before and after the change, and consensus seems to be that the ELE now more accurately reflects common practice. My intention was to avoid all contentious areas, but since you've made this comment (yours is the only one, by way), it looks like I trod somewhere I didn't intend to. If you'd like to point out the areas you disgree with, I'll happily reverse my changes and start a discussion. Allan 18:41, 7 September 2005 (UTC)
- I'll try to avoid jargon in future. Entry Layout Explained.
- I just noticed the dated template, Allan. That was/is undergoing rather rigourous discussion at Requests for Deletion and had been nominated for deletion. Given that, I was surprized to see you had jumped into ELE and posted it as an acceptable restrictive whatever-you-called it. I was just surprized, that's all; in my glancing at it, I thought it looked good. I'll have to give it a better looking over when I have more clear-thinking time. Cheers, --Stranger 19:01, 7 September 2005 (UTC)
- I'm glad we don't have axes to grind. Just for clarity: It was Eclecicology, not I, who removed the reference to the dated template from ELE. Coincidentally, I included (Dated) as an acceptable restrictive label for one reason only: do a search for (Dated) and you'll find it a couple of hundred articles
- Oh, and I do agree that the ELE should contain some reference to the proper process for gaining consensus before editing it. Other users have been less circumspect than me in the past, and some have attempted to be downright prescriptive. Allan 19:24, 7 September 2005 (UTC)
- I don't think that we are even at the warning level yet, stern or otherwise. Even if I don't agree with all of Allan's changes, I will certainly assume good faith. It was only yesterday that I was finally able to dump the "dated" template and category which I have always found to be one that is incapable being precise. I also did not think that it is always obvious that "UK" would be interpreted as "Commonwealth". There are a few other points that I would dispute, but none of them are very important. I think that Allan raises a few detailed points, like starting a definition with a capital letter, that have never been addressed. This is a matter of convention where either option could be correct. Abiding by that convention helps the overall appearance of the Wiktionary, but failing to do so would not be grounds for getting upset with people. Eclecticology 19:34:34, 2005-09-07 (UTC)
RFD RFD
In broad strokes: There's too many places to look for the status of words: Tea Room, RFD (both the automatic-list category and the manual-list discussion page), RFC (both the automatic-list category and the manual-list discussion page), List of Protos, the Proto page itself, Neos - stable, Neos - unstable. It's becoming too work-intensive to monitor each one, and to clean each one out. This might have worked in the past, but it's not working anymore. (And, not that this is new, but people are upset when their pet word gets nominated.) I propose: (a) we get eliminate the Tea Room, RFC, Protos and Neos; (b) we reserve RFD for Internal Wiktionary pages only (category deletion, template deletion, etc.); (c) stipulate that all discussions about words take place on the talk page of that word; (d) put one standard notice on the main page of the word that the word is under discussion and has not yet reached a consensus that it conforms to CFI; (e) if a word is not to be listed, include a notice at the beginning of the main page that this word has failed to meet the CFI, see the talk page for more information - never delete word talk pages. This way, an advocate for a proto will always have a place to add more citations and, after a year or so, can drop a line in the BP to ask people to go to the word and review it for further discussion to ask that the non-CFI notice be removed. PEACE. --Stranger 18:18, 7 September 2005 (UTC)
- Changing neologisms to a category might satisfy one of your concerns. As I understand it, protologisms are usually only suggested by a single user. There's no risk of protologims reappearing independently, and they're not words that have to be hunted down. Protologisms should stay on a separate page to reinforce the idea that this is a dictionary of language as it is used, and that the ability to edit the dictionary does not equate to editing the language. Even with a warning banner placed on proto pages, new users would get the encourageable idea that these are allowed.
- I like the idea of never deleting a word's talk page if it would avoid a problem in having to monitor deleted words. If there's no other technical solution, this might be accomplished by a redirect to a special "word not found" page (unless that would make words more difficult to monitor). And I do think it makes sense to talk about words on their individual talk pages, which is furthermore the most accessible place for new users. However, with this as a norm there would have to be a list of recently updated talk pages to make sure comments don't go unnoticed. I also find the tea room to be a great community site, and if nothing else it should be kept as an index to talk pages, either with brief comments on the subject of the discussion or at lease a date and classifications such as "pronunciation", "part of speech", "criteria for inclusion", etc. Perhaps this function could be rolled into the automatic list that keeps discussion from falling through the cracks, or the automatic list could omit words that are listed in the tea room. Davilla 02:32, 8 September 2005 (UTC)
- Fast comments made while I'm tired, so take that into account if I don't make much sense: (a) I want one place to review words and make comments. The system now, I think, is broken. Did anyone see my comments on rotary-dial or crank-telephones when we were discussing pulse and touch-tone phones? The RFD/RFC notice itself says to see the word's talk page, yet it doesn't appear people do that after the word is listed on the RFD-page/RFC-page. (b) One motivation for this change is that I wasn't even aware there were comments on a RFC list (similar to that on RFD) until today. Is this just because I'm a newbie? (c) If people see a word listed with a nasty banner that it doesn't meet the CFI, I think people will want to learn more about CFI and will look at the talk page to see what the discussion was all about and actually learn that they can't just add any tosh that comes to their mind. I think it will provide a learning experience for them rather than an excuse for them to add crappy words. (d) If we can design the Tea Room to be a list of all topics talked about on talk pages that day (which AUTOMATICALLY REFRESHES every day - thus doesn't need maintenance) then that would be great. Cheers, --Stranger 03:00, 8 September 2005 (UTC)
- I offer as "proof" of this (c), the following comment posted by an anon on our favourite phrase You're the man now, dog: Keep I was curious, this was informative. The sites are very strange to say the least.
- I say we're missing more comments like this because the discussion is not taking place conveniently on the word's talk page. WAY past bedtime, now. Later, --Stranger 07:37, 8 September 2005 (UTC)
- My only objection to your more recent points is with (c), and I only object to (c) in the case of protologisms. YTMND is not a protologism, so a banner would be appropriate. (However, others might wish a non-entry to be limited to a talk page.) Given the narrowness of my dissension, I should have said that the whole looks like it could be a great idea, provided (d). Davilla 16:38, 10 September 2005 (UTC)
- What about changing the Appendix:List of protologisms page to a category? That would avoid the need for that manual list. The description of the proto can be put on the talk page. So, the only thing on the main page would be a note that "this has been classified as a proto and therefore fails to meet the CFI - see talk page for (if any) discussion" - or something like that. Would this not be similar to neos? Would this satisfy?
- Similarly, what about changing the Tea Room to a category. If you think a word needs more discussion, just add the flag {tea} to it and list your concerns on the word's talk page. The word would then automatically show up in the Tea Room. After the discussion was concluded, the {tea} flag could be removed, and the word would no longer appear in the Tea Room. Cheers, --Stranger 18:42, 11 September 2005 (UTC)
- I mentioned protologisms in my first response specifically because I believe these to be an exception. The rules for inclusion, if they are not themselves arbitrary, can sometimes be subjective. But in my mind there is a fairly strong distinction between (i) "weakly attested" words and (ii) those that are admittedly invented and not at all in use. In the first group, a word that is not yet official may appear on a number of webpages as slang but not in print, or may arguably be technical jargon or a minor brand. They may be appropriate for a referece since they could be run across in some context. In the second group, however, a protologism is understood nowhere and has to be explained in every context in which it is used. They have no more place in Wiktionary than does original research in Wikipedia. Thus they are listed under exclusions in the CFI along with vandalism and encyclopedic entries such as genealogies. There's no reason to advertise talk-page protologisms with a banner. If neologisms are pushed to talk pages, I would vote for protologisms to remain outcast as a list.
- Categories solve some but not all concerns. The above personal objection aside, a {{proto}} warning template would solve the mechanical problems for protologisms much as the suggested (tag and) category for neologisms. A Tea Room category and its subcategories would be an interesting variation of the list I had first suggested, although it would lack date tags for the occasionally necessary spring cleaning. More importantly, it would not satisfy (d) in catching conversation that would otherwise fall through the cracks. I see that as both the barrier and catapult for this proposal. Without an automatic list, even regulars are isolated, but with it the learning curve in entering the wiktionary community is immediately reduced. Davilla 16:35, 15 September 2005 (UTC)
dmh's response
- This seems like a reasonable proposal. I haven't thought it through well enough to wholeheartedly endorse it, but it's a constructive proposal, and we certainly need that.
- As one who probably raises a good proportion of the objections on RFD, let me make it very clear that I don't have any "pet words". I can't think of many words that I've personally contributed that have been RFD'd. Most of the words I've advocated for have been entered by others. I've spoken up for them simply because they're clearly in use and idiomatic — that is, they clearly meet CFI. Further, it's generally easy to confirm this.
- In the past, the conversation was roughly:
- I haven't heard of this.
- Really? I have and it gets X number of google hits as well.
- Oh. Never mind.
- This worked fairly well. Lately, this conversation has shifted:
- This word is marketing crap/illiterate drivel/nonsense/your pejorative here. It has no place in a respectable dictionary.
- Really? I've heard of it, and it gets X number of google hits as well.
- followed by some dismal variation of one of the following
- Google hits don't count. Only print counts.
- Really? You know, there's all kinds of utter crap in print. Look, here comes some now.
- — or —
- I don't see any citations on the page.
- Just do the search. It's not hard.
- No. I'm going to delete it instead.
- and so forth. The problem with all this was twofold:
- There was an attempt to unilaterally impose a "show sources on the page or it's out" policy without prior discussion, which attempt has, sadly resulted in something of a pyrrhic victory. Several words have in fact been deleted, some more than once, with no noticeable improvement in the overall level of documentation of new or existing entries.
- This policy was applied arbitrarily. The RFD page tells the story. Further, since the story kept changing depending on which logical flaw was last pointed out, there was inherently no way to enforce such a policy other than arbitrarily.
- All this is to try to make it painfully clear that this is not some sort of pissing match over some particular set of words. When I started working on Wiktionary, it was an enjoyable place to contribute what I knew and watch the rest of the world correct and improve it. We've lost quite a bit of that, and for no good reason. Taking a step back from the current spate of deletions would do a lot to recover what made Wiktionary such a nice and interesting place to be. -dmh 21:30, 7 September 2005 (UTC)
Eclecticology's response
- There are some useful ideas in Stranger's proposals even if I don't agree with all of them.
- The Tea Room has a role similar to that of the Beer Parlour, except that it is more directed to dealing with specific words and newbie questions about words, and separates these from the broader questions that are found on this page. Questions there tend to be a lot less contentious. Like this page, it needs maintenance though it is likely to be easier there than here. If it is worthwhile it can be moved to the word's talk page. Many newbie questions that don't really give rise to any new information can simply be deleted. It's just that somebody has to do it.
- The two Neologism pages are a throwback to the early debates about protologisms. I don't think that they have seen much activity lately, and their role should be reviewed.
- It will surprise no-one if I say that I am committed to present format of the RfD. For me everything on there is something about which a decision MUST be taken, with a fity for those items that have drifted to the top. I admit that I have not yet looked into the rotary-dial telephone issue at all. This should not be taken as an expression of any POV on the article, but I know that if it manages to drift to the top of the list I will have to deal with it. Many of the items listed are no brainers that can be put through in a routine fashion. In others a near consensus will develop without much difficulty. Even so, there remains a goodly number that require careful consideration that takes into account lexical issue, policy insights, and the personalities involved. Being decisive then means that some people will not be pleased with any given decision. That's the way it goes. I am acutely aware of the perpetual VfD crisis on Wikipedia, and have been since before Wiktionary started.
- The problem with RfC is that it is one giant carpet for sweeping dirt under. Things are put on that list and ignored. In several cases the tag has been added, but I have no idea what that person wants fixed. Putting these deletion discussions on the word's talk page does no better. Nobody notices, and if nobody notices nothing gets done.
- We could probably do more to save some talk pages of deleted words, or even to save the relevant RfD discussion. These commentaries should probably be factored down first. The best place to do this remains an open question.
- Eclecticology 04:49:22, 2005-09-08 (UTC)
- There are some useful ideas in Stranger's proposals even if I don't agree with all of them.
- I have to make a clarification on the touch-tone phone business. The RFD template says to see the Request for Deletion page for notes on this word. The RFC template says: Please consult the Talk: page for this article for the reasoning behind this request for cleanup." I made the mistake in thinking that the RFD template also said "see the talk page".
- After touch-tone was listed for RFD but before it made it to the Request for Deletion page, I made a note on the talk page regarding the deletion request wherein I made reference to rotary-dial phones. No one mentioned rotary-dial phones in the subsequent discussion on the Request for Deletion page which makes me think that no one even bothered to look at the corresponding Talk page for touch-tone.
- And that's what makes things confusing/frustrating - is when information about a word is spread over several different areas of Wiktionary (like the Request for Cleanup/Deletion page and the article's Talk page - and maybe the Tea Room and List of Protos). Cheers, --Stranger 18:57, 8 September 2005 (UTC)
- I'll keep this in mind when I get around to investigating that item. Remember that the Tea Room is primarily for polite or newby type questions. Probably nothing there should stay for more than a month. Eclecticology 19:36:19, 2005-09-08 (UTC)
- Just because a discussion is in the Beer Parlour instead of the Tea Room, that's no reason why we shouldn't be polite. --Stranger 18:45, 11 September 2005 (UTC)
- The problem I see with the current method, and a purely hypothetical one, is that it isn't scalable. A single page for deletions makes it easier for one devoted wikipedian such as yourself, a committed contributor acting as the chief editor, to keep tabs on progress. However, there will come a day when that is no longer possible.
- I also believe the idea as put forward is critical in shifting from an archiving scheme, such as the one you propose for RFD and as exemplified by pretty much any discussion area on this site, to an indexing scheme. Davilla 17:08, 10 September 2005 (UTC)
- I agree in both respects. For the centralization the challenge will be how to subdivide the work, and distribute the workload while maintaining a reasonable level of openness that allows anyone to see what is going on.
- In the second point my habit has been to remove sections one at a time even though it could often be more convenient to rmove entire sections together. This ensures that each removal is noted in the summary line. Indexing and refactoring would be nice, but they require an awful lot more work than simply reading the discussion before taking action. Eclecticology 08:34:04, 2005-09-11 (UTC)
- Subdivision of work? You take A-M and I'll take N-Z. Or, I'll take the contributions by even numbered IP addresses and you take the odds. No, wait. I'll take A-D, you take E-Z. :-) Cheers, --Stranger 19:14, 11 September 2005 (UTC)
Dmh's response to Eclecticology's response
I don't particularly mind the current format of RFD. I do mind the current process, and I especially mind the lack of discussion of it.
Personally, I can't see anything, other than obvious vandalism, on RFD that MUST be dealt with anytime soon. Someone enters webinar. Fine. Someone else doesn't like it. Fine. Where's the crisis? Not sure it meets CFI? Note that and move on. No crisis.
This may put a lot of stuff under RFC, but so what? There really is a lot of cleanup to be done on Wiktionary, and putting things on RFC makes that clear and gives people a chance to go after it actively, instead of just stumbling on something broken and fixing (which is, of course, also a vital part of the process).
- Perhaps, then, let's just get rid of RFC as a page and leave it as an automatic category. If there are notes on what someone wants to cleanup, they can be left on the talk page. --Stranger 19:14, 11 September 2005 (UTC)
I'm also a bit leery of the analogy with Wikipedia. First, the Wikipedia community and its edit rate are at least an order of magnitude larger than Wiktionary. Such a quantitative difference becomes qualitative. Second, Wiktionary is a dictionary, recording usage, not an encyclopedia, which tries to give in-depth information.
If I make reference to the infamous photon belt in a Wikipedia article, I should expect someone to dispute that it exists. If I make reference to the term photon belt in Wiktionary, I'm on much safer ground. It clearly exists and has the (fanciful) meaning given. If I point to some random web page as evidence that the photon belt exists, I should expect derision. If I point to some random web page as evidence that the term "photon belt" exists, I should expect little trouble, as it is generally unlikely that someone put up that page just to promote the idea that the term "photon belt" is in use (and exceptions to this principle are generally quite clear).
In other words, the kind of evidence needed to prove usage of a term (which is what we care about here), is qualitatively different from the kind of evidence needed to support an assertion in Wikipedia. For similar reasons, the question of whether a term should appear in Wiktionary is qualitatively different from the question of whether an article should appear in Wikipedia.
In any case, let's actually have the discussion of what process we should follow with non-vandalistic entries and RFD. It ought to prove more fruitful than the current battles, and might well be more enjoyable as well, amusing though the latest escapades have been. -dmh 05:32, 8 September 2005 (UTC)
- There's a big difference between the entries for "webinar" and "photon belt". The one that was evidenced was kept; the one that you invented was deleted. Eclecticology 17:59:15, 2005-09-08 (UTC)
- What does that even mean, "the one you invented"? I certainly didn't invent "webinar", either the word or the activity. -dmh 03:43, 9 September 2005 (UTC)
- Just for full disclosure, I created a talk page for webinar just to hold my research notes. Maybe that's all we can agree on at the moment: is to have talk pages without articles. --Stranger 18:25, 8 September 2005 (UTC)
- I'll keep in mind that we need some kind of procedure for handling these. Eclecticology 19:36:19, 2005-09-08 (UTC)
some examples
Some example(s) I've stumbled across:
- Talk:coug_it - I think what Daniel's doing here is great. He's trying his best. If memory serves, wasn't coug it on the deletion list before and get removed? If we had saved those discussions, we wouldn't have to rely on memory if a word was re-added later.
- webcest - Now this is a word which I know has been readded after we decided to delete it. I think that having a history on a talk page about how many times this has been added and deleted would be of assistance in determining if the word is gaining more public acceptance.
That's it for now. --Stranger 16:25, 8 September 2005 (UTC)
- This is another one where it would be great to track the early progress of the term in case it gains wide acceptance — which, from the evidence, it might actually do. Has it been a year since the first mention? If so, it's clearly in. For my money, this is one of the few areas where Wiktionary can outdo other dictionaries. -dmh 14:59, 10 September 2005 (UTC)
- I find no record that "coug it" was ever deleted, but it is altogether possible that it would have eventually been so nominated. It is quite likely that it would also have been questioned as a localism of a type that is rampant in university residences. Daniel has done the right thing in seeking documentation for his contribution. I would have preferred having the key quotes on the actual page for the word rather than the talk page since it contributes to the documentation of the term for the benefit of the general public, but this deficiency is not fatal.
- IIRC the deletion of "webcest" went quickly, and there was not much of a discussion, although I did make the usual point of keeping it on the RfD page for at least a week after its deletion. I don't think that it's necessary to keep all deletion discussions. The short undisputed ones in particular are not worth keeping. We just need to agree on the details. Eclecticology 18:33:31, 2005-09-08 (UTC)
- Thanks, Ec. Yeah, my memory's bad. I thought "coug it" was deleted and that Daniel was collecting evidence so that it wouldn't be deleted again; I agree that a couple (not all citations) should be on the main page.
- My point, I guess, with webcest was that it has been entered a few times from several different people. That in itself is worth cataloguing, I think. And how different or similar their definitions are to each other would also be a helpful guage for us in trying to determine if the word is proto or not. Cheers, --Stranger 18:45, 8 September 2005 (UTC)
- Oh, and, I don't think anyone disputes that absolute rubbish ("I wuz here!") should just be deleted. --Stranger 18:46, 8 September 2005 (UTC)
- libnut - Ec, this is the word I was thinking of (instead of coug it). Cheers, --Stranger 16:13, 9 September 2005 (UTC)
- The big change between what was there before and the reincarnation is that it now has references. That makes all the difference. The major argument is not about whether any particular word is valid, but about the academic rigour of citing sources, and who has that responsibility. Eclecticology
- So let's have that argument first, and then change our practices if it appears necessary, OK? -dmh 14:51, 10 September 2005 (UTC)
- The big change between what was there before and the reincarnation is that it now has references. That makes all the difference. The major argument is not about whether any particular word is valid, but about the academic rigour of citing sources, and who has that responsibility. Eclecticology
same page
There's, like, two conversations going on here. dmh is discussing more the philosophy of what gets deleted and how; I'm more interested in procedure and streamlining things regardless of that philosophy; and Ec, bless him, is nobly trying to wrap his mind around both aspects. Just to make sure y'all can follow things that are going on here a little bit better. PEACE. --Stranger 01:11, 10 September 2005 (UTC)
- That's an interesting way of putting it. :-) Eclecticology 05:57:11, 2005-09-10 (UTC)
- It's not a philosophical point. It's a practical point. Philosophically, I'm pretty sure we're both very much in favor of better documentation. But as a practical matter, "show citations when challenged or I gun it" is a non-starter.
- I believe we've already agreed that the vast majority of entries, old and new, come in with no documentation whatsoever. Hang on a sec while I spot check that and reinstate webinar if it's been deleted again. Let's see ... traduzcáis: nope. translating dictionary: Content was rubbish, someone had RFD'd it, but why jump through hoops? Just fix it (done). waschen: nope. By the way, it's nice to see so many non-English entries coming in. washcloth:Nope. Definition was "flannel". I added the more general definition of flannel to that page and moved the definition "a cloth used to wash the face and body" here. We may want to re-arrange a bit more to keep both sides of the pond happy. conglomerate: Nope. Reformatted, added verb sense. More to be done here. webinar: nope, but hey, it's still there!
- The purpose of that last was twofold. First to re-emphasize that entries overwhelmingly start life with no documentation whatsoever, and second to illustrate how, IMHO, Wiktionary should work. You find something broken, you fix it. You don't RFD it, and you try not to even RFC it, unless there's not a clear, direct way to proceed.
- Before I continue on the practicalities of "show citations", let me point out what I should think would be obvious to anyone working on a wiki: that having entries come in undocumented and even malformatted is a Good Thing per se. Wikis depend, vitally, on lowering the bar for entering and editing data. The whole idea is that if people can "just do it" without administrative hinderance, they will. Sometimes this results in garbage, but most often it doesn't. People want to help. Let them. OK, enough philosophy for now, though more could be said.
- Given that entries tend to come in undocumented (and as a near-corollary that the vast majority of existing articles are undocumented), it's clear that there is no way that "show cites or be deleted" could be applied uniformly in Wiktionary as we know it. So, how do we go about reducing the number of articles facing deletion (if only to spare our poor sysops the burden of tracking them all :-). There are currently two proposals on the table, I believe:
- Require citations to be produced for any term which is placed on RFD, whatever the reason.
- Require citations to be produced for any term for which the usual search indicates that the CFI are not met.
- As a practical matter, (1) shows the wisdom of the instruction on PDG that "[I]t is usually better to give the benefit of the doubt to keeping the article. This has a positive effect on the overall communal mental health." (who wrote that, anyway? :-). People nominate articles for all sorts of reasons, some better than others. Some among us would like to actively encourage people to nominate words they don't find familiar. Not what I'd do if I ran across an unfamiliar word in any other dictionary on the planet, but never mind. The point is that (1) invests essentially arbitrary challenges with far greater force than they deserve. Deletion is a pretty big hammer. It's particularly ironic that this is done in the name of rigor. And, not to put too fine a point on it, this already inherently arbitrary rule has not been applied with complete rigor in practice, even if you discount challenges tainted by my involvement.
- On the other hand, (2) reflects common practice over the last year or so (with a couple of notable exceptions). Essentially, we require a prospective challenger to do a minimal amount of investigation before claiming that a word does not meet the CFI and so should be deleted. Though this has been repeatedly misinterpreted as requiring proof of a negative, it isn't. All you have to show is that you did a minimal search and explain why it indicates that the CFI were not met. This is generally quite easy.
- I should also point out that blatant vandalism is shot on sight no matter what. We trust the sysops to know vandalism when they see it.
- There are a couple of points that need to be fleshed out to make (2) work, but since it's based in existing practice, we have a pretty good handle on them:
- What is the "usual search"? For most folks, this means existing dictionaries, maybe BNC, google print, followed by google groups and web if those turn up nothing, but since anything in existing dictionaries is going to turn up on google, in practice it means something more like: Try google web first. If there isn't much there, try google groups and print, and maybe other stuff like BNC. I acknowledge there is some dispute over whether google web in particular is a reliable source, but that's a separate argument from whether there should be some sort of reasonably standard search required before an RFD challenge is considered to have merit.
- What to do with articles that need more support. First, the presumption should be that documentation is nice-to-have in almost all cases, but much more meaningful for a term which has survived a serious challenge. In this case, it would be best for the person who turned up the off-the-beaten-path documentation required to add it to the page itself (or the talk page, or .../citations), but this is not a dire necessity.
- In short, we know that (2) will work fine because we've already tried it. From experience with (1), it appears to increase the heat/light ration significantly without actually improving the quality of entries noticeably. To the contrary, if anything. -dmh 04:12, 11 September 2005 (UTC)
Name change?
Is there any openness here to considering a name change for this page? If not fine, but I imagine there may be others too who would prefer a more "sober" name (for different reasons perhaps). Brettz9 20:08, 9 September 2005 (UTC)
- Is there any similar forum on the other Wiktionaries? What do they call it? Is there something proposed for the Unified Wiktionary? Is the "sober" concern more an American concern? The problem is that various conversations merely refer to the BP as a place to hammer things out, so I think keeping the initials would be important. "Big Pub" perhaps? --Stranger 19:28, 10 September 2005 (UTC)
- There are certainly some naming issues to be worked out for a unified Wiktionary :-) One big question, once it is easier for people to see what is being said in all languages, is how to allow people to filter which comments they see... right now, there is no "edit-level language tag". I think the closest approximation to the current set of BPs would be a separate page for each language. I'm thinking particularly about how to handle this problem. +sj + 04:02, 11 September 2005 (UTC)
My preference is to keep the name "Beer Parlor" as it very accurately describes the character of brawls one can expect here.
To those of a 12-step inclination, I would say their objections are misplaced. To quote from http://www.aa.org/bigbookonline/en_BigBook_chapt7.pdf starting at the bottom of page 100:
- ...People have said ... we musn't think or be reminded about alcohol at all. Our experience shows that this is not necessarily so.
- We meet these conditions every day. ... His only chance for sobriety would be some place like the Greenland Ice Cap, and even there an Eskimo might turn up with a bottle of scotch and ruin everything! ...
- In our belief any scheme ... which proposes to shield the sick man from temptation is doomed to failure. ... These attempts to do the impossible have always failed.
I for one, would also object to a name change on that basis. --Connel MacKenzie [+] (contribs) 23:16, 15 September 2005 (UTC)
- For those who abstain or are recovering alcoholics, non-alcoholic beers and soft drinks are available. I think the UK spelling of "parlour" is appropriate as this is a quaint term. I'm not sure that US pubs, er, bars have such things as parlours or snugs. — Paul G 12:38, 2 November 2005 (UTC)
- Bars and clubs (nightclubs) often do have back rooms and private rooms. Some have nice little alcoves to hide in too (make-out spots.) Private rooms/booths take on a quite different meaning in strip clubs however. (I assume that much is the same, whichever side of the pond.) But you are correct; parlors are generally limited to mansions in Southern States. I'll try to remember to spell it as the Beer Parlour from now on. --Connel MacKenzie T + C # 22:31, 5 November 2005 (UTC)
Protologisms/Idea wiki
Hello... Per Eclecticology's suggestion, I am raising this discussion here about the advisability of making reference on the protologism page(s) (e.g., Appendix:List of protologisms) to http://allyourideas.com a website recently set up (under the GNU license) to house (practical) ideas and idea snippets. The site could conceivably host protologisms, and may be of interest and relevance to Wiktionary for at least two reasons:
- Some people may feel unduly constrained about not being able to add entire pages for a protologism (also having its own page could allow categories to be added to individual protologisms). As the site allows more general ideas as well as new words (unlike Wiktionary), protologisms could easily become extended and elaborated into additional pages as well, if the need arose.
- Wiktionary may wish to be more clear in steering away from original research.
Brettz9 20:08, 9 September 2005 (UTC)
- I don't want Wiktionary to become a list of links to other, outside websites. Google does that. That's my concern. --Stranger 03:48, 12 September 2005 (UTC)
Questions about Criteria for Inclusion
I'm afraid that I'm new to Wiktionary, though I've made contributions to Wikipedia regularly. I had several questions about the Wiktionary:Criteria for inclusion; I imagine these are known to the community, but they aren't explicitly addressed on the page. First, are scientific and technical jargon terms appropriate? For example, I started the w:NL (complexity) article on Wikipedia, about a certain complexity class. Would it also be appropriate to add this definition to the NL article here, with a link to the Wikipedia article? Also, I notice that nonword characters such as +, *, etc. have both attestation and idiomaticity; are there any past discussions that I should know about before starting these articles? Thanks! -- Creidieki 22:52, 11 September 2005 (UTC)
- Thanks for your question! Way over my head, I'm afraid. If no one answers you here, try asking SemperBlotto. He's a smart one. Cheers, --Stranger 03:43, 12 September 2005 (UTC)
- Hi Creidiki,
- First of all: Welcome to Wiktionary! As far as I'm concerned anything that is a word or an expression which you can give a definition for, translate or describe lexicologically is welcome. Indeed we even have entries for symbols and punctuation. Jargon, technical and otherwise certainly has its place here. Where we draw the line is invented words and some artificial languages. And of course vandalism, spam and plain sillyness gets deleted on sight, but that wasn't really your question.
- I would say: go for it. Create a few entries and see how the other contributors react. At first you might get a few remarks or pointers how you can do it better, but you'll soon get the hang of it. Polyglot 07:41, 12 September 2005 (UTC)
- What you need to kkeep in mind is that users of this project are not mathematicians. I looked at the Wikipedia article and I think that it was lacking any sort of statement about what "NL" is. I had a "What the hell are these guys talking about?" kind of reaction. Technical jargon needs to be defined just as much as anything else. The problem with symbols is a bit diffeent. A few can't be used because the software can read them as instructions. A much larger selection are useless because few people know how to enter them in the search box. It could be nice to know what ∮ means but how is the average user going to find it? Eclecticology 23:58:08, 2005-09-12 (UTC)
dict
Hi. Is there a way to read entries via the dict protocol (that dictd, gnome-dictionary and others uses)?
- Currently, no. However, dict support is (last I checked) one of the features planned for the so-called Ultimate Wiktionary. —Muke Tever 00:24, 16 September 2005 (UTC)
Perhaps this? w:Wikipedia:Tools#Extensions/Plugins --Connel MacKenzie 07:01, 23 September 2005 (UTC)
Babel
WF and User:Krun have got some Babel stuff going, the top of which is Category:User languages. Its taken from w:Category:User languages, see also Wiktionary:Babel, and Template:babels, Template:babels2
Wikiversity Vote
Voting has started for a new Wikimedia sister project proposal called Wikiversity. This is a request for anybody that is interested to cast a vote either in support or opposition to this new project proposal. The results of this vote will determine if this project will be started on its own seperate group of wikis as a Wikimedia sister project, together with approval from the Wikimedia Foundation Board. Discussion about this proposal should take place on the Wikiversity discussion page.
Complicated
Is there a simplifed version of Wiktionary, where I don't have know if a word is a verb or not, what an infinitive means, understand ITA etc? --Commander Keane 16:14, 16 September 2005 (UTC)
- Write what you know, and let others add to it. You don't have to produce an enormous, full-fledged page all at once or all alone. In any case, verb and infinitive are probably things you might want to learn about if you plan on writing for a dictionary. —Muke Tever 18:35, 16 September 2005 (UTC)
- It's hard to help in a meaningful way if you don't know some basic concepts like what nouns, verbs, infinitives etc. are. You don't need to know about IPA (phonetic alphabet to help people know how to pronounce words).
- A verb is a do-word, a word that expresses an action or a state of being.
- An infinitive is the ground form of a verb: to do, to make, to be, to have
- A conjugated verb can be one of the following: I have, we are, he makes, she did and many more. Some of them coincide with how their infinitives are written. This is what makes English seem so easy at first sight.
- A noun is a word that represents an object or an idea/concept.
- An adjective is a word that says something about a noun.
- An adverb is a word that says something about an adjective, a verb or another adverb. It makes it more specific or says something about how a verb action is.
- A number is a counting word.
- A personal pronoun is one of I, you, he, she, it, we, they, one
- A possessive pronoun is one of my, your, his, her, its, our, their, one's (not entirely sure about this last one)
- There are some more pronouns but with this basic knowledge you will get pretty far already.
- Some words have more than one function. play is a noun (performance) and a verb (perform a pleasurable action, sometimes together with others). Sorry that's not a great definition.
- Anyway, all these things you can look them up on Wikipedia and other resources across the internet. As long as you're willing to learn, you'll get by.
- To me it's the learning experience that makes it worthwhile to contribute over here. Before I came here I didn't know how to read IPA. Now I invested some time into learning about it and I can mostly read it now. There are many more examples, but I have to avoid turnin this into my memoires... :-) Polyglot 19:26, 16 September 2005 (UTC)
- Don't get turned off by TMI! You don't have to know how to read IPA pronunciations. You don't really have to know what an infinitive is. To write the simplest entry, all you have to know is how to tell the difference between verbs, nouns, and adjectives. A noun is something you can touch, kind of, like an egg or a toe, but also the sky or an idea. An adjective is something that describes it: hard, sore, blue, dumb. A verb is an action word: get, lie, quit, agree. And if you're not sure about a word, you can always ask here first. Davilla 18:38, 27 September 2005 (UTC)
- I should have made my question clearer. I'm not worried about contributing to Wiktionary, I'm worried about reading it. English is my only language, and I speak at a relatively good level, but I don't want to have to trawl through all the IPA and infinitive crap just to find the meaning of a word. I think Wiktionary's format is too complicated. Is there a simplifed version? --Commander Keane 05:51, 5 October 2005 (UTC)
Rank
Greetings,
I am following a suggestion on migrating the Project Gutenberg frequency count rankings to individual pages. I will soon rank all words in Wiktionary that have ==English== and I would like to indicate that rank on individual pages. How that should appear is of question. Before automating this, I am starting to experiment with several different formats on the following pages:
- the - vanilla third level heading, placed towards top of entry. --Connel MacKenzie [+] (contribs) 19:16, 16 September 2005 (UTC)
- of - bland description of rank, towards bottom of entry. --Connel MacKenzie [+] (contribs) 19:16, 16 September 2005 (UTC)
- and - table showing prev & next, at top of page. --Connel MacKenzie [+] (contribs) 19:16, 16 September 2005 (UTC)
- to - table showing prev 2 & next 2 below TOC. --Connel MacKenzie [+] (contribs) 19:16, 16 September 2005 (UTC)
- a - table showing prev 4 & next 4 towards top of entry. --Connel MacKenzie [+] (contribs) 19:16, 16 September 2005 (UTC)
- in - sidebar showing prev 5 & next 5 on right of TOC. --Connel MacKenzie [+] (contribs) 19:16, 16 September 2005 (UTC)
- that - sidebar showing only three prev and next. --Connel MacKenzie [+] (contribs) 19:20, 16 September 2005 (UTC)
- I
- was
- he
- his
- with
- is
- it
- for
- as
- had
- you
- not
- be
I would appreciate feedback here. Particularly asthetics.
I have not made templates of the various tables formats yet, but I plan to when there is constructive feedback.
PLEASE DO NOT CHANGE THE RANKINGS ABOVE. If you have better ideas on how it should be done, please show what you mean by formatting one of the unformatted ones (and note it on this page.)
When technical issues are addressed, we'll probably be left with only one or two formats; if two we can probably vote on them then. Please don't vote yet, as I'm confident others may come up with additional (better) layout ideas.
--Connel MacKenzie [+] (contribs) 19:16, 16 September 2005 (UTC)
- I think that for positioning the line immediately after the "English" heading would be the best. It is clearly visible, and tells the reader that it is about the English word only.
- Yes, I think the example a demonstrated that the best. --Connel MacKenzie [+] (contribs) 07:02, 17 September 2005 (UTC)
- I don't think that showing the prior and next word on the list does much for us. A link to the list might be better for those who are interested in that sort of thing.
- If I'm going to automate this, it only makes sense to add obvious features the a couple previous and a couple next. It is very difficult to guess how much of a good or bad thing this concept is, until it is seen in practice. Why is it a pleasure to flip through a paper dictionary? Sometimes it is the related words, other times it is the completely un-related words on the same page. 'Bot-added content is just as easily bot-removed. --Connel MacKenzie [+] (contribs) 07:02, 17 September 2005 (UTC)
- Fair enough. I only saw it as not particularly useful. It's not the kind of issue where I would keep up an argument. Eclecticology 19:04, 17 September 2005 (UTC)
- If I'm going to automate this, it only makes sense to add obvious features the a couple previous and a couple next. It is very difficult to guess how much of a good or bad thing this concept is, until it is seen in practice. Why is it a pleasure to flip through a paper dictionary? Sometimes it is the related words, other times it is the completely un-related words on the same page. 'Bot-added content is just as easily bot-removed. --Connel MacKenzie [+] (contribs) 07:02, 17 September 2005 (UTC)
- Would it be difficult to add the actual frequency? Thus: "The word occurred n times per million words." ... where n = (total occurrences of word x 1,000,000)/total words in sample. Eclecticology 03:42, 17 September 2005 (UTC)
- Since I did not calculate that for each term on this last run, yes, that would be hard. The next time I run it, I could do that...I think. But to limit it to English words only is still a problem. That is, I'd need to calculate the total number of English words in the sample, not the total number of words. OK. I'll see about adding that when I finish processing 03-September's XML dump. --Connel MacKenzie [+] (contribs) 07:02, 17 September 2005 (UTC)
- the 52555.71, of 31545.15, and 27971.81, to 24370.46, a 18431.97, in 15524.04 that 10269.95, I 10009.75. --Connel MacKenzie [+] (contribs) 07:02, 17 September 2005 (UTC)
- Looks good, though I think that I would round things off to the nearest integer. What would thses numbers look like at the 100,000 end of the scale? I can appreciate the difficulties that come with limiting this to English words, but like dealing with the Gutenberg legalese in each file, these refinements will come with time. At this stage being able to deal with the fundamental technical issues is the obvious priority. I don't want to discourage your efforts with a whole lot of impossible demands. Eclecticology 19:04, 17 September 2005 (UTC)
- the 52555.71, of 31545.15, and 27971.81, to 24370.46, a 18431.97, in 15524.04 that 10269.95, I 10009.75. --Connel MacKenzie [+] (contribs) 07:02, 17 September 2005 (UTC)
- #99,985 threaten 6.64. It will be interesting to see where the bell curve peaks...my guess is that it will be less than 100. Whoops, that ranking included other languages. The frequency is going to be much lower than that at 100,000 words...two decimal places may not be enough at 200,000...certainly won't be enough at 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000 or even at 7,000,000. --Connel MacKenzie [+] (contribs) 07:25, 18 September 2005 (UTC)
- Rather than rounding to integer level I would suggest rounding to 3 significant figures. Then the would have a frequency of 52,600 occurences per million words. While threaten would be 6.64 occurences per million words. When you get to less than 3 occurences in your entire sample I humbly suggest a less than figure. e.g. if you have a sample of 1 million words and a word occurs twice it has a frequency of less than 3 occurences per million words. MGSpiller 18:53, 18 September 2005 (UTC)
- Now those are some neat ideas. The concept of limiting the number of significant digits is quite valuable. I'd quibble that four (not three) digits will probably be required, but I won't know that until I've done a test run and seen the results. One concern I have about this is whether the numbers will be mostly obvious at the lower range...if ten items in a row each are 6.64, then one might guess that they are tied (when they really are not...or maybe only some are.) The other major concern is that once I produce this data, others can whittle it down to their liking...but truncation is harder to undo!
- I also really like your idea about cutting off at the low-end. I do cut off at the low end as it is. But I counted 1.7 billion words. With the method I used, that came out to just over 9,053,000 unique words. Perhaps if I listed frequency not per million (106,) but instead per billion (109,) then cutting off terms that occur less than 10 times might make more sense. That would leave only 1,176,929 unique words.
- Words that occured in my P.G. sample of 1.7 billion words, more than 100: 7,876,380, 101: 914,647, 102: 203,635, 103: 46,937, 104: 10,176, 105: 1,368, 106: 150, 107: 16.
- In fact, looking at those numbers, perhaps I should limit further analysis to words that occur at least 100 times in those 1.7 billion words...that would leave a "manageable" 262,282 words to work on.
- --Connel MacKenzie [+] (contribs) 17:37, 19 September 2005 (UTC)
- Removing the French and other foreign words would increase rather than decrease the frequency for "threaten" since you have made the divisor smaller. Three significant digits is probably enough, but I won't complain about four. I would expect the low frequency words to have unstable rankings. A single additional occurrence can probably move it up dramatically in the rankings; this suggests at that level the frequencies will be far more important than the rankings. There is no mathematical difference between "per million" or "per billion" frequencies; it's a matter of which is the more aesthetically pleasing. In the long run keeping the low frequency words will be useful in developing criteria for what we call a rare or obsolete word. If we can also report that some word was used 15 times, all between 1868 and 1881, I'm sure that someone will find that information interesting. Eclecticology 19:55, 19 September 2005 (UTC)
- Right. I meant that Word #100,000 would be way further down on the list than threaten was...therefore would have a much smaller value. By the way, I just looked up cruisin (zero), and cruising (88,879), and cruzin (187,536)! --Connel MacKenzie [+] (contribs) 21:07, 19 September 2005 (UTC)
- Removing the French and other foreign words would increase rather than decrease the frequency for "threaten" since you have made the divisor smaller. Three significant digits is probably enough, but I won't complain about four. I would expect the low frequency words to have unstable rankings. A single additional occurrence can probably move it up dramatically in the rankings; this suggests at that level the frequencies will be far more important than the rankings. There is no mathematical difference between "per million" or "per billion" frequencies; it's a matter of which is the more aesthetically pleasing. In the long run keeping the low frequency words will be useful in developing criteria for what we call a rare or obsolete word. If we can also report that some word was used 15 times, all between 1868 and 1881, I'm sure that someone will find that information interesting. Eclecticology 19:55, 19 September 2005 (UTC)
- Now that the monthly backup situation is stabalizing, I may actually get to do this. Any more comments? --Connel MacKenzie 06:29, 5 October 2005 (UTC)
My current plan is to put the #5 format into a template, and put it on the top 16 entries. I'll then use Polyglot's bot as my base to do the top 150. At that point it would be valuable to know if accounts can still be marked as bots. If that goes well, I'll try the top 1,000, then 10,000.
One improvement I still need to make on my counting is to make the frequency counts case-insensitive. The case sensitivity is relevant for finding words that we are missing (e.g. Marquis) but only throws the statistical analysis off (e.g. the + The.) I think I shall regenerate the numbers case-insensitive before proceeding any further. Sound good? Any other formatting ideas out there, in the meantime? --Connel MacKenzie 06:18, 9 October 2005 (UTC)
- Case insensitivity is needed. But of course there is then a caveat that it is not possible to discern words which always have their first letter capitalised from words which don't when the word is the first in the sentence or part of a phrase in all capitals. Another is of course that without some very good comp.sci it's also not possible to know which of several homonyms a given example is. There is a more minor one dealing with hyphenated vs. non hyphenated variants, especially in etexts formatted with hyphens at the end of some lines to split words either at syllable boundaries or when there is a hyphen in that position anyway - but this can be overlooked for now. To deal with the others I would make it clear in the template that the stats are for all capitalisation variants and all homonyms, and thus should be positioned outside any of the etymology sections - perhaps along with the "disambiguation see also" at the very top.
- Another idea you might want to think about is collocations. You can build a table of every word preceding and every word following every other word, then put the top 3 or 5 or 10 in some similar table. Yet another idea would be to have a separate tab just for statistics in the manner of my citations tab. That you could fill with all sorts of arcane stuff of interest to many but not getting in the way of the casual dictionary user.
- Keep up the great work! — Hippietrail 15:16, 9 October 2005 (UTC)
- Thanks. I suppose I could exclude words that Wiktionary identifies as ===Proper nouns=== from the case-folding. But not this round. I've re-sync'ed with Project Gutenberg; now finishing processing of 25,112 books.
- Thanks for the good feedback. I was hoping to finesse the hyphenation and homonym problems, if you don't mind. Also, I'm leery of crowding things *above* the ==English== header...since these rankings apply only to words that are tagged within Wiktionary as ==English==, I was planning on adding it immediately below that header line. This may have an undesired side effect of including French and German words in the counts (where the French word is spelled the same as another word in English) but again, I am hoping to simply ignore these odd cases for now, through a technique similar to that of the ostrich.
- --Connel MacKenzie 03:00, 10 October 2005 (UTC)
- I've had an idea for statistically checking to see if a text is in English. Total the occurrences of the ten most frequent English words (perhaps excluding "a" and "I"). Normalize the total, and compare it with the genrally expected value for English. If the result is less than 10% of the expected value the text is probably not English. With refinement a similar test might be used to identify the language of any text. Eclecticology 18:08, 14 October 2005 (UTC)
- Clever. I like it. I'll have to do better at filtering off the header and footer pages to do that effectively. (Not too hard, just something I haven't bothered with yet.) Both things could take quite a while though. --Connel MacKenzie 18:55, 14 October 2005 (UTC)
- No rush. It's not like anybody was clawing his way into competing with you for the work. Is the text in the headers and footers fairly stable? If so you could take a long enough string of words at the beginning and end of each of these and simply delete everything between. Eclecticology 00:32, 15 October 2005 (UTC)
- Clever. I like it. I'll have to do better at filtering off the header and footer pages to do that effectively. (Not too hard, just something I haven't bothered with yet.) Both things could take quite a while though. --Connel MacKenzie 18:55, 14 October 2005 (UTC)
Disappointing news: with only 4 significant digits, I found duplicates within the first 1,000. I'm trying to run that again with 6 significant digits right now.
- I wouldn't worry about this. Any change in the corpus is bound to induce small changes to their relative positions. You are just as well off accepting that they are tied for that rank, and leave as many ranks blank as are needed to fill the space. Eclecticology 00:32, 15 October 2005 (UTC)
Hippietrail, The collocations idea is great. I started something along those lines when I did the first couple test books. That information I think is highly valuable also, but that is not what I'm after on this pass. It will also require me re-running the whole shebang (that right now I am able to supplement incrementally.)
Sidenote: Brion seems to have the automated backups generating weekly XML dumps. I presume this means the [[Special:]] pages are now also being updated weekly?
--Connel MacKenzie 07:16, 10 October 2005 (UTC)
- Included in this latest run are frequencies (per billion) case insensitive. --Connel MacKenzie 09:47, 14 October 2005 (UTC)
- A couple of remarks I made about the template from Connel's talk page: The template should, concisely, indicate what it is all about, or be placed under a 'Project Gutenberg frequency' ===header===. Also, most people using Wiktionary won't be interested in how often a word occurs in the Project Gutenberg texts. Therefore the template should be moved to a less prominent position, as the end of the language section. In addition I would welcome some documentation on how you decide which language a word belongs to, and how you handle capitalization of words at the beginning of a sentence. Ncik 22:38, 19 November 2005 (UTC)
Current status
Implementing Ec's 10% filter was much easier than I thought, but is having inconsistent results so far, as I refine the weighing criteria I am using. Finding a line in the preamble that says "This etext is in German" is sometimes a good clue, but is not consistent either. --Connel MacKenzie 22:44, 16 October 2005 (UTC)
I'll be republishing the recalculated rankings as a simple list on Wiktionary:Frequency lists soon. The 'bot will continue individual experiments through the first 100, when I'll stop for additional comments before letting the (throttled) ranking bot User:GutenBot do its thing for the top 1,000 and 10,000 English words. --Connel MacKenzie T + C # 18:16, 10 November 2005 (UTC)
- Right now, I'm letting the 'bot run for English words #1-100. With case insensitivity, (and including English only) the top ten reordered a little bit. Since I use the week-old XML list, there might be a couple minor mistakes in the top ten. Please give me some feedback on #11-100. (1-10 will get corrected when there is a new XML backup copy on the Wiktionary download page.) --Connel MacKenzie T C 23:47, 25 November 2005 (UTC)
- Proceeding to 1,000 to get a better feel for it. --Connel MacKenzie T C 02:32, 26 November 2005 (UTC)
While I think this work is very interesting, it is very annoying to find these things at the very top of articles thus pushing the essential information down, sometimes off the bottom of the screen. I think this info is of about the least importance to actual dictionary users (about equal to the Anagrams sections that have begun to appear) and thus belongs at the bottom of articles unless some very compact sidebar or such can be devised. — Hippietrail 00:59, 27 November 2005 (UTC)
I aggree with Hippietrail Gerard Foley 03:18, 27 November 2005 (UTC)
- I believe that was format choice #8... http://en.wiktionary.org/w/index.php?title=that&direction=prev&oldid=629559 demonstrated that format. The opinions I heard were unfavorable. Perhaps you could offer advice on how to render the information at a lower level? Parsing an entry and changing text immediately after ==English== is relatively easy. Are you suggesting I add this instead immediately before the first "----" perhaps? --Connel MacKenzie T C 18:58, 28 November 2005 (UTC)
I like this style. Gerard Foley 19:26, 28 November 2005 (UTC)
- On reflection, I am more partial to the sidebar format as well.
- It has occurred to me that much of the conversation about this side project of mine has ended up on my talk page (or skype or e-mail) when it really does belong here. Now that 1-1,000 are done, perhaps I should archive these conversations, and reopen a new vote for just the formatting style/preferences? (I think there are really only two viable formatting candidates at this point: the horizontal bar currently in use, vs. the sidebar.) With this particular conversation being old, it is getting hard-to-find within this Beer Parlour page. (And the BP itself needs a fresh round of archiving.)
- Also, where should I describe technical aspects (converting hyphens to spaces, case-insensitive methods, including in the ranking only words that have ==English==, handling of tied ranks, etc.)? Wiktionary:Frequency lists?
- I'd also like opinions on what to do for an encore: should I make frequency lists for German, French and Latin, or should I go back to experimenting on the colocations within English?
- --Connel MacKenzie T C 16:34, 2 December 2005 (UTC)
Now that I'm getting used to seeing these ranking bars I have to say that after all I don't find it useful in the least to see the words on either side of this word in the rankings - there's just no relationship. I'd like to see a sidebar that has just the ranking of this particular word in a nice format and maybe it could have a link to a subpage or appendix that shows a bunch of words around it. Don't get me wrong, I still think it's valuable - just not in the same was as truly dictionaryish info. — Hippietrail 18:16, 2 December 2005 (UTC)
- I agree with Hippietrail about the words that come before and after. The link to the frequency list should suffice (where all the information is provided, for those who are interested). I also think the present template is too big and in too prominent a location. Maybe we should think about having a Interesting Facts section as suggested by Paul G below, in which case the template (if we need one at all) could just be displayed as a normal line saying: Project Gutenberg rank: [number], with "Project Gutenberb rank" linked to the Frequency_lists, where one should find information on what these lists are and how the ranks are worked out. Ncik 23:17, 2 December 2005 (UTC)
- While I feel that "truly dictionaryish info" is only a tiny subset of what Wiktionary covers (not being paper and all) I think you may be quite correct that the befores and afters just aren't relevant. Perhaps for the top ten words, the trivia is interesting, but for the others, nah.
- There was talk of revisiting the sidebar format. Would a format similar to something like {{wikipedia}} be more acceptable to folks? That is, without befores and afters. The idea of adding a ===Trivia=== section is simply too great a challenge at this point in time. I will keep it in mind, but neither my Python nor my 'bot inner-workings knowledge is that good yet.
- This conversation is still too hard to find, within the Beer Parlour. Shall I propose a vote (as a new section at the end?) --Connel MacKenzie T C 19:03, 5 December 2005 (UTC)
- Connel please feel free to move this convo to someplace better. I suggest re-doing half as a sidebar at the top and half as a more minimal regular formatted entry at the bottom of the ==English== section and then taking a vote. — Hippietrail 19:50, 5 December 2005 (UTC)
- Um, that's what I just said I'm not able to do. --Connel MacKenzie T C 21:45, 5 December 2005 (UTC)
Main Page
Hi, can someone with admin rights please undelete the Main Page? Thanks. -- Schneelocke 00:42, 17 September 2005 (UTC)
- I've protected it again. There was nothing in the protection log to show that it had been unprotected. I'm wondering how an anonymous IP could have undone the protection to enable his vandalism. Is there some kind of security hole? Eclecticology 03:30, 17 September 2005 (UTC)
- Last I checked, protection doesn't stop admins from deleting pages.
- I've taken the step of desysopping User:Wonderfool on account of the strange deletions that were reported. --Brion 03:32, 17 September 2005 (UTC)
- I suspect that the others may be referring to the anon that made the mainpage again saying it was fubar'd, and the lack of protection after undeletion, since the process destroys any protection. --Phroziac 03:38, 17 September 2005 (UTC)
- Sorry - I was called away. I saw User:Schneelocke's request and restored the page, (as well as about 8 others,) but then had to go away until just now. --Connel MacKenzie [+] (contribs) 06:11, 17 September 2005 (UTC)
Verifiability
I have copied the Wikipedia policy on Verifiability to Wiktionary:Verifiability, and I will be editing it to adapt the text to our terminology in the next few days. This policy is considered as one of the pillars of Wikipedia credibility, along with Neutral Point of View and No Original Research. Wiktionary is now big enough that we should pay more attention to its credibility. Eclecticology 08:52, 20 September 2005 (UTC)
- Great stuff, Ec! This should go a long way toward stemming some of the recent abuses of deletion, which are clearly out of line with Wikipedia practice (not to mention Wiki practice in general). I'll have a look to try to make sure that the policies translate to the Wiktionary world. As I've mentioned before, we're trying to verify usage here, not factual accuracy. Nonetheless, much of the verifiability policy does seem applicable.
- While I've got you here, here's the latest cut at the "Attestation" section of CFI. I've toned down the references to print media considerably, as they're clearly not practical as stated, and played up the concept of durably archived media. All in all, our domain seems to fit best with "facts that can be verified fairly quickly by most editors, requiring only resources available over the Internet". With this in mind:
"Attested" means verified through
- Clearly widespread use,
- Usage in a well known work,
- Appearance in a refereed academic journal, or
- Usage in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year.
Where possible, it is better to cite sources that are likely to remain easily accessible over time, so that someone referring to Wiktionary years from now is likely to be able to find the original source. As Wiktionary is an online dictionary, this naturally favors media such as blogs and usenet groups, which are durably archived by Google. Print media such as books and magazines will also do, particularly if their contents are indexed online. Other recorded media such as audio and video are also acceptable, provided they are of verifiable origin and are durably archived. When citing a quotation from a book, please include the ISBN.
- What do you think?
- Thanks! -dmh 03:35, 21 September 2005 (UTC)
- Ah ... here's one of the points I had in mind: "Use your common sense to work out what other resources would help, and check them. If you can confirm the statement using them, leave it in; otherwise, continue." Note that this does not say "delete the entire article". The entire policy revolves around increasing the level of citation by addition, not around deleting unsupported matter.
- Yes, the burden does rest initially on the person making the change, but if that person fails, particularly from being innocently ignorant of the process, it's up to the rest of us to pick up the slack, not to castigate the offender and undo useful work, however minimal, in order to enforce some notion of discipline. I would certainly have taken your calls for citation much more seriously had you not appeared to abdicate all responsiblity for digging up cites yourself. Yes, it is my homework, but it's yours as well. There are certainly contributors in our community who put both of us to shame in the homework department. Let's emulate them better, shall we?
- Now, I realize you may well take offense at that last, as you are, I sincerely believe, quite diligent in consulting printed sources and most likely not so averse to searching online as you sometimes appear. If so, I apologize, and ask you to read through this section and others like it carefully and less adversarially than the "fun and games" comments I've made. We're really much closer than might appear.
- Underlying all this there is still the issue of what sources to consult. You have made your dislike of blogs and such quite well known and, I daresay, taken perhaps a bit more extreme position on this than you might otherwise have in order to be sure that this point of view was represented. This is a separate issue, however. It is up to the community to reach consensus on how (or, I'll add for the sake of completeness, whether) online material can be used as verifiable support for our work. If you are uncomfortable with neologisms (as opposed to protologisms) appearing with only online support, I can understand that, and I even share your concern up to a point.
- However, until such consensus is clearly reached, we need to work around each other's points of view and preserve whatever information has been found, even for terms we may not like to see included. I'm perfectly happy, and have been for quite some time, with declaring clearly what is known about a term, and even disclaiming certain entries as "weakly supported" or "don't use this on your English paper" or whatever. This seems infinitely preferable to haggling over CFI and PGD, though these will of course need to be continually revisited for other reasons.
- If you really are serious about bringing Wikipedia-level rigor to Wiktionary, I'm happy to hear it. I for one will welcome it as a refreshing change, one that I've been asking for quietly for quite a while now, even as I have continued to take the piss out of you for what I honestly regard as your more egregious violations. -dmh 04:13, 21 September 2005 (UTC)
Wiktionary:Verifiability not editable
In order to help speed the process along, I adapted the section on "Dubious Sources" to the realities of dictionary-making. Unfortunately, this edit seems to have been reverted. Worse, the page has been protected, so I am unable to restore this work. As it stands, only those with sysop privileges are able to edit this page, effectively blocking others from contributing on an equal basis. Knowing the commitment of the community to reaching consensus through an open process, I can only assume that this was accidental. Could some sysop please re-enable editing?
Alternatively, we could confine all discussion and "adaptation" of this page to the talk page until consensus is reached, then apply the results en masse, but this seems cumbersome compared to simply editing the page directly but disclaiming appropriately that no consensus has yet been reached. -dmh 18:34, 21 September 2005 (UTC)
- Will I unprotect it? No. Looking at the edit history, it is painfully clear that it is a work in progress. I'm sure it will be unprotected once it stablizes. Likewise, I am confident that Ec will take some of your efforts into account and include them as he can, as he adapts the texts for Wiktionary. --Connel MacKenzie [+] (contribs) 19:09, 21 September 2005 (UTC)
- Right. So I adapted a section which was clearly not appropriate as it stood into something that at least spoke to the needs of Wiktionary. Naturally, my take on the subject of "Dubious Sources" runs somewhat counter to Ec's, but so what? Ec's edits are not exactly unbiased either, as he would like to use the page to lend legitimacy to certain of his positions. If the page wasn't ready to be edited publically, it should have been polished offline and then posted. If it's a work in progress, let everyone work on it, especially if there is controversy over what the page should say. Otherwise, what legitimacy would it have?
- Frankly, from recent experience, I am no longer confident that Ec will take anything at all that I say on board, notwithstanding that — to his great credit — he has done so in the past even if he has sometimes had to hold his nose while doing it. What I find now is that anything substantive I put up regarding issues like who should have to prove what when is simply ignored, apparently out of a misplaced fear that I want to turn Wiktionary into UrbanDictionary, but that there is plenty of time to continue the spitwad fight on RFD.
- What I'm seeing here is an escalating attempt to control the process, first by overusing the sysop's power to delete, then by keeping up one side of the silly edit war on CFI, and now by pulling rank on the Verifiability page — but only after I went so far as to edit a section. I don't think this is directed at me personally so much as it is born out of a fear of Wiktionary being cheapened by the too-easy addition of slang, vulgarity, netspeak and such. As a vocal proponent of inclusivity (within the bounds of rigor), I have become something of a lightning rod for Ec's criticism, but then, I haven't exactly run from the role, either.
- Be that as it may, having one person control any part of the process is antithetical to the way Wikis work. What I would really like to see here is an honest, open discussion of what to let in, why and with what support, with due attention to existing Wiki practice (definitely including Wikipedia policies) and to the needs of a dictionary. I would like to see this discussion take place directly (and not via edit ward) and outside the policy pages proper, with any edits to the policy changes done by consensus.
- At this point, I'm not happy with having WT:V protected, but if it helps Ec to be able to do the initial adaptation undisturbed, I suppose I can go with that. I do, however, fully expect the page to be unprotected sooner rather than later. -dmh 19:49, 21 September 2005 (UTC)
- Admittedly I did reverse dmh's edits on the verifiability page before I read his comments on this page. I had not yet worked on the section that he changed. His last paragraph above is in the spirit of what I have done. Meanwhile, I have copied his changed version of the section to the talk page. I will indeed unprotect the page once the initial adaptations have been made.
- Thanks for commenting Connel. There are some fundamental issues involved here which should draw in a wider community. Having them resolved in favour of the winner of a two-party pissing match would not be the best outcome for the project. Eclecticology 20:52, 21 September 2005 (UTC)
- Heartily seconded. Now we're getting somewhere. I hope from some of the comments I've made on the talk page there that it's clear there is at least some room for compromise and hopefully more common ground than there might appear at first. One of the enduring ironies of this discussion is that, as far as I can tell, we have fairly similar views of the desired end result. E.g., for the infamous choda, I'm confident that we will eventually find published authors using the word, probably in devanagari script, just as we can find published references for various forms of fuck. Indeed, it's precisely because I'm confident of this that I want to see the incompletely formed article there with a call to native speakers to complete the puzzle. I don't take up for these words for the thrill of seeing vulgar slang in a dictionary, or even (solely :-), to twit Ec. -dmh 21:43, 21 September 2005 (UTC)
Declension tables for German nouns
I just thought I'd let everyone know I've added some declension table templates for German nouns, namely Template:de-declension, Template:de-declension-uncountable and Template:de-declension-pluraliatantum. They should be self-explanatory, I hope - if they're not, the first is used on Federweißer and Roter Sauser, for example, so refer to those pages to see how they're used. :) I hope they'll be found useful. :) -- Schneelocke 00:24, 22 September 2005 (UTC)
- This is good. I do have a suggestion for a small improvement. Could the horizontal line between the plural and singular sections be made bolder? Eclecticology 16:12, 22 September 2005 (UTC)
- Sure. :) How does it look now? -- Schneelocke 16:20, 22 September 2005 (UTC)
- Much better. As for the grammar there, since "grapes" is plural "have" is the correct verb. Eclecticology 04:46, 26 September 2005 (UTC)
- Sure. :) How does it look now? -- Schneelocke 16:20, 22 September 2005 (UTC)
There also exists Wytukaze's template Template:de-decl-noun. Ncik 18:02, 7 October 2005 (UTC)
Pseudonamespaces (again)
Some months ago, there was discussion about formalizing some pseudo namespaces. Rather than trying to discuss them all again, I'd like to request just the two namespaces "Appendix:" and "Index:" to become an official namespaces (#101 and #102.) That way, the "Wiktionary:" namespace (#4) might gradually become less cluttered.
Do we need to have an offical vote here for such a thing to happen? Or can a bureaucrat simply request it from the developers? (Or, is it a good idea to have a vote anyway?)
--Connel MacKenzie 16:32, 27 September 2005 (UTC)
- Sorry, I've been slow off the mark in dealing with this. Can you look at m:Help:Custom namespaces, and let me know if it makes any sense. When it was discussed, I don't think that there was much opposition. I don't expect that there would be any opposition now, but one never knows around here. Eclecticology 09:28, 28 September 2005 (UTC)
- Being us it's probably always safer to at least propose a vote to see if anyone disagress. But I'm fully in favour of both namespaces so you can count my vote in. — Hippietrail 16:37, 28 September 2005 (UTC)
- Well, I certainly vote for it also. Anyone familiar with the process of making it a formal vote, or can we just have people chime in here? --Connel MacKenzie 06:27, 5 October 2005 (UTC)
- As soon as a this is voted on, we can move the "Place names in xxx" entries out of the main namespace, right? --Connel MacKenzie 02:29, 12 October 2005 (UTC)
Romanization
I think placing romanization of Korean terms in parentheses is useless and even does the reader a dis-service. Unlike Japanese, which maps rather well to English romanization, Korean romanization is tricky. It depends on hard-to-learn rules.
For example, the Korean word for principle is romanized as weonri or weonli which is ridiculous, because it's pronounced weolli. There is a pronunciation rule in Korean: ㄴ before ㄹ makes both sound like ㄹㄹ.
I think our readers will stumble over the difference between pronunciation and Romanization.
I think we should add pronunciation somehow - but probably not right next to the first apperance of the Hangul glyphs (i.e., Korean word written in Korean!).
My interest in Korean is primarily as a language learner and sometime language teacher. The people I know who are learning Korean are either (1) Japanese, who will ignore any romanizations because they would prefer a "kana-ization" in any case! or (2) Westerners who can barely read the Korean alphabet.
Learning the alphabet is slowed greatly by showing students a romanization - particularly when the romanization is different from the pronunciation. I've heard students consistently mispronounce even simple words like Han-goong-mal (as Han-gook-mal with a K) simply because that's how the romanization convention puts it. Even more annoying is Ton-gil instead of Tong-il when singing the song "Urie so wonun tongil". The NG sound of Tong should be followed in the next syllable with EEL (sounding like the fish.
What's the right place to continue this discussion? Ed Poor 12:15, 28 September 2005 (UTC)
- In defense of the romanizations it's important to remember that it serves an entirely different purpose than pronunciations. It's the difference between the written word and the spoken word. If they correspond that's great, but both should still be shown. One of the benefits of a clearly defined romanization is that it allows us to consistently find things. The Library of Congress, for example, has rules for romanizing Asian languages to ensure that books are filed consistently. Eclecticology 02:15, 29 September 2005 (UTC)
- Do I understand you correctly? It seems it would be wiser to first add the pronunciation in IPA or the Korean phonetic alphabet (if there is one) before considering the romanization. Even phoneticized in Japanese, assuming the two languages share the same allophones, would be better. There may first of all be several ways to romanize Korean, and certainly Korean does not share the same phonetic sounds with any of the European languages. As a matter of principle, the romanization cannot be considered a pronunciation. That it is treated as such is lamentable, and should be discouraged to the best of our ability. (As Eclecticology notes, however, it cannot simply be done away with.)
- As I have laid it out, the case in Chinese is quite the same. The Chinese use a phonetic system called Zhuyin (or bopomofo) in their elementary education. Neither the consonants nor the vowels are the same as English, and the ones that are close are often confused because of the romanizations used. Davilla 15:51, 3 October 2005 (UTC)
- Romanization is just that: a (usually systematic) respelling of a word in Roman characters, for the benefit of those who cannot read foreign alphabets (either from lack of skill or lack of fonts). It is no phoneticization; the spelling is no more meant to accurately indicate the pronunciation than the spelling of English 'eighth' or 'have' indicate the pronunciation of the words they represent. If we weren't a dictionary, we might worry about this, but we are a dictionary and all our entries have room for ===Pronunciation=== sections. (At any rate, the McCune-Reischauer romanization of 원리 is wŏlli, isn't it?). —Muke Tever 20:31, 3 October 2005 (UTC)
Labels used to highlight categories
From a discussion on my user page:
User:--Stranger created the template {{rivers}} which consisted of the label (rivers) and a category. When used in an entry, this then gives something along the lines of (using a simple example):
Seine
- (river) A river in France.
which then gets put in the "Rivers" category.
Now, to my mind, the label is redundant, as we can see it is a river from the definition. My understanding is that labels are used in Wiktionary to show the area in which a term is used (such as geology) or to show the level of language (such as slang).
However --Stranger says:
- "I like/need the restrictive label (river) in the template because it lets me know the template actually exists. I know there are other ways of telling, but on late at night when clicking the "preview" button, if I don't see that my {river} tag did anything, I'll likely just delete it. (I had this problem with Mississippi.) While I agree it's redundant, please keep it for the sake of newbies (and Strangers)."
This is a fair point, but I think this is a misuse of labels. It might help newbies and Strangers but makes for odd-looking entries.
Any other views? — Paul G 14:59, 29 September 2005 (UTC)
- What categories a page is liste in is available at preview, at the very bottom of the page. Jon Harald Søby 16:25, 29 September 2005 (UTC)
- For the record, I can't take credit for the {rivers} template creation - that was someone else. I was just editting Mississippi and I didn't see anything happen when I added "{rivers}" to the beginning of the entry I created for the river, so I removed it. Paul G has created a workable solution to this, he made {rivers} print out (geography).
- For newbies/idiots/Strangers, I would like to open a can of worms here - how do you add categories to entries? Initialisms, acronyms and abbreviations are easy because the category is built into a template for the third level heading. Can something similar be done for nouns, verbs and adjectives? I mean, can someone make a ==={{noun}}=== template so that a word is listed under the Noun category? Or aren't we using categories for something as broad as the incorrectly-named parts of speech? Peace, --Stranger 17:23, 29 September 2005 (UTC)
- --Stranger, if the category already exists, just type [[Category:xxx]] at the bottom of the entry, replacing xxx with the name of the category (for example, [[Category:Mathematics]]). (Actually, it can go anywhere in the entry, but it's neater to list all the categories at the bottom.) The entry will then get added to that the category. You can make new categories yourself if they don't already exist. Have a look at some of the existing ones in the Category namespace (one way to do this is to go to the main page, click on any of the letters under "Words beginning with:" on the right of the page, select "Category" from the "Namespace:" pull-down menu and click "Go").
- You will see that there are several hundred categories already, and that "Nouns" is already there. This is probably too broad a category, though, and it is better to use the language-specific ones when creating an entry (eg, Category:English nouns) if you are inclined to do so. — Paul G 09:53, 30 September 2005 (UTC)
I love this! I've just added it to my user page. Can someone some nice Greek speaker create el-1 for me, please? — Paul G 15:44, 29 September 2005 (UTC)
- The easier way is copy template:User el-1 from w:template:User el-1. (Disclaimer: I'm so monolingual, it hurts.) --Connel MacKenzie 16:29, 29 September 2005 (UTC)
Merging of "Alternative forms" with "Alternative spellings" headers
Connel has started this process, and it is a fairly beneficial move, in my opinion. However, I disagree with how it is being merged (as I have already noted to Connel, who has stopped for the time being, allowing us to examine the results). Currently, all "Alternative forms" headers are being removed in favour of "Alternative spellings headers". This, I feel, is an incorrect approach, as, for example, ze is not a mere spelling alternative of zij, it is an unstressed variant, and is thus pronounced differently too, which is the reason for the difference in spelling at all. Another example is þrie and þreo. These are pronounced quite differently, and the spelling is a reflection of this.
As I see it, there are three options:
- We halt the merging process completely, and return all articles back to their previous state. This keeps the "different in spelling only" and "different in spelling and pronunciation(/something else)" distinction, but perhaps gives us an unnecessary amount of (unnecessarily discriminating) headers to choose from.
- We merge the headers, but under "Alternative forms" instead. An alternative spelling is clearly also an alternative form, but the reverse does not have to be true. I quite like this idea, but it may not be to everyone's liking, who may like our current distinctions.
- We instead link relations like the þrie/þreo one under "Synonyms", as it can be argued they are etymologically different, if not very distantly at all. However, this creates two problems. Firstly, it can be quite hard to know where to draw the line, much like the problem many had/have with the "dated" qualifier. Secondly, many alternative spellings (words that are spelt differently but pronounced the same) also have different etymologies, that we are well aware of. For example, the spelling reforms by Noah Webster, creating our much-beloved color/colour debate.
I'm open to more ideas, naturally. --Wytukaze 21:27, 29 September 2005 (UTC)
- I vote for #2; merge to "===Alternative forms===". --Connel MacKenzie 21:37, 29 September 2005 (UTC)
- I'm making a cautious vote for continuing the current process but leaving out any articles which use both headers since those need special attention. It is the right thing to merge all the variations of the alternative spelling headers into a single standard. The oddities y2 points out are interesting but indicate why we need a tight nomenclature. In the past I have put similar things under "Related terms" if they seem etymologically related, and "Synonyms" if they seem not to be. If possible I would recommend looking at some print dictionaries in difficult cases. Most have "also" or cross-references for spelling variations. Some have "See", "cf" or "note" for irregular inflections and other kinds of uncommon relationships between two headwords. — Hippietrail 23:07, 29 September 2005 (UTC)
- I agree with === Alternative forms ===. It is the header used in the Latin wiktionary (well, 'Other forms', technically) and is a good overarching form that comprises 1) same pronunciation, different spelling (Lutetia, Lutecia = Paris); 2) same form, different declension (commentarius masc., commentarium neut. = notebook); 3) different pronunciation, same 'word' (mainly in the case of proper nouns—whose form in spoken languages change over time and may be 'reborrowed', as Olisipo → Lyxebona → Lisbona = Lisbon; but also sometimes in other words: centesimus, centensimus = hundredth; diaeta, zeta = diet) —Muke Tever 07:52, 1 October 2005 (UTC)
- I think the moving of everything to "Alternative spellings" is not the right approach, as alternative spellings offer a choice of how to spell a single word (such as with -ise or with -ize), whereas "Alternative forms" is more appropriate for inflections, where there is no choice (such as where a word must be used in the given context in a particular number, gender or case, and using any other would be incorrect). Under "Synonyms" would come words that are completely different but mean more of less the same thing. "Related terms" are different words that do not have the same meaning but are related by concept.
- So, for example: "categorize" has the alternative spelling "categorise", the alternative forms (inflections) "categorizes", "categorizing" and "categorized", the synonym "classify", among others, and the related term "category" (among others).
- I think it is useful to retain these distinctions rather than putting everything into "Alternative forms", or have I misunderstood what is being proposed? — Paul G 17:04, 4 October 2005 (UTC)
- I do not think you misunderstood. On this, Paul, I agree with you. Although this makes it harder for me to automate the replacement of misspelt third level headings, it should result in a cleaner Wiktionary. If everyone agrees, we should probably update WT:ELE to reflect the new "valid" third level heading. --Connel MacKenzie 15:26, 5 October 2005 (UTC)
WikiSpecies
Is there any way to link to the Wikispecies project? I keep coming across things like snapdragon. Thanks. --Stranger 13:51, 2 October 2005 (UTC)
- Perhaps {{wikispecies}}? See Cocos for a usage example. -- Nick1nildram 08:25, 3 October 2005 (UTC)
- But you won't find snapdragon there (or even Antirrhinum) - but in taxonomic entries, we link there anyway - its only a matter of time before they catch up. SemperBlotto 09:32, 3 October 2005 (UTC)
PAGETITLE (?)
I want to create a new template that uses the page title. I've seen PAGETITLE (or something similar) used elsewhere before, but Template:PAGETITLE does not exist. Could someone remind me of the syntax to refer to the title of a page, please? Thanks. — Paul G 09:47, 3 October 2005 (UTC)
- See Template:new en basic for an example of use. SemperBlotto 09:57, 3 October 2005 (UTC)
- {{PAGENAME}} when used in templates is very cool - it takes on the entry name of the entry that includes the template. So, when you preview a template, it has to use its own name (the name of the template) and therefore likely looks wrong.
- In my personal javascript file I let javascript auto-substitute the PAGENAME variable for the actual page name. Of all the features in my javascript, I think that one feature should be enabled for all (or most) users by adding that portion of the code to MediaWiki:Monobook.js. --Connel MacKenzie 19:11, 3 October 2005 (UTC)
- Thanks, Connel, I've made my new template (see "New template" below). — Paul G 09:09, 5 October 2005 (UTC)
Meef
I think this one should probably go, unless we are including nonce words. Rich Farmbrough 13:37, 3 October 2005 (UTC)
- Moving to Wiktionary:Requests for deletion, although I'm fairly sure this one has come up before. — Paul G 09:10, 5 October 2005 (UTC)
Copyvio on prodigal article.
I've noticed that the prodigal article seems to be a direct copy of the dictionary.com [1]. I couldn't seem to find a unified place to report this, like the Wikipedia copyvio pages, so I'm posting here. Thanks. -- Creidieki 17:10, 3 October 2005 (UTC)
- Here or WT:RFD are fine. Actually, it looks like all of the Special: Contributions/Alex011 were similar vandalism, where "clean" content was replaced with copyvios. May take a while to clean these up. --Connel MacKenzie 18:58, 3 October 2005 (UTC)
- I think I've deleted these. Would another admin please review (delete any I missed.) Thanks. --Connel MacKenzie T C 19:17, 9 December 2005 (UTC)
Hm, there is a bit of an edit war going on over facade (without the cedilla) at the moment. The content there was identical to that at façade (with the cedilla) so I expanded "façade" with the intention of making "facade" a simple redirect. One of the changes was to remove "see also facade" from the top of the page, as this became redundant. While I was making these changes, which took some time, the "see also" was restored twice.
I don't think we need anything more than a redirect at "facade", as the alternative spelling is given at "façade". If, in future, meanings are found for "facade" that don't apply to "façade" (perhaps words in other languages) then "facade" can be made back into a full entry. Until then, there is no need for a separate entry, as far as I can see. — Paul G 16:51, 4 October 2005 (UTC)
- Other languages may also have that particular spelling of the word, so a redirect shouldn't be used. And facade without the cedilla is, sadly, just as valid a spelling as what façade is. Both articles should be kept. Jon Harald Søby 16:54, 4 October 2005 (UTC)
- I strongly object to replacing real content with a redirect. Lexically, a spelling difference is a very important distinction. To have the translations section of one entry point to the translations section of another entry I have no objection to, but every other valid heading we have has the potential of being different. Pronunciation especially. In the US, these two are spoken differently. --Connel MacKenzie 16:59, 4 October 2005 (UTC)
- I agree that other languages may have that particular spelling, but none have been entered until now. The redirect can be changed to include an English entry saying "alternative spelling of façade" or something similar when those are found.
- I am unaware of any distinction in meaning between "facade" with and without the cedilla, which is why I have moved the content of "facade" without into "facade" with. As far as I am aware, they are simply alternative spellings. Connel, how do "facade" and "façade" differ in meaning and pronunciation in the US? If this is the case, then by all means we can have separate pages. — Paul G 17:08, 4 October 2005 (UTC)
- By the way, this isn't to favour one spelling over another - I see this as the same issue as "categorize"/"categorise" - write the article for one page and redirect from the other as an altnerative spelling, if that is the case, although I concede that this may not be true here. — Paul G 17:13, 4 October 2005 (UTC)
- As Muke pointed out on IRC, facade is not a valid spelling of the French word. --Connel MacKenzie 17:20, 4 October 2005 (UTC)
- The ç is a secondary English letter. Michael Everson describes it, and several other characters such as è and ï, as "fundamental letters normal to the alphabet of a language, used in writing native or naturalized (non-foreign) words, but which are, in the sources, interfiled with the base letter" and suggests the loss of diacritics to be due to ASCII. Cf. [2] and PDF. (See also la:Auxilium:Abecedarium Anglicum). :p I, too, would like to see the main entry at façade and reference facade to it (whether by redirect or by explicit see-under doesnt particularly matter to me, until another language's 'facade' emerges) and list facade as an alternative spelling. And I have never heard 'facade' spoken differently from 'façade' in this country, unless perhaps in the mouths of people who are reading out a word they don't actually know. —Muke Tever 18:22, 4 October 2005 (UTC)
- Well I pronounce it differently; with the c-cedilla, it is usually meant to sound pretentious. Perhaps that is not as widespread as I thought. --Connel MacKenzie 18:30, 4 October 2005 (UTC)
- The word itself may be somewhat pretentious, but I've never heard of any variant pronunciation being any less pretentious. —Muke Tever 04:29, 5 October 2005 (UTC)
- Point conceded; my pronunciation distinction seems to be my own. Although pronunciation was a specious example, I still assert that "c" is different from "ç". --Connel MacKenzie 15:20, 5 October 2005 (UTC)
To address the renewed American English vs. British debate is perhaps misplaced. But the technique of replacing an entry was not ever an acceptable solution. The only argument for having one's "translations" section refer to another's was one of pure laziness. But to redirect any entire entry is inadequate (and certain to offend one side of the pond or the other.) --Connel MacKenzie 18:30, 4 October 2005 (UTC)
Checking http://googlefight.com/index.php?lang=en_GB&word1=fa%E7ade&word2=facade implies that the main entry should exist at facade not façade. --Connel MacKenzie 18:34, 4 October 2005 (UTC)
- On the matter of "diacritic vs. non-diacritic", Google cannot be used as a reliable source, as most people don't know how to/are too lazy to write the word with the diacritic on computers. Jon Harald Søby 19:18, 4 October 2005 (UTC)
Firstly I apologize for editing something that was underway but of course I had no idea that was the case.
Secondly and overridingly, as far as I am aware we have a reasonly long-standing policy of allowing redirects for alt spellings as a minimal efforct, we expressly allow replacing such redirects with full articles, and we expressly don't permit replacing full articles with redirects.
If Wiktionary were built on something other than MediaWiki/WikiMedia (I always forget which is which), and was designed to have sections shared by multiple 'headwords' without retyping them, that would indeed be better. But it doesn't so in the meantime I think we're doing it a way that works. We've discussed this all before several times anyway.
AHD and Collins both prefer façade over facade. Encarta and M-W prefer facade over façade. That's a 50/50 split in the opinions of reputable lexicographers right there. If the experts can't agree which is 'right' we should not give a preference to either one.
Collins doesn't provide pronunciations but the other three all agree on /f@"sA:d/. I've also heard /eI/ in place of /A:/ and I think I've heard /k/ in place of /s/. I wouldn't have been surprised if either was considered acceptable in some place by some dictionary, but apparently not.
On Google, in the past several months they stopped distinguishing diacritics and ligatures in English mode. You can type with or without and get the same results. Other language modes work differently. — Hippietrail 23:20, 4 October 2005 (UTC)
- It's not just stopping to distinguish them: It's just an unusual default mode. If you search 'facade' you will get back 'facade' and 'façade'. If you do 'façade' you will just get 'façade'. If you want 'facade' only you can search for 'facade -façade' (but if another accent, say 'facadé', appears in some other language, that will still show up in results). Presumably it's trying to be helpful but it's not so much so for those of us who are studying the diacritics. —Muke Tever 04:25, 5 October 2005 (UTC)
- Um, facade -façade still gets a lot more google hits than façade -facade. And facade is still not a French word. --Connel MacKenzie 04:45, 5 October 2005 (UTC)
- True, "facade" is not a French word, but I don't see how the redirect suggests that it is. Perhaps the redirect should be replaced with an English entry cross-referencing the spelling with the cedilla and a "see also" for other languages. — Paul G 08:52, 5 October 2005 (UTC)
- I think one solution is as I have just suggested above: to note on the page for "facade" that is it an alternative spelling of the English entry, and nothing more, until such time as we find that it is a word in other languages.
- I wonder whether, if "façade" had just been created and "facade" had been made a redirect (or cross-reference) to it, we would be having this discussion. There seem to be two issues here: whether it is ever acceptable to replace content with a redirect when that content is made available at the page that the is redirected to, and whether "facade" is any way distinct from "façade". My current view is that the latter is not true and so the redirect is sufficient.
- Hippietrail, thank you, but no need to apologise - if I'd known I was going to take so long to edit the page, I would have put a "editing in progress" lock on it. — Paul G 08:59, 5 October 2005 (UTC)
- Your example is having immediate implications. Newcomers are entering redirects for other word forms now, as well as requesting articles to be removed and replaced with redirects. Here on the English Wiktionary, we make a distinction at a word's spelling. Would you like to change that basic premise of how the English Wiktionary functions? Shouldn't we take a vote on that or something? (edited at) --Connel MacKenzie 23:04, 5 October 2005 (UTC)
- With Muke's help, I'm preparing a couple templates that might help clarify the entries and why we (Wiktionary) would like in general to prescribe the c-cedilla form. (Yes, I certainly can concede that.) Muke also pointed out some of the flaws in my desire to cite the Object Oriented Programming uses for facade only. So yes, it does look like most parts of the entry at facade should point to façade. When done and everyone has agreed the Template:diacrit should be added to both. Template:nodiacrit will be for terms like İstanbul, and Template:prondiacrit for entries like Zurich/ Zürich. --Connel MacKenzie 23:04, 5 October 2005 (UTC)
Whoa slow down. I really don't think it's so simple at all. We should be looking at print dictionaries and at the words and at usage guides etc. Not all authorities are in favour of keeping the diacritics. Unfortunately I made a typo above that made it look like all dictionaries recommend façade but in fact Encarta and Merriam-Webster both recommend facade and only give façade as secondary. This is bound to vary from dictionary to dictionary and country to country and has been happening since way before the internet or ASCII. Also look at the cousin of the diacritic: the ligature. In fact I have been building an index of words in English which use both: Category:English words spelled with diacritics or ligatures.
Also note that User:Ed Poor is doing something to merge at least these two Korean articles: 을 and 를 using another system of his own. It looks like it may well be time to vote on what we do for redirects, overwriting articles with redirects, etc. And also do a lot more research about how English really handles diacritics (and ligatures) before running off making templates etc. I don't feel those templates reflect how any dictionaries treat these words at all. I don't think they are what we need. Maybe categories or something more subtle. — Hippietrail 00:02, 6 October 2005 (UTC)
- To reiterate a point above, there must be an entry at façade because this is the only correct ==French== spelling. Where you want to put the ==English== definition is a fight I'm not interested in. Davilla 07:51, 22 October 2005 (UTC)
New template
I'd just like to announce a new template: {{en-noun-reg}}. Put it underneath the ===Noun=== header to get the noun and its plural when the noun is regular, for example, when used for "hrunk", it would give you:
- I fail to see the value of this template. What it represents is not that much longer than the template. If we want to include the plural for all words that's fine, and I can adopt the suggested format, but I will still write it out in full instead of having to recall yet another unnecesary template. Eclecticology 23:20, 8 October 2005 (UTC)
- One immediate benefit is that we no longer quibble about general formatting on individual pages, but on the talk pages of the templates only.
- No one is saying this must be used by everyone. I like it. I like using it, too. I find it highly superior to other such formatting attempts that managed to disrupt the machine-readability of entries by doing such things as commingling the pronunciation sections withing the definitions area.
- Another benefit of Paul's templates, are that they reduce the number of typos. Not often, but just today, I came across an entry that had the word spelled incorrectly on that line.
- It also allows for a standard look for entries. Additionally, using what-links-here on the irregular template automatically gives a list of irregular plurals. That alone is golden.
- --Connel MacKenzie 00:03, 9 October 2005 (UTC)
Noun
hrunk (plural hrunks)
By analogy with the templates that give verb inflections, others could be written for nouns that take -es and nouns that have irregular plurals; in the latter case, the irregular plural would be an argument to the template. — Paul G 09:31, 5 October 2005 (UTC)
- OK, I've made {{en-noun-reg}}, {{en-noun-reg-es}} and {{en-noun-irreg}} and described their use in Wiktionary:Index_to_templates. — Paul G 10:08, 5 October 2005 (UTC)
- Paul, these are excellent! Thank you. --Connel MacKenzie 17:02, 5 October 2005 (UTC)
- Well, I like this. In the true spirit of Wiktionary, however, I'm going to voice a complaint. I don't like the italicised "plural" you have there, but I can live with it, so long as we get standardisation. Thanks! --Wytukaze 17:06, 5 October 2005 (UTC)
- As far as I've seen, italics is the standard. Jon Harald Søby 19:59, 5 October 2005 (UTC)
- Wytukaze, you should clarify your joke, before someone takes you seriously. (I too thought it was funny.) --Connel MacKenzie 20:30, 7 October 2005 (UTC)
- Would the addition of a mass noun template (with no plural & possibly a mass noun category) be a useful addition?MGSpiller 20:37, 5 October 2005 (UTC)
- To answer my own question I think the {{uncountable}} is most of the answer. Though I'm tempted to ask for a rename of the associated category to English uncountable nouns. MGSpiller 21:01, 5 October 2005 (UTC)
Thanks, Connel. I've now added templates for plurals of French nouns and comparatives and superlatives of adjectives (although I managed to mess up the tables while I was at it - please could someone fix them if they can).
Wytukaze, it is fairly standard for print dictionaries to italicise terms such as "plural" and "comparative". This adds to readability as it shows that these words are there to indicate supplementary material rather than as part of the headword or its definition. For this reason, I think it makes sense for Wiktionary to do the same. — Paul G 10:18, 7 October 2005 (UTC)
- Alas and alack, I was too late. As Connel stated, it was just a joke, a jab at our current penchant for politics, rants and arguments. You may well note I don't italicise "plural" when I'm linking to one, but that's just because I never have. Using this template, I can and will, or perhaps it's more a case of "won't be able to not". I hope I don't need to stick my tongue out before someone gets offended; I agree with your assertion, Paul, and I thank you again for being the only one of us to realise this template is a good idea. --Wytukaze 15:35, 9 October 2005 (UTC)
Noun templates updated with templates for uncountable nouns, as requested by MGSpiller, and for nouns ending in consonant + y. I didn't get the joke - sorry. Never mind. — Paul G 15:07, 11 October 2005 (UTC)
'Ye Olde Queene' and such; do archaic spellings deserve entries?
I've been pondering if words should be entered 'yn' the odd spellings as they might be found in old documents. Of course, there might be quite a lot of them, and a lot of old documents would have to be read, too; also, without an institutionalized spelling system there may be a ridiculous range of different spellings for many words. But it might be a fun undertaking as well as informative to anyone who wants to look up words from an old source. Citizen Premier 01:01, 7 October 2005 (UTC)
- I'm pretty sure I'm not the only one who assumed we were going to include old spellings. After all, people need to read old writing and will come across words they need to look up. We definitely have a few old spellings of words in German and Russian already.
- By the way, ye is not an old spelling of the. But people have been using it for a while thinking it is so it probably deserves an entry anyway (: — Hippietrail 01:59, 7 October 2005 (UTC)
- It would generally be pointless to try to include all these old(e) variants, particularly when the only change has been the suppression of a silent "e". Ye is a bit special given its widespread misinterpretation. My Oxford calls it a "pseudo-archaic" graphical variant. the was originally written "þe", and as the thorn fell into disuse in English it often became a "y", though it apparently retained its pronunciation. Pronouncing it as "ye" is a more modern development. Eclecticology 22:36, 8 October 2005 (UTC)
Oops
In adding new templates for nouns and adjectives to Wiktionary:Index_to_templates, I've messed the tables up and don't know how to fix them. Could someone who knows what they are doing please fix them? Thanks. — Paul G 10:08, 7 October 2005 (UTC)
- You were very close; all better now. Very nicely done, Paul! Are you planning on renaming the -infl-'s to -verb- next?
- My only nitpick (about that section of WT:I2T) is that I thought we were having languages be each under separate headings...since each language seems to have it's own set of conjugations (that don't correspond to English nor others) very well. That has the secondary benefit of making the wiki tables themselves a little simpler. --Connel MacKenzie 06:17, 8 October 2005 (UTC)
- OK, I've reorganized those tables and sections a bit. (I wikified derived terms in the ADJ templates while I was there.) Everything look OK now? --Connel MacKenzie 07:06, 8 October 2005 (UTC)
- Thanks for the clean-up. I don't intend to change -infl- to -verb- as these are not my templates, and the -infl- ones are already used in many pages. Using -verb- rather than -infl- would make sense if these templates were being created now, but I think we might be stuck with the -infl- forms now.
- I noticed that someone has added categories (eg, English nouns) to the templates - good work! — Paul G 11:12, 17 October 2005 (UTC)
Hypothesized Indo-European root words
Wikipedia is well on the way to sending several articles on hypothesized Indo-European root words in our direction. My view is that they probably won't satisfy our attestation criteria, being not attested in works independent of the work where they are hypothesized (Indogermanisches Etymologisches Wörterbuch). They are listed in an appendix in AHD, rather than in the main body of the dictionary, and most mentions of them that I can find use an asterisk prefix to denote a hypothesized form. Please contribute to the discussion at w:Wikipedia:Articles for deletion/Indo-European root word articles. Whilst you are there, you may want to pay a side-trip to w:Wikipedia:Articles for deletion/Steezy, too. Uncle G 04:10, 8 October 2005 (UTC)
- Um, Pokorny's Wörterbuch is not the only reference on Proto-Indo-European. (Indeed, it is ridiculously outdated—if you see a laryngeal spelled in any PIE word, it postdates Pokorny, and many roots were discovered after him.) It's also ridiculous to say that a language doesn't meet attestation criteria due to independence — for that, say, you'd have to throw out Gothic and Avestan as well — we have other criteria that would apply here, such as the appearance in well-known works (the Wörterbuch being one) and the appearance of roots in academic journals. (Note that an argument in that CFI you favored that works better than independence is that they are not—except in rare cases like the Owis ekwoskwe story—used to convey meaning.)
- Nevertheless, I would object to the addition of Proto-Indo-European roots as anything other than an appendix vel sim. here: first off, the major POV issue of the forms of the roots and stems themselves, many of which differ from expert to expert; secondly, the spelling is not at all uniform: the palatals may be spelled *ḱ, *ǵ, or *k̂, *ĝ, and sometimes even *c, *j in ASCII, and that's with sidestepping whether the writer even considers them to exist at all; the semivowels are variously spelled as i̯, u̯, or *y, *w, or *j, *u̯..., mainly depending on the language habits of those writing the text; the laryngeals—again, even without the question of how many they may be—may be spelled *H₁, *H₂, *H₃...; *ə₁, *ə₂, *ə₃..., various notations for syllabic resonants, certain consonant clusters, e.g. *tḱ vs. *kþ; the question of whether or not to mark stress when it is known; basically, what you have is an NPOV nightmare on your hands with just the orthography. The semantics, also, may be much in debate. In short, the whole thing is worse than a conlang, and rather than having the whole mess flying across the main namespace at the whim of whatever soi-disant expert may next arrive it would be better to keep it contained. —Muke Tever 07:03, 8 October 2005 (UTC)
I agree. The best place for these is probably merged in a Wikipedia article on Proto-Indo-European roots. They can also keep "steezy" in the absence of any verification. Eclecticology 22:08, 8 October 2005 (UTC)
Four dashes
I see that there is a sort of tradition to put a horizontal line (four dashes, ----
) before a new language in entries. However, I find this utterly ugly and disturbing with one line before a "==level two header==
", as there is also automatically one line just beneath it. So, is there a possibility that this tradition might end? Jon Harald Søby 13:22, 9 October 2005 (UTC)
- I don't mind either way, but I speak only for myself. I do like the consistency we seem to have now with it though (after English section.) Are you volunteering to 'bot-remove all "----\n"'s if this is approved? --Connel MacKenzie 19:21, 9 October 2005 (UTC)
- The practice is inherited from before we had the current skin, when ==level two headers== didn't come with a line below. The removal of these lines was discussed when the monobook skin was introduced, but no consensus was reached, IIRC. One thing to take into account is that some of the regular contributors (not me) have admitted to not using the monobook skin and thus don't have the line beneath the level two header. —Muke Tever 01:01, 10 October 2005 (UTC)
I believe Ec uses another skin so his input would be valuable. I'm easy. If everybody wants to get rid of the ---- then I have no problem with it. I do think it delineates sections a little better but I also agree it is ugly on Monobook. I do want consistent formatting though, and it will be a lot of work to remove these from all articles - if we decide to do it I am in favour of seeking help from the devs since they can make such large-scale changes in a tiny fraction of the time even a bot can. — Hippietrail 15:27, 10 October 2005 (UTC)
- Connel: Well, I don't know how to write a bot, but if someone did, I would be more than happy to run it… Jon Harald Søby 15:34, 10 October 2005 (UTC)
Yes, I use the Classic skin, which does not put in the line after a level 2 heading. A few of the other skins also do not automatically add this line. I find the sans-serif type used in the default skin to be smaller and more difficult to read. A feature which gives contradictory results depending on which skin a person uses could be seen as a bug. Hippietrail, since we have discussed stylesheets in another context, is it possible to amend the Monobook stylesheet to suppress the line after the heading? Eclecticology 07:30, 11 October 2005 (UTC)
- It is possible, but IMO it would be better to add it to the other skins… Jon Harald Søby 14:25, 11 October 2005 (UTC)
- Ec, I only recall that Classic has no counterpart of Monobook's monobook.js - I don't know if it has a .css file - or if this would be sufficient. — Hippietrail 15:20, 11 October 2005 (UTC)
- Classic does have such a counterpart; for historic reasons, it's at standard.js and (and standard.css). —Cryptic (talk) 20:03, 23 October 2005 (UTC)
- It could just be added to MediaWiki:Common.css. It doesn't exist here, but it exists (and works) on Wikipedia)… Jon Harald Søby 15:26, 11 October 2005 (UTC)
- I usually look at the pages in edit mode and I don't really care all that much how they look in the different skins. It's the content and its consistency that matters, as fas as I'm concerned. For a long time I was also using an alternative skin, but now I conformed to the default. I like those four dashes. They delineate where a new language entry starts. Polyglot 20:01, 23 October 2005 (UTC)
I would like to see the lines removed. The lines not appearing with the other skins doesn't matter, visitors will see the pages in the default skin. Who are we writing this stuff for, ourselves or everyone?! Gmcfoley 20:19, 8 November 2005 (UTC)
- Agreed. —msh210 04:39, 9 November 2005 (UTC)
I just noticed that User:Jon_Harald_Søby has taken it upon himself to start removing these [3]. Has this decision been made?
- No, this decision has not been made, all we have is 1 month old talk about what skins people use. Can you really blame him for getting tired of waiting? Gerard Foley 17:06, 25 November 2005 (UTC)
- Well since he is waiting for a decision, which could go either way, yes I can blame him. He is trying to force the decision. We have not heard enough from those who might want to keep them. This is precisely what the Beer parlour is for. — Hippietrail 17:43, 25 November 2005 (UTC)
- I'm not trying to force anything. Anyways, there is no standard in having the lines either, it is about 50-50. And yes, I'm tired of waiting… That is the problem of trying to solve things with consenus, the discussions usually just die out without any clear solution on what to do. Jon Harald Søby 20:16, 26 November 2005 (UTC)
How long should he wait? The last edit was 2 weeks ago, in support. Reading the comments this is my interpretation of where people stand on this.
Keep the extra lines
- Polyglot
Remove the extra lines
- Jon Harald Søby
- Gerard Foley
- Msh210
Don't care
- Connel MacKenzie
- CLARIFICATION: Keep. I perhaps didn't care, but having heard several reasons why they should stay, I am inclined to keep them. Whomever changed the default skin to "hide" the four dashes should revert that change immediately. They are very useful on non-main namespace entries; the only dispute I've heard is about their appearance on main-namespace entries. But even for them, I think we should retain the single "----" between English and other languages. --Connel MacKenzie T C 17:32, 28 November 2005 (UTC)
- It was I who changed the default sking (and whomever is for the object case d-;). Ive explained it below. But you're being against it is enough for me and I'll revert it. It's extremely easy for those who hate them to put the same change into their personal css page. — Hippietrail 01:44, 29 November 2005 (UTC)
- Thank you. I agree the conversation should remain open. Perhaps there is a way to limit those sorts of masking changes to main namespace entries? Maybe the older skins should be updated to match? Or maybe Monobook shouldn't add the second <hr>? --Connel MacKenzie T C 05:55, 29 November 2005 (UTC)
- Hippietrail
I think it's understandable that he started to remove the stupid things. I added them back into kamera, they look very bad and confusing, while not adding anything of value. Gerard Foley 23:52, 25 November 2005 (UTC)
To the people looking for consistency, I just looked up color, the extra line is used above Spanish, but not above Latin. With this system, we can't even get consistency in the same article. Also, if the issue comes up every so often, it must mean that most people don't want them. I can't see people asking Why don't you add an extra line above the language name? very often. Gerard Foley 19:33, 26 November 2005 (UTC)
I would think that in the interests of playing fair, anybody would wait for a decision to be made, just as we encourage over on RFD. Currently I would consider Ec capable of making a deciding decision since he will be the most affected, but also anybody else using another skin. I don't think it's too much to ask people to be patient or keep commenting here in the meantime. — Hippietrail 01:15, 27 November 2005 (UTC)
WHAT? Look, you have had a month and a half to say you want to keep the extra lines, only 1 person has. Ec is not the person most effected, it is Mr. and Mrs. Joe Bloggs who visits this site. Now I am going over to Wikipedia to tell them all that they have to start inserting extra lines as I want the articles to look good in the classic skin. I don't think that will work somehow. Gerard Foley 01:42, 27 November 2005 (UTC)
I certainly hope you will take this same attitude when an issue arises against your own personal preference. Umm have fun telling things to those at Wikipedia I guess... — Hippietrail 16:54, 27 November 2005 (UTC)
I added myself to the "keep" section (albeit a weak one), since I think they are a benefit when in the edit-mode. \Mike 17:05, 27 November 2005 (UTC)
- Note: I removed Mike from the keep section, as it is not a vote, only my own thoughts. Gerard Foley 21:54, 27 November 2005 (UTC)
- Ok, sorry for my misunderstanding. But I still keep that opinion, though :) \Mike 22:00, 27 November 2005 (UTC)
Four dashes (merged from RFD)
Question: User:Gmcfoley and User:Jon_Harald_Søby have begun removing the four dashes (----) that separate the language sections on each page. Is this an approved formatting change? I think it’s a bad idea ... I think the four-dash separator is needed. —Stephen 11:27, 27 November 2005 (UTC)
- Can you not continue the discussion in the Beer Parlour? In any case, if you want a line above the second header elements, it should be added with CSS, not four dashes in front of every single second level header in every entry. The dashes are archaic no matter how you look at it. Jon Harald Søby 11:35, 27 November 2005 (UTC)
- Since they are not archaic as I see it, that statement cannot be correct. Without the line, the pages are confusing, and typographically they don’t look good. —Stephen 11:51, 27 November 2005 (UTC)
- Still, if there should be lines, they should be added with CSS and not with four dashes. The <HR> HTML element is AFAIK archaic in XHTML (though I'm not 100 % sure there). Jon Harald Søby 14:09, 27 November 2005 (UTC)
- Who says it should be with CSS? If there should be lines, why should they not be with four dashes? I think what you’re doing amounts to vandalism. —Stephen 15:05, 27 November 2005 (UTC)
- Please. Maintaining a system where one should add four dashes on front of every header is inconvenient, and I hope you see that too. But fine, I'll stop removing them untill a solution is reached. By the way, is there a specific reason why you want these lines there? "[…] typographically they don’t look good" is something I absolutely don't agree to. IMO, they are very ugly, and ruining the appearance of Wiktionary. Jon Harald Søby 15:25, 27 November 2005 (UTC)
- I think the appearance of the four dashes in the entry edit-text is very helpful. Also, rendered, I think the lines do "look good." If the community decides they don't like them, I'll shrug. (Perhaps a similar previous comment of mine, was how you misinterpreted my indifference?) But I doubt that all contributors (or even a majority) have your POV about their appearance. --Connel MacKenzie T C 17:39, 28 November 2005 (UTC)
- The appearance of the four dashes in the entry edit-text is helpful, but this can be achieved with comments. Also, if the two lines look so good why was this not done from day 1, with four dashes before and after the heading? Gerard Foley 17:54, 28 November 2005 (UTC)
I think I said somewhere the first time this came up that I'm don't like how these look in monobook. But I find them very helpful in edit mode and since we offer a choice of skins we should endeavour to look our best in all of them. I am not in favour of impatience or forcing a decision on a minority without due discussion of alternative fixes that could make everybody happy.
It is in keeping with hoping to keep everybody happy that I just modified MediaWiki:Monobook.css to hide the manual HR element caused by the ----. The other horizontal lines which are part of the level 2 headings are unaffected. So far I haven't seen any other side effects but perhaps some exist and should be brought to our attention.
But this change was made before I say Stephen's comments. So please treat it as experimental at this stage. There is surely more that can be done with CSS to keep "really everybody" happy. At the moment however it seems that enforcing the ---- to a) keep edit mode clear and b) keep non-monobook users happy is a good way to move forward. More comments and suggestions welcome. Please no shouting or impatience. — Hippietrail 01:45, 28 November 2005 (UTC)
- That seems good enough. And on a second note, I wasn't trying to force anything, it was just that there had been no activity here for a long time, and most of the response was in favour of removing the lines. That was why I started removing the lines. Jon Harald Søby 09:23, 28 November 2005 (UTC)
This was written before Hippietrail's latest comments I have to agree with Jon, there was no activity here until he started to remove the lines, at which point I started also. And by my count there were 3 in favour and only 1 against. I clearly marked in the edit summary what I was doing. As for my shouting, I did get angry at the suggestion that Ec is the most important person here, and I apologize. Gerard Foley 15:07, 28 November 2005 (UTC)
- The community-friendly way to behave would have been to keep commenting and perhaps adding suggestions. Now I can only hope that the impatient deleters can turn around and patiently repair their hasty damage ): — Hippietrail 14:49, 28 November 2005 (UTC)
- I'll put it in wherever I edit, just as I took it out earlier. I won't retrace my edits just to add it in again, but revision of the Norwegian stuff (difference between Bokmål and Nynorsk is sorely needed – so far I have only used the Bokmål variants under the ==Norwegian== header, which is technically incorrect) will lead me to edit about all the places I have edited earlier. In other words, it will be fixed. =) Jon Harald Søby 08:55, 29 November 2005 (UTC)
- I must confess I have taken these lines out on occasion - I think from skimming this discussion earlier I had somehow got the impression that a majority of people were against them. I certainly find them unhelpful and inelegant. Widsith 13:53, 1 December 2005 (UTC)
We should keep the lines. I think they're very helpfull when editing and they do look good. --Dijan 06:36, 2 December 2005 (UTC)
Delete the lines. There is already a line under each heading (Chinese, Japanese, etc.) It's a waste of bandwidth and completely unnecessary. Badagnani 02:09, 2 January 2006 (UTC)
Keep. I like the extra visual separation. Millie 02:57, 2 January 2006 (UTC)
I don't know if it there is any use in contributing to an archived threat, but my vote is to remove the lines from the Wiki entries and move it to CSS. CSS is meant for layout, Wiki is meant for content and structure. Also, when placed in CSS, you will end the problem of inconsistency, since the lines will then show up either always or never. I would really like a decision on this point, but would like to stress that we have two questions here; do we want an extra divider line above each level 2 heading, and do we really need that line to be in WikiText instead of in CSS. -- Pbb 19:44, 6 March 2006 (UTC)
Experiment format for translating dictionary entries for non idiomatic phrases
Please read this and leave comments and suggestions before blindly reverting boiled egg
I have put this article in a new experimental format for translating dictionary entries for use with non idiomatic phrases which normal dictionaries never include. For now I have used the level-3 heading "phrasebook", but maybe "translating dictionary" is better, though I don't feel that term to be a set phrase itself. "Bilingual dictionary" is more common, but we deal with more than 2 languages and I also don't think "multilingual dictionary" would mean much to most people.
For phrases which are very common and very old and plainly obvious and not included by any well-known print monolingual dictionary, I feel very strongly that we should be traditional rather than radical. I also think it comes close to the "no original research" rule, but that's of less importance than what this dictionary wants to be compared to what every good dictionary before has been and continues to be. Do we really want to be something different? — Hippietrail 15:22, 10 October 2005 (UTC)
- I disagee with your assumption that boiled egg does not merit an English entry. --Connel MacKenzie 15:42, 10 October 2005 (UTC)
- Please elaborate, for keeping it and others like it makes us depart radically from what dictionaries have traditionally been. This term is so obvious as to not be in the online American Heritage, Collins, Encarta, or Merriam-Webster. And I would be very surprised if it is in any of the well-known English print dictionaries. Why would you like Wiktionary to be so different from these respected dictionaries? If we make this radical departure I feel we should advertise it everywhere, not least in our Main Page. — Hippietrail 22:10, 10 October 2005 (UTC)
- I don't know what you mean. --Connel MacKenzie 06:15, 11 October 2005 (UTC)
- I do not consider Wordnet, which all of your links bar one specifically use, as a more professional dictionary than Wiktionary. I consider it on a very similar level and susceptible to the same pitfalls as us. I am talking about established dictionaries such as AHD, Cambridge, Chambers, Collins, Duden, Encarta, Gage, Langenscheidt, Larousse, Longman, Macquarie, Oxford, Penguin, PONS, RAE, Robert, Van Dale, Websters. The other link you gave was to Ultralingua, uses an edited version of Wordnet — Hippietrail 15:44, 11 October 2005 (UTC)
- The discussion here is not about whether we should keep the specific expression boiled egg or not. The issue is a more general one about a problem that comes up repeatedly about a wide range of terms.
- Indeed. That is a separate discussion over on RFD. — Hippietrail 15:20, 11 October 2005 (UTC)
- If kept, I think that simply calling this a "noun phrase" would be sufficient. While there might be some validity for using "phrasebook", I think that what we normally want at that point in an article is a grammatical identification that gives us an idea of how the term functions in a text. I find "phrasebook" to be too collective as a term. Talking about translations or other languages at that point in an article takes us even further into unknown territory. There is adequate place later in the article for dealing with translations.
- There is one criterion for keeping these terms that is suggested by the translations. Does a literal translation of the component parts of a phrase yield a resonably equivalent result in major languages? Eclecticology 08:10, 11 October 2005 (UTC)
- Ec, I would like your opinion on whether to keep, and if so, whether to include definitions for these obvious phrases. Where do you belive we should draw the line - where established dictionaries have traditionally drawn it, or where Connel and some others wish to draw it? — Hippietrail 15:20, 11 October 2005 (UTC)
- The Pawley list in the rfd discussion for fictional character could be a good starting place. Generally my position on these is clearly more conservative than Connel's. A key factor should be that there needs to be something more to back these word combinations than that they are used a lot. Using the expression set phrase doesn't help us because there is bound to be a difference of opinion about what that really means, and whether it can be used as a synonym for idiom. We can easily find Google hits for these terms, but that doesn't help us because those hits don't tell us whether there is anything special about the usage. Someone who wants to keep one of these needs to be able to positively make a case for it. I hope this doesn't sound too evasive, but I do find this a difficult question where I hope to maintain some openness. Eclecticology 17:17, 12 October 2005 (UTC)
- I'm in favour of keeping a non-idiomatic phrase if it is unambiguously associated with a certain object and there is at least one language in which different word by word translations are conceivable. I'm indifferent about whether or whether not to give a definition. My point of view about the level 2 header is well-known: boild egg is a noun, so let's classify it as such. Ncik 23:48, 12 October 2005 (UTC)
- On fr: , we keep a lot of phrases (we call them locutions), and we treat them (the idiomatic phrases too) exactly like the others, with definitions, pronunciations, etymology, synonyms, etc. Just take a look at fr:Catégorie:Locutions françaises, or fr:naine jaune to see the results. Phrases are just seen as forms like initials, acronyms, or simple words. The only question is - of course - which phrases should be kept in Wiktionary. Actually, we didn't had any problems with that... It seems odd to me not to describe boiled egg in its own article for example. - Dakdada 23:25, 22 October 2005 (UTC)
Japanese Kanji
As it stands, there seem to be three separate categories for Japanese Kanji: Category:Kanji, Category:Japanese kanji and Category:Japanese Kanji. "Japanese Kanji" is even listed as a separate language. As I only just created an account and don't know much about the workings of Wiktionary yet, I was wondering if anyone else had any thoughts on how this should be handled. TheIncredibleEdibleOompaLoompa 05:48, 11 October 2005 (UTC)
- Your criticism is valid. The word "Japanese" in these headings is redundant. Chinese and Korean use the characters but they have their own names for them. I would suggest, if you're looking for something to do, editing the entries under both longer names to simply be in Category:Kanji. When the contents have been moved out it will be an easy task to get rid of the disused categories. Eclecticology 08:33, 11 October 2005 (UTC)
- I agree they shoud be merged. I'm not so sure about keeping just Kanji mainly because the corresponding Chinese term hanzi and Korean term hanja are so rare that though we define them, I have never seen either one in any major print dictionary or online dictionary - and I have looked. Also the fact that in articles we already use the full Japanese Kanji, Chinese Hanzi, Korean Hanja. I'm not necesarily arguing aginst Kanji on its own but these points are worth considerng. — Hippietrail 15:20, 11 October 2005 (UTC)
- I have to apologize for that I already started working on it before I found your discussion. I'm integrating Japanese kanji categories into Category:Japanese kanji and Chinese ones into Category:Chinese hanzi. Not touching Korean categories. I only intended to change the discouraging situation among the categories immediately while I feel the preceding language names are surely arguable (yet I'm considering it as the best way). You can regard the change as interim and make further improvement on it. As I think my modification won't make things worse and can also help bots to work on this later, I'll go ahead and finish it up in a few days. -- Tohru 18:43, 30 October 2005 (UTC)
Basic flaw in Wiktionary--What is a 'word'??
In 'Wiktionary: Entry Layout Explained' it says that an entry is '...a given sequence of letters'. This is a very unusual way to define 'word' for a dictionary, and means that entirely different words are grouped under the one headword merely because they happen to be written using the same string of letters (in some cases they're not even pronounced the same). As far as I can see there has never been any discussion about this, and no reason given for adopting this approach. I certainly can't see any justification for it and have never seen such extreme lumping in any dictionary that I'm familiar with. As well, it seems to contribute greatly to the complexity of entries. Has anyone got any comments about this? Is everyone happy with it? (Wiktionary is having problems keeping me logged-in, so my username may not appear: it's Dougg.) Dougg 01:33, 13 October 2005 (UTC)
- Is "remember across sessions" checked on the first page of your preferences?
- Maybe we should be using "lexeme" instead of "word". There was some discussion of the points that you raise in the very earliest days. How would you deal with homographs? Since you can't have two articles with identical names splitting these up would create a whole new set of problems. Eclecticology 08:44, 13 October 2005 (UTC)
Yes, I had that checked. When I came back a bit later and tried to log in it wouldn't let me, saying 'there's no such user'. I went away and came back again a while later and it was working, so hopefully it stays that way.
Yes, possibly lexeme instead of word, though a lexeme is definitely not a string of characters either. While a very few dictionaries don't bother with any distinction between different entries that have the same orthographic form, the usual way is to use superscript numbers (but there are other approaches). Surely this is easy to do, and would greatly simplify the structure of entries? As for searching, isn't it possible to have disambiguation pages, as in Wikipedia, to handle this if necessary? Dougg 11:56, 13 October 2005 (UTC)
- Of course, "lexeme" and "string of characters" are not equivalent, but these discordances are bound to happen when the terminology is taken from different fields of study. It is dubious that the "leet" nonsense entries would be valid lexemes, but they are graphemes, and clearly strings of characters. While it may be useful to employ disambiguation pages at some future time when articles have become much bigger, I think that it is now still better to see all the possibilities on one page. A dictionary that uses superscripts to distinguish homographs is still able to see them together on the same page. Going to these entries is not just a matter of putting something in the search function. We also want to be able to Wikify other articles where the reader wants to understand a word that is there. Can a disambiguation page make a distinction that is clear enough for such a searcher? Eclecticology 18:46, 13 October 2005 (UTC)
I'm not sure what you mean when you say Terminology taken from different fields: lexeme is from lexicology, but string of characters is not specific to any field of study and is commonly used in linguistics, especially computational linguistics. While I won't comment on whether or not leet is a lexeme, it is not a grapheme (a symbol which represents a phoneme). Yes, it is a string of characters, but so is 12(*.&.
Ok, I've got no objection to having orthographically identical words on the one page, but in my opinion where they are different words they should always be identified as such through some mechanism. This makes it possible to distinguish between polysemy and homography, and better represents what native speakers know about the language (which is surely what dictionaries try to do).
When you say ...Wikify other articles... do you mean having automatic links between Wikis so that where a word is unfamiliar the user can click on it and be taken to Wiktionary to see a definition? If so it sounds like a good idea, but surely separation of different words is even more important as you want the user to be presented the relevant definition, not a (potentially huge) bunch of unrelated definitions. This sort of thing is pretty well established in computational linguistics, at least to the level of getting the right word--getting to the correct sense is still hugely problematic. Dougg 00:24, 14 October 2005 (UTC)
- By "string of characters" I would draw from computer jargon, and use it for anything that can be represented. The "leet" material refers to such entries as "1337" or "pr0n" which include numbers treated as though they were letters. Personally, I would be delighted to outlaw them completely, but we have too many contributors who believe they really should be in the dictionary.
- I agree that orthographically identical words should be identified as different words when that is the case, whether by language, etymology, pronunciation or part of speech. Our degree of success is another matter.
- You correctly interpret the term "Wikify". As things stand any word can be turned into a link whether or not an article already exists, and a red link will simply indicate that nothing has been done with it yet. The only practical approach still seems like having one page where this can all be sorted. Eclecticology 06:09, 14 October 2005 (UTC)
Looking up words you cannot spell.
Is the Wiktionary any good for looking up words you can't spell? I tried a couple of intentional misspellings (unecessary and curiculum) and it appears not. But I would have thought you'd want a dictionary to help you with that. --Bodnotbod 03:59, 15 October 2005 (UTC)
- It would be a good idea, however care would have to be taken not to confuse "alternative spellings" with "misspellings." We should probably do various misspellings for long words primarily, as shorter words can be found by variation. Citizen Premier 04:19, 15 October 2005 (UTC)
- Am I right to assume that the model you're thinking of consists of pages created for predictable misspellings and creating redirect/disambiguation pages? --Bodnotbod 05:09, 15 October 2005 (UTC)
- For now we conservatively discourage entries for all but the most common misspellings. So many misspellings and typos are possible that it is difficult to establish criteria for deciding which to include. I don't think that redirects are an appropriate way of handling misspellings. A reader who finds on the curiculum page, "common misspelling of curriculum is more likely to remember his error than someone who is simply redirected. I don't think tha disambiguation pages would be helpful for this problem. It's important to remember that the most hilarious misspellings are the ones that produce other real words. We need only think of those ladies that are often seen with pedants hanging from their necks. Eclecticology 07:25, 15 October 2005 (UTC)
- I wonder if the "word not found" page could be adjusted to show a browsable list from "Special:Allpages" that would include the looked-up word? Then the user might see the word he had in mind (if we have it). SemperBlotto 07:48, 15 October 2005 (UTC)
- That thought did cross my mind. A "Browse this word" function in the sidebar could be helpful, especially when the error is in the latter part of a word. Perhaps one of the more technically minded Wiktionarians could comment on this. Hippietrail? Connel? Eclecticology 17:36, 15 October 2005 (UTC)
- The main problem with having a word browser (like the one in Encarta's dictionary) is that the wiki software has no way for us to tell it which pages have articles for which languages - all this information is available as human readably text only. Without this the browser will contain many entries for foreign languages the user knows nothing about. The only way to fix this properly would be for the Wiktionary techies to come up with a proposal and then to get as many other contributors to help petition the wiki devs to get them to implement it. Or if our techies can come up with a patch, to get the wiki devs to apply it. — Hippietrail 15:20, 16 October 2005 (UTC)
- I'd love to see a variety of searching lookup improvements. Things like an option for W:SOUNDEX lookups would be nice, optional filters by language would be nice too. But for pendant/pedant, I think we need to just keep Eclecticology proofreading for us. Offline, it is conceivable to find the previous and next words in English, and stuff them into entries in a similar way to the proposed format for frequency counts. But that doesn't help for the situation that SemperBlotto described. Also, the offline method is (as any 'bot) prone to significant time lags. (I dropped in a "browse nearby" thing in MediaWiki:Nogomatch, but that isn't quite what SemperBlotto asked for, nor does it work "right." I expect to roll that change back in a day or two when everyone has seen it...seems like that may be hard on the servers.) Hippietrail, do you have something you are proposing to the devs that needs community support? --Connel MacKenzie 16:07, 16 October 2005 (UTC)
- I think something akin to soundex or metaphone or even using aspell since it has improvements over both of these, is absolutely necessary for any dictionary software. Merriam-Webster online has such a feature for misspelt words. Encarta has a browse feature. Collins just leaves you wondering. (AHD I access via answers.com which has all kinds of stuff.) I don't think a bot could help for spelling suggestions like it could for browsing. But some of us techies could certainly install the wiki software at home and start hacking to try to come up with extensions and patches. If not, I feel we are at a point in terms of size and scope where we should be able to ask the devs for features. But to do that we really need to figure out just what we want first, or they'll probably just ignore us because of the demands on their time.
- As for the "change" and rolling it back, I don't even know what it is. I just saw some tech talk and thought I'd put in my 2 cents (: — Hippietrail 16:33, 16 October 2005 (UTC)
- I'd love to see a variety of searching lookup improvements. Things like an option for W:SOUNDEX lookups would be nice, optional filters by language would be nice too. But for pendant/pedant, I think we need to just keep Eclecticology proofreading for us. Offline, it is conceivable to find the previous and next words in English, and stuff them into entries in a similar way to the proposed format for frequency counts. But that doesn't help for the situation that SemperBlotto described. Also, the offline method is (as any 'bot) prone to significant time lags. (I dropped in a "browse nearby" thing in MediaWiki:Nogomatch, but that isn't quite what SemperBlotto asked for, nor does it work "right." I expect to roll that change back in a day or two when everyone has seen it...seems like that may be hard on the servers.) Hippietrail, do you have something you are proposing to the devs that needs community support? --Connel MacKenzie 16:07, 16 October 2005 (UTC)
- The main problem with having a word browser (like the one in Encarta's dictionary) is that the wiki software has no way for us to tell it which pages have articles for which languages - all this information is available as human readably text only. Without this the browser will contain many entries for foreign languages the user knows nothing about. The only way to fix this properly would be for the Wiktionary techies to come up with a proposal and then to get as many other contributors to help petition the wiki devs to get them to implement it. Or if our techies can come up with a patch, to get the wiki devs to apply it. — Hippietrail 15:20, 16 October 2005 (UTC)
- That thought did cross my mind. A "Browse this word" function in the sidebar could be helpful, especially when the error is in the latter part of a word. Perhaps one of the more technically minded Wiktionarians could comment on this. Hippietrail? Connel? Eclecticology 17:36, 15 October 2005 (UTC)
- Am I right to assume that the model you're thinking of consists of pages created for predictable misspellings and creating redirect/disambiguation pages? --Bodnotbod 05:09, 15 October 2005 (UTC)
- Speaking as someone very familiar with Wikipedia but not the Wiktionary I have to agree with Eclecticology that trying to pre-empt misspellings is fraught with danger. I confess I'm surprised this isn't something there's already a project page or policy about. I sort of think of it being an intrinsic part of a dictionary's use.
- As I (sketchily) understand Wikiprojects, each page is stored as part of a Media-Wiki database with (I'm assuming again) a cell in the database table carrying the name of the page. I wonder whether this couldn't be used to compare someone's (misspelled) search against the available pagenames and return suggestions based on % nearness to one that actually exists. A similar thing happens with a Wikipedia search, though seemingly you have to get the entire word within a multi-word topic correct, ie Magic eejit ball will get you to WP's Magic 8 Ball in a way that mugic ate bull won't.
- Perhaps this is something to ask the hard-core tecchies about. Again, as someone very new to Wiktionary I'm ill-equipped, but I've got the impression from my recent reading that Wiktionary may try, instead of having English/Italian/Spanish/xxx/yyy/zzz editions, to have a one world dictionary. I think that would make a dictionary that attempts to help a user with poor spelling an even more difficult proposition because you have the opportunity to accidentally correctly spell a word, just not the one you meant in the language you meant it. --Bodnotbod 10:37, 15 October 2005 (UTC)
- Funny, but when I saw "eejit" I read it as "idiot" rather than "eight" in parallel with the "injuns" in the wild west. :-) Your Wikipedia search probably worked because two of the words were correct, and it would have begun by listing those titles that included those two words. With "magic ate bull" I had visions of a stage magician bringing a bull on stage and making him disappear.
- The hard-core techies tend to avoid Wiktionary. It would be nice to have one dedicated to Wiktionary, but I have no illusions that that will happen. Still we do have people working on some very interesting ideas that would help develop Wiktionary's unique nature.
- I do defend the different language editions as directed to readers with the relevant native language. A particular edition should emphasize its own language in greater detail. Any major decision entails its own special challenges. Eclecticology 17:36, 15 October 2005 (UTC)
- Oh absolutely. I was delighted yesterday when I looked up a word and found the word translated into around 20 other languages, "wow!" I thought. --Bodnotbod 18:01, 16 October 2005 (UTC)
- I'm not familiar with Metaphone, but Soundex could probably be implemented through the category system. Eclecticology 09:05, 17 October 2005 (UTC)
- w:Metaphone is another algorithm that has the same aim as Soundex but different internals and supposedly better results.
- At first I didn't know how you could implement these via the category system since you don't know what the user will type and will have to generate the "key" (or whatever the technical name is). But in fact we could do them just as we do the pronunciations now. For every english word we generate these keys which then link into a category for all words which have like keys. It would be a nice feature for a bot to build, but would it really help searches so much? — Hippietrail 16:39, 17 October 2005 (UTC)
- I'm not familiar with Metaphone, but Soundex could probably be implemented through the category system. Eclecticology 09:05, 17 October 2005 (UTC)
- Unfortunately the Wikipedia stub does not describe the nuts and bolts of how metaphone works. Is there a copyright problem with it? I did find something here. I like the Soundex use of numbers because it avoids some potential scaling conflicts with other ideas that could develop in the future. Metaphone does address address some of the problems that are overlooked in Soundex. Maybe WikiΦone could be a hybrid system with potential use for other languages. :-) Eclecticology 17:57, 17 October 2005 (UTC)
- I'm intruigued by the potential of Category:SOUNDEX:... categories, but the search software would need to be modified pretty significantly to make it useful. Plus it would require a recent-changes-aware 'bot to be meaningful. At this point in time, these seem like prohibitive obstacles, but that should be revisited in the future. A MediaWiki software solution (building a SOUNDEX cross reference table by language) would certainly be more efficient. --Connel MacKenzie T + C # 18:49, 10 November 2005 (UTC)
Another question from the light user - The Wordnet Collaboration
I asked this Q on the talk page there, but clearly this place is far more popular. I wondered what had happened with project Wiktionary:Princeton_wordnet? There's some discussion on Wiktionary_talk:Princeton_wordnet but activity seems to have tailed off 18 months ago even though things seemed to be proceeding very happily. Anyone know what became of it? --Bodnotbod 18:01, 16 October 2005 (UTC)
- It seems that the people who were interested in this idea, are mostly inactive. That's not unusual in a volunteer community. Eclecticology 09:02, 17 October 2005 (UTC)
- Ah. That's a shame. Thanks. --Bodnotbod 05:40, 19 October 2005 (UTC)
Word-linking policy in entries
I think that every unique word in the definition part of every entry should be a link to that word in the Wiktionary. There is simply no reason not to go with massive linking, it is in no way confusing or ambiguous, it absolves the entry-writers from deciding which words are most unusual or important (I have noticed many strange choices), and it makes browsing the dictionary so simple and natural. (It also makes the links on the site a relational map of the language, which might be fun for some geeks.) The only issues I see with this are (a) phrasal verbs (how to link them as they can span other words as in "he punched the window of my car out with a rock" where the verb is not "punch" but "punch out".) and (b) getting a script or robot to robustly insert all the links without doing damage in rare or odd cases. But there would be very few problems if the massive linking was applied only to the definitions, and not to all the example sentences, usage notes, etc. Hogghogg 18:55, 16 October 2005 (UTC)
- This has been suggested before, and it is a valid point. There is in fact a very good reason why this is not done, however. Wikifying every unique word in an entry would put too great a load on the server, making Wiktionary excessively slow and unusable.
- Phrasal verbs can be linked, and this is done as follows: [[punch out|punch]] — Paul G 11:06, 17 October 2005 (UTC)
- Thanks for this. I had forgotten that all the links are resolved so actively (ie, they are not just passive HTML). Okay, well too bad; I hope when the 'tionary becomes stronger, this can be done by someone with a super-smart perl bot. Hogghogg 15:30, 18 October 2005 (UTC)
- You may with to check this Firefox extension out. It seems to do the right-click-on-any-word thing you want it to. Only key words are wikified in our entries to help convery emphasis, as well as to try to be nice to the servers. --Connel MacKenzie 16:46, 18 October 2005 (UTC)
- While it is true that linking every word increases the load, there is not one "server", but many. On the order of 120 currently. In addition to that, wiktionary is a minor project, as compared with the load generated by the wikipedias. So while load is a concern, it's a minor one. Most of the server load right now is not coming from too many links, but from overwhelming demand, currently peaking at 5000 requests per second.
- Anyway, linking EVERY word seems like too much. Certainly most users of the dictionary can be expected to know basic words like "the", "of" "when" and so on. I think it makes much more sense to link what i would call "key" words. Words that are reasonably complex, and "key" to your understanding of the definition. The connecting words of the definition sentence need not be linked. — Fudoreaper 04:53, 24 November 2005 (UTC)
- Personally I hate the idea of having all words as links. I think it is confusing and ambiguous, and I think using links to emphasise just the key words for further reference is very elegant and useful, particularly within definitions and etymologies. Widsith 09:49, 29 November 2005 (UTC)
- Headwords of a phrase are all linked, including the, of, and, when etc. as those particular entries are some of the most extensive entries we have. But in the definitions I strongly agree with Widsith that we should strive for elegance, and wikify only key terms. Etymologies should have all foreign terms wikified and all relevant component terms wikified, as well as language names. The definitions and quotations should read like prose; the technical terms (inflections, synonyms, etymological components, derived terms, translations,) that bear a relation to the word being described should be wikilinked. The headings themselves used to be wikilinked too, (e.g. ==English==, ===Noun===) but that was eliminated for obscure technical reasons that I'm still not clear on. --Connel MacKenzie T C 16:59, 29 November 2005 (UTC)
- I'm with Widsith and Connel. I'm totally against linking every single word. We can get the same functionality without the ugliness by using Javascript. I've seen at least one Wikipedia or Wiktionary mirror which does it. That could be a project for somebody. Well not 100% the same since you won't get the red vs. blue link.
- Personally I never link the "minor" words in English phrases but always do so in phrases in other languages. I link "major" words always.
- I constantly find articles which wikify random words with no apparent rhyme nor reason. I often change them by dewikifying words that are common and of no interest to the article, and wikify words which are rare or of interest to the article. I also only link noninflected forms in articles but often find inflected forms wikified.
- I don't usually wikify even interesting words in quotations but maybe I should - I'm open on that one.
- The reason we stopped linking headings was because there were huge slowdowns due to overloaded servers. Cutting back lots of links which were always the same and on every page helped. I still link headings if the terms are obscure, such rare languages with few articles (Amuzgo, Tamazight) or parts of speech unknown to most people (coverb, postposition) - but those don't come up to often. — Hippietrail 18:06, 29 November 2005 (UTC)
Treatment of foreign given names
I've been thinking about how we ought to be treating given names here. Leaving aside whether or not these should be included, which is a separate issue (and the tacit consensus seems to be that that they should), I have been thinking specifically about foreign given names. Here are some issues, using my own name as an example.
- "Paul" is an English proper noun. Now, when I go abroad, I don't call myself "Pablo", "Paolo", "Πάβλο(ς)", etc - I am still called Paul and that is how I ask others to address me. So I suggest that these "translations" are actually equivalents in other languages. I think we should therefore have "Foreign equivalents", "Equivalents in other languages" or something similar rather than "Translations" as the header for the section that lists these.
- Similarly, "Paolo" is an Italian proper noun, but if someone called Paolo goes to an English-speaking country (or anywhere else for that matter), he is still called Paolo. Now, does that make "Paolo" an English proper noun as well? I think it probably does. How should we then treat "Paolo"? I suggest that we treat it as an English proper noun (and an Italian one), and give the English equivalent, as I have done at Paolo (I have not added an Italian section):
Proper noun
Paolo
- A male given name of Italian origin. English equivalent: Paul
You might ask what happens with names in non-Latin scripts — these names are certainly not English proper nouns. In that case, English typically transliterates into the Latin alphabet, so, for example, "Sergei" and "Dimitri" are English proper nouns but not Russian ones.
Any other thoughts about this? — Paul G 11:01, 17 October 2005 (UTC)
- In Norwegian, names of royal people are translated. So king James the 2nd becomes Jakob 2. But this only goes for royal names, and usually not until the person in question is dead (Elizabeth is translated, but Charles is not, neither is Juan Carlos)…
- The deal with translations, is if it means the same. What name people use for you, is what you prefer to use, not what they prefer. Jon Harald Søby 13:58, 17 October 2005 (UTC)
- Just because not everybody uses translations of their name while traveling, does not mean they are not translations. Personally when I'm travelling in Spanish-speaking countries I always introduce myself as Andres (or is it Andrés?) because of the surprising tendency of people to think my name is Henry if I use the English Andrew. But, the other way around, when I meet foreigners, I always try to find out and use their names they way they would be in their own language. For instance I know that in some Germanic or Scandinavian versions of Anne, the final "e" is actually pronounced as a schwa, which is too odd for most English speakers, but I like to pronounce it. So anyway, I say we keep given names from all languages and keep the translations but don't force anybody to use the translations for theirselves. — Hippietrail 16:33, 17 October 2005 (UTC)
- It all seems a non-issue to me. We could enter into a lot of non-productive debate about the semantic difference between "translation" and "English equivalent", but the issue would remain. How a person wants his own name pronounced will remain a personal decision, and immigrant populations are divided about what to do when they migrate. Some take great pride in their national origins; others want to be assimilated and lose all trace of those origins. We can let them know that "Paolo" is often Englished as "Paul", but the choice of what to do remains theirs. If one is translating fiction it is useful to know the equivalents, but the translator needs to retain the option of choosing what will best represent the circumstance of the story. Eclecticology 16:55, 17 October 2005 (UTC)
- In Latin foreign names are generally translated; in modern practice, one translates the first name if possible and usually leaves the surname as-is. So not only is King James Iacobus, your neighbor Kevin is Coemgenus, etc.
- Anyway, basically the deal is, there's two different ways that names are treated:
- There is the name generally given; Mark, Jeff, Dmitri, with no particular referent. Generally these aren't translated (though some languages may translate or transliterate them as a rule, and these equivalents should be given).
- There is the name as applied to particular, well-known individuals—particularly historical ones, and kings. Often these do have translations: Luke the evangelist translates in some languages to Lucas; James the king to Jakob; 孔子 to Confucius.
- And of course there is the question of attestation—we shouldn't give Paolo as any kind of translation for Paul unless we have evidence of it actually being translated as such; that is, just being a cognate is no guarantee of being a translation. —Muke Tever 23:06, 17 October 2005 (UTC)
- Good point! Cognates can (and should) go in the etymology section no matter whether they also belong in the translation section. — Hippietrail 03:02, 18 October 2005 (UTC)
Twice today, contributions of mine have made the counter go to 96,000. Is this a record? It was due to intervening deletions. Does this make a mockery of the whole Milestone business? SemperBlotto 13:45, 17 October 2005 (UTC)
- Actually, with the Romanica deletions, we've passed 96,000 five times now. The first two times, I missed catching the precise entries. I suppose we could keep at it all day but then again, maybe not. (I had been trying to get those deletions out of the way before we got to the milestone, but missed.) I don't think the milestones have outlived their usefulness just yet. After 100,000, I do think we should limit them to every 10,000 though. Or do people enjoy them too much for that? --Connel MacKenzie 13:56, 17 October 2005 (UTC)
- The milestones should refer to the first hit at that number. Deletions and new entries mean that we will oscillate around any given number of entries from time to time, but there is nothing of interest in recording that we passed 96,000 entries five times. The earliest entry in the milestones that someone adds should be the official entry (whether or not it is spot-on accurate); if subsequently the count goes down and then back up, this should not be recorded. — Paul G 14:43, 17 October 2005 (UTC)
- Milestones were never intended as anything more than a rough guide to progress, and a celebration that someone had reached that point. The determination of those numbers was no more accurate than a McDonald's celebration of the 30 billionth hamberger. Do they take into account those that were returned by customers? When Bubba runs over somebody with his pickup truck does he go back and run him over again just to have extra run-over credits? I agree that we should change the frequency after 100,000 but would suggest increments of 5,000 until we reach 200,000, changing again to 10,000 when we get there. Eclecticology 17:13, 17 October 2005 (UTC)
Sumerian/Akkadian
Both these languages are long-dead (4000 and 2000 years resp.) and were written in cuneiform, which I think is a bit beyond what wiktionary can stomach at this point in time. However they both have a more or less standardized transliteration into Latin script. Akkadian has a few added diacritics like circonflex and hacek, so that is not really a problem. I have added a few words on the nl. wiktionary (where I usually reside). Sumerian, however, is more of a challenge. It is customary to separate syllables with a dot and to indicate which glyph is used by either acutes, graves or a subscript. Ma4 is a different word from ma5. (I wonder if it had something to do with tones, but I am certainly no expert). My question is: is there a good way to put this kind of subscripts into lemma titles. Just putting square brackets around it does not work.
On a different subject, I was somewhat dismayed at the non-use of language templates here. Yes it is true that {{==something==}} produces a false edit button, but on nl this has been solved by putting in an explanation of what the header means and a synopsis of usual format. This stops people from putting information there. Also the header produces an automatic categorization of all words because the category 'English word' is included in {{-en-}}. It seems that even {{en}} is discouraged here. The problem I have with that is that it makes it much more difficult to exchange translations between the various wiktionaries. You cannot just copy and paste. Yes it is true that alfabetization is not automatic, but I'd rather shift around the entries a bit than having to translate all the labels. User:Jcwf 152.1.193.141 18:36, 17 October 2005 (UTC)
- There doesn't appear to be anything wrong with making page titles for ma₄ and ma₅, so... um... whatever. I believe the subscripts are meant to refer to different glyphs with the same pronunciation—at least, that is the case in Mycenaean transliteration.
- Incidentally on la.wikt {{-en-}} doesn't produce a false edit button... it produces the category, but not the header, so the usage is =={{-en-}}==. I have suggested this before for en. but the reception is lukewarm. (Another detriment to acceptance of this is that categories such as "English words" are entirely discouraged—for some reason they prefer part-of-speech subcategories such as "English nouns.")
- As for the regular {{en}}, I think the main reason for their being disliked is that most of the editors here don't like to remember them; they can be confusing for those not ready for them—some people were using sw for Swedish, though that comes up Swahili; one the same note, even nl.wikt made the rather boneheaded move of using -adj- for "Adjective" which will be very troublesome when you get around to adding words in Adioukrou; -rel- for "Related words" instead of Rendille.... —Muke Tever 22:48, 17 October 2005 (UTC)
- I think you kind of answered my question about subscripts: I should have said that <sup>5</sup> does not work. Pray tell me how to get the subscripts in your way if that does work. Thank you for the remark about -adj- and -rel-. I agree that this needs to be solved.
Perhaps non-language templates should have at least four letters? I doubt that for people who speak svenska the choice of sv for this language should really be a problem. nl:Gebruiker:Jcwf
- Four letters is a good length IMO. For things that might really need to be shorter another possibility is putting them in caps—la has a MC template for small-caps for example. Sub- and superscripts are encoded in unicode apparently Just For plain text problems like this: ⁰¹²³⁴⁵⁶⁷⁸⁹⁺⁻⁼⁽⁾, ₀₁₂₃₄₅₆₇₈₉₊₋₌₍₎. If they're not in the auto-insert dealie yet perhaps they should be. —Muke Tever 06:26, 20 October 2005 (UTC)
- I think it's great to get some Sumerian/Akkadian on here - I'd just add that cuneiform script is scheduled for inclusion in Unicode at some point, and when that happens it would be great to get the languages up in the ‘original’ forms Widsith 09:57, 29 November 2005 (UTC)
need frequency count for consonant clusters in english
Hi,
I am looking for the frequency of certain consonant clusters which occur at the end of the word (pronunciation not spelling) in English. please let me know if it's possible to locate it in the Wiktionary and if it is.... then to go about it.
my e-mail is divyaufl@gmail.com if needed.
Thank you,
take care
Divya
- If I have understood you correctly, this request is probably beyond the current scope of the Wiktionary project. I think you will find that the Wiktionary as it stands reflects the full range of the English language incompletely. Even for the English words we currently have, our pronunciation information is incomplete by comparison. Finally, not all the words in the English Wiktionary are words in English. You may find some help on our rhymes pages, rhymes:English and its sub-pages. I have sent a copy of this note to the address specified. --Dvortygirl 21:24, 18 October 2005 (UTC)
Definitions for Kids
In my search of the internet I have noticed that there doesn't seem to be a very good dictionary that kids can use. Definitions tend to just confuse kids more than the original word. I figure if we can make a good dictionary for kids it will help promote the site to a new generation. I also notice that with people just learning English that this option will also benefit from a very simple definition. The format I was thinking about using was to just add a line to the definition labeled (kids), this will make it simple for kids to identify the definition. This option also allows them to see the regular definition to make the comparison and learn maybe what other words mean.
- There is a Simple English version of Wiktionary - link over on the left somewhere. SemperBlotto 13:07, 18 October 2005 (UTC)
- Go to simple:wiktionary.org That project could stand having a few people who are interested in that sort of thing. It has had very little activity. Eclecticology 20:54, 18 October 2005 (UTC)
- Still not having any luck in finding this simple ditionary. Why create a separate dictionary for kids when it's just as easy to add a line to the current ditionary with a kids definition? This would allow it to be kept up to date very easily.
- Simple English Wiktionary - but I wouldn't bother. It's only got about 30 (thirty!) words so far. I would recommend a printed one. SemperBlotto 12:27, 19 October 2005 (UTC)
- It is very small, and I'm not personally interested. I nevertheless think that it has a role to play if the right person can take leadership there. Eclecticology 16:13, 19 October 2005 (UTC)
- Go to simple:wiktionary.org That project could stand having a few people who are interested in that sort of thing. It has had very little activity. Eclecticology 20:54, 18 October 2005 (UTC)
How to format nuclides
Nuclides are specific isotopes. They come with two numbers (the atomic number and the atomic mass) and these are printed on top of each other before the symbol for the element. The best I can do is 146C - can anyone do better? SemperBlotto 16:32, 18 October 2005 (UTC)
- Some clever CSS in a template can do the trick (I'll try in a minute, and include the result here), but I'm not sure if it'll be cross-browser compatible. Jon Harald Søby 16:34, 18 October 2005 (UTC)
- 146C (Template:nuclide). This works only where there are X digits in the top number, and one in the bottom number. I can make one for one digit in the top number and X in the bottom number too… Jon Harald Søby 16:46, 18 October 2005 (UTC)
- Well done. The top number is always bigger than the bottom one - roughly double or more. This will be more usefull over of -pedia, where there are articles on nuclear reactions. I'll get back to you. SemperBlotto 16:52, 18 October 2005 (UTC)
- Oh, can you stop it skipping to a new line~ See nuclide SemperBlotto 16:57, 18 October 2005 (UTC)
- Well done. The top number is always bigger than the bottom one - roughly double or more. This will be more usefull over of -pedia, where there are articles on nuclear reactions. I'll get back to you. SemperBlotto 16:52, 18 October 2005 (UTC)
- Hmm. For some weird reason, it doesn't work with indention or ordered lists… I removed
#:
, and then it worked (in preview). I'm not sure how to fix that… Jon Harald Søby 17:07, 18 October 2005 (UTC)
- Hmm. For some weird reason, it doesn't work with indention or ordered lists… I removed
- Ok, fixed; the only problem was one line break that shouldn't have been there… Jon Harald Søby 14:18, 19 October 2005 (UTC)
- OK Jon. Before I copy the system to Wikipedia, what needs to be done to make these nuclear reactions look good?
Lithium + Deuterium => Helium : 63Li + 21H => 242He
Uranium-236 => Xenon + Strontium : Template:nuclide-2 => Template:nuclide-2 + Template:nuclide-2 +2n
- I'm not sure how that first one ought to be viewed, but for the second one, you can use Template:nuclide-2. Jon Harald Søby 14:58, 20 October 2005 (UTC)
Quotation format
I have begun changing the quotation format from "Year: quotation - linked author, title" to "Year: linked author, title - quotation.
The simple purpose for this is to better facilitate the use of templates for references. Of the four elements in a quotation, only the quotation itself will vary every time, and there is already general agreement that having the year as the first element is useful for ordering the quotations. Thus now {(RQ:Shakespeare Timon}} will give us 1607: w:William Shakespeare, The Life of Timon of Athens . Please note the use of "RQ:" in the template; this will serve to list all quotes together alphabetically. "R:" will serve similarly for references. This approach will be very helpful for frequently used sources, but one needn't bother if an author or work is only rarely used. Eclecticology 09:05, 19 October 2005 (UTC)
- This is a very good thing, I think. But for Webster 1913 quotations, can we continue using the old format (especially those of us less versed in navigating Wikisource?) --Connel MacKenzie 17:18, 19 October 2005 (UTC)
- What I've been using for Webster references is {{R:Webster 1913}}; Wikisource doesn't enter into that one. Where the Webster has a quote I do try to identify it, but that can be a very time consuming task. I've also been using the 1914 Century Dictionary, which has triple the entries of the Webster, and is far more helpful for identifying them. It's just not available on line. Eclecticology 23:58, 20 October 2005 (UTC)
Template:m, Template:f, and Template:n
Could anybody briefly explain what these templates are good for? Ncik 00:31, 22 October 2005 (UTC)
- The idea is that they say whether something is masculine, feminine or neuter. However, they don't save any time, but are used on other language wiktionaries. --Wonderfool 12:52, 22 October 2005 (UTC)
- For consistency, I've had my javascript auto-replacing ''f'' with {{f}} as it does not seem harmful, but has the added benefit of Hippitrail's "hint" text appearing on mouse-over. --Connel MacKenzie 18:11, 22 October 2005 (UTC)
- There is aso a p and a c template for plural and common (used in Scandinavian languages). GerardM 10:39, 27 October 2005 (UTC)
- Just in passing, I've noticed that the w:RAE also uses common for Spanish! I know the term stems from Latin, but I'm not sure if the RAE uses it for terms which do not change their form for masculine and feminine, or if they use it for terms representing humans and which take -a for females and -o for males. Need to look further... — Hippietrail 14:19, 27 October 2005 (UTC)
- They use it for words which use the same form for masculine and feminine, just like in Latin. e.g. policía, presidente, but for a word like maestro, maestra they use "m. y f." —Muke Tever 22:13, 28 October 2005 (UTC)
- Thanks! — Hippietrail 02:12, 29 October 2005 (UTC)
Bots
meta:Requests_for_permissions#Requests for Bot status links here.
- ... Bot policy on Wiktionary (in particular en.wiktionary) is under discussion so bots should not be made without approval of the appropriate wiki.
I am not sure what the discussion in the past was, but can we please reassure meta that we yes, do like 'bots in general (well behaved ones, with approval that is.)
Should bots be permitted on en.wiktionary?
Votes for:
- --Connel MacKenzie 07:54, 22 October 2005 (UTC)
- Jon Harald Søby 18:13, 22 October 2005 (UTC)
- Wytukaze 18:36, 22 October 2005 (UTC)
- Dvortygirl 23:55, 22 October 2005 (UTC)
- Polyglot 20:01, 23 October 2005 (UTC)
- Hippietrail 23:46, 23 October 2005 (UTC) Of course we should have the bots that we vote in favour of.
Votes against:
Abstain
- Eclecticology 00:39, 23 October 2005 (UTC) My abstention is because I don't think that such a broad policy statement by itself accomplishes anything. Bots certainly do have their place, but the more important question is when, and under what circumstances should a particular bot be allowed. Eclecticology 00:39, 23 October 2005 (UTC)
- Each individual 'bot requires its own separate approval, following Meta conventions. I do not know why they link here to this subheading of Beer Parlor, but they do. A "no" here means an objection exists to all 'bots, not any possible particular one. --Connel MacKenzie 02:20, 23 October 2005 (UTC)
Appendix:
Should Appendix: and Appendix talk: be added as valid namespaces, probably as 100 and 101?
Votes for:
- --Connel MacKenzie 00:05, 23 October 2005 (UTC)
- Eclecticology 00:49, 23 October 2005 (UTC) with apology. I really should have dealt with this more diligently.
- Hippietrail 23:44, 23 October 2005 (UTC)
- Jon Harald Søby 07:39, 24 October 2005 (UTC)
- --Wonderfool 15:46, 27 October 2005 (UTC) Yeah, this is a step in the "changing the interface designed for an encyclopedia to make it more suitable to a dictionary" (CTIDFAETMIMSTAD).
- Wytukaze 18:27, 1 November 2005 (UTC)
Votes against:
Index:
Should Index: and Index talk: be approved as valid namespaces probably as numbers 102 and 103?
Votes for:
- --Connel MacKenzie 00:07, 23 October 2005 (UTC)
- Eclecticology 00:50, 23 October 2005 (UTC) with same apology.
- Hippietrail 23:43, 23 October 2005 (UTC)
- Jon Harald Søby 07:35, 24 October 2005 (UTC)
- Wytukaze 18:29, 1 November 2005 (UTC)
Votes against:
Request for bot status: DblRedirBot
I formally request community approval of running "redirects.py" from the account User:DblRedirBot.
Purpose: Clean up double redirects (~4,000 or so) and periodically re-run.
Owner: User:Connel MacKenzie
Testing status: OK. (with and without throttling) Testing again to check User:Hippietrail's concerns. --Connel MacKenzie 07:30, 24 October 2005 (UTC)
- Now that the obvious redirects have been deleted, I plan to do another test soon, unless there are other objections. --Connel MacKenzie T + C # 22:12, 5 November 2005 (UTC)
- Recent tests (after deletion pass first) were much better. Requesting bot status. --Connel MacKenzie T C 18:21, 9 December 2005 (UTC)
- 10 December 2005, 'bot flag set for account. --Connel MacKenzie T C 21:39, 13 December 2005 (UTC)
Votes for:
- --Connel MacKenzie 02:21, 23 October 2005 (UTC)
- Fine with me Polyglot 20:01, 23 October 2005 (UTC)
- Just make sure it works. Somebody has been running a bot to fix redirects lately that changed things but fixed nothing on the pages that I saw at least. — Hippietrail 23:40, 23 October 2005 (UTC)
- Jon Harald Søby 07:32, 24 October 2005 (UTC)
- I see what Ncik is saying, that in some instances there are valid reasons for rebuilding the middle link in the redirect chain into a full entry, or more likely a small entry which refers to the stem but has some separate content of it's own. However each time a user types in a double redirect page they are left with an almost blank page & a bad user experience. We are as yet only 1% of the way to creating an entry for each of the 9 million unique "words"/"lexemes" found in the Gutenberg project so making some improvement automatically on the double redirects instead of spending time fixing them manually is a good thing. Perhaps giving us time to put in the separate entries. How about a special page for single re-directs (as there will no-longer be double ones) so they can be reviewed and people can perhaps put in the separate content? I've gone on enough. for MGSpiller 23:59, 26 October 2005 (UTC)
Votes against:
- The bot has 'fixed' two double redirects from my watchlist: One was Brains -> brains -> brain, the other one Baldric -> baldric -> baldrick. We might want to split off noun defn 4 from brain, and also add the third person singular of to brain to brains. The spelling variant baldric of baldrick definitely deserves its own page, as all other spelling variants do. In both cases the bot has produced additional work for editors embarking on implementing the aforementioned improvements by forcing them to change the redirects back to their previous targets. I don't think it is asked too much of Wiktionary users who somehow managed to end up on the page of a misspelling (I still think these pages shouldn't exist in the first place, and the problem wouldn't exist) to perform an additional click. Ncik 19:25, 24 October 2005 (UTC)
Comments
- So, why are the misspellings on your watchlist? Did you enter them incorrectly in the first place? Of relevant note: brains and Brains are probably currently on a separate cleanup list; Baldric/baldric should be there also.
- Furthermore, correcting a double redirect did not cause any of the problems you indicate...since they appear on your watchlist, I can only assume you had some involvement in them being "broken" to begin with. As usual, I expect your character assassination and wild misrepresentations (this time against a Wikipedia 'bot!) to go unpunished, lucky you. --Connel MacKenzie 03:54, 25 October 2005 (UTC)
- Responsible for the existence of these entries and them being on my watchlist is the now abolished first letter capitalizsation policy. In both cases the redirects were set up after deleting existent content last edited by myself. But I often have not enough time to check all the changes that show up on my watchlist.
- Redirects are bad, always. So the bot should have to be there in the first place. However, how else do double redirects arise, and is there any point in shortcutting them? Ncik 23:26, 26 October 2005 (UTC)
- (re-indented previous paragraph.)
- Blanket statements such as "Redirects are bad, always." ignores reality. Certainly for idioms they have proven useful. There are about 3,000 double redirects that still need correction. As discussed, I'll certainly give higher priority to first deleting incorrect redirects before testing any further. By then I do hope the bot has finished the approval process so the corrections don't clog Recent changes. (ReDirBot corrections will still appear on your watchlist, but deletions will not.) --Connel MacKenzie 05:33, 28 October 2005 (UTC)
Requests for pronunciation
There are two templates for requesting pronunciation for an entry. If you want to add a pronunciation but aren't familiar with the pronunciation schemes used on Wiktionary, just add {{rfp}} to the page. As well as adding text to the page saying that a pronunciation has been requested, the page will get added to a category so that all such pages can easily be found.
The template {{rfap}} is similar, but requests an audio pronunciation (a link to an audio file that reads the pronunciation out loud). — Paul G 09:04, 26 October 2005 (UTC)
's' in Bosnian entries
What does the 's' mean in the Bosnian entries (see, for example, bog)? It is supposed to be an abbreviation of "singular"? I thought we always spelled "singular" and "plural" out in full. If is something else, it should still be spelled out; if it is "singular", it seems to be redundant to me, because nouns are assumed to be singular unless they are marked otherwise. — Paul G (who can't stay logged on today, again)
- The Template:s that calls it says it's for marking singular (and apparently it has a mouseover that so indicates). See Wiktionary:Abbreviation for the list of abbreviations we use. —Muke Tever 20:45, 28 October 2005 (UTC)
- Right, well, this is a change of policy. When was this agreed? — Paul G 12:31, 2 November 2005 (UTC)
- Agreed? I dont know. Myself, I think 's' is too short; I think sg is more usual (and recognizable). —Muke Tever 17:29, 2 November 2005 (UTC)
- I've been wondering this too. Oddly, I wasn't too worried about "s" for singular but I would go with the popular vote out of "s", "sg", and "sing". I was worried about "p" for plural. I would much prefer "pl". I've never seen "p" used to my memory in a print dictionary. — Hippietrail 16:16, 4 November 2005 (UTC)
- More commonly used for "singular" than "s" are "sg" and "sing"; "s" doesn't immediately suggest "singular" to me, just as "p" doesn't suggest "plural".
- It looks like this has been introduced without discussion. I am going to open a new discussion so we can consider it (a new topic will be more likely to be seen by more people than this older one). — Paul G 09:32, 8 November 2005 (UTC)
b:Wikibooks:Votes for deletion#Body parts slang discussion on Wikibooks
I hesitate to even recommend it, particularly because this module was dumped on Wikibooks by Wikipedia from a VfD discussion on Wikipedia where the decision was to kill the whole page. I'm proposing to move it yet again, but I want to make sure that the people here on Wikitionary want it in the first place.
I'm an admin on Wikibooks and just trying to clean things up. I also want you to know that if it gets moved here on Wiktionary is irrelevant to the discussion on Wikibooks, as that is a seperate set of policies that are being addressed. While I do find the page offensive, the main "excuse" (and I'm admitting this here) to get rid of it from Wikibooks because it is a list of dictionary terms, or would be if it becomes fully developed. Some of the terms are humourous, and a few surprising that I hadn't heard before for some anatomical parts of the body.
The only thing that input from here on Wikitionary would have for me is to determine how long of a delay there might be before it is deleted on Wikibooks. If nobody wants it here, I'll go ahead and kill it in the next couple of days. If on the other hand some editor/contributors here on Wiktionary want to use it as source material, I'll put in a note that it is to be deleted in a few weeks and put in a deadline for deletion. That will (most likely) be honored by the other admins on Wikibooks as well. IMHO a straight transwiki may be inappropriate as it doesn't seem to follow formatting and other conventions on Wiktionary either, but that is something for you to decide how to deal with the topic here. --Robert Horning 00:00, 30 October 2005 (UTC)
There was no need to retain the copies at Wikibooks. I transwikied the originals directly from Wikipedia. We now have Transwiki:Body parts slang and Transwiki:List of sexual slang. Their natural home is quite obviously WikiSaurus, which already covers much of this ground, in fact. (All of the body part Wikipedia articles, some of which were growing their own mini WikiSauruses, now link to WikiSaurus, by the way.) Any editors who wish to help with splitting the articles up into their appropriate WikiSaurus headwords are welcome to be bold. ☺ Uncle G 01:23, 3 November 2005 (UTC)
Category lists in date sequence
I wanted to change the listing of Category:Requests for language cleanup so that the entries were in date order - so I could deal with them when they are more than a month old more easily. I tried adding "|*" to the Category entry in the {{nolanguage}} template, but that didn't work - they just get listed in random order. Any ideas? (I also tried "|&date;" on the offchance - no luck.) SemperBlotto|Talk 14:18, 31 October 2005 (UTC)
- Please take a look at what I tried with template:nolanguage. It will add entries (as they are edited) to a year/month subcategory. Downside: as they are edited again, they will move to the current month. (For reference, date functions are used on Main Page.) --Connel MacKenzie 18:36, 31 October 2005 (UTC)
- Another approach might be to maintain a manual list at Wiktionary:Requests for language cleanup but that seems like much more effort than it is worth, for these stale entries. This automatic subcategory method seems like it may work well for future entry taggings. --Connel MacKenzie 18:48, 31 October 2005 (UTC)
- Well done. If it turns out to be useful, we could do the same with the notenglish template. SemperBlotto|Talk 19:54, 31 October 2005 (UTC)
- Hint hint? :-) Done; categories added and populated, so it should be on autopilot now. --Connel MacKenzie T + C # 20:48, 31 October 2005 (UTC)
phonetic spelling?
Why is phonetic spelling not included .....anybody
- Many articles do include a "Pronunciation" section which may in turn include phonetic spellings in IPA and/or a generic "American dictionary style" which is often misleadingly labelled as AHD. We do not include things such as "foe NET ick", which are often referred to as ad-hoc phonetic spellings since they are not systematic in a way that makes them as useful for persons from Britain, Australia, India, Jamaica, and America for instance. Particularly trick in these is distinguishing the sounds in "book" and "boot". — Hippietrail 02:11, 1 November 2005 (UTC)
foreign words
Why are there so very few foreign words in English Wiktionary (except those that happen to be spelled the same as English words)? — msh210 04:06, 4 November 2005 (UTC)
- I guess more people here have made getting the English lexicon close-ish to complete the higher priority. Also the need for good bilingual skills or at least reference works means foreign terms are slightly more work. Personally I've been adding interesting words I've come across in all manner of languages and would encourage others. Perhaps we could have a drive for foreign terms soon or do some bot-work on what terms are on the other Wiktionaries but not here yet. — Hippietrail 16:11, 4 November 2005 (UTC)
- I guess we've progressed a long way from the time when people used to complain about CJK character entries suffocating the "real" (read: English) words :p —Muke Tever 18:36, 6 November 2005 (UTC)
"Sorry! We could not process your edit due to a loss of session data. Please try again. If it still doesn't work, try logging out and logging back in."
I was editting, and a message popped up, saying "Sorry! We could not process your edit due to a loss of session data. Please try again. If it still doesn't work, try logging out and logging back in.". What is session data? --Wonderfool 13:18, 4 November 2005 (UTC)
- This was happening to me too yesterday. Not so far today, but fingers crossed... — Hippietrail 16:08, 4 November 2005 (UTC)
- It's a new error message. I've been consistently getting it in those same situations where, previously, upon hitting "save" I would be sent back to the preview & edit screen, instead of a save directly. —Muke Tever 20:46, 4 November 2005 (UTC)
100,000
We are very close to reaching 100,000 entries. Is there anything special planned for this event? I'm not suggesting a party or anything, but a banner on the front page (like Wikipedia has for major milestones such as this one) would be fitting. — Paul G 09:29, 8 November 2005 (UTC)
- I had my own private party with me, a French slang dictionary and a glass of rosé, but less said about that the better. I did a tidbit of an article for Wikipedia's Signpost. The editor seems to want "perhaps quotes from a few major Wiktionary writers, etc.". SemperBlotto is pretty damn major, he could say something like "I'm flattered and hono(u)red to be the writer of the 100,000th article. And yes, I made it an instrument to look inside c**ts on purpose, hahaha". Probly nothing's gonna emerge tho, unless we were to make a Wiktionary signpost thing... but there'd'nt be much that goes in there. --Wonderfool 21:09, 9 November 2005 (UTC)
Abbreviating "singular" and "plural"
Wiktionary:Abbreviation was recently updated with s and pl for "singular" and "plural" respectively. The standard format here has always been to spell these words out in full, the only permitted abbreviations being those for genders (m, f, n and c).
I think this has been done by an individual without any discussion (at least none that I have seen anywhere), effectively introducing a change (or variation) in policy. Perhaps it is now time we discussed it. So there are two issues:
- Should "singular" and "plural" be abbreviated in Wiktionary entries? Some arguments for: print dictionaries do; users familiar with print dictionaries will automatically do the same; we should have been doing this from the start. Some arguments against: we should just leave things as they are; requirement to update many entries that spell the words out in full; there are no space limitations in an online dictionary so there is no need to abbreviate.
- If they are to be abbreviated, what abbreviations should we use? Wiktionary:Abbreviation has "s" and "pl". The latter is commonly used in print dictionaries, but "s" is less common than "sg" or "sing". My feeling is that "s" and "p" should certainly not be used, as it is not obvious what these stand for.
— Paul G 09:43, 8 November 2005 (UTC)
- It should be noted that the "recent update" referred to in the first paragraph is dated July, about half a year ago; since then, even the mere templates have been in use by several people, including at least two admins, Template:s having been placed in about 400 pages and Template:p in about 500 (often replacing already-present manual abbreviations). Recency illusion Note that they appear not only in headwords but in translations lists, which are often rigorously formatted in columns (e.g. craft) Of course I do think 'sg' is more apt than 's'. —Muke Tever 17:31, 8 November 2005 (UTC)
- Allow me to treat this as a vote. I vote to keep using them but change s to sg and p to pl. Those are the more common in dictionaries and people are familiar enough with them. The tooltips on mouseover help the few that are not familiar. — Hippietrail 16:28, 9 November 2005 (UTC)
- The templates are called "s" and "p" but s (sg) and pl (pl) are what is displayed. Template:pl is busy being a language template for Polish. —Muke Tever 22:01, 9 November 2005 (UTC)
- I've approached these gender and number abbreviations with a spirit of tolerance, but my preference would still be to have them spelled out in full when they appear in articles. That would be a lot more helpful for those users unfamiliar with our conventions. Eclecticology 17:07, 10 November 2005 (UTC)
- I'll abstain from this vote. I have no strong preference to the spelled out or the abbreviated form; I think I contributed to the effort (in a semi-automated manner) thinking that this was the preferred form. I'm eager to hear what the consensus is. The template {{sg}} seems to be a country code, as well as pl. --Connel MacKenzie T + C # 18:29, 10 November 2005 (UTC)
Diacriticals in Old English
I've added a ton of Old English words to the (English) Wiktionary, and a couple of times things have been moved on account of accents. Example: I wrote an entry for OE ‘is’, meaning ‘ice’. I put it on the "is" page — but under the Noun heading I spelled it ‘īs’, which is a common convention in OE dictionaries and study texts to show vowel length. The entry was moved by someone to a new ‘īs’ page.
Now as I see it, accents in Old English are not like accents in, say, French, where "mange" (for instance) is an entirely different word from "mangé" and everyone writes the two words in the two different ways. The Anglo-Saxons did not use accents: they spelled "maga" (stomach) exactly the same as "māga" (relative), and therefore the two words should in my opinion be on the same page. It's worth using the accents within the entry, because they are so familiar from dictionaries and study texts etc, but it's surely wrong to think they are a part of the language proper, especially since some editors use macrons and others use acute accents.
So my question is really whether there is any kind of official policy on this. It is an issue which affects a lot of ancient languages (eg Old Norse has a lot of entries in already with acute accents, but personally I don't know enough about the language to know whether that's correct or not). Widsith 10:08, 8 November 2005 (UTC)
- IMO (I moved is, btw), the entries should be on the accented pages if that is the modern-day convention of writing them, but with references from non-accented versions on the "See also" thing on top of the page ({{Seealso|īs}}). Having accented letters in the entry when the entry word (e.g. title) does not, can get confusing. Jon Harald Søby 17:03, 8 November 2005 (UTC)
- I don't know if there's an official policy for Old English on en:. On la: the practice is to put the article at an accentless spelling, but use accents in text—both because, as mentioned, the Old Angles didn't use the accents, and because of the variance between whether acutes (á), macrons (ā), or apexes (â, roughly) should be used. () On en: similar is done with Hebrew and Arabic: we don't put articles at pointed spellings, which appear in the headword but not the article title; with Russian, acute accents are used in the headword but not the article title; I personally think that moving ís/îs/īs from is was a mistake, unless we care to put the same entry on all three pages... —Muke Tever 17:14, 8 November 2005 (UTC)
- I'm inclined to agree with Muke on this. With three accentuation possibilities available it doesn't seem as though the "modern-day convention" is very stable. The "see also" technique may be useful if we ever need to split up the article, but that's not an immediate problem. Eclecticology 01:17, 9 November 2005 (UTC)
- We should have an FAQ or formatting page on this subject. It applies to all languages which have (systematic) optional pointing, vowels, accents, or diacritics: Arabic, Hebrew, Latin, Old English, Russian, Turkish.
- Words in such languages use the most minimal spelling with all optional marks omitted as the page title, and the most maximal spelling with all optional marks included in the headword section/inflection line. Subsequently, it's a good idea to create a redirect from the fully "pointed" or at least other common spellings (Arabic and Hebrew can be pointed to varying degrees) to the minimal spelling. Except in the case of Russian, where Stephen vehemently argued against it - which I still don't understand. — Hippietrail 16:24, 9 November 2005 (UTC)
Extra lines
Hello, I want to ask about the extra lines that are sometimes used to end a language section.
---- ==Language==
I don't see the point in having them. They are not always used, hard to maintain, look bad and are actually confusing when the language sections are short. It looks like it was something done at a time when 2nd level headers did not have a line under them (Is this true?). Is it OK if I get rid of them, or is there a reason for keeping them? Gmcfoley 13:23, 8 November 2005 (UTC)
- This issue comes up regularly. In fact, there's already a discussion on this page (#Four dashes). —Muke Tever 16:58, 8 November 2005 (UTC)
Template:EDS
I just noticed this template on bot. It's also on about 2 other pages. Is this what we want? I would've posted it straight on RFD but I don't know if it's been discussed before. It's survived quite a few edits on bot and nobody's deleted it yet. — Hippietrail 01:46, 11 November 2005 (UTC)
- Probably nmobody noticed it. The person who wrote up the template did not continue his experiment. I'm sure that a closer look at the list of Templates would give us many of these abandoned experiments that never got anywhere. Eclecticology 09:47, 11 November 2005 (UTC)
- Something like this was brought up before; a common reaction was that it might encourage copyvios. Personally I think it's a good idea. On la: I sometimes link to external dictionaries (such as Reta Vortaro, for which we already have interwiki links built in, e.g. ReVo:angl.) —Muke Tever 21:50, 11 November 2005 (UTC)
- I use my monobook.js to put some dictionaries in my "navigation" bar. They don't look up the current word but they could. If we really wanted this type of thing we would do much better to do it via the global monobook.js than by putting something like this on every single page. — Hippietrail 01:36, 12 November 2005 (UTC)
Easier multilingual lookup
Why don't we define identifiers for meanings, and link those to dictionaries? Then you could lookup the id of of an English word, and lookup (preferably automatically) that id in French, German, Italian, or even Wikipedia, IPA English etc. Currently if you want to create a 6-language lookup in wiktionary this involves adding 30 items (possibly in 6 wiki's if you want Italian etc. descriptions: 180 items): six times an English description + 5 translations. If we used identifiers 12 could suffice (one description and word for the meaning in all 6 languages). Add more languages and the difference becomes more dramatic. 145.118.85.189 02:31, 11 November 2005 (UTC)
- Sounds like you probably want m:Ultimate Wiktionary. —Muke Tever 21:47, 11 November 2005 (UTC)
Guess I'll have a look there then. Thanks! 145.118.85.189 12:42, 16 November 2005 (UTC)
Triliteral semitic roots in etymologies
A dictionary I have used (sadly I'm not sure which or in which language) includes Arabic triliteral roots for words of Arabic origin. I would really like to do this on Wiktionary too but to go one step further to the semitic root and naturally do the same for words of Hebrew origin (Amharic or any other semitic languages which might crop up too). One potential problem though is how to show semitic roots without using either the Arabic or Hebrew script. These scripts should be used at their own level however. What I want to know is if there is a standard among semiticists for showing the roots, or if everybody has their own ad-hoc method as turned out to be the case with Proto Indo-European roots.
Here is an example of the kind of thing I'm talking about:
Indonesian
Etymology Arabic كتاب, from the root كتب, from the semitic root ktb.
Noun
If there is a standard, I'm not sure whether we should wikify and create an entry for each. In any case we could certainly use categories such as:
Category:Arabic root|كتب
Category:Semitic root|ktb
I would really like to hear thoughts from Stephen and any other contributors familiar with Arabic or Hebrew, or languages with lots of borrowing from them such as Spanish, Portuguese, Swahili, Yiddish, Malay, Indonesian, Farsi, English etc. — Hippietrail 01:49, 12 November 2005 (UTC)
- On la we do have, e.g. la:Categoria:Radice Hebraica אדם. I still think that the best way for linking protolanguage reconstructions is by an infobox between such categories, e.g. such as the one at la:Categoria:Radice hιππ (but perhaps better formatted—IIRC that's sitting where it was before because of an old display bug). —Muke Tever 14:58, 12 November 2005 (UTC)
- As for notation, there are variations, mainly in the romanizations of special characters; in the two sources I have to hand, there is variance among whether to use rounded semicircles ʾ (aleph) and ʿ (ain) (in the online AHD4's semitic appendix), or the more IPA-like ʔ and ʕ (Ehret's 1995 book on Proto-Afroasiatic). Ehret also uses θ and ð (thorn and eth) for AHD's ṯ and ḏ (t and d with underscore). There may be a couple other differences as well. —Muke Tever 15:13, 12 November 2005 (UTC)
- FYI "θ" is theta, "þ" is thorn. - Ec.
- LOL of course. thinko on my part. —Muke Tever 17:58, 15 November 2005 (UTC)
- FYI "θ" is theta, "þ" is thorn. - Ec.
- The structure of our etymology sections is still relatively undeveloped. Apart from a series of derivation templates it has all been very freestyle. Showing the triliteral roots is certainly desirable for Semitic languages. Having romanized forms is valuable as is the use of Romaji for Japanese. Unfortunately there is not one standard for transliterating some of these languages, and we should think carefully about this on the appropriate language consideration page. I would avoid the use of underscored latters from some systems because in default our own software underlines links. (I have undelining turned off in my preferences, but a newby won't know aboutt that.) I am also uncertain about the use of underdots for the same reason. Eclecticology 08:15, 15 November 2005 (UTC)
- More broadly speaking, on the topic of trilateral roots, I wonder if there is already a policy on the "alphabetical" organization of entries in languages which use trilateral roots (arabic being my concern; I guess Hebrew dictionaries are organized that way too). It comes down to a question of WHO the Arabic-English dictionary is going to be for. If it's for serious students of the language, it should be organized on roots. i.e. منظمة would be listed under ن since the م in the first letter position is just a prefix. In this case, the operative three letters are ن ط م .
- For the few internet users who are not students of arabic but want to look up a word, this would not be a useful way of ordering the dictionary(they wouldn't know anything about three letter radicals but might be able to paste a word into a search box to look it up. I personally think the dictionary should be organized by root, making it potentially useful for students of the language. Anyone else thinking about this? Jackbrown
Special characters
In the "special characters" box under IPA, there's a list of vowels, a list of consonants, and some extra symbols for stress, vowel length, etc. Could someone perhaps add two more: ʲ for palatalization and ʰ for aspiration? They're not needed for English, but they're extremely common in other languages. Thanks! --Angr 09:45, 12 November 2005 (UTC)
DoubleRedirects at this Wiktionary
- Halló! This wiki has lots of DoubleRedirects. Maybe someone (or a bot) can fix them. Regards Gangleri T 18:20, 12 November 2005 (UTC)
- Maybe you could even fix some. Eclecticology 08:19, 15 November 2005 (UTC)
- I'm happy to run the bot. Please vote for it on this page at #Request for bot status: DblRedirBot. --Connel MacKenzie 17:30, 17 November 2005 (UTC)
Template for Current Wiktionary projects
Greetings! I've copied in the Current projects template and user page from Wikipedia. You may now mark your user page with any current projects on which you happen to be working. The Category page will also allow others to quickly see who is working on which projects. Directions for using the template appear at the top of the page linked above. -- EncycloPetey 04:10, 13 November 2005 (UTC)
Language code templates again
What is our policy on these? Do we accept them or reject them? Do we encourage people to replace names with templates or templates with names? User:Stahr seems to be active in the former: [[4]]
Is this what we want to encourage? I think we should make a desision? — Hippietrail 15:33, 14 November 2005 (UTC)
- I certainly don't like them. They could be useful if the names of languages in English were changed fortnightly (…), but I don't see any reason why any language name would, so there's no point in having them in templates. They also make alphabetical order harder; zh is a good example of that. Jon Harald Søby 15:37, 14 November 2005 (UTC)
- All true. On en: I could see keeping discouraging them because they're not much liked, because they don't do much other than display a language name, and because English names of languages—or at least, the names English uses for languages, which isn't necessarily the same thing—are pretty much stable. (For minor language Wiktionaries they may be more useful, as the language names might not be stable; and I know say on la: they link to theoretically useful pages like la:Auxilium:Lingua Anglica and la:Project:Lingua Anglica.) It might be better to keep them for compatibility for those copy-and-pasting translations tables from other wiktionaries—if we want to encourage that kind of behaviour. —Muke Tever 18:41, 14 November 2005 (UTC)
- For the record, I also don't like them and would vote against them if it came to that. But I wouldn't bother fighting it if several important contributors are vehemently in favour of keeping them. — Hippietrail 16:37, 14 November 2005 (UTC)
- User:Stahr is a newcomer to this project, but may have worked on some of the other language Wiktionaries where templates are used. I have revesed his changes of this sort, and advised him of our usual practice. Eclecticology 09:07, 15 November 2005 (UTC)
- I'm against language templates. The copy-and-pasting to other Wiktionaries argument is nonsense since the expanded language names will have a different alphabetical order there. Ncik 02:29, 20 November 2005 (UTC)
- So you're saying it's just as easy to rearrange the alphabetical order (with templates existing) as it is to both rearrange the alphabetical order and type out translations/transliterations of language names (as must be done without them)? Sure. —Muke Tever 19:48, 22 November 2005 (UTC)
Link to Wiktionary user page from Wikipedia
How can one put a link on one's Wikipedia user page to one's Wiktionary user page?
Thanks. Doc 18:48, 15 November 2005 (UTC)
- You can use either the "wikt:" prefix or "wiktionary:". If linking from another language, then use "en:wikt:". For example, [[en:wikt:User:Connel MacKenzie]]. --Connel MacKenzie 21:28, 16 November 2005 (UTC)
- It's more correct to use [[wikt:en:User:Jon Harald Søby]] than en:wikt:…, I think. But it doesn't really matter. Both will get you where you want. Jon Harald Søby 14:37, 21 November 2005 (UTC)
Linking TO Wikitionary
I linking here from Wikisource. Is there any way to link to particular use within the definition? This would be paricularly useful woth odsolete words. Is not can wwe at least link to noun or verb? Using the # doesn't seem to work as it does in Wikipedia.--24.107.197.177 04:48, 16 November 2005 (UTC)
- I seem to be able to link to it#Noun fine; from Wikipedia the link is wikt:it#Noun but the link wikt:it#noun does not work (lower case "n".) --Connel MacKenzie 18:00, 17 November 2005 (UTC)
Spanish (Castillian)
User:Caretaker Gorgon seems to believe there are several Spanishes and that all of our articles cover only the variety he disambiguates as "Castillian". I would just ask him to cease and desist but I'd prefer to get some support from other contributors here first. There seem to be several hundred affected articles but I don't have a quick way of counting. Many are new and very worthwhile articles, some are just the additions of the disambiguation. In the case of the latter those edits are marked as minor, which could be interpreted as sneaky, there are no edit comments. — Hippietrail 22:33, 16 November 2005 (UTC)
- I knew this was going to come up. And as I see it, the main problem with labelling the articles with Spanish (Castilian), is not that it is incorrect, but that it could be confusing. But the reason I chose to do so, was not to offend those who might be offended by the name "Spanish" being given to the Spanish/Castilian language. Some Catalans, Galicians, Extremadurans and other inhabitants of Spain would consider it offensive if the language they call "Castilian" is the only language that has the right to bear a name that resembles the name of the country they live in, i.e. "Spain", as if Castilian was the truly "national" language of Spain. But, on the other hand, some people would prefer to use the name "Spanish", especially Catalans, because they feel Catalonia is not a part of Spain at all. So the name of the Spanish/Castilian language is a difficult and controversial issue. Not only in Spain, but in the rest of the Spanish/Castilian-speaking world too. The name "Spanish" (español) is without doubt the most used name for this language, at least by non-Spanish/Castilian-speaking people (that is the reason why I put the name "Castilian" in brackets and not behind a slash), but millions of hispanohablantes, or should I say, castellanohablantes, use the name "Castilian" (castellano), when referring to the language they speak, especially in some Latin American countries (see Names given to the Spanish language). But maybe I am just trying to be too politically correct when I am adding "Castilian" to the articles, just like I avoid using the word "American", when speaking of a person from the United States. But again, I can see how this can be confusing, and I will therefore accept it if we decide to leave it out, and only use "Spanish".
- While we're at it, what is the policy regarding the name(s) of the Serbian/Croatian/Bosnian/Serbo-Croatian/Croato-Serbian/Bosnian-Croatian-Serbian language(s)? — Caretaker Gorgon 13:02, 17 November 2005 (UTC)
- I think the idea that it's at all controversial to call the language Spanish outside the Iberian peninsula is an exageration. The vast majority of latin americans call their language Espanyol/Spanish. Even in Spain the term is essentially accepted; let's not let politics get in the way of clarity. RE: Serbian/Croatian/Bosnian - Yes, the intentional (and mostly imaginary) fracturing of Serbo-Croation into three different "langugages" is an amusing artifact of the first Yugoslavian war. Actually the only amusing artifact, I would say.--Jackbrown
- Our regular Serbian contributor puts Serbian words in and marks them as Serbian and Bosnian words as Bosnian, e.g. Jadransko more. Presumably Croatian words would be marked Croatian, and words in the now-apparently-defunct Serbo-Croatian the same. This is of course for the most part massively redundant, as they are generally identical in broad meaning if not in details, but this is probably the only way to satisfy NPOV without resorting to compound headers like ==Bosnian and Serbian== which I'm certain our parsing-oriented contributors would universally decry. —Muke Tever 23:27, 17 November 2005 (UTC)
- Using parenthesis in level two headers is certainly syntactically incorrect on this English Wiktionary. We do not use Wikipedia style disambiguation at all. I have no knowledge regarding the POV of political correctness for the Spanish language. So I will not voice any opinion on whether or not to tag definition/meaning lines with something like {{castilian}} as I do not know if that is correct or appropriate. But mangling of the level two language headers should be reverted, I think. --Connel MacKenzie 18:11, 17 November 2005 (UTC)
One day it would be nice to be able to allow users to select between Burmese and Myanmar; Khmer and Cambodian (and perhaps Kampuchean); Farsi and Persian; Sinhala, Sinhalese, and Singhalese; Maori and Māori; Filipino and Pilipino; Lao and Laotian; etc. (In fact it's already possible using templates and CSS, except for sorting.) But for now, every print dictionary I've seen in English and in Spanish-speaking countries uses "Spanish". Latin-American countries do have preferences for which name they use, but nobody gets upset when somebody else uses the other one. Interestingly, my Catalan-Spanish dictionary which was published in Catalonia and which I bought in Barcelona, uses "Català-Castellà" so apparently they at least weren't too fussed about whether they were part of Spain or not, at least not in choosing the title of their dictionary.
As far as anybody being offended, that's equally likely whether we choose "Spanish (Castilian)" or "Castilian (Spanish)". So far nobody's been offended. — Hippietrail 19:47, 18 November 2005 (UTC)
Wiktionary logo
OK apologies if this has come up before, but it's been bugging me for ages. The Wiktionary logo over there on the left - the phonetic thing - well it's wrong, isn't it? Surely it should be ['wɪkʃənri] ....? Widsith 08:05, 18 November 2005 (UTC)
- I think /'wɪkʃənri/ would be even better... - Dakdada 17:09, 18 November 2005 (UTC)
- Perhaps in preparation for Wiktionary:Wiktionary Day (12 December) we could have a new logo contest? --Connel MacKenzie 19:14, 19 November 2005 (UTC)
- Whether it's an "i" or "ɪ" may be no more than a matter of what part of England one is listening to. My experience with logo issues is that we cannot possibly deal with that so quickly, even if it is desirable to change the logo. Eclecticology 07:12, 20 November 2005 (UTC)
- Well I don't known how these things are done, but personally I think it should be changed, even if it's not a quick job - it kind of looks bad. As for it being a regional pronunciation - that's a fair point, but then we're hardly using Yorkshire accents as a rule of thumb within the entries, so why should we do it on the main emblem? I just think for anyone used to checking phonetic information in dictionaries, it doesn't look like a regional variation, it just looks like an error. Widsith 19:38, 20 November 2005 (UTC)
- The definition isn't even in Wiktionary format SemperBlotto 19:46, 20 November 2005 (UTC)
- True, but our formats were far from established when the logo was drawn up; wilco is not in that format either, and we have decapitalized. One further observation: If this were drawn the same way now, the text above the name should probably read "a directory of species". Wikispecies is now the project that immediately precedes Wiktionary in alphabetical order, but that was not the case three years ago. Still, I think that the underlying concept was clever. Eclecticology 17:11, 22 November 2005 (UTC)
Translations of inflected forms (Old heading: was)
Looking at edit history of was I am wondering why a contributor would perform this sort of (repeated) vandalous data removal. Given my own history with this contributor, perhaps someone else could enlighten either myself or User:Ncik. --Connel MacKenzie T C 20:45, 19 November 2005 (UTC)
- I suppose you are talking about the translations. This matter was discussed before and, as far as I remember, the community came to the decision not to add translations (nor synonyms or antonyms) to inflected forms because this would mean that the definitions would have to be mirrored. Ncik 02:40, 20 November 2005 (UTC)
- There is no particular prohibition to adding translations to inflected forms. In many cases they may not be very helpful, but if someone wants to add them there is no need to remove them. Eclecticology 07:04, 20 November 2005 (UTC)
- We should encourage contributors to add information in places where other users would expect it to be found. A big, fat "Do not add translations for inflected forms" on WS:ESE would make sure editors only translate the basic form (infinitive, singular, positive, etc.) of the English word, and then add the inflections of the translations to the appropriate language sections of those translations. Similar considerations hold for synonyms and antonyms. Allowing translations to be added for inflected forms would also cause an enormous mess. Languages inflected more highly than English, e.g. Latin or German, will normally produce at least a dozen of translations for the past tenses of most English verbs just by inflection (6 persons x 2 moods + past participle, imperative, etc.), multiply this by 1-5 translations per meaning, of which in many cases there will be more than 3, or even 5, and expect, considering overlap of translations of different definitions, 100+ translations. The problem is less dramatic for other parts of speech, though. Ncik 20:45, 20 November 2005 (UTC)
- I agree that it may not be terribly important to remove them all, but I believe that translations of inflections will cause more confusion than help for the great majority of the readers, and hence should - in general - be discouraged. \Mike 10:21, 21 November 2005 (UTC)
- I remember no such discussion that concluded that inflected forms should not have translations. I have only heard/read that they should since the translations themselves are very likely to be different. --Connel MacKenzie T C 06:40, 23 November 2005 (UTC)
- I don't remember any such discussions either. In general I wouldn't encourage it, but I don't support deleting them wherever they appear. Is somebody confusing this with our accepted practice of not having translation lists on foreign words? Eclecticology 17:18, 24 November 2005 (UTC)
- I see nothing confusing in translations of inflected forms. I wouldn't bother promoting the addition of them, but I'm certainly against their deletion when somebody has gone to the trouble to add them. — Hippietrail 17:38, 25 November 2005 (UTC)
- I've restored them (again). --Connel MacKenzie T C 01:19, 29 November 2005 (UTC)
- Here's why we should not be translating inflected forms. What are the translations of "got"? Well, it depends on your meaning of "get", so you need to go to get to find these. There are very many. You add translations of the meanings you know, but not all of them go in, perhaps. A new meaning of "get" is added, and the translations of "got" are not updated. Things get out of synch fast and are not helpful to the user.
- Here's another reason: how do you translate "ate" into French (just in its commonest sense, let alone its others)? It is "mangé"? No, because that is the past participle, which is a translation of "eaten". How about "mangea"? That's closer, because that is the past historic, and "he ate" can be translated as "il mangea", but it's the third-person singular, so little use if you actually want to translate "we ate". How about, then, "avons mangé"? Well, that fits, but we would have to give "ai mangé", "as mangé" and all the others too. Hm, it's starting to get long-winded, and this is only for one meaning of "ate".
- Here's the only suitable way to do things: not to add translations to inflected forms. If the user wants to translate "got", they see that is the past tense (and past participle in UK English) of "get", go there, look up the appropriate meaning of the infinitive, find the translation of the infinitive, go to that page and look up the past tense. Any changes, additions or deletions keep this process consistent. (From a database point of view, this is a form of data normalisation, which is a good thing.)
- Here's what now happens with "we ate". "Ate" cross-refers to "eat", where the user finds "manger" (which, all being as it should, links directly to the French word rather than the page, which will begin with the English word of the same spelling). There they find a table of conjugated forms of the verb. Depending on whether the the perfect or the past historic is the more appropriate tense, they can now see exactly what the translation should be. (There are already templates for tables of conjugations in some languages, so the work to add conjugations is minimal, at least for regular verbs.)
- Any existing translations of inflections will eventually be moved to conjugation tables and so the work done to add them will not have been in vain.
- This greatly cuts down the amount of work we have to do, eliminates inconsistency by making consistency permanent, avoids redundancy and, most importantly, actually helps the user. Giving translations of inflections does none of these things. — Paul G 18:51, 30 November 2005 (UTC)
I don't see these arguments showing "why not to include translations of inflected forms", but merely why doing so is hard. It's true that translating "ate" into French results in many forms. But so does any any word with more than one sense. Any English adjective into Spanish results in up to 4 possibilities, into Russian up to 5 possibilities - both per sense, and counting only the non-inflected English forms. Also nobody is asking people who don't want them to do the work. People who want to do it will do it, or nobody will. In every respect Wiktionary has lots of missing data and surely always will, users should be made well aware of this.
Having said that, I think there's a very good chance that templates can be used to do it by including all inflected forms in the destination language and adding "hide" or "invisible" attributes to the forms which are not relevant. It shouldn't be too hard. — Hippietrail 21:12, 30 November 2005 (UTC)
- My main argument for not including translations of inflected forms is that it they can very quickly become out of synch with the base forms, which makes the information useless. We've already had this problem with variant spellings, and the solution we came up with was to put the information in one place, that is, at whichever spelling is entered first, with a cross-reference at the other spelling(s). The ideal solution here is the same - put the information in one place to avoid redundancy and inconsistency (a basic principle of database management), not to mention lots of work (should anyone have lots of time on their hands and want to do it). Incidentally, this is the system used in print dictionaries, partly to save ink and paper but mainly for this reason. &dmash; Paul G 10:12, 1 December 2005 (UTC)
- Paul, that is not my recolection of those conversations. The AmE vs. CW/UK solution is to include both spellings. We've never had a truly acceptable experiment yet, though. Also, the concept was not to delete information but rather to merge it! But merging is only appropriate some of the time. In the case of was where the verb inflection is such a particularly tricky part of the language, the translations most certainly do belong in the inflected forms of to be. As someone pointed out earlier, no one is asking you to do this grunt work. I am merely asking for deletion of (valid!) content to stop. --Connel MacKenzie T C 01:21, 2 December 2005 (UTC)
- The trickier translating an inflected word gets the less dersireable it is to have those translations on that page, not the other way round, Connel! You want to be as precise as possible. Deletion of existing translations (of which there really aren't many yet) should continue to make sure nobody gets the idea to add translations in the wrong place. Ncik 22:58, 2 December 2005 (UTC)
- Having separate translations was an argument presented to me for having separate entries for inflected forms in the first place! The inflected forms (especially for tricky core terms) should be on the separate entries. Over-consolodating them is what causes problems. --Connel MacKenzie T C 18:06, 9 December 2005 (UTC)
Templates and Categories
How do I bar categorising templates from adding themselves to categories? Ncik 02:46, 20 November 2005 (UTC)
- Template:rfd has this line: <includeonly>[[Category:Requests for deletion]]</includeonly>. --Connel MacKenzie T C 03:02, 20 November 2005 (UTC)
- Thanks. Ncik 20:01, 20 November 2005 (UTC)
Japanese links and redirects
Should the kana and romaji be links or not
With Links | Without links |
---|---|
JapaneseNoun |
JapaneseNoun歌 (うた, uta) |
I put the links in but other people have removed them.
Also, should ふらんす have its own entry as I gave it, or should it be a redirect, as it is now? Gerard Foley 17:21, 22 November 2005 (UTC)
- Yes, the kana, kanji, and romaji should be linked. I fail to see why he would remove the links. On the other point I would probably prefer that the hiragana version be an article in its own right so that it could explain why the katakana would be used. I do note that you use Category:Furigana. Why this instead of Category:Hiragana. My understanding is that the term furigana is only used to refer to the hiragana script when used beside the text rather than in the text.
- Participation in the entry of Japanese material has been spotty, so that our standards for these things have not really been established. I look forward to your reply. Eclecticology 02:18, 23 November 2005 (UTC)
I would like it, if someone else can change the templates back and revert ふらんす, (I don't want to start an edit war).
As for using Category:Furigana, I don't know why I use this category, probably something I read in the Help pages. Gerard Foley 02:50, 23 November 2005 (UTC)
- I support treating ふらんす as a redirect, as a loanword "フランス" should be written in that manner in light of the Japanese orthography, like a proper noun "France" should be like that in English. I've found "france" is also a redirect.
- And, I stand neutral on the wikification of kana and roma-ji while I feel that it'd be rather informative if kanji is also wikified there. --Tohru 04:16, 23 November 2005 (UTC)
I've spent the last few hours trying to understand the issues and to find ways to deal with them. The fact that "france" is a redirect has no bearing on this discussion. That page was a by-product of the conversion to case sensitive first letters; it would not otherwise have come into existence. If it were solely up to me I would have deleted that page long ago.
Any Japanese word can be represented in hiragana, katakana or romaji, but not necessarily in kanji. Of these hiragana is the most common syllabic form for writing Japanese, romaji is clearly foreign but very helpful for us foreigners, and katakana has become associated with a range of specialized uses among which the most notable is in the representation of most foreign nouns. Since the English Wiktionary is primarily for the benefit of the English speaking user he should be spared from the arcane long-lived debates about which script is appropriate in which circumstances. I would propose that the hiragana form be recognized as canonical for the purposes of Wiktionary. Romaji entries should exist for all Japanese words. Kanji and katakana entries should be set up as required; they should all link to the relevant hiragana form, preferably not as redirects. If a katakana entry is more common because it represents a foreign word, that should be clearly indicated.
We do not need categories for hiragana and romaji since these will apply to all Japanese words. Furigana represents a particular way of using hiragana, and should not be used as a category, except perhaps under specialized circumstances. There may be some justification for kanji, katakana and japanese nouns but by and large I find them useless because of the enormous numbers of elements that may belong in those sets. A useful category contributes to the hierarchical organization of knowledge.
I also have views about the way that the use of templates are being applied to Japanese terms. I have simplified the Japanese entry at ana; a single template could still be used to show the three writing forms, but I'll get into that elsewhere. Eclecticology 22:19, 23 November 2005 (UTC)
- My thoughts on ana:
- "Japanese romaji" is not a language, the heading should be just Japanese
- I don't like the order of the あな, ana, 穴. It should be romaji and then kana, no kanji
- I think it is best to put the kanji next to the short-def as was written on the Help page
- I also like the brackets as in ana (あな)
- The reasons for the new templates are:
- to speed up the entry of new Japanese words
- to apply a consistent format to the entries
- to make changing layouts easier
- to place words in the appropriate category
- The templates were made to follow the style guide on the Help page (which I like)
- The templates are
- janoun
- janoun2
- kanji
- furigana
- romaji
- japdef
- t1verb
- t2verb
I have changed the entry at ana to a template that gives a series of boxes; this same template could be used for all three entries of the word, thus minimizing the number of different templates. I suppose that putting "romaji" in the heading isn't that necessary since a Japanese word in Roman script is necessarily romaji. Also, if need be, the order of the three forms could be changed; without the boxes having the romaji in the middle acted as a nice separator. Putting the hiragana in the first position emphasized its relative importance.
I've noticed that in the part of speech is missing from many of the romaji and hiragana entries. It should be added. The "Romanization of" line is a redundancy in the romaji entries.
I don't see the point of sticking the kanji in the definition lines, or even much value to putting the definition inside the template. This doesn't leave much purpose to the "japdef" template. The definitions are generally not italicized.
As I said before the romaji and hiragana categories are pointless since there should be such entries for every Japanese word. Eclecticology 10:41, 24 November 2005 (UTC)
- You can look at かんちょう for a comparison of the two systems. Gerard Foley 13:42, 24 November 2005 (UTC)
Here's a few thoughts from me. I own half a dozen Japanese dictionaries and have tried reading Japanese texts with dictionaries since my knowledge of Japanese is pretty minimal. The former informs me what is traditional in Japanese dictionaries, the latter provides a basis for what works and doesn't work in the real-word.
I think primary entries should be in whatever script is primary. I think secondary entries can benefit by having short glosses rather than full definitions. Hiragana, katakana, and romaji pages are very useful as indeces when looking up a word you come across in those scripts. Following the links from them to the primary form will then provide a full definition. Including kanji in them will help disambiguate for users knowing various amounts of kanji, or help jog rusty memories.
Real Japanese dictionaries sort by kana spelling and treat hiragana and katakana as pretty much the same. A common way is to show hiragana for all words spelled in kanji or hiragana and katakana for words which use neither. Kanji is provided in the articles for words which usually or often use kanji. Some use kanji as primary headword, some stick with the kana.
As for italics, in my experience with bilingual dictionaries of various language pairs, normal straight text is used for glosses, italics is used when there is no gloss into the other language and an explanation is required.
I've said before that using "furigana" as a heading is misleading and unhelpful.
Using "Japanese hiragana" implies that hiragana is not the usual form for writing this word and fuller information may be found at the usual kanji spelling. Entries in their usual spelling, no matter what that is should just use "Japanese". "Japanese romaji" may seem ambiguous from limited points of views. But when encountered in a list of articles in various languages using the same spelling, it gives a little more help to Japanese-ignorant people and also implies that this is not a usual spelling. Just using "Japanese" would be misleading for people looking for a bunch of translations of their favourite word. Just using "romaji" won't mean much to the very many people who know that Japanese is a language but know nothing about the terms for the various ways in which it can be written.
I think it's best to find a middle-ground that works for both beginners and experts of Japanese, without hindering either. — Hippietrail 17:26, 25 November 2005 (UTC)
- Hear, hear to all of that. Widsith 18:34, 25 November 2005 (UTC)
Some replies to Hippietrail's comments.
I think primary entries should be in whatever script is primary.
- Yes, I agree.
I think secondary entries can benefit by having short glosses rather than full definitions.
- Yes, I agree.
Hiragana, katakana, and romaji pages are very useful as indices when looking up a word you come across in those scripts.
- Yes, I agree.
using "furigana" as a heading is misleading and unhelpful.
- OK, what should we change it to?
- We could go with "hiragana" and "katakana" or just "kana" - perhaps depending on what we go with elsewhere. — Hippietrail 18:12, 26 November 2005 (UTC)
Using "Japanese hiragana" implies that hiragana is not the usual form for writing this word and fuller information may be found at the usual kanji spelling.
- Yeah, but I strongly think 2nd level headers should display the language only. I think that if the entry is formatted like I have been doing them (look at かんちょう under ==Japanese==) it is clear that this is not how it is usually written.
And for the romaji bit, the romaji entries say "Romanization of:" before a list of words which show how the word is actually written, I don't see a big problem there. Gerard Foley 00:19, 26 November 2005 (UTC)
- We already extend the level-2 header in a coupld of instances, notably "translingual". While a level-3 header below seems okay, it won't be if it pushes other headers to an extra level. — Hippietrail 18:12, 26 November 2005 (UTC)
- Pondering the romaji bit a bit more, I think there is a clear and essential distinction between hiragana/katakana script and romaji script: the former is genuine while the latter is auxiliary, and it is suitable and informative to distinguish them as such with the top-level header, just as Hippietrail remarked. My own tentative for now is ==Japanese Romanization==, in such a way that the convention is applicable to Romanization entries of other Asian languages like Thai, Korean and Chinese as well, while feeling ==Japanese romaji== is a feasible option. Anyhow, I hope those Asian languages are taken into account too, since this issue is common among Japanese and them. The conclusion here can be quite influential to them, or the overall consistency of Wiktionary, for better or worse. -Tohru 09:16, 26 November 2005 (UTC)
- Romaji is auxiliary but it is mostly standardized. "Japanese romanization" would be equally correct for all manner of older ad-hoc borrowings from Japanese into English, of which "saki" is comes to mind, but there are rarer terms which are much further from what English speakers and learners of Japanese think of when they use the term "romaji". Romaji is (without going off to read) a version the Hepburn romanization schemse as currently in very wide use among those who need to express Japanese in the Latin alphabet. It includes macrons over long a/e/o/u and optionally over "i" which may also be written "ii". It also includes an optional apostrophe to disambiguate between final "n" and "n" forming part of a syllable with a vowel. Other forms of romanization are probably called "romaji" in Japanese but this is not usual with English speakers or romanized Japanese dictionaries in print.
- Chinese and Korean each have two (or more) standard, accepted romanization forms which are in wide use, or were in wide use in the past. Thai has one relatively new system which is not yet widely used outside the Thai goverment. Thai teaching materials all invent their own system or a new variation of a system used prior in other such material. No Thai system is very well known to English speakers, unlike romaji and pinyin.
- Pondering the romaji bit a bit more, I think there is a clear and essential distinction between hiragana/katakana script and romaji script: the former is genuine while the latter is auxiliary, and it is suitable and informative to distinguish them as such with the top-level header, just as Hippietrail remarked. My own tentative for now is ==Japanese Romanization==, in such a way that the convention is applicable to Romanization entries of other Asian languages like Thai, Korean and Chinese as well, while feeling ==Japanese romaji== is a feasible option. Anyhow, I hope those Asian languages are taken into account too, since this issue is common among Japanese and them. The conclusion here can be quite influential to them, or the overall consistency of Wiktionary, for better or worse. -Tohru 09:16, 26 November 2005 (UTC)
- OK. No extra level-2 headers, for romanized foreign words. I might have gone too far in this, and won't stick to the previous idea.
- And now, I'd like to know your thought/position about the lesser known or not so standardized romanization systems, like RTGS of Thai or Arabic transliteration systems, a little more. You do feel that titles in those scripts should be accepted here, or not? (My original concern was about how to cope with the possible confusion in case of accepting them.) In other words, romanization systems in wide-spread use among English speakers should only be accepted?
- Tough this is not urgent, I feel that it'd be beneficial to discuss it a bit to prepare the way for further discussion on the romaji specific issues. --Tohru 07:00, 27 November 2005 (UTC)
- Personally I'm not in favour of adding articles using romanization systems not widely known by English speakers. However, if we get some people who feel strongly about adding romanized index/gloss style articles in specific standardized schemes for Thai, Greek, Russian, Hebrew, or Arabic, I will not argue against them. As long as the people submitting such articles know what they are doing. We don't need ad-hoc romanization articles as have appeared from time to time in the past. In particular, I strongly feel the name of the system should be used prominently in each article. — Hippietrail 16:54, 27 November 2005 (UTC)
- Thank you for the helpful suggestion. Yes, the coherence would be the matter in such relatively minor romanizations. -Tohru 16:58, 29 November 2005 (UTC)
==English misspelling== would imply that this is not the usual form for writing this word and fuller information may be found at the usual spelling, but we don't do that (to my knowledge ), the main heading only lists the language. The romaji entries have a 3rd level heading saying "Romaji", and then a line saying "Romanization of:", how much clearer can it be? People looking up Japanese words will probably know that they are not written this way in Japanese anyway. Gerard Foley 13:38, 26 November 2005 (UTC)
- I woudn't support this. I would prefer "English" at level 2 and "Misspelling" at level 3. The POS and other real info will be in the real article. — Hippietrail 18:12, 26 November 2005 (UTC)
We seem to have a lot of questions to answer about the layout of Japanese entries. For now let's concentrate on the original question that I asked; should the kana and romaji be links or not? A least let us answer this one. I think that they should, and I believe Ec agrees with me. Gerard Foley 01:00, 27 November 2005 (UTC)
- Keep both. My feeling is actually quite a bit stronger for the kana. Linking is what the web is all about, and unlike the "link every word" argument, all of these links will be 100% relevant. — Hippietrail 14:23, 28 November 2005 (UTC)
- I second linking the kana and romaji, while feeling further discussions will be needed about what kind of content should be there in romaji entries and how to normalize the titles (especially for the idiomatic or phrasal ones). -Tohru 16:58, 29 November 2005 (UTC)
- I think linking kana and rōmaji is fine, but I strongly feel the second of the two layouts on かんちょう is infinitely preferable (except for the "furigana" heading), not least for readability. I suggest it should be headed "Hiragana", and definition 10 should be removed and only appear under the (linked) rōmaji article. Widsith 10:24, 1 December 2005 (UTC)
- I like the furigana heading and I don't think it is confusing. The time you use it is when there are several kanji that sound the same. They have the same furigana. If there are no kanji (and thus no furigana), then you'd use 'noun' or 'verb' or whatever part of speech and just define it. I do like the かんちょう example. My votes: linking kana and romaji = yes and keep furigana Millie 16:29, 6 December 2005 (UTC)
Neat icons
Taken from Talk:Main Page:
- Is there any reason why the French language Wiktionary has nice icons (for example, see here http://fr.wiktionary.org/wiki/ann%C3%A9e), while the English version has none?
- Ultra megatron 02:01, 24 November 2005 (UTC)
The icons look brilliant. I want them!!! Gerard Foley 02:31, 24 November 2005 (UTC)
- Oooh.. they do look good. They seem to be using a template for the headings, which allows them to insert the image as well. What that means is we would have to change all ===Noun=== lines into {{Noun}} lines. Might be possible with a bot. This gets more difficult when you have to worry about heading depth, i would guess. But to answer the question in the spirit in which it was asked... The only thing standing between us and doing that is, uh... doing it. So if we want to, we can. — Fudoreaper 04:57, 24 November 2005 (UTC)
- No.... there is universal deprecation of replacing headers with templates here on en:. For one, it generally screws up edit sections. (fr: gets away with this by adding
__NOEDITSECTION__
to every page through the language template, thus disabling edit sections everywhere.) Now, if it were==={{Noun}}===
, that'd be a whole different kettle of fish; it wouldn't break anything. —Muke Tever 07:53, 24 November 2005 (UTC)
- No.... there is universal deprecation of replacing headers with templates here on en:. For one, it generally screws up edit sections. (fr: gets away with this by adding
I'm all for it! Also, I think there is a possible solution for it, where we don't have to use templates (the images won't become links to the image description page either). However, it would require a small change in the MonoBook template by the developers; these lines:
<p><a name="Noun" id="Noun"></a></p> <h2>Noun</h2>
to be changed to this:
<p><a name="Noun"></a></p> <h2 id="Noun">Noun</h2>
After that, we do a small CSS trick in MediaWiki:Monobook.css, and voilà! it's universal. I could ask the people on #mediawiki-tech to fix this – if there is a community consenus for it, that is. Jon Harald Søby 14:01, 24 November 2005 (UTC)
- Yes, yes, yes let's do it!!! Gerard Foley 15:42, 24 November 2005 (UTC)
- We would also need to decide on which icons to use, and for which purposes. We could just copy the french, of course, or we could find our own images. Perhaps an intrepid user should collect all the icons needed, identify which headings they would be used for, and put this on their userpage. Then we could all look at the proposed icons before we start suggesting we switch. Any takers? — Fudoreaper 16:20, 24 November 2005 (UTC)
- Perhaps a new skin could be requested from the developers that incorporates these features. I do not want my Wiki connection to get any slower; icons seem to be the most problematic aspect of page loading/slow performance. Adding four to twenty unique icons per page certainly cannot help speed things up. With a new skin, users that want the {{prettyicons}} can have them. I admit, they do look good, but I don't think that add much lexical information. --Connel MacKenzie T C 16:39, 24 November 2005 (UTC)
The icons will not add any information, but they look so good. A new image server was added earlier this week, so the icons should load faster. I think the icons are better than having the edit links, but I would like both if possible. Gerard Foley 16:49, 24 November 2005 (UTC)
- Also, other examples: fr:xénon, fr:purchase, fr:галоп, fr:lingerie and fr:Page d'accueil is pretty cool too.
- One more image server isn't even a drop in the bucket. My point is that images are separate httpd requests, therefore even if the servers perform instantaneously, (that'll be the day) there is still a
noticablevery significant performance lag. --Connel MacKenzie T C 17:05, 24 November 2005 (UTC) --17:10, 24 November 2005 (UTC)
- It's not so bad, Connel. The TEXT will still load just as fast, but perhaps the images will load more slowly. And as you say, since the images aren't lexical info, a non-loading image doesn't mean the wiktionary is less useful. Additionally, a set of 10 (or so) icons on en.wiktionary is such a small increase in load i'm sure it won't be felt on the overall load of wikimedia. If the icons are the same as fr.wiktionary, they're already being cached on the squid servers. The images would be cached locally, on the browser, as well, since each page would have the same set of icons. A user looking at multiple pages would have the images cached locally, removing the need for additional httpd [sic] requests. In short, i disagree with your fearmongering about "destroying performance/response time just for the sake of a little eye candy". — Fudoreaper 19:29, 25 November 2005 (UTC)
- Why'd you put your question here? The bottom of this section might be better for comments. Fudoreaper, you are wrong. 10 (actually about 30 icons) hosted on 180 separate Wiktionaries is very significant. There has been a very long issue on WikiCommons regarding the "WikiSister" icons at the bottom of Main Pages on most WikiMedia sites. For about four months now, an effort has been underway to reduce the icons "weight" on the squid servers. Unfortunately, each separate URL reference to an image gets cached separately on the squids, abolishing the benefit of common icons from language to language (because they are referenced and rendered as separate URLs.) Having images uploaded on separate sites compounds the problem significantly. But the cacheing memory is expended for each site (e.g. en.wiktioanry vs. fr.wiktionary) that references that same (rendered) image. Your browser caches them separately also. If you close your browser, and reopen it, then load our main page, you will see the slowness (even today) that I'm describing. (Depending on your browser and settings, you may not see that delay again.) Yet the majority of visitors start at the main page...therefore you are suggesting a situation that makes the (already slow) first page even more painful to new visitors.
- I object to the images because it would screw up my activities a lot. Javascript does not begin processing on a WikiMedia pages until all images are loaded. While turning off image loading is something I do often, it hoses javascript too much to leave turned off. Average page loads here are already pathetically slow. You are proposing making them an order of magnitude slower. In a couple years, as technology improves, perhaps. Right now, no way, dude. --Connel MacKenzie T C 20:01, 25 November 2005 (UTC)
Also, their Wiktionary logo is better then ours, it fades in at the top and bottom. We should change ours to be the same style! Gerard Foley 17:15, 24 November 2005 (UTC)
- The pretty picture icons are not important for me, but I have no objection if others want to add them as long as it does not require a massive adoption of the templates that we have already rejected, or does not have any unpredictable effects on the way we edit. Eclecticology 18:46, 24 November 2005 (UTC)
Can we start a vote on this. I think that our first vote should be if we start to include the icons. After that, we can discuss how to do it.
Gerard Foley 19:40, 24 November 2005 (UTC)
- Yeah, a vote is good. On the matter of loading, there will only be about five–eight icons of very small file size, so I don't think they'll have any affect on loading. Also, if they are added with CSS, as I suggested, they can easily be turned of in your private CSS. Jon Harald Søby 21:40, 24 November 2005 (UTC)
OK let's start.
- This vote is silly! If people want to design icons for their own skin let them go ahead. What can your possibly hope to accomplish with a vote? What's the matter with working to find a consensus on this issue? That technique has not yet been exhausted. Eclecticology 02:03, 25 November 2005 (UTC)
- I think this is about whether the standard skin should include it or not, if it was personal, it wouldn't matter, would it? Jon Harald Søby 10:54, 25 November 2005 (UTC)
We should look at what looks good to Mr. Joe Bloggs who visits the site, not what looks good to us. Designing icons for your own skin means nothing. A vote will show clearly where people stand on this issue, not the Yeah but, No but in the above discussion. The problem with working to find a consensus on this issue is the consensus reached with the stupid extra lines placed everywhere. I asked a question about this and got no real response.
Jon, If the standard skin included the icons would they be available to all language Wiktionarys, will we still have the edit links and what would we have to type to get it to work? Gerard Foley 15:37, 25 November 2005 (UTC)
- No, but if the change I proposed was to be done, it would be very simple to add it (just a few lines in Monobook.css). Jon Harald Søby 20:08, 25 November 2005 (UTC)
The vote is for the idea of using the icons, we will worry about how to do it if it passes.
Yes, use the icons
- I see absolutely no harm in having them. Jon Harald Søby 06:08, 25 November 2005 (UTC)
- The icons make the text look more pretty, and more easy to read, although making background color like in the Babel languages is not a bad idea Optimix 06:04, 4 January 2006 (UTC)
No, don't use the icons
- See absolutely no point in having them. Ncik 01:03, 25 November 2005 (UTC)
- Performance penalty is far greater than proponents imagine. --Connel MacKenzie T C 17:24, 25 November 2005 (UTC)
- They seem useless and I agree with Connel, they are terrible for performance. --Dijan 23:31, 26 November 2005 (UTC)
- Use background colour instead; images are a performace issue. — Fudoreaper 21:02, 27 November 2005 (UTC)
Comments
- Letting the French Wiktionnaire continue their experiment (and further refine the icons) is valuable, but until WikiMedia has image serving 100 times faster than current technology, these should be discouraged here, especially on the default skin. If you load this Main Page you will see that our handful of front-page icons already severely slows page loading. Paying for caching servers distributed over the globe is very expensive. And there is no lexical gain from the icons. I use the default skin, not because I like it, but rather to see what our pages look like to new visitors. Utterly destroying performance/response time just for the sake of a little eye candy would be quite counter-productive at this time. --Connel MacKenzie T C 17:24, 25 November 2005 (UTC)
- As I said, there would be about seven–eight icons, about 50x50 pixels. Unless you have turned off cache completely, these would be cached and ready for every opening (in fact, eight, or even ten, icons of that size would be smaller than the Wiktionary logo in size, not to speak of the background on every Wik* project. You don't have loading problems with these, do you? This problem – if it could even be called a problem – is insignificant. Jon Harald Søby 20:08, 25 November 2005 (UTC)
- I deleted my cache and loaded the main page and some pages on French Wiktionnaire, all the icons loaded in less then 2sec. so I don't know what Connel MacKenzie is talking about. The icons won't slow the rest of the page down, so the actual information will load just as quick. Many websites use lots of images that can take some time to load (including Wikipedia) so obviously our target audience like pretty pictures to go with the information. Gerard Foley 00:01, 26 November 2005 (UTC)
- Perhaps it was the time of day you tried your experiments, perhaps your browser still had them cached. Then again, with my cable modem, I very rarely get a wiki page (even text-only) to load in under two seconds. Whenever I restart my broswser (cache set to clear on exit) I *do* have to wait for the first page to load.
- I did notice that the French pages load much faster (because they are serving the images from fr.wiktionnaire instead of commons?) But I do hope you aren't suggesting that pages load faster with images? Pages with images are an order of magnitude slower - everyone can see it (but you can continue denying the fact, if you want.) --Connel MacKenzie T C 22:23, 26 November 2005 (UTC)
- Wiktionnaire (fr) uses images from commons (so I don't know why it seems to load faster), and we limited there there number at 5 (etymology, pronunciation, definition, translation and see also). We chose to add them to make the articles more nice to see, even if they had to load more slowly ; the purpose was also to seem attractive and accessible to the visitors (and that's not as easy as WP...). Since we use templates it was easy to try them, but I don't know if it would work like that here. - Dakdada (from fr ;) 22:57, 26 November 2005 (UTC)
- I had a discussion with the sysops (or developers) on #wikimedia-tech about this issue. I asked whether they thought that additional images on each page would load the wikimedia system. Their answer was yes, it will. And traffic is growing so fast that they can barely keep up. It's been doubling every 3 months for the past 2 years (approximately). Therefore, they suggested that adding images would only make the problems worse, and that whey would strongly recommend that we NOT take steps that would needlessly increase the wikimedia cluster load. I say needlessly since the icons are purely cosmetic, they serve no functional purpose. The suggestion was, then, to instead use a background color in the headings. The load increase from that would be very very small, but it would achieve the same visual difference between headings. I thought this was a very good idea, and a reasonable compromise between the "want icons" and "don't want icons" group. Therefore, i recommend we do not add icons, but instead look at the possibility of adding a background colour to the headline lines. — Fudoreaper 21:02, 27 November 2005 (UTC)
I did not say that the page loaded in less then 2 sec., the icons loaded in less then 2 sec. after the page. It is my understanding that images have no effect on the speed a page loads, as the images are fetched after the page loads. Articles on Wikipedia regularly have lots of pictures, they even have a Picture of the day. I can't understand how everyone is so worried about the performance impact of 5 or 6 little pictures. Gerard Foley 23:14, 26 November 2005 (UTC)
- I just read Connel MacKenzie's latest comments, and it seams to be well researched, so I am not going to vote. They still look nice though, and I hope we can get them sometime in the future. Gerard Foley 22:50, 27 November 2005 (UTC)
No one uses Wiktionary
This saddens me. Lotsofissues 23:59, 24 November 2005 (UTC)
- A friend of mine got a Chinese charter tattooed on to his arm a few months ago. I asked him what it meant many times, but he refused to tell me. I eventually asked if I could copy it and look it up on the internet. He didn't think that I would be able to find it, so he offered me €10 if I could tell him what it meant the next day. So I typed the character into Google (using the Handwriting recognition software built into Windows xp), and asked for pages with that character, but written in English. Guess what site I found! I got my €10, thanks everyone! Gerard Foley 00:13, 25 November 2005 (UTC)
- It's used by answers.com as part of their standard response and is easily added to Firefox as a lookup engine in the box at the top right. I use it & recommend it, though I'm biased. There is a lot of work to do before it's use becomes as popular as Wikipedia but it's not getting any smaller. The best way to make it's use ubiquitous is to make it the most comprehensive & accurate answer to any lexical question. I'm not sure about this but I suspect it's growing faster than other lexical resources so it's just a matter of time & keeping up the good work before it will become the natural & default choice. :-) --MGSpiller 01:33, 25 November 2005 (UTC)
- How does one add it to the Firefox search box? Odd bloke 01:13, 2 December 2005 (UTC)
Alternative spelling of...
Just a call for people to be careful with their wording.
I've noticed a spate of minimal articles recently using "alternative spelling of X". This should be avoided since it implies that "X" is the accepted/best/standard/usual/proper spelling and the current articles uses an inferior spelling. The most recent article I noticed used a post-Webster Amerian spelling as "X". I do not know if this is always the case, or if such articles generally point to whichever version already exists. In either case it's not appropriate.
In the absence of research indicating which spelling is more widely used in which areas and which eras, or preferred by which publications or other bodies, all alternatives should be treated as equal. — Hippietrail 17:50, 25 November 2005 (UTC)
I have added text to colorisation, colorization, colourisation and colourization to show that all alternate spellings are equally valid. If you are happy with that, I will roll the templates out to other cases. SemperBlotto 14:56, 26 November 2005 (UTC)
- I'm not sure what the best solution is, but thanks for trying something. Colourisation might be even more tricky than most since it involves two variations which are both considered "British" but which do not always come together. British and Commonwealth English has always used "colour" and its forms. American English gradually began using "color" and its forms after the appearance of Webster's dictionary and used "colour" before that. In Britain and Commonwealth countries there is popular support for "ise" and "isation" but most authorities such as dictionaries and style guides actually prefer "ize" and "ization". I do not know if "ise" and "isation" were ever current in American English even before Webster's.
- Now because "colour" is mandatory but "isation" and "ization" are interchangeable in British and Commonwealth English, all 3 forms exist. As far as I'm aware "colorisation" does not exist in English anywhere. There are also three forms for the verb to colourise and the adjective colourised.
- The spelling "colour" was widely used in all English speaking countries until Webster's changes gained popularity sometime after his dictionary - I do not know how long this took. This means that "color" is the alternative spelling of "colour". Different people will consider "ise" and "ize" to be the standard. Americans and makers of dictionaries and style guides for British and Commonwealth markets consider "ise" to be an alternative to "ize". Many people in Britain and the Commonwealth nevertheless strongly feel that "ise" is the only correct form and that "ize" is an alternative to it. The process of adding colour to black & white films I am assuming was originally terms "colorization" in USA and spread from there - but the Wikipedia article doesn't confirm this. If true this means that "colourization" and "colourisation" are alternatives to "colorization" - but in a different way.
- I'm not sure the disclaimers alone, even reworded, are sufficient to show that an entry which is basically a stub is equal to an entry which has a full article in another spelling when the stub asserts itself to be an alternative to the spelling used in the full article.
- Suggestions required! — Hippietrail 16:05, 26 November 2005 (UTC)
- Sorry, what's wrong with colorisation? Even in -our/-ise–flavoured English the -our changes to -or in derivatives: an honorific is still preferred to an honourific, and they valorise more often than valourise. —Muke Tever 17:39, 27 November 2005 (UTC)
- As far as I can tell it's a French word but not an English word. In the countries where "color" is correct, "ise" and "isation" are never correct. In the countries where "ise" and "isation" are legitimate alternatives to "ize" and "ization", "color" is never correct. A Google search turns up nothing but French. I'll check some online dictionaries and get back though - my mind is open... — Hippietrail 01:36, 28 November 2005 (UTC)
- I've just checked AHD, Collins, Encarta, and M-W. None contain either colorise nor colorisation. I would consider it a spelling mistake. I'll rfv it. — Hippietrail 02:05, 28 November 2005 (UTC)
- AHD and M-W at the least are American dictionaries and wouldn't be expected to have -ise–flavored English; they don't have colourise or colourisation either. More supporting examples though are odour → deodorise / deodorisation; vapour → vaporise / vaporisation. A quick reverse dictionary search shows that colour / colourise and its antonym decolourise are the only words that "officially" retain their -u- in this kind of compound with -ise. There is no rule but custom that justifies "colourise", so it's understandable if the more correct but rarer spelling may turn up from time to time. —Muke Tever 19:27, 28 November 2005 (UTC)
- Good work Muke - you've totally convinced me! I do recall finding "British" spellings in American dictionaries before, but I don't know which ones are good at it and which are bad. "British" dictionaries generally try to include American spellings in my experience at least. Google Print is good but Amazon has a lot more books available. I was able to find more examples of all the spellings and inflections I tried there. I'm going to add all variations to my watchlist now though and I'd still really like to see some dictionary evidence just out of curiosity. — Hippietrail 17:52, 29 November 2005 (UTC)
Wiktionary apparently is no backwater. There is a discussion in Wikipedia about dictioaries, encyclopedias, entries and all sorts of other crap. Maybe a respected Wiktionarians input would be wanted? --Wonderfool 01:44, 26 November 2005 (UTC)
Redirects
How do make a redirect? --Freiberg, Let's talk!, contribs 01:19, 29 November 2005 (UTC)
- Generally, the English Wiktionary avoids redirects. For inflected forms, alternate spellings, redirects are a no-no. For alternate forms of idioms, enter a redirect as #redirect [[most generic idiom form]] as the only line. If an entry already exists, it is taboo to replace the content with a redirect. (Note that Wikipedia is "redirect-happy" while here, the lexical difference of spelling is much much more important.) Also, do not make redirect entries for misspellings. --Connel MacKenzie T C 06:10, 29 November 2005 (UTC)
- In Latin, there are many forms of each word because of declensions. I was trying to make the words in the declension link back to the original word. --Freiberg, Let's talk!, contribs 01:46, 30 November 2005 (UTC)
- The thing to do here is not to use a redirect but instead to write cross-references. These can be done as "definitions" that link to the infinitive (or the singular in the basic case (nominative?) for nouns, and so on) in the following form (using "parla", an Italian word, as an example):
- third-person singular indicative present tense of parlare
- The advantages of this are:
- If "parla" is a word in another language too, or has another Italian meaning, these other meanings can easily be added. This can't be done in a redirect.
- The cross-reference is concise and gives the minimal information necessary. The user can find out what the word means by going to the base form of the word and reading that entry.
- The user sees an entry for the word they typed in rather than being redirected to some other word, where it might not be obvious (or even explained at all) how the word they were looking up relates to the entry they were redirected to.
- The advantages of this are:
- One disadvantage is that there will be many, many of these to enter, but Wiktionary is for its users, not its contributors, so this
spadeworkdonkey work will have to be done eventually. — 193.203.81.129 12:34, 30 November 2005 (UTC)
- One disadvantage is that there will be many, many of these to enter, but Wiktionary is for its users, not its contributors, so this
- Indeed. User:Freiberg, notice how I replaced your redirect at la:belli. —Muke Tever 17:18, 30 November 2005 (UTC)
- Sorry, I hadn't realized that there are similar words to Latin, especially if the language has declensions as well. I'll start spadeworking. --Freiberg, Let's talk!, contribs 03:33, 1 December 2005 (UTC)
- Oops, not spadework - that's preparatory work. "Donkey work" or "hard graft" are more appropriate terms. — Paul G 09:55, 1 December 2005 (UTC)
Inflexion categories
Should inflexions of the basic forms of words have their own categories? It gets kind of confusing when an inflected form of a word appears in a category where there are only the basic words. E.g. the appearance of sangen (inflected form of sang) in Category:Norwegian nouns. And if so, what should the category names be? Category:Definite singulars of Norwegian nouns? Jon Harald Søby 09:40, 29 November 2005 (UTC)
This is something that I thought was a bit odd, or perhaps short-sighted. We have some really helpful templates for inflected forms (I'm thinking of the ones that come up when you use the go button for a page that does not exist) which I use routinely but they put inflected forms in the same category as the root form. It always seemed like a recipe for infintely huge categories when they could be reduced at least one order of magnitude by sub-categorising. --MGSpiller 01:53, 30 November 2005 (UTC)
- So, does the lack of other replies suggest that the idea is good, and can be executed? Jon Harald Søby 19:53, 2 December 2005 (UTC)
- Put them in different categories. See Category:English irregular verbs for what I did. Ncik 22:45, 2 December 2005 (UTC)
In most circumstances a "Nouns" category is so large as to be perfectly useless. Most of the time if something has Category:English nouns and I can put into a meaningful category, I simply remove it from "Nouns". I certainly never add it. "Irregular verbs" is only barely useful. What's the point of having these grammatical categories? Eclecticology 02:42, 3 December 2005 (UTC)
- Irregular verbs category is barely useful???? Every dictionary has this list in its appendix. It's of great importance for any non-native speaker, especially those learning the language, and surely is of interest to many native speakers as well. Ncik 13:19, 3 December 2005 (UTC)
Anagrams
Some users have been adding "Anagrams" sections to pages. My response in the past has been to remove them, but quite a few have been added recently.
People use dictionaries to look up words to find their meanings, synonyms, translations and so on. For anything related to wordplay, such as anagrams, they can consult specialist anagram dictionaries or websites such as the excellent Internet Anagram Server that provide this information.
I feel that it isn't Wiktionary's job to replicate the work of these existing resources. Discovering that "orchestra" is an anagram of "carthorse" or that "Britney Spears" is an anagram of "presbyterians" is fun but I wouldn't use Wiktionary to find that information.
Details of anagrams do not belong in Wiktionary. What do others think? — Paul G 10:03, 30 November 2005 (UTC)
- Why not? We have other undictionary-like information such as rhymes and thesaurary data. I've even seen some lists of minimal pairs on some es: pages. I would put it under 'associated terms' (or whatever the English name of that header is). —Muke Tever 17:16, 30 November 2005 (UTC)
- My arguments against:
- They are difficult to maintain. The entry for rob has orb as an anagram, but bor and bro (and possibly also Rob) belong there too. Adding another anagram to a word means adding it to all of its anagrams as well.
- You don't look for anagrams in a dictionary (but the argument "we have always done it that way" is not a valid argument, of course). Synonyms are not "undictionary-like" as thesauruses are a type of dictionary, and some print dictionaries list synonyms too. Rhymes are here because someone decided early on that they should go in (although they have only been going in recently). Anagrams can be found using websites that process wordlists to generate anagrams on the fly.
- Anagrams for short words are easy to come up with, but for longer words they are not (exercise for the reader: What are the anagrams of top? What are the anagrams of tergiversation (there are two - see here for the answer)?)
- My arguments against:
- What are the arguments for? — Paul G 18:19, 30 November 2005 (UTC)
- Well, for the maintainability, they could be put in templates, probably with a consistent naming scheme, e.g. with all the letters in alphabetical order, so anagrams of "orchestra"—perhaps an auto-generated list of them—would go in Template:anagrams/acehorrst and
{{anagrams/acehorrst}}
would go on every page thus listed. This could even be easy bot work. —Muke Tever 19:01, 30 November 2005 (UTC)
- Well, for the maintainability, they could be put in templates, probably with a consistent naming scheme, e.g. with all the letters in alphabetical order, so anagrams of "orchestra"—perhaps an auto-generated list of them—would go in Template:anagrams/acehorrst and
I don't think the arguments against anagrams in strong. Generally we combine all kinds of dictionaries into one place, that's why we have proper nouns and rhymes. Then again, I don't care for the inclusion of anagrams at all, which is not to say I oppose it. If some people want to burden themselves with it then let them. I do feel any Anagrams section must go very far down each article, past "See also", close to Connel's "most common words in Gutenberg" stuff, since those sections are of use to the fewest people. As for how to do it, I think what's been said above and following the rhymes pages should cover it. — Hippietrail 20:59, 30 November 2005 (UTC)
- I agree, my arguments are weak. I have no major objection to anagrams going in. The only harm I can see that it might do is that it could open the door to allowing all sorts of examples of wordplay being entered (such as noting under murder that if you spell it backwards, you get "Red Rum", and that if you swap the syllables of "outtake" you get another word), and, while this is a nice curiosity in the style of "did you know?", is not what dictionaries are used for. Perhaps any such info, if it is to be entered, could go under a "Trivia" or "Interesting facts" section. — Paul G 09:59, 1 December 2005 (UTC)
- Actually having anagrams would close the door to those particular examples, as they would be entered without comment in the anagrams section ;) But point taken anyway. —Muke Tever 19:49, 1 December 2005 (UTC)
If people want anagrams, why not? I have no objection to them, al long as they go very far down each article. -- Gerard Foley 22:11, 1 December 2005 (UTC)
I'd say yes, but only as long as there are only meaningful words in the list. The last thing we want is a bot-generated list. Unless, I suppose, it were to compare it's output with the Wiktionary database, and only add words that already exist along with a link... Odd bloke 01:02, 2 December 2005 (UTC)
- I like Muke's idea to use templates. They could be made language sensitive. Paul G's idea to have an Interesting Facts section would also be the right place for the Gutenberg Project rank of a word. Ncik 22:40, 2 December 2005 (UTC)
I have no particular interest in anagrams, but I can tolerate them. Putting them in a trivia section further on in an article with such things as what is linguistically interesting about uncopyrightable would be acceptable. Except for a word's alphagram, I agree that the list should be limited to real words. Eclecticology 02:20, 3 December 2005 (UTC)
New Missing Words list: Oxford English Dictionary's 301,100 words
I ran a script to grab all 301,100 terms from the Oxford English Dictionary (aka the English dictionary). I will combine it with some other dictionaries' indexes to remove claims to copyright. Is anyone opposed to my uploading this list as part of your Missing Words lists? Wikipedia has a similar project which I've contributed several lists to: Missing articles project. --Brian0918 16:07, 30 November 2005 (UTC)
- Yes, I'm opposed, as I suspect this could still constitute a breach of copyright. No doubt the copyright notice at www.oed.com says something about it not being stored in a retrieval system (such as Wiktionary is) in any form, which would cover mixing it up with other words.
- In any case, there are lots of terms in the OED that we might not want to include - are we including obsolete terms, of which a large chunk of the OED's content is? We don't seem to have many. Are we including words that appear only in dictionaries or have no print citations, which the OED includes? That would seem to go against the criteria for inclusion.
- In any case, anyone with legal knowledge know what the position is here? — Paul G 18:32, 30 November 2005 (UTC)
- No, it is not a breach of copyright. We already went through the same ordeal on Wikipedia:WP:MEA, after several separate indexes from Encyclopaedia Britannica, Encarta, etc, were separately posted and subsequently removed as copyright violations. They discussed it extensively and consulted with juriwiki. They came to the conclusion that combining several lists together into a new list, completely unlike any of the other lists, is not a copyright violation. And, thus, we have the "Hotlist of topics" and the "General Encyclopedia topics" lists.
- As for what to include, that is ultimately up to the community. They can choose to create obsolete terms as redirects to the modern terms (just like what OED does), or they can choose not to include those terms. In any case, OED is the most complete English Dictionary out there, and en.wiktionary is trying to become the most complete English Dictionary out there (or at least something close to that ideal), so it only makes sense to consult OED (and other dictionaries) to reach that goal. This list serves as a starting point, nothing more. --Brian0918 19:00, 30 November 2005 (UTC)
I was under the impression that a list of words cannot be copyrighted but IANAL. If we can get very good copyright law advice and it's not a copyright infringement then I would be in favour. The OED itself makes no secret of the fact that it includes words based on their inclusion in other dictionaries. I also believe our scope is "all words" and I'm 100% in favour of adding archaic and obsolete terms. They're certainly needed when reading archaic or obsolete materials. — Hippietrail 20:59, 30 November 2005 (UTC)
- Well, I'll ask some of the foundation people, but since Wikipedia is legally allowed to do this, I don't see how Wiktionary wouldn't also be allowed. I'm currently also copying Merriam Webster's Unabridged dictionary. --Brian0918 02:43, 1 December 2005 (UTC)
- Is it the list of words or the definitions? IANALE, but our sister project may be allowed to use it in a very narrow manner; that doesn't quite sound like what we would use the same list for. www.m-w.com is also a no no (clearly!) I'd rather not see Wiktionary polluted with copyright-suspect material. --Connel MacKenzie T C 01:32, 2 December 2005 (UTC)
- IANAL either (nice acronym). I'm just erring on the side of caution. I would definitely like to see this if it is legal. — Paul G 10:02, 1 December 2005 (UTC)
My inclination is o suggest that the use of this list is probably legal, especially if we plan to combine it with other lists. There would certainly be no problem with the list of words in the original OED. I have no problem with including archaic or obsolete forms of words, but I would prefer to see these as actual entries which would show them as "archaic forms of ...". It is also important to remember that these archaic forms may not apply to all usages of the current form of a term.
Copying definitions can be problematical. If a definition in the current OED or Merriam-Webster is unchanged from what was in the its earlier, clearly public domain form, it is not copyrightable. In other circumstances the given definition may be the only one possible; if a writing can be originated independently it is not an infringement of copyright even if the result is identical. If we must choose between a correct but copyright protected definition and an original, but wrong, one of our own design we need to use the copyright one. That situation is impossible with longer works, but perfectly conceivable for short dictionary definitions. Claims of fair use are also far more sustainable in a dictionary for individual entries; if anything in all this is clearly protected it is their general appearance and block selection. If we are going to use anything from a currently published dictionary we need to site our sources. Eclecticology 01:47, 3 December 2005 (UTC)
Importing the wiktionary databse into a mysql database
Hi, I Would like to import the wiktionary database(words & their meanings only) into a MySQL database which i would then use for an sms based dictionary service.I have tried to follow all the instructions but to no avail.I wonder if there is anybody out there who can do it for me.Am desparate to start this not for profit service in my country,where there are more people with mobile phone access than with internet access.Please assist.
Samuel.
(kenyaknowhow@gmail.com)
- I just edited this to make it readable. I hope no-one minds. Odd bloke 01:04, 2 December 2005 (UTC)
- I haven't imported it into MySQL, but I have worked with it. What part of the process is breaking down for you? (Perhaps this should move to your user talk page?) --Connel MacKenzie T C 01:42, 2 December 2005 (UTC)
Plus tab added to Tea room
Just a quick note to say that since my old javascript to add a plus tab here was well-received and now that I use Wiktionary:Tea room a lot more, I've updated the code to add the plus tab there too. Please let me know if there are any problems. — Hippietrail 17:48, 1 December 2005 (UTC)
- I forgot to say last time: I think what you have done is fantastic! --Connel MacKenzie T C 01:43, 2 December 2005 (UTC)
Firefox Search Bar extension
Is there an extension for Firefox available that allows me to search Wiktionary without having to navigate to it first? As much as I want to use Wiktionary, the "dictionary.com" search ships with Firefox and it is much quicker and easier to simply use that. If there isn't, this would be a valuable addition (especially if the Firefox developers could be convinced to ship it), and would certainly help to boost traffic to the site. Odd bloke 21:04, 1 December 2005 (UTC)
- The Wikipedia page that used to be at w:Wikipedia:Tools (I think) described how to set that up. I don't visit 'pedia enough to know where they moved the instruction to, though. I agree that that would be an awesome boost, to have it linked directly from Firefox! --Connel MacKenzie T C 01:46, 2 December 2005 (UTC)
- Have you tried the extension "dictionary tooltip"? ([6]) It's not wiktionary-specific, but you rather pick a dictionary of your choice (e.g. en.wiktionary or ja.wikipedia(!)), doubleclick a word, and get the corresponding wiktionary article in a popup window. \Mike 08:44, 2 December 2005 (UTC)
Regional variations of English
Looking through Special:Categories, I saw Category:Australia, and thought we could categoise this. My best option wsa Category:Regional English, and one could put Americanisms, Britishisms, Canadaisms etc. into there. DO we have such a place already? --Wonderfool 11:06, 2 December 2005 (UTC)
Change to MediaWiki
As of 2005-12-01, MediaWiki now has two messages. This message now appears above the edit summary and Save button. For the new editing tools message, which appears in the old position below the Save button, see MediaWiki:Edittools. -- originally posted on another page by Uncle G 17:43, 2 December 2005 (UTC)
- This is a good move but it has had the side-effect of separating the control used for choosing the set of special characters from the special characters themselves. I've just had a quick go at fixing it but since I'm paying 15 pesos per hour right now and have no income I just can't afford to work on it more and debug it. I don't even have a proper editor. I would really appreciate it if Connel or some other Javascript-savvy contributor could take a look at it. Thanks in advance and apologies for not doing it myself. — Hippietrail 18:52, 2 December 2005 (UTC)
- Oops I spoke too soon. I got it working after all. It's a little different but I like it better. They savvy may still feel free to tweak. And please, non-monobook users, let us know if you've been affected. — Hippietrail 19:13, 2 December 2005 (UTC)
What about antonyms?
I know most dictionaries/thesauruses don't have antonyms for words, and maybe there are too many words in Wiktionary now to add antonyms for ALL the words, but I think it would help a lot of people if there were antonyms. Can it be considered to add antonyms to the Wiktionary entries too?
- Feel free to add antonyms where it makes sense. Polyglot 22:22, 3 December 2005 (UTC)
- There are plenty of articles to which antonyms have been added. See hot, e.g. Ncik 00:59, 5 December 2005 (UTC)
Greek words of Greek letters
I am unsure of the best way of handling entries such as the above. With foreign words using the same letters as English the policy is straightforward (angst looks the same in Geerman and English). Should the entry for (say) βήτα be a redirection or or should a brief explanation be added such as: The Greek spelling of the word beta. Saltmarsh 12:31, 5 December 2005 (UTC)
The entry for βήτα should look something like this
==Greek== ===Noun=== βήτα # beta
i.e. a simple translation. You could add ===Etymology===, ===Pronunciartion==, ==Derived terms=== etc but NOT ===Translations=== SemperBlotto 15:48, 5 December 2005 (UTC)
- The Greek entry under beta (perhaps even béta) should simply show this as a transliteration of the Greek form. Eclecticology 17:14, 5 December 2005 (UTC)
- It seems Ec that you are endorsing entries of Greek in transliteration. There was a huge debate about this a long time ago based on the fact that there is no truly accepted standard for transliterating Greek in the same way that there is for Chinese, Japanese, and even Korean. Especially in light of supporting Ancient and modern Greek, let alone Katharavousa and other historical oddities of the Greek written language. Russian fares a little better than Greek but we don't endorse the creation of Russian articles in transliteration either. Russian seems to have two major kinds of ad-hoc transliteration systems plus variations. For Greek it's much worse. If people feel strongly about it they must first argue the merits of a particular system (or two systems) and have such accepted by the community.
- Meanwhile, beta is a normal English word of course and βήτα is a normal Greek word and both should have their usual entries. — Hippietrail 18:27, 5 December 2005 (UTC)
- So, what's wrong with transliterated entries? The main entry for any Greek word is still the one in its own script. It could be a useful tool for anyone who is not familiar with entering Greek letters. Since this technique is only an access tool we can determine our own standards for an acceptable transliteration. Eclecticology 07:06, 6 December 2005 (UTC)
- The other issue here is whether it should go under βήτα or βητα. Can someone who knows more than me about modern Greek confirm the diacriticals are an inherent part of the language? Widsith 11:21, 6 December 2005 (UTC)
- Diacritics are indeed an inherit part of modern Greek. All words (except for most monosyllables and some foreign words) have an acute accent indicating the syllable that carries the stress, and a few also have a diaeresis to indicate that adjacent vowels are to be pronounced separately. — Paul G 11:41, 6 December 2005 (UTC)
- Obviously as I have stated there are multiple transliteration systems for Greek so both beta and béta are transliterations of the word. Less obviously, η is a "long" e and can also be transliterated as "ē". Trickier again is the fact that this is a long e which carries the stress so it can also be transliterated as an e with both a macron and an acute. The same goes for omicron versus omega. There are also transliterations which try to preserve the original spelling and those which try to preserve the original pronunciation. There are those which strive for exactness and embrace exotic diacritics and those which strive to work for ASCII or similar.
- The "acute" accent is as essential in Modern Greek as it is in Spanish though it's not always translitered and it's not always not transliterated. Of course English did not get the word from Modern Greek but something prior, probably Ancient Greek. And in any case the English word covers the letter no matter what period or orthography of Greek.
- The two points Ec brings up of "anyone who is not familiar with entering Greek letters" and "we can determine our own standards for an acceptable transliteration" are mutually incompatible. Hoping for these users not familiar with Greek letters to first learn our standard for acceptable transliteration is a lot of hope. People will just enter what they think is close, or what they saw in a book.
- That said, I'm not totally against transliteration entries but the fact that there will be a many to one mapping of potential transliterations to correct Greek orthography is something we have to think about hard first. I would suggest either a lot of bot work or some day in the future getting a better search function that can handle transliterations in code rather than in articles. — Hippietrail 16:09, 6 December 2005 (UTC)
- Beta might well be a transliteration of the ancient Greek, but it is not a transliteration of the modern. As the pronunciation of the modern Greek word "βήτα" is /ˈvita/ in IPA, "veta" or possibly "vita" would be closer. — Paul G 18:17, 6 December 2005 (UTC)
- Let's not confuse transliteration with pronunciation. If transliteration is intended to represent pronunciation you are correct. Perhaps romanization is a better term for the process that I'm talking about. The uninformed reader probably doesn't know whether the text before him is ancient or modern Greek. So the transliteration scheme that we follow should be designed with him in mind. Eclecticology 04:07, 7 December 2005 (UTC)
Requests for cleanup
Could people who put an "rfc" tag on an article please indicate what kind of cleanup they are seeking. There are many pages that now have this tag, but nobody knows why. If you see such a tag with no apparent reason for being there you should feel free to remove it. Eclecticology 07:35, 6 December 2005 (UTC)
- I always (I hope) indicate this in the edit summary which is pretty easy to find in the history. I would hope others do this too. — Hippietrail 16:13, 6 December 2005 (UTC)
- Yes, I noted that. However, if others want to discuss it, then it's probably best done on the talk page. In some cases it may be a long time before the cleanup happens, or a newbie looking for something to do may not yet understand how to use the history page. Eclecticology 10:01, 8 December 2005 (UTC)
- This is a good reminder. I try to comment in the edit summary, but often forget. --Connel MacKenzie T C 17:58, 9 December 2005 (UTC)
Wiktionary:Requested articles
A series of "Requested articles" pages have sprung up for various languages. It would probably be a good idea if these could be kept in one page which would contain only entries where there is an immediate interest. If something stays on such a page for a month, there is a likelihood that the person has already found what he was seeking elsewhere, or is no longer interested. The entry can simply be moved to the appropriate Index page. Eclecticology 07:48, 6 December 2005 (UTC)
- These pages haven't just sprung up. The English one in particular has been around for ages. I've added most of the others over a long period of time. I see no benefit whatsoever in the "immediate interest" point of view, at least no for these pages. The "Wanted" and "Requests" fields on "Recent changes" were originally for that purpose. The "Requested articles" pages have many uses, chief among them being a collection of red links to articles which need to be written. When I'm reading a book or website in whatever language I regularly take notes of interesting words specifically to add to these pages. Often they are words which are not in my dictionary, words which look unusual for the language, or words which are under discussion on language-related sites. Much of this work is "seeding" to make these pages more visible so that a) users will request words to be added and b) contributors will look for words that they can add in a language they know. Stephen has been very good at adding Russian terms I've added here and I even got some of his work noticed on another website.
- A better idea is to get these pages more coverage. This has happened a little lately with one admin adding the English page to the monobook sidebard, another user long ago creating a category to link all the other language pages to it, and my standard preamble template to make them all look more like one another. The English page has really taken off lately I must say thanks to the sidebar and a few dedicated contributors. — Hippietrail 16:09, 6 December 2005 (UTC)
Any objections to Led Zeppelin, The Rolling Stones and Pink Floyd? --Wonderfool 15:07, 6 December 2005 (UTC). That is to say, of course, speaking from a dictionary's point of view. I'm not talking about the choice of music for our staff Chrsitmas party --Wonderfool 15:09, 6 December 2005 (UTC)
- Usually, in some dictionaries, there is a part for proper nouns... however, since there is already Wikipedia, it seems useless on Wiktionary for that kind of proper nouns, and these entries don't have any lexical informations. If I look for some name, I just go on Wikipedia, not here. - Dakdada 15:30, 6 December 2005 (UTC)
- It's the slippery slope raising its ugly head again. Either we have (potentially) all or none. I think that none would be better (well, maybe the Clash and the Who) SemperBlotto 15:37, 6 December 2005 (UTC)
- No, let's have none. We are not an encyclopedia.
- For that matter, I believe the Harry Potters, etc, should also go (for which I take responsibility for starting with my appendix of fictional characters). Mythological characters can definitely stay, as print dictionaries include these. — Paul G 18:11, 6 December 2005 (UTC)
- Very famous bands, persons and fictional characters should have entries. Harry Potter, The Beatles and Elvis are good, but they should be very very limited. "Legends", so to say, should be included, but not very much else. Jon Harald Søby 18:15, 6 December 2005 (UTC)
I think it's safe to say we've never really been an encyclopedic dictionary, but we have always been a translating dictionary. If we do decide to become the former then it will indeed leave us open to the "all or nothing" type arguments. But it was because we are the latter that we have the Harry Potter entries. For characters they might strictly not warrant definitions, for invented words like remembrall those should also have definitions as well as the translations. It would probably be a good idea to be able to mark what articles are truly "lexical" and which are here for other reasons though, but I'm not sure how. — Hippietrail 01:31, 7 December 2005 (UTC)
- I think most of them can be deleted safely. Our contributor tried to justify the inclusion of these on the basis of our including an obscure American band, Wilco. An inappropriate entry is not justification for further wrongful entries. Wilco was originally there because the contents were moved to wilco; these redirect pages are easy prey for these kinds of entry. To justify an article a name should have a meaning that goes beyond the simple personal name. Eclecticology 02:14, 7 December 2005 (UTC)
The vandal template
En-WikipediA has a cool template {{vandal}}, used in reporting vandal accounts, that gives useful links for that user/IP, such as page moves performed, blocks applied, and the comments in the block log about them. Can this be imported to Wikitionary? Thanx 68.39.174.238 04:05, 8 December 2005 (UTC)
- Imported from en.Wikipedia. - Amgine/talk 04:15, 8 December 2005 (UTC)
Wiktionary Christmas Competition 2005
New section to add notes on other dictionaries
I'd like to propose a new section below "See also" and "Anagrams" but above Connel's rankings if they end up at the bottom.
The idea is to have a place to note what other print or well-known dictionaries do. Nothing copyright would belong here but instead non-POV facts/factiods such as whether particular dictionaries have an entry or not for this term, which spelling variants they include and with which preference, how they mark terms or senses (obsolete, archaic, dated, etc). In The Tea room there was a recent discussion of whether dairy is an adjective as well as a noun. It turned out that the big online dictionaries are divided on this issue. It would be great to have a specialized section to add these notes rather than just sprinkling them in Beer parlour, talk pages, Tea room, etc.
What do other contribs think? — Hippietrail 20:15, 8 December 2005 (UTC)
- (I editing your link above.) This is tricky. The usual method would be to sweep it into a ===Usage note=== section. Do you expect more than a handful of terms to need a discussion about inconsistencies between other dictionaries? --Connel MacKenzie T C 21:11, 8 December 2005 (UTC)
- Well not really since a ===Usage note=== is just one kind of note, specifically about usage. This would be a note not about usage but about other stuff of interest to some people interested in words and dictionaries. Also it's not about "need" in much the same way as ranking is not about need. It's purely about interest. — Hippietrail 17:09, 9 December 2005 (UTC)
- If "the big online dictionaries" are divided on an issue, I don't see why we shouldn't come down on one side or other of the debate, particularly if a consensus is reached among us on the Beer Parlour/Tea Room. The fact that we are trying to be all-inclusive shouldn't mean we can't say if something is right or wrong. The issue with dairy comes up all the time because in English you can use a noun adjectivally - machine-gun, post-man etc etc. In theory any noun in the language could be used as an adjective. I just worry that a whole section about other authorities' disagreements might be confusing. Widsith 10:14, 9 December 2005 (UTC)
- In fact once we make a call on one side or the other we are expressing only one point of view (POV) which is specifically against Wikipedia policy though their policies do not always dictate our policies 1:1. But that's not what my proposed section is about anyway. It's an interest section below all the prominent sections for people who like that kind of stuff.
- Incidentally, the examples you give are not of nouns being used attributively, they are both compound nouns which are subtly different. Also the fact that most if not all nouns in English can be used attributively does not mean that some nouns are not also true adjectives. Two online dictionaries say "dairy" is also a true adjective. The other two say "dairy" is always a noun which is often used attributively. The fact that these dictionaries are all made by career lexicographers make the variance all the more interesting than if it were merely untrained volunteers such as ourselves. I might add a few sections so people can get the idea. — Hippietrail 17:09, 9 December 2005 (UTC)
- I'd like to see your experiment. "Sweeping" it into a Usage note is good for reducing the number and types of third level headings, but might not be quite right for this tricky example. --Connel MacKenzie T C 17:32, 9 December 2005 (UTC)
- I just can't bring myself to belie the term "Usage notes" by putting non-usage in there. If we renamed it "Notes" I wouldn't mind. I don't see any problem with whatever number of level-3 headers. In the meantime, I have added a "Dictionary notes" section with a category to just one article for now but I'll try to do a couple more covering different topics soon. Please see: tsarist. — Hippietrail 01:57, 10 December 2005 (UTC)
- I've now also added a section to gay. I've also made a category Category:Dictionary notes where more will be added. — Hippietrail 17:43, 10 December 2005 (UTC)
- Neat. Would the more general ===Reference notes=== (to allow usage guides, thesauruses, etc.) be slightly better? --Connel MacKenzie T C 23:57, 10 December 2005 (UTC)
- I'd like to see your experiment. "Sweeping" it into a Usage note is good for reducing the number and types of third level headings, but might not be quite right for this tricky example. --Connel MacKenzie T C 17:32, 9 December 2005 (UTC)
How about a "Miscellaneous" header? It could contain:
- External links (including links to Wikipedia)
- Anagrams (as a level 2 header only)
- Gutenberg Project ranking (as a level 2 header only)
- Dictionary (thesaurus, etc.) notes
- The "See also" section??
- To be continued...
Ncik 00:25, 11 December 2005 (UTC)
- Maybe, but I think that having a "Miscellaneous" header would encourage a plethora of crap we don't want. The concept of having specific headers probably shouldn't be abandoned entirely. --Connel MacKenzie T C 02:16, 14 December 2005 (UTC)
- A valid point. Only, having a level 2 headers for each bit of inforamtion that one wouldn't consider essential for a dictionary might actually encourage people to invent new headers to add there crap there, in which case I'd prefer they added it under a Miscellaneous header. Ncik 00:14, 15 December 2005 (UTC)
- I'd prefer it not be encouraged by a Miscellaneous header.
- BTW, when you say "level 2 headers" you mean level four headers, right? ====Like this====? The least ambiguous way of describing them (I've learned) is not their relative levels, but rather the number of equal signs before the heading text. --Connel MacKenzie T C 02:00, 15 December 2005 (UTC)
- By level 2 I mean 3 equal signs on each side (you won't get a header using just 2 ='s). I will change this and use your terminology from now on. Seems more natural. Ncik 18:17, 15 December 2005 (UTC)
- Minor note: The RFD page uses =level 1= headings. --Connel MacKenzie T C 19:34, 15 December 2005 (UTC)
- Interesting. Wasn't aware this is possible. Ncik 20:47, 15 December 2005 (UTC)
Verbs that are transitive and intransitive
Regarding verbs that have transitive and intransitive meanings, do we want to:
- have a ===Transitive Verb=== and a ===Intransitive Verb=== section;
- have a ===Verb=== section with subsections ====Intransitive==== and ====Transitive====;
- have a ===Verb=== section and specify for each meaning whether it is transitive and intransitive;
- do something completely different?
My view:
- Strong opposition. One level 2 header per part of speech. Also don't want same conjugation twice.
- Used to prefer this one, but now think it's inelegant, and instead tend towards
- Having the opinion that all further grammatical specifications should be given definition dependent (incidentially, I think similarly about things like (un)countable for nouns, and (not )comparable for adjectives and adverbs). Otherwise it would be difficult to add a new definition if one doesn't know whether it's transitive or intransitive (or auxiliary, or ditransitive, or whatever) Probably not a concern for most frequent editors, but Wiktionary should be in a format that won't discourage people with little grammatical knowledge from contributing. Ncik 18:07, 9 December 2005 (UTC)
I actively support #3 as it is my impression from prior Beer Parlour discussion that it is the best. In school, I learned the distinction between 1) nouns, 2) verbs, 3) adjectives, 4) adverbs, 5) pronouns, 6) prepositions and 7) conjunctions while the other parts of speech were merely mentioned as being special. To native speakers of English, there is not a very noticable difference between transitive and intransitive; it makes a huge difference here on Wiktionary for translations. But I don't think that for the sake of translations, one (navigating a long Wiktionary page) should have to hunt down meanings of a verb - they all must be listed together. --Connel MacKenzie T C 18:51, 9 December 2005 (UTC)
- I used to support #1 but I've learned a lot since I've been active on Wiktionary and now I think #3 is the least we should do. I would actually go further ideally and remove the POS stuff from headings at all and put POS stuff inline at the beginning of each sense.
- This of course would make it possible for people that can't determine the POS of a meaning to add a definition not in the wrong place, rather than guessing the POS (but maybe the editors of such erroneous classifications are in many cases actually convinced that their noun definition is an adjective definition, in which case we are helpless anyway). Ncik 02:31, 11 December 2005 (UTC)
Probably still keeping senses of the same POS together and only adding the POS details at the first sense where it differs from the previous senses. This seems to be what most of the best dictionaries do.
- The English Wiktionary included, lol. Where do you want to put the inflections? All in one place? 'Translations' and 'Related terms' will get ridiculously big, other sections might over time. Ncik 02:31, 11 December 2005 (UTC)
- Oh good point! How silly of me. I guess I just prefer the inline look to the POS-headings look that I overlooked they were the same otherwise. The inflections would go at the same places but without interrupting senses with a level-2 heading. I guess the obvious difference is that there would only be one sequence of sense numbers per homonym rather than per POS.
- I don't think Related terms would be affected other than on some few pages that have multiple such sections per homonym they would be merged. Translations actually already have quite a bit of rendundancy where we overdifferentiate senses based on "sub-part of speech". In many cases where we currently have the same word for several senses in a given language we would then need only once. — Hippietrail 03:33, 12 December 2005 (UTC)
Also I would stop counting transitive/intransitive, countable/uncountable as necessarily distince. Some senses are the same for both and good dictionaries reflect this. This would all mean more disambiguation tags in Translation sections of course. But again this is what is done in many good translating dictionaries. — Hippietrail 01:38, 11 December 2005 (UTC)
Given names
Why are some capitalised (eg Stephen) and others lower case (eg [[susan]])? -- SGBailey 10:32, 10 December 2005 (UTC)
- On a related issue, why do several names (Appendix:Names_male-M) have full-stops in them? There are at least 3 in the male Ma section of which Mat.j is the oddest. I was going to remove them all butthought I should ask first. -- SGBailey 10:43, 10 December 2005 (UTC)
- The proper names in lower case are probably left over from our conversion to upper and lower case for the first letter which previously had to always be upper case due to software issues. Though some given names actually happen to also be common nouns, in which case both articles will exist. As for the appendix I wouldn't be surprised if it was machine-generated by somebody and contains errors. I'd say go ahead and remove them. — Hippietrail 16:30, 10 December 2005 (UTC)
- OK, thanks. Now what do I do about names like Mace and Mack. The former reckons Mace is a tear gas trade name - I added "A male name" but I've never heard of this as a name, so ... Meanwhile Mack redirects to Mac. Do I add the fact that Mack is a name (if it is) to Mac or do I un-redirect with a See also? And then Magnus redirects to the latin word magnus. Do I add it to magnus or to Magnus? etc etc etc -- SGBailey 18:04, 10 December 2005 (UTC)
- Mack is a perfectly good first name for a person, and should not redirect to Mac. It is also a brand name for large trucks, and has to some extent become genericized. Eclecticology 09:29, 11 December 2005 (UTC)
- Don't ever hesitate to replace a redirect with an entry. There is nothing in Wiktionary that encourages redirects; the only redirects that exist are remnants of a page that has been moved to a better title for what was on it. —Muke Tever 20:35, 10 December 2005 (UTC)
- OK, thanks. Now what do I do about names like Mace and Mack. The former reckons Mace is a tear gas trade name - I added "A male name" but I've never heard of this as a name, so ... Meanwhile Mack redirects to Mac. Do I add the fact that Mack is a name (if it is) to Mac or do I un-redirect with a See also? And then Magnus redirects to the latin word magnus. Do I add it to magnus or to Magnus? etc etc etc -- SGBailey 18:04, 10 December 2005 (UTC)
- Well said Muke! Perhaps that should be an official Wiktionary Policy? "Don't ever hesitate to replace a redirect with an entry." I like that a lot! The corollary Never replace an entry with a redirect should probably also be voted into official status. That particular concept seems to be a recurring problem when we get visitors from Wikipedia (whose policies are nearly opposite.) --Connel MacKenzie T C 23:47, 10 December 2005 (UTC)
- We don't need a vote. It is already a matter of general acceptance as part of the culture. Endless formality won't di any good if RTFI is ignored. Eclecticology 09:29, 11 December 2005 (UTC)
- Well said Muke! Perhaps that should be an official Wiktionary Policy? "Don't ever hesitate to replace a redirect with an entry." I like that a lot! The corollary Never replace an entry with a redirect should probably also be voted into official status. That particular concept seems to be a recurring problem when we get visitors from Wikipedia (whose policies are nearly opposite.) --Connel MacKenzie T C 23:47, 10 December 2005 (UTC)
- Neither of these are hard-and-fast rules. "Don't ever hesitate to replace a redirect with an entry" is fine when the entry is little more than a cross-reference. It becomes bad when the cross-reference replicates the information at the entry referred to and begins to become out of synch with it. "Never replace an entry with a redirect" is a good general rule but there are exceptions, for example, when a junk entry is cleaned up and becomes a redirect (or perhaps a cross-reference) to a legitimate entry. They are however good rules in most cases. — Paul G 18:26, 12 December 2005 (UTC)
- re: "It becomes bad when the cross-reference replicates the information at the entry referred to and begins to become out of synch with it." — Nevertheless: color/colour, armor/armour, etc., Other wikts may have different policies on standard but regionally-marked spellings such as these, but on en: this appears to be the way things are done... —Muke Tever 20:37, 12 December 2005 (UTC)
Lost in in translation?
There seems to be a persistent user using two usernames User talk:Jezerfetnaeget / User talk:Jezerfetnae to avoid discussion? Perhaps visiting from the Romanche Wikipedia? I took a shot at cleaning up the mess twice now, but the user is not getting it; we are not Wikipedia. Anyone from Switzerland care to try talking to this person? --Connel MacKenzie T C 00:10, 11 December 2005 (UTC)
- I would tend to write off the two names as newbie confusion. I just cleaned up a few, but there were more than what I wanted to handle tonight. Perhaps a note on the pages he is likely to visit as a part of adding these communes will be noticed. Eclecticology 09:15, 11 December 2005 (UTC)
Frequency lists
If anybody is interested in frequency lists, I had the authorization to use the lists here http://wortschatz.uni-leipzig.de/html/wliste.html from the laboratory (they said it's ok to put it under GFDL). They told me that they have other languages somewhere, and that they could maybe send them to me (I said yes of course!). I already wikified the lists and put them in fr: here: fr:Wiktionnaire:Listes de fréquence. --Kipmaster 16:13, 11 December 2005 (UTC)
- Wiktionary:French frequency lists/1-2000 is a place where stuff happens --Wonderfool 22:48, 11 December 2005 (UTC)
- There are not only French frequency lists that I imported, there are also German, Netherlands and English frequency lists. --Kipmaster, 12 Dec, 9:54 French time :p.
- Somve very cool stuff indeed! I am interested to see how my analysis measures up. For those of us who don't understand German, where did they derive these lists from? I think in interest of CYA, you should post the letter where they release their work under the GFDL. Very, very well done! --Connel MacKenzie T C 21:44, 15 December 2005 (UTC)
- Well, I don't understand enough German either to understand where the data comes from ;-). So if someone could translate...
- And yes, I should add the e-mail on wikt. In fact, they told me that they would have released it under public domain if it would have been possible (this is possible in the USA, but not in France and Germany from what I know). --Kipmaster 16:58, 16 December 2005 (UTC)
- e-mail added there: fr:Discussion Wiktionnaire:Listes de fréquence. I don't know if you want to copy/paste the e-mail here. Kipmaster 13:58, 20 December 2005 (UTC)
- Somve very cool stuff indeed! I am interested to see how my analysis measures up. For those of us who don't understand German, where did they derive these lists from? I think in interest of CYA, you should post the letter where they release their work under the GFDL. Very, very well done! --Connel MacKenzie T C 21:44, 15 December 2005 (UTC)
Klingon
Why do links to the Kilgon Wiktionary not work, example: tlh:neck appears as a link and not on the sidebar --Gerard Foley 06:20, 12 December 2005 (UTC)
- Some central decision made by someone. I think the Klingon Wik*s may exist as long as they aren't interwikied. I think that was because they thought Wikipedia would seem unserious if a Klingon edition got lots of references. Or something. Jon Harald Søby 10:42, 12 December 2005 (UTC)
Phobias supercategory
Category:Phobias has "Diseases" as a supercategory. I am not a doctor, but my understanding is that things like phobias are conditions, not diseases - can anyone confirm this and make the necessary correction (perhaps making a new category if necessary)? — Paul G 12:53, 12 December 2005 (UTC)
Stress marks in AHD
This has been raised before, but my request is different from the previous discussion.
The symbol used to mark primary stress in AHD is similar to ´ and I've seen this symbol or something similar used by some people. (Secondary stress is indicated by a plain apostrophe. Most of the AHD pronunciations in Wiktionary use a single quote and a double quote respectively instead of these two symbols.)
Could someone please add this symbol (or, preferably, the similar one that has been used in some AHD pronunciations) to the box that appears at the bottom of the page when you edit it? You used to be able to do this by editing Mediawiki:Copyrightwarning but it looks like it's been moved out of there.
Thanks. — Paul G 18:22, 12 December 2005 (UTC)
- this page must be modified: MediaWiki:Edittools Kipmaster 09:17, 14 December 2005 (UTC)
- Ah, excellent. Thanks, Kipmaster.
- I've added this symbol: ´ (Unicode 00B4/ASCII 180). Is it the right one? — Paul G 09:53, 14 December 2005 (UTC)
- I've found the right one now, and added the secondary stress mark too (which is just an apostrophe). — Paul G 11:16, 14 December 2005 (UTC)
A few points:
- We shouldn't use the name AHD. It's my fault because at the time I expected all American dictionaries used the same system and this was the closest I could find to a name for it. In fact each American dictionary uses a different system. Some are similar, some are not. The system we use is similar but different to AHD but nobody has come up for a better name for it.
- As for the stress marks, different dictionaries use different ones. I've been checking for some time. Some use a left-pointing diagonal apostrophe and a right-pointing diagonal apostrophe. Some use both left-pointing but one much heavier than the other. There may be other variations. Unicode does some marks which superficially look similar but no pair which is semantically correct.
- Since we are not AHD and our pronunciation scheme is not AHD, there is no right or wrong stress mark. Since Unicode lacks a defined pair of characters for this use there is no right or wrong stress mark. Without some solid research and conclusions I see no point in changing horses in the middle of this stream. — Hippietrail 16:55, 14 December 2005 (UTC)
- These are valid points. Also some dictionaries put the stress mark at the beginning of a syllable and others at the end. One needs to pause to determine the policy there before determining the syllable to which the mark applies.
- Pronunciations are very inexact, and will depend on where the speaker comes from. The periodic debate over the pronunciation of the last vowel in the logo has no solution because some people do pronounce it the way it is shown. I personally prefer to avoid getting involved in arguments about pronunciation. I avoid adding pronunciations unless it is clearly important to distinguish two different pronunciations of the word like "lead".
- To me, each phoneme of a word lives in a probability cloud that permits a range of pronunciations. That's what goes into making up the different accents that we hear. What native speaker would have any difficulty distinguishing between the voices of Winston Churchill and Groucho Marx after hearing only a single sentence? Eclecticology 20:30, 14 December 2005 (UTC)
- Just to clarify: I encourage/challenge any resident to England to provide such a pronunciation on Wiktionary#Pronunciation. So far, all English speaking persons have said the four syllable version (Canada, US, UK.) --Connel MacKenzie T C 22:35, 14 December 2005 (UTC)
- Yes, pronunciations do vary greatly with accent, however, British English has RP as a clear, standardised pronunciation, and I use this for UK pronunciations.
- The RP for "Wiktionary" (based on the RP for "dictionary") would be (in SAMPA) /"wIkS@nrI/ - three syllables, with a "short" i at the end. However, this sounds very dated to modern British ears. Few British speakers would use it these days, using something closer to the US pronunciation: /"wIkS@n@ri/ or /"wIkS@%nEri/, with a shortened "long" i at the end. — Paul G 18:32, 15 December 2005 (UTC)
Translations of names of chemical elements
Over the past few days I've been going through the chemical elements (all 118 of them - phew!) and cleaning up, adding derived terms, synonyms, etc.
I know that most of the translations were imported from an external site (with permission) but it looks as though these might not be 100% up to scratch. It's not always clear whether this is down to the author of the external site or the person who copied the translations. Here are the problems I have noticed:
- Many of the translations (in particular in non-Latin scripts) were entered with an initial capital letter. I am fairly certain in the case of modern Greek that this is incorrect, and have been correcting these. However I cannot comment on the translations in Cyrillic script. It is possible that the elements named after proper nouns (people, countries and places) might have an initial capital, but I have no idea whether this is the case. What makes me particularly suspicious is that all the translations on the external site seem to have initial capital letters, which does not help. I know next to nothing about these languages. It would be useful if someone more knowledgeable could to look into this.
- Most of the modern Greek translations are missing accents. All polysyllabic words in modern Greek have an accent on the stressed syllable. These need to be checked by someone who is familiar with modern Greek or has access to a good, up-to-date Greek print dictionary. The same might or might not apply to words in other languages. — Paul G
- Some of the translations have commas in them, as if they have been edited badly when copying and pasting them. Again, I don't know whether these commas are part of the translations or not.
- The Latin translations of the elements that have been known for the longest time (such as for "tin" and "iron") are no doubt correct, but the translations for the more recently discovered elements (in particular those that as yet have no official name) must be New Latin or Neo-Latin or whatever this would be called, not Latin. Despite the Latin appearance of the name, the Romans didn't know about ununbium, et al!
I know that someone went to a great deal of trouble to copy these translations into Wiktionary, and of course we are indebted to them, but it looks like there is still some work to be done to check these.
This is always the danger with copying translations into languages one is unfamiliar with. Ideally, we would all enter translations that we know or can be certain of, but of course this is not possible, especially as Wiktionary does not yet have contributors whose combined knowledge covers all foreign languages. — Paul G 16:15, 13 December 2005 (UTC)
- Latin is no more the exclusive domain of the Romans than English is of the British. This is a recent notion; see Humanist Latin. —Muke Tever 20:08, 13 December 2005 (UTC)
- (I should also mention that chemistry is one of the few fields in which official Latin names are still being produced — w:International Nonproprietary Names of chemicals are published in English, French, and Latin.) —Muke Tever 20:10, 13 December 2005 (UTC)
- Botany even more so. The last time I looked the formal description of a plant had to be written completely in Latin. This is not the case for zoölogy.
- Good point about the Greek. Perhaps someone could add the accented Greek latters to the list at the bottom of the edit page. At this point we only have the rough breathings. Classical Greek would have even more diacritics but I don't think we need them as much. Eclecticology 23:44, 13 December 2005 (UTC)
- Ah, I didn't realise this about Latin. Thank you for enlightening me, Muke. I know that Latin terms are created for new concepts (there was a dictionary published recently that included Latin for "football" and "mobile phone" among other things, I believe), but I thought these terms were called "New Latin".
- The accented Greek letters are alread in the Greek section of the edit box at the bottom of the page. The letters given there are for modern Greek only - perhaps we need to change "Greek" to "Greek, Modern" (or "Modern Greek") to emphasise this, and add a separate section for Ancient Greek. These will be very useful for etymologies. — Paul G 09:45, 14 December 2005 (UTC)
- You're right. It's sometimes hard on the eyes to distinguish these things :-). We have ά έ ή ί ό ύ ώ which are indeed the accents. We should have ἁ for the rough breathing and ἅ for the accented rough breathing. This should apply for all vowels. Perhaps too we should have ϊ and ϋ with and without the accent. That one should not be necessary for the other vowels. ISO 8859-7 would have us using combining diacriticals for some of these, but I'm not enthused about having to repeatedly explain how that works. Classical Greek would open up a wide range of possibilities, but that can be discussed at some time in the future. Eclecticology
Abandoned work pages
I was just looking at User:Kevin Rector/Offline reports/Articles that are probably missing etymology, and related pages. My first impulse was to want to delete them. It is, however, a part of a broader class of pages. For the most part they have been set up by someone with a perfectly valid personal project in mind, when they want to have means of tracking progress in that project. In this particular case Kevin set up the page in April, became an admin in May, and hasn't edited at all since June. It seems that the page is hopelessly out of date, so that if he were to resume his project he would do well to recreate the page using current data.
Pages of this sort contain long lists that regularly appear and re-appear on "What links here" when I'm trying to clean up something else. That can add tedipus complexity to otherwise simple clean-up jobs. If such a page is currently being used by an editor it should stay. If the editor is currently active without using the page, he should be asked about it. Does anyone have any opinion on when these work pages should be considered abandoned and deletable. Eclecticology 09:23, 14 December 2005 (UTC)
- I check links when I delete a page, but if it is one of these "project" pages, I'll leave it alone. I think most of the time, they provide a reasonable reference (especially for my redirects pages) on what remains to be done, at least in some situations. I guess I don't understand what keepping them for a year hurts. --Connel MacKenzie T C 01:51, 15 December 2005 (UTC)
Do we have any policy about smileys? I see no reason to omit things like -), (_x_) (kiss my ass), (_E=mc2_) (a smart ass) are examples of assicons
- The Wiktionary community has objected to leet to varying degrees over time, but pure symbols generally get deleted pretty quickly. Please sign ~~~~ your posts here. --Connel MacKenzie T C 22:30, 14 December 2005 (UTC)
- Personally I'm against them, at least in the regular namespace. Maybe an index page like we've had for certain other things. I've seen them in print dictionaries but in a special section which would be equivalent to our indeces. One problem would be marking languages since some are cross-cultural and others are only seen or are much more common only in certain parts. The subject of Japanese (or Asian?) smileys being just once case. Another issue is currency, many lists have all kinds of exotic variations that are just not common enough to bother about in the wild. — Hippietrail 01:00, 15 December 2005 (UTC)
Anagrams
I am a new user and I was wondering what the policy is.
- I think policy is to include them, see Wiktionary:Beer parlour/2005/October-December#Anagrams "official" and Wiktionary:Beer parlour/2005/October-December#Anagrams. Gerard Foley 15:04, 14 December 2005 (UTC)
Spam
Someone keeps adding spam to wiki projects [http://www.g155 .info 0] See [7]. This is being added to other wiki projects also, is there a way to stop it automatically? Gerard Foley 23:34, 14 December 2005 (UTC)
- I searched my off-line copy of all current_pages from this week's XML dump and found only a handful of references to ".info". They each turned out to be legitimate ===Further reading===
, so apparently Wiktionary's sysops are doing an adequate job blasting that stuff when it crops up. Trying to post a reply here, I was blocked by Wiktionary's new spam filter, so I edited/broke the link you had above (by inserting a space) just to post this reply. --Connel MacKenzie T C 16:19, 16 December 2005 (UTC)
- It's because g155/.info has already been added to the spam - blacklist on meta. Please post new spamlinkpages on the spam blacklist-talk page, thanks --birdy (:> )=| 16:33, 16 December 2005 (UTC)
- Thanks for this Gerard Foley 23:49, 18 December 2005 (UTC)
Date and time
I just noticed that this web site never adjusted for daylight savings time. I don't have time to submit the bug to bugzilla right now. But if you are a logged in user in a timezone that uses DST, you may want to check your "preferences" at the very top of this page, then under "Date and time" click on "fill in from browser" then save. --Connel MacKenzie T C 16:04, 16 December 2005 (UTC)
- It's like that on every wiki, not just this one… Jon Harald Søby 16:18, 16 December 2005 (UTC)
statistics
Is there any statistics somewhere that shows how many articles there are per language (in en.wikt)? If not, I think I'll do that (wait Friday for my holidays). Kipmaster 12:53, 20 December 2005 (UTC)
- No. Basically because as yet the MediaWiki software has no specific support for Wiktionary, this means it has no awareness of what languages words are defined for on each page. To find this information, the only current method is to download the database file and create your own tool to parse it. I did this many months ago but lost my tool in a power surge since. Good luck! — Hippietrail 16:09, 20 December 2005 (UTC)
- You should have used language templates :p. I'll see what I can do though, I've already developed such a tool for fr:, that's why I propose to help. Kipmaster 18:36, 20 December 2005 (UTC)
- Templates didn't make much difference. Especially since here in the early days we even had competing templates. Basically my parser accepted many synonyms. It even logged with variants were more popular. Anything it didn't understand went into a separate log where I could check into and add the common ones, and look at the uncommon ones for errors. — Hippietrail 15:23, 21 December 2005 (UTC)
- You should have used language templates :p. I'll see what I can do though, I've already developed such a tool for fr:, that's why I propose to help. Kipmaster 18:36, 20 December 2005 (UTC)
- Here it is: User:Kipmaster/statistics. Should inconsistencies in the database be corrected? (see my comments on the stats page). Anyone can edit this page of course, and maybe move it to Wiktionary:something. Kipmaster 12:43, 24 December 2005 (UTC)
Can someone please update this to
This is a file from the Wikimedia Commons. The description on its description page there is shown below. |
Gerard Foley 11:08, 21 December 2005 (UTC)
- Done; thank you Gerard. --Connel MacKenzie T C 00:15, 23 December 2005 (UTC)
No problem, I only copied it from b:simple:MediaWiki:Sharedupload. Gerard Foley 00:58, 23 December 2005 (UTC)
HTML
Can I use HTML or XHTML in my pages on Wikimedia?
--Natovr 19:21, 21 December 2005 (UTC)
- Use Wiki markup whenever you can. There's not too many kinds of things HTML markup will be needed for. Wiktionary is light on formatting. — Hippietrail 01:12, 22 December 2005 (UTC)
IPA templates broken
Earlier I did some work on the pronunciation section of peduncle. Our IPA templates are supposed to force IPA-capable fonts such as Arial Unicode MS. The computer I used for those edits did have that font installed but the page was still rendered unreadable. Does anybody know what could be wrong? I do have a custom CSS which sets body text to Times New Roman - could that be the problem? If so are there any CSS experts who might be able to suggest a way for both the template and custom CSS to work together? — Hippietrail 00:23, 23 December 2005 (UTC)
I have been reviving this page and deprecating Wiktionary:Index to Internals. Thus far that has involved reorganizing the page in broad interest areas, and removing those links from the Index to Internals. A few others have been moved from the Index to Internals, and others will follow.
Index to Internals is an alphabetical list of topics about Wiktionary, often with a topic having several entries to reflect possible ways of approaching a topic. This approach is useful only if you know precisely what you're looking for. If you know what you are searching for the search box is always available.
What I'm looking for now is feed back on the broad topics that I have put on the Utilities page. What others could there or should there be? What are too many or too few sub-topics? Please respond at Wiktionary Talk:Utilities. Eclecticology 20:49, 23 December 2005 (UTC)
Category tracing
Hand in hand with the organizing mentioned above is the idea of category tracing. This means that whenever a category is placed on an article we are fitting it into a hierarchical taxonomy of categories. At this point we have one top level category Category:*Topics. Ideally, every categorized article should be able to trace its way back to that category. These top level categories (which begin with an asterix) will always appear at the top of a list, and must remain very few to be effective. I will be introducing a second one Category:*Wiktionary which will deal with matters that are specific to Wiktionary operations. These will be about the medium rather than the message.
I would ask some of our more technical members whether there is a way to have a box on an article which would automatically generate such a trace. Thus if we categorize a skunk in Category:Mammals the box will show that it is in turn in Category:Animals which is in turn in Category:Biology, and in turn Category:Sciences which is finally in Category:*Topics. What would be required to make this work? Eclecticology 20:49, 23 December 2005 (UTC)
- Well, there is a feature ($wgUseCategoryBrowser) built in to the software that will, on category pages, show all the up-tree categories. According to bugzilla:1571 this is turned off for performance reasons — presumably the same performance issues (indeed more so) would attend the 'portable version' of this feature. —Muke Tever 21:18, 25 December 2005 (UTC)
- How serious would that performance issue be? Back before we had categories I took a random sampling of articles on Wikipedia and used "What links here" to trace the article back to the Main Page. The resulting path seldom had more than five steps in it. I believe that IMDB had (and may still have) a feature which connected any two movie personalies through a series of movies. That path could be amazingly short. If I'm not mistaken the performance problem would be a direct function of the number of lookups needed when a page is called up. Link-rich pages tend to be the slowest. If a trace to the tap category involves five steps that would mean five lookups, and the performance problem would only become serious if an article is directly in too many categories. It would be an interesting experiment to have this turned on for Witionary, which is still a much smaller project than Wikipedia. Eclecticology 21:59, 30 December 2005 (UTC)
Where report vandalism?
Anon 63.19.199.217 is replacing pages with Mayodan (apparently a known vandal). I'm not as familiar with wiktionary as wikipedia, where do I report it? I reverted all the pages I could but a copule are creations. JillianE 19:47, 25 December 2005 (UTC)
Now at user:63.19.143.81 . JillianE 20:20, 25 December 2005 (UTC)
(Previously placed in tea toom, sorry). JillianE 20:31, 25 December 2005 (UTC)
Category for Dialect words
There is no (or I couldnt find) a category for dialect words - this seems odd, so maybe the ommission is deliberate? Is one needed? – Saltmarsh 07:22, 26 December 2005 (UTC)
- Feel free to create some. I guess we'd need a supercategory for each major language to cover it's dialects, a category for each dialect of each major language, and a superdupercategory to cover the whole lot. But no need to make them all up front. Just make the ones that you find or create entries for. I'm not sure what kind of naming convention might be best but maybe "Dialect words", "English dialect words", "Australian English words"... — Hippietrail 16:33, 26 December 2005 (UTC)
Multiple Etymologies
Is there any consensus on the clearest way of setting out multiple etymologies I have found a number of different existing example all of which may have shortcomings. Example 3 at User:Saltmarsh/Sandbox2 gives clarity but may compromise structure because :
- It adds to headings viz Etymology 1
- It indents headings beyond the usual position viz ====Noun==== (with 4=)
Is there a good example of how it might be done? cheers Saltmarsh 07:35, 26 December 2005 (UTC)
- Personally I don't like the numbers because they are arbitrary. It's not as though they are stable enough to use when cross-referencing. I do like the extra heading levels since heading levels show the structure which is very helpful and also reflected in the table of contents. This is also what HTML heading levels were specifically designed for. Trying to force rigid heading levels is a simplistic view. I'm not against also upping the levels on articles with a single etymology but it feels like a lot of pointless work and I wouldn't be surprised if other contributors would be against it. — Hippietrail 16:26, 26 December 2005 (UTC)
- I agree that better ways of dealing with multiple etymologies may exist, but until we do have that I can see no other immediate practicla approach that recognizes the fact that a wide series of words may be spawned by the same etymology.
- I agree with what Hippitrail says about heading levels. Except for using Level 2 for languages, I have always considered heading levels as relative rather than absolute. Eclecticology 23:07, 26 December 2005 (UTC)
- Saltmarsh has my full support. Example 3 is exactly what I used to do and tried to make other people do. It allows cross-referencing to different etymologies, and it unifies format and makes adding a new etymology less tedious. It is bad manners to force people adding new things to change stuff they are not (and indeed should not be) concerend about. If other people agree, the changes won't take very long as we know from experience (top/mid/bottom templates). Ncik 02:12, 27 December 2005 (UTC)
- I have now found this format as the recommended method but thus: Etymology (1) under Wiktionary:Entry_layout_explained#Homographs - as they say RTFM thanks for your time Saltmarsh 07:39, 27 December 2005 (UTC)
- Forget about the unnecessary parentheses. Removed them on ELE. Ncik 03:03, 28 December 2005 (UTC)
- I have now found this format as the recommended method but thus: Etymology (1) under Wiktionary:Entry_layout_explained#Homographs - as they say RTFM thanks for your time Saltmarsh 07:39, 27 December 2005 (UTC)
Boxing day present
Merry Christmas,
Have a look at this first Wikidata application for the GEMET data.. GerardM 19:44, 26 December 2005 (UTC)
- It has a long way to go, but I suppose that as a set of translation lists it could be helpful. I happened to look at its entry for waste and I found that as "Straw, hay or similar material used as bedding by animals". This is a completely eccentric definition, even if it is appropriate to the sometimes synonym litter. To whatever extent it may have value the 1913 Webster gives five definitions for the noun alone, and that is not one of them. Eclecticology 00:44, 27 December 2005 (UTC)
Page format
I use the classic skin and Win98/MSIE5.5. From this viewpoint, there is something wrong with many (all?/most?/some?) Wiktionary pages. They are way too wide. and you have to constantly left/right scroll. In particular this applies to the main page which is the worst so far. Wikipedia does NOT suffer from the same problem. -- SGBailey 09:26, 28 December 2005 (UTC)
Creating a new entry
I've been here before and done it a few times. Yet today I couldn't find ANY reference to how to create a new entry. There used to be a link from either main page or community portal to "creating a new entry". It just isn't there. I suppose I'll have to edit my user page to make a redlink and then follow the link.
(If it is there and I've missed it, then it is too well hidden to be fit for purpose.) -- SGBailey 09:33, 28 December 2005 (UTC)
- I don't think it's expected that one will just sit down and say "I'm going to create a new entry." The first step in the process is generally seeing whether an entry already exists, i.e. typing your word into the search box and hitting 'Go' (or enter). If the article doesn't exist, the search failure page prominently features a link to create the new article, and even offers several helpful templates to begin creating the article from. —Muke Tever 19:21, 29 December 2005 (UTC)
- I agree with your procedure but not your facts. I use classic skin, MSIE5.5 Win98SE and I do not get a create a new entry link. To be specific, I get the usual links in top and left of page, the fundraising links, and
- "Search For query "melpel" For more information about searching Wiktionary, see _Searching_Wiktionary_. Sorry, there were no exact matches to your query."
-- SGBailey 21:05, 29 December 2005 (UTC)
- That is what happens when you go out of your way to click "search" (which is inappropriate here, as you're not looking for the word in any article, you're looking for this word as an article title). When you click "go" (or press "enter" when the form is in its default state of having 'go' selected) you get "Search For query "melpel" For more information about searching Wiktionary, see Searching Wiktionary. No page with this exact title exists; trying to find similar titles. You can create an entry with that title or put up a request for it or browse nearby pages. For single word entires [sic], you can also create it with one of the following preloaded templates:" [snip table] "Sorry, there were no exact matches to your query." This happens in the classic skin as well as the default Monobook skin (I just checked this). —Muke Tever 18:57, 30 December 2005 (UTC)
NEW PROJECT SUGGESTION: Wikimology
I think you should start a new database of names and surnames, called...oh, I don't know, Wikimology, perhaps? At any rate, you should. People would love to look up name and surname ideas for books or movies or even for fun!
It would sort of be like behindthename.com and surnames.behindthename.com.
Well, tell me what you think of the idea.
Sincerely,
Vaidehi
How many words?
There are currently 109,097 words entries in wiktionary. But we still have a lot of entries in requested entries and the list of redlinks is still very long. So there is a lot more to be done. Can anyone hazard a guess on how many entries there will be when we're done? (That doesn't even cover that we need to go back polish, expand and improve existing entries). JillianE 05:03, 29 December 2005 (UTC)
- Our current numbe of entries is close to the numbe of entries in the 1913 Webster, but there are still a lot of its words that we don't have. By my estimate my copy of the 1914 Century Dictionary in 10 volumes has about 300,000 words. Then there's all that stuff in the OED. That must bring things close to a half-million for English alone. Add in all the other languages (which admittedly fall short of English in vocabulary richness) and we should be easily able to exceed 2 million or more
- The number of entries when we're done presumes that we will some day be done. Eclecticology 09:34, 29 December 2005 (UTC)
Special:Wantedpages and people's lists of words
This list of most linked-to pages is heavilly influenced by people's various lists of words. For instance, of the 19 links to "shows" only 1 is from a proper article (the singular). Perhaps people might consider deleting any lists that they no longer have a use for. SemperBlotto 08:40, 29 December 2005 (UTC)
- I'm sympathetic to what you say see what I raised at #Abandoned work pages above. When these lists get long enough or there are too many they lose much of their usefulness. Look at Wiktionary:Requested articles:English or Wiktionary:Requested articles:English/DictList/A or the indexes and we find that people have done an excellent job indexing the things that we don't have several times over. I think that these are all subject to the law of diminishing returns. It would be nice if a lot of these could be consolidated. In many cases it's not a matter of no longer haveing any use for the lists. Their author may even have forgotten that they exist or may himself be no longer active. Yet some of our colleagues will argue very strongly that we should keep them all. Eclecticology 10:12, 29 December 2005 (UTC)
- Any chance of getting developers to add a filter so it doesn't count list pages somehow? It begs the question of how to recognize entries not to count. (Maybe also have it stop reporting links to non-existent user and user:talk pages.) JillianE 20:25, 29 December 2005 (UTC)
- I don't know if that would help, but then I'm naturally suspicious of technical solutions. I also don't attach much importance to article counts; they never give much more than an approximate idea of where we stand. I just clicked on "Random page" 20 times. This gave me 16 regular pages, 2 transwiki pages, 1 appendix page and 1 rhyme page. I don't know how statistically significant that is. "Appendix:" is now a pseudonamespace, and the rhymes could conceivably be in there too; there has been community approval for that to be a full namespace. The transwiki pages just need cleaning up, and decisions made. User links should not normally appear in the main namespace. Eclecticology 21:20, 30 December 2005 (UTC)
- Any chance of getting developers to add a filter so it doesn't count list pages somehow? It begs the question of how to recognize entries not to count. (Maybe also have it stop reporting links to non-existent user and user:talk pages.) JillianE 20:25, 29 December 2005 (UTC)
Help pages
There has recently been an effort to import help pages from Meta, and updating them automatically whenever they are changed on Meta without regard to changes that we may want. To discourage further editing a template has been devised to add the following on each such page:
- This is a copy of the master help page at m:Help:H:h Help. Do not edit this page. Edits will be lost in the next update from the master page. Either edit the master help page for all projects at Meta, or edit the project-specific text at Template:Ph:H:h Help. You are welcome to copy the exact wikitext from the master page at Meta and paste it into this page at any time.
I have blanked that template at Template:H:h Help.
Much of what is said in these master templates is very good information, and if most of that information is studied on its own merits would probably stay. Nevertheless, it should be up to the community on this project to decide which of these policies should apply; as is often said, "That's the wiki way". Changes to these pages may very well be discussed on Meta before they are made, but that does not mean that their applicability has been discussed here, or that we have abandoned this community's rights to be consulted or to have a different version on some parts of the help pages.
If anyone wants to see how any of these topics are treated in Meta, a simple interwiki link should be enough. Eclecticology 19:30, 29 December 2005 (UTC)
Templates for inflexions
What has happened to section 16 of Wiktionary:Index to templates ("Inflexions")? All of a sudden, the system that gives entries of the form:
is being given as the standard, over the more frequently used system that gives:
- belly (plural bellies)
and the tables of templates that give inflexions of nouns, adjectives and verbs have been deleted.
I don't believe there has been any discussion about this.
I prefer the second of the two templates above for giving the plural of a noun, as I think the layout is neater (and likewise with the templates for adjectives and verbs), but I don't have a problem with the first one. I'm not keen on the tabulated form, but if people want to use it, that's up to them. What I do have a problem with is the suggestion now seems to be that the first way of doing it is the only way (which it is not) and is the Wiktionary standard (which it is not).
Can't we just allow users to choose which they would prefer to use, and have the tables restored to "Index to templates", please? Some of us put a lot of effort into constructing these templates and tables and I am disheartened, not to mention angered, to see my work cast aside.
I see Ncik made these changes. I'll be speaking to him privately about this. — Paul G 21:31, 29 December 2005 (UTC)
- Your comments are well taken. I am not too attracted to a long series of templates that I can't remember, but these things need time to live or die on their own merits. I essentially like the appearance of what Ncik has done for verbs, but have adopted a simplified model which in most cases requires remembering only a single template which may require a little more typing. For nouns, I think that the previous layout was fine, but I see no point to having nine forgettable templates to accomplish this.
- At this point perhaps we can run with both sets of templates, showing both options on the Index page, and let people choose as they will. A little Wikidarwinism may be in order. Eclecticology 22:49, 29 December 2005 (UTC)
- Only just came across this thread. Here what I replied to Paul G in this matter on my talk page:
- There are too many of those templates. How can we ask editors to memorise all these things or to look them up every time they add a word. They are also not flexible enough, hence don't cover sufficiently many cases. The irregular verbs template is completely inadequate as many irregular verbs have more than just one possible conjugation and you are not able to add these additional forms with the template's automatic formatting. The irregular verbs (same for all grammatical categories, actually) category should be added manually, so nobody is required to determine whether a word is irregular or not. The conjugation templates also don't cover for regular doubling verbs which are not doubled in American English (e.g. travel). Despite the huge amount of adjective templates they don't cover all irregularities; eg: old, older, oldest and old, elder, eldest. I also don't like that there is a template for not comparable words: You never know if you know all meanings of a word, there might be one you don't know about which is comparable. Use Template:not comparable at the beginning of each definition instead. Same holds for countability of nouns, etc. The noun templates are unsuitable to be used with words that have more than one plural (you will find many of these in Category:English irregular plurals). A look at the rules for when a word is to be counted as irregular at the beginning of that category shows that your templates would produce the incorrect plural *Germanies. These arguments are not exhaustive, but anyway some thinking of your own should convince you that the templates you prefer should be abolished. Ncik 03:01, 30 December 2005 (UTC) :Ncik 04:45, 31 December 2005 (UTC)
- You're missing the point entirely. It's not just about which templates are better, It's about how decisions are made. It's about showing some respect for the work that others have done even when you have found a better way to do things. Until a consensus is reached both systems should be allowed and described. Meanwhile any organized attempt to remove all traces of the other system should henceforth be viewed as POV pushing. Eclecticology 09:30, 31 December 2005 (UTC)
- I have used and improved my templates for several months now. I have repeatedly introduced them in the BP and on User talk pages and asked for feedback. Didn't get much, but the comments I got were mostly positive (apart from Connel's inevitable outright disapproval). That's why I decided to whack my templates on Wiktionary:Index to templates and delete the others, wondering whether they will survive the Wiki-Darwinism. Ncik 15:46, 31 December 2005 (UTC)
- I have restored the original to the Index page, and left Ncik's version as an alternative. Although Connel may have been most vocal in opposing Ncik's version he was not alone. As for the lack of feedback ... that's life on the wiki. There are always endless ideas floating about. Most will not be taken very far by their own creators, perhaps out of his own boredom, so there's no point to debating those. Opposition generally does not appear until one starts putting his ideas into action; it gets strongest when the creator starts imposing those ideas. Eclecticology 23:07, 31 December 2005 (UTC)
- Thank you, Eclecticology, for doing this. I think this was the appropriate thing to do. I would have done it myself but I am not keen on cleaning up after others and wanted it to be raised for discussion first.
- Ncik, a couple of points in defence of the templates. I disagree that they should be abolished, for the following reasons:
- They have been active for a long time now, and you are the only person as far as I am aware who has objected to their use.
- They are used not just by me but also by other users, which means they are now an established part of Wiktionary.
- They are a part of many pages, and abolishing them would mean a lot of pages would need to be edited.
- The inflections can always be written out in full if desired, if the template doesn't fit or if the category would be wrong. There is no requirement to use them if you don't want to.
- The templates cover almost all situations and save a lot of typing. Again, cases that don't fit can be written out in full, but having the templates available means that that this need only be done 1% or whatever of the time.
- The templates automatically categorise various words. Again, if the category comes out wrong, for a particular word, don't use the template.
- The plural of "Germany" is indeed "Germanys" rather than "Germanies" (all proper nouns ending in "y" add an "s" irrespective of standard spelling rules). If there isn't a template to cover this, the options are either not to use a template or create a new one, rather than to say that the template is wrong and should be thrown out.
- If a new sense comes along that means that the template no longer applies (eg, a countable sense of a noun or a comparable sense of a verb, or new comparatives and superlatives such as "badder" and "baddest"), a more appropriate template can be used, or the inflections can be written out in full. It is no big deal to switch to a different template, short of looking it up, and any categories to which the word is added (or was previously added) will automatically be updated when the edit is saved.
- Yes, there are a lot of templates, as there are a lot of cases to cover. Although I created many of them, I know only the most frequently used ones off by heart and do not intend to memorise them all. That's what the tables at Wiktionary:Index to templates are for - as a reference. That page is in my Internet bookmarks and I look up templates there as and when I need them.
- Yes, you're right about the "doubling" case for verbs (such as "travel" -> "travelling" (UK), "traveling" (US), so why not be proactive and change it so it gives both UK and US spellings?
- Ncik, a couple of points in defence of the templates. I disagree that they should be abolished, for the following reasons:
- To summarise: no one has to use the templates; most of them do a fine job in my view; and if they don't do what you want them to do, either don't use them or make a new one that does. — Paul G 12:33, 1 January 2006 (UTC)
I'd just like to point out that the popularity of Ncixs templates are due to him converting thousands of entries (starting with the irregular verbs, then stalking my entries since then.) Until *all* of Ncik's entries are reverted it is unfair to claim (as Ec seems to be) that one is somehow more popular than the other. I repeat: Ncik is the only militant proponent of his ugly non-standard templates. The longer his abuses are condoned, the more newcomers think that perhaps the ugly pink boxes are intentional. --Connel MacKenzie T C 01:45, 5 January 2006 (UTC)
- Other people like and use these templates as well. The only criticism you have is that they are ugly. Nothing substantiated has been contributed by you. Ncik 15:58, 7 January 2006 (UTC)
- I, too, find them unaesthetic, wasteful of browser real estate. I have not commented previously because I choose not to use them. - Amgine/talk 00:02, 8 January 2006 (UTC)
- Now Ncik claims to know what other people think? My oh my. Other people have used your templates occasionally probably under the assumption that you would continue hounding them if they didn't. That is hardly concensus. Your vandalism of thousands of entries (unchecked) certainly skewed the balance...so even if a newcomer knew your templates were wrong, they'd have a hard or impossible time finding the correct ones. Your flat out lies are silly. I've presented dozens of specific critiques of your templates as have others. Since you've adopted Polyglot's suggested input parameter format (slightly reordered) almost half of my complaints have been addressed. But then again, you reordered the input parameters against his suggestion, therefore it remains wildly unacceptable. My complaints about the boxes' inflexability remain. My complaints about categories remain. My complaints about the excessive verbosity remain. My complaints about your stomping on other people's naming conventions remain. My complaints about your vandalizing entries remains. And the pink shit is ugly. --Connel MacKenzie T C 01:01, 12 January 2006 (UTC)
Old entries - why?
The Old entries link on the main page seems to no longer serve a purpose. It pulls up a list of Chinese/Japanese idiograms(?) and question marks. I am sure it gives an odd impression to new visitors ! Or am I missing something ? Saltmarsh 12:50, 30 December 2005 (UTC)
- Special:Ancientpages is really more of interest to potential editors than potential readers. It shows those pages which have not been edited in a long time. This either means articles so perfect that nobody can think of any edits to make, or, more likely, articles that have been overlooked for a long time and are probably out of date. —Muke Tever 18:48, 30 December 2005 (UTC)
- I realised that - I have looked again, the list has 1,000 lines all but one are not English (and this is an English dictionary) - the list serves no purpose. Saltmarsh 05:46, 31 December 2005 (UTC)
- Gmcfoley, Millie and I have maintained all the Japanese words so that they are removed from the list. But I guess the number of the remaining heaped Chinese characters, automatically added by User:NanshuBot, is more than 10,000 and it wouldn't be a good idea to try to dig them away with bare hands. --Tohru 07:58, 31 December 2005 (UTC)
- The list is automatically generated. I agree that most of these are likely out of date, and anyone familiar with the Chinese characters is free to edit them. The question marks refer to characters in the CJK Unified Ideographs Extension A of Unicode between U+3400 and U+4DB5. This character set may need to be specially downloaded into someone's browser. Eclecticology 08:40, 31 December 2005 (UTC)
- It is a dictionary written in English, but it is a dictionary of all languages, so the list would be useful to those users/contributors who aren't interested in looking up English words. At any event it is automatically generated by the wiki software so debating its purposefulness is moot: we can't get rid of it. We may as well use it. —Muke Tever 16:38, 31 December 2005 (UTC)
- I think, like Saltmarsh, that it shouldn't be on the Main page. It is nothing we want to be "in the spotlight". It should be on the Wiktionary:Community Portal, but not on the main page. Jon Harald Søby 16:41, 31 December 2005 (UTC)
- The Main page needs a thorough overhaul. It is particularly worrying that it gives new users absolutely no guidance on how to use Wiktionary. Ncik 19:26, 31 December 2005 (UTC)
- It was just a matter of time before those Nanshubot pages drifted to the top. It probably makes sense not to have that page linked from the Main Page, but I also agree with Ncik that the Main Page itself needs some updating. I have made a few minor adjustments lately, but they have just scratched the surface. Any design ideas? What relation should there be between the Main Page and the Community Portal? Eclecticology 23:16, 31 December 2005 (UTC)
IPA/FAQ
The Wiktionary:FAQ#Pronunciation page has two questions about IPA characters that don't show up properly in Internet Explorer. I don't feel competent to deal with these. Perhaps someone else could have a look. Eclecticology 01:11, 31 December 2005 (UTC)