Wikitech-l March 2005

wikitech-l@lists.wikimedia.org

128 participants
175 discussions

MediaWiki to Latex Converter
by Hugo Vincent 18 Jun '12

18 Jun '12

Hi everyone, I recently set up a MediaWiki (http://server.bluewatersys.com/w90n740/) and I need to extra the content from it and convert it into LaTeX syntax for printed documentation. I have googled for a suitable OSS solution but nothing was apparent. I would prefer a script written in Python, but any recommendations would be very welcome. Do you know of anything suitable? Kind Regards, Hugo Vincent, Bluewater Systems.

6 13

machine translaton of the articles...
by prasad gadgil 20 Nov '05

20 Nov '05

Hi, I have just joined, I am from mumbai, india. I would like to get the articles translated in marathi, my mother tongue. Looking at the effort and no of volunteers, this will not be usable in any reasonable amount of time. That has made me think of alternatives - machine translation. A state funded institute has a software available but I don't have access to it yet. Pl. comment about this approach. Has this been tried for any other language earlier. Thanks & regards, Prasad Gadgil ________________________________________________________________________ Yahoo! India Matrimony: Find your life partner online Go to: http://yahoo.shaadi.com/india-matrimony

9 8

Time to get some new servers
by Camille Constans 04 Jul '05

04 Jul '05

Hello, As wikipedia is slow at the busy time, I propose to get some new servers for our cluster. - Some new web servers(3 or 4), P4 2,8Ghz with 2Go of RAM - A server which could be a backup for nfs server, zwinger, with bigger disk, 80Go is very low, maybe 200 or 250Go - Upgrading disk of zwinger to 200 or 250Go (or add a new one) - A db server in 64 bits mode with 4Go of RAM (if we cant make working geoffrin), like this one : http://www.macomp.com/products/servers/patriot2200.asp With raid 10 disk system, 4 or 6 drives in raid and 1 stand-by. I prefer 15000rpm disk, but I can understand that they are more expensive - Maybe another squid server What do you think of that ? Shaihulud

7 6

Different alphabets for the same language
by monk＠zoomcon.com 13 Apr '05

13 Apr '05

Hello wikitech-l, Belarusian language (http://en.wikipedia.org/wiki/Belarusian_language) has now two quite widely used alphabet versions - Cyrillics and Latin (actually, it also has Arabic alphabet, but it is too rarely used). For now, be: wikipedia uses Cyrillics. But we really need Latin version for those, who prefers to use this alphabet. We have strict bidirectional rules to transform any text between Cyrillics <-> Latin. We are interested in creating of a "live mirror" (automatic translator) between Cyrillics and Latin alphabets for be: Wikipedia. I mean, it would be great, if anyone could read and submit any article in either alphabet. As far as i know, something similar was created for different alphabets of Chinese language, so this issue should be worked over already. I'm myself an experienced PHP+MySQL developer, so I can directly participate in this project. Can anyone provide their thoughts and any help about this issue? It is surely interesting and quite important thing. Thank you. -- Best regards, Monk ([[en:User:Monkbel]] mailto:monk@zoomcon.com

9 33

MediaWiki 1.5 work and general notes
by Brion Vibber 06 Apr '05

06 Apr '05

Just a reminder of work in progress and general background, for those who might be commenting without being aware of present work... First, in MediaWiki 1.5 we've made a major schema change, intended to reduce the number of changes to data rows that have to be made and to slim down the amount of data that has to be pulled per-row when scanning non-bulk-text metadata. Specifically, the 'cur' and 'old' tables are being split into 'page', 'revision', and 'text'. Lists of pages won't be trudging through large page text fields, and operations like renames of heavily-edited pages won't have to touch 15000 records. This will also give us the potential to move the bulk text to a separate replicated object store to keep the core metadata DBs relatively small and limber (and cacheable). Talk to icee in #mediawiki if interested in the object store; he's working on a prototype for us, to use for image uploads and potentially bulk text storage. Second, remember that each wiki's database is independent. It's very likely that at some point we'll want to split out some of the larger wikis to separate master servers; aside from localizing disk and cache utilization, this could provide some fault isolation in that a failure in one master would not affect the wikis running off the other master. Third, we're expecting to have at least two new additional data centers soon in Europe and the US. Initially these are probably going to be squid proxies since that's easy for us to do (we have a small offsite squid farm in France currently in addition to the squids in the main cluster in Florida) but local web server boxen pulling from local slaved databases at least for read-only requests is something we're likely to see, to move more of the load off of the central location. Finally, people constantly bring up the 'PostgreSQL cures cancer' bugaboo. 1.4 has experimental PostgreSQL support, which I'd like to see as a first-class supported configuration for the 1.5 release. This is only going to happen, though, if people pitch in to help in testing, bug fixing, and of course make some benchmarks and failure mode tests compared to MySQL! If you ever want Wikimedia to consider switching, the software needs to be available to make it work and it needs to be demonstrated as a legitimate improvement with a feasible conversion. Domas is the PostgreSQL partisan on our team and wrote the existing PostgreSQL support. If you'd like to help you should probably track him down; in #mediawiki you'll usually find him as 'dammit'. -- brion vibber (brion @ pobox.com)

6 16

revision table change in HEAD
by Brion Vibber 03 Apr '05

03 Apr '05

revision and text now use separate row ID numbers in HEAD. A revision row refers to text.old_id with its rev_text_id key; this allows text revisions to be stored independently of a given revision. * Operations that change only metadata can be put in the page history without storing a new text record. I've done this for page move as a start. (It might be good to also add a marker field for metadata-only changes so they can be shown distinctly in the history.) * In theory, reverts could do the same, referring to the prior text record without saving a new copy. * The storage backend can number text objects using its own scheme; if necessary text object IDs can be reassigned during batch recompression. If you're running a 1.5 test wiki, you'll have to run the update.php to add the field. (Manually, maintenance/archive/patch-rev_text_id.sql) -- brion vibber (brion @ pobox.com)

3 4

Syntax standardization effort
by Jimmy (Jimbo) Wales 02 Apr '05

02 Apr '05

At ETech in San Francisco, I met with Ross Mayfield of Socialtext, and we discussed the idea that there should be a central core set of standard wiki syntax. Ross was quite keen on the concept. Standard syntax is important for the entire wiki world so that as people become more accustomed to using wikis for all kinds of things, they can feel comfortable on a variety of platforms. It seems natural to me that as the largest wiki project (and we are probably the wiki engine with the most installations, although I have no actual way of knowing that), we should take a leadership role in this. http://www.socialtext.net/mayfield/index.cgi?action=refcard&login=user4232 is a quick 'refcard' on the syntax of socialtext. I propose that we set up a group of people either in a mailing list or a wiki or both, and invite representatives from all the major wiki software projects and companies to participate in a syntax standard. I don't know anything about how formal standards are proposed and decided, but just as with HTML, it seems that wiki syntax is a natural for some standardization. --Jimbo -- "Pianosa is een Italie" - first words of 50,000th article on nl.wikipedia.org

25 42

Changes in the Python Wikipediabot framework
by Andre Engels 02 Apr '05

02 Apr '05

The following changes have been made to the Python Wikipediabot framework since the previous overview of February 6. As always, the new files can be downloaded at http://cvs.sourceforge.net/viewcvs.py/pywikipediabot/pywikipedia/, and one can of course use a newer version as well. Furthermore, changes have been done to the Wikimedia software early February, and to the bot as well. Therefore: * versions of wikipedia.py older than 1.391 (February 5) do not work any more * If you use a version of anything from February 6 or later, you should use a version of everything (more precisely, wikipedia.py, config.py and the specific bot you are using) from that date or later. But on to the newer changes. For the bugfixes I will describe which bot(s) is or are affected, and what goes wrong with older versions. "int." means that the bug has been introduced in some earlier version. Bugs both introduced and solved in the current period have not been mentioned. For all changes the files and versions that are needed are mentioned. Andre Engels == Dependencies == In general, if error messages occur upon downloading a new version of a bot, getting a new version of wikipedia.py as well would be the first idea. Versions of wikipedia.py 1.405 and higher need family.py 1.21 (and vice versa) == Bug fixes == * general * wikipedia.py 1.397 (int. 1.392 - does not count number of bot processes correctly) * general * wikipedia.py 1.397 (int. 1.392 - cannot edit redirect pages and cannot create new pages) * general * wikipedia.py 1.406 (is unable to edit after having been dormant for some time) * catall.py * catall.py 1.13 (int. 1.12 - gives an error message before ending) * category.py * category.py 1.62 (int. 1.61 - major disfunction) * category.py (and others) * catlib.py 1.32 (finds at most 200 articles in a category) * interwiki.py * family.py 1.20 (does not find pages on csb:) * interwiki.py * family.py 1.21, wikipedia.py 1.406 (does not recognize [[{xx:PAGENAME}]] interwiki links and a few redirects) * interwiki.py * interwiki.py 1.135 (crashes when the -continue option is used with an empty dumpfile) * interwiki.py * interwiki.py 1.136 (chance of not being removed from the list of bot processes if stopped *very* soon after being started) * interwiki.py * titletranslate.py 1.38 (crashes when a non-existing language is given as a hint) * pagefromfile.py * pagefromfile.py 1.7 (int. 1.6 - gives error message at end and is not removed from the list) * pagefromfile.py * pagefromfile.py 1.8 (the option "-end" is not recognized) * redirect.py * redirect.py 1.19 (int 1.18 - major disfunction) * replace.py * replace.py 1.35 (int. 1.6 - gives error message at end and is not removed from the list) == Major changes == * interwiki.py can now, when asking for hints, be asked for the text of the page by typing "?" or adding the "-showpage option * interwiki.py 1.136 * replace.py and solve_disambiguation.py now give their diffs colored (Unix only) * replace.py 1.37, solve_disambiguation.py 1.128, wikipedia.py 1.404 * Two new features of sqldump.py: findr finds regular expressions; the function of baddisambiguation is not clear to me * sqldump.py 1.17 * interwiki.py uses nb: instead of no: in presence of an nn: link or on the nn: wiki * interwiki.py 1.139 == Minor changes == * Swedish translations for interwiki.py: interwiki.py 1.137 * Change of Icelandic text for category.py: category.py 1.63 * Hawaiian and Chichewa added to known languages (wiktionary only Hawaiian yet): wikipedia_family.py 1.89, wiktionary_family 1.21 == Cosmetic changes / invisible changes == * Multiple alternative redirect texts for one language are supported: wikipedia.py 1.406 * Special care for zh-cn/zh-tw difference removed: interwiki.py 1.139, wikipedia.py 1.408

1 2

Syntax for texts in verse
by Yann Forget 01 Apr '05

01 Apr '05

Hi, There is a need for a special syntax for texts in verse, as : creates an indentation and <br/> is quite ugly. Could it be possible to have a code <verse> which would produce carriage return at the end of each line ? Thanks, Yann -- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikipedia.org/ | Encyclopédie libre http://www.forget-me.net/pro/ | Formations et services Linux

5 4

Small schema change in CVS
by Brion Vibber 31 Mar '05

31 Mar '05

I've added a rev_deleted field to the revision table in CVS HEAD. If running a wiki on 1.5 CVS currently, run update.php or manually source maintenance/archives/patch-rev_deleted.sql to update the table. This will be part of changes to the deletion system, not yet fully implemented: * Deleted revisions will remain listed in the contributions history (though greyed out and in strike-through), making it easier to track serial offenders. * Deleted revisions will be listed in the regular page history (again, grayed out and in strike-through). In cases where individual revisions have been struck from a page for eg copyright reasons this will make it easier to see where portions were removed. * The actual text won't have to be shuffled into an archive table; simply updating a 1-byte field in the revision records ought to be faster, and this will allow deletion again on pages which have been block-compressed. (Block compression combines multiple revisions' text into a single compressed record; since this interacts poorly with the current deletion/undeletion system, deletion is currently disabled for any page which has been block-compressed.) Deleted text could be pruned at leisure by a batch or background process if necessary. * For admins, the diff and view source tools will remain available on deleted pages, which the current Special:Undelete viewer doesn't provide. This can aid in analysis of a vandalism spree. The deletion/undeletion system has not yet been updated to actually use this field, nor has limitation of viewing of such revisions to admins been implemented. (Remember the purpose of _deletion_ is to make materials inaccessible to the public, especially if for legal reasons, so non-admins will still not be able to look at the actual deleted revision text.) Once finished there will probably be a slight change to the page table as well, to mark pages which have no non-deleted revisions. -- brion vibber (brion @ pobox.com)

3 3

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l March 2005