Hi everyone,
I recently set up a MediaWiki (http://server.bluewatersys.com/w90n740/)
and I need to extra the content from it and convert it into LaTeX
syntax for printed documentation. I have googled for a suitable OSS
solution but nothing was apparent.
I would prefer a script written in Python, but any recommendations
would be very welcome.
Do you know of anything suitable?
Kind Regards,
Hugo Vincent,
Bluewater Systems.
Hi,
I have just joined, I am from mumbai, india. I would like to get the
articles translated in marathi, my mother tongue. Looking at the effort
and no of volunteers, this will not be usable in any reasonable amount
of time.
That has made me think of alternatives - machine translation. A state
funded institute has a software available but I don't have access to it
yet.
Pl. comment about this approach. Has this been tried for any other
language earlier.
Thanks & regards,
Prasad Gadgil
________________________________________________________________________
Yahoo! India Matrimony: Find your life partner online
Go to: http://yahoo.shaadi.com/india-matrimony
Hello,
As wikipedia is slow at the busy time, I propose to get some new servers for our cluster.
- Some new web servers(3 or 4), P4 2,8Ghz with 2Go of RAM
- A server which could be a backup for nfs server, zwinger, with bigger disk, 80Go is very low, maybe 200 or 250Go
- Upgrading disk of zwinger to 200 or 250Go (or add a new one)
- A db server in 64 bits mode with 4Go of RAM (if we cant make working geoffrin), like this one :
http://www.macomp.com/products/servers/patriot2200.asp
With raid 10 disk system, 4 or 6 drives in raid and 1 stand-by. I prefer 15000rpm disk, but I can understand that they are more expensive
- Maybe another squid server
What do you think of that ?
Shaihulud
Hello wikitech-l,
Belarusian language (http://en.wikipedia.org/wiki/Belarusian_language)
has now two quite widely used alphabet versions - Cyrillics and Latin
(actually, it also has Arabic alphabet, but it is too rarely used).
For now, be: wikipedia uses Cyrillics. But we really need Latin
version for those, who prefers to use this alphabet. We have strict
bidirectional rules to transform any text between Cyrillics <-> Latin.
We are interested in creating of a "live mirror" (automatic
translator) between Cyrillics and Latin alphabets for be: Wikipedia.
I mean, it would be great, if anyone could read and submit any article
in either alphabet.
As far as i know, something similar was created for different
alphabets of Chinese language, so this issue should be worked over
already.
I'm myself an experienced PHP+MySQL developer, so I can directly
participate in this project.
Can anyone provide their thoughts and any help about this issue? It
is surely interesting and quite important thing.
Thank you.
--
Best regards,
Monk ([[en:User:Monkbel]] mailto:monk@zoomcon.com
Just a reminder of work in progress and general background, for those
who might be commenting without being aware of present work...
First, in MediaWiki 1.5 we've made a major schema change, intended to
reduce the number of changes to data rows that have to be made and to
slim down the amount of data that has to be pulled per-row when scanning
non-bulk-text metadata.
Specifically, the 'cur' and 'old' tables are being split into 'page',
'revision', and 'text'. Lists of pages won't be trudging through large
page text fields, and operations like renames of heavily-edited pages
won't have to touch 15000 records. This will also give us the potential
to move the bulk text to a separate replicated object store to keep the
core metadata DBs relatively small and limber (and cacheable).
Talk to icee in #mediawiki if interested in the object store; he's
working on a prototype for us, to use for image uploads and potentially
bulk text storage.
Second, remember that each wiki's database is independent. It's very
likely that at some point we'll want to split out some of the larger
wikis to separate master servers; aside from localizing disk and cache
utilization, this could provide some fault isolation in that a failure
in one master would not affect the wikis running off the other master.
Third, we're expecting to have at least two new additional data centers
soon in Europe and the US. Initially these are probably going to be
squid proxies since that's easy for us to do (we have a small offsite
squid farm in France currently in addition to the squids in the main
cluster in Florida) but local web server boxen pulling from local slaved
databases at least for read-only requests is something we're likely to
see, to move more of the load off of the central location.
Finally, people constantly bring up the 'PostgreSQL cures cancer'
bugaboo. 1.4 has experimental PostgreSQL support, which I'd like to see
as a first-class supported configuration for the 1.5 release. This is
only going to happen, though, if people pitch in to help in testing, bug
fixing, and of course make some benchmarks and failure mode tests
compared to MySQL! If you ever want Wikimedia to consider switching, the
software needs to be available to make it work and it needs to be
demonstrated as a legitimate improvement with a feasible conversion.
Domas is the PostgreSQL partisan on our team and wrote the existing
PostgreSQL support. If you'd like to help you should probably track him
down; in #mediawiki you'll usually find him as 'dammit'.
-- brion vibber (brion @ pobox.com)
revision and text now use separate row ID numbers in HEAD. A revision
row refers to text.old_id with its rev_text_id key; this allows text
revisions to be stored independently of a given revision.
* Operations that change only metadata can be put in the page history
without storing a new text record. I've done this for page move as a
start. (It might be good to also add a marker field for metadata-only
changes so they can be shown distinctly in the history.)
* In theory, reverts could do the same, referring to the prior text
record without saving a new copy.
* The storage backend can number text objects using its own scheme; if
necessary text object IDs can be reassigned during batch recompression.
If you're running a 1.5 test wiki, you'll have to run the update.php to
add the field. (Manually, maintenance/archive/patch-rev_text_id.sql)
-- brion vibber (brion @ pobox.com)
At ETech in San Francisco, I met with Ross Mayfield of Socialtext, and
we discussed the idea that there should be a central core set of
standard wiki syntax. Ross was quite keen on the concept.
Standard syntax is important for the entire wiki world so that as
people become more accustomed to using wikis for all kinds of things,
they can feel comfortable on a variety of platforms.
It seems natural to me that as the largest wiki project (and we are
probably the wiki engine with the most installations, although I have
no actual way of knowing that), we should take a leadership role in
this.
http://www.socialtext.net/mayfield/index.cgi?action=refcard&login=user4232
is a quick 'refcard' on the syntax of socialtext.
I propose that we set up a group of people either in a mailing list or
a wiki or both, and invite representatives from all the major wiki
software projects and companies to participate in a syntax standard.
I don't know anything about how formal standards are proposed and
decided, but just as with HTML, it seems that wiki syntax is a natural
for some standardization.
--Jimbo
--
"Pianosa is een Italie" - first words of 50,000th article on nl.wikipedia.org
The following changes have been made to the Python Wikipediabot
framework since the previous overview of February 6. As always, the
new files can be downloaded at
http://cvs.sourceforge.net/viewcvs.py/pywikipediabot/pywikipedia/, and
one can of course use a newer version as well. Furthermore, changes
have been done to the Wikimedia software early February, and to the
bot as well. Therefore:
* versions of wikipedia.py older than 1.391 (February 5) do not work any more
* If you use a version of anything from February 6 or later, you
should use a version of everything (more precisely, wikipedia.py,
config.py and the specific bot you are using) from that date or later.
But on to the newer changes. For the bugfixes I will describe which
bot(s) is or are affected, and what goes wrong with older versions.
"int." means that the bug has been introduced in some earlier version.
Bugs both introduced and solved in the current period have not been
mentioned. For all changes the files and versions that are needed are
mentioned.
Andre Engels
== Dependencies ==
In general, if error messages occur upon downloading a new version of
a bot, getting a new version of wikipedia.py as well would be the
first idea. Versions of wikipedia.py 1.405 and higher need family.py
1.21 (and vice versa)
== Bug fixes ==
* general * wikipedia.py 1.397 (int. 1.392 - does not count number
of bot processes correctly)
* general * wikipedia.py 1.397 (int. 1.392 - cannot edit redirect
pages and cannot create new pages)
* general * wikipedia.py 1.406 (is unable to edit after having been
dormant for some time)
* catall.py * catall.py 1.13 (int. 1.12 - gives an error message
before ending)
* category.py * category.py 1.62 (int. 1.61 - major disfunction)
* category.py (and others) * catlib.py 1.32 (finds at most 200
articles in a category)
* interwiki.py * family.py 1.20 (does not find pages on csb:)
* interwiki.py * family.py 1.21, wikipedia.py 1.406 (does not
recognize [[{xx:PAGENAME}]] interwiki links and a few redirects)
* interwiki.py * interwiki.py 1.135 (crashes when the -continue
option is used with an empty dumpfile)
* interwiki.py * interwiki.py 1.136 (chance of not being removed
from the list of bot processes if stopped *very* soon after being
started)
* interwiki.py * titletranslate.py 1.38 (crashes when a non-existing
language is given as a hint)
* pagefromfile.py * pagefromfile.py 1.7 (int. 1.6 - gives error
message at end and is not removed from the list)
* pagefromfile.py * pagefromfile.py 1.8 (the option "-end" is not recognized)
* redirect.py * redirect.py 1.19 (int 1.18 - major disfunction)
* replace.py * replace.py 1.35 (int. 1.6 - gives error message at
end and is not removed from the list)
== Major changes ==
* interwiki.py can now, when asking for hints, be asked for the text
of the page by typing "?" or adding the "-showpage option *
interwiki.py 1.136
* replace.py and solve_disambiguation.py now give their diffs colored
(Unix only) * replace.py 1.37, solve_disambiguation.py 1.128,
wikipedia.py 1.404
* Two new features of sqldump.py: findr finds regular expressions; the
function of baddisambiguation is not clear to me * sqldump.py 1.17
* interwiki.py uses nb: instead of no: in presence of an nn: link or
on the nn: wiki * interwiki.py 1.139
== Minor changes ==
* Swedish translations for interwiki.py: interwiki.py 1.137
* Change of Icelandic text for category.py: category.py 1.63
* Hawaiian and Chichewa added to known languages (wiktionary only
Hawaiian yet): wikipedia_family.py 1.89, wiktionary_family 1.21
== Cosmetic changes / invisible changes ==
* Multiple alternative redirect texts for one language are supported:
wikipedia.py 1.406
* Special care for zh-cn/zh-tw difference removed: interwiki.py 1.139,
wikipedia.py 1.408
Hi,
There is a need for a special syntax for texts in verse, as
: creates an indentation and <br/> is quite ugly.
Could it be possible to have a code <verse>
which would produce carriage return at the end of each line ?
Thanks,
Yann
--
http://www.non-violence.org/ | Site collaboratif sur la non-violence
http://www.forget-me.net/ | Alternatives sur le Net
http://fr.wikipedia.org/ | Encyclopédie libre
http://www.forget-me.net/pro/ | Formations et services Linux
I've added a rev_deleted field to the revision table in CVS HEAD. If
running a wiki on 1.5 CVS currently, run update.php or manually source
maintenance/archives/patch-rev_deleted.sql to update the table.
This will be part of changes to the deletion system, not yet fully
implemented:
* Deleted revisions will remain listed in the contributions history
(though greyed out and in strike-through), making it easier to track
serial offenders.
* Deleted revisions will be listed in the regular page history (again,
grayed out and in strike-through). In cases where individual revisions
have been struck from a page for eg copyright reasons this will make it
easier to see where portions were removed.
* The actual text won't have to be shuffled into an archive table;
simply updating a 1-byte field in the revision records ought to be
faster, and this will allow deletion again on pages which have been
block-compressed. (Block compression combines multiple revisions' text
into a single compressed record; since this interacts poorly with the
current deletion/undeletion system, deletion is currently disabled for
any page which has been block-compressed.) Deleted text could be pruned
at leisure by a batch or background process if necessary.
* For admins, the diff and view source tools will remain available on
deleted pages, which the current Special:Undelete viewer doesn't
provide. This can aid in analysis of a vandalism spree.
The deletion/undeletion system has not yet been updated to actually use
this field, nor has limitation of viewing of such revisions to admins
been implemented. (Remember the purpose of _deletion_ is to make
materials inaccessible to the public, especially if for legal reasons,
so non-admins will still not be able to look at the actual deleted
revision text.)
Once finished there will probably be a slight change to the page table
as well, to mark pages which have no non-deleted revisions.
-- brion vibber (brion @ pobox.com)