There should be no need for a purge action exposed to users. Every time a user has to manually trigger a purge (or a null edit) to get a useful representation of some page, MediaWiki has failed that user. Eventually, it would be nice if purge could be removed completely; but for now, MediaWiki fails users quite often, so we keep track the issues causing that here.
Description
Details
- Reference
- bz54902
Event Timeline
It *is* an ugly hack, and it's not meant to be exposed to end-users.
Any time people feel the need to use action=purge, it's because MediaWiki failed to properly handle updates to page contents (usually because of some feature that's not compatible with the parser cache, sometimes because of software updates that don't clear the parser cache).
MZMcBride: When would this bug be "evaluated", so action could take place?
So far it feels like this discussion should happen on a mailing list instead of a bugtracker, because this ticket is far from being actionable. If there's consensus a bug report could be filed that defines problems that were agreed on.
null edit is not the same than a simple purge. null edit also refreshed the database tables (like categorylinks and so on). With api purge and param forcelinkupdate you can reach that also.
Along with the job queue and new features (core software update or a new installed extension), bug 18478 is also a reason to do a purge.
(In reply to comment #5)
MZMcBride: When would this bug be "evaluated", so action could take place?
Looks like one of Wikimedia's (or MediaWiki's) architects confirmed that the purge action is a hack in comment 4. :-)
Any hack should eventually be killed (I believe that's the nature of a hack). If we're not going to expose the purge action in the user interface, I think we should work toward deprecating it.
It would be nice if this purging wasn't needed at all. There's a gadget specifically for this at the english wikipedia [1] and probably several other wikipedias as well.
IIRC, whenever a highly used template is changed at the english wikipedia, bots are used to purge all pages where the template was used.
[1]: https://en.wikipedia.org/wiki/MediaWiki:Gadget-purgetab.js
I guess it needs some clarification that "purge action" here refers to index.php?action=purge and/or api.php?action=purge ?
(In reply to Liangent from comment #9)
I guess it needs some clarification that "purge action" here refers to
index.php?action=purge and/or api.php?action=purge ?
Probably both. Is there a good reason to have either?
(In reply to MZMcBride from comment #10)
Probably both. Is there a good reason to have either?
api.php?action=purge can do more than index.php?action=purge - the "force[recursive]linkupdate" parameter.
Assume we have a page containing [[Category:{{CURRENTTIMESTAMP}}]]. Maybe we can deprecate the "parser cache clearing action" by postprocessing parser output or simply disabling parser cache, but I don't think we'll find a way to update the categorylinks row every second. In this case a "link updating action" is still needed. The equivalent thing on index.php is a null edit.
(In reply to Liangent from comment #11)
(In reply to MZMcBride from comment #10)
Probably both. Is there a good reason to have either?
api.php?action=purge can do more than index.php?action=purge - the
"force[recursive]linkupdate" parameter.
Why are these options necessary? Which use-cases is the API purge action and its additional parameters solving?
Assume we have a page containing [[Category:{{CURRENTTIMESTAMP}}]]. Maybe we
can deprecate the "parser cache clearing action" by postprocessing parser
output or simply disabling parser cache, but I don't think we'll find a way
to update the categorylinks row every second. In this case a "link updating
action" is still needed. The equivalent thing on index.php is a null edit.
Null edits should not be necessary.
There are many ideas to explore here. For example, we could make purging more probabilistic by purging pages (including links updates0 every thousandth or millionth view.
(In reply to Liangent from comment #11)
api.php?action=purge can do more than index.php?action=purge - the
"force[recursive]linkupdate" parameter.
This reminds of me the "don't leave a redirect" functionality when moving a page. The functionality was originally added only to the API's move action and eventually it was declared that API functionality and UI (index.php) functionality at Special:MovePage should not intentionally diverge like this.
That is, index.php?action=purge should probably include the force[recursive]linkupdate parameter as well.
Or that parameter should be removed from both, depending on its justification and utility.
Null edits shouldn't be needed but there are several bugs that have been open for a long time, in fact one that I ran into on my private wiki has been open and unresolved for over two years. Until such a time as the purge action is no longer needed it should remain.
(In reply to Betacommand from comment #14)
Null edits shouldn't be needed but there are several bugs that have been
open for a long time, in fact one that I ran into on my private wiki has
been open and unresolved for over two years. Until such a time as the purge
action is no longer needed it should remain.
Yeah I can remember some of them. Is there a full list (a tracking bug?) somewhere?
(In reply to Betacommand from comment #14)
Until such a time as the purge action is no longer needed it should remain.
Indeed. I think the focus should be slow deprecation and eventual removal.
(In reply to Liangent from comment #15)
Yeah I can remember some of them. Is there a full list (a tracking bug?)
somewhere?
This bug report may become a tracking bug.
In my experience, one of the most common reasons to use purge is post infrastructure failures (the purge feed between main and caching centers has been down), causing inconsistencies.
In other cases, it's almost always cached 'time dependent' content. The category with speedy deletes not updating quick enough to the wikignomes liking (job queue). Or calculated ages of article subjects for instance, that are no longer up to date (because calculations are cached, which for time based content, is thus inherently broken).
If we don't want to sacrifice flexibility and want to keep caching at the same time, I would say that calculations based on time could have cache invalidation timers associated with it. (more ugliness, but doable/manageable in Lua I think).
There was a recent situation with All-and-every-Wikisource that needed an almighty purge of many, many pages, and purge.py needed updating. @Mpaa & @Billinghurst did it, iirc..?
Wikisource does indeed use the purge function
- When Proofread Page hasn't replicated its page status properly to indexes, which resulted in the purging of both Index: ns and Page: ns, and it was done across all WSes. [There is a phabricator ticket]
Also to note that when a file: is updated at Commons (overwritten) that we often will need to purge the index to refresh the text layer for the Page: ns pages that pending (redlinked). We used to have issues with text layer availability, though I cannot say that I have had to do that for a while.
Personally, I have had to use purge on occasions when user scripts have failed to load, either where added to the sidebar as an option, or where a pages has failed to load properly. Today for example, I have had issues with Commons script MediawikiExCommons.js failing to find used files and offering to transfer them.
So I have no issue with never having to purge a page, I definitely know that we are not ready to remove it.
Factual note:
Any page (most typically main pages and portal pages) which transcludes time-changing templates (Ie. "word of the day", "quote of the week", "featured article" etc.) needs purge feature because (obviously) never updates on its own.
I'm not saying that isn't true, but it shouldn't be true. Using magic words like {{CURRENTDAY}} is supposed to reduce the cache TTL (time-to-live) of the rendering. For the built-in magic words, there's a handy list in MagicWord.php. For ParserFunctions' {{#time:}} it's kind of difficult to follow, but I think it uses the calculations in Language.php. Of course it's not impossible that this functionality is broken, but it exists and it's supposed to work.
There is an use case for /api.php?action=purge on plwiktionary, see the task description in T109638: Page categorization logs expose user's IP.
I honestly never seen "auto-purging" page transcluding time-dependant template. We always had to do it manually or have a bot to do purge on midnights.
Also not only parser function or magic words can create this kind of transclusion. Mind Lua modules as well, which do not have to use any of those constructions simply because of built-in os.date/os.time.
Lua is not my strong point, but this also seems to be implemented for Lua modules, and indeed for os.date/os.time.
https://github.com/wikimedia/mediawiki-extensions-Scribunto/blob/master/engines/LuaCommon/lualib/mw.lua#L113
https://github.com/wikimedia/mediawiki-extensions-Scribunto/blob/master/engines/LuaCommon/lualib/mw.lua#L477
In addition to what Billinghurst reported about Wikisource, page transclusion is also a major cause for the need of purging: very often, when a transcluded page is modified, the transcluding page remains outdated even for several minutes, and we have to purge it to refresh it.
On ptwiki, we often see people complaining that "today"s date on main page is incorrect, and I assume is caused by the cache, so I suggest purging the pages when they want it to be updated. See e.g.:
https://pt.wikipedia.org/wiki/WP:Esplanada/geral/Datas_autom%C3%A1ticas,_ou_n%C3%A3o..._(6nov2012)
Also, on many wikis we have documentation templates transcluding /doc subpages to the main template pages and these provide a purge link so that changes to the docs can be propagated to the main template page. See:
https://en.wikipedia.org/wiki/Template:Documentation
Consider also the problem reported at pt:WP:CP#Categoria:!Redirecionamentos de categorias não vazios não está esvaziando. It seems to be like this:
Suppose there is a single page "P" in the category "C", and in the description of "C" there is a template "T" containing the test
{{#ifeq: {{PAGESINCATEGORY: {{PAGENAME}} }}|0|| [[Categoria:N]] }}
In this situation, if a user goes to "P" and removes category "C", the test in the template is now evaluated to "true" and the category "N" is removed. However, the category "C" is still listed at category "N" (i.e. it is not purged automatically, and the user has to do a null edit at the description of "C", so that the list at "N" gets updated).
Look at https://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Weblinkwartung/Toter_Link
:Impovements since last year
categorytree works much faster. good job. thank you.
:as the same as last year
transclution of {{Spezial:Linkliste/Vorlage:Toter_Link/!...nourl}} (special linklist) this needs purge, because it should by actual. at this point there is some work to do.
Hmm... purge action may be needed for now until MediaWiki can be properly operated to keep content up to date. Sure, the purge buttons are grating, but anonymous users using IP addresses, if they can discover, are left with an option to type ?action=purge or &action=purge in the http address bar, which is more effective than simply pressing the refresh/reload button.
As an extension writer, the purge action is massively, massively useful. It is not uncommon whilst working on the extension, to mess-up rendering, and without an easy way to purge the page, you're a bit stuck.
It also helps users who encounter extension bugs that may cause temporary rendering issues. Non-core extensions issues are not something that could ever be resolved by changes to the core MW code.
Therefore, I vote for this to be a WONTFIX as there will always be some situations for which action=purge is the only feasible solution.
On Wikidata purge is a standard again. I'd guess for good reasons. I think that this increased the chances of retaining this useful feature. Touching wood.
It seems a strange idea to me. Changes in MediaWiki or extensions codebase (per @gh87), Lua modules, templates and pages with Semantic MediaWiki annotations do not propagate themselves along dependencies in realtime; and neither is even the parser cache invalidation.
Since it got bumped again, this ticket is more of an idealistic ticket, expressing that MediaWiki should do cache invalidation perfectly so that a manual purge is never needed by users. If users have to click "purge" to get things in sync, that's a bug, which is why this ticket exists. We may never get to the eventual goal because cache invalidation is a hard problem, but it seems worthwhile to aim for, track issues and hopefully fix them.
I get that, but my point was that this is inherently an impossibility - even if this could be guaranteed for MW core, it could not be guaranteed for extensions (which are outside of your control) nor for any dev environment, where things are inherently unstable.
I would understand a permanent tracking ticket for issues relating to cache invalidation, but I don't think it is realistic to expect that the MW ecosystem would ever get to a point where page-specific purging is not useful and necessary functionality.
Why can't extensions be fixed? Why can't extension developers vary their caches on page_touched or adjust cache TTLs/disable caches when that isn't possible? Relying on users to click purge is a bug, just like any other broken extension functionality.
Developers should be able to manipulate caches on their own. Maybe we need better tooling to make this straightforward?
I would understand a permanent tracking ticket for issues relating to cache invalidation, but I don't think it is realistic to expect that the MW ecosystem would ever get to a point where page-specific purging is not useful and necessary functionality.
We'll have to settle for dreaming then.
I'm not suggesting we rely on users to click purge as part of any design, but it is a very useful tool when asking a user to help you debug an issue, remotely, or for users to use to workaround bugs that have not yet been fixed (I have definitely encountered a few of these in my time - both as a user and a developer). The alternative is for the extension to disable caching altogether on pages that use it, which I don't think would be a great idea.
Developers should be able to manipulate caches on their own. Maybe we need better tooling to make this straightforward?
If you mean a tool to purge a specific page whilst you are debugging, then yes, that would be great, but... well... isn't that what we already have?