This task is for tracking the progress of implementing the reverted tag, as part of my GSoC 2020 project: T248775: Proposal: Add Reverted filter to RecentChanges Filters
Requirements
These are the requirements the solution has to satisfy. This is based on the discussion in T252366: Choose the definition of a "reverted" edit.
What is a reverted edit?
A reverted edit is an edit that was later removed from the article by another edit. We will consider the following cases:
- An edit is reverted using the rollback link. This allows reverting one or more top revisions made by a single author quickly.
- An edit is reverted using the undo link. This allows reverting a single edit in article's history. The reverting editor has the option to apply additional changes on top of the revert, this still counts as reverting.
- An optional enhancement that can be implemented later is handling undoing many revisions using the undoafter parameter. See: T153570
- A manual revert is made, i.e. the editor reverts the page to an exact previous state.
- This will be done by exact revision SHA-1 matching. As the revision table lacks an index over that field, we will have to restrict the search to a certain radius. It will be probably 15 edits, as suggested by this paper, but that can be also a configuration variable.
- See also T154637 for a request for a similar feature.
Other
- The solution must allow for easy filtering of reverted edits on Special:RecentChanges. This requirement will be satisfied automatically, if we use an edit tag to mark reverted edits.
- The reverted tag is permanent – under no circumstances will it be removed from an edit by MediaWiki.
- We should somehow store the information on which method of detecting a revert was used to mark an edit as reverted. This is a request from the Product Analytics team.
- The solution should cover at least all cases that the Echo extension currently covers for detecting reverted edits. We will want to get rid of redundant revert detection code in Echo.
Technical design
Note that this is still being drafted and may change a lot. I will probably update this section a few times. It may be temporarily complete nonsense.
I think the most sensible way of starting this is by refactoring the code around the PageContentSaveComplete hook, mostly in the PageUpdater class. A new class would be introduced to store information related to an edit, to simplify data handling and related hooks. This was already attempted in this patch. Note that this patch is pre-MCR, so a lot of stuff will have to be changed.
This would also involve changing parameter lists for some hooks and probably deprecating the ArticleRollbackComplete hook (see T154263).
- The new class (let's call it EditResult for now) would store references to things like the revision we reverted to and revisions that were reverted by it.
- The class may seem quite similar in role to RevisionRecord, but from what I understand, RevisionRecord should be completely independent of its page and other revisions. Also it would not be possible to trivially reconstruct the new fields from a DB row of the revision table.
- It may also be tempting to integrate that into the PageUpdater class, as it has a kind of similar role. That would also be bad, as that class is used mostly as a controller for saving edits, not as a container. In the PageContentSaveComplete hook we would have to pass $this to extensions, which is way beyond the scope of the hook.
Then the actual revert detection code would be implemented. Relevant information would be stored in an object of the new class and reverted edits would be tagged appropriately. Additional information about the revert can be stored in the tag using the ct_params field of the change_tags table.
Note about scheduling: according to my original internship schedule in T248775, I would first do the revert detection code and make the programmatic interface later. I think doing it the opposite way would save me some time on refactoring the code later, so I'll want to start with implementing the new class.