While examinating this AbuseLog entry on itwikinews, the page_namespace variable has the incorrect value of null. However, in the fields below it's clearly shown as 0, and the edit was on a ns0 page, too. Trying to reproduce this error on other AbuseLog entries didn't work, and the namespace has the correct value (0 or anything else).
The first suspect could be that this is due to the edit not being saved, however other non-saved edits do have the correct value for page_namespace.
The second could be that page_namespace isn't used in any filter, and thus computed non-lazily upon examination (something similar to T176291), but the variable was actually computed at edit time.
While further investigation is coming, I have no clues for the moment.
UPDATE 1: This comes from wmf.19. In fact, page_namespace is identical to null for entries older than August 30th, 19:47 UTC, which is the deployment time.
UPDATE 2: This affects all renamed variables. Now, I also suspect the reason: currently, we only use new names for computing, and if the pattern contains an old name, the parser translates it to the new one. However, var_dumps for old entries are stored with the old names, and trying to retrieve (for instance) page_namespace from a dump where such variable is saved as article_namespace returns null because the variable doesn't exist. If this is correct, the possible solutions are:
- Change the mapping direction, i.e. translate new names to the old ones. This would require a big change (basically most of the patches for T173889 should be reverted) and produce the opposite problem for entries after wmf.19, plus add some inconsistency (since preferred names aren't actually used) and just delay this problem if we'll decide to completely remove old variables.
- Add a special-case backward mapping when examining old entries.
- Use a maintenance script to convert old names to the new ones in old entries. Note: We should seize the opportunity and fix T110854 and T187153 with this script.
Point 2 could be a good short term solution, while point 3 would be the long term one. If, however, we assess that any of the two could produce other problems, then we could proceed with point 1 and apply point 3 (and maybe 2) only for newer entries.