Page MenuHomePhabricator

ReleaseTaggerBot tagged a task with a later branch than when the code landed, as the branch was created between the merge and the time the bot ran
Open, Needs TriagePublic

Description

ReleaseTaggerBot tagged T297664 with MW-1.38-notes (1.38.0-wmf.16; 2022-01-03), but the only associated patch made it into wmf.13. The patch was merged at 01:52 UTC, the bot updated the task at 02:00 UTC, and the branch commit is timestamped 02:07 UTC, so this could be a timing issue / race condition.

(BTW, the deployments page on wikitech is wrong: it claims that the branch cut happens at 03:00 UTC)

Event Timeline

Entirely guessing, the branch in Gerrit was created at 02:00 UTC, RTB did not see this patch in that branch, tagged it as going in wmf.16, and then the commit for that branch was only pushed at 02:07 UTC.

We could adjust the cron to run every hour at 30 minutes instead of on the hour to avoid this since train branching is now on a pretty reliable schedule.

(BTW, the deployments page on wikitech is wrong: it claims that the branch cut happens at 03:00 UTC)

DST bug maybe? That page is generated by https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/tools/release/+/refs/heads/master/make-deployment-calendar/deployments-calendar.json

From the log file:

2021-12-14 02:00:23,050 - forrestbot - INFO - Processing https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MediaSearch/+/746979
2021-12-14 02:00:23,050 - root - DEBUG - Requesting 'master' branches for mediawiki/extensions/MediaSearch
2021-12-14 02:00:23,065 - urllib3.connectionpool - DEBUG - https://gerrit.wikimedia.org:443 "GET /r/projects/mediawiki%2Fextensions%2FMediaSearch/branches/ HTTP/1.1" 200 1455
2021-12-14 02:00:23,066 - forrestbot - INFO - https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MediaSearch/+/746979: merged in branch master, Task 297664, needs slugs ['mw1.38.0-wmf.14']

now, how does RTB figure out the slug for a master branch commit?

By getting a list of branches from Gerrit, finding the latest release branch, and adding 1 to the wmf-part:
https://github.com/wikimedia/labs-tools-forrestbot/blob/master/forrestbot.py#L47

So what I think happened is the following:

  1. Commit to master.
  2. Branch-off to -wmf.13 branch
  3. RTB processes a batch, sees a commit to master. This will go into the next release, hence -wmf.14.

Which suggests that moving to 30 minutes will make the issue worse, not better. The only way to correctly fix this would be to verify whether the commit is part of the 'current' release branch.... maybe through https://gerrit-review.googlesource.com/Documentation/rest-api-projects.html#get-included-in ? There's of course still a potential for a race condition then, but it's at least limited to a few seconds rather than 5 minutes.

Jdforrester-WMF renamed this task from ReleaseTaggerBot tagged a task with a branch that hadn't been cut yet to ReleaseTaggerBot tagged a task with a later branch than when the code landed, as the branch was created between the merge and the time the bot ran.Dec 14 2021, 6:53 PM

(BTW, the deployments page on wikitech is wrong: it claims that the branch cut happens at 03:00 UTC)

DST bug maybe? That page is generated by https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/tools/release/+/refs/heads/master/make-deployment-calendar/deployments-calendar.json

This was reported as a separate issue minutes later: T297724