Page MenuHomePhabricator

Outstanding issues with scap phatality deployment
Open, Needs TriagePublic

Description

  • Canary deployment should be used.
    • Open question for: what subset of servers should act as canary?
  • As implemented, checks are not suitable for installation steps, because a failing check does not prevent a subsequent check from running within the same stage (maybe should be a scap3 task?).
  • It's not clear to me, when scap deploy offers to roll back (e.g., due to a check error), if checks are run after the rollback. Needs investigation.

xref: https://wikitech.wikimedia.org/wiki/Incidents/2024-09-16_logstash_unavailability
xref: T374880