Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We currently save the GitHub OAuth access tokens in plaintext in our database, which isn't a huge issue as long as nobody unauthorized gains access to our database. But from a security standpoint it's better to encrypt them so that an attacker would need both access to the database and to the encryption key.
This PR implements step 1 of the migration towards using encrypted access tokens. It adds a new column to the database, changes the code to start saving encrypted tokens to that column and adds an admin tool to backfill the column based on the current plaintext tokens.
This also adds a new required
GITHUB_TOKEN_ENCRYPTION_KEY
environment variable without which the server and background worker won't start. The env var expects a 64 character hex string, which is then decoded to a 32 byte AES key. I've already generated corresponding values for our staging and production environments viaopenssl rand -hex 32
.The encryption uses "AES-256-GCM" which from my understanding should be a reasonable algorithm for this situation and requires no extra dependencies since it was already used transitively.
For completeness: Phase 2 will turn the column non-nullable and will start using the encrypted tokens instead of the plaintext tokens. Phase 3 will then remove the plaintext token column. These phases should be deployed separately for obvious reasons 😅