Description
As a developer I expect my coder volume to stay around until my workspace is deleted so that I can trust my half-finished work is not lost. (Barring exceptional circumstance.)
Scenario 1: Currently there is one way for a volume to be deleted unintentionally (with #3000 there are two):
- A template author updates a template and changes the volume name format
- The user sees their workspace is outdated and goes ahead and updates
- The workspace is re-created, during start the old volume will be deleted (no longer referenced in the TF template) and a new one will be created in its place
Scenario 2: When #3000 is implemented, our current users will run into this scenario:
- The user renames their workspace
- The user (or autostart/stop) starts or stops the workspace
- The volume will be deleted and a new one created in its place
To avoid #3000, template authors should move to using data.coder_workspace.me.id
instead of data.coder_workspace.me.name
, which will prevent volume deletion due to a naming change. However, the migration path will mean that all existing coder workspaces will lose their volumes either due to the former case (rename in template + update), or due to workspace rename.
There are a few ways we could dampen this impact:
- We could detect when a template push would cause this and warn the operator
- We could detect when a workspace update would cause this and warn the user
- We could detect when a rename would cause this and warn the user
- We could consider workspace volume names immutable after creation (i.e. we compute the volume name and store it separately, re-use it for the lifetime of the workspace)
It could be argued that renaming volumes in the template is an operator error and we shouldn't care about it. On the flip-side, operator error or not, it's something we could've guarded against and prevented potential loss in developer productivity.
I think option 4 would be best, as it lets us guard against data loss in all cases. It's also the one requiring least changes to our code to handle the edge cases.
At minimum, we should implement 1 and 2 so that operators and users are aware of the impact. It'd be great to have 3 too in this case but not mandatory if we consider all our users should migrate to using IDs.
I think this is something of a feature-bug, so I've labeled it as such. Depending on solutions it will also require CLI and frontend work (but I've omitted those tags for now).