feat: track resource replacements when claiming a prebuilt workspace (#17571)

Closes https://github.com/coder/internal/issues/369

We can't know whether a replacement (i.e. drift of terraform state
leading to a resource needing to be deleted/recreated) will take place
apriori; we can only detect it at `plan` time, because the provider
decides whether a resource must be replaced and it cannot be inferred
through static analysis of the template.

**This is likely to be the most common gotcha with using prebuilds,
since it requires a slight template modification to use prebuilds
effectively**, so let's head this off before it's an issue for
customers.

Drift details will now be logged in the workspace build logs:


![image](https://github.com/user-attachments/assets/da1988b6-2cbe-4a79-a3c5-ea29891f3d6f)

Plus a notification will be sent to template admins when this situation
arises:


![image](https://github.com/user-attachments/assets/39d555b1-a262-4a3e-b529-03b9f23bf66a)

A new metric - `coderd_prebuilt_workspaces_resource_replacements_total`
- will also increment each time a workspace encounters replacements.

We only track _that_ a resource replacement occurred, not how many. Just
one is enough to ruin a prebuild, but we can't know apriori which
replacement would cause this.
For example, say we have 2 replacements: a `docker_container` and a
`null_resource`; we don't know which one might
cause an issue (or indeed if either would), so we just track the
replacement.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
This commit is contained in:
Danny Kopping
2025-05-14 14:52:22 +02:00
committed by GitHub
parent e75d1c1ce5
commit 6e967780c9
33 changed files with 2048 additions and 969 deletions

View File

@ -0,0 +1 @@
DELETE FROM notification_templates WHERE id = '89d9745a-816e-4695-a17f-3d0a229e2b8d';

View File

@ -0,0 +1,34 @@
INSERT INTO notification_templates
(id, name, title_template, body_template, "group", actions)
VALUES ('89d9745a-816e-4695-a17f-3d0a229e2b8d',
'Prebuilt Workspace Resource Replaced',
E'There might be a problem with a recently claimed prebuilt workspace',
$$
Workspace **{{.Labels.workspace}}** was claimed from a prebuilt workspace by **{{.Labels.claimant}}**.
During the claim, Terraform destroyed and recreated the following resources
because one or more immutable attributes changed:
{{range $resource, $paths := .Data.replacements -}}
- _{{ $resource }}_ was replaced due to changes to _{{ $paths }}_
{{end}}
When Terraform must change an immutable attribute, it replaces the entire resource.
If youre using prebuilds to speed up provisioning, unexpected replacements will slow down
workspace startupeven when claiming a prebuilt environment.
For tips on preventing replacements and improving claim performance, see [this guide](https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#preventing-resource-replacement).
NOTE: this prebuilt workspace used the **{{.Labels.preset}}** preset.
$$,
'Template Events',
'[
{
"label": "View workspace build",
"url": "{{base_url}}/@{{.Labels.claimant}}/{{.Labels.workspace}}/builds/{{.Labels.workspace_build_num}}"
},
{
"label": "View template version",
"url": "{{base_url}}/templates/{{.Labels.org}}/{{.Labels.template}}/versions/{{.Labels.template_version}}"
}
]'::jsonb);