fix: reduce cost of prebuild failure (#17697)

Relates to https://github.com/coder/coder/issues/17432

### Part 1:

Notes:
- `GetPresetsAtFailureLimit` SQL query is added, which is similar to
`GetPresetsBackoff`, they use same CTEs: `filtered_builds`,
`time_sorted_builds`, but they are still different.

- Query is executed on every loop iteration. We can consider marking
specific preset as permanently failed as an optimization to avoid
executing query on every loop iteration. But I decided don't do it for
now.

- By default `FailureHardLimit` is set to 3.

- `FailureHardLimit` is configurable. Setting it to zero - means that
hard limit is disabled.

### Part 2

Notes:
- `PrebuildFailureLimitReached` notification is added.
- Notification is sent to template admins.
- Notification is sent only the first time, when hard limit is reached.
But it will `log.Warn` on every loop iteration.
- I introduced this enum:
```sql
CREATE TYPE prebuild_status AS ENUM (
  'normal',           -- Prebuilds are working as expected; this is the default, healthy state.
  'hard_limited',     -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore.
  'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried.
);
```
`validation_failed` not used in this PR, but I think it will be used in
next one, so I wanted to save us an extra migration.

- Notification looks like this:
<img width="472" alt="image"
src="https://github.com/user-attachments/assets/e10efea0-1790-4e7f-a65c-f94c40fced27"
/>

### Latest notification views:
<img width="463" alt="image"
src="https://github.com/user-attachments/assets/11310c58-68d1-4075-a497-f76d854633fe"
/>
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/6bbfe21a-91ac-47c3-a9d1-21807bb0c53a"
/>
This commit is contained in:
Yevhenii Shcherbina
2025-05-21 15:16:38 -04:00
committed by GitHub
parent e1934fe119
commit 53e8e9c7cd
32 changed files with 1160 additions and 60 deletions

View File

@ -0,0 +1,112 @@
From: system@coder.com
To: bobby@coder.com
Subject: There is a problem creating prebuilt workspaces
Message-Id: 02ee4935-73be-4fa1-a290-ff9999026b13@blush-whale-48
Date: Fri, 11 Oct 2024 09:03:06 +0000
Content-Type: multipart/alternative; boundary=bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4
MIME-Version: 1.0
--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=UTF-8
Hi Bobby,
The number of failed prebuild attempts has reached the hard limit for templ=
ate docker and preset particle-accelerator.
To resume prebuilds, fix the underlying issue and upload a new template ver=
sion.
Refer to the documentation for more details:
Troubleshooting templates (https://coder.com/docs/admin/templates/troublesh=
ooting)
Troubleshooting of prebuilt workspaces (https://coder.com/docs/admin/templa=
tes/extending-templates/prebuilt-workspaces#administration-and-troubleshoot=
ing)
View failed prebuilt workspaces: http://test.com/workspaces?filter=3Downer:=
prebuilds+status:failed+template:docker
View template version: http://test.com/templates/cern/docker/versions/angry=
_torvalds
--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=UTF-8
<!doctype html>
<html lang=3D"en">
<head>
<meta charset=3D"UTF-8" />
<meta name=3D"viewport" content=3D"width=3Ddevice-width, initial-scale=
=3D1.0" />
<title>There is a problem creating prebuilt workspaces</title>
</head>
<body style=3D"margin: 0; padding: 0; font-family: -apple-system, system-=
ui, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Oxygen', 'Ubuntu', 'Cantarel=
l', 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif; color: #020617=
; background: #f8fafc;">
<div style=3D"max-width: 600px; margin: 20px auto; padding: 60px; borde=
r: 1px solid #e2e8f0; border-radius: 8px; background-color: #fff; text-alig=
n: left; font-size: 14px; line-height: 1.5;">
<div style=3D"text-align: center;">
<img src=3D"https://coder.com/coder-logo-horizontal.png" alt=3D"Cod=
er Logo" style=3D"height: 40px;" />
</div>
<h1 style=3D"text-align: center; font-size: 24px; font-weight: 400; m=
argin: 8px 0 32px; line-height: 1.5;">
There is a problem creating prebuilt workspaces
</h1>
<div style=3D"line-height: 1.5;">
<p>Hi Bobby,</p>
<p>The number of failed prebuild attempts has reached the hard limi=
t for template <strong>docker</strong> and preset <strong>particle-accelera=
tor</strong>.</p>
<p>To resume prebuilds, fix the underlying issue and upload a new template =
version.</p>
<p>Refer to the documentation for more details:<br>
- <a href=3D"https://coder.com/docs/admin/templates/troubleshooting">Troubl=
eshooting templates</a><br>
- <a href=3D"https://coder.com/docs/admin/templates/extending-templates/pre=
built-workspaces#administration-and-troubleshooting">Troubleshooting of pre=
built workspaces</a></p>
</div>
<div style=3D"text-align: center; margin-top: 32px;">
=20
<a href=3D"http://test.com/workspaces?filter=3Downer:prebuilds+stat=
us:failed+template:docker" style=3D"display: inline-block; padding: 13px 24=
px; background-color: #020617; color: #f8fafc; text-decoration: none; borde=
r-radius: 8px; margin: 0 4px;">
View failed prebuilt workspaces
</a>
=20
<a href=3D"http://test.com/templates/cern/docker/versions/angry_tor=
valds" style=3D"display: inline-block; padding: 13px 24px; background-color=
: #020617; color: #f8fafc; text-decoration: none; border-radius: 8px; margi=
n: 0 4px;">
View template version
</a>
=20
</div>
<div style=3D"border-top: 1px solid #e2e8f0; color: #475569; font-siz=
e: 12px; margin-top: 64px; padding-top: 24px; line-height: 1.6;">
<p>&copy;&nbsp;2024&nbsp;Coder. All rights reserved&nbsp;-&nbsp;<a =
href=3D"http://test.com" style=3D"color: #2563eb; text-decoration: none;">h=
ttp://test.com</a></p>
<p><a href=3D"http://test.com/settings/notifications" style=3D"colo=
r: #2563eb; text-decoration: none;">Click here to manage your notification =
settings</a></p>
<p><a href=3D"http://test.com/settings/notifications?disabled=3D414=
d9331-c1fc-4761-b40c-d1f4702279eb" style=3D"color: #2563eb; text-decoration=
: none;">Stop receiving emails like this</a></p>
</div>
</div>
</body>
</html>
--bbe61b741255b6098bb6b3c1f41b885773df633cb18d2a3002b68e4bc9c4--

View File

@ -0,0 +1,35 @@
{
"_version": "1.1",
"msg_id": "00000000-0000-0000-0000-000000000000",
"payload": {
"_version": "1.2",
"notification_name": "Prebuild Failure Limit Reached",
"notification_template_id": "00000000-0000-0000-0000-000000000000",
"user_id": "00000000-0000-0000-0000-000000000000",
"user_email": "bobby@coder.com",
"user_name": "Bobby",
"user_username": "bobby",
"actions": [
{
"label": "View failed prebuilt workspaces",
"url": "http://test.com/workspaces?filter=owner:prebuilds+status:failed+template:docker"
},
{
"label": "View template version",
"url": "http://test.com/templates/cern/docker/versions/angry_torvalds"
}
],
"labels": {
"org": "cern",
"preset": "particle-accelerator",
"template": "docker",
"template_version": "angry_torvalds"
},
"data": {},
"targets": null
},
"title": "There is a problem creating prebuilt workspaces",
"title_markdown": "There is a problem creating prebuilt workspaces",
"body": "The number of failed prebuild attempts has reached the hard limit for template docker and preset particle-accelerator.\n\nTo resume prebuilds, fix the underlying issue and upload a new template version.\n\nRefer to the documentation for more details:\n\nTroubleshooting templates (https://coder.com/docs/admin/templates/troubleshooting)\nTroubleshooting of prebuilt workspaces (https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting)",
"body_markdown": "\nThe number of failed prebuild attempts has reached the hard limit for template **docker** and preset **particle-accelerator**.\n\nTo resume prebuilds, fix the underlying issue and upload a new template version.\n\nRefer to the documentation for more details:\n- [Troubleshooting templates](https://coder.com/docs/admin/templates/troubleshooting)\n- [Troubleshooting of prebuilt workspaces](https://coder.com/docs/admin/templates/extending-templates/prebuilt-workspaces#administration-and-troubleshooting)\n"
}