Relates to https://github.com/coder/coder/issues/17432
### Part 1:
Notes:
- `GetPresetsAtFailureLimit` SQL query is added, which is similar to
`GetPresetsBackoff`, they use same CTEs: `filtered_builds`,
`time_sorted_builds`, but they are still different.
- Query is executed on every loop iteration. We can consider marking
specific preset as permanently failed as an optimization to avoid
executing query on every loop iteration. But I decided don't do it for
now.
- By default `FailureHardLimit` is set to 3.
- `FailureHardLimit` is configurable. Setting it to zero - means that
hard limit is disabled.
### Part 2
Notes:
- `PrebuildFailureLimitReached` notification is added.
- Notification is sent to template admins.
- Notification is sent only the first time, when hard limit is reached.
But it will `log.Warn` on every loop iteration.
- I introduced this enum:
```sql
CREATE TYPE prebuild_status AS ENUM (
'normal', -- Prebuilds are working as expected; this is the default, healthy state.
'hard_limited', -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore.
'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried.
);
```
`validation_failed` not used in this PR, but I think it will be used in
next one, so I wanted to save us an extra migration.
- Notification looks like this:
<img width="472" alt="image"
src="https://github.com/user-attachments/assets/e10efea0-1790-4e7f-a65c-f94c40fced27"
/>
### Latest notification views:
<img width="463" alt="image"
src="https://github.com/user-attachments/assets/11310c58-68d1-4075-a497-f76d854633fe"
/>
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/6bbfe21a-91ac-47c3-a9d1-21807bb0c53a"
/>
Closes https://github.com/coder/internal/issues/369
We can't know whether a replacement (i.e. drift of terraform state
leading to a resource needing to be deleted/recreated) will take place
apriori; we can only detect it at `plan` time, because the provider
decides whether a resource must be replaced and it cannot be inferred
through static analysis of the template.
**This is likely to be the most common gotcha with using prebuilds,
since it requires a slight template modification to use prebuilds
effectively**, so let's head this off before it's an issue for
customers.
Drift details will now be logged in the workspace build logs:

Plus a notification will be sent to template admins when this situation
arises:

A new metric - `coderd_prebuilt_workspaces_resource_replacements_total`
- will also increment each time a workspace encounters replacements.
We only track _that_ a resource replacement occurred, not how many. Just
one is enough to ruin a prebuild, but we can't know apriori which
replacement would cause this.
For example, say we have 2 replacements: a `docker_container` and a
`null_resource`; we don't know which one might
cause an issue (or indeed if either would), so we just track the
replacement.
---------
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
We do not want password reset notifications to end up in Coder Inbox as
this doesn't make much sense. This implements the logic to ensure they
are not delivered if the method is Coder Inbox.
In the future we might want to investigate a better solution but for now
this works.
Related to #17082
Some notifications ( workspace created and workspace manually updated )
are using wrong variables to build the Action URL. Fixing it.
Updating golden files is an unnecessary extra step in addition to gen
that is easily overlooked, leading to the developer noticing the issue
in CI leading to lost developer time waiting for tests to complete.
Currently the `targets` column in `inbox_notifications` doesn't get
filled. This PR fixes that. Rather than give targets special treatment,
we should put it in the payload like everything else. This correctly
propagates notification targets to the inbox table without much code
change.
This PR is part of the inbox notifications topic, and rely on previous
PRs merged - it adds :
- Endpoints to :
- WS : watch new inbox notifications
- REST : list inbox notifications
- REST : update the read status of a notification
Also, this PR acts as a follow-up PR from previous work and :
- fix DB query issues
- fix DBMem logic to match DB
This PR is [resolving the dispatch part of Coder
Inbocx](https://github.com/coder/internal/issues/403).
Since the DB layer has been merged - we now want to insert notifications
into Coder Inbox in parallel of the other delivery target.
To do so, we push two messages instead of one using the `Enqueue`
method.
Change as part of https://github.com/coder/coder/pull/16071
It has been decided that we want to be able to have some notification
templates be disabled _by default_
https://github.com/coder/coder/pull/16071#issuecomment-2580757061.
This adds a new column (`enabled_by_default`) to
`notification_templates` that defaults to `TRUE`. It also modifies the
`inhibit_enqueue_if_disabled` function to reject notifications for
templates that have `enabled_by_default = FALSE` with the user not
explicitly enabling it.
- Adds `testutil.GoleakOptions` and consolidates existing options to
this location
- Pre-emptively adds required ignore for this Dependabot PR to pass CI
https://github.com/coder/coder/pull/16066
Relates to https://github.com/coder/coder/issues/15845
When the `/workspace/<name>/builds` endpoint is hit, we check if the
requested template version is different to the previously used template
version. If these values differ, we can assume that the workspace has
been manually updated and send the appropriate notification. Automatic
updates happen in the lifecycle executor and bypasses this endpoint
entirely.
Resolves https://github.com/coder/coder/issues/15513
Disables notifications when both `$CODER_NOTIFICATIONS_WEBHOOK_ENDPOINT` and `$CODER_EMAIL_SMARTHOST` are unset.
Breaking change: `$CODER_EMAIL_SMARTHOST` is no longer set by default as `localhost:587`, meaning any deployments that make use of this default value will need to add it back.
---------
Co-authored-by: Danny Kopping <danny@coder.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests. This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.
Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
Closes https://github.com/coder/coder/issues/15213
This PR enables sending notifications without requiring the auth system
context, instead using a new auth notifier context.
This PR aims to close#14253
We keep the default behavior using the Coder logo if there's no logo
set.
Otherwise we want to use the logo based on the URL set in appearance.
---------
Co-authored-by: defelmnq <yvincent@coder.com>
In investigating https://github.com/coder/internal/issues/109 I noticed many of the notification tests are still using `time.Sleep` and `require.Eventually`. This is an initial effort to start converting these to Quartz.
One product change is to switch the `notifier` to use a `TickerFunc` instead of a normal Ticker, since it allows the test to assert that a batch process is complete via the Quartz `Mock` clock. This does introduce one slight behavioral change in that the notifier waits the fetch interval before processing its first batch. In practice, this is inconsequential: no one will notice if we send notifications immediately on startup, or just a little later.
But, it does make a difference to some tests, which are fixed up here.
A bunch of notification tests create a whole `coderd`, when all they use is the database and logger. This makes the tests more expensive to run, and pollutes the test logs with a bunch of stuff that doesn't matter (e.g. tailnet).
This PR closes#15065.
As advised by @mtojek, a template's display name may be set to "", which
is not useful in an email notification. We'd like to provide a friendly
name for the template, but it also needs to be identifiable.
As such, we fall back to template.Name in the case that the template's
display name is empty.
This Pull request addresses the more trivial items in
https://github.com/coder/coder/issues/14893.
These were simple formatting changes that I was able to fix despite
limited context.
Some more changes are required for which I will have to dig a bit deeper
into how the template contexts are populated. I'm happy to add those to
this PR or create a subsequent PR.
Relates to https://github.com/coder/coder/issues/14232
This implements two endpoints (names subject to change):
- `/api/v2/users/otp/request`
- `/api/v2/users/otp/change-password`