This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.
Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.
---------
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Closes https://github.com/coder/internal/issues/510
<details>
<summary> Refactoring Summary </summary>
### 1) `CalculateActions` Function
#### Issues Before Refactoring:
- Large function (~150 lines), making it difficult to read and maintain.
- The control flow is hard to follow due to complex conditional logic.
- The `ReconciliationActions` struct was partially initialized early,
then mutated in multiple places, making the flow error-prone.
Original source:
fe60b569ad/coderd/prebuilds/state.go (L13-L167)
#### Improvements After Refactoring:
- Simplified and broken down into smaller, focused helper methods.
- The flow of the function is now more linear and easier to understand.
- Struct initialization is cleaner, avoiding partial and incremental
mutations.
Refactored function:
eeb0407d78/coderd/prebuilds/state.go (L67-L84)
---
### 2) `ReconciliationActions` Struct
#### Issues Before Refactoring:
- The struct mixed both actionable decisions and diagnostic state, which
blurred its purpose.
- It was unclear which fields were necessary for reconciliation logic,
and which were purely for logging/observability.
#### Improvements After Refactoring:
- Split into two clear, purpose-specific structs:
- **`ReconciliationActions`** — defines the intended reconciliation
action.
- **`ReconciliationState`** — captures runtime state and metadata,
primarily for logging and diagnostics.
Original struct:
fe60b569ad/coderd/prebuilds/reconcile.go (L29-L41)
</details>
---------
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Dean Sheather <dean@deansheather.com>
Co-authored-by: Spike Curtis <spike@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
- Removes displaying XRay scan results in the dashboard. I'm not sure
anyone was even using this integration so it's just debt for us to
maintain. We can open up a separate issue to get rid of the db tables
once we know for sure that we haven't broken anyone.
This does ~95% of the backend work required to integrate the AI work.
Most left to integrate from the tasks branch is just frontend, which
will be a lot smaller I believe.
The real difference between this branch and that one is the abstraction
-- this now attaches statuses to apps, and returns the latest status
reported as part of a workspace.
This change enables us to have a similar UX to in the tasks branch, but
for agents other than Claude Code as well. Any app can report status
now.
* Adds `codersdk.ExperimentWebPush` (`web-push`)
* Adds a `coderd/webpush` package that allows sending native push
notifications via `github.com/SherClockHolmes/webpush-go`
* Adds database tables to store push notification subscriptions.
* Adds an API endpoint that allows users to subscribe/unsubscribe, and
send a test notification (404 without experiment, excluded from API docs)
* Adds server CLI command to regenerate VAPID keys (note: regenerating
the VAPID keypair requires deleting all existing subscriptions)
---------
Co-authored-by: Kyle Carberry <kyle@carberry.com>
Closes
[coder/internal#477](https://github.com/coder/internal/issues/477)

I'm solving this issue in two parts:
1. Updated the postgres function so that it doesn't omit 0 values in the
error
2. Created a new query to fetch the number of resources associated with
an organization and using that information to provider a cleaner error
message to the frontend
> **_NOTE:_** SQL is not my strong suit, and the code was created with
the help of AI. So I'd take extra time looking over what I wrote there
Pre-requisite for https://github.com/coder/coder/pull/16891
Closes https://github.com/coder/internal/issues/515
This PR introduces a new concept of a "system" user.
Our data model requires that all workspaces have an owner (a `users`
relation), and prebuilds is a feature that will spin up workspaces to be
claimed later by actual users - and thus needs to own the workspaces in
the interim.
Naturally, introducing a change like this touches a few aspects around
the codebase and we've taken the approach _default hidden_ here; in
other words, queries for users will by default _exclude_ all system
users, but there is a flag to ensure they can be displayed. This keeps
the changeset relatively small.
This user has minimal permissions (it's equivalent to a `member` since
it has no roles). It will be associated with the default org in the
initial migration, and thereafter we'll need to somehow ensure its
membership aligns with templates (which are org-scoped) for which it'll
need to provision prebuilds; that's a solution we'll have in a
subsequent PR.
---------
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
This change allows specifying devcontainers in terraform and plumbs it
through to the agent via agent manifest.
This will be used for autostarting devcontainers in a workspace.
Depends on coder/terraform-provider-coder#368
Updates #16423
[Resolve this issue](https://github.com/coder/internal/issues/506)
Add a mark-all-as-read endpoint which is marking as read all
notifications that are not read for the authenticated user.
Also adds the DB logic.
This change adds support for workspace app auditing.
To avoid audit log spam, we introduce the concept of app audit sessions.
An audit session is unique per workspace app, user, ip, user agent and
http status code. The sessions are stored in a separate table from audit
logs to allow use-case specific optimizations. Sessions are ephemeral
and the table does not function as a log.
The logic for auditing is placed in the DBTokenProvider for workspace
apps so that wsproxies are included.
This is the final change affecting the API fo #15139.
Updates #15139
Addresses https://github.com/coder/nexus/issues/195. Specifically, just
the "tracking templates" requirement:
> ## Tracking in templates
> To enable resource alerts, a user must add the resource_monitoring
block to a template's coder_agent resource. We'd like to track if
customers have any resource monitoring enabled on a per-deployment
basis. Even better, we could identify which templates are using resource
monitoring.
Third and final PR to address
https://github.com/coder/coder/issues/16230.
This PR enables GitHub OAuth2 login by default on new deployments.
Combined with https://github.com/coder/coder/pull/16629, this will allow
the first admin user to sign up with GitHub rather than email and
password.
We take care not to enable the default on deployments that would upgrade
to a Coder version with this change.
To disable the default provider an admin can set the
`CODER_OAUTH2_GITHUB_DEFAULT_PROVIDER` env variable to false.
- Add deleted column to organizations table
- Add trigger to check for existing workspaces, templates, groups and
members in a org before allowing the soft delete
---------
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
This change adds provisioner daemon ID filter to the provisioner daemons
endpoint, and also implements the limiting to 50 results.
Test coverage is greatly improved and template information for jobs
associated to the daemon was also fixed.
Updates #15084
Updates #15192
Related #16532
This pull requests adds the necessary migrations and queries to support
presets within the coderd database. Future PRs will build functionality
to the provisioners and the frontend.
As requested for [this
issue](https://github.com/coder/internal/issues/245) we need to have a
new resource `resources_monitoring` in the agent.
It needs to be parsed from the provisioner and inserted into a new db
table.
Addresses https://github.com/coder/nexus/issues/175.
## Changes
- Adds the `telemetry_items` database table. It's a key value store for
telemetry events that don't fit any other database tables.
- Adds a telemetry report when HTML is served for the first time in
`site.go`.
Replace Depot build action with Nix for Nix dogfood image builds
The dogfood Nix image is now built using Nix's native container tooling instead of Depot. This change:
- Adds Nix setup steps to the GitHub Actions workflow
- Removes the Dockerfile.nix in favor of a Nix-native container build
- Updates the flake.nix to support building Docker images
- Introduces a hash file to track Nix-related changes
- Updates the vendorHash for Go dependencies
Change-Id: I4e011fe3a19d9a1375fbfd5223c910e59d66a5d9
Signed-off-by: Thomas Kosiewski <tk@coder.com>
RE: https://github.com/coder/coder/issues/15740,
https://github.com/coder/coder/issues/15297
In order to add a graph to the coder frontend to show user status over
time as an indicator of license usage, this PR adds the following:
* a new `api.insightsUserStatusCountsOverTime` endpoint to the API
* which calls a new `GetUserStatusCountsOverTime` query from postgres
* which relies on two new tables `user_status_changes` and
`user_deleted`
* which are populated by a new trigger and function that tracks updates
to the users table
The chart itself will be added in a subsequent PR
---------
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
Another PR to address https://github.com/coder/coder/issues/15109.
- adds the DisableForeignKeysAndTriggers utility, which simplifies
converting tests from in-mem to postgres
- converts the dbauthz test suite to pass on both the in-mem db and
Postgres
Relates to https://github.com/coder/coder/issues/15390
Currently when a user creates a workspace, their workspace's TTL is
determined by the template's default TTL. If the Coder instance is AGPL,
or if the template has disallowed the user from configuring autostop,
then it is not possible to change the workspace's TTL after creation.
Any changes to the template's default TTL only takes effect on _new_
workspaces.
This PR modifies the behaviour slightly so that on AGPL Coder, or on
enterprise when a template does not allow user's to configure their
workspace's TTL, updating the template's default TTL will also update
any workspace's TTL to match this value.
https://github.com/coder/coder/pull/15608 introduced a buggy behaviour
with dbcrypt enabled.
When clearing an oauth refresh token, we had been setting the value to
the empty string.
The database encryption package considers decrypting an empty string to
be an error, as an empty encrypted string value will still have a nonce
associated with it and thus not actually be empty when stored at rest.
Instead of 'deleting' the refresh token, 'update' it to be the empty
string.
This plays nicely with dbcrypt.
It also adds a 'utility test' in the dbcrypt package to help encrypt a
value. This was useful when manually fixing users affected by this bug
on our dogfood instance.
Relates to https://github.com/coder/coder/issues/15082
Further to https://github.com/coder/coder/pull/15429, this reduces the
amount of false-positives returned by the 'is eligible for autostart'
part of the query. We achieve this by calculating the 'next start at'
time of the workspace, storing it in the database, and using it in our
`GetWorkspacesEligibleForTransition` query.
The prior implementation of the 'is eligible for autostart' query would
return _all_ workspaces that at some point in the future _might_ be
eligible for autostart. This now ensures we only return workspaces that
_should_ be eligible for autostart.
We also now pass `currentTick` instead of `t` to the
`GetWorkspacesEligibleForTransition` query as otherwise we'll have one
round of workspaces that are skipped by `isEligibleForTransition` due to
`currentTick` being a truncated version of `t`.
Once a token refresh fails, we remove the `oauth_refresh_token` from the
database. This will prevent the token from hitting the IDP for
subsequent refresh attempts.
Without this change, a bad script can cause a failing token to hit a
remote IDP repeatedly with each `git` operation. With this change, after
the first hit, subsequent hits will fail locally, and never contact the
IDP.
The solution in both cases is to authenticate the external auth link. So
the resolution is the same as before.
Adds an api endpoint to grab all available sync field options for IDP
sync. This is for autocomplete on idp sync forms. This is required for
organization admins to have some insight into the claim fields available
when configuring group/role sync.
Addresses https://github.com/coder/nexus/issues/35.
This PR:
- Adds a `workspace_modules` table to track modules used by the
Terraform provisioner in provisioner jobs.
- Adds a `module_path` column to the `workspace_resources` table,
allowing to identify which module a resource originates from.
- Starts pushing this new information into telemetry.
For the person reviewing this PR, do not fret about the 1,500 new lines
- ~1,000 of them are auto-generated.
Relates to https://github.com/coder/coder/issues/15082
The old implementation of `GetWorkspacesEligibleForTransition` returns
many workspaces that are not actually eligible for transition. This new
implementation reduces this number significantly (at least on our
dogfood instance).
Second PR for #14716.
Adds a query that, given a user ID, returns all the workspaces they own, that can also be `ActionRead` by the requesting user.
```
type GetWorkspacesAndAgentsByOwnerIDRow struct {
WorkspaceID uuid.UUID `db:"workspace_id" json:"workspace_id"`
WorkspaceName string `db:"workspace_name" json:"workspace_name"`
JobStatus ProvisionerJobStatus `db:"job_status" json:"job_status"`
Transition WorkspaceTransition `db:"transition" json:"transition"`
Agents []AgentIDNamePair `db:"agents" json:"agents"`
}
```
`JobStatus` and `Transition` are set using the latest build/job of the workspace. Deleted workspaces are not included.
Joins in fields like `username`, `avatar_url`, `organization_name`,
`template_name` to `workspaces` via a **view**.
The view must be maintained moving forward, but this prevents needing to
add RBAC permissions to fetch related workspace fields.
Relates to https://github.com/coder/coder/issues/14232
This implements two endpoints (names subject to change):
- `/api/v2/users/otp/request`
- `/api/v2/users/otp/change-password`
* feat: begin impl of agent script timings
* feat: add job_id and display_name to script timings
* fix: increment migration number
* fix: rename migrations from 251 to 254
* test: get tests compiling
* fix: appease the linter
* fix: get tests passing again
* fix: drop column from correct table
* test: add fixture for agent script timings
* fix: typo
* fix: use job id used in provisioner job timings
* fix: increment migration number
* test: behaviour of script runner
* test: rewrite test
* test: does exit 1 script break things?
* test: rewrite test again
* fix: revert change
Not sure how this came to be, I do not recall manually changing
these files.
* fix: let code breathe
* fix: wrap errors
* fix: justify nolint
* fix: swap require.Equal argument order
* fix: add mutex operations
* feat: add 'ran_on_start' and 'blocked_login' fields
* fix: update testdata fixture
* fix: refer to agent_id instead of job_id in timings
* fix: JobID -> AgentID in dbauthz_test
* fix: add 'id' to scripts, make timing refer to script id
* fix: fix broken tests and convert bug
* fix: update testdata fixtures
* fix: update testdata fixtures again
* feat: capture stage and if script timed out
* fix: update migration number
* test: add test for script api
* fix: fake db query
* fix: use UTC time
* fix: ensure r.scriptComplete is not nil
* fix: move err check to right after call
* fix: uppercase sql
* fix: use dbtime.Now()
* fix: debug log on r.scriptCompleted being nil
* fix: ensure correct rbac permissions
* chore: remove DisplayName
* fix: get tests passing
* fix: remove space in sql up
* docs: document ExecuteOption
* fix: drop 'RETURNING' from sql
* chore: remove 'display_name' from timing table
* fix: testdata fixture
* fix: put r.scriptCompleted call in goroutine
* fix: track goroutine for test + use separate context for reporting
* fix: appease linter, handle trackCommandGoroutine error
* fix: resolve race condition
* feat: replace timed_out column with status column
* test: update testdata fixture
* fix: apply suggestions from review
* revert: linter changes