Commit Graph

2517 Commits

Author SHA1 Message Date
c2bc801f83 chore: add 'classic_parameter_flow' column setting to templates (#17828)
We are forcing users to try the dynamic parameter experience first.
Currently this setting only comes into effect if an experiment is
enabled.
2025-05-15 17:55:17 -05:00
1bacd82e80 feat: add API key scope to restrict access to user data (#17692) 2025-05-15 15:32:52 +01:00
2aa8cbebd7 fix: exclude deleted templates from metrics collection (#17839)
Also add some clarification about the lack of database constraints for
soft template deletion.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-15 13:33:58 +02:00
f2edcf3f59 fix: add missing clause for tracking replacements (#17849)
We should only be tracking resource replacements during a prebuild
claim.

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-15 13:02:30 +02:00
789c4beba7 chore: add dynamic parameter error if missing metadata from provisioner (#17809) 2025-05-14 12:21:36 -05:00
f3bcac2e90 refactor: improve overlayFS errors (#17808) 2025-05-14 10:26:47 -06:00
6e967780c9 feat: track resource replacements when claiming a prebuilt workspace (#17571)
Closes https://github.com/coder/internal/issues/369

We can't know whether a replacement (i.e. drift of terraform state
leading to a resource needing to be deleted/recreated) will take place
apriori; we can only detect it at `plan` time, because the provider
decides whether a resource must be replaced and it cannot be inferred
through static analysis of the template.

**This is likely to be the most common gotcha with using prebuilds,
since it requires a slight template modification to use prebuilds
effectively**, so let's head this off before it's an issue for
customers.

Drift details will now be logged in the workspace build logs:


![image](https://github.com/user-attachments/assets/da1988b6-2cbe-4a79-a3c5-ea29891f3d6f)

Plus a notification will be sent to template admins when this situation
arises:


![image](https://github.com/user-attachments/assets/39d555b1-a262-4a3e-b529-03b9f23bf66a)

A new metric - `coderd_prebuilt_workspaces_resource_replacements_total`
- will also increment each time a workspace encounters replacements.

We only track _that_ a resource replacement occurred, not how many. Just
one is enough to ruin a prebuild, but we can't know apriori which
replacement would cause this.
For example, say we have 2 replacements: a `docker_container` and a
`null_resource`; we don't know which one might
cause an issue (or indeed if either would), so we just track the
replacement.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:52:22 +02:00
425ee6fa55 feat: reinitialize agents when a prebuilt workspace is claimed (#17475)
This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.

Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:15:36 +02:00
60762d4c13 feat: load terraform modules when using dynamic parameters (#17714) 2025-05-13 16:07:29 -05:00
ef745c0c5d chore: optimize workspace_latest_builds view query (#17789)
Avoids two sequential scans of massive tables (`workspace_builds`,
`provisioner_jobs`) and uses index scans instead. This new view largely
replicates our already optimized query `GetWorkspaces` to fetch the
latest build.

The original query and the new query were compared against the dogfood
database to ensure they return the exact same data in the exact same
order (minus the new `workspaces.deleted = false` filter to improve
performance even more). The performance is massively improved even
without the `workspaces.deleted = false` filter, but it was added to
improve it even more.

Note: these query times are probably inflated due to high database load
on our dogfood environment that this intends to partially resolve.

Before: 2,139ms
([explain](https://explain.dalibo.com/plan/997e4fch241b46e6))

After: 33ms
([explain](https://explain.dalibo.com/plan/c888dc223870f181))

Co-authored-by: Cian Johnston <cian@coder.com>

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-13 20:51:01 +02:00
64807e1d61 chore: apply the 4mb max limit on drpc protocol message size (#17771)
Respect the 4mb max limit on proto messages
2025-05-13 11:24:51 -05:00
b0788f410f chore: rename "Test Notification" to "Troubleshooting Notification" (#17790)
Rename the "Test Notification" to "Troubleshooting Notification"
2025-05-13 13:52:55 +01:00
599bb35a04 fix(coderd): list templates returns non-deprecated templates by default (#17747)
## Description

Modifies the behaviour of the "list templates" API endpoints to return
non-deprecated templates by default. Users can still query for
deprecated templates by specifying the `deprecated=true` query
parameter.

**Note:** The deprecation feature is an enterprise-level feature

## Affected Endpoints
* /api/v2/organizations/{organization}/templates
* /api/v2/templates

Fixes #17565
2025-05-13 12:44:46 +01:00
0b5f27f566 feat: add parent_id column to workspace_agents table (#17758)
Adds a new nullable column `parent_id` to `workspace_agents` table. This
lays the groundwork for having child agents.
2025-05-13 00:01:31 +01:00
398b999d8f chore: pass previous values into terraform apply (#17696)
Pass previous workspace build parameter values into the terraform
`plan/apply`. Enforces monotonicity in terraform as well as `coderd`.
2025-05-12 15:32:00 -05:00
d0ab91c16f fix: reduce size of terraform modules archive (#17749) 2025-05-12 13:50:07 -06:00
37832413ba chore: resolve internal drpc package conflict (#17770)
Our internal drpc package name conflicts with the external one in usage. 
`drpc.*` == external
`drpcsdk.*` == internal
2025-05-12 10:31:38 -05:00
af2941bb92 feat: add is_prebuild_claim to distinguish post-claim provisioning (#17757)
Used in combination with
https://github.com/coder/terraform-provider-coder/pull/396

This is required by both https://github.com/coder/coder/pull/17475 and
https://github.com/coder/coder/pull/17571

Operators may need to conditionalize their templates to perform certain
operations once a prebuilt workspace has been claimed. This value will
**only** be set once a claim takes place and a subsequent `terraform
apply` occurs. Any `terraform apply` runs thereafter will be
indistinguishable from a normal run on a workspace.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-12 14:19:03 +00:00
3ee95f14ce chore: upgrade terraform-provider-coder & preview libs (#17738)
The changes in `coder/preview` necessitated the changes in
`codersdk/richparameters.go` & `provisioner/terraform/resources.go`.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-05-09 17:41:19 +02:00
a9f1a6b2a2 fix: revert fix: persist terraform modules during template import (#17665) (#17734)
This reverts commit ae3d90b057.
2025-05-08 22:03:08 -04:00
ae3d90b057 fix: persist terraform modules during template import (#17665) 2025-05-08 16:13:46 -06:00
d5360a6da0 chore: fetch workspaces by username with organization permissions (#17707)
Closes https://github.com/coder/coder/issues/17691

`ExtractOrganizationMembersParam` will allow fetching a user with only
organization permissions. If the user belongs to 0 orgs, then the user "does not exist" 
from an org perspective. But if you are a site-wide admin, then the user does exist.
2025-05-08 14:41:17 -05:00
1bb96b8528 fix: resolve flake test on manager (#17702)
Fixes coder/internal#544

---------

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2025-05-08 16:49:57 +03:00
4369765996 test: fix TestWorkspaceAgentReportStats flake (#17678)
Closes https://github.com/coder/internal/issues/609.

As seen in the below logs, the `last_used_at` time was updating, but just to the same value that it was on creation; `dbtime.Now` was called in quick succession. 

```
 t.go:106: 2025-05-05 12:11:54.166 [info]  coderd.workspace_usage_tracker: updated workspaces last_used_at  count=1  now="2025-05-05T12:11:54.161329Z"
    t.go:106: 2025-05-05 12:11:54.172 [debu]  coderd: GET  host=localhost:50422  path=/api/v2/workspaces/745b7ff3-47f2-4e1a-9452-85ea48ba5c46  proto=HTTP/1.1  remote_addr=127.0.0.1  start="2025-05-05T12:11:54.1669073Z"  workspace_name=peaceful_faraday34  requestor_id=b2cf02ae-2181-480b-bb1f-95dc6acb6497  requestor_name=testuser  requestor_email=""  took=5.2105ms  status_code=200  latency_ms=5  params_workspace=745b7ff3-47f2-4e1a-9452-85ea48ba5c46  request_id=7fd5ea90-af7b-4104-91c5-9ca64bc2d5e6
    workspaceagentsrpc_test.go:70: 
        	Error Trace:	C:/actions-runner/coder/coder/coderd/workspaceagentsrpc_test.go:70
        	Error:      	Should be true
        	Test:       	TestWorkspaceAgentReportStats
        	Messages:   	2025-05-05 12:11:54.161329 +0000 UTC is not after 2025-05-05 12:11:54.161329 +0000 UTC
```

If we change the initial `LastUsedAt` time to be a time in the past, ticking with a `dbtime.Now` will always update it to a later value. If it never updates, the condition will still fail.
2025-05-06 00:15:24 +10:00
544259b809 feat: add database tables and API routes for agentic chat feature (#17570)
Backend portion of experimental `AgenticChat` feature:
- Adds database tables for chats and chat messages
- Adds functionality to stream messages from LLM providers using
`kylecarbs/aisdk-go`
- Adds API routes with relevant functionality (list, create, update
chats, insert chat message)
- Adds experiment `codersdk.AgenticChat`

---------

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2025-05-02 17:29:57 +01:00
b7e08ba7c9 fix: filter out deleted users when attempting to delete an organization (#17621)
Closes
[coder/internal#601](https://github.com/coder/internal/issues/601)
2025-05-01 13:26:01 -03:00
2acf0adcf2 chore(codersdk/toolsdk): improve static analyzability of toolsdk.Tools (#17562)
* Refactors toolsdk.Tools to remove opaque `map[string]any` argument in
favour of typed args structs.
* Refactors toolsdk.Tools to remove opaque passing of dependencies via
`context.Context` in favour of a tool dependencies struct.
* Adds panic recovery and clean context middleware to all tools.
* Adds `GenericTool` implementation to allow keeping `toolsdk.All` with
uniform type signature while maintaining type information in handlers.
* Adds stricter checks to `patchWorkspaceAgentAppStatus` handler.
2025-04-29 16:05:23 +01:00
1fc74f629e refactor(agent): update agentcontainers api initialization (#17600)
There were too many ways to configure the agentcontainers API resulting
in inconsistent behavior or features not being enabled. This refactor
introduces a control flag for enabling or disabling the containers API.
When disabled, all implementations are no-op and explicit endpoint
behaviors are defined. When enabled, concrete implementations are used
by default but can be overridden by passing options.
2025-04-29 17:53:10 +03:00
02b2de9ae4 refactor: skip reconciliation for some presets (#17595) 2025-04-29 07:55:37 -04:00
12589026b6 chore: update error message for duplicate organization members (#17594)
Closes https://github.com/coder/internal/issues/345
2025-04-28 14:51:33 -06:00
a78f0fc4e1 refactor: use specific error for agpl and prebuilds (#17591)
Follow-up PR to https://github.com/coder/coder/pull/17458
Addresses this discussion:
https://github.com/coder/coder/pull/17458#discussion_r2055940797
2025-04-28 16:37:41 -04:00
14105ff301 test: do not run memory race test in parallel (#17582)
Closes
https://github.com/coder/internal/issues/597#issuecomment-2835262922

The parallelized tests share configs, which when accessed concurrently
throw race errors. The configs are read only, so it is fine to run these
tests with shared idp configs.
2025-04-28 12:20:07 -05:00
37c5e7c440 chore: return safe copy of string slice in 'ParseStringSliceClaim' (#17439)
Claims parsed should be safe to mutate and filter. This was likely not
causing any bugs or issues, and just doing this out of precaution
2025-04-28 12:18:02 -05:00
e0483e3136 feat: add prebuilds metrics collector (#17547)
Closes https://github.com/coder/internal/issues/509

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-04-28 12:28:56 +02:00
08ad910171 feat: add prebuilds configuration & bootstrapping (#17527)
Closes https://github.com/coder/internal/issues/508

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Cian Johnston <cian@coder.com>
2025-04-25 11:07:15 +02:00
118f12ac3a feat: implement claiming of prebuilt workspaces (#17458)
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Co-authored-by: Edward Angert <EdwardAngert@users.noreply.github.com>
Co-authored-by: EdwardAngert <17991901+EdwardAngert@users.noreply.github.com>
Co-authored-by: Jaayden Halko <jaayden.halko@gmail.com>
Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>
Co-authored-by: M Atif Ali <atif@coder.com>
Co-authored-by: Aericio <16523741+Aericio@users.noreply.github.com>
Co-authored-by: M Atif Ali <me@matifali.dev>
Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>
2025-04-24 09:39:38 -04:00
36a72a2b25 chore: loosen static validation when using dynamic parameters (#17516)
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-04-23 10:15:49 -06:00
71dbd0c888 fix: nil ptr deref when removing OIDC from deployment and accessing old users (#17501)
If OIDC is removed from a deployment, trying to create a workspace for a previous user
on OIDC would panic.
2025-04-23 08:45:26 -05:00
ca38729840 chore: revert dynamic params as a safe experiment (#17510) 2025-04-22 16:21:15 +00:00
2cc56ab515 chore: fill out workspace owner data for dynamic parameters (#17366) 2025-04-17 14:51:50 -06:00
ea65ddc17d fix: correct user roles being passed into terraform context (#17460)
Roles were being passed into the workspace context incorrectly. Site
wide scopes were being org scoped. Roles outside the org should also not
be sent.
2025-04-17 15:42:23 -05:00
183146e2c9 fix: add minor fix to reconciliation loop (#17454)
Follow-up PR to https://github.com/coder/coder/pull/17261
I noticed that 1 metrics-related test fails in `dk/prebuilds` after
merging my PR into `dk/prebuilds`.
2025-04-17 13:43:24 -04:00
27bc60d1b9 feat: implement reconciliation loop (#17261)
Closes https://github.com/coder/internal/issues/510

<details>
<summary> Refactoring Summary </summary>

### 1) `CalculateActions` Function

#### Issues Before Refactoring:

- Large function (~150 lines), making it difficult to read and maintain.
- The control flow is hard to follow due to complex conditional logic.
- The `ReconciliationActions` struct was partially initialized early,
then mutated in multiple places, making the flow error-prone.

Original source:  

fe60b569ad/coderd/prebuilds/state.go (L13-L167)

#### Improvements After Refactoring:

- Simplified and broken down into smaller, focused helper methods.
- The flow of the function is now more linear and easier to understand.
- Struct initialization is cleaner, avoiding partial and incremental
mutations.

Refactored function:  

eeb0407d78/coderd/prebuilds/state.go (L67-L84)

---

### 2) `ReconciliationActions` Struct

#### Issues Before Refactoring:

- The struct mixed both actionable decisions and diagnostic state, which
blurred its purpose.
- It was unclear which fields were necessary for reconciliation logic,
and which were purely for logging/observability.

#### Improvements After Refactoring:

- Split into two clear, purpose-specific structs:
- **`ReconciliationActions`** — defines the intended reconciliation
action.
- **`ReconciliationState`** — captures runtime state and metadata,
primarily for logging and diagnostics.

Original struct:  

fe60b569ad/coderd/prebuilds/reconcile.go (L29-L41)

</details>

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Dean Sheather <dean@deansheather.com>
Co-authored-by: Spike Curtis <spike@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
2025-04-17 09:29:29 -04:00
daafa0d689 chore: add missing prometheus tests for UNKNOWN/STATIC paths (#17446) 2025-04-17 13:50:18 +02:00
0bc49ff5ae test: fix flake in TestRoleSyncTable with test cases sharing resources (#17441)
The test case definition shares maps that can have concurrent access if run in parallel.
2025-04-17 00:14:11 +00:00
7f6e5139eb chore: format code (#17438) 2025-04-16 17:21:14 -06:00
2e5cd299f2 chore: load 'assign_default' value from legacy value (#17428)
If this value was set before v2.19.0, then assign_default was in a json
field that would not match. And it would default to `false`. This
corrects that.
2025-04-16 15:55:37 -05:00
c4d3dd2791 chore: prevent null loading sync settings (#17430)
Nulls passed to the frontend caused a page to fail to load.

`Record<string,string>` can be `nil` in golang
2025-04-16 14:39:57 -05:00
f670bc31f5 chore: update testutil chan helpers (#17408) 2025-04-16 10:37:09 -06:00
2a76f5028e fix: don't attempt to insert empty terraform plans into the database (#17426) 2025-04-16 10:14:35 -06:00