Commit Graph

544 Commits

Author SHA1 Message Date
425ee6fa55 feat: reinitialize agents when a prebuilt workspace is claimed (#17475)
This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.

Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:15:36 +02:00
b2a1de9e2a feat: fetch prebuilds metrics state in background (#17792)
`Collect()` is called whenever the `/metrics` endpoint is hit to
retrieve metrics.

The queries used in prebuilds metrics collection are quite heavy, and we
want to avoid having them running concurrently / too often to keep db
load down.

Here I'm moving towards a background retrieval of the state required to
set the metrics, which gets invalidated every interval.

Also introduces `coderd_prebuilt_workspaces_metrics_last_updated` which
operators can use to determine when these metrics go stale.

See https://github.com/coder/coder/pull/17789 as well.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-13 20:27:41 +02:00
64807e1d61 chore: apply the 4mb max limit on drpc protocol message size (#17771)
Respect the 4mb max limit on proto messages
2025-05-13 11:24:51 -05:00
37832413ba chore: resolve internal drpc package conflict (#17770)
Our internal drpc package name conflicts with the external one in usage. 
`drpc.*` == external
`drpcsdk.*` == internal
2025-05-12 10:31:38 -05:00
d5360a6da0 chore: fetch workspaces by username with organization permissions (#17707)
Closes https://github.com/coder/coder/issues/17691

`ExtractOrganizationMembersParam` will allow fetching a user with only
organization permissions. If the user belongs to 0 orgs, then the user "does not exist" 
from an org perspective. But if you are a site-wide admin, then the user does exist.
2025-05-08 14:41:17 -05:00
a646478aed fix: move pubsub publishing out of database transactions to avoid conn exhaustion (#17648)
Database transactions hold onto connections, and `pubsub.Publish` tries
to acquire a connection of its own. If the latter is called within a
transaction, this can lead to connection exhaustion.

I plan two follow-ups to this PR:

1. Make connection counts tuneable

https://github.com/coder/coder/blob/main/cli/server.go#L2360-L2376

We will then be able to write tests showing how connection exhaustion
occurs.

2. Write a linter/ruleguard to prevent `pubsub.Publish` from being
called within a transaction.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-05 11:54:18 +02:00
ef11d4f769 fix: fix bug with deletion of prebuilt workspaces (#17652)
Don't specify the template version for a delete transition, because the
prebuilt workspace may have been created using an older template
version.
If the template version isn't explicitly set, the builder will
automatically use the version from the last workspace build - which is
the desired behavior.
2025-05-01 17:26:30 -04:00
98e5611e16 fix: fix for prebuilds claiming and deletion (#17624)
PR contains:
- fix for claiming & deleting prebuilds with immutable params
- unit test for claiming scenario
- unit test for deletion scenario

The parameter resolver was failing when deleting/claiming prebuilds
because a value for a previously-used parameter was provided to the
resolver, but since the value was unchanged (it's coming from the
preset) it failed in the resolver. The resolver was missing a check to
see if the old value != new value; if the values match then there's no
mutation of an immutable parameter.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-01 08:52:23 +00:00
6936a7b5a2 fix: fix prebuild omissions (#17579)
Fixes accidental omission from https://github.com/coder/coder/pull/17527

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-04-30 14:26:30 +00:00
02b2de9ae4 refactor: skip reconciliation for some presets (#17595) 2025-04-29 07:55:37 -04:00
a78f0fc4e1 refactor: use specific error for agpl and prebuilds (#17591)
Follow-up PR to https://github.com/coder/coder/pull/17458
Addresses this discussion:
https://github.com/coder/coder/pull/17458#discussion_r2055940797
2025-04-28 16:37:41 -04:00
9167cbfe4c refactor: claim prebuilt workspace tests (#17567)
Follow-up to: https://github.com/coder/coder/pull/17458
Specifically it addresses these discussions:
- https://github.com/coder/coder/pull/17458#discussion_r2053531445
2025-04-28 12:49:23 -04:00
e0483e3136 feat: add prebuilds metrics collector (#17547)
Closes https://github.com/coder/internal/issues/509

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-04-28 12:28:56 +02:00
08ad910171 feat: add prebuilds configuration & bootstrapping (#17527)
Closes https://github.com/coder/internal/issues/508

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Cian Johnston <cian@coder.com>
2025-04-25 11:07:15 +02:00
118f12ac3a feat: implement claiming of prebuilt workspaces (#17458)
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Co-authored-by: Edward Angert <EdwardAngert@users.noreply.github.com>
Co-authored-by: EdwardAngert <17991901+EdwardAngert@users.noreply.github.com>
Co-authored-by: Jaayden Halko <jaayden.halko@gmail.com>
Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>
Co-authored-by: M Atif Ali <atif@coder.com>
Co-authored-by: Aericio <16523741+Aericio@users.noreply.github.com>
Co-authored-by: M Atif Ali <me@matifali.dev>
Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>
2025-04-24 09:39:38 -04:00
27bc60d1b9 feat: implement reconciliation loop (#17261)
Closes https://github.com/coder/internal/issues/510

<details>
<summary> Refactoring Summary </summary>

### 1) `CalculateActions` Function

#### Issues Before Refactoring:

- Large function (~150 lines), making it difficult to read and maintain.
- The control flow is hard to follow due to complex conditional logic.
- The `ReconciliationActions` struct was partially initialized early,
then mutated in multiple places, making the flow error-prone.

Original source:  

fe60b569ad/coderd/prebuilds/state.go (L13-L167)

#### Improvements After Refactoring:

- Simplified and broken down into smaller, focused helper methods.
- The flow of the function is now more linear and easier to understand.
- Struct initialization is cleaner, avoiding partial and incremental
mutations.

Refactored function:  

eeb0407d78/coderd/prebuilds/state.go (L67-L84)

---

### 2) `ReconciliationActions` Struct

#### Issues Before Refactoring:

- The struct mixed both actionable decisions and diagnostic state, which
blurred its purpose.
- It was unclear which fields were necessary for reconciliation logic,
and which were purely for logging/observability.

#### Improvements After Refactoring:

- Split into two clear, purpose-specific structs:
- **`ReconciliationActions`** — defines the intended reconciliation
action.
- **`ReconciliationState`** — captures runtime state and metadata,
primarily for logging and diagnostics.

Original struct:  

fe60b569ad/coderd/prebuilds/reconcile.go (L29-L41)

</details>

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Dean Sheather <dean@deansheather.com>
Co-authored-by: Spike Curtis <spike@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
2025-04-17 09:29:29 -04:00
669e790df6 test: add unit test to excercise bug when idp sync hits deleted orgs (#17405)
Deleted organizations are still attempting to sync members. This causes
an error on inserting the member, and would likely cause issues later in
the sync process even if that member is inserted. Deleted orgs should be
skipped.
2025-04-16 09:27:35 -05:00
06d39151dc feat: extend request logs with auth & DB info (#17304)
Closes #16903
2025-04-15 13:27:23 +02:00
c06ef7c1eb chore!: remove JFrog integration (#17353)
- Removes displaying XRay scan results in the dashboard. I'm not sure
  anyone was even using this integration so it's just debt for us to
  maintain. We can open up a separate issue to get rid of the db tables
  once we know for sure that we haven't broken anyone.
2025-04-11 14:45:21 -04:00
46d4b28384 chore: add x-authz-checks debug header when running in dev mode (#16873) 2025-04-10 11:36:27 -06:00
0b58798a1a feat: remove site wide perms from creating a workspace (#17296)
Creating a workspace required `read` on site wide `user`. 
Only organization permissions should be required.
2025-04-09 14:35:43 -05:00
52d555880c chore: add custom samesite options to auth cookies (#16885)
Allows controlling `samesite` cookie settings from the deployment config
2025-04-08 14:15:14 -05:00
ce22de8d15 feat: log long-lived connections acceptance (#17219)
Closes #16904
2025-04-08 08:30:05 +00:00
0b2b643ce2 feat: persist prebuild definitions on template import (#16951)
This PR allows provisioners to recognise and report prebuild definitions
to the coder control plane. It also allows the coder control plane to
then persist these to its store.

closes https://github.com/coder/internal/issues/507

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>
2025-04-07 10:35:28 +02:00
e8b7ce80de ci: re-enable revive and gosec linters (#17225)
* Reenables revive linter for test files (with an exception for the
`unused-parameter` rule)
* Reenables gosec linter for test files
2025-04-02 16:19:23 +01:00
17ddee05e5 chore: update golang to 1.24.1 (#17035)
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
  - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)

🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2025-03-26 01:56:39 -05:00
cf10d98aab fix: improve error message when deleting organization with resources (#17049)
Closes
[coder/internal#477](https://github.com/coder/internal/issues/477)

![Screenshot 2025-03-21 at 11 25
57 AM](https://github.com/user-attachments/assets/50cc03e9-395d-4fc7-8882-18cb66b1fac9)

I'm solving this issue in two parts:

1. Updated the postgres function so that it doesn't omit 0 values in the
error
2. Created a new query to fetch the number of resources associated with
an organization and using that information to provider a cleaner error
message to the frontend

> **_NOTE:_** SQL is not my strong suit, and the code was created with
the help of AI. So I'd take extra time looking over what I wrote there
2025-03-25 15:31:24 -04:00
4c33846f6d chore: add prebuilds system user (#16916)
Pre-requisite for https://github.com/coder/coder/pull/16891

Closes https://github.com/coder/internal/issues/515

This PR introduces a new concept of a "system" user.

Our data model requires that all workspaces have an owner (a `users`
relation), and prebuilds is a feature that will spin up workspaces to be
claimed later by actual users - and thus needs to own the workspaces in
the interim.

Naturally, introducing a change like this touches a few aspects around
the codebase and we've taken the approach _default hidden_ here; in
other words, queries for users will by default _exclude_ all system
users, but there is a flag to ensure they can be displayed. This keeps
the changeset relatively small.

This user has minimal permissions (it's equivalent to a `member` since
it has no roles). It will be associated with the default org in the
initial migration, and thereafter we'll need to somehow ensure its
membership aligns with templates (which are org-scoped) for which it'll
need to provision prebuilds; that's a solution we'll have in a
subsequent PR.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
2025-03-25 12:18:06 +00:00
092c129de0 chore: perform several small frontend permissions refactors (#16735) 2025-03-07 10:33:09 -07:00
522181fead feat(coderd): add new dispatch logic for coder inbox (#16764)
This PR is [resolving the dispatch part of Coder
Inbocx](https://github.com/coder/internal/issues/403).

Since the DB layer has been merged - we now want to insert notifications
into Coder Inbox in parallel of the other delivery target.

To do so, we push two messages instead of one using the `Enqueue`
method.
2025-03-05 22:43:18 +01:00
04c33968cf refactor: replace golang.org/x/exp/slices with slices (#16772)
The experimental functions in `golang.org/x/exp/slices` are now
available in the standard library since Go 1.21.

Reference: https://go.dev/doc/go1.21#slices

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2025-03-04 00:46:49 +11:00
91a4a98c27 chore: add an unassign action for roles (#16728) 2025-02-27 10:39:06 -07:00
cccdf1ecac feat: implement WorkspaceCreationBan org role (#16686)
Using negative permissions, this role prevents a user's ability to
create & delete a workspace within a given organization.

Workspaces are uniquely owned by an org and a user, so the org has to
supercede the user permission with a negative permission.

# Use case

Organizations must be able to restrict a member's ability to create a
workspace. This permission is implicitly granted (see
https://github.com/coder/coder/issues/16546#issuecomment-2655437860).

To revoke this permission, the solution chosen was to use negative
permissions in a built in role called `WorkspaceCreationBan`.

# Rational

Using negative permissions is new territory, and not ideal. However,
workspaces are in a unique position.

Workspaces have 2 owners. The organization and the user. To prevent
users from creating a workspace in another organization, an [implied
negative
permission](36d9f5ddb3/coderd/rbac/policy.rego (L172-L192))
is used. So the truth table looks like: _how to read this table
[here](36d9f5ddb3/coderd/rbac/README.md (roles))_

| Role (example)  | Site | Org  | User | Result |
|-----------------|------|------|------|--------|
| non-org-member  | \_   | N    | YN\_ | N      |
| user            | \_   | \_   | Y    | Y      |
| WorkspaceBan    | \_   | N    | Y    | Y      |
| unauthenticated | \_   | \_   | \_   | N      |


This new role, `WorkspaceCreationBan` is the same truth table condition
as if the user was not a member of the organization (when doing a
workspace create/delete). So this behavior **is not entirely new**.

<details>

<summary>How to do it without a negative permission</summary>

The alternate approach would be to remove the implied permission, and
grant it via and organization role. However this would add new behavior
that an organizational role has the ability to grant a user permissions
on their own resources?

It does not make sense for an org role to prevent user from changing
their profile information for example. So the only option is to create a
new truth table column for resources that are owned by both an
organization and a user.

| Role (example)  | Site | Org  |User+Org| User | Result |
|-----------------|------|------|--------|------|--------|
| non-org-member  | \_   | N    |  \_    | \_   | N      |
| user            | \_   | \_   |  \_    | \_   | N      |
| WorkspaceAllow  | \_   | \_   |   Y    | \_   | Y      |
| unauthenticated | \_   | \_   |  \_    | \_   | N      |

Now a user has no opinion on if they can create a workspace, which feels
a little wrong. A user should have the authority over what is theres.

There is fundamental _philosophical_ question of "Who does a workspace
belong to?". The user has some set of autonomy, yet it is the
organization that controls it's existence. A head scratcher 🤔

</details>

## Will we need more negative built in roles?

There are few resources that have shared ownership. Only
`ResourceOrganizationMember` and `ResourceGroupMember`. Since negative
permissions is intended to revoke access to a shared resource, then
**no.** **This is the only one we need**.

Classic resources like `ResourceTemplate` are entirely controlled by the
Organization permissions. And resources entirely in the user control
(like user profile) are only controlled by `User` permissions.


![Uploading Screenshot 2025-02-26 at 22.26.52.png…]()

---------

Co-authored-by: Jaayden Halko <jaayden.halko@gmail.com>
Co-authored-by: ケイラ <mckayla@hey.com>
2025-02-27 06:23:18 -05:00
95363c9041 fix(enterprise/coderd): remove useless provisioner daemon id from request (#16723)
`ServeProvisionerDaemonRequest` has had an ID field for quite a while
now.
This field is only used for telemetry purposes; the actual daemon ID is
created upon insertion in the database. There's no reason to set it, and
it's confusing to do so. Deprecating the field and removing references
to it.
2025-02-27 09:08:08 +00:00
e005e4e51d chore: merge provisioner key and provisioner permissions (#16628)
Provisioner key permissions were never any different than provisioners.
Merging them for a cleaner permission story until they are required (if
ever) to be seperate.

This removed `ResourceProvisionerKey` from RBAC and just uses the
existing `ResourceProvisioner`.
2025-02-24 13:31:11 -06:00
546a549dcf feat: enable soft delete for organizations (#16584)
- Add deleted column to organizations table
- Add trigger to check for existing workspaces, templates, groups and
members in a org before allowing the soft delete

---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
2025-02-24 12:59:41 -05:00
9469b78290 fix!: enforce regex for agent names (#16641)
Underscores and double hyphens are now blocked. The regex is almost the
exact same as the `coder_app` `slug` regex, but uppercase characters are
still permitted.
2025-02-20 05:09:26 +00:00
7f061b9faf fix(coderd): add stricter authorization for provisioners endpoint (#16587)
References #16558
2025-02-17 14:34:47 +02:00
77306f3de1 feat(coderd): add filters and fix template for provisioner daemons (#16558)
This change adds provisioner daemon ID filter to the provisioner daemons
endpoint, and also implements the limiting to 50 results.

Test coverage is greatly improved and template information for jobs
associated to the daemon was also fixed.

Updates #15084
Updates #15192
Related #16532
2025-02-14 17:26:46 +02:00
d0a534e30d chore: prevent authentication of non-unique oidc subjects (#16498)
Any IdP returning an empty field here breaks the assumption of a
unique subject id. This is defined in the OIDC spec.
2025-02-10 09:31:08 -06:00
0e2ae10b47 feat: add additional patch routes for group and role idp sync (#16351) 2025-01-31 12:14:24 -07:00
6ea5c6f0ef fix: show user-auth provisioners for all organizations (#16350) 2025-01-30 14:08:27 -07:00
b256b204d0 feat: add endpoint for partial updates to org sync field and assign_default (#16337) 2025-01-30 13:55:17 -07:00
2371153a37 feat: add endpoint for partial updates to org sync mapping (#16316) 2025-01-30 10:52:50 -07:00
92d22e296b chore: track usage of organizations in telemetry (#16323)
Addresses https://github.com/coder/internal/issues/317.

## Changes

Requirements are quoted below:

> how many orgs does deployment have

Adds the Organization entity to telemetry.

> ensuring resources are associated with orgs

All resources that reference an org already report the org id to
telemetry. Adds a test to check that.

> whether org sync is configured

Adds the `IDPOrgSync` boolean field to the Deployment entity.

## Implementation of the org sync check

While there's an `OrganizationSyncEnabled` method on the IDPSync
interface, I decided not to use it directly and implemented a
counterpart just for telemetry purposes. It's a compromise I'm not happy
about, but I found that it's a simpler approach than the alternative.
There are multiple reasons:

1. The telemetry package cannot statically access the IDPSync interface
due to a circular import.
2. We can't dynamically pass a reference to the
`OrganizationSyncEnabled` function at the time of instantiating the
telemetry object, because our server initialization logic depends on the
telemetry object being created before the IDPSync object.
3. If we circumvent that problem by passing the reference as an
initially empty pointer, initializing telemetry, then IDPSync, then
updating the pointer to point to `OrganizationSyncEnabled`, we have to
refactor the initialization logic of the telemetry object itself to
avoid a race condition where the first telemetry report is performed
without a valid reference.

I actually implemented that approach in
https://github.com/coder/coder/pull/16307, but realized I'm unable to
fully test it. It changed the initialization order in the server
command, and I wanted to test our CLI with Org Sync configured with a
premium license. As far as I'm aware, we don't have the tooling to do
that. I couldn't figure out a way to start the CLI with a mock license,
and I didn't want to go down further into the refactoring rabbit hole.

So I decided that reimplementing the org sync checking logic is simpler.
2025-01-29 15:54:31 +01:00
c069563af1 test: fix use of t.Logf where t.Log would suffice (#16328) 2025-01-29 14:35:04 +00:00
76adde91dc fix(provisioner/terraform/tfparse): allow empty values in coder_workspace_tag defaults (#16303)
* chore(docs): update docs re workspace tag default values
* chore(coderdenttest): use random name instead of t.Name() in newExternalProvisionerDaemon
* fix(provisioner/terraform/tfparse): allow empty values in coder_workspace_tag defaults
2025-01-28 09:11:39 +00:00
5841c0aacb fix: fetch custom roles from workspace agent context (#16237) 2025-01-23 12:57:09 -06:00
f34e6fd92c chore: implement 'use' verb to template object, read has less scope now (#16075)
Template `use` is now a verb.
- Template admins can `use` all templates (org template admins same in
org)
- Members get the `use` perm from the `everyone` group in the
`group_acl`.
2025-01-17 11:55:41 -06:00
f32f7c6862 test(enterprise/coderd): fix ctx init in multiple workspace tests (#16176) 2025-01-17 14:33:58 +00:00