177 Commits

Author SHA1 Message Date
0238f2926d feat: persist AI task state in template imports & workspace builds (#18449) 2025-06-24 10:36:37 +00:00
6cc4cfa346 feat: allow for default presets (#18445) 2025-06-24 12:19:19 +02:00
4699393522 fix: upsert coder_app resources in case they are persistent (#18509) 2025-06-23 18:50:44 +00:00
66e8dbbe17 feat: persist generated coder_app id (#18487) 2025-06-23 08:46:18 +00:00
0f6ca55238 feat: implement scheduling mechanism for prebuilds (#18126)
Closes https://github.com/coder/internal/issues/312
Depends on https://github.com/coder/terraform-provider-coder/pull/408

This PR adds support for defining an **autoscaling block** for
prebuilds, allowing number of desired instances to scale dynamically
based on a schedule.

Example usage:
```
data "coder_workspace_preset" "us-nix" {
  ...
  
  prebuilds = {
    instances = 0                  # default to 0 instances
    
    scheduling = {
      timezone = "UTC"             # a single timezone is used for simplicity
      
      # Scale to 3 instances during the work week
      schedule {
        cron = "* 8-18 * * 1-5"    # from 8AM–6:59PM, Mon–Fri, UTC
        instances = 3              # scale to 3 instances
      }
      
      # Scale to 1 instance on Saturdays for urgent support queries
      schedule {
        cron = "* 8-14 * * 6"      # from 8AM–2:59PM, Sat, UTC
        instances = 1              # scale to 1 instance
      }
    }
  }
}
```

### Behavior
- Multiple `schedule` blocks per `prebuilds` block are supported.
- If the current time matches any defined autoscaling schedule, the
corresponding number of instances is used.
- If no schedule matches, the **default instance count**
(`prebuilds.instances`) is used as a fallback.

### Why
This feature allows prebuild instance capacity to adapt to predictable
usage patterns, such as:
- Scaling up during business hours or high-demand periods
- Reducing capacity during off-hours to save resources

### Cron specification
The cron specification is interpreted as a **continuous time range.**

For example, the expression:

```
* 9-18 * * 1-5
```

is intended to represent a continuous range from **09:00 to 18:59**,
Monday through Friday.

However, due to minor implementation imprecision, it is currently
interpreted as a range from **08:59:00 to 18:58:59**, Monday through
Friday.

This slight discrepancy arises because the evaluation is based on
whether a specific **point in time** falls within the range, using the
`github.com/coder/coder/v2/coderd/schedule/cron` library, which performs
per-minute matching rather than strict range evaluation.

---------

Co-authored-by: Danny Kopping <danny@coder.com>
2025-06-19 11:08:48 -04:00
8b27983d14 fix: fix TestAcquireJobWithCancel_Cancel flake (#18441) 2025-06-18 22:51:13 -04:00
b0fa3275d2 fix: increase TestAcquireJob_LongPoll timeout to prevent flakiness (#18442)
I'll be honest I'm not even really sure the point of this test but it
was failing due to

```
2025-06-16T15:01:54.0863251Z         	Error:      	Received unexpected error:
2025-06-16T15:01:54.0863554Z         	            	acquire job:
2025-06-16T15:01:54.0864230Z         	            	    github.com/coder/coder/v2/coderd/provisionerdserver.(*server).AcquireJob
2025-06-16T15:01:54.0865173Z         	            	        /home/runner/work/coder/coder/coderd/provisionerdserver/provisionerdserver.go:329
2025-06-16T15:01:54.0865683Z         	            	  - failed to acquire job:
2025-06-16T15:01:54.0866374Z         	            	    github.com/coder/coder/v2/coderd/provisionerdserver.(*Acquirer).AcquireJob
2025-06-16T15:01:54.0867262Z         	            	        /home/runner/work/coder/coder/coderd/provisionerdserver/acquirer.go:148
2025-06-16T15:01:54.0867819Z         	            	  - pq: canceling statement due to user request
```

which is certainly unintended.
2025-06-19 02:50:53 +00:00
c1341cccdd feat: use proto streams to increase maximum module files payload (#18268)
This PR implements protobuf streaming to handle large module files by:
1. **Streaming large payloads**: When module files exceed the 4MB limit,
they're streamed in chunks using a new UploadFile RPC method
2. **Database storage**: Streamed files are stored in the database and
referenced by hash for deduplication
3. **Backward compatibility**: Small module files continue using the
existing direct payload method
2025-06-13 12:46:26 -05:00
0428c5ec1c chore: include 'everyone' group in template importing (#18257) 2025-06-05 19:25:36 +00:00
8387dd27ab chore: add form_type parameter argument to db (#17920)
`form_type` is a new parameter field in the terraform provider. Bring
that field into coder/coder.

Validation for `multi-select` has also been added.
2025-05-29 08:55:19 -05:00
776c144128 fix(coderd): ensure agent timings are non-zero on insert (#18065)
Relates to https://github.com/coder/coder/issues/15432

Ensures that no workspace build timings with zero values for started_at or ended_at are inserted into the DB or returned from the API.
2025-05-29 13:36:06 +01:00
9fc3329575 feat: persist app groups in the database (#17977) 2025-05-27 13:13:08 -06:00
6f6e73af03 feat: implement expiration policy logic for prebuilds (#17996)
## Summary 

This PR introduces support for expiration policies in prebuilds. The TTL
(time-to-live) is retrieved from the Terraform configuration
([terraform-provider-coder
PR](https://github.com/coder/terraform-provider-coder/pull/404)):
```
prebuilds = {
	  instances = 2
	  expiration_policy {
		  ttl = 86400
	  }
  }
```
**Note**: Since there is no need for precise TTL enforcement down to the
second, in this implementation expired prebuilds are handled in a single
reconciliation cycle: they are deleted, and new instances are created
only if needed to match the desired count.

## Changes

* The outcome of a reconciliation cycle is now expressed as a slice of
reconciliation actions, instead of a single aggregated action.
* Adjusted reconciliation logic to delete expired prebuilds and
guarantee that the number of desired instances is correct.
* Updated relevant data structures and methods to support expiration
policies parameters.
* Added documentation to `Prebuilt workspaces` page
* Update `terraform-provider-coder` to version 2.5.0:
https://github.com/coder/terraform-provider-coder/releases/tag/v2.5.0

Depends on: https://github.com/coder/terraform-provider-coder/pull/404
Fixes: https://github.com/coder/coder/issues/17916
2025-05-26 20:31:24 +01:00
b7462fb256 feat: improve transaction safety in CompleteJob function (#17970)
This PR refactors the CompleteJob function to use database transactions
more consistently for better atomicity guarantees. The large function
was broken down into three specialized handlers:

- completeTemplateImportJob
- completeWorkspaceBuildJob
- completeTemplateDryRunJob

Each handler now uses the Database.InTx wrapper to ensure all database
operations for a job completion are performed within a single
transaction, preventing partial updates in case of failures.

Added comprehensive tests for transaction behavior for each job type.

Fixes #17694

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-05-21 16:48:51 +02:00
1bacd82e80 feat: add API key scope to restrict access to user data (#17692) 2025-05-15 15:32:52 +01:00
f2edcf3f59 fix: add missing clause for tracking replacements (#17849)
We should only be tracking resource replacements during a prebuild
claim.

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-15 13:02:30 +02:00
789c4beba7 chore: add dynamic parameter error if missing metadata from provisioner (#17809) 2025-05-14 12:21:36 -05:00
6e967780c9 feat: track resource replacements when claiming a prebuilt workspace (#17571)
Closes https://github.com/coder/internal/issues/369

We can't know whether a replacement (i.e. drift of terraform state
leading to a resource needing to be deleted/recreated) will take place
apriori; we can only detect it at `plan` time, because the provider
decides whether a resource must be replaced and it cannot be inferred
through static analysis of the template.

**This is likely to be the most common gotcha with using prebuilds,
since it requires a slight template modification to use prebuilds
effectively**, so let's head this off before it's an issue for
customers.

Drift details will now be logged in the workspace build logs:


![image](https://github.com/user-attachments/assets/da1988b6-2cbe-4a79-a3c5-ea29891f3d6f)

Plus a notification will be sent to template admins when this situation
arises:


![image](https://github.com/user-attachments/assets/39d555b1-a262-4a3e-b529-03b9f23bf66a)

A new metric - `coderd_prebuilt_workspaces_resource_replacements_total`
- will also increment each time a workspace encounters replacements.

We only track _that_ a resource replacement occurred, not how many. Just
one is enough to ruin a prebuild, but we can't know apriori which
replacement would cause this.
For example, say we have 2 replacements: a `docker_container` and a
`null_resource`; we don't know which one might
cause an issue (or indeed if either would), so we just track the
replacement.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:52:22 +02:00
425ee6fa55 feat: reinitialize agents when a prebuilt workspace is claimed (#17475)
This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.

Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:15:36 +02:00
0b5f27f566 feat: add parent_id column to workspace_agents table (#17758)
Adds a new nullable column `parent_id` to `workspace_agents` table. This
lays the groundwork for having child agents.
2025-05-13 00:01:31 +01:00
398b999d8f chore: pass previous values into terraform apply (#17696)
Pass previous workspace build parameter values into the terraform
`plan/apply`. Enforces monotonicity in terraform as well as `coderd`.
2025-05-12 15:32:00 -05:00
d0ab91c16f fix: reduce size of terraform modules archive (#17749) 2025-05-12 13:50:07 -06:00
37832413ba chore: resolve internal drpc package conflict (#17770)
Our internal drpc package name conflicts with the external one in usage. 
`drpc.*` == external
`drpcsdk.*` == internal
2025-05-12 10:31:38 -05:00
af2941bb92 feat: add is_prebuild_claim to distinguish post-claim provisioning (#17757)
Used in combination with
https://github.com/coder/terraform-provider-coder/pull/396

This is required by both https://github.com/coder/coder/pull/17475 and
https://github.com/coder/coder/pull/17571

Operators may need to conditionalize their templates to perform certain
operations once a prebuilt workspace has been claimed. This value will
**only** be set once a claim takes place and a subsequent `terraform
apply` occurs. Any `terraform apply` runs thereafter will be
indistinguishable from a normal run on a workspace.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-12 14:19:03 +00:00
a9f1a6b2a2 fix: revert fix: persist terraform modules during template import (#17665) (#17734)
This reverts commit ae3d90b057.
2025-05-08 22:03:08 -04:00
ae3d90b057 fix: persist terraform modules during template import (#17665) 2025-05-08 16:13:46 -06:00
118f12ac3a feat: implement claiming of prebuilt workspaces (#17458)
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Co-authored-by: Edward Angert <EdwardAngert@users.noreply.github.com>
Co-authored-by: EdwardAngert <17991901+EdwardAngert@users.noreply.github.com>
Co-authored-by: Jaayden Halko <jaayden.halko@gmail.com>
Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>
Co-authored-by: M Atif Ali <atif@coder.com>
Co-authored-by: Aericio <16523741+Aericio@users.noreply.github.com>
Co-authored-by: M Atif Ali <me@matifali.dev>
Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>
2025-04-24 09:39:38 -04:00
71dbd0c888 fix: nil ptr deref when removing OIDC from deployment and accessing old users (#17501)
If OIDC is removed from a deployment, trying to create a workspace for a previous user
on OIDC would panic.
2025-04-23 08:45:26 -05:00
ea65ddc17d fix: correct user roles being passed into terraform context (#17460)
Roles were being passed into the workspace context incorrectly. Site
wide scopes were being org scoped. Roles outside the org should also not
be sent.
2025-04-17 15:42:23 -05:00
2a76f5028e fix: don't attempt to insert empty terraform plans into the database (#17426) 2025-04-16 10:14:35 -06:00
a98605913a feat: mark prebuilds as such and set their preset ids (#16965)
This pull request closes https://github.com/coder/internal/issues/513
2025-04-14 15:34:50 +02:00
0b2b643ce2 feat: persist prebuild definitions on template import (#16951)
This PR allows provisioners to recognise and report prebuild definitions
to the coder control plane. It also allows the coder control plane to
then persist these to its store.

closes https://github.com/coder/internal/issues/507

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>
2025-04-07 10:35:28 +02:00
99c6f235eb feat: add migrations and queries to support prebuilds (#16891)
Depends on https://github.com/coder/coder/pull/16916 _(change base to
`main` once it is merged)_

Closes https://github.com/coder/internal/issues/514

_This is one of several PRs to decompose the `dk/prebuilds` feature
branch into separate PRs to merge into `main`._

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>
2025-04-03 10:58:30 +02:00
7d4b3c8634 feat(agent): add devcontainer autostart support (#17076)
This change adds support for devcontainer autostart in workspaces. The
preconditions for utilizing this feature are:

1. The `coder_devcontainer` resource must be defined in Terraform
2. By the time the startup scripts have completed,
	- The `@devcontainers/cli` tool must be installed
	- The given workspace folder must contain a devcontainer configuration

Example Terraform:

```tf
resource "coder_devcontainer" "coder" {
  agent_id         = coder_agent.main.id
  workspace_folder = "/home/coder/coder"
  config_path      = ".devcontainer/devcontainer.json" # (optional)
}
```

Closes #16423
2025-03-27 12:31:30 +02:00
17ddee05e5 chore: update golang to 1.24.1 (#17035)
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
  - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)

🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2025-03-26 01:56:39 -05:00
5c8cac9fb7 feat: add name to workspace agent devcontainers (#17089)
In the presence of multiple devcontainers, it would be nice to
differentiate them by name. This change inherits the resource name from
terraform.

Refs #17076
2025-03-25 12:59:20 +00:00
5b3eda6719 chore: persist template import terraform plan in postgres (#17012) 2025-03-24 10:01:50 -06:00
69ba27e347 feat: allow specifying devcontainer on agent in terraform (#16997)
This change allows specifying devcontainers in terraform and plumbs it
through to the agent via agent manifest.

This will be used for autostarting devcontainers in a workspace.

Depends on coder/terraform-provider-coder#368
Updates #16423
2025-03-20 19:09:39 +02:00
04c33968cf refactor: replace golang.org/x/exp/slices with slices (#16772)
The experimental functions in `golang.org/x/exp/slices` are now
available in the standard library since Go 1.21.

Reference: https://go.dev/doc/go1.21#slices

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2025-03-04 00:46:49 +11:00
ca23abe12c feat(provisioner): add support for workspace_owner_rbac_roles (#16407)
Some checks are pending
ci / changes (push) Waiting to run
ci / lint (push) Blocked by required conditions
ci / gen (push) Waiting to run
ci / fmt (push) Blocked by required conditions
ci / test-go (macos-latest) (push) Blocked by required conditions
ci / test-go (ubuntu-latest) (push) Blocked by required conditions
ci / test-go (windows-2022) (push) Blocked by required conditions
ci / test-cli (macos-latest) (push) Blocked by required conditions
ci / test-cli (windows-2022) (push) Blocked by required conditions
ci / test-go-pg (ubuntu-latest) (push) Blocked by required conditions
ci / test-go-pg-16 (push) Blocked by required conditions
ci / test-go-race (push) Blocked by required conditions
ci / test-go-race-pg (push) Blocked by required conditions
ci / test-go-tailnet-integration (push) Blocked by required conditions
ci / test-js (push) Blocked by required conditions
ci / test-e2e (push) Blocked by required conditions
ci / test-e2e-premium (push) Blocked by required conditions
ci / chromatic (push) Blocked by required conditions
ci / offlinedocs (push) Blocked by required conditions
ci / required (push) Blocked by required conditions
ci / build-dylib (push) Blocked by required conditions
ci / build (push) Blocked by required conditions
ci / deploy (push) Blocked by required conditions
ci / deploy-wsproxies (push) Blocked by required conditions
ci / sqlc-vet (push) Blocked by required conditions
ci / notify-slack-on-failure (push) Blocked by required conditions
OpenSSF Scorecard / Scorecard analysis (push) Waiting to run
Part of https://github.com/coder/terraform-provider-coder/pull/330

Adds support for the coder_workspace_owner.rbac_roles attribute
2025-03-02 14:54:44 -06:00
9469b78290 fix!: enforce regex for agent names (#16641)
Underscores and double hyphens are now blocked. The regex is almost the
exact same as the `coder_app` `slug` regex, but uppercase characters are
still permitted.
2025-02-20 05:09:26 +00:00
3fddfef879 fix!: enforce agent names be case-insensitive-unique per-workspace (#16614)
Relates to https://github.com/coder/coder-desktop-macos/issues/54

Currently, it's possible to have two agents within the same workspace whose names only differ in capitalization:
This leads to an ambiguity in two cases:
- For CoderVPN, we'd like to allow support to workspaces with a hostname of the form: `agent.workspace.username.coder`.
- Workspace apps (`coder_app`s) currently use subdomains of the form: `<app>--<agent>--<workspace>--<username>(--<suffix>)?`.

Of note is that DNS hosts must be strictly lower case, hence the ambiguity.

This fix is technically a breaking change, but only for the incredibly rare use case where a user has:
- A workspace with two agents
- Those agent names differ only in capitalization.

Those templates & workspaces will now fail to build. This can be fixed by choosing wholly unique names for the agents.
2025-02-20 12:51:25 +11:00
d6b9806098 chore: implement oom/ood processing component (#16436)
Implements the processing logic as set out in the OOM/OOD RFC.
2025-02-17 16:56:52 +00:00
46e04c68e3 feat(provisioner): add support for presets to coder provisioners (#16574)
This pull request adds support for presets to coder provisioners.
If a template defines presets using a compatible version of the
provider, then this PR will allow those presets to be persisted to the
control plane database for use in workspace creation.
2025-02-17 13:00:44 +02:00
7cbd77fd94 feat: improve resources_monitoring for OOM & OOD monitoring (#16241)
As requested for [this
issue](https://github.com/coder/internal/issues/245) we need to have a
new resource `resources_monitoring` in the agent.

It needs to be parsed from the provisioner and inserted into a new db
table.
2025-02-04 18:45:33 +01:00
a160e8f06c chore(coderd): remove the window option in open_in (#16104)
As we worked on adding a `open_in` parameter for workspace_apps - we
initially created three options :
- window
- slim_window
- tab

After further investigation, `window` should not be used and has to be
removed.

ℹ️ I decided to remove the option instead of deprecating it as we've not
created any release nor documented the feature. Can be discussed.
2025-01-15 15:26:31 +01:00
08463c27d8 feat: add OpenIn option to coder_app (#15743)
This PR is the coder/coder part of [the open_in parameter
issue](https://github.com/coder/terraform-provider-coder/issues/297)
aiming to add a new optional parameter to choose how to open modules.

This PR is heavily linked [to this
PR](https://github.com/coder/terraform-provider-coder/pull/321).

ℹ️ For now, some integrations tests can not be pushed as it requires a
release on the terraform-provider repo.
2025-01-03 11:27:02 +01:00
e21a301682 fix: make GetWorkspacesEligibleForTransition return even less false positives (#15594)
Relates to https://github.com/coder/coder/issues/15082

Further to https://github.com/coder/coder/pull/15429, this reduces the
amount of false-positives returned by the 'is eligible for autostart'
part of the query. We achieve this by calculating the 'next start at'
time of the workspace, storing it in the database, and using it in our
`GetWorkspacesEligibleForTransition` query.

The prior implementation of the 'is eligible for autostart' query would
return _all_ workspaces that at some point in the future _might_ be
eligible for autostart. This now ensures we only return workspaces that
_should_ be eligible for autostart.

We also now pass `currentTick` instead of `t` to the
`GetWorkspacesEligibleForTransition` query as otherwise we'll have one
round of workspaces that are skipped by `isEligibleForTransition` due to
`currentTick` being a truncated version of `t`.
2024-12-02 21:02:36 +00:00
ef09b51912 fix(coderd): extract provisionerdserver.StaleInterval to 90 seconds (#15643)
Follow-up from https://github.com/coder/coder/pull/15578

Extracts `provisionerdserver.StaleInterval` and sets it to 90 seconds by
default
2024-11-28 12:57:43 +00:00
0896f339c4 refactor(coderd/provisionerdserver): use quartz.Clock instead of TimeNowFn (#15642)
Replace `TimeNowFn` in `provisionerdserver` with `quartz.Clock` as
well as pass `coderd`'s `Clock` to `provisionerdserver`.
2024-11-25 16:25:36 +00:00