Commit Graph

2597 Commits

Author SHA1 Message Date
fa86cc4adf chore: support the has_ai_task column in template version and workspace insert queries (#18385)
https://github.com/coder/coder/pull/18359 added the `has_ai_task`
columns on the `workspace_builds` and `template_versions` tables.
2025-06-16 16:07:16 +02:00
1d1070d051 chore: ensure proper rbac permissions on 'Acquire' file in the cache (#18348)
The file cache was caching the `Unauthorized` errors if a user without
the right perms opened the file first. So all future opens would fail.

Now the cache always opens with a subject that can read files. And authz
is checked on the Acquire per user.
2025-06-16 13:40:45 +00:00
9a432b8d9f fix: add workspace owner id as query param to websocket (#18363)
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-06-13 14:49:32 -04:00
c1341cccdd feat: use proto streams to increase maximum module files payload (#18268)
This PR implements protobuf streaming to handle large module files by:
1. **Streaming large payloads**: When module files exceed the 4MB limit,
they're streamed in chunks using a new UploadFile RPC method
2. **Database storage**: Streamed files are stored in the database and
referenced by hash for deduplication
3. **Backward compatibility**: Small module files continue using the
existing direct payload method
2025-06-13 12:46:26 -05:00
8e29ee50a3 feat: add ai tasks migrations (#18359)
Adds database migrations required for the Tasks feature.

There's a slight difference between the migrations in this PR and the
RFC: this PR adds `NOT NULL` constraints to the `has_ai_task` columns.
It was an oversight on my part when I wrote the RFC - I assumed the
`DEFAULT FALSE` value would make the columns implicitly NOT NULL, but
that's not the case with Postgres. We have no use for the NULL value.

The `DEFAULT FALSE` statement ensures that the migration will pass even
when there are existing rows in the template version and workspace
builds tables, so there's no danger in adding the `NOT NULL`
constraints.
2025-06-13 15:54:02 +02:00
f1cca03ed3 docs: reorganize the About section (#18236)
As part of an information architecture overhaul, this PR reorganizes the
About section and adds a Support section (but not content to it yet)

[preview](https://coder.com/docs/@docs-ia-about/about)

this PR is intentionally limited in scope so that we can ship meaningful
changes faster and followup PRs should include:

- [ ] edit + overhaul the About page
- [ ] decide on the `start` directory
- [ ] ~screenshots page updates~ (this should happen July or later)

redirects PR: https://github.com/coder/coder.com/pull/944

---------

Co-authored-by: EdwardAngert <17991901+EdwardAngert@users.noreply.github.com>
2025-06-12 13:56:45 -04:00
5944b1c595 chore: remove local storage based optin/optout (#18344)
This removes the opt-in and opt-out buttons for dynamic parameters on
the create workspace page and the workspace parameters settings page.

---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-06-12 13:37:07 -04:00
f126931219 chore: remove dynamic-parameters experiment (#18290)
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: jaaydenh <1858163+jaaydenh@users.noreply.github.com>
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-06-12 12:15:05 -04:00
70723d3b51 fix(coderd): fix panics by always checking for non-nil request logger (#18228) 2025-06-12 13:50:50 +03:00
ae3882a600 chore: move all images to new GCP project (#18324) 2025-06-11 13:06:31 +00:00
2377d76ebb test: ensure the return value of MockAuditor.Contains is checked (#18319)
It unfortunately doesn't seem possible, even with a custom ruleguard rule, to mark a function as requiring it's return value be used, it looks like you have to go all in on a linter that rejects *any* unused return values.
2025-06-11 17:16:18 +10:00
dd27a28cfa chore: fix comment on Acquire (#18313) 2025-06-10 15:36:48 -06:00
fb63c9445c test: fix test flake in TestDynamicParametersWithTerraformValues (#18311)
Wrong build ID was being used for the await.

Closes https://github.com/coder/internal/issues/687
2025-06-10 16:13:47 -05:00
9b9b89499e fix(coderd/database/db2sdk): add agent parent ID (#18310) 2025-06-10 18:07:05 +01:00
fca99174ad feat(agent/agentcontainers): implement sub agent injection (#18245)
This change adds support for sub agent creation and injection into dev
containers.

Updates coder/internal#621
2025-06-10 12:37:54 +03:00
910858b731 chore(coderd/provisionerdserver): convert dbmem tests to use postgres (#18278) 2025-06-09 10:05:29 +02:00
8daa0aacc6 feat(coderd/agentapi): support adding display apps to a sub agent (#18272)
Completely missed this in the original PR for adding support for
creating sub agents. This now allows specifying a list of display apps
to be added to the agent.
2025-06-06 17:30:52 +01:00
f569d9c33d feat: add separate max token lifetime for administrators (#18267)
# Add separate token lifetime limits for administrators

This PR introduces a new configuration option `--max-admin-token-lifetime` that allows administrators to create API tokens with longer lifetimes than regular users. By default, administrators can create tokens with a lifetime of up to 7 days (168 hours), while the existing `--max-token-lifetime` setting continues to apply to regular users.

The implementation:
- Adds a new `MaximumAdminTokenDuration` field to the session configuration
- Modifies the token validation logic to check the user's role and apply the appropriate lifetime limit
- Updates the token configuration endpoint to return the correct maximum lifetime based on the user's role
- Adds tests to verify that administrators can create tokens with longer and shorter lifetimes
- Updates documentation and help text to reflect the new option

This change allows organizations to grant administrators extended token lifetimes while maintaining tighter security controls for regular users.

Fixes #17395
2025-06-06 17:36:30 +02:00
a12429e9f8 feat(agent/agentcontainers): refactor Lister to ContainerCLI and implement new methods (#18243) 2025-06-06 10:33:09 +00:00
533c6dcbbe fix: remove error log when notification manager is already closed (#18264)
Closes https://github.com/coder/internal/issues/677

Resolves flakes such as:

```
$ go test -race -run "TestRunStopRace" github.com/coder/coder/v2/coderd/notifications -count=10000 -parallel $(nproc)
--- FAIL: TestRunStopRace (0.00s)
    t.go:106: 2025-06-06 02:44:39.348 [debu]  notifications-manager: notification manager started
    t.go:106: 2025-06-06 02:44:39.348 [debu]  notifications-manager: graceful stop requested
    t.go:106: 2025-06-06 02:44:39.348 [debu]  notifications-manager: notification manager stopped
    t.go:115: 2025-06-06 02:44:39.348 [erro]  notifications-manager: notification manager stopped with error ...
        error= manager already closed:
                   github.com/coder/coder/v2/coderd/notifications.(*Manager).loop
                       /home/coder/coder/coderd/notifications/manager.go:166
         *** slogtest: log detected at level ERROR; TEST FAILURE ***
--- FAIL: TestRunStopRace (0.00s)
    t.go:106: 2025-06-06 02:44:41.632 [debu]  notifications-manager: notification manager started
    t.go:106: 2025-06-06 02:44:41.632 [debu]  notifications-manager: graceful stop requested
    t.go:106: 2025-06-06 02:44:41.632 [debu]  notifications-manager: notification manager stopped
    t.go:115: 2025-06-06 02:44:41.633 [erro]  notifications-manager: notification manager stopped with error ...
        error= manager already closed:
                   github.com/coder/coder/v2/coderd/notifications.(*Manager).loop
                       /home/coder/coder/coderd/notifications/manager.go:166
         *** slogtest: log detected at level ERROR; TEST FAILURE ***
FAIL
FAIL    github.com/coder/coder/v2/coderd/notifications  6.847s
FAIL
```

These error logs are caused as a result of the `Manager` `Run` start operation being asynchronous. In the flaking test case we immediately call `Stop` after `Run`. It's possible for `Stop` to be scheduled to completion before the goroutine spawned by `Run` calls `loop` and checks `closed`. If this happens, `loop` returns an error and produces the error log.

We'll address this by replacing this specific error log with a warning log.

```
$ go test -run "TestRunStopRace" github.com/coder/coder/v2/coderd/notifications -count=10000 -parallel $(nproc)
ok      github.com/coder/coder/v2/coderd/notifications  1.294s

$ go test -race github.com/coder/coder/v2/coderd/notifications -count=100 -parallel $(nproc)
ok      github.com/coder/coder/v2/coderd/notifications  26.525s
```
2025-06-06 19:28:10 +10:00
0076e8479f chore(vpn): send ping results over tunnel (#18200)
Closes #17982.

The purpose of this PR is to expose network latency via the API used by Coder Desktop.

This PR has the tunnel ping all known agents every 5 seconds, in order to produce an instance of:
```proto
message LastPing {
	// latency is the RTT of the ping to the agent.
	google.protobuf.Duration latency = 1;
	// did_p2p indicates whether the ping was sent P2P, or over DERP.
	bool did_p2p = 2;
	// preferred_derp is the human readable name of the preferred DERP region,
	// or the region used for the last ping, if it was sent over DERP.
	string preferred_derp = 3;
	// preferred_derp_latency is the last known latency to the preferred DERP
	// region. Unset if the region does not appear in the DERP map.
	optional google.protobuf.Duration preferred_derp_latency = 4;
}
```
The contents of this message are stored and included on all subsequent upsertions of the agent. 
Note that we upsert existing agents every 5 seconds to update the `last_handshake` value.

On the desktop apps, this message will be used to produce a tooltip similar to that of the VS Code extension:
<img width="495" alt="image" src="https://github.com/user-attachments/assets/d8b65f3d-f536-4c64-9af9-35c1a42c92d2" />
(wording not final)

Unlike the VS Code extension, we omit:
- The Latency of *all* available DERP regions. It seems not ideal to send a copy of this entire map for every online agent, and it certainly doesn't make sense for it to be on the `Agent` or `LastPing` message. 
If we do want to expose this info on Coder Desktop, we should consider how best to do so; maybe we want to include it on a more generic `Netcheck` message.
- The current throughput (Bytes up/down). This is out of scope of the linked issue, and is non-trivial to implement. I'm also not sure of the value given the frequency we're doing these status updates (every 5 seconds).
If we want to expose it, it'll be in a separate PR.

<img width="343" alt="image" src="https://github.com/user-attachments/assets/8447d03b-9721-4111-8ac1-332d70a1e8f1" />
2025-06-06 14:18:57 +10:00
0428c5ec1c chore: include 'everyone' group in template importing (#18257) 2025-06-05 19:25:36 +00:00
623dcd97dc fix(cli): fix flakes related to context cancellation when establishing pg connections (#18246)
Since https://github.com/coder/coder/pull/18195 was merged, we started
running CLI tests with postgres instead of just dbmem. This surfaced
errors related to context cancellation while establishing postgres
connections.

This PR should fix https://github.com/coder/internal/issues/672. Related
to https://github.com/coder/coder/issues/15109.
2025-06-05 15:54:13 +02:00
b5fd3dd855 feat(coderd/agentapi): allow inserting apps for sub agents (#18129)
Allow creating workspace apps for a sub agent when the agent is being
created with `CreateSubAgent` in the agent api.
2025-06-05 11:57:02 +01:00
277c2c7ea7 chore(coderd/prometheusmetrics): remove dbmem from tests (#18238) 2025-06-05 09:30:27 +02:00
5f7e5d7097 feat: support prebuilt workspaces in non-default organizations (#18010)
closes https://github.com/coder/internal/issues/527
2025-06-04 14:20:29 +02:00
4d0fe20ca6 chore(coderd/database/dbauthz): update RBAC for InsertWorkspaceApp (#18223)
Instead of using `ResourceSystem` as the resource for
`InsertWorkspaceApp`, we instead use the associated workspace (if it
exists), with the action `ActionUpdate`.
2025-06-04 11:22:01 +00:00
246a829ea9 feat: evaluate dynamic parameters http endpoint (#18182)
Used when a websocket is too heavy. This implements a single request to
the preview engine.
2025-06-02 13:50:07 -05:00
2e7cd0fe22 chore(coderd/database/dbpurge): remove dbmem from tests (#18151)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 16:48:38 +02:00
1d48131e98 chore(coderd/externalauth): remove dbmem from tests (#18147)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 14:42:18 +02:00
d1fc0dd2c5 chore(coderd/database/dbmetrics): remove dbmem from tests (#18150)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 14:26:56 +02:00
8ca5519f57 chore(coderd/idpsync): run all tests with postgres (#18149)
Related to https://github.com/coder/coder/issues/15109.

Running postgres tests used to create a new postgres docker container
every time. I believe the slow down might've been caused by that and was
misattributed to postgres performance.

```
coder@main ~/coder ((0e90ac29))> DB=ci gotestsum --packages="./coderd/idpsync" -- -count=1
✓  coderd/idpsync (1.471s)

DONE 91 tests in 4.766s
```
2025-06-02 14:07:31 +02:00
bdf227c1f9 chore(coderd/database/dbgen): remove dbmem from tests (#18152)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 14:05:40 +02:00
d3ed6fe652 chore(coderd/autobuild): use dbtestutil.WillUsePostgres instead of os.Getenv in test (#18145)
Standardizing on `WillUsePostgres` will make it easier to remove the
check entirely once dbmem is removed.

Related to https://github.com/coder/coder/issues/15109.
2025-06-02 13:58:07 +02:00
782d01bae2 chore(coderd/httpmw): remove dbmem usage from tests (#18146)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 13:57:56 +02:00
9a28cb0545 chore(coderd/telemetry): remove dbmem from tests (#18148)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 13:57:27 +02:00
9812162190 chore(coderd/database/dbrollup): remove dbmem from tests (#18153)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 13:56:50 +02:00
9db114d17c feat: add filecache prometheus metrics (#18089)
Dynamic parameters has an in memory file cache. This adds prometheus
metrics to monitor said cache.
2025-05-30 11:54:54 -05:00
562c4696de fix(coderd/database/dbmem): fill DisplayGroup field for InsertWorkspaceApp (#18136)
It appears `dbmem` was missed in the new app groups feature
https://github.com/coder/coder/pull/17977.
2025-05-30 17:52:31 +01:00
9afdd33e64 fix(coderd/database/dbmem): apply rlock/runlock on GetTelemetryItems (#18133)
Fixes https://github.com/coder/coder/issues/18132
2025-05-30 16:39:32 +01:00
216fe441cf chore: align CSRF settings with deployment config (#18116) 2025-05-30 09:30:49 -05:00
f974add373 chore: rollback PR #18025 (#18118)
Rollback https://github.com/coder/coder/pull/18025
2025-05-30 11:00:26 -03:00
d779126ee3 chore: rollback PR #18081 (#18104)
Rollback https://github.com/coder/coder/pull/18081
2025-05-29 13:12:13 -03:00
e4648b6fc1 feat: allow iframing urls on the same domain as the deployment (#18102)
Used for AI tasks. We should eventually add regions to this csp header.
2025-05-29 10:07:57 -05:00
8387dd27ab chore: add form_type parameter argument to db (#17920)
`form_type` is a new parameter field in the terraform provider. Bring
that field into coder/coder.

Validation for `multi-select` has also been added.
2025-05-29 08:55:19 -05:00
776c144128 fix(coderd): ensure agent timings are non-zero on insert (#18065)
Relates to https://github.com/coder/coder/issues/15432

Ensures that no workspace build timings with zero values for started_at or ended_at are inserted into the DB or returned from the API.
2025-05-29 13:36:06 +01:00
b712d0b23f feat(coderd/agentapi): implement sub agent api (#17823)
Closes https://github.com/coder/internal/issues/619

Implement the `coderd` side of the AgentAPI for the upcoming
dev-container agents work.

`agent/agenttest/client.go` is left unimplemented for a future PR
working to implement the agent side of this feature.
2025-05-29 12:15:47 +01:00
bc83de2a72 feat: add prebuilt workspaces telemetry (#18084)
Adds telemetry for a _global_ account of prebuilt workspaces created,
failed to build, and claimed.

Partitioning this data by template/preset tuple is not currently in
scope.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-29 13:13:44 +02:00
2ec7404197 chore: make owner_name and owner_username consistent (#18081)
We've been using owner_name inconsistently as username. So this PR fixes
it by making the attribute naming more consistent.
2025-05-28 17:25:32 -03:00
b330c0803c fix: reimplement reporting of preset-hard-limited metric (#18055)
Addresses concerns raised in https://github.com/coder/coder/pull/18045
2025-05-28 14:18:32 -04:00