Commit Graph

1080 Commits

Author SHA1 Message Date
32239b29cb chore: add AI-tasks-specific fields to codersdk.WorkspaceBuild (#18436)
This will be needed by the frontend on the `/task/$id` page to display
the app in the sidebar.

Related to https://github.com/coder/coder/issues/18158
2025-06-20 10:59:34 +02:00
8f6a5afa4f feat: add backend logic for determining tasks tab visibility (#18401)
This PR implements the backend logic for determining if the Tasks tab
should be visible in the web UI as described in [the
RFC](https://www.notion.so/coderhq/Coder-Tasks-207d579be5928053ab68c8d9a4b59eaa?source=copy_link#210d579be5928013ab5acbe69a2f548b).

The frontend component will be added in a follow-up PR once the entire
Tasks backend is implemented so as not to break the dogfood environment
until then.
2025-06-18 18:32:34 +02:00
97474bb28b feat: support devcontainer agents in ui and unify backend (#18332)
This commit consolidates two container endpoints on the backend and improves the
frontend devcontainer support by showing names and displaying apps as
appropriate.

With this change, the frontend now has knowledge of the subagent and we can also
display things like port forwards.

The frontend was updated to show dev container labels on the border as well as
subagent connection status. The recreation flow was also adjusted a bit to show
placeholder app icons when relevant.

Support for apps was also added, although these are still WIP on the backend.
And the port forwarding utility was added in since the sub agents now provide
the necessary info.

Fixes coder/internal#666
2025-06-17 16:06:47 +03:00
5df70a613d feat: add organization scope for shared ports (#18314) 2025-06-16 16:15:59 -06:00
4bd5609e13 feat: add status watcher to MCP server (#18320)
This is meant to complement the existing task reporter since the LLM
does not call it reliably.

It also includes refactoring to use the common agent flags/env vars.
2025-06-13 12:53:43 -08:00
9a432b8d9f fix: add workspace owner id as query param to websocket (#18363)
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-06-13 14:49:32 -04:00
f126931219 chore: remove dynamic-parameters experiment (#18290)
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: jaaydenh <1858163+jaaydenh@users.noreply.github.com>
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-06-12 12:15:05 -04:00
ae0c8701bb feat(agent): disable devcontainers for sub agents (#18303)
Updates coder/internal#621
Refs #18245
2025-06-10 10:47:02 +00:00
f569d9c33d feat: add separate max token lifetime for administrators (#18267)
# Add separate token lifetime limits for administrators

This PR introduces a new configuration option `--max-admin-token-lifetime` that allows administrators to create API tokens with longer lifetimes than regular users. By default, administrators can create tokens with a lifetime of up to 7 days (168 hours), while the existing `--max-token-lifetime` setting continues to apply to regular users.

The implementation:
- Adds a new `MaximumAdminTokenDuration` field to the session configuration
- Modifies the token validation logic to check the user's role and apply the appropriate lifetime limit
- Updates the token configuration endpoint to return the correct maximum lifetime based on the user's role
- Adds tests to verify that administrators can create tokens with longer and shorter lifetimes
- Updates documentation and help text to reflect the new option

This change allows organizations to grant administrators extended token lifetimes while maintaining tighter security controls for regular users.

Fixes #17395
2025-06-06 17:36:30 +02:00
d44d8abcd1 fix: improve task report tool complete status (#18138) 2025-05-30 17:46:19 +00:00
f974add373 chore: rollback PR #18025 (#18118)
Rollback https://github.com/coder/coder/pull/18025
2025-05-30 11:00:26 -03:00
bedeb4710b fix: improve task reporting tool description (#18119)
In my (albeit subjective) testing, this dramatically improved the
reporting ability - both in frequency and accuracy.
2025-05-30 00:00:12 +00:00
d779126ee3 chore: rollback PR #18081 (#18104)
Rollback https://github.com/coder/coder/pull/18081
2025-05-29 13:12:13 -03:00
8387dd27ab chore: add form_type parameter argument to db (#17920)
`form_type` is a new parameter field in the terraform provider. Bring
that field into coder/coder.

Validation for `multi-select` has also been added.
2025-05-29 08:55:19 -05:00
b712d0b23f feat(coderd/agentapi): implement sub agent api (#17823)
Closes https://github.com/coder/internal/issues/619

Implement the `coderd` side of the AgentAPI for the upcoming
dev-container agents work.

`agent/agenttest/client.go` is left unimplemented for a future PR
working to implement the agent side of this feature.
2025-05-29 12:15:47 +01:00
2ec7404197 chore: make owner_name and owner_username consistent (#18081)
We've been using owner_name inconsistently as username. So this PR fixes
it by making the attribute naming more consistent.
2025-05-28 17:25:32 -03:00
b4531c4218 feat: make dynamic parameters respect owner in form (#18013)
Closes https://github.com/coder/coder/issues/18012

---------

Co-authored-by: Jaayden Halko <jaayden.halko@gmail.com>
2025-05-27 15:43:00 -05:00
9fc3329575 feat: persist app groups in the database (#17977) 2025-05-27 13:13:08 -06:00
a18eb9d08f feat(site): allow recreating devcontainers and showing dirty status (#18049)
This change allows showing the devcontainer dirty status in the UI as
well as a recreate button to update the devcontainer.

Closes #16424
2025-05-27 19:42:24 +03:00
71a647b001 feat: support ConvertUserLoginType for another user in codersdk(#17784)
Added `ConvertUserLoginType(ctx, user, req)` method to support
converting the login type for a specified user.
2025-05-27 09:53:02 -05:00
d63417b542 fix: update WorkspaceOwnerName to use user.name instead of user.username (#18025)
We have been using the user.username instead of user.name in wrong
places, making it very confusing for the UI.
2025-05-27 11:42:07 -03:00
9827c97f32 feat: add AI Tasks page (#18047)
**Preview:**

<img width="1624" alt="Screenshot 2025-05-26 at 21 25 04"
src="https://github.com/user-attachments/assets/2a51915d-2527-4467-bf99-1f2d876b953b"
/>
2025-05-27 11:34:07 -03:00
0731304905 feat(agent/agentcontainers): recreate devcontainers concurrently (#18042)
This change introduces a refactor of the devcontainers recreation logic
which is now handled asynchronously rather than being request scoped.
The response was consequently changed from "No Content" to "Accepted" to
reflect this.

A new `Status` field was introduced to the devcontainer struct which
replaces `Running` (bool). This reflects that the devcontainer can now
be in various states (starting, running, stopped or errored).

The status field also protects against multiple concurrent recrations,
as long as they are initiated via the API.

Updates #16424
2025-05-26 18:30:52 +03:00
94c129c03d fix!: omit name, avatar_url and last_seen_at from responses when empty (#18005)
User name, avatar URL, and last seen at, are not required fields so they
can be empty. Instead of returning the 0 values from Go, we want to make
it more agnostic, and omit them when they are empty. This make the docs
and usage way clearer for consumers.
2025-05-23 11:35:05 -03:00
53e8e9c7cd fix: reduce cost of prebuild failure (#17697)
Relates to https://github.com/coder/coder/issues/17432

### Part 1:

Notes:
- `GetPresetsAtFailureLimit` SQL query is added, which is similar to
`GetPresetsBackoff`, they use same CTEs: `filtered_builds`,
`time_sorted_builds`, but they are still different.

- Query is executed on every loop iteration. We can consider marking
specific preset as permanently failed as an optimization to avoid
executing query on every loop iteration. But I decided don't do it for
now.

- By default `FailureHardLimit` is set to 3.

- `FailureHardLimit` is configurable. Setting it to zero - means that
hard limit is disabled.

### Part 2

Notes:
- `PrebuildFailureLimitReached` notification is added.
- Notification is sent to template admins.
- Notification is sent only the first time, when hard limit is reached.
But it will `log.Warn` on every loop iteration.
- I introduced this enum:
```sql
CREATE TYPE prebuild_status AS ENUM (
  'normal',           -- Prebuilds are working as expected; this is the default, healthy state.
  'hard_limited',     -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore.
  'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried.
);
```
`validation_failed` not used in this PR, but I think it will be used in
next one, so I wanted to save us an extra migration.

- Notification looks like this:
<img width="472" alt="image"
src="https://github.com/user-attachments/assets/e10efea0-1790-4e7f-a65c-f94c40fced27"
/>

### Latest notification views:
<img width="463" alt="image"
src="https://github.com/user-attachments/assets/11310c58-68d1-4075-a497-f76d854633fe"
/>
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/6bbfe21a-91ac-47c3-a9d1-21807bb0c53a"
/>
2025-05-21 15:16:38 -04:00
3e7ff9d9e1 chore(coderd/rbac): add Action{Create,Delete}Agent to ResourceWorkspace (#17932) 2025-05-20 21:20:56 +01:00
a123900fe8 chore: remove coder/preview dependency from codersdk (#17939) 2025-05-20 10:45:12 -05:00
e76d58f2b6 chore: disable parameter validatation for dynamic params for all transitions (#17926)
Dynamic params skip parameter validation in coder/coder.
This is because conditional parameters cannot be validated 
with the static parameters in the database.
2025-05-20 10:09:53 -05:00
769c9ee337 feat: cancel stuck pending jobs (#17803)
Closes: #16488
2025-05-20 15:22:44 +02:00
9c000468a1 chore: expose use_classic_parameter_flow on workspace response (#17925) 2025-05-19 21:59:15 +00:00
61f22a59ba feat(agent): add ParentId to agent manifest (#17888)
Closes https://github.com/coder/internal/issues/648

This change introduces a new `ParentId` field to the agent's manifest.
This will allow an agent to know if it is a child or not, as well as
knowing who the owner is.

This is part of the Dev Container Agents work
2025-05-19 16:09:56 +01:00
f044cc3550 feat: add provisioner daemon name to provisioner jobs responses (#17877)
# Description

This PR adds the `worker_name` field to the provisioner jobs endpoint.

To achieve this, the following SQL query was updated:
-
`GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisioner`

As a result, the `codersdk.ProvisionerJob` type, which represents the
provisioner job API response, was modified to include the new field.

**Notes:** 
* As mentioned in
[comment](https://github.com/coder/coder/pull/17877#discussion_r2093218206),
the `GetProvisionerJobsByIDsWithQueuePosition` query was not changed due
to load concerns. This means that for template and template version
endpoints, `worker_id` will still be returned, but `worker_name` will
not.
* Similar to `worker_id`, the `worker_name` is only present once a job
is assigned to a provisioner daemon. For jobs in a pending state (not
yet assigned), neither `worker_id` nor `worker_name` will be returned.

---

# Affected Endpoints

- `/organizations/{organization}/provisionerjobs`
- `/organizations/{organization}/provisionerjobs/{job}`

---

# Testing

- Added new tests verifying that both `worker_id` and `worker_name` are
returned once a provisioner job reaches the **succeeded** state.
- Existing tests covering state transitions and other logic remain
unchanged, as they test different scenarios.

---

# Front-end Changes

Admin provisioner jobs dashboard:
<img width="1088" alt="Screenshot 2025-05-16 at 11 51 33"
src="https://github.com/user-attachments/assets/0e20e360-c615-4497-84b7-693777c5443e"
/>

Fixes: https://github.com/coder/coder/issues/16982
2025-05-19 16:05:39 +01:00
98e2ec4417 feat: show devcontainer dirty status and allow recreate (#17880)
Updates #16424
2025-05-19 12:56:10 +03:00
f36fb67f57 chore: use static params when dynamic param metadata is missing (#17836)
Existing template versions do not have the metadata (modules + plan) in
the db. So revert to using static parameter information from the
original template import.

This data will still be served over the websocket.
2025-05-16 11:47:59 -05:00
c2bc801f83 chore: add 'classic_parameter_flow' column setting to templates (#17828)
We are forcing users to try the dynamic parameter experience first.
Currently this setting only comes into effect if an experiment is
enabled.
2025-05-15 17:55:17 -05:00
3de0003e4b feat(agent): send devcontainer CLI logs during recreate (#17845)
We need a way to surface what's happening to the user, since autostart
logs here, it's natural we do so during re-create as well.

Updates #16424
2025-05-15 16:06:56 +03:00
425ee6fa55 feat: reinitialize agents when a prebuilt workspace is claimed (#17475)
This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.

Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:15:36 +02:00
64807e1d61 chore: apply the 4mb max limit on drpc protocol message size (#17771)
Respect the 4mb max limit on proto messages
2025-05-13 11:24:51 -05:00
0b5f27f566 feat: add parent_id column to workspace_agents table (#17758)
Adds a new nullable column `parent_id` to `workspace_agents` table. This
lays the groundwork for having child agents.
2025-05-13 00:01:31 +01:00
37832413ba chore: resolve internal drpc package conflict (#17770)
Our internal drpc package name conflicts with the external one in usage. 
`drpc.*` == external
`drpcsdk.*` == internal
2025-05-12 10:31:38 -05:00
3ee95f14ce chore: upgrade terraform-provider-coder & preview libs (#17738)
The changes in `coder/preview` necessitated the changes in
`codersdk/richparameters.go` & `provisioner/terraform/resources.go`.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-05-09 17:41:19 +02:00
29bce8d9e6 feat(cli): make MCP server work without user authentication (#17688)
Part of #17649

---

# Allow MCP server to run without authentication

This PR enhances the MCP server to operate without requiring authentication, making it more flexible for environments where authentication isn't available or necessary. Key changes:

- Replaced `InitClient` with `TryInitClient` to allow the MCP server to start without credentials
- Added graceful handling when URL or authentication is missing
- Made authentication status visible in server logs
- Added logic to skip user-dependent tools when no authenticated user is present
- Made the `coder_report_task` tool available with just an agent token (no user token required)
- Added comprehensive tests to verify operation without authentication

These changes allow the MCP server to function in more environments while still using authentication when available, improving flexibility for CI/CD and other automated environments.
2025-05-07 21:53:06 +02:00
9fe5b71d31 chore!: fix workspace apps response (#17700)
Fix WorkspaceApp response type to better reflect the schema from
https://registry.terraform.io/providers/coder/coder/latest/docs/resources/app.
2025-05-07 10:05:07 -03:00
544259b809 feat: add database tables and API routes for agentic chat feature (#17570)
Backend portion of experimental `AgenticChat` feature:
- Adds database tables for chats and chat messages
- Adds functionality to stream messages from LLM providers using
`kylecarbs/aisdk-go`
- Adds API routes with relevant functionality (list, create, update
chats, insert chat message)
- Adds experiment `codersdk.AgenticChat`

---------

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2025-05-02 17:29:57 +01:00
4ac71e9fd9 fix(codersdk/toolsdk): ensure all tools include required fields of aisdk.Schema (#17632) 2025-05-01 12:19:35 +00:00
98e5611e16 fix: fix for prebuilds claiming and deletion (#17624)
PR contains:
- fix for claiming & deleting prebuilds with immutable params
- unit test for claiming scenario
- unit test for deletion scenario

The parameter resolver was failing when deleting/claiming prebuilds
because a value for a previously-used parameter was provided to the
resolver, but since the value was unchanged (it's coming from the
preset) it failed in the resolver. The resolver was missing a check to
see if the old value != new value; if the values match then there's no
mutation of an immutable parameter.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-01 08:52:23 +00:00
53ba3613b3 feat(cli): use coder connect in coder ssh --stdio, if available (#17572)
Closes https://github.com/coder/vscode-coder/issues/447
Closes https://github.com/coder/jetbrains-coder/issues/543
Closes https://github.com/coder/coder-jetbrains-toolbox/issues/21

This PR adds Coder Connect support to `coder ssh --stdio`. 

When connecting to a workspace, if `--force-new-tunnel` is not passed, the CLI will first do a DNS lookup for `<agent>.<workspace>.<owner>.<hostname-suffix>`. If an IP address is returned, and it's within the Coder service prefix, the CLI will not create a new tailnet connection to the workspace, and instead dial the SSH server running on port 22 on the workspace directly over TCP.

This allows IDE extensions to use the Coder Connect tunnel, without requiring any modifications to the extensions themselves. 

Additionally, `using_coder_connect` is added to the `sshNetworkStats` file, which the VS Code extension (and maybe Jetbrains?) will be able to read, and indicate to the user that they are using Coder Connect.

One advantage of this approach is that running `coder ssh --stdio` on an offline workspace with Coder Connect enabled will have the CLI wait for the workspace to build, the agent to connect (and optionally, for the startup scripts to finish), before finally connecting using the Coder Connect tunnel.

As a result, `coder ssh --stdio` has the overhead of looking up the workspace and agent, and checking if they are running. On my device, this meant `coder ssh --stdio <workspace>` was approximately a second slower than just connecting to the workspace directly using `ssh <workspace>.coder` (I would assume anyone serious about their Coder Connect usage would know to just do the latter anyway).
 
To ensure this doesn't come at a significant performance cost, I've also benchmarked this PR.

<details>
<summary>Benchmark</summary>

## Methodology
All tests were completed on `dev.coder.com`, where a Linux workspace running in AWS `us-west1` was created.
The machine running Coder Desktop (the 'client') was a Windows VM running in the same AWS region and VPC as the workspace.

To test the performance of specifically the SSH connection, a port was forwarded between the client and workspace using:
```
ssh -p 22 -L7001:localhost:7001 <host>
```
where `host` was either an alias for an SSH ProxyCommand that called `coder ssh`, or a Coder Connect hostname.

For latency, [`tcping`](https://www.elifulkerson.com/projects/tcping.php) was used against the forwarded port:
```
tcping -n 100 localhost 7001
```

For throughput, [`iperf3`](https://iperf.fr/iperf-download.php) was used:
```
iperf3 -c localhost -p 7001
```
where an `iperf3` server was running on the workspace on port 7001.

## Test Cases

### Testcase 1: `coder ssh` `ProxyCommand` that bicopies from Coder Connect
This case tests the implementation in this PR, such that we can write a config like:
```
Host codercliconnect
    ProxyCommand /path/to/coder ssh --stdio workspace
```
With Coder Connect enabled, `ssh -p 22 -L7001:localhost:7001 codercliconnect` will use the Coder Connect tunnel. The results were as follows:

**Throughput, 10 tests, back to back:**
- Average throughput across all tests: 788.20 Mbits/sec
- Minimum average throughput: 731 Mbits/sec
- Maximum average throughput: 871 Mbits/sec
- Standard Deviation: 38.88 Mbits/sec

**Latency, 100 RTTs:**
- Average: 0.369ms
- Minimum: 0.290ms
- Maximum: 0.473ms

### Testcase 2: `ssh` dialing Coder Connect directly without a `ProxyCommand`

This is what we assume to be the 'best' way to use Coder Connect

**Throughput, 10 tests, back to back:**
- Average throughput across all tests: 789.50 Mbits/sec
- Minimum average throughput: 708 Mbits/sec
- Maximum average throughput: 839 Mbits/sec
- Standard Deviation: 39.98 Mbits/sec

**Latency, 100 RTTs:**
- Average: 0.369ms
- Minimum: 0.267ms
- Maximum: 0.440ms

### Testcase 3:  `coder ssh` `ProxyCommand` that creates its own Tailnet connection in-process

This is what normally happens when you run `coder ssh`:

**Throughput, 10 tests, back to back:**
- Average throughput across all tests: 610.20 Mbits/sec
- Minimum average throughput: 569 Mbits/sec
- Maximum average throughput: 664 Mbits/sec
- Standard Deviation: 27.29 Mbits/sec

**Latency, 100 RTTs:**
- Average: 0.335ms
- Minimum: 0.262ms
- Maximum: 0.452ms

## Analysis

Performing a two-tailed, unpaired t-test against the throughput of testcases 1 and 2, we find a P value of `0.9450`. This suggests the difference between the data sets is not statistically significant. In other words, there is a 94.5% chance that the difference between the data sets is due to chance.

## Conclusion

From the t-test, and by comparison to the status quo (regular `coder ssh`, which uses gvisor, and is noticeably slower), I think it's safe to say any impact on throughput or latency by the `ProxyCommand` performing a bicopy against Coder Connect is negligible. Users are very much unlikely to run into performance issues as a result of using Coder Connect via `coder ssh`, as implemented in this PR.

Less scientifically, I ran these same tests on my home network with my Sydney workspace, and both throughput and latency were consistent across testcases 1 and 2.

</details>
2025-04-30 15:17:10 +10:00
2acf0adcf2 chore(codersdk/toolsdk): improve static analyzability of toolsdk.Tools (#17562)
* Refactors toolsdk.Tools to remove opaque `map[string]any` argument in
favour of typed args structs.
* Refactors toolsdk.Tools to remove opaque passing of dependencies via
`context.Context` in favour of a tool dependencies struct.
* Adds panic recovery and clean context middleware to all tools.
* Adds `GenericTool` implementation to allow keeping `toolsdk.All` with
uniform type signature while maintaining type information in handlers.
* Adds stricter checks to `patchWorkspaceAgentAppStatus` handler.
2025-04-29 16:05:23 +01:00
268a50c193 feat(agent/agentcontainers): add file watcher and dirty status (#17573)
Fixes coder/internal#479
Fixes coder/internal#480
2025-04-29 11:53:58 +03:00
08ad910171 feat: add prebuilds configuration & bootstrapping (#17527)
Closes https://github.com/coder/internal/issues/508

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Cian Johnston <cian@coder.com>
2025-04-25 11:07:15 +02:00