Commit Graph

1451 Commits

Author SHA1 Message Date
7f056da088 feat: add hidden CODER_AGENT_IS_SUB_AGENT flag to coder agent (#17783)
Closes https://github.com/coder/internal/issues/620

Adds a new, hidden, flag `CODER_AGENT_IS_SUB_AGENT` to the `coder agent`
command.
2025-05-13 10:57:50 +01:00
578b9ff5fe fix: enrich the notLoggedInMessage error message with the full path to the coder (#17715)
---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-05-12 11:45:24 -07:00
e0dd50d7fb chore(cli): fix test flake in TestExpMcpServer (#17772)
Test was failing inside a Coder workspace.
2025-05-12 17:15:24 +01:00
37832413ba chore: resolve internal drpc package conflict (#17770)
Our internal drpc package name conflicts with the external one in usage. 
`drpc.*` == external
`drpcsdk.*` == internal
2025-05-12 10:31:38 -05:00
af2941bb92 feat: add is_prebuild_claim to distinguish post-claim provisioning (#17757)
Used in combination with
https://github.com/coder/terraform-provider-coder/pull/396

This is required by both https://github.com/coder/coder/pull/17475 and
https://github.com/coder/coder/pull/17571

Operators may need to conditionalize their templates to perform certain
operations once a prebuilt workspace has been claimed. This value will
**only** be set once a claim takes place and a subsequent `terraform
apply` occurs. Any `terraform apply` runs thereafter will be
indistinguishable from a normal run on a workspace.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-12 14:19:03 +00:00
3ee95f14ce chore: upgrade terraform-provider-coder & preview libs (#17738)
The changes in `coder/preview` necessitated the changes in
`codersdk/richparameters.go` & `provisioner/terraform/resources.go`.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-05-09 17:41:19 +02:00
a9f1a6b2a2 fix: revert fix: persist terraform modules during template import (#17665) (#17734)
This reverts commit ae3d90b057.
2025-05-08 22:03:08 -04:00
ae3d90b057 fix: persist terraform modules during template import (#17665) 2025-05-08 16:13:46 -06:00
c5c3a54fca fix: create ssh directory if it doesn't already exist when running coder config-ssh (#17711)
Closes
[coder/internal#623](https://github.com/coder/internal/issues/623)

> [!WARNING]  
> PR co-authored by Claude Code
2025-05-08 10:10:52 -04:00
29bce8d9e6 feat(cli): make MCP server work without user authentication (#17688)
Part of #17649

---

# Allow MCP server to run without authentication

This PR enhances the MCP server to operate without requiring authentication, making it more flexible for environments where authentication isn't available or necessary. Key changes:

- Replaced `InitClient` with `TryInitClient` to allow the MCP server to start without credentials
- Added graceful handling when URL or authentication is missing
- Made authentication status visible in server logs
- Added logic to skip user-dependent tools when no authenticated user is present
- Made the `coder_report_task` tool available with just an agent token (no user token required)
- Added comprehensive tests to verify operation without authentication

These changes allow the MCP server to function in more environments while still using authentication when available, improving flexibility for CI/CD and other automated environments.
2025-05-07 21:53:06 +02:00
544259b809 feat: add database tables and API routes for agentic chat feature (#17570)
Backend portion of experimental `AgenticChat` feature:
- Adds database tables for chats and chat messages
- Adds functionality to stream messages from LLM providers using
`kylecarbs/aisdk-go`
- Adds API routes with relevant functionality (list, create, update
chats, insert chat message)
- Adds experiment `codersdk.AgenticChat`

---------

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2025-05-02 17:29:57 +01:00
c278662218 feat: collect database metrics (#17635)
Currently we don't have a way to get insight into Postgres connections
being exhausted.

By using the prometheus' [`DBStats`
collector](https://github.com/prometheus/client_golang/blob/main/prometheus/collectors/dbstats_collector.go),
we get some insight out-of-the-box.

```
# HELP go_sql_idle_connections The number of idle connections.
# TYPE go_sql_idle_connections gauge
go_sql_idle_connections{db_name="coder"} 1
# HELP go_sql_in_use_connections The number of connections currently in use.
# TYPE go_sql_in_use_connections gauge
go_sql_in_use_connections{db_name="coder"} 2
# HELP go_sql_max_idle_closed_total The total number of connections closed due to SetMaxIdleConns.
# TYPE go_sql_max_idle_closed_total counter
go_sql_max_idle_closed_total{db_name="coder"} 112
# HELP go_sql_max_idle_time_closed_total The total number of connections closed due to SetConnMaxIdleTime.
# TYPE go_sql_max_idle_time_closed_total counter
go_sql_max_idle_time_closed_total{db_name="coder"} 0
# HELP go_sql_max_lifetime_closed_total The total number of connections closed due to SetConnMaxLifetime.
# TYPE go_sql_max_lifetime_closed_total counter
go_sql_max_lifetime_closed_total{db_name="coder"} 0
# HELP go_sql_max_open_connections Maximum number of open connections to the database.
# TYPE go_sql_max_open_connections gauge
go_sql_max_open_connections{db_name="coder"} 10
# HELP go_sql_open_connections The number of established connections both in use and idle.
# TYPE go_sql_open_connections gauge
go_sql_open_connections{db_name="coder"} 3
# HELP go_sql_wait_count_total The total number of connections waited for.
# TYPE go_sql_wait_count_total counter
go_sql_wait_count_total{db_name="coder"} 28
# HELP go_sql_wait_duration_seconds_total The total time blocked waiting for a new connection.
# TYPE go_sql_wait_duration_seconds_total counter
go_sql_wait_duration_seconds_total{db_name="coder"} 0.086936235
```

`go_sql_wait_count_total` is the metric I'm most interested in gaining,
but the others are also very useful.

Changing the prefix is easy (`prometheus.WrapRegistererWithPrefix`), but
getting rid of the `go_` segment is not quite so easy. I've kept the
changeset small for now.

**NOTE:** I imported a library to determine the database name from the
given conn string. It's [not as
simple](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING)
as one might hope. The database name is used for the `db_name` label.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-02 12:17:01 +02:00
c7fc7b91ec fix: create directory before writing coder connect network info file (#17628)
The regular network info file creation code also calls `Mkdirall`.

Wasn't picked up in manual testing as I already had the `/net` folder in
my VSCode.

Wasn't picked up in automated testing because we use an in-memory FS,
which for some reason does this implicitly.
2025-05-01 16:53:13 +10:00
d7e6eb7914 chore(cli): fix test flake when running in coder workspace (#17604)
This test was failing inside a Coder workspace due to
`CODER_AGENT_TOKEN` being set.
2025-04-30 09:18:58 +01:00
7a1e56b707 test: avoid sharing echo.Responses across tests (#17610)
I missed this in https://github.com/coder/coder/pull/17211 because I
only searched for `:= &echo.Responses` and not `= &echo.Responses` 🤦

Fixes flakes like
https://github.com/coder/coder/actions/runs/14746732612/job/41395403979
2025-04-30 05:18:13 +00:00
53ba3613b3 feat(cli): use coder connect in coder ssh --stdio, if available (#17572)
Closes https://github.com/coder/vscode-coder/issues/447
Closes https://github.com/coder/jetbrains-coder/issues/543
Closes https://github.com/coder/coder-jetbrains-toolbox/issues/21

This PR adds Coder Connect support to `coder ssh --stdio`. 

When connecting to a workspace, if `--force-new-tunnel` is not passed, the CLI will first do a DNS lookup for `<agent>.<workspace>.<owner>.<hostname-suffix>`. If an IP address is returned, and it's within the Coder service prefix, the CLI will not create a new tailnet connection to the workspace, and instead dial the SSH server running on port 22 on the workspace directly over TCP.

This allows IDE extensions to use the Coder Connect tunnel, without requiring any modifications to the extensions themselves. 

Additionally, `using_coder_connect` is added to the `sshNetworkStats` file, which the VS Code extension (and maybe Jetbrains?) will be able to read, and indicate to the user that they are using Coder Connect.

One advantage of this approach is that running `coder ssh --stdio` on an offline workspace with Coder Connect enabled will have the CLI wait for the workspace to build, the agent to connect (and optionally, for the startup scripts to finish), before finally connecting using the Coder Connect tunnel.

As a result, `coder ssh --stdio` has the overhead of looking up the workspace and agent, and checking if they are running. On my device, this meant `coder ssh --stdio <workspace>` was approximately a second slower than just connecting to the workspace directly using `ssh <workspace>.coder` (I would assume anyone serious about their Coder Connect usage would know to just do the latter anyway).
 
To ensure this doesn't come at a significant performance cost, I've also benchmarked this PR.

<details>
<summary>Benchmark</summary>

## Methodology
All tests were completed on `dev.coder.com`, where a Linux workspace running in AWS `us-west1` was created.
The machine running Coder Desktop (the 'client') was a Windows VM running in the same AWS region and VPC as the workspace.

To test the performance of specifically the SSH connection, a port was forwarded between the client and workspace using:
```
ssh -p 22 -L7001:localhost:7001 <host>
```
where `host` was either an alias for an SSH ProxyCommand that called `coder ssh`, or a Coder Connect hostname.

For latency, [`tcping`](https://www.elifulkerson.com/projects/tcping.php) was used against the forwarded port:
```
tcping -n 100 localhost 7001
```

For throughput, [`iperf3`](https://iperf.fr/iperf-download.php) was used:
```
iperf3 -c localhost -p 7001
```
where an `iperf3` server was running on the workspace on port 7001.

## Test Cases

### Testcase 1: `coder ssh` `ProxyCommand` that bicopies from Coder Connect
This case tests the implementation in this PR, such that we can write a config like:
```
Host codercliconnect
    ProxyCommand /path/to/coder ssh --stdio workspace
```
With Coder Connect enabled, `ssh -p 22 -L7001:localhost:7001 codercliconnect` will use the Coder Connect tunnel. The results were as follows:

**Throughput, 10 tests, back to back:**
- Average throughput across all tests: 788.20 Mbits/sec
- Minimum average throughput: 731 Mbits/sec
- Maximum average throughput: 871 Mbits/sec
- Standard Deviation: 38.88 Mbits/sec

**Latency, 100 RTTs:**
- Average: 0.369ms
- Minimum: 0.290ms
- Maximum: 0.473ms

### Testcase 2: `ssh` dialing Coder Connect directly without a `ProxyCommand`

This is what we assume to be the 'best' way to use Coder Connect

**Throughput, 10 tests, back to back:**
- Average throughput across all tests: 789.50 Mbits/sec
- Minimum average throughput: 708 Mbits/sec
- Maximum average throughput: 839 Mbits/sec
- Standard Deviation: 39.98 Mbits/sec

**Latency, 100 RTTs:**
- Average: 0.369ms
- Minimum: 0.267ms
- Maximum: 0.440ms

### Testcase 3:  `coder ssh` `ProxyCommand` that creates its own Tailnet connection in-process

This is what normally happens when you run `coder ssh`:

**Throughput, 10 tests, back to back:**
- Average throughput across all tests: 610.20 Mbits/sec
- Minimum average throughput: 569 Mbits/sec
- Maximum average throughput: 664 Mbits/sec
- Standard Deviation: 27.29 Mbits/sec

**Latency, 100 RTTs:**
- Average: 0.335ms
- Minimum: 0.262ms
- Maximum: 0.452ms

## Analysis

Performing a two-tailed, unpaired t-test against the throughput of testcases 1 and 2, we find a P value of `0.9450`. This suggests the difference between the data sets is not statistically significant. In other words, there is a 94.5% chance that the difference between the data sets is due to chance.

## Conclusion

From the t-test, and by comparison to the status quo (regular `coder ssh`, which uses gvisor, and is noticeably slower), I think it's safe to say any impact on throughput or latency by the `ProxyCommand` performing a bicopy against Coder Connect is negligible. Users are very much unlikely to run into performance issues as a result of using Coder Connect via `coder ssh`, as implemented in this PR.

Less scientifically, I ran these same tests on my home network with my Sydney workspace, and both throughput and latency were consistent across testcases 1 and 2.

</details>
2025-04-30 15:17:10 +10:00
2acf0adcf2 chore(codersdk/toolsdk): improve static analyzability of toolsdk.Tools (#17562)
* Refactors toolsdk.Tools to remove opaque `map[string]any` argument in
favour of typed args structs.
* Refactors toolsdk.Tools to remove opaque passing of dependencies via
`context.Context` in favour of a tool dependencies struct.
* Adds panic recovery and clean context middleware to all tools.
* Adds `GenericTool` implementation to allow keeping `toolsdk.All` with
uniform type signature while maintaining type information in handlers.
* Adds stricter checks to `patchWorkspaceAgentAppStatus` handler.
2025-04-29 16:05:23 +01:00
1fc74f629e refactor(agent): update agentcontainers api initialization (#17600)
There were too many ways to configure the agentcontainers API resulting
in inconsistent behavior or features not being enabled. This refactor
introduces a control flag for enabling or disabling the containers API.
When disabled, all implementations are no-op and explicit endpoint
behaviors are defined. When enabled, concrete implementations are used
by default but can be overridden by passing options.
2025-04-29 17:53:10 +03:00
22b932a8e0 fix(cli): fix prompt issue in mcp configure claude-code (#17599)
* Updates default Coder prompt.
* Skips the directions to report tasks if the pre-requisites are not
available (agent token and app slug).
* Adds the capability to override the default Coder prompt via
`CODER_MCP_CLAUDE_CODER_PROMPT`.
2025-04-29 15:23:16 +01:00
08ad910171 feat: add prebuilds configuration & bootstrapping (#17527)
Closes https://github.com/coder/internal/issues/508

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Cian Johnston <cian@coder.com>
2025-04-25 11:07:15 +02:00
9922240fd4 feat: enable masking password inputs instead of blocking echo (#17469)
Closes #17059
2025-04-24 09:54:00 +02:00
444bd6a212 fix(cli/server.go): switch to alternate maven repo for postgres binaries (#17451)
Not really guaranteed, but worth a shot.

---------

Co-authored-by: Danny Kopping <danny@coder.com>
2025-04-22 09:02:35 +01:00
9fe3fd4e28 chore: change config-ssh Call to Action to use suffix (#17445)
fixes #16828

With all the recent changes, I believe it is now safe to change the Call to Action for `config-ssh` to use the hostname suffix rather than prefix if it was set.
2025-04-17 12:16:29 +04:00
b0854aa971 feat: modify config-ssh to check for Coder Connect (#17419)
relates to #16828

Changes SSH config so that suffixes only match if Coder Connect is not running / available. This means that we will use the existing Coder Connect tunnel if it is available, rather than creating a new tunnel via `coder ssh --stdio`.
2025-04-17 12:04:00 +04:00
3b54254177 feat: add coder connect exists hidden subcommand (#17418)
Adds a new hidden subcommand `coder connect exists <hostname>` that checks if the name exists via Coder Connect. This will be used in SSH config to match only if Coder Connect is unavailable for the hostname in question, so that the SSH client will directly dial the workspace over an existing Coder Connect tunnel.

Also refactors the way we inject a test DNS resolver into the lookup functions so that we can test from outside the `workspacesdk` package.
2025-04-17 11:23:24 +04:00
f670bc31f5 chore: update testutil chan helpers (#17408) 2025-04-16 10:37:09 -06:00
b7cd545d0a test: fix TestConfigSSH_FileWriteAndOptionsFlow on Windows 11 24H2 (#17410)
Fixes tests on Windows 11 due to `printf` not being a recognized command name.
2025-04-16 14:29:45 +04:00
70b113de7b feat: add edit-role within user command (#17341) 2025-04-15 18:30:20 -04:00
2d2c9bda98 fix(cli): correct logic around CODER_MCP_APP_STATUS_SLUG (#17391)
Past me was not smart.
2025-04-14 16:24:02 +01:00
a98605913a feat: mark prebuilds as such and set their preset ids (#16965)
This pull request closes https://github.com/coder/internal/issues/513
2025-04-14 15:34:50 +02:00
7b0422b49b fix(codersdk/toolsdk): fix tool schemata (#17365)
Fixes two issues with the MCP server:
- Ensures we have a non-null schema, as the following schema was making
claude-code unhappy:

 
```
        "inputSchema": { "type": "object", "properties": null },
```


- Skip adding the coder_report_task tool if an agent client is not
available. Otherwise the agent may try to report tasks and get confused.
2025-04-11 18:58:17 +01:00
1235550637 feat(codersdk): add toolsdk and replace existing mcp server tool impl (#17343)
- Refactors existing `mcp` package to use `kylecarbs/aisdk-go` and moves
to `codersdk/toolsdk` package.
- Updates existing MCP server implementation to use `codersdk/toolsdk`

Co-authored-by: Kyle Carberry <kyle@coder.com>
2025-04-11 10:24:45 +01:00
e7e47537c9 chore: fix gpg forwarding test (#17355) 2025-04-11 03:33:53 +00:00
1e0051a9a2 feat(testutil): add GetRandomNameHyphenated (#17342)
This started coming up more often for me, so time for a test helper!

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-10 19:08:38 +01:00
a8fbe71a22 chore(cli): increase healthcheck timeout in TestSupportbundle (#17291)
Fixes https://github.com/coder/internal/issues/272

* Increases healthcheck timeout in tests. This seems to be the most
usual cause of test failures.
* Adds a non-nilness check before caching a healthcheck report.
* Modifies the HTTP response code to 503 (was 404) when no healthcheck
report is available. 503 seems to be a better indicator of the server
state in this case, whereas 404 could be misinterpreted as a typo in the
healthcheck URL.
2025-04-09 09:21:17 +01:00
8d122aa4ab chore(cli): avoid use of testutil.RandomPort() in prometheus test (#17297)
Should hopefully fix https://github.com/coder/internal/issues/282

Instead of picking a random port for the prometheus server, listen on
`:0` and read the port from the CLI stdout.
2025-04-09 09:20:47 +01:00
52d555880c chore: add custom samesite options to auth cookies (#16885)
Allows controlling `samesite` cookie settings from the deployment config
2025-04-08 14:15:14 -05:00
389e88ec82 chore(cli): refactor TestServer/Prometheus to use testutil.Eventually (#17295)
Updates https://github.com/coder/internal/issues/282

Refactors existing tests to use `testutil.Eventually` which plays nicer
with `testutil.Context`.
2025-04-08 16:53:22 +01:00
9eeb506ae5 feat: modify config-ssh to set the host suffix (#17280)
Wires up `config-ssh` command to use a hostname suffix if configured.

part of: #16828


e.g. `coder config-ssh --hostname-suffix spiketest` gives:

```
# ------------START-CODER-----------
# This section is managed by coder. DO NOT EDIT.
#
# You should not hand-edit this section unless you are removing it, all
# changes will be lost when running "coder config-ssh".
#
# Last config-ssh options:
# :hostname-suffix=spiketest
#
Host coder.* *.spiketest
        ConnectTimeout=0
        StrictHostKeyChecking=no
        UserKnownHostsFile=/dev/null
        LogLevel ERROR
        ProxyCommand /home/coder/repos/coder/build/coder_config_ssh --global-config /home/coder/.config/coderv2 ssh --stdio --ssh-host-prefix coder. --hostname-suffix spiketest %h
# ------------END-CODER------------
```
2025-04-08 11:48:18 +04:00
d312e82a51 feat: support --hostname-suffix flag on coder ssh (#17279)
Adds `hostname-suffix` flag to `coder ssh` command for use in SSH Config ProxyCommands.

Also enforces that Coder server doesn't start the suffix with a dot.

part of: #16828
2025-04-07 21:33:33 +04:00
fc471eb384 fix: handle vscodessh style workspace names in coder ssh (#17154)
Fixes an issue where old ssh configs that use the
`owner--workspace--agent` format will fail to properly use the `coder
ssh` command since we migrated off the `coder vscodessh` command.
2025-04-07 10:06:58 -04:00
59c5bc9bd2 feat: add hostname-suffix option to config-ssh (#17270)
Adds `hostname-suffix` as a Config SSH option that we get from Coderd, and also accept via a CLI flag.

It doesn't actually do anything with this value --- that's for PRs up the stack, since we need the `coder ssh` command to be updated to understand the suffix first.
2025-04-07 12:11:04 +04:00
24248736ac feat: add host suffix to /api/v2/deployment/ssh (#17269)
Adds `HostnameSuffix` to ssh config API and deprecates `HostnamePrefix`. We will still support setting and using the prefix for some time.
2025-04-07 11:57:10 +04:00
87d9ff0973 feat: add CODER_WORKSPACE_HOSTNAME_SUFFIX (#17268)
Adds deployment option `CODER_WORKSPACE_HOSTNAME_SUFFIX`. This will eventually replace `CODER_SSH_HOSTNAME_PREFIX`, but we will do this slowly and support both for `coder ssh` for some time.

Note that the name is changed to "workspace" hostname, since this suffix will also be used for Coder Connect on Coder Desktop, which is not limited to SSH.
2025-04-07 11:35:47 +04:00
ae7afd1aa0 feat: split cli roles edit command into create and update commands (#17121)
Closes #14239
2025-04-04 14:04:20 -04:00
f6bf6c6ec4 fix!: use names not IDs for agent SSH key seed (#17258)
Changes the SSH host key seeding to use the owner username, workspace name, and agent name. This prevents SSH from complaining about a mismatched host key if you use Coder Desktop to connect, and delete and recreate your workspace with the same name. Previously this would generate a different key because the workspace ID changed.

We also include the owner's username in anticipation of using Coder Desktop to access shared workspaces (or as a superuser) down the road, so that workspaces with the same name owned by different users will not have the same key.

This change is **BREAKING** in a limited sense that early access users of Coder Desktop will see their SSH clients complain about host keys changing the first time each workspace is rebuilt with this code. It can be resolved by clearing your `.ssh/known_hosts` file of the Coder workspaces you access this way.
2025-04-04 12:51:46 +04:00
aa3d71d169 feat(cli): support opening devcontainers in vscode (#17189)
Closes https://github.com/coder/coder/issues/16427

Adds the option `-c,--container` to `open vscode` that allows opening
VSCode into a running devcontainer.
2025-04-03 10:21:23 +01:00
4aa45a5c43 fix(cli): modify exp mcp configure to also read claude API key from CLAUDE_API_KEY env (#17229)
Currently you have to set `CODER_MCP_CLAUDE_API_KEY`, which can be
obnoxious.
2025-04-03 09:45:17 +01:00
6fdad0272d fix: avoid sharing echo.Responses across tests (#17211)
Closes https://github.com/coder/internal/issues/551

We've noticed lots of flakes in `go test -race` tests that use the echo provisioner. I believe the root cause of this to be https://github.com/coder/coder/pull/17012/, where we started mutating the `echo.Responses`. This only caused issues as we previously shared `echo.Responses` across multiple test cases.

This PR is therefore the same as https://github.com/coder/coder/pull/17128, but I believe this is all the cases where an `echo.Responses` is shared between tests - including tests that haven't flaked (yet).
2025-04-02 13:06:19 +11:00
88bae05223 feat(cli): implement exp mcp configure claude-code command (#17195)
Updates `~/.claude.json` and `~/.claude/CLAUDE.md` with required
settings for agentic usage.
2025-04-01 20:06:42 +01:00