coder

mirror of https://github.com/coder/coder.git synced 2025-07-10 23:53:15 +00:00

Author	SHA1	Message	Date
brettkolodny	2cd3f999a6	feat: add one shot commands to the coder ssh command (#17779 ) Closes #2154 > [!WARNING] > The tests in this PR were co-authored by AI	2025-05-16 10:09:46 -04:00
Ethan	53ba3613b3	feat(cli): use coder connect in `coder ssh --stdio`, if available (#17572 ) Closes https://github.com/coder/vscode-coder/issues/447 Closes https://github.com/coder/jetbrains-coder/issues/543 Closes https://github.com/coder/coder-jetbrains-toolbox/issues/21 This PR adds Coder Connect support to `coder ssh --stdio`. When connecting to a workspace, if `--force-new-tunnel` is not passed, the CLI will first do a DNS lookup for `<agent>.<workspace>.<owner>.<hostname-suffix>`. If an IP address is returned, and it's within the Coder service prefix, the CLI will not create a new tailnet connection to the workspace, and instead dial the SSH server running on port 22 on the workspace directly over TCP. This allows IDE extensions to use the Coder Connect tunnel, without requiring any modifications to the extensions themselves. Additionally, `using_coder_connect` is added to the `sshNetworkStats` file, which the VS Code extension (and maybe Jetbrains?) will be able to read, and indicate to the user that they are using Coder Connect. One advantage of this approach is that running `coder ssh --stdio` on an offline workspace with Coder Connect enabled will have the CLI wait for the workspace to build, the agent to connect (and optionally, for the startup scripts to finish), before finally connecting using the Coder Connect tunnel. As a result, `coder ssh --stdio` has the overhead of looking up the workspace and agent, and checking if they are running. On my device, this meant `coder ssh --stdio <workspace>` was approximately a second slower than just connecting to the workspace directly using `ssh <workspace>.coder` (I would assume anyone serious about their Coder Connect usage would know to just do the latter anyway). To ensure this doesn't come at a significant performance cost, I've also benchmarked this PR. <details> <summary>Benchmark</summary> ## Methodology All tests were completed on `dev.coder.com`, where a Linux workspace running in AWS `us-west1` was created. The machine running Coder Desktop (the 'client') was a Windows VM running in the same AWS region and VPC as the workspace. To test the performance of specifically the SSH connection, a port was forwarded between the client and workspace using: ``` ssh -p 22 -L7001:localhost:7001 <host> ``` where `host` was either an alias for an SSH ProxyCommand that called `coder ssh`, or a Coder Connect hostname. For latency, [`tcping`](https://www.elifulkerson.com/projects/tcping.php) was used against the forwarded port: ``` tcping -n 100 localhost 7001 ``` For throughput, [`iperf3`](https://iperf.fr/iperf-download.php) was used: ``` iperf3 -c localhost -p 7001 ``` where an `iperf3` server was running on the workspace on port 7001. ## Test Cases ### Testcase 1: `coder ssh` `ProxyCommand` that bicopies from Coder Connect This case tests the implementation in this PR, such that we can write a config like: ``` Host codercliconnect ProxyCommand /path/to/coder ssh --stdio workspace ``` With Coder Connect enabled, `ssh -p 22 -L7001:localhost:7001 codercliconnect` will use the Coder Connect tunnel. The results were as follows: Throughput, 10 tests, back to back: - Average throughput across all tests: 788.20 Mbits/sec - Minimum average throughput: 731 Mbits/sec - Maximum average throughput: 871 Mbits/sec - Standard Deviation: 38.88 Mbits/sec Latency, 100 RTTs: - Average: 0.369ms - Minimum: 0.290ms - Maximum: 0.473ms ### Testcase 2: `ssh` dialing Coder Connect directly without a `ProxyCommand` This is what we assume to be the 'best' way to use Coder Connect Throughput, 10 tests, back to back: - Average throughput across all tests: 789.50 Mbits/sec - Minimum average throughput: 708 Mbits/sec - Maximum average throughput: 839 Mbits/sec - Standard Deviation: 39.98 Mbits/sec Latency, 100 RTTs: - Average: 0.369ms - Minimum: 0.267ms - Maximum: 0.440ms ### Testcase 3: `coder ssh` `ProxyCommand` that creates its own Tailnet connection in-process This is what normally happens when you run `coder ssh`: Throughput, 10 tests, back to back: - Average throughput across all tests: 610.20 Mbits/sec - Minimum average throughput: 569 Mbits/sec - Maximum average throughput: 664 Mbits/sec - Standard Deviation: 27.29 Mbits/sec Latency, 100 RTTs: - Average: 0.335ms - Minimum: 0.262ms - Maximum: 0.452ms ## Analysis Performing a two-tailed, unpaired t-test against the throughput of testcases 1 and 2, we find a P value of `0.9450`. This suggests the difference between the data sets is not statistically significant. In other words, there is a 94.5% chance that the difference between the data sets is due to chance. ## Conclusion From the t-test, and by comparison to the status quo (regular `coder ssh`, which uses gvisor, and is noticeably slower), I think it's safe to say any impact on throughput or latency by the `ProxyCommand` performing a bicopy against Coder Connect is negligible. Users are very much unlikely to run into performance issues as a result of using Coder Connect via `coder ssh`, as implemented in this PR. Less scientifically, I ran these same tests on my home network with my Sydney workspace, and both throughput and latency were consistent across testcases 1 and 2. </details>	2025-04-30 15:17:10 +10:00
Mathias Fredriksson	1fc74f629e	refactor(agent): update agentcontainers api initialization (#17600 ) There were too many ways to configure the agentcontainers API resulting in inconsistent behavior or features not being enabled. This refactor introduces a control flag for enabling or disabling the containers API. When disabled, all implementations are no-op and explicit endpoint behaviors are defined. When enabled, concrete implementations are used by default but can be overridden by passing options.	2025-04-29 17:53:10 +03:00
ケイラ	f670bc31f5	chore: update testutil chan helpers (#17408 )	2025-04-16 10:37:09 -06:00
Dean Sheather	e7e47537c9	chore: fix gpg forwarding test (#17355 )	2025-04-11 03:33:53 +00:00
Spike Curtis	d312e82a51	feat: support --hostname-suffix flag on coder ssh (#17279 ) Adds `hostname-suffix` flag to `coder ssh` command for use in SSH Config ProxyCommands. Also enforces that Coder server doesn't start the suffix with a dot. part of: #16828	2025-04-07 21:33:33 +04:00
Garrett Delfosse	fc471eb384	fix: handle vscodessh style workspace names in coder ssh (#17154 ) Fixes an issue where old ssh configs that use the `owner--workspace--agent` format will fail to properly use the `coder ssh` command since we migrated off the `coder vscodessh` command.	2025-04-07 10:06:58 -04:00
Spike Curtis	f6bf6c6ec4	fix!: use names not IDs for agent SSH key seed (#17258 ) Changes the SSH host key seeding to use the owner username, workspace name, and agent name. This prevents SSH from complaining about a mismatched host key if you use Coder Desktop to connect, and delete and recreate your workspace with the same name. Previously this would generate a different key because the workspace ID changed. We also include the owner's username in anticipation of using Coder Desktop to access shared workspaces (or as a superuser) down the road, so that workspaces with the same name owned by different users will not have the same key. This change is BREAKING in a limited sense that early access users of Coder Desktop will see their SSH clients complain about host keys changing the first time each workspace is rebuilt with this code. It can be resolved by clearing your `.ssh/known_hosts` file of the Coder workspaces you access this way.	2025-04-04 12:51:46 +04:00
Cian Johnston	a9574fb4b1	chore(cli): increase timeout for TestSSH_Container subtests (#17148 ) Closes https://github.com/coder/internal/issues/524	2025-03-28 13:52:13 +00:00
Jon Ayers	17ddee05e5	chore: update golang to 1.24.1 (#17035 ) - Update go.mod to use Go 1.24.1 - Update GitHub Actions setup-go action to use Go 1.24.1 - Fix linting issues with golangci-lint by: - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <claude@anthropic.com>	2025-03-26 01:56:39 -05:00
Mathias Fredriksson	3ac844ad3d	chore(codersdk): rename WorkspaceAgent(Dev)container structs (#16996 ) This is to free up the devcontainer name space for more targeted structs. Updates #16423	2025-03-19 10:16:14 +00:00
Cian Johnston	ca23abcc30	chore(cli): fix test flake in TestSSH_Container/NotFound (#16771 ) If you hit the list containers endpoint with no containers running, the response is different. This uses a mock lister to ensure a consistent response from the agent endpoint.	2025-03-03 14:15:25 +00:00
Cian Johnston	ec44f06f5c	feat(cli): allow SSH command to connect to running container (#16726 ) Fixes https://github.com/coder/coder/issues/16709 and https://github.com/coder/coder/issues/16420 Adds the capability to`coder ssh` into a running container if `CODER_AGENT_DEVCONTAINERS_ENABLE=true`. Notes: * SFTP is currently not supported * Haven't tested X11 container forwarding * Haven't tested agent forwarding	2025-02-28 09:38:45 +00:00
Thomas Kosiewski	660746462e	fix(agent/agentssh): use deterministic host key for SSH server (#16626 ) Fixes: https://github.com/coder/coder/issues/16490 The Agent's SSH server now initially generates fixed host keys and, once it receives its manifest, generates and replaces that host key with the one derived from the workspace ID, ensuring consistency across agent restarts. This prevents SSH warnings and host key verification errors when connecting to workspaces through Coder Desktop. While deterministic keys might seem insecure, the underlying Wireguard tunnel already provides encryption and anti-spoofing protection at the network layer, making this approach acceptable for our use case. --- Change-Id: I8c7e3070324e5d558374fd6891eea9d48660e1e9 Signed-off-by: Thomas Kosiewski <tk@coder.com>	2025-02-21 14:58:41 +01:00
Mathias Fredriksson	7cf62423ec	test(cli): fix TestSSH/RemoteForward_Unix_Signal flake (#16172 )	2025-01-17 16:53:09 +02:00
Ethan	2413106f22	fix: improve shell compatibility of netstat check in test (#16141 ) When I wrote the original just the other day, I used `$?`, which is fine on CI and in most cases, but not when the person running the test has their system shell set to fish (Fish uses $status) instead. In the interest of letting this test pass locally, I'll instead just grab the line count of the grep output. However, `wc` is padded on macos with spaces, so we need to get rid of those too.	2025-01-15 03:23:53 +00:00
Aaron Lehmann	1aa9e32a2b	feat: add --ssh-host-prefix flag for "coder ssh" (#16088 ) This adds a flag matching `--ssh-host-prefix` from `coder config-ssh` to `coder ssh`. By trimming a custom prefix from the argument, we can set up wildcard-based `Host` entries in SSH config for the IDE plugins (and eventually `coder config-ssh`). We also replace `--` in the argument with `/`, so ownership can be specified in wildcard-based SSH hosts like `<owner>--<workspace>`. Replaces #16087. Part of https://github.com/coder/coder/issues/14986. Related to https://github.com/coder/coder/pull/16078 and https://github.com/coder/coder/pull/16080.	2025-01-13 19:07:21 -06:00
Aaron Lehmann	838ee3b244	feat: add --network-info-dir and --network-info-interval flags to coder ssh (#16078 ) This is the first in a series of PRs to enable `coder ssh` to replace `coder vscodessh`. This change adds `--network-info-dir` and `--network-info-interval` flags to the `ssh` subcommand. These were formerly only available with the `vscodessh` subcommand. Subsequent PRs will add a `--ssh-host-prefix` flag to the ssh subcommand, and adjust the log file naming to contain the parent PID.	2025-01-13 18:29:31 -06:00
Ethan	a7fe35af25	fix: use `netstat` over `ss` when testing unix socket (#16103 ) Closes https://github.com/coder/internal/issues/274. `TestSSH/RemoteForwardUnixSocket` previously used `ss` for confirming if a socket was listening. `ss` isn't available on macOS, causing the test to flake. The test previously passed on macOS as a 2 could always be read on the SSH connection, presumably reading it as part of some escape sequence? I confirmed the test passed on Linux if you comment out the `ss` command, the pty would always read a sequence ending in `[?2`.	2025-01-13 20:51:55 +11:00
Mathias Fredriksson	8c44cd3dfd	test(cli/ssh): fix ssh start conflict test by faking API response (#16082 )	2025-01-10 14:48:11 +00:00
Mathias Fredriksson	ba6e84dec3	fix(cli/ssh): retry on autostart conflict (#16058 )	2025-01-08 15:15:30 +02:00
Spike Curtis	5861e516b9	chore: add standard test logger ignoring db canceled (#15556 ) Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests. This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes. Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.	2024-11-18 14:09:22 +04:00
Steven Masley	343f8ec9ab	chore: join owner, template, and org in new workspace view (#15116 ) Joins in fields like `username`, `avatar_url`, `organization_name`, `template_name` to `workspaces` via a view. The view must be maintained moving forward, but this prevents needing to add RBAC permissions to fetch related workspace fields.	2024-10-22 09:20:54 -05:00
Kayla Washburn-Love	bf4b7abf14	chore(coderd): allow creating workspaces without specifying an organization (#14048 )	2024-07-30 10:44:02 -06:00
Garrett Delfosse	fed668b432	chore: switch ssh session stats based on experiment (#13637 )	2024-06-25 10:58:45 -04:00
Aaron Lehmann	8a1216254e	feat(cli): add `--env` flag for `coder ssh` (#12991 ) This allows environment variables to be set on the SSH session. Example: coder ssh myworkspace --env VAR1=val1,VAR2=val2	2024-04-22 13:13:48 +03:00
Mathias Fredriksson	f418ece9ae	test(cli): prevent flake due to outdated build in TestSSH (#12760 ) Fixes #12752	2024-03-26 10:46:58 +00:00
Spike Curtis	af3fdc68c3	chore: refactor agent routines that use the v2 API (#12223 ) In anticipation of needing the `LogSender` to run on a context that doesn't get immediately canceled when you `Close()` the agent, I've undertaken a little refactor to manage the goroutines that get run against the Tailnet and Agent API connection. This handles controlling two contexts, one that gets canceled right away at the start of graceful shutdown, and another that stays up to allow graceful shutdown to complete.	2024-02-23 11:04:23 +04:00
Spike Curtis	da376549a3	fix: stop waiting for Agent in a goroutine in ssh test (#12268 ) Fixes race seen here: https://github.com/coder/coder/runs/21852483781 What happens is that the agent connects, completes the test, and then disconnects before the Eventually condition runs. The waiter then times out because it's looking for a connected agent. Then, since it's a `require` in a goroutine, that causes the `tGo` cleanup to hang and the whole test suite to timeout after 10 minutes. Anyway, `agenttest.New` doesn't block, and we don't actually need to wait for the agent to connect, since a successful SSH session is evidence that it connected.	2024-02-22 17:01:06 +04:00
Mathias Fredriksson	e659957b65	fix(cli/ssh): prevent reads/writes to stdin/stdout in stdio mode (#12045 ) Fixes #11530	2024-02-08 13:09:42 +02:00
Marcin Tojek	77a4792ecd	fix(cli): ssh: auto-update workspace (#11773 )	2024-01-23 18:01:44 +01:00
Mathias Fredriksson	200a87e7d4	feat(cli/ssh): allow multiple remote forwards and allow missing local file (#11648 )	2024-01-19 15:21:10 +02:00
Mathias Fredriksson	385d58caf6	fix(agent/agentssh): allow remote forwarding a socket multiple times (#11631 ) * fix(agent/agentssh): allow remote forwarding a socket multiple times Fixes #11198 Fixes https://github.com/coder/customers/issues/407	2024-01-16 21:26:13 +02:00
Steven Masley	cb89bc1729	feat: restart stopped workspaces on ssh command (#11050 ) * feat: autostart workspaces on ssh & port forward This is opt out by default. VScode ssh does not have this behavior	2023-12-08 10:01:13 -06:00
Spike Curtis	46d95cb0f0	fix: wait for dial goroutine to complete (#10959 ) Fixes flake seen here: https://github.com/coder/coder/runs/19170327767 The goroutine that attempts to dial the socket didn't complete before the test did. Here we add an explicit wait for it to complete in each run of the loop.	2023-12-01 11:37:32 +04:00
Jon Ayers	967db2801b	chore: refactor ResolveAutostart tests to use dbfake (#10603 )	2023-11-30 19:33:04 -06:00
Spike Curtis	2dc565d5de	chore: remove New----Builder from dbfake function names (#10882 ) Drop "New" and "Builder" from the function names, in favor of the top-level resource created. This shortens tests and gives a nice syntax. Since everything is a builder, the prefix and suffix don't add much value and just make things harder to read. I've also chosen to leave `Do()` as the function to insert into the database. Even though it's a builder pattern, I fear `.Build()` might be confusing with Workspace Builds. One other idea is `Insert()` but if we later add dbfake functions that update, this might be inconsistent.	2023-11-29 11:06:04 +04:00
Spike Curtis	78283a7fb9	chore: remove dbfake.WorkspaceWithAgent (#10879 ) Replace dbfake.WorkspaceWithAgent() with the builder pattern and remove this function.	2023-11-27 14:30:15 +04:00
Spike Curtis	b25e5dc90b	chore: remove dbfake.WorkspaceBuild in favor of builder pattern (#10814 ) I'd like to convert dbfake into a builder pattern to prevent a proliferation of XXXWithYYY methods. This is one step of the way by removing the Non-builder function.	2023-11-22 13:04:58 +04:00
Spike Curtis	5d5b5aa074	chore: use dbfake for ssh tests rather than provisionerd (#10812 ) Refactors SSH tests to skip provisionerd and instead use dbfake to insert workspaces and builds. This should make tests faster and more reliable. dbfake.WorkspaceBuild is refactored to use a "builder" pattern with "fluent" options, as the number of options and variants was starting to get out of hand.	2023-11-21 16:22:08 +04:00
Spike Curtis	92ef0baff3	fix: remove pty match for TestSSH/RemoteForward (#10789 ) Fixes #10578	2023-11-20 20:50:09 +04:00
Spike Curtis	3dd35e019b	fix: close ssh sessions gracefully (#10732 ) Re-enables TestSSH/RemoteForward_Unix_Signal and addresses the underlying race: we were not closing the remote forward on context expiry, only the session and connection. However, there is still a more fundamental issue in that we don't have the ability to ensure that TCP sessions are properly terminated before tearing down the Tailnet conn. This is due to the assumption in the sockets API, that the underlying IP interface is long lived compared with the TCP socket, and thus closing a socket returns immediately and does not wait for the TCP termination handshake --- that is handled async in the tcpip stack. However, this assumption does not hold for us and tailnet, since on shutdown, we also tear down the tailnet connection, and this can race with the TCP termination. Closing the remote forward explicitly should prevent forward state from accumulating, since the Close() function waits for a reply from the remote SSH server. I've also attempted to workaround the TCP/tailnet issue for `--stdio` by using `CloseWrite()` instead of `Close()`. By closing the write side of the connection, half-close the TCP connection, and the server detects this and closes the other direction, which then triggers our read loop to exit only after the server has had a chance to process the close. TODO in a stacked PR is to implement this logic for `vscodessh` as well.	2023-11-17 12:43:20 +04:00
Spike Curtis	34c9661f1b	fix: disable flaky test TestSSH/RemoteForward_Unix_Signal (#10711 )	2023-11-15 11:04:36 +00:00
Spike Curtis	4894eda711	feat: capture cli logs in tests (#10669 ) Adds a Logger to cli Invocation and standardizes CLI commands to use it. clitest creates a test logger by default so that CLI command logs are captured in the test logs. CLI commands that do their own log configuration are modified to add sinks to the existing logger, rather than create a new one. This ensures we still capture logs in CLI tests.	2023-11-14 22:56:27 +04:00
Spike Curtis	f400d8a0c5	fix: handle SIGHUP from OpenSSH (#10638 ) Fixes an issue where remote forwards are not correctly torn down when using OpenSSH with `coder ssh --stdio`. OpenSSH sends a disconnect signal, but then also sends SIGHUP to `coder`. Previously, we just exited when we got SIGHUP, and this raced against properly disconnecting. Fixes https://github.com/coder/customers/issues/327	2023-11-13 15:14:42 +04:00
Marcin Tojek	a1ee4d44aa	fix: test: TestSSH_RemoteForward wait for startup script (#10211 )	2023-10-11 14:17:04 +02:00
Spike Curtis	24c80bf532	fix: remove AwaitWorkspaceAgents in goroutines AwaitWorkspaceAgent calls testify.require which isn't allowed from a goroutine and causes cascading failures in the test suite such as: https://github.com/coder/coder/actions/runs/6458768855/job/17533163316 I don't believe these functions serve a direct purpose since nothing else is "waiting" for the functions to return before doing other things.	2023-10-09 20:37:23 +04:00
Kayla Washburn	c194119689	chore: rename `AwaitTemplateVersionJobCompleted` and `AwaitWorkspaceBuildJobCompleted` (#10003 )	2023-10-03 11:02:56 -06:00
Monika Pawluczuk	4966ef02cf	feat(cli): add reverse tunnelling SSH support for unix sockets (#9976 )	2023-10-03 16:39:39 +10:00
Cian Johnston	fad02081fc	fix: avoid logging env in unit tests (#9885 )	2023-09-27 13:34:40 +01:00

1 2

100 Commits