Commit Graph

22 Commits

Author SHA1 Message Date
6c0bed0f53 chore: update to coder/quartz v0.2.0 (#18007)
Upgrade to coder/quartz v0.2.0 including fixing up a minor API breaking change.
2025-05-27 16:05:03 +04:00
5f516ed135 feat: improve coder connect tunnel handling on reconnect (#17598)
Closes https://github.com/coder/internal/issues/563

The [Coder Connect
tunnel](https://github.com/coder/coder/blob/main/vpn/tunnel.go) receives
workspace state from the Coder server over a [dRPC
stream.](114ba4593b/tailnet/controllers.go (L1029))
When first connecting to this stream, the current state of the user's
workspaces is received, with subsequent messages being diffs on top of
that state.

However, if the client disconnects from this stream, such as when the
user's device is suspended, and then reconnects later, no mechanism
exists for the tunnel to differentiate that message containing the
entire initial state from another diff, and so that state is incorrectly
applied as a diff.

In practice:
- Tunnel connects, receives a workspace update containing all the
existing workspaces & agents.
- Tunnel loses connection, but isn't completely stopped.
- All the user's workspaces are restarted, producing a new set of
agents.
- Tunnel regains connection, and receives a workspace update containing
all the existing workspaces & agents.
- This initial update is incorrectly applied as a diff, with the
Tunnel's state containing both the old & new agents.

This PR introduces a solution in which tunnelUpdater, when created,
sends a FreshState flag with the WorkspaceUpdate type. This flag is
handled in the vpn tunnel in the following fashion:
- Preserve existing Agents
- Remove current Agents in the tunnel that are not present in the
WorkspaceUpdate
- Remove unreferenced Workspaces
2025-05-06 16:00:16 +02:00
f670bc31f5 chore: update testutil chan helpers (#17408) 2025-04-16 10:37:09 -06:00
9e2af3e127 feat: add configurable DNS match domain for tailnet connections (#17336)
Use the hostname suffix to set the DNS match domain when creating a Tailnet as part of the vpn `Tunnel`.

part of: #16828
2025-04-11 15:00:48 +04:00
2c573dc023 feat: vpn uses WorkspaceHostnameSuffix for DNS names (#17335)
Use the hostname suffix to set DNS names as programmed into the DNS service and returned by the vpn `Tunnel`.

part of: #16828
2025-04-11 13:24:20 +04:00
3c1cb5d05a chore: add generic DNS record for checking if Coder Connect is running (#17298)
Closes https://github.com/coder/internal/issues/466

```
$ dig -6 @fd60:627a:a42b::53 is.coder--connect--enabled--right--now.coder AAAA

; <<>> DiG 9.10.6 <<>> -6 @fd60:627a:a42b::53 is.coder--connect--enabled--right--now.coder AAAA
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62390
;; flags: qr aa rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;is.coder--connect--enabled--right--now.coder. IN AAAA

;; ANSWER SECTION:
is.coder--connect--enabled--right--now.coder. 2	IN AAAA	fd60:627a:a42b::53

;; Query time: 3 msec
;; SERVER: fd60:627a:a42b::53#53(fd60:627a:a42b::53)
;; WHEN: Wed Apr 09 16:59:18 AEST 2025
;; MSG SIZE  rcvd: 134
```

Hostname considerations:
- Workspace names, usernames, and agent names can't have double hyphens, so this name can't conflict with a real Coder Connect hostname.
- Components can't start or end with hyphens according to [RFC 952](https://www.rfc-editor.org/rfc/rfc952.html)
- DNS records can't have hyphens in the 3rd and 4th positions, as to not conflict with IDNs https://datatracker.ietf.org/doc/html/rfc5891
2025-04-11 13:59:25 +10:00
17ddee05e5 chore: update golang to 1.24.1 (#17035)
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
  - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)

🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2025-03-26 01:56:39 -05:00
ba48069325 chore: implement CoderVPN client & tunnel (#15612)
Addresses #14734.

This PR wires up `tunnel.go` to a `tailnet.Conn` via the new `/tailnet` endpoint, with all the necessary controllers such that a VPN connection can be started, stopped and inspected via the CoderVPN protocol.
2024-12-05 13:30:22 +11:00
029cd5d064 fix(tailnet): prevent redial after Coord graceful restart (#15586)
fixes: https://github.com/coder/internal/issues/217

> There are a couple problems:
>
> One is that we assert the RPCs succeed, but if the pipeDialer context is canceled at the end of the test, then these assertions happen after the test is officially complete, which panics and affects other tests.

This converts these to just return the error rather than assert.

> The other is that the retrier is slightly bugged: if the current retry delay is 0 AND the ctx is done, (e.g. after successfully connecting and then gracefully disconnecting), then retrier.Wait(c.ctx) is racy and could return either true or false.

Fixes the phantom redial by explicitly checking the context before dialing. Also, in the test, we assert that the controller is closed before completing the test.
2024-11-19 11:37:11 +04:00
85c3c4c025 feat(tailnet): add alias with username and short alias to DNS (#15585)
Adds DNS aliases of the form `<agent>.<workspace>.<username>.coder.` and
`<workspace>.coder.`
2024-11-19 11:23:17 +04:00
5861e516b9 chore: add standard test logger ignoring db canceled (#15556)
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests.  This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.

Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
2024-11-18 14:09:22 +04:00
16992ee548 feat(tailnet): add workspace updates support to Controller (#15529)
re: #14730

Adds support in `tailnet.Controller` for WorkspaceUpdates.

Also checks configured controllers against the clients returned by the dialer, so that if we connect with a dialer that doesn't support an RPC (for instance the in-memory dialer for ServerTailnet doesn't support WorkspaceUpdates), we throw an error if there is a controller expecting it.
2024-11-18 10:41:19 +04:00
40802958e9 fix: use explicit api versions for agent and tailnet (#15508)
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.

What happened is that we accidentally shipped two new API features without bumping the version.  `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.

Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet.  That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs.  This isn't great, but hasn't caused us major issues because 

1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported. 
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.

Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.

To mitigate against this thing happening again, this PR also:

1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version.  That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.

With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
2024-11-15 11:16:28 +04:00
916df4d411 feat: set DNS hostnames in workspace updates controller (#15507)
re: #14730

Adds support for the workspace updates protocol controller to also program DNS names for each agent.

Right now, we only program names like `myagent.myworkspace.me.coder` and `myworkspace.coder.` (if there is exactly one agent in the workspace).  We also want to support `myagent.myworkspace.username.coder.`, but for that we need to update WorkspaceUpdates RPC to also send the workspace owner's username, which will be in a separate PR.
2024-11-15 11:00:19 +04:00
08216aaad6 feat: add workspace updates controller (#15506)
re: #14730

Adds a protocol controller for WorkspaceUpdates RPC that takes all the agents we learn about over the RPC, and programs them into the Coordination controller, so that we set up tunnels to all the agents.

Handling DNS is in a PR up the stack, as is actually wiring it up to anything.
2024-11-14 16:16:04 +04:00
e5661c2748 feat: add support for multiple tunnel destinations in tailnet (#15409)
Closes #14729

Expands the Coordination controller used by the CLI client to allow multiple tunnel destinations (agents).  Our current client uses just one, but this unifies the logic so that when we add Coder VPN, 1 is just a special case of "many."
2024-11-08 13:32:07 +04:00
8c00ebc6ee chore: refactor ServerTailnet to use tailnet.Controllers (#15408)
chore of #14729

Refactors the `ServerTailnet` to use `tailnet.Controller` so that we reuse logic around reconnection and handling control messages, instead of reimplementing.  This unifies our "client" use of the tailscale API across CLI, coderd, and wsproxy.
2024-11-08 13:18:56 +04:00
718722af1b chore: refactor tailnetAPIConnector to tailnet.Controller (#15361)
Refactors `workspacesdk.tailnetAPIConnector` as a `tailnet.Controller` to reuse all the reconnection and graceful disconnect logic.

chore re: #14729
2024-11-08 10:10:54 +04:00
d7e86278c8 chore: add resume token controller (#15346)
Implements a controller for the Tailnet API resume token RPC, by refactoring from `workspacesdk`.

chore re: #14729
2024-11-07 11:32:20 +04:00
335e4ab6bf chore: refactor sending telemetry (#15345)
Implements a tailnet API Telemetry controller by refactoring from `workspacesdk`.

chore re: #14729
2024-11-06 20:23:23 +04:00
9126cd78a6 chore: refactor DERP setting loop (#15344)
Implements a Tailnet API DERP controller by refactoring from `workspacesdk`

chore re: #14729
2024-11-06 20:04:05 +04:00
886dcbec84 chore: refactor coordination (#15343)
Refactors the way clients of the Tailnet API (clients of the API, which include both workspace "agents" and "clients") interact with the API.  Introduces the idea of abstract "controllers" for each of the RPCs in the API, and implements a Coordination controller by refactoring from `workspacesdk`.

chore re: #14729
2024-11-05 13:50:10 +04:00