Commit Graph

21 Commits

Author SHA1 Message Date
335e4ab6bf chore: refactor sending telemetry (#15345)
Implements a tailnet API Telemetry controller by refactoring from `workspacesdk`.

chore re: #14729
2024-11-06 20:23:23 +04:00
9126cd78a6 chore: refactor DERP setting loop (#15344)
Implements a Tailnet API DERP controller by refactoring from `workspacesdk`

chore re: #14729
2024-11-06 20:04:05 +04:00
886dcbec84 chore: refactor coordination (#15343)
Refactors the way clients of the Tailnet API (clients of the API, which include both workspace "agents" and "clients") interact with the API.  Introduces the idea of abstract "controllers" for each of the RPCs in the API, and implements a Coordination controller by refactoring from `workspacesdk`.

chore re: #14729
2024-11-05 13:50:10 +04:00
b1298a3c1e feat: add WorkspaceUpdates tailnet RPC (#14847)
Closes #14716
Closes #14717

Adds a new user-scoped tailnet API endpoint (`api/v2/tailnet`) with a new RPC stream for receiving updates on workspaces owned by a specific user, as defined in #14716. 

When a stream is started, the `WorkspaceUpdatesProvider` will begin listening on the user-scoped pubsub events implemented in #14964. When a relevant event type is seen (such as a workspace state transition), the provider will query the DB for all the workspaces (and agents) owned by the user. This gets compared against the result of the previous query to produce a set of workspace updates. 

Workspace updates can be requested for any user ID, however only workspaces the authorised user is permitted to `ActionRead` will have their updates streamed.
Opening a tunnel to an agent requires that the user can perform `ActionSSH` against the workspace containing it.
2024-11-01 14:53:53 +11:00
cd890aa3a0 feat: enable key rotation (#15066)
This PR contains the remaining logic necessary to hook up key rotation
to the product.
2024-10-25 17:14:35 +01:00
7d9f5ab81d chore: add Coder service prefix to tailnet (#14943)
re: #14715

This PR introduces the Coder service prefix: `fd60:627a:a42b::/48` and refactors our existing code as calling the Tailscale service prefix explicitly (rather than implicitly).

Removes the unused `Addresses` agent option. All clients today assume they can compute the Agent's IP address based on its UUID, so an agent started with a custom address would break things.
2024-10-04 10:04:10 +04:00
2df9a3e554 fix: fix tailnet remoteCoordination to wait for server (#14666)
Fixes #12560

When gracefully disconnecting from the coordinator, we would send the Disconnect message and then close the dRPC stream.  However, closing the dRPC stream can cause the server not to process the Disconnect message, since we use the stream context in a `select` while sending it to the coordinator.

This is a product bug uncovered by the flake, and probably results in us failing graceful disconnect some minority of the time.

Instead, the `remoteCoordination` (and `inMemoryCoordination` for consistency) should send the Disconnect message and then wait for the coordinator to hang up (on some graceful disconnect timer, in the form of a context).
2024-09-16 09:24:30 +04:00
fb3523b37f chore: remove legacy AgentIP address (#14640)
Removes the support for the Agent's "legacy IP" which was a hardcoded IP address all agents used to use, before we introduced "single tailnet". Single tailnet went GA in 2.7.0.
2024-09-12 07:40:19 +04:00
8c15192433 feat(cli): add p2p diagnostics to ping (#14426)
First PR to address #14244.

Adds common potential reasons as to why a direct connection to the workspace agent couldn't be established to `coder ping`:
- If the Coder deployment administrator has blocked direction connections (`CODER_BLOCK_DIRECT`).
- If the client has no STUN servers within it's DERP map.
- If the client or agent appears to be behind a hard NAT, as per Tailscale `netInfo.MappingVariesByDestIP`

Also adds a warning if the client or agent has a network interface below the 'safe' MTU for tailnet. This warning is always displayed at the end of a `coder ping`.
2024-08-28 15:39:01 +10:00
cf8be4eac5 feat: add resume support to coordinator connections (#14234) 2024-08-20 17:16:49 +10:00
e2cec454bc fix: check for io.EOF error in derpmap to resolve flake (#14125)
See: https://github.com/coder/coder/actions/runs/10218717887/job/28275465405?pr=14045
2024-08-02 17:08:47 +00:00
e8db21c89e chore: add additional network telemetry stats & events (#13800) 2024-07-10 14:14:35 +10:00
a110d18275 chore: add DRPC tailnet & cli network telemetry (#13687) 2024-07-03 15:23:46 +10:00
6c94dd4f23 chore: add DRPC server implementation for network telemetry (#13675) 2024-07-02 01:50:52 +10:00
c94b5188bd fix: modify workspacesdk to ask for tailnet API 2.0 (#13684)
#13617 bumped the Agent/Tailnet API minor version because it adds telemetry features.  However, we don't actually use the protocol features yet, so it's a bit obnoxious for our CLI client to ask for the newest API version.

This is particularly true of the CLI client, since that's distributed separately, so if an end user installs the latest CLI client and their organization hasn't fully upgraded, then it will fail to connect.

Since we have a release coming up and the telemetry stuff won't make it, I think we should roll back to version 2.0 until we actually implement the telemetry stuff. That way the newest release (2.13) will work with Coder servers all the way back to 2.9.
2024-06-27 15:38:21 +04:00
5b59f2880f fix: fix workspacesdk to return error on API mismatch (#13683) 2024-06-27 15:02:43 +04:00
1f9bdc36bf fix: ignore yamux.ErrSessionShutdown on TestTailnetAPIConnector_Disconnects (#13532) 2024-06-11 11:16:49 +04:00
3de737fdc8 fix: start packet capture immediately on speedtest (#13128)
I initially made this change when hacking wgengine to also capture wireguard packets going into the magicsock, so that we could capture the initial wireguard handshake. 

I don't think we should ship that additional capture logic, but... it seems generally useful to capture packets from the get go on speedtest, so that you can see disco and pings before the TCP speedtest session starts.
2024-05-02 19:44:32 +04:00
6b4eb03192 chore: give additional time in tests for tailnetAPIConnector graceful disconnect (#12980)
Failure seen here: https://github.com/coder/coder/actions/runs/8711258577/job/23894964182?pr=12979
2024-04-17 12:38:17 -05:00
e801e878ba feat: add agent acks to in-memory coordinator (#12786)
When an agent receives a node, it responds with an ACK which is relayed
to the client. After the client receives the ACK, it's allowed to begin
pinging.
2024-04-10 17:15:33 -05:00
4d5a7b2d56 chore(codersdk): move all tailscale imports out of codersdk (#12735)
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
2024-03-26 12:44:31 -05:00