Fixes a test flake on TestTailnet_ForcesWebsockets like:
```
t.go:106: 2024-11-18 07:44:25.939 [debu] w2: dial tcp addr_port="[fd7a:115c:a1e0:46cc:bd8e:400d:1bc6:f6ac]:35565"
t.go:106: 2024-11-18 07:44:25.943 [debu] w1.net.netstack: netstack: could not connect to local server at 127.0.0.1:35565 (or [::1]:35565)%!(EXTRA *net.OpError=dial tcp [::1]:35565: connect: connection refused)
conn_test.go:146:
Error Trace: /Users/spike/repos/coder/tailnet/conn_test.go:146
Error: Received unexpected error:
connect tcp [fd7a:115c:a1e0:46cc:bd8e:400d:1bc6:f6ac]:35565: connection was refused
Test: TestTailnet/ForcesWebSockets
t.go:106: 2024-11-18 07:44:25.945 [info] w1: closing tailnet Conn
t.go:106: 2024-11-18 07:44:25.945 [debu] w1: closing configMaps configLoop
t.go:106: 2024-11-18 07:44:25.945 [debu] w1: closing nodeUpdater updateLoop
t.go:106: 2024-11-18 07:44:25.945 [debu] w1: closed netstack
conn_test.go:135:
Error Trace: /Users/spike/repos/coder/tailnet/conn_test.go:135
/Users/spike/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.8.darwin-arm64/src/runtime/asm_arm64.s:1222
Error: Received unexpected error:
connection closed:
github.com/coder/coder/v2/tailnet.init
<autogenerated>:1
Test: TestTailnet/ForcesWebSockets
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x1039771dc]
goroutine 2224 [running]:
github.com/coder/coder/v2/tailnet_test.TestTailnet.func3.2()
/Users/spike/repos/coder/tailnet/conn_test.go:136 +0x7c
created by github.com/coder/coder/v2/tailnet_test.TestTailnet.func3 in goroutine 109
/Users/spike/repos/coder/tailnet/conn_test.go:133 +0x7dc
```
Test didn't synchronize listening on the port before dialing it.
It also has a nil pointer deference when the test fails, which causes a bunch of unrelated output. Also fixed.
`coder vpn-daemon run` will instantiate a RPC connection with the
specified pipe handles and communicate with the (yet to be implemented)
parent process.
The tests don't ensure that the tunnel is actually usable yet as the
tunnel functionality isn't implemented, but it does make sure that the
tunnel tries to read from the RPC pipe.
Closes#14735
Related: https://github.com/coder/internal/issues/212
This PR modifies the logic responsible for creating a server in E2E
tests to check if the port is free. Alternatively, we could refactor the
framework to dynamically create server instances, but this solution
might be a cheaper quick win.
Note:
I'll leave it as is now, it might be worth asking somebody with a
frontend skillset to double-check this contribution.
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests. This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.
Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
closes#14730
Adds support for WorkspaceUpdates to the WebsocketDialer. This allows us to dial the new endpoint added in #14847 and connect it up to a `tailnet.Controllers` to connect to all agents over the tailnet.
I refactored the fakeWorkspaceUpdatesProvider to a mock and moved it to `tailnettest` so it could be more easily reused. The Mock is a little more full-featured.
re: #14730
Adds support in `tailnet.Controller` for WorkspaceUpdates.
Also checks configured controllers against the clients returned by the dialer, so that if we connect with a dialer that doesn't support an RPC (for instance the in-memory dialer for ServerTailnet doesn't support WorkspaceUpdates), we throw an error if there is a controller expecting it.
Addresses https://github.com/coder/nexus/issues/35.
This PR:
- Adds a `workspace_modules` table to track modules used by the
Terraform provisioner in provisioner jobs.
- Adds a `module_path` column to the `workspace_resources` table,
allowing to identify which module a resource originates from.
- Starts pushing this new information into telemetry.
For the person reviewing this PR, do not fret about the 1,500 new lines
- ~1,000 of them are auto-generated.
Fixes https://github.com/coder/internal/issues/214#15475 missed that we also write to `rpty` after starting
`rpty.lifecycle()`.
This PR moves the function call right at the end. Hopefully this should
address the data races before we go resorting to mutexes.
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.
What happened is that we accidentally shipped two new API features without bumping the version. `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.
Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet. That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs. This isn't great, but hasn't caused us major issues because
1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported.
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.
Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.
To mitigate against this thing happening again, this PR also:
1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version. That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.
With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
re: #14730
Adds support for the workspace updates protocol controller to also program DNS names for each agent.
Right now, we only program names like `myagent.myworkspace.me.coder` and `myworkspace.coder.` (if there is exactly one agent in the workspace). We also want to support `myagent.myworkspace.username.coder.`, but for that we need to update WorkspaceUpdates RPC to also send the workspace owner's username, which will be in a separate PR.
Move claims from a `debug` column to an actual typed column to be used.
This does not functionally change anything, it just adds some Go typing to build
on.
The hardcoded image is an anti-pattern, leading to weird errors if the
`docker` group is absent. We should either provide a better error
in-product or just have a better image.
@matifali - also down to use a Devcontainers universal image instead or
make this a parameter. Let me know what you think the best "default
install" is
re: #14730
Adds a protocol controller for WorkspaceUpdates RPC that takes all the agents we learn about over the RPC, and programs them into the Coordination controller, so that we set up tunnels to all the agents.
Handling DNS is in a PR up the stack, as is actually wiring it up to anything.
- Adds documentation for how to correctly hold --parameter with list(string)
- Adds tests for the aforementioned documented correct finger positions for --parameter list(string)
Adds a `--things` flag to our `multi-select` example prompt command.
```
go run ./cmd/coder exp prompt-example multi-select --things=Code,Bike,Potato=mashed
"Code, Bike, Potato=mashed" are nice choices.
```
Relates to https://github.com/coder/coder/issues/15082
The old implementation of `GetWorkspacesEligibleForTransition` returns
many workspaces that are not actually eligible for transition. This new
implementation reduces this number significantly (at least on our
dogfood instance).
Related to #15309
As we already are doing for agent logs - this PR is enabling the logs
rotation for coderd logs.
Currently keeping the same logic than we had for agent - with 5MB as the
file size for rotation.
- Assert rbac in fake notifications enqueuer
- Move fake notifications enqueuer to separate notificationstest package
- Update dbauthz rbac policy to allow provisionerd and autostart to create and read notification messages
- Update tests as required
Fixes test flakes _a la_
```
t.go:108: 2024-11-05 09:52:37.996 [erro] workspacestats: failed to load template schedule bumping activity, defaulting to bumping by 60min request_id=f14215d2-73dc-47ba-aa81-422c62f257e4 workspace_id=545d73c7-3a62-4466-8c08-b6abb12867b7 template_id=49747428-3abb-40e4-a6b2-03653e9f2506 ...
error= fetch object:
github.com/coder/coder/v2/coderd/database/dbauthz.(*querier).GetTemplateByID.fetch[...].func1
/home/runner/work/coder/coder/coderd/database/dbauthz/dbauthz.go:497
- pq: canceling statement due to user request
*** slogtest: log detected at level ERROR; TEST FAILURE ***
```
seen here on main: https://github.com/coder/coder/actions/runs/11681605747/job/32527006174
Whilst the `networking-troubleshooting` docs page already mentions that
a direct connection can be established over a private network, even if
there are no STUN servers, it's worth this is the case at the end of the
ping output.
This also removes a print statement that was dirtying up the diagnostic
output, and corrects the name of the `--disable-direct-connections`
flag.