Commit Graph

127 Commits

Author SHA1 Message Date
f01cab9894 feat: use tailnet v2 API for coordination (#11638)
This one is huge, and I'm sorry.

The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet.

There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!
2024-01-22 11:07:50 +04:00
b173195e0d Revert "fix: detect JetBrains running on local ipv6 (#11653)" (#11664)
This reverts commit 2d61d5332a.
2024-01-17 15:38:39 +04:00
2d61d5332a fix: detect JetBrains running on local ipv6 (#11653) 2024-01-16 15:53:41 -09:00
dd05a6b13a chore: mockgen archived, moved to new location (#11415)
* chore: mockgen archived, moved to new location
2024-01-04 18:35:56 -06:00
df3c310379 feat(cli): add coder open vscode (#11191)
Fixes #7667
2024-01-02 20:46:18 +02:00
520c3a8ff7 fix: use TSMP for pings and checking reachability (#11306)
We're seeing some flaky tests related to agent connectivity - https://github.com/coder/coder/actions/runs/7286675441/job/19856270998

I'm pretty sure what happened in this one is that the client opened a connection while the wgengine was in the process of reconfiguring the wireguard device, so the fact that the peer became "active" as a result of traffic being sent was not noticed.

The test calls `AwaitReachable()` but this only tests the disco layer, so it doesn't wait for wireguard to come up.

I think we should be using TSMP for pinging and reachability, since this operates at the IP layer, and therefore requires that wireguard comes up before being successful.

This should also help with the problems we have seen where a TCP connection starts before wireguard is up and the initial round trip has to wait for the 5 second wireguard handshake retry.

fixes: #11294
2024-01-02 15:53:52 +04:00
db9104c02e fix: avoid panic on nil connection (#11305)
Related to https://github.com/coder/coder/actions/runs/7286675441/job/19855871305

Fixes a panic if the listener returns an error, which can obfuscate the underlying problem and cause unrelated tests to be marked failed.
2023-12-21 14:26:11 +04:00
b7bdb17460 feat: add metrics to workspace agent scripts (#11132)
* push startup script metrics to agent
2023-12-13 11:45:43 -06:00
dbbf8acc26 fix: track JetBrains connections (#10968)
* feat: implement jetbrains agentssh tracking

Based on tcp forwarding instead of ssh connections

* Add JetBrains tracking to bottom bar
2023-12-07 12:15:54 -09:00
70cede8f7a test(agent): improve TestAgent_Dial tests (#11013)
Refs #11008
2023-12-04 13:11:30 +02:00
6c67add2d9 fix: detect and retry reverse port forward on used port (#10844)
Fixes #10799

The flake happens when we try to remote forward, but the port we've chosen is not free.  In the flaked example, it's actually the SSH listener that occupies the port we try to remote forward, leading to confusing reads (c.f. the linked issue).

This fix simplies the tests considerably by using the Go ssh client, rather than shelling out to OpenSSH.  This avoids using a pseudoterminal, avoids the need for starting any local OS listeners to communicate the forwarding (go SSH just returns in-process listeners), and avoids an OS listener to wire OpenSSH up to the agentConn.

With the simplied logic, we can immediately tell if a remote forward on a random port fails, so we can do this in a loop until success or timeout.

I've also simplified and fixed up the other forwarding tests. Since we set up forwarding in-process with Go ssh, we can remove a lot of the `require.Eventually` logic.
2023-11-27 09:42:45 +04:00
1286904de8 test(agent): improve TestAgent_Session_TTY_MOTD_Update (#10385) 2023-10-23 17:32:28 +00:00
8f1b4fb061 test(agent): fix service banner trim test flake (#10384) 2023-10-23 18:06:59 +03:00
76c65b1e1b fix(agent): send metadata in batches (#10225)
Fixes #9782

---

I recommend reviewing with ignore whitespace.
2023-10-13 17:48:25 +03:00
4857d4bd55 feat(codersdk/agentsdk): use new agent metadata batch endpoint (#10224)
Part of #9782
2023-10-13 17:32:28 +03:00
7eeba15d16 feat(coderd): add support for sending batched agent metadata (#10223)
Part of #9782
2023-10-13 16:37:55 +03:00
a9077812e2 fix: use UTF-8 encoding with screen (#10190)
This will make characters like ❯ and ⇣ work, for example.
2023-10-11 13:25:04 -08:00
b039dc6989 fix: correct escaping in test regex (#10138)
Fixes regex escaping.  Spotted during a code read.
2023-10-10 08:42:39 +04:00
54fd350913 feat: improve logging for speedtest connections
part of #7963

improve connection logging for speedtest connections
2023-10-09 20:48:28 +04:00
c67db6efb0 fix: wait for bash prompt before commands (#9882)
Signed-off-by: Spike Curtis <spike@coder.com>
2023-09-27 12:26:24 +04:00
1262eef2c0 feat: add support for coder_script (#9584)
* Add basic migrations

* Improve schema

* Refactor agent scripts into it's own package

* Support legacy start and stop script format

* Pipe the scripts!

* Finish the piping

* Fix context usage

* It works!

* Fix sql query

* Fix SQL query

* Rename `LogSourceID` -> `SourceID`

* Fix the FE

* fmt

* Rename migrations

* Fix log tests

* Fix lint err

* Fix gen

* Fix story type

* Rename source to script

* Fix schema jank

* Uncomment test

* Rename proto to TimeoutSeconds

* Fix comments

* Fix comments

* Fix legacy endpoint without specified log_source

* Fix non-blocking by default in agent

* Fix resources tests

* Fix dbfake

* Fix resources

* Fix linting I think

* Add fixtures

* fmt

* Fix startup script behavior

* Fix comments

* Fix context

* Fix cancel

* Fix SQL tests

* Fix e2e tests

* Interrupt on Windows

* Fix agent leaking script process

* Fix migrations

* Fix stories

* Fix duplicate logs appearing

* Gen

* Fix log location

* Fix tests

* Fix tests

* Fix log output

* Show display name in output

* Fix print

* Return timeout on start context

* Gen

* Fix fixture

* Fix the agent status

* Fix startup timeout msg

* Fix command using shared context

* Fix timeout draining

* Change signal type

* Add deterministic colors to startup script logs

---------

Co-authored-by: Muhammad Atif Ali <atif@coder.com>
2023-09-25 16:47:17 -05:00
70e481e7a5 fix: use terminal emulator that keeps state in ReconnectingPTY tests (#9765)
* Add more pty diagnostics for terminal parsing

Signed-off-by: Spike Curtis <spike@coder.com>

* print escaped strings

Signed-off-by: Spike Curtis <spike@coder.com>

* Only log on failure - heisenbug?

Signed-off-by: Spike Curtis <spike@coder.com>

* use the terminal across matches to keep cursor & contents state

Signed-off-by: Spike Curtis <spike@coder.com>

* Only log bytes if we're not expecting EOF

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-09-19 17:57:30 +00:00
7311ffbd9d feat: implement agent process management (#9461)
- An opt-in feature has been added to the agent to allow
   deprioritizing non coder-related processes for CPU by setting their
   niceness level to 10.
- Opting in to the feature requires setting CODER_PROC_PRIO_MGMT to a non-empty value.
2023-09-14 19:45:05 -05:00
22e781eced chore: add /v2 to import module path (#9072)
* chore: add /v2 to import module path

go mod requires semantic versioning with versions greater than 1.x

This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```

Migrate generated files to import /v2

* Fix gen
2023-08-18 18:55:43 +00:00
02ee724d9f fix: do terminal emulation in reconnecting pty tests (#9114)
It looks like it is possible for screen to use control sequences instead
of literal newlines which fails the tests.

This reuses the existing readUntil function used in other pty tests.
2023-08-16 13:02:03 -08:00
b993cab49a fix: use screen for reconnecting terminal sessions on Linux if available (#8640)
* Add screen backend for reconnecting ptys

The screen portion is a port from wsep.  There is an interface that lets
you choose between screen and the previous method.  By default it will
choose screen if it is installed but this can be overidden (mostly for
tests).

The tests use a scanner instead of a reader now because the reader will
loop infinitely at the end of a stream.

Replace /bin/bash with bash since bash is not always in /bin.

* Remove connection_id from reconnecting PTY logger

This serves multiple connections so it makes no sense to scope it to a
single connection.

Also lets us use "connection_id" when logging write errors instead of
"other_conn_id".

* Use PATH to test buffered reconnecting pty
2023-08-14 11:19:13 -08:00
3c52b01850 chore: add tailscale magicsock debug logging controls (#8982) 2023-08-08 17:56:08 +00:00
c575292ba6 fix: fix tailnet netcheck issues (#8802) 2023-08-02 01:50:43 +10:00
2f0a9996e7 chore: add derpserver to wsproxy, add proxies to derpmap (#7311) 2023-07-27 02:21:04 +10:00
c8d65de4b7 test(agent): fix TestAgent_Metadata/Once flake (#8613) 2023-07-20 18:49:44 +00:00
5fd77ad7cf test(agent): fix service banner and metadata intervals (#8516) 2023-07-14 16:10:26 +03:00
c47b78c44b chore: replace wsconncache with a single tailnet (#8176) 2023-07-12 17:37:31 -05:00
3f058f28e7 test(agent): use afero for motd tests to allow parallel execution (#8329) 2023-07-06 10:57:51 +03:00
6015319e9d feat: show service banner in SSH/TTY sessions (#8186)
* Allow workspace agents to get appearance
* Poll for service banner every two minutes
* Show service banner before MOTD if not quiet
2023-06-30 10:41:29 -08:00
6d176aee5d test(agent): fix lifecycle test flakeyness (#8230) 2023-06-27 12:44:16 +00:00
8dac0356ed refactor: replace startup script logs EOF with starting/ready time (#8082)
This commit reverts some of the changes in #8029 and implements an
alternative method of keeping track of when the startup script has ended
and there will be no more logs.

This is achieved by adding new agent fields for tracking when the agent
enters the "starting" and "ready"/"start_error" lifecycle states. The
timestamps simplify logic since we don't need understand if the current
state is before or after the state we're interested in. They can also be
used to show data like how long the startup script took to execute. This
also allowed us to remove the EOF field from the logs as the
implementation was problematic when we returned the EOF log entry in the
response since requesting _after_ that ID would give no logs and the API
would thus lose track of EOF.
2023-06-20 14:41:55 +03:00
0c5077464b fix: avoid missed logs when streaming startup logs (#8029)
* feat(coderd,agent): send startup log eof at the end

* fix(coderd): fix edge case in startup log pubsub

* fix(coderd): ensure startup logs are closed on lifecycle state change (fallback)

* fix(codersdk): fix startup log channel shared memory bug

* fix(site): remove the EOF log line
2023-06-16 17:14:22 +03:00
14efdadd3c feat: Collect agent SSH metrics (#7584) 2023-05-25 12:52:36 +02:00
70d2203b9e chore: reduce the log output of skipped tests (#7520)
With the introduction of the workspace proxy tests there was a lot
of output if a test was eventually skipped.
2023-05-14 19:37:00 -05:00
9c030a8888 fix: pty.Start respects context on Windows too (#7373)
* fix: pty.Start respects context on Windows too

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows imports; rename ToExec -> AsExec

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix import in windows test

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-05-03 11:43:05 +04:00
465fe8658d chore: skip timing-sensistive AgentMetadata test in the standard suite (#7237)
* chore: skip timing-sensistive AgentMetadata test in the standard suite

* Add test-timing target

* fix windows?

* Works on my Windows desktop?

* Use tag system

* fixup! Use tag system
2023-05-02 10:41:41 +00:00
b6666cf1cf chore: tailnet debug logging (#7260)
* Enable discovery (disco) debug

Signed-off-by: Spike Curtis <spike@coder.com>

* Better debug on reconnectingPTY

Signed-off-by: Spike Curtis <spike@coder.com>

* Agent logging in appstest

Signed-off-by: Spike Curtis <spike@coder.com>

* More reconnectingPTY logging

Signed-off-by: Spike Curtis <spike@coder.com>

* Add logging to coordinator

Signed-off-by: Spike Curtis <spike@coder.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Clarify logs; remove unrelated changes

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-04-27 13:59:01 +04:00
daee91c6dc refactor: PTY & SSH (#7100)
* Add ssh tests for longoutput, orphan

Signed-off-by: Spike Curtis <spike@coder.com>

* PTY/SSH tests & improvements

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix some tests

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix linting

Signed-off-by: Spike Curtis <spike@coder.com>

* fmt

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows test

Signed-off-by: Spike Curtis <spike@coder.com>

* Windows copy test

Signed-off-by: Spike Curtis <spike@coder.com>

* WIP Windows pty handling

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix truncation tests

Signed-off-by: Spike Curtis <spike@coder.com>

* Appease linter/fmt

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix typo

Signed-off-by: Spike Curtis <spike@coder.com>

* Rework truncation test to not assume OS buffers

Signed-off-by: Spike Curtis <spike@coder.com>

* Disable orphan test on Windows --- uses sh

Signed-off-by: Spike Curtis <spike@coder.com>

* agent_test running SSH in pty use ptytest.Start

Signed-off-by: Spike Curtis <spike@coder.com>

* More detail about closing pseudoconsole on windows

Signed-off-by: Spike Curtis <spike@coder.com>

* Code review fixes

Signed-off-by: Spike Curtis <spike@coder.com>

* Rearrange ptytest method order

Signed-off-by: Spike Curtis <spike@coder.com>

* Protect pty.Resize on windows from races

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows bugs

Signed-off-by: Spike Curtis <spike@coder.com>

* PTY doesn't extend PTYCmd

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows types

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-04-24 14:53:57 +04:00
712098fa2b test(agent): Increase the time to wait for agent reachable (#7245) 2023-04-21 19:40:17 +00:00
300ae4a6bf test(agent): Fix TestAgent_UnixRemoteForwarding timeout (#7235) 2023-04-21 01:35:51 +03:00
bb43713d38 fix: VSCode desktop connection (#7120)
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-04-14 17:32:18 +03:00
0224426e5b refactor(agent): Move SSH server into agentssh package (#7004)
Refs: #6177
2023-04-06 19:39:22 +03:00
121c2bcde8 test(agent): Fix tests without cmd.Wait() (#7029) 2023-04-06 16:45:53 +03:00
ca4fa81570 feat: add agent metadata (#6614) 2023-03-31 15:26:19 -05:00
04e404e448 chore: dial the remote socket continually until connect (#6891)
It's possible that the command starts but the socket isn't ready
even when the file exists.
2023-03-30 15:36:23 +00:00