Commit Graph

98 Commits

Author SHA1 Message Date
c8d65de4b7 test(agent): fix TestAgent_Metadata/Once flake (#8613) 2023-07-20 18:49:44 +00:00
5fd77ad7cf test(agent): fix service banner and metadata intervals (#8516) 2023-07-14 16:10:26 +03:00
c47b78c44b chore: replace wsconncache with a single tailnet (#8176) 2023-07-12 17:37:31 -05:00
3f058f28e7 test(agent): use afero for motd tests to allow parallel execution (#8329) 2023-07-06 10:57:51 +03:00
6015319e9d feat: show service banner in SSH/TTY sessions (#8186)
* Allow workspace agents to get appearance
* Poll for service banner every two minutes
* Show service banner before MOTD if not quiet
2023-06-30 10:41:29 -08:00
6d176aee5d test(agent): fix lifecycle test flakeyness (#8230) 2023-06-27 12:44:16 +00:00
8dac0356ed refactor: replace startup script logs EOF with starting/ready time (#8082)
This commit reverts some of the changes in #8029 and implements an
alternative method of keeping track of when the startup script has ended
and there will be no more logs.

This is achieved by adding new agent fields for tracking when the agent
enters the "starting" and "ready"/"start_error" lifecycle states. The
timestamps simplify logic since we don't need understand if the current
state is before or after the state we're interested in. They can also be
used to show data like how long the startup script took to execute. This
also allowed us to remove the EOF field from the logs as the
implementation was problematic when we returned the EOF log entry in the
response since requesting _after_ that ID would give no logs and the API
would thus lose track of EOF.
2023-06-20 14:41:55 +03:00
0c5077464b fix: avoid missed logs when streaming startup logs (#8029)
* feat(coderd,agent): send startup log eof at the end

* fix(coderd): fix edge case in startup log pubsub

* fix(coderd): ensure startup logs are closed on lifecycle state change (fallback)

* fix(codersdk): fix startup log channel shared memory bug

* fix(site): remove the EOF log line
2023-06-16 17:14:22 +03:00
14efdadd3c feat: Collect agent SSH metrics (#7584) 2023-05-25 12:52:36 +02:00
70d2203b9e chore: reduce the log output of skipped tests (#7520)
With the introduction of the workspace proxy tests there was a lot
of output if a test was eventually skipped.
2023-05-14 19:37:00 -05:00
9c030a8888 fix: pty.Start respects context on Windows too (#7373)
* fix: pty.Start respects context on Windows too

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows imports; rename ToExec -> AsExec

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix import in windows test

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-05-03 11:43:05 +04:00
465fe8658d chore: skip timing-sensistive AgentMetadata test in the standard suite (#7237)
* chore: skip timing-sensistive AgentMetadata test in the standard suite

* Add test-timing target

* fix windows?

* Works on my Windows desktop?

* Use tag system

* fixup! Use tag system
2023-05-02 10:41:41 +00:00
b6666cf1cf chore: tailnet debug logging (#7260)
* Enable discovery (disco) debug

Signed-off-by: Spike Curtis <spike@coder.com>

* Better debug on reconnectingPTY

Signed-off-by: Spike Curtis <spike@coder.com>

* Agent logging in appstest

Signed-off-by: Spike Curtis <spike@coder.com>

* More reconnectingPTY logging

Signed-off-by: Spike Curtis <spike@coder.com>

* Add logging to coordinator

Signed-off-by: Spike Curtis <spike@coder.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Clarify logs; remove unrelated changes

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-04-27 13:59:01 +04:00
daee91c6dc refactor: PTY & SSH (#7100)
* Add ssh tests for longoutput, orphan

Signed-off-by: Spike Curtis <spike@coder.com>

* PTY/SSH tests & improvements

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix some tests

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix linting

Signed-off-by: Spike Curtis <spike@coder.com>

* fmt

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows test

Signed-off-by: Spike Curtis <spike@coder.com>

* Windows copy test

Signed-off-by: Spike Curtis <spike@coder.com>

* WIP Windows pty handling

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix truncation tests

Signed-off-by: Spike Curtis <spike@coder.com>

* Appease linter/fmt

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix typo

Signed-off-by: Spike Curtis <spike@coder.com>

* Rework truncation test to not assume OS buffers

Signed-off-by: Spike Curtis <spike@coder.com>

* Disable orphan test on Windows --- uses sh

Signed-off-by: Spike Curtis <spike@coder.com>

* agent_test running SSH in pty use ptytest.Start

Signed-off-by: Spike Curtis <spike@coder.com>

* More detail about closing pseudoconsole on windows

Signed-off-by: Spike Curtis <spike@coder.com>

* Code review fixes

Signed-off-by: Spike Curtis <spike@coder.com>

* Rearrange ptytest method order

Signed-off-by: Spike Curtis <spike@coder.com>

* Protect pty.Resize on windows from races

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows bugs

Signed-off-by: Spike Curtis <spike@coder.com>

* PTY doesn't extend PTYCmd

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows types

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-04-24 14:53:57 +04:00
712098fa2b test(agent): Increase the time to wait for agent reachable (#7245) 2023-04-21 19:40:17 +00:00
300ae4a6bf test(agent): Fix TestAgent_UnixRemoteForwarding timeout (#7235) 2023-04-21 01:35:51 +03:00
bb43713d38 fix: VSCode desktop connection (#7120)
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-04-14 17:32:18 +03:00
0224426e5b refactor(agent): Move SSH server into agentssh package (#7004)
Refs: #6177
2023-04-06 19:39:22 +03:00
121c2bcde8 test(agent): Fix tests without cmd.Wait() (#7029) 2023-04-06 16:45:53 +03:00
ca4fa81570 feat: add agent metadata (#6614) 2023-03-31 15:26:19 -05:00
04e404e448 chore: dial the remote socket continually until connect (#6891)
It's possible that the command starts but the socket isn't ready
even when the file exists.
2023-03-30 15:36:23 +00:00
90d18dd2e5 fix(agent): Close stdin and stdout separately to fix pty output loss (#6862)
Fixes #6656
Closes #6840
2023-03-29 21:58:38 +03:00
891bbda995 fix(agent): More protection for lost output of SSH PTY commands (#6833)
Fixes #6656 (part 2)
2023-03-28 09:11:15 +00:00
76bdde7f1b fix(agent): Prevent SSH TTYs from losing command output on exit (#6777) 2023-03-24 18:23:41 +00:00
cb7375450b feat: add startup script logs to the ui (#6558)
* Add startup script logs to the database

* Add coderd endpoints for startup script logs

* Push startup script logs from agent

* Pull startup script logs on frontend

* Rename queries

* Add constraint

* Start creating log sending loop

* Add log sending to the agent

* Add tests for streaming logs

* Shorten notify channel name

* Add FE

* Improve bulk log performance

* Finish UI display

* Fix startup log visibility

* Add warning for overflow

* Fix agent queue logs overflow

* Display staartup logs in a virtual DOM for performance

* Fix agent queue with loads of logs

* Fix authorize test

* Remove faulty test

* Fix startup and shutdown reporting error

* Fix gen

* Fix comments

* Periodically purge old database entries

* Add test fixture for migration

* Add Storybook

* Check if there are logs when displaying features

* Fix startup component overflow gap

* Fix startup log wrapping

---------

Co-authored-by: Asher <ash@coder.com>
2023-03-23 14:09:13 -05:00
5304b4e483 feat: add connection statistics for workspace agents (#6469)
* fix: don't make session counts cumulative

This made for some weird tracking... we want the point-in-time
number of counts!

* Add databasefake query for getting agent stats

* Add deployment stats endpoint

* The query... works?!?

* Fix aggregation query

* Select from multiple tables instead

* Fix continuous stats

* Increase period of stat refreshes

* Add workspace counts to deployment stats

* fmt

* Add a slight bit of responsiveness

* Fix template version editor overflow

* Add refresh button

* Fix font family on button

* Fix latest stat being reported

* Revert agent conn stats

* Fix linting error

* Fix tests

* Fix gen

* Fix migrations

* Block on sending stat updates

* Add test fixtures

* Fix response structure

* make gen
2023-03-08 21:05:45 -06:00
22e3ff96be feat(agent): Add shutdown lifecycle states and shutdown_script support (#6139)
* feat(api): Add agent shutdown lifecycle states

* feat(agent): Add shutdown_script support

* feat(agent): Add shutdown_script timeout

* feat(site): Support new agent lifecycle states

---

Co-authored-by: Marcin Tojek <marcin@coder.com>
2023-03-06 21:34:00 +02:00
2ff1c6d613 feat: add agent stats for different connection types (#6412)
This allows us to track when our extensions are used, when the
web terminal is used, and average connection latency to the agent.
2023-03-02 08:06:00 -06:00
05e449943d chore: convert agent stats to use a table (#6374)
* chore: convert workspace agent stats from json to table

* chore: convert agent stats to use a table

Backwards compatibility becomes hard when all agent stats are in a JSON blob.
We also want to query this table for new agents that are failing health checks
so we can display it in the UI.

* Fix migration using default values
2023-02-28 13:33:33 -06:00
677721e4a1 fix(tailnet): Skip nodes without DERP, avoid use of RemoveAllPeers (#6320)
* fix(tailnet): Skip nodes without DERP, avoid use of RemoveAllPeers
2023-02-24 18:16:29 +02:00
4432cd08d6 chore: update tailscale (#6091) 2023-02-09 21:43:18 -06:00
691495d761 feat: add expanded_directory to the agent for extension support (#6087)
This will enable opening the default `dir` of an agent in
the VS Code extension!
2023-02-07 21:35:09 +00:00
2c2bbcc019 chore: update tests to support fish (#6023)
* fix: update tests to add fish support

* Track connections for SSH sessions to prevent leaks

* Revert SSH conn handling
2023-02-03 12:25:11 -06:00
7ad87505c8 chore: move agent functions from codersdk into agentsdk (#5903)
* chore: rename `AgentConn` to `WorkspaceAgentConn`

The codersdk was becoming bloated with consts for the workspace
agent that made no sense to a reader. `Tailnet*` is an example
of these consts.

* chore: remove `Get` prefix from *Client functions

* chore: remove `BypassRatelimits` option in `codersdk.Client`

It feels wrong to have this as a direct option because it's so infrequently
needed by API callers. It's better to directly modify headers in the two
places that we actually use it.

* Merge `appearance.go` and `buildinfo.go` into `deployment.go`

* Merge `experiments.go` and `features.go` into `deployment.go`

* Fix `make gen` referencing old type names

* Merge `error.go` into `client.go`

`codersdk.Response` lived in `error.go`, which is wrong.

* chore: refactor workspace agent functions into agentsdk

It was odd conflating the codersdk that clients should use
with functions that only the agent should use. This separates
them into two SDKs that are closely coupled, but separate.

* Merge `insights.go` into `deployment.go`

* Merge `organizationmember.go` into `organizations.go`

* Merge `quota.go` into `workspaces.go`

* Rename `sse.go` to `serversentevents.go`

* Rename `codersdk.WorkspaceAppHostResponse` to `codersdk.AppHostResponse`

* Format `.vscode/settings.json`

* Fix outdated naming in `api.ts`

* Fix app host response

* Fix unsupported type

* Fix imported type
2023-01-29 15:47:24 -06:00
1cd5f38cb0 feat: add debug server for tailnet coordinators (#5861)
Implements a Tailscale-like debug server for our in-memory coordinator. This should provide some visibility into why connections could be failing.
Resolves: https://github.com/coder/coder/issues/5845

![image](https://user-images.githubusercontent.com/6332295/214680832-2724d633-2d54-44d6-a7ce-5841e5824ee5.png)
2023-01-25 21:27:36 +00:00
138887de7e feat: Add workspace agent lifecycle state reporting (#5785) 2023-01-24 14:24:27 +02:00
73afdd7c09 chore: agent_test.go: use ptty.Peek() instead of expecting caret in TestAgent_SessionTTYShell (#5821) 2023-01-23 11:23:25 +00:00
f1fe2b5c06 feat: add GPG forwarding to coder ssh (#5482) 2023-01-06 07:52:19 +00:00
760419a965 chore: Refactor agent tests to avoid t.Run when not needed (#5376)
It turns out that writing tests that contain subtests should probably be
limited to table-based tests and tests that share a common setup shared
between tests.

Writing tests with a subtest like this:

```
func TestSomething(t *testing.T) {
	t.Run("Subtest", func(t *testing.t) {})
}
```

Has the following disadvantages:

- It can lead to multiple tests failing with `(unknown)` status when
  only one of the subtests hang (never exit)
- In Go 1.20rc1, using `t.Setenv` is no longer allowed if the parent
  test is parallel
2022-12-12 22:20:46 +02:00
88bb901283 fix: Close tailnet if agent is closed during creation (#5375) 2022-12-12 11:26:49 +00:00
05130db571 fix: Improve closing of services in agent tests (#5355) 2022-12-09 12:22:27 +02:00
eff99f78fa feat: Add support for MOTD file in coder agents (#5147) 2022-11-24 12:22:20 +00:00
ae38bbeab6 chore: refactor agent stats streaming (#5112) 2022-11-18 16:46:53 -06:00
69e8c9e7b4 feat: add reconnectingpty loadtest (#5083) 2022-11-17 16:57:15 +00:00
73f91e4690 ci: use big runners (#4990)
* chore: Close idle connections on test cleanup

It's possible that this was the source of a leak on Windows...

* ci: use big runners

* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Add logging to Startupscript test

* Add better logging

* Write startup script logs to fs dir

* Fix startup script test

* Fix startup script test

* Reduce test timeout

* Use central tmp dir in agent

* Adjust output

* Skip startup script test on Windows

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2022-11-13 14:23:23 -06:00
82f494c99c fix: Improve tailnet connections by reducing timeouts (#5043)
* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Update Tailscale
2022-11-13 11:33:05 -06:00
16e9b1eb1a fix: Add timeouts to every tailnet ping (#4986)
A ping isn't guaranteed to deliver, so these need to have a
tight timeout for tests to not flake.
2022-11-09 20:12:51 +00:00
d82364b9b5 feat: make trace provider in loadtest, add tracing to sdk (#4939) 2022-11-09 08:10:48 +10:00
8e743d28c8 fix: Use instance identity session token for git subcommands (#4884)
This broke using gitssh with instance identity!
2022-11-04 09:44:36 -07:00
104d6608d9 feat: Add VSCODE_PROXY_URI to surface code-server ports (#4798)
* feat: Add `VSCODE_PROXY_URI` to surface code-server ports

Fixes #4776.

* Check if app host is provided
2022-11-04 04:45:43 +00:00