Commit Graph

90 Commits

Author SHA1 Message Date
14efdadd3c feat: Collect agent SSH metrics (#7584) 2023-05-25 12:52:36 +02:00
70d2203b9e chore: reduce the log output of skipped tests (#7520)
With the introduction of the workspace proxy tests there was a lot
of output if a test was eventually skipped.
2023-05-14 19:37:00 -05:00
9c030a8888 fix: pty.Start respects context on Windows too (#7373)
* fix: pty.Start respects context on Windows too

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows imports; rename ToExec -> AsExec

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix import in windows test

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-05-03 11:43:05 +04:00
465fe8658d chore: skip timing-sensistive AgentMetadata test in the standard suite (#7237)
* chore: skip timing-sensistive AgentMetadata test in the standard suite

* Add test-timing target

* fix windows?

* Works on my Windows desktop?

* Use tag system

* fixup! Use tag system
2023-05-02 10:41:41 +00:00
b6666cf1cf chore: tailnet debug logging (#7260)
* Enable discovery (disco) debug

Signed-off-by: Spike Curtis <spike@coder.com>

* Better debug on reconnectingPTY

Signed-off-by: Spike Curtis <spike@coder.com>

* Agent logging in appstest

Signed-off-by: Spike Curtis <spike@coder.com>

* More reconnectingPTY logging

Signed-off-by: Spike Curtis <spike@coder.com>

* Add logging to coordinator

Signed-off-by: Spike Curtis <spike@coder.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update agent/agent.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Clarify logs; remove unrelated changes

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-04-27 13:59:01 +04:00
daee91c6dc refactor: PTY & SSH (#7100)
* Add ssh tests for longoutput, orphan

Signed-off-by: Spike Curtis <spike@coder.com>

* PTY/SSH tests & improvements

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix some tests

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix linting

Signed-off-by: Spike Curtis <spike@coder.com>

* fmt

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows test

Signed-off-by: Spike Curtis <spike@coder.com>

* Windows copy test

Signed-off-by: Spike Curtis <spike@coder.com>

* WIP Windows pty handling

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix truncation tests

Signed-off-by: Spike Curtis <spike@coder.com>

* Appease linter/fmt

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix typo

Signed-off-by: Spike Curtis <spike@coder.com>

* Rework truncation test to not assume OS buffers

Signed-off-by: Spike Curtis <spike@coder.com>

* Disable orphan test on Windows --- uses sh

Signed-off-by: Spike Curtis <spike@coder.com>

* agent_test running SSH in pty use ptytest.Start

Signed-off-by: Spike Curtis <spike@coder.com>

* More detail about closing pseudoconsole on windows

Signed-off-by: Spike Curtis <spike@coder.com>

* Code review fixes

Signed-off-by: Spike Curtis <spike@coder.com>

* Rearrange ptytest method order

Signed-off-by: Spike Curtis <spike@coder.com>

* Protect pty.Resize on windows from races

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows bugs

Signed-off-by: Spike Curtis <spike@coder.com>

* PTY doesn't extend PTYCmd

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix windows types

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-04-24 14:53:57 +04:00
712098fa2b test(agent): Increase the time to wait for agent reachable (#7245) 2023-04-21 19:40:17 +00:00
300ae4a6bf test(agent): Fix TestAgent_UnixRemoteForwarding timeout (#7235) 2023-04-21 01:35:51 +03:00
bb43713d38 fix: VSCode desktop connection (#7120)
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-04-14 17:32:18 +03:00
0224426e5b refactor(agent): Move SSH server into agentssh package (#7004)
Refs: #6177
2023-04-06 19:39:22 +03:00
121c2bcde8 test(agent): Fix tests without cmd.Wait() (#7029) 2023-04-06 16:45:53 +03:00
ca4fa81570 feat: add agent metadata (#6614) 2023-03-31 15:26:19 -05:00
04e404e448 chore: dial the remote socket continually until connect (#6891)
It's possible that the command starts but the socket isn't ready
even when the file exists.
2023-03-30 15:36:23 +00:00
90d18dd2e5 fix(agent): Close stdin and stdout separately to fix pty output loss (#6862)
Fixes #6656
Closes #6840
2023-03-29 21:58:38 +03:00
891bbda995 fix(agent): More protection for lost output of SSH PTY commands (#6833)
Fixes #6656 (part 2)
2023-03-28 09:11:15 +00:00
76bdde7f1b fix(agent): Prevent SSH TTYs from losing command output on exit (#6777) 2023-03-24 18:23:41 +00:00
cb7375450b feat: add startup script logs to the ui (#6558)
* Add startup script logs to the database

* Add coderd endpoints for startup script logs

* Push startup script logs from agent

* Pull startup script logs on frontend

* Rename queries

* Add constraint

* Start creating log sending loop

* Add log sending to the agent

* Add tests for streaming logs

* Shorten notify channel name

* Add FE

* Improve bulk log performance

* Finish UI display

* Fix startup log visibility

* Add warning for overflow

* Fix agent queue logs overflow

* Display staartup logs in a virtual DOM for performance

* Fix agent queue with loads of logs

* Fix authorize test

* Remove faulty test

* Fix startup and shutdown reporting error

* Fix gen

* Fix comments

* Periodically purge old database entries

* Add test fixture for migration

* Add Storybook

* Check if there are logs when displaying features

* Fix startup component overflow gap

* Fix startup log wrapping

---------

Co-authored-by: Asher <ash@coder.com>
2023-03-23 14:09:13 -05:00
5304b4e483 feat: add connection statistics for workspace agents (#6469)
* fix: don't make session counts cumulative

This made for some weird tracking... we want the point-in-time
number of counts!

* Add databasefake query for getting agent stats

* Add deployment stats endpoint

* The query... works?!?

* Fix aggregation query

* Select from multiple tables instead

* Fix continuous stats

* Increase period of stat refreshes

* Add workspace counts to deployment stats

* fmt

* Add a slight bit of responsiveness

* Fix template version editor overflow

* Add refresh button

* Fix font family on button

* Fix latest stat being reported

* Revert agent conn stats

* Fix linting error

* Fix tests

* Fix gen

* Fix migrations

* Block on sending stat updates

* Add test fixtures

* Fix response structure

* make gen
2023-03-08 21:05:45 -06:00
22e3ff96be feat(agent): Add shutdown lifecycle states and shutdown_script support (#6139)
* feat(api): Add agent shutdown lifecycle states

* feat(agent): Add shutdown_script support

* feat(agent): Add shutdown_script timeout

* feat(site): Support new agent lifecycle states

---

Co-authored-by: Marcin Tojek <marcin@coder.com>
2023-03-06 21:34:00 +02:00
2ff1c6d613 feat: add agent stats for different connection types (#6412)
This allows us to track when our extensions are used, when the
web terminal is used, and average connection latency to the agent.
2023-03-02 08:06:00 -06:00
05e449943d chore: convert agent stats to use a table (#6374)
* chore: convert workspace agent stats from json to table

* chore: convert agent stats to use a table

Backwards compatibility becomes hard when all agent stats are in a JSON blob.
We also want to query this table for new agents that are failing health checks
so we can display it in the UI.

* Fix migration using default values
2023-02-28 13:33:33 -06:00
677721e4a1 fix(tailnet): Skip nodes without DERP, avoid use of RemoveAllPeers (#6320)
* fix(tailnet): Skip nodes without DERP, avoid use of RemoveAllPeers
2023-02-24 18:16:29 +02:00
4432cd08d6 chore: update tailscale (#6091) 2023-02-09 21:43:18 -06:00
691495d761 feat: add expanded_directory to the agent for extension support (#6087)
This will enable opening the default `dir` of an agent in
the VS Code extension!
2023-02-07 21:35:09 +00:00
2c2bbcc019 chore: update tests to support fish (#6023)
* fix: update tests to add fish support

* Track connections for SSH sessions to prevent leaks

* Revert SSH conn handling
2023-02-03 12:25:11 -06:00
7ad87505c8 chore: move agent functions from codersdk into agentsdk (#5903)
* chore: rename `AgentConn` to `WorkspaceAgentConn`

The codersdk was becoming bloated with consts for the workspace
agent that made no sense to a reader. `Tailnet*` is an example
of these consts.

* chore: remove `Get` prefix from *Client functions

* chore: remove `BypassRatelimits` option in `codersdk.Client`

It feels wrong to have this as a direct option because it's so infrequently
needed by API callers. It's better to directly modify headers in the two
places that we actually use it.

* Merge `appearance.go` and `buildinfo.go` into `deployment.go`

* Merge `experiments.go` and `features.go` into `deployment.go`

* Fix `make gen` referencing old type names

* Merge `error.go` into `client.go`

`codersdk.Response` lived in `error.go`, which is wrong.

* chore: refactor workspace agent functions into agentsdk

It was odd conflating the codersdk that clients should use
with functions that only the agent should use. This separates
them into two SDKs that are closely coupled, but separate.

* Merge `insights.go` into `deployment.go`

* Merge `organizationmember.go` into `organizations.go`

* Merge `quota.go` into `workspaces.go`

* Rename `sse.go` to `serversentevents.go`

* Rename `codersdk.WorkspaceAppHostResponse` to `codersdk.AppHostResponse`

* Format `.vscode/settings.json`

* Fix outdated naming in `api.ts`

* Fix app host response

* Fix unsupported type

* Fix imported type
2023-01-29 15:47:24 -06:00
1cd5f38cb0 feat: add debug server for tailnet coordinators (#5861)
Implements a Tailscale-like debug server for our in-memory coordinator. This should provide some visibility into why connections could be failing.
Resolves: https://github.com/coder/coder/issues/5845

![image](https://user-images.githubusercontent.com/6332295/214680832-2724d633-2d54-44d6-a7ce-5841e5824ee5.png)
2023-01-25 21:27:36 +00:00
138887de7e feat: Add workspace agent lifecycle state reporting (#5785) 2023-01-24 14:24:27 +02:00
73afdd7c09 chore: agent_test.go: use ptty.Peek() instead of expecting caret in TestAgent_SessionTTYShell (#5821) 2023-01-23 11:23:25 +00:00
f1fe2b5c06 feat: add GPG forwarding to coder ssh (#5482) 2023-01-06 07:52:19 +00:00
760419a965 chore: Refactor agent tests to avoid t.Run when not needed (#5376)
It turns out that writing tests that contain subtests should probably be
limited to table-based tests and tests that share a common setup shared
between tests.

Writing tests with a subtest like this:

```
func TestSomething(t *testing.T) {
	t.Run("Subtest", func(t *testing.t) {})
}
```

Has the following disadvantages:

- It can lead to multiple tests failing with `(unknown)` status when
  only one of the subtests hang (never exit)
- In Go 1.20rc1, using `t.Setenv` is no longer allowed if the parent
  test is parallel
2022-12-12 22:20:46 +02:00
88bb901283 fix: Close tailnet if agent is closed during creation (#5375) 2022-12-12 11:26:49 +00:00
05130db571 fix: Improve closing of services in agent tests (#5355) 2022-12-09 12:22:27 +02:00
eff99f78fa feat: Add support for MOTD file in coder agents (#5147) 2022-11-24 12:22:20 +00:00
ae38bbeab6 chore: refactor agent stats streaming (#5112) 2022-11-18 16:46:53 -06:00
69e8c9e7b4 feat: add reconnectingpty loadtest (#5083) 2022-11-17 16:57:15 +00:00
73f91e4690 ci: use big runners (#4990)
* chore: Close idle connections on test cleanup

It's possible that this was the source of a leak on Windows...

* ci: use big runners

* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Add logging to Startupscript test

* Add better logging

* Write startup script logs to fs dir

* Fix startup script test

* Fix startup script test

* Reduce test timeout

* Use central tmp dir in agent

* Adjust output

* Skip startup script test on Windows

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2022-11-13 14:23:23 -06:00
82f494c99c fix: Improve tailnet connections by reducing timeouts (#5043)
* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Update Tailscale
2022-11-13 11:33:05 -06:00
16e9b1eb1a fix: Add timeouts to every tailnet ping (#4986)
A ping isn't guaranteed to deliver, so these need to have a
tight timeout for tests to not flake.
2022-11-09 20:12:51 +00:00
d82364b9b5 feat: make trace provider in loadtest, add tracing to sdk (#4939) 2022-11-09 08:10:48 +10:00
8e743d28c8 fix: Use instance identity session token for git subcommands (#4884)
This broke using gitssh with instance identity!
2022-11-04 09:44:36 -07:00
104d6608d9 feat: Add VSCODE_PROXY_URI to surface code-server ports (#4798)
* feat: Add `VSCODE_PROXY_URI` to surface code-server ports

Fixes #4776.

* Check if app host is provided
2022-11-04 04:45:43 +00:00
a0bdb4fca2 fix: Remove pkg/sftp fork, fix SFTP test (#4759) 2022-10-26 16:02:06 +03:00
eec406b739 feat: Add Git auth for GitHub, GitLab, Azure DevOps, and BitBucket (#4670)
* Add scaffolding

* Move migration

* Add endpoints for gitauth

* Add configuration files and tests!

* Update typesgen

* Convert configuration format for git auth

* Fix unclosed database conn

* Add overriding VS Code configuration

* Fix Git screen

* Write VS Code special configuration if providers exist

* Enable automatic cloning from VS Code

* Add tests for gitaskpass

* Fix feature visibiliy

* Add banner for too many configurations

* Fix update loop for oauth token

* Jon comments

* Add deployment config page
2022-10-24 19:46:24 -05:00
bf3224e373 fix: Refactor agent to consume API client (#4715)
* fix: Refactor agent to consume API client

This simplifies a lot of code by creating an interface for
the codersdk client into the agent. It also moves agent
authentication code so instance identity will work between
restarts.

Fixes #3485 and #4082.

* Fix client reconnections
2022-10-23 22:35:08 -05:00
173b7a2c83 fix: Start SFTP sessions in user home (working directory) (#4549)
* fix: Start SFTP sessions in user home (working directory)

This commit switches to our fork of `pkg/sftp` which includes a Server
option for changing the current working directory.

Attempt to upstream: https://github.com/pkg/sftp/pull/528

Supercedes and closes #4420

Fixes #3620

* Update fork
2022-10-21 09:54:06 -05:00
2ba4a62a0d feat: Add high availability for multiple replicas (#4555)
* feat: HA tailnet coordinator

* fixup! feat: HA tailnet coordinator

* fixup! feat: HA tailnet coordinator

* remove printlns

* close all connections on coordinator

* impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* Add replicas

* Add DERP meshing to arbitrary addresses

* Move packages to highavailability folder

* Move coordinator to high availability package

* Add flags for HA

* Rename to replicasync

* Denest packages for replicas

* Add test for multiple replicas

* Fix coordination test

* Add HA to the helm chart

* Rename function pointer

* Add warnings for HA

* Add the ability to block endpoints

* Add flag to disable P2P connections

* Wow, I made the tests pass

* Add replicas endpoint

* Ensure close kills replica

* Update sql

* Add database latency to high availability

* Pipe TLS to DERP mesh

* Fix DERP mesh with TLS

* Add tests for TLS

* Fix replica sync TLS

* Fix RootCA for replica meshing

* Remove ID from replicasync

* Fix getting certificates for meshing

* Remove excessive locking

* Fix linting

* Store mesh key in the database

* Fix replica key for tests

* Fix types gen

* Fix unlocking unlocked

* Fix race in tests

* Update enterprise/derpmesh/derpmesh.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Rename to syncReplicas

* Reuse http client

* Delete old replicas on a CRON

* Fix race condition in connection tests

* Fix linting

* Fix nil type

* Move pubsub to in-memory for twenty test

* Add comment for configuration tweaking

* Fix leak with transport

* Fix close leak in derpmesh

* Fix race when creating server

* Remove handler update

* Skip test on Windows

* Fix DERP mesh test

* Wrap HTTP handler replacement in mutex

* Fix error message for relay

* Fix API handler for normal tests

* Fix speedtest

* Fix replica resend

* Fix derpmesh send

* Ping async

* Increase wait time of template version jobd

* Fix race when closing replica sync

* Add name to client

* Log the derpmap being used

* Don't connect if DERP is empty

* Improve agent coordinator logging

* Fix lock in coordinator

* Fix relay addr

* Fix race when updating durations

* Fix client publish race

* Run pubsub loop in a queue

* Store agent nodes in order

* Fix coordinator locking

* Check for closed pipe

Co-authored-by: Colin Adler <colin1adler@gmail.com>
2022-10-17 13:43:30 +00:00
39cf329404 fix: Replace access URL for built-in DERP servers (#4197)
Fixes #4195.
2022-09-26 12:56:04 -05:00
4c8be34d81 feat: add health check monitoring to workspace apps (#4114) 2022-09-23 15:51:04 -04:00
a7ee8b31e0 fix: Don't use StatusAbnormalClosure (#4155) 2022-09-22 18:26:05 +00:00