Commit Graph

69 Commits

Author SHA1 Message Date
677721e4a1 fix(tailnet): Skip nodes without DERP, avoid use of RemoveAllPeers (#6320)
* fix(tailnet): Skip nodes without DERP, avoid use of RemoveAllPeers
2023-02-24 18:16:29 +02:00
4432cd08d6 chore: update tailscale (#6091) 2023-02-09 21:43:18 -06:00
691495d761 feat: add expanded_directory to the agent for extension support (#6087)
This will enable opening the default `dir` of an agent in
the VS Code extension!
2023-02-07 21:35:09 +00:00
2c2bbcc019 chore: update tests to support fish (#6023)
* fix: update tests to add fish support

* Track connections for SSH sessions to prevent leaks

* Revert SSH conn handling
2023-02-03 12:25:11 -06:00
7ad87505c8 chore: move agent functions from codersdk into agentsdk (#5903)
* chore: rename `AgentConn` to `WorkspaceAgentConn`

The codersdk was becoming bloated with consts for the workspace
agent that made no sense to a reader. `Tailnet*` is an example
of these consts.

* chore: remove `Get` prefix from *Client functions

* chore: remove `BypassRatelimits` option in `codersdk.Client`

It feels wrong to have this as a direct option because it's so infrequently
needed by API callers. It's better to directly modify headers in the two
places that we actually use it.

* Merge `appearance.go` and `buildinfo.go` into `deployment.go`

* Merge `experiments.go` and `features.go` into `deployment.go`

* Fix `make gen` referencing old type names

* Merge `error.go` into `client.go`

`codersdk.Response` lived in `error.go`, which is wrong.

* chore: refactor workspace agent functions into agentsdk

It was odd conflating the codersdk that clients should use
with functions that only the agent should use. This separates
them into two SDKs that are closely coupled, but separate.

* Merge `insights.go` into `deployment.go`

* Merge `organizationmember.go` into `organizations.go`

* Merge `quota.go` into `workspaces.go`

* Rename `sse.go` to `serversentevents.go`

* Rename `codersdk.WorkspaceAppHostResponse` to `codersdk.AppHostResponse`

* Format `.vscode/settings.json`

* Fix outdated naming in `api.ts`

* Fix app host response

* Fix unsupported type

* Fix imported type
2023-01-29 15:47:24 -06:00
1cd5f38cb0 feat: add debug server for tailnet coordinators (#5861)
Implements a Tailscale-like debug server for our in-memory coordinator. This should provide some visibility into why connections could be failing.
Resolves: https://github.com/coder/coder/issues/5845

![image](https://user-images.githubusercontent.com/6332295/214680832-2724d633-2d54-44d6-a7ce-5841e5824ee5.png)
2023-01-25 21:27:36 +00:00
138887de7e feat: Add workspace agent lifecycle state reporting (#5785) 2023-01-24 14:24:27 +02:00
73afdd7c09 chore: agent_test.go: use ptty.Peek() instead of expecting caret in TestAgent_SessionTTYShell (#5821) 2023-01-23 11:23:25 +00:00
f1fe2b5c06 feat: add GPG forwarding to coder ssh (#5482) 2023-01-06 07:52:19 +00:00
760419a965 chore: Refactor agent tests to avoid t.Run when not needed (#5376)
It turns out that writing tests that contain subtests should probably be
limited to table-based tests and tests that share a common setup shared
between tests.

Writing tests with a subtest like this:

```
func TestSomething(t *testing.T) {
	t.Run("Subtest", func(t *testing.t) {})
}
```

Has the following disadvantages:

- It can lead to multiple tests failing with `(unknown)` status when
  only one of the subtests hang (never exit)
- In Go 1.20rc1, using `t.Setenv` is no longer allowed if the parent
  test is parallel
2022-12-12 22:20:46 +02:00
88bb901283 fix: Close tailnet if agent is closed during creation (#5375) 2022-12-12 11:26:49 +00:00
05130db571 fix: Improve closing of services in agent tests (#5355) 2022-12-09 12:22:27 +02:00
eff99f78fa feat: Add support for MOTD file in coder agents (#5147) 2022-11-24 12:22:20 +00:00
ae38bbeab6 chore: refactor agent stats streaming (#5112) 2022-11-18 16:46:53 -06:00
69e8c9e7b4 feat: add reconnectingpty loadtest (#5083) 2022-11-17 16:57:15 +00:00
73f91e4690 ci: use big runners (#4990)
* chore: Close idle connections on test cleanup

It's possible that this was the source of a leak on Windows...

* ci: use big runners

* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Add logging to Startupscript test

* Add better logging

* Write startup script logs to fs dir

* Fix startup script test

* Fix startup script test

* Reduce test timeout

* Use central tmp dir in agent

* Adjust output

* Skip startup script test on Windows

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2022-11-13 14:23:23 -06:00
82f494c99c fix: Improve tailnet connections by reducing timeouts (#5043)
* fix: Improve tailnet connections by reducing timeouts

This awaits connection ping before running a dial. Before,
we were hitting the TCP retransmission and handshake timeouts,
which could intermittently add 1 or 5 seconds to a connection
being initialized.

* Update Tailscale
2022-11-13 11:33:05 -06:00
16e9b1eb1a fix: Add timeouts to every tailnet ping (#4986)
A ping isn't guaranteed to deliver, so these need to have a
tight timeout for tests to not flake.
2022-11-09 20:12:51 +00:00
d82364b9b5 feat: make trace provider in loadtest, add tracing to sdk (#4939) 2022-11-09 08:10:48 +10:00
8e743d28c8 fix: Use instance identity session token for git subcommands (#4884)
This broke using gitssh with instance identity!
2022-11-04 09:44:36 -07:00
104d6608d9 feat: Add VSCODE_PROXY_URI to surface code-server ports (#4798)
* feat: Add `VSCODE_PROXY_URI` to surface code-server ports

Fixes #4776.

* Check if app host is provided
2022-11-04 04:45:43 +00:00
a0bdb4fca2 fix: Remove pkg/sftp fork, fix SFTP test (#4759) 2022-10-26 16:02:06 +03:00
eec406b739 feat: Add Git auth for GitHub, GitLab, Azure DevOps, and BitBucket (#4670)
* Add scaffolding

* Move migration

* Add endpoints for gitauth

* Add configuration files and tests!

* Update typesgen

* Convert configuration format for git auth

* Fix unclosed database conn

* Add overriding VS Code configuration

* Fix Git screen

* Write VS Code special configuration if providers exist

* Enable automatic cloning from VS Code

* Add tests for gitaskpass

* Fix feature visibiliy

* Add banner for too many configurations

* Fix update loop for oauth token

* Jon comments

* Add deployment config page
2022-10-24 19:46:24 -05:00
bf3224e373 fix: Refactor agent to consume API client (#4715)
* fix: Refactor agent to consume API client

This simplifies a lot of code by creating an interface for
the codersdk client into the agent. It also moves agent
authentication code so instance identity will work between
restarts.

Fixes #3485 and #4082.

* Fix client reconnections
2022-10-23 22:35:08 -05:00
173b7a2c83 fix: Start SFTP sessions in user home (working directory) (#4549)
* fix: Start SFTP sessions in user home (working directory)

This commit switches to our fork of `pkg/sftp` which includes a Server
option for changing the current working directory.

Attempt to upstream: https://github.com/pkg/sftp/pull/528

Supercedes and closes #4420

Fixes #3620

* Update fork
2022-10-21 09:54:06 -05:00
2ba4a62a0d feat: Add high availability for multiple replicas (#4555)
* feat: HA tailnet coordinator

* fixup! feat: HA tailnet coordinator

* fixup! feat: HA tailnet coordinator

* remove printlns

* close all connections on coordinator

* impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* fixup! impelement high availability feature

* Add replicas

* Add DERP meshing to arbitrary addresses

* Move packages to highavailability folder

* Move coordinator to high availability package

* Add flags for HA

* Rename to replicasync

* Denest packages for replicas

* Add test for multiple replicas

* Fix coordination test

* Add HA to the helm chart

* Rename function pointer

* Add warnings for HA

* Add the ability to block endpoints

* Add flag to disable P2P connections

* Wow, I made the tests pass

* Add replicas endpoint

* Ensure close kills replica

* Update sql

* Add database latency to high availability

* Pipe TLS to DERP mesh

* Fix DERP mesh with TLS

* Add tests for TLS

* Fix replica sync TLS

* Fix RootCA for replica meshing

* Remove ID from replicasync

* Fix getting certificates for meshing

* Remove excessive locking

* Fix linting

* Store mesh key in the database

* Fix replica key for tests

* Fix types gen

* Fix unlocking unlocked

* Fix race in tests

* Update enterprise/derpmesh/derpmesh.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Rename to syncReplicas

* Reuse http client

* Delete old replicas on a CRON

* Fix race condition in connection tests

* Fix linting

* Fix nil type

* Move pubsub to in-memory for twenty test

* Add comment for configuration tweaking

* Fix leak with transport

* Fix close leak in derpmesh

* Fix race when creating server

* Remove handler update

* Skip test on Windows

* Fix DERP mesh test

* Wrap HTTP handler replacement in mutex

* Fix error message for relay

* Fix API handler for normal tests

* Fix speedtest

* Fix replica resend

* Fix derpmesh send

* Ping async

* Increase wait time of template version jobd

* Fix race when closing replica sync

* Add name to client

* Log the derpmap being used

* Don't connect if DERP is empty

* Improve agent coordinator logging

* Fix lock in coordinator

* Fix relay addr

* Fix race when updating durations

* Fix client publish race

* Run pubsub loop in a queue

* Store agent nodes in order

* Fix coordinator locking

* Check for closed pipe

Co-authored-by: Colin Adler <colin1adler@gmail.com>
2022-10-17 13:43:30 +00:00
39cf329404 fix: Replace access URL for built-in DERP servers (#4197)
Fixes #4195.
2022-09-26 12:56:04 -05:00
4c8be34d81 feat: add health check monitoring to workspace apps (#4114) 2022-09-23 15:51:04 -04:00
a7ee8b31e0 fix: Don't use StatusAbnormalClosure (#4155) 2022-09-22 18:26:05 +00:00
714c366d16 chore: Remove WebRTC networking (#3881)
* chore: Remove WebRTC networking

* Fix race condition

* Fix WebSocket not closing
2022-09-19 19:46:29 -05:00
0f8c2f592e feat: Use Tailscale networking by default (#4003)
* feat: Use Tailscale networking by default

Removal of WebRTC code will happen in another PR, but it
felt dangerious to default and remove in a single commit.

Ideally, we can release this version and collect final
thoughts and  feedback before a full commitment.

* Remove UNIX forwarding

Tailscale doesn't support this, and adding support
for it shouldn't block our rollout. Customers can
always forward over SSH.

* Update cli/portforward_test.go

Co-authored-by: Dean Sheather <dean@deansheather.com>

Co-authored-by: Dean Sheather <dean@deansheather.com>
2022-09-13 15:55:56 -05:00
00104096c2 fix: Resolve CI flakes for tailnet agent (#3924) 2022-09-07 09:24:58 -05:00
1254e7a902 feat: Add speedtest command for tailnet (#3874) 2022-09-05 17:15:49 -05:00
04b03792cb feat: add last used to Workspaces page (#3816) 2022-09-02 00:08:51 +00:00
30f8fd9b95 Daily Active User Metrics (#3735)
* agent: add StatsReporter

* Stabilize protoc
2022-09-01 14:58:23 -05:00
9bd83e5ec7 feat: Add Tailscale networking (#3505)
* fix: Add coder user to docker group on installation

This makes for a simpler setup, and reduces the likelihood
a user runs into a strange issue.

* Add wgnet

* Add ping

* Add listening

* Finish refactor to make this work

* Add interface for swapping

* Fix conncache with interface

* chore: update gvisor

* fix tailscale types

* linting

* more linting

* Add coordinator

* Add coordinator tests

* Fix coordination

* It compiles!

* Move all connection negotiation in-memory

* Migrate coordinator to use net.conn

* Add closed func

* Fix close listener func

* Make reconnecting PTY work

* Fix reconnecting PTY

* Update CI to Go 1.19

* Add CLI flags for DERP mapping

* Fix Tailnet test

* Rename ConnCoordinator to TailnetCoordinator

* Remove print statement from workspace agent test

* Refactor wsconncache to use tailnet

* Remove STUN from unit tests

* Add migrate back to dump

* chore: Upgrade to Go 1.19

This is required as part of #3505.

* Fix reconnecting PTY tests

* fix: update wireguard-go to fix devtunnel

* fix migration numbers

* linting

* Return early for status if endpoints are empty

* Update cli/server.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Update cli/server.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Fix frontend entites

* Fix agent bicopy

* Fix race condition for the last node

* Fix down migration

* Fix connection RBAC

* Fix migration numbers

* Fix forwarding TCP to a local port

* Implement ping for tailnet

* Rename to ForceHTTP

* Add external derpmapping

* Expose DERP region names to the API

* Add global option to enable Tailscale networking for web

* Mark DERP flags hidden while testing

* Update DERP map on reconnect

* Add close func to workspace agents

* Fix race condition in upstream dependency

* Fix feature columns race condition

Co-authored-by: Colin Adler <colin1adler@gmail.com>
2022-08-31 20:09:44 -05:00
e44f7adb7e feat: Set SSH env vars: SSH_CLIENT, SSH_CONNECTION and SSH_TTY (#3622)
Fixes #2339
2022-08-23 21:19:57 +03:00
f7ccfa2ab9 feat: Set CODER=true in workspaces (#3637)
Fixes #2340
2022-08-23 14:29:01 +03:00
4730c589fe chore: Use standardized test timeouts and delays (#3291) 2022-08-01 15:45:05 +03:00
36ffdce065 Return proper exit code on ssh with TTY (#3192)
* Return proper exit code on ssh with TTY

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix revive lint

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix Windows exit code for missing command

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix close error handling on agent TTY

Signed-off-by: Spike Curtis <spike@coder.com>
2022-07-27 14:23:28 -05:00
0b86c8047c fix: Close connections in agent tests (#3196) 2022-07-26 13:24:54 +03:00
51dd1fde3b fix: Remove use of require in require.Eventually in tests (#3110)
* fix: Remove use of `require` in `require.Eventually` in tests

Because require uses `t.FailNow()` and `require.Eventually` runs the
function in a goroutine, which is not allowed.

* feat: Add ruleguard for require.Eventually

Co-authored-by: Cian Johnston <cian@coder.com>
2022-07-22 20:02:49 +03:00
7d07e670ca chore: Improve test cleanup (#3112) 2022-07-22 15:14:45 +03:00
d9da96cad0 fix: Add test for SCP (#2692)
* fix: Elongate agent disconnect timeout in tests

This will fix the flake seen here:
https://github.com/coder/coder/runs/7071719863?check_suite_focus=true

* fix: Add test for SCP

This was hanging due to the stdin pipe never being closed.
A test has been added to make sure it works!
2022-06-27 17:41:53 +01:00
b9f3fe49cb fix: Start login shells on macOS and Linux (#2437)
This appends `-l` to the shell command on macOS and Linux.
It also adds environment variable expansion to allow for
chaining from `coder_agent.env`.
2022-06-17 05:54:45 +00:00
d0ed107b08 fix: Add command to reconnecting PTY (#1860)
This fixes #1708 and opens the door for PTYs to execute
non-shell commands!
2022-05-27 14:51:20 -05:00
781f3d0641 fix: use dir over full path for coder bin (#1795) 2022-05-26 19:05:46 +00:00
3052a6d88e Add coder executable to PATH (#1771) 2022-05-26 12:59:41 -05:00
4543a3b277 fix: log after test exit in TestAgent/StartupScript (#1726)
```
$ go test ./agent/ -v -run TestAgent/StartupScript -count 1
=== RUN   TestAgent
=== PAUSE TestAgent
=== CONT  TestAgent
=== RUN   TestAgent/StartupScript
=== PAUSE TestAgent/StartupScript
=== CONT  TestAgent/StartupScript
    t.go:56: 2022-05-24 20:22:39.648 [INFO]	<agent.go:112>	connected
--- PASS: TestAgent (0.00s)
    --- PASS: TestAgent/StartupScript (0.17s)
PASS
panic: Log in goroutine after TestAgent/StartupScript has completed: 2022-05-24 20:22:39.651 [WARN]	<agent.go:130>	agent script failed ...
"error": run:
             github.com/coder/coder/agent.(*agent).runStartupScript
                 /home/colin/Projects/coder/coder/agent/agent.go:183
           - signal: killed
```
2022-05-24 16:03:42 -05:00
c2f74f3cc2 chore: avoid concurrent usage of t.FailNow (#1683)
* chore: golangci: add linter rule to report usage of t.FailNow inside goroutines
* chore: avoid t.FailNow in goroutines to appease the race detector
2022-05-24 08:58:39 +01:00