Commit Graph

42 Commits

Author SHA1 Message Date
68a46198d3 fix(scaletest): deploy external provisionerd (#8618)
* scaletest: stop kubernetes_secret from being constantly recreated
* scaletest: ensure we do not get auto-upgraded
* scaletest: add external provisionerd deployment, the lazy way
2023-07-20 11:38:46 +01:00
278527cff4 feat(scaletest): add option to send traffic over SSH (#8521)
- Refactors the metrics logic to avoid needing to pass in a whole prometheus registry
- Adds an --ssh option to the workspace-traffic command to send SSH traffic

Fixes #8242
2023-07-18 12:17:11 +01:00
cdf9b9045f fix(scaletest/terraform): fix prometheus namespace deps, disable auto-upgrade (#8490)
* hotfix(scaletest/terraform): fix prometheus namespace deps, disable auto-upgrade

* fixup! hotfix(scaletest/terraform): fix prometheus namespace deps, disable auto-upgrade
2023-07-13 10:54:57 +01:00
c47b78c44b chore: replace wsconncache with a single tailnet (#8176) 2023-07-12 17:37:31 -05:00
435c67ab75 refactor(cli)!: move scaletest to exp/scaletest (#8339)
* refactor(cli): mv scaletest exp/scaletest

* make gen
2023-07-07 09:10:14 +01:00
7fcf319e01 fix(cli)!: protect client Logger and refactor cli scaletest tests (#8317)
- (breaking) Protects Logger and LogBodies fields of codersdk.Client with its mutex. This addresses a data race in cli/scaletest.
- Fillets the existing cli/createworkspaces unit test and moves the testing logic there into the tests under scaletest/createworkspaces.
- Adds testutil.RaceEnabled bool const and conditionaly skips previously-skipped tests under scaletest/ if the race detector is enabled. This is unfortunate and sad, but I would prefer to have these tests at least running without the race detector than not running at all.
- Adds IgnoreErrors option to fake in-memory agent loggers; having the agents fail the test immediately when they encounter any sort of error isn't really helpful.
2023-07-06 09:43:39 +01:00
1e8cc2ca8d feat: scaletest: scale down nodegroups by default (#8276)
* feat: allow scaling down scaletest environments

* fix bugged namespace deletion

* misc fixes to scaletest.sh

* destroy namespaces is a no-op as the cluster will be gone anyway
2023-06-30 16:07:47 +01:00
a6bd85df38 feat: scaletest: add Grafana dashboard for scale testing (#8274)
* feat: scaletest: add Grafana dashboard for scale testing

Fixes #7600.

* make fmt
2023-06-30 14:04:46 +00:00
357f3b38f7 fix: scaletest: mount CODER_CACHE volume under /tmp (#8271)
Mounting the CODER_CACHE volume under /tmp/coder causes
template creation to fail due to read-only tmp dir.
2023-06-30 12:48:34 +01:00
7072b8eff5 chore: update scaletest terraform with latest findings (#8249)
Updates scaletest terraform with learnings from #8213:

- Increase max pods per node to 256
- Decrease CPU requests for test workspace to allow maxing out workspaces per node
- Explicitly set CODER_ACCESS_URL for ssh to work
- Explicitly disable rate limits in coderd
- Increase DB size for medium and large scenarios
- Mount cache volume directly under /tmp/coder instead of /tmp.
- Plumb through requests and limits for workspaces
- Plumb through requests for coderd
2023-06-29 14:03:11 +00:00
5d26637686 feat(scaletest): add license and experiment to scaletest (#8222)
* add license and experiment to scaletest

Signed-off-by: Spike Curtis <spike@coder.com>

* appease lint & fmt

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-06-27 10:13:36 +00:00
b8437ce453 fix(scaletest): adjust sessionAffinity and scenario resources (#8205)
* scaletest: adjust scenario resources
* scaletest: set sessionAffinity=None for coder service
2023-06-26 15:54:05 +01:00
b1d1b63113 chore: ensure logs consistency across Coder (#8083) 2023-06-20 12:30:45 +02:00
6e598234b6 fix: only collect prometheus database metrics when explicitly enabled (#8045)
* fix: only collect prometheus database metrics when explicitly enabled

* add missing test

* de-duplicate wrapping
2023-06-15 12:34:16 +01:00
df842b31e8 chore: fix miscellaneous issues in scaletest scripts (#8006)
* chore: scaletest: plumb through more options

* bump terraform version

* scaletest.sh: pprof during traffic gen

* cli/scaletest: actually wait for prometheus metrics to be scraped

* increase prometheus wait
2023-06-14 09:38:04 +01:00
2bbe650eb0 chore: scaletest: collect database metrics using prometheus-postgres-exporter (#7945)
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2023-06-09 21:21:08 +00:00
efbb55803b chore: add scaletest convenience script (#7819)
- Adds a convenience script `scaletest.sh` to automate process of running scale tests
- Enables pprof endpoint by default, and captures pprof traces before tearing down infra.
- Improves idempotency of coder_init.sh
- Removes the promtest.Float64 invocations in workspacetraffic runner, these metrics will be in prometheus.
- Increases default workspace traffic output to 40KB/s/workspace.
2023-06-08 09:30:02 +01:00
9ec1fcf1a7 ci: move timing tests to nightly gauntlet (#7910)
Test_Runner_Timing was one of our flakiest tests before.
2023-06-08 04:03:03 +00:00
795050bba3 chore: add prometheus monitoring of workspace traffic generation (#7583)
- Exposes reads/writes from scaletest traffic generation (default: 0.0.0.0:21112)
- Adds self-hosted prometheus with remote_write to loadtest terraform
- Adds convenience script to run a traffic generation test
2023-05-26 13:53:35 +01:00
854e974bb4 chore: add terraform for spinning up load test cluster (#7504)
Adds terraform configs for spinning up loadtest environments
2023-05-15 15:56:47 +01:00
50f2d0c7e9 fix: add a mutex around reading logs from scaletests (#7521) 2023-05-14 12:16:00 -05:00
08fb9a6f1b feat(cli): add trafficgen command for load testing (#7307)
This PR adds a scaletest workspace-traffic command for load testing. This opens a
ReconnectingPTY connection to each scaletest workspace (via coderd) and 
concurrently writes and reads random data to/from the PTY. Payloads are of the
form #${RANDOM_ALPHANUMERIC_STRING}, which essentially drops garbage
comments in the remote shell, and should not result in any commands being executed.
2023-05-05 10:34:58 +01:00
f1dfeb03db chore: fix flake in apptest reconnecting-pty test (#7281) 2023-04-26 00:31:41 +00:00
bf0fed4f3f chore: Update pion/udp and improve parallel/non-parallel tests (#7164)
* test(all): Improve and fix subtests with parallell/nonparallel parents

* chore: Update pion/udp to fix buffer close
2023-04-17 20:23:10 +03:00
17f692a89a fix(scaletest): correctly validate configs using SessionToken (#7111) 2023-04-12 17:36:05 -05:00
a44070e2ec feat(scaletest): allow scaletests to run using the host credentials (#7075) 2023-04-11 19:49:28 +00:00
72c84c5b0a fix(loadtest): use cryptorand.String to generate user password (#7006) 2023-04-05 12:52:47 -05:00
348530000f fix(coderd): Ensure agent disconnect happens after timeout (#6600)
Fixes #6598
2023-03-14 13:14:47 +00:00
bc26c4a27f chore: skip scaletest/reconnectingpty (#6599) 2023-03-14 11:37:31 +00:00
a78786119d chore: resolve race when running load tests with logs (#6523)
See https://github.com/coder/coder/actions/runs/4370166746/jobs/7644793277
2023-03-08 21:12:20 -06:00
f05609b4da chore: format Go more aggressively 2023-02-18 18:32:09 -06:00
a54de6093b feat: add coder ping (#6161) 2023-02-13 10:38:00 -06:00
4432cd08d6 chore: update tailscale (#6091) 2023-02-09 21:43:18 -06:00
e6f5623627 chore: Rename agent statistics server to http api server (#5961) 2023-02-01 20:05:57 +02:00
0d08065488 fix: use a waitgroup to ensure all connections are cleaned up in agent (#5910)
* fix: use a waitgroup to ensure all connections are cleaned up in agent

There was a race where connections would be created at the same time as close.
The `net.Conn` produced by Tailscale doesn't close then the listener does.

* Remove accidental test
2023-01-29 17:20:30 -06:00
7ad87505c8 chore: move agent functions from codersdk into agentsdk (#5903)
* chore: rename `AgentConn` to `WorkspaceAgentConn`

The codersdk was becoming bloated with consts for the workspace
agent that made no sense to a reader. `Tailnet*` is an example
of these consts.

* chore: remove `Get` prefix from *Client functions

* chore: remove `BypassRatelimits` option in `codersdk.Client`

It feels wrong to have this as a direct option because it's so infrequently
needed by API callers. It's better to directly modify headers in the two
places that we actually use it.

* Merge `appearance.go` and `buildinfo.go` into `deployment.go`

* Merge `experiments.go` and `features.go` into `deployment.go`

* Fix `make gen` referencing old type names

* Merge `error.go` into `client.go`

`codersdk.Response` lived in `error.go`, which is wrong.

* chore: refactor workspace agent functions into agentsdk

It was odd conflating the codersdk that clients should use
with functions that only the agent should use. This separates
them into two SDKs that are closely coupled, but separate.

* Merge `insights.go` into `deployment.go`

* Merge `organizationmember.go` into `organizations.go`

* Merge `quota.go` into `workspaces.go`

* Rename `sse.go` to `serversentevents.go`

* Rename `codersdk.WorkspaceAppHostResponse` to `codersdk.AppHostResponse`

* Format `.vscode/settings.json`

* Fix outdated naming in `api.ts`

* Fix app host response

* Fix unsupported type

* Fix imported type
2023-01-29 15:47:24 -06:00
8487127f5c chore: skip reconnecting pty scale tests (#5908)
* fix: close reconnecting pty conn when exiting agent

Fixes https://github.com/coder/coder/actions/runs/4038282899/jobs/6942170850

* Fix conpty

* Fix contrib

* Skip runner tests for being flakes

* Fix gpg key test

* Fix golden files

* Fix comments
2023-01-29 14:53:49 -06:00
45eb26d5d0 fix(scaletest): increase time range check causing flake on MacOS (#5776) 2023-01-18 22:41:14 +00:00
e61234f260 feat: Add vscodeipc subcommand for VS Code Extension (#5326)
* Add extio

* feat: Add `vscodeipc` subcommand for VS Code Extension

This enables the VS Code extension to communicate with a Coder client.
The extension will download the slim binary from `/bin/*` for the
respective client architecture and OS, then execute `coder vscodeipc`
for the connecting workspace.

* Add authentication header, improve comments, and add tests for the CLI

* Update cli/vscodeipc_test.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update cli/vscodeipc_test.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Update cli/vscodeipc/vscodeipc_test.go

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>

* Fix requested changes

* Fix IPC tests

* Fix shell execution

* Fix nix flake

* Silence usage

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2022-12-18 17:50:06 -06:00
ffb8df9655 test: Disable error on agent log in scaletest/reconnectingpty (#5445)
They way the reconnectingpty tests behave inherently will cause the
agent to occasionally log an error (e.g. due to test disconnecting at a
certain time), allowing these error logs to fail the test will cause
these tests to be flakey.

It's best for these tests to only rely on the observed behavior.
2022-12-16 16:13:31 +02:00
e2aec2709b test: Fix scaletest/reconnectingpty commands for use in powershell (#5439) 2022-12-16 12:18:14 +02:00
6b6eac2518 feat: remove loadtest cmd, add new scaletest cmd (#5310) 2022-12-15 15:04:24 +00:00