Commit Graph

42 Commits

Author SHA1 Message Date
5762d8add4 fix: return only the first workspace agent script timing per script (#16203)
Fixes https://github.com/coder/coder/issues/16124

If a workspace agent crashes, it is possible for any startup scripts to
be ran again. This PR makes it so that the
`GetWorkspaceAgentScriptTimingsByBuildID` query only returns the first
timing recorded per-script.
2025-01-21 11:54:43 +00:00
e232aee011 feat(site): add agent connection timings (#15276)
Local preview:

<img width="1260" alt="Screenshot 2024-10-29 at 16 16 01"
src="https://github.com/user-attachments/assets/10fdb20d-1f2a-4b0a-a8a1-171050ee620d">


Close https://github.com/coder/internal/issues/116

---------

Co-authored-by: Danny Kopping <danny@coder.com>
2024-11-01 13:29:00 -03:00
9c8ecb82a3 feat(coderd): return agent script timings (#14923)
Add the agent script timings into the
`/workspacebuilds/:workspacebuild/timings` response.

Close https://github.com/coder/coder/issues/14876
2024-10-14 09:31:03 -03:00
ae522c558d feat: add agent timings (#14713)
* feat: begin impl of agent script timings

* feat: add job_id and display_name to script timings

* fix: increment migration number

* fix: rename migrations from 251 to 254

* test: get tests compiling

* fix: appease the linter

* fix: get tests passing again

* fix: drop column from correct table

* test: add fixture for agent script timings

* fix: typo

* fix: use job id used in provisioner job timings

* fix: increment migration number

* test: behaviour of script runner

* test: rewrite test

* test: does exit 1 script break things?

* test: rewrite test again

* fix: revert change

Not sure how this came to be, I do not recall manually changing
these files.

* fix: let code breathe

* fix: wrap errors

* fix: justify nolint

* fix: swap require.Equal argument order

* fix: add mutex operations

* feat: add 'ran_on_start' and 'blocked_login' fields

* fix: update testdata fixture

* fix: refer to agent_id instead of job_id in timings

* fix: JobID -> AgentID in dbauthz_test

* fix: add 'id' to scripts, make timing refer to script id

* fix: fix broken tests and convert bug

* fix: update testdata fixtures

* fix: update testdata fixtures again

* feat: capture stage and if script timed out

* fix: update migration number

* test: add test for script api

* fix: fake db query

* fix: use UTC time

* fix: ensure r.scriptComplete is not nil

* fix: move err check to right after call

* fix: uppercase sql

* fix: use dbtime.Now()

* fix: debug log on r.scriptCompleted being nil

* fix: ensure correct rbac permissions

* chore: remove DisplayName

* fix: get tests passing

* fix: remove space in sql up

* docs: document ExecuteOption

* fix: drop 'RETURNING' from sql

* chore: remove 'display_name' from timing table

* fix: testdata fixture

* fix: put r.scriptCompleted call in goroutine

* fix: track goroutine for test + use separate context for reporting

* fix: appease linter, handle trackCommandGoroutine error

* fix: resolve race condition

* feat: replace timed_out column with status column

* test: update testdata fixture

* fix: apply suggestions from review

* revert: linter changes
2024-09-24 10:51:49 +01:00
0f8251be41 feat(coderd/database/dbpurge): retain most recent agent build logs (#14460)
Updates the `DeleteOldWorkspaceAgentLogs` to:
- Retain logs for the most recent build regardless of age,
- Delete logs for agents that never connected and were created before
   the cutoff for deleting logs while still retaining the logs most recent build.
2024-08-30 17:39:09 +01:00
a74273f1fd chore(coderd/database/dbpurge): replace usage of time.* with quartz (#14480)
Related to #10576

This PR introduces quartz to coderd/database/dbpurge and updates the following unit tests to make use of Quartz's functionality:

- TestPurge
- TestDeleteOldWorkspaceAgentLogs

Additionally, updates DeleteOldWorkspaceAgentLogs to replace the hard-coded interval with a parameter passed into the query. This aids in testing and brings us a step towards allowing operators to configure the cutoff interval for workspace agent logs.
2024-08-30 11:55:47 +01:00
79441e3609 perf(coderd/database): optimize GetWorkspaceAgentAndLatestBuildByAuthToken (#12809) 2024-03-28 19:38:16 +02:00
0723dd3abf fix: ensure agent token is from latest build in middleware (#12443) 2024-03-14 12:27:32 -04:00
7a453608c9 feat: support order property of coder_agent (#12121) 2024-02-15 13:33:13 +01:00
c0e169ebf9 feat: support custom order of agent metadata (#12066) 2024-02-08 17:29:34 +01:00
fe867d02e0 fix: correct perms for forbidden error in TemplateScheduleStore.Load (#11286)
* chore: TemplateScheduleStore.Load() throwing forbidden error
* fix: workspace agent scope to include template
2023-12-20 11:38:49 -06:00
228cbec99b fix: stop updating agent stats from deleted workspaces (#11026)
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2023-12-07 13:55:29 -05:00
a7c671ca07 feat: add workspace agent APIVersion (#10419)
Fixes #10339
2023-10-31 10:08:43 +04:00
7eeba15d16 feat(coderd): add support for sending batched agent metadata (#10223)
Part of #9782
2023-10-13 16:37:55 +03:00
1262eef2c0 feat: add support for coder_script (#9584)
* Add basic migrations

* Improve schema

* Refactor agent scripts into it's own package

* Support legacy start and stop script format

* Pipe the scripts!

* Finish the piping

* Fix context usage

* It works!

* Fix sql query

* Fix SQL query

* Rename `LogSourceID` -> `SourceID`

* Fix the FE

* fmt

* Rename migrations

* Fix log tests

* Fix lint err

* Fix gen

* Fix story type

* Rename source to script

* Fix schema jank

* Uncomment test

* Rename proto to TimeoutSeconds

* Fix comments

* Fix comments

* Fix legacy endpoint without specified log_source

* Fix non-blocking by default in agent

* Fix resources tests

* Fix dbfake

* Fix resources

* Fix linting I think

* Add fixtures

* fmt

* Fix startup script behavior

* Fix comments

* Fix context

* Fix cancel

* Fix SQL tests

* Fix e2e tests

* Interrupt on Windows

* Fix agent leaking script process

* Fix migrations

* Fix stories

* Fix duplicate logs appearing

* Gen

* Fix log location

* Fix tests

* Fix tests

* Fix log output

* Show display name in output

* Fix print

* Return timeout on start context

* Gen

* Fix fixture

* Fix the agent status

* Fix startup timeout msg

* Fix command using shared context

* Fix timeout draining

* Change signal type

* Add deterministic colors to startup script logs

---------

Co-authored-by: Muhammad Atif Ali <atif@coder.com>
2023-09-25 16:47:17 -05:00
ee24260614 feat: allow configuring display apps from template (#9100) 2023-08-30 14:53:42 -05:00
5d4a17717f refactor(coderd): fetch owner information when authorizing workspace agent (#9123)
* Refactors the existing httpmw tests to use dbtestutil so that we can test them against a real database if desired,
* Modifies the GetWorkspaceAgentByAuthToken to return the owner and associated roles, removing the need for additional queries
2023-08-21 15:49:26 +01:00
07fd73c4a0 chore: allow multiple agent subsystems, add exectrace (#8933) 2023-08-08 22:10:28 -07:00
bd944e0d21 chore: rename startup logs to agent logs (#8649)
* chore: rename startup logs to agent logs

This also adds a `source` property to every agent log. It
should allow us to group logs and display them nicer in
the UI as they stream in.

* Fix migration order

* Fix naming

* Rename the frontend

* Fix tests

* Fix down migration

* Match enums for workspace agent logs

* Fix inserting log source

* Fix migration order

* Fix logs tests

* Fix psql insert
2023-07-28 15:57:23 +00:00
8dac0356ed refactor: replace startup script logs EOF with starting/ready time (#8082)
This commit reverts some of the changes in #8029 and implements an
alternative method of keeping track of when the startup script has ended
and there will be no more logs.

This is achieved by adding new agent fields for tracking when the agent
enters the "starting" and "ready"/"start_error" lifecycle states. The
timestamps simplify logic since we don't need understand if the current
state is before or after the state we're interested in. They can also be
used to show data like how long the startup script took to execute. This
also allowed us to remove the EOF field from the logs as the
implementation was problematic when we returned the EOF log entry in the
response since requesting _after_ that ID would give no logs and the API
would thus lose track of EOF.
2023-06-20 14:41:55 +03:00
0c5077464b fix: avoid missed logs when streaming startup logs (#8029)
* feat(coderd,agent): send startup log eof at the end

* fix(coderd): fix edge case in startup log pubsub

* fix(coderd): ensure startup logs are closed on lifecycle state change (fallback)

* fix(codersdk): fix startup log channel shared memory bug

* fix(site): remove the EOF log line
2023-06-16 17:14:22 +03:00
660bbb8d38 refactor: deprecate login_before_ready in favor of startup_script_behavior (#7837)
Fixes #7758
2023-06-06 11:58:07 +03:00
00a2413c03 feat: add telemetry support for workspace agent subsystem (#7579) 2023-05-17 22:49:25 -05:00
81e2b2500a feat: add level support for startup logs (#7067)
This allows external services like our devcontainer support to display
errors and warnings with custom styles to indicate failures to users.
2023-04-10 14:29:59 -05:00
ca4fa81570 feat: add agent metadata (#6614) 2023-03-31 15:26:19 -05:00
665b84de0d feat: use app tickets for web terminal (#6628) 2023-03-30 23:24:51 +10:00
cb7375450b feat: add startup script logs to the ui (#6558)
* Add startup script logs to the database

* Add coderd endpoints for startup script logs

* Push startup script logs from agent

* Pull startup script logs on frontend

* Rename queries

* Add constraint

* Start creating log sending loop

* Add log sending to the agent

* Add tests for streaming logs

* Shorten notify channel name

* Add FE

* Improve bulk log performance

* Finish UI display

* Fix startup log visibility

* Add warning for overflow

* Fix agent queue logs overflow

* Display staartup logs in a virtual DOM for performance

* Fix agent queue with loads of logs

* Fix authorize test

* Remove faulty test

* Fix startup and shutdown reporting error

* Fix gen

* Fix comments

* Periodically purge old database entries

* Add test fixture for migration

* Add Storybook

* Check if there are logs when displaying features

* Fix startup component overflow gap

* Fix startup log wrapping

---------

Co-authored-by: Asher <ash@coder.com>
2023-03-23 14:09:13 -05:00
22e3ff96be feat(agent): Add shutdown lifecycle states and shutdown_script support (#6139)
* feat(api): Add agent shutdown lifecycle states

* feat(agent): Add shutdown_script support

* feat(agent): Add shutdown_script timeout

* feat(site): Support new agent lifecycle states

---

Co-authored-by: Marcin Tojek <marcin@coder.com>
2023-03-06 21:34:00 +02:00
691495d761 feat: add expanded_directory to the agent for extension support (#6087)
This will enable opening the default `dir` of an agent in
the VS Code extension!
2023-02-07 21:35:09 +00:00
981cac5e28 chore: Invert delay_login_until_ready, now login_before_ready (#5893) 2023-01-27 20:07:47 +00:00
138887de7e feat: Add workspace agent lifecycle state reporting (#5785) 2023-01-24 14:24:27 +02:00
eff99f78fa feat: Add support for MOTD file in coder agents (#5147) 2022-11-24 12:22:20 +00:00
90c34b74de feat: Add connection_timeout and troubleshooting_url to agent (#4937)
* feat: Add connection_timeout and troubleshooting_url to agent

This commit adds the connection timeout and troubleshooting url fields
to coder agents.

If an initial connection cannot be established within connection timeout
seconds, then the agent status will be marked as `"timeout"`.

The troubleshooting URL will be present, if configured in the Terraform
template, it can be presented to the user when the agent state is either
`"timeout"` or `"disconnected"`.

Fixes #4678
2022-11-09 17:27:05 +02:00
5be6c7071e feat: Associate connected workspace agents with replicas (#4914)
This will enable displaying a graph that associates agents
to running replicas.
2022-11-06 15:27:09 -06:00
9bd83e5ec7 feat: Add Tailscale networking (#3505)
* fix: Add coder user to docker group on installation

This makes for a simpler setup, and reduces the likelihood
a user runs into a strange issue.

* Add wgnet

* Add ping

* Add listening

* Finish refactor to make this work

* Add interface for swapping

* Fix conncache with interface

* chore: update gvisor

* fix tailscale types

* linting

* more linting

* Add coordinator

* Add coordinator tests

* Fix coordination

* It compiles!

* Move all connection negotiation in-memory

* Migrate coordinator to use net.conn

* Add closed func

* Fix close listener func

* Make reconnecting PTY work

* Fix reconnecting PTY

* Update CI to Go 1.19

* Add CLI flags for DERP mapping

* Fix Tailnet test

* Rename ConnCoordinator to TailnetCoordinator

* Remove print statement from workspace agent test

* Refactor wsconncache to use tailnet

* Remove STUN from unit tests

* Add migrate back to dump

* chore: Upgrade to Go 1.19

This is required as part of #3505.

* Fix reconnecting PTY tests

* fix: update wireguard-go to fix devtunnel

* fix migration numbers

* linting

* Return early for status if endpoints are empty

* Update cli/server.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Update cli/server.go

Co-authored-by: Colin Adler <colin1adler@gmail.com>

* Fix frontend entites

* Fix agent bicopy

* Fix race condition for the last node

* Fix down migration

* Fix connection RBAC

* Fix migration numbers

* Fix forwarding TCP to a local port

* Implement ping for tailnet

* Rename to ForceHTTP

* Add external derpmapping

* Expose DERP region names to the API

* Add global option to enable Tailscale networking for web

* Mark DERP flags hidden while testing

* Update DERP map on reconnect

* Add close func to workspace agents

* Fix race condition in upstream dependency

* Fix feature columns race condition

Co-authored-by: Colin Adler <colin1adler@gmail.com>
2022-08-31 20:09:44 -05:00
5362f4636e feat: show agent version in UI and CLI (#3709)
This commit adds the ability for agents to set their version upon start.
This is then reported in the UI and CLI.
2022-08-31 16:33:50 +01:00
9df6bc7ba1 fix: update template updated_at value (#2729)
* fix: update template updated_at value

* use Go time for all updated_at updates
2022-06-30 12:14:51 +00:00
05b67ab1cf feat: peer wireguard (#2445) 2022-06-24 10:25:01 -05:00
4cce969018 feat: Add anonymized telemetry to report product usage (#2273)
* feat: Add anonymized telemetry to report product usage

This adds a background service to report telemetry to a Coder
server for usage data. There will be realtime event data sent
in the future, but for now usage will report on a CRON.

* Fix flake and requested changes

* Add reporting options for setup

* Add reporting for workspaces

* Add resources as they are reported

* Track API key usage

* Ensure telemetry is tracked prior to exit
2022-06-17 00:26:40 -05:00
8701e0084c feat: Update Terraform provider to support "dir" in "coder_agent" (#1219)
This allows users to specify a starting directory for shell sessions.
2022-05-02 10:27:34 -05:00
19b4323512 feat: Allow workspace resources to attach multiple agents (#942)
This enables a "kubernetes_pod" to attach multiple agents that
could be for multiple services. Each agent is required to have
a unique name, so SSH syntax is:

`coder ssh <workspace>.<agent>`

A resource can have zero agents too, they aren't required.
2022-04-11 16:06:15 -05:00
fd523100bf chore: split queries.sql into files by table (#762) 2022-04-01 15:45:23 -05:00