* chore: rename startup logs to agent logs
This also adds a `source` property to every agent log. It
should allow us to group logs and display them nicer in
the UI as they stream in.
* Fix migration order
* Fix naming
* Rename the frontend
* Fix tests
* Fix down migration
* Match enums for workspace agent logs
* Fix inserting log source
* Fix migration order
* Fix logs tests
* Fix psql insert
This commit adds two new `agentsdk` functions, `StartupLogsSender` and
`StartupLogsWriter` that can be used by any client looking to send
startup logs to coderd.
We also refactor the `agent` to use these new functions.
As a bonus, agent startup logs are separated into "info" and "error"
levels to separate stdout and stderr.
---------
Co-authored-by: Marcin Tojek <mtojek@users.noreply.github.com>
During agent close it was possible for the startup script logs consumer
to enter a deadlock state where by agent close was waiting via
`a.trackConnGoroutine` and the log reader for a flush event.
This refactor removes the mutex in favor of channel communication and
relies on two goroutines without shared state.
This commit reverts some of the changes in #8029 and implements an
alternative method of keeping track of when the startup script has ended
and there will be no more logs.
This is achieved by adding new agent fields for tracking when the agent
enters the "starting" and "ready"/"start_error" lifecycle states. The
timestamps simplify logic since we don't need understand if the current
state is before or after the state we're interested in. They can also be
used to show data like how long the startup script took to execute. This
also allowed us to remove the EOF field from the logs as the
implementation was problematic when we returned the EOF log entry in the
response since requesting _after_ that ID would give no logs and the API
would thus lose track of EOF.
* feat(coderd,agent): send startup log eof at the end
* fix(coderd): fix edge case in startup log pubsub
* fix(coderd): ensure startup logs are closed on lifecycle state change (fallback)
* fix(codersdk): fix startup log channel shared memory bug
* fix(site): remove the EOF log line
* chore: skip timing-sensistive AgentMetadata test in the standard suite
* Add test-timing target
* fix windows?
* Works on my Windows desktop?
* Use tag system
* fixup! Use tag system
* fix: don't make session counts cumulative
This made for some weird tracking... we want the point-in-time
number of counts!
* Add databasefake query for getting agent stats
* Add deployment stats endpoint
* The query... works?!?
* Fix aggregation query
* Select from multiple tables instead
* Fix continuous stats
* Increase period of stat refreshes
* Add workspace counts to deployment stats
* fmt
* Add a slight bit of responsiveness
* Fix template version editor overflow
* Add refresh button
* Fix font family on button
* Fix latest stat being reported
* Revert agent conn stats
* Fix linting error
* Fix tests
* Fix gen
* Fix migrations
* Block on sending stat updates
* Add test fixtures
* Fix response structure
* make gen
* feat(api): Add agent shutdown lifecycle states
* feat(agent): Add shutdown_script support
* feat(agent): Add shutdown_script timeout
* feat(site): Support new agent lifecycle states
---
Co-authored-by: Marcin Tojek <marcin@coder.com>
* chore: convert workspace agent stats from json to table
* chore: convert agent stats to use a table
Backwards compatibility becomes hard when all agent stats are in a JSON blob.
We also want to query this table for new agents that are failing health checks
so we can display it in the UI.
* Fix migration using default values