This PR updates the coder port-forward command to periodically inform coderd that the workspace is being used:
- Adds workspaceusage.Tracker which periodically batch-updates workspace LastUsedAt
- Adds coderd endpoint to signal workspace usage
- Updates coder port-forward to periodically hit this endpoint
- Modifies BatchUpdateWorkspacesLastUsedAt to avoid overwriting with stale data
Co-authored-by: Danny Kopping <danny@coder.com>
* chore: remove max_ttl from templates
Completely removing max_ttl as a feature on template scheduling. Must use other template scheduling features to achieve autostop.
* chore: add org ID as optional param to AcquireJob
* chore: plumb through organization id to provisioner daemons
* add org id to provisioner domain key
* enforce org id argument
* dbgen provisioner jobs defaults to default org
fixes#11950https://github.com/coder/coder/issues/11950#issuecomment-1987756088 explains the bug
We were also calling into `Unlisten()` and `Close()` while holding the mutex. I don't believe that `Close()` depends on the notification loop being unblocked, but it's hard to be sure, and the safest thing to do is assume it could block.
So, I added a unit test that fakes out `pq.Listener` and sends a bunch of notifies every time we call into it to hopefully prevent regression where we hold the mutex while calling into these functions.
It also removes the use of a `context.Context` to stop the PubSub -- it must be explicitly `Closed()`. This simplifies a bunch of the logic, and is how we use the pubsub anyway.
DERP mesh key setup would do a SELECT and then an INSERT on failure, without a lock. During some testing with multiple replicas, I managed to cause a replica to crash due to them initializing simultaneously.
Fixes:
Encountered an error running "coder server"
create coder API: insert mesh key: pq: duplicate key value violates unique constraint "site_configs_key_key"
Co-authored-by: Cian Johnston <cian@coder.com>
Alternative solution to #6442
Modifies the behaviour of AcquireProvisionerJob and adds a special case for 'un-tagged' jobs such that they can only be picked up by 'un-tagged' provisioners.
Also adds comprehensive test coverage for AcquireJob given various combinations of tags.
* feat: convertGroups() no longer requires organization info
Removing role information from some users in the api. This info is
excessive and not required. It is costly to always include
* fix: move oauth2 routes
From /login/oauth2/* to /oauth2/*.
/login/oauth2 causes /login to no longer get served by the frontend,
even if nothing is actually served on /login itself.
* Add forgotten comment on delete
* chore: add database test fixture to insert non-unique linked_ids
* chore: create unit test to exercise failed email change bug
* fix: add postgres triggers to keep user_links clear of deleted users
* Add migrations to prevent deleted users with links
* Force soft delete of users, do not allow un-delete
* fix: assign new oauth users to default org
This is not a final solution, as we eventually want to be able
to map to different orgs. This makes it so multi-org does not break oauth/oidc.
The first organization created is now marked as "default". This is
to allow "single org" behavior as we move to a multi org codebase.
It is intentional that the user cannot change the default org at this
stage. Only 1 default org can exist, and it is always the first org.
Closes: https://github.com/coder/coder/issues/11961
Fixes#12030
This is a good example of the kind of thing I'd like to address with a time-testing lib. The problem is that there is a race between the watchdog starting it's timer and the test incrementing the time. What would make this easier is if the time-testing library could wait for and assert the call to start the timer before incrementing the time.
adds a watchdog to our pubsub and runs it for Coder server.
If the watchdog times out, it triggers a graceful exit in `coder server` to give any provisioner jobs a chance to shut down.
c.f. #11950
Annoyingly, prometheus Registry collects metrics async, which is causing our test to be racy. They also don't export enough from the Metric interface for us to replicate a synchronous collect, so we have to use Eventually to test.
Adds prometheus metrics to PGPubsub for monitoring its health and performance in production.
Related to #11950 --- additional diagnostics to help figure out what's happening