Commit Graph

2409 Commits

Author SHA1 Message Date
e191d9650c feat: support created_at filter for the GET /users endpoint (#15633)
Closes https://github.com/coder/coder/issues/12747

We support these filters currently:
https://coder.com/docs/v2/latest/admin/users#user-filtering, adding
`created_at` filter as well.
2024-12-17 15:24:54 +11:00
409e2c7a20 fix: use random names for TestUpdateUserProfile (#15868)
Fixes a flake seen in
https://github.com/coder/coder/actions/runs/12346801529/job/34452940351
It's possible but exceedingly rare for the randomly generated username
to be exactly 32 characters.
Then, appending a `1` to that username causes the username to be invalid
and the test to fail. Instead of appending we'll just generate a new
username that is <=32 characters.

The `UpdateSelf` subtest has the same appending, but uses a fixed
username that is less than 32 characters, so it doesn't need to be
changed.
2024-12-16 06:59:44 +00:00
50ff06cc3c chore: acquire lock for individual workspace transition (#15859)
When Coder is ran in High Availability mode, each Coder instance has a
lifecycle executor. These lifecycle executors are all trying to do the
same work, and whilst transactions saves us from this causing an issue,
we are still doing extra work that could be prevented.

This PR adds a `TryAcquireLock` call for each attempted workspace
transition, meaning two Coder instances shouldn't duplicate effort.
2024-12-13 16:59:27 +00:00
d31c2f1fe7 chore: implement SCIM PUT endpoint, protect against missing active (#15829)
Closes https://github.com/coder/coder/issues/15828
2024-12-12 08:11:13 -06:00
36c2cf8a40 fix(coderd/database): exclude canceled jobs in queue position (#15835)
When calculating the queue position in
`GetProvisionerJobsByIDsWithQueuePosition` we only counted jobs with
`started_at = NULL`. This is misleading, as it allows canceling or
canceled jobs to take up rows in the computed queue position, giving an
impression that the queue is larger than it really is.

This modifies the query to also exclude jobs with a null `canceled_at`,
`completed_at`, or `error` field for the purposes of calculating the
queue position, and also adds a test to validate this behaviour.

(Note: due to the behaviour of `dbgen.ProvisionerJob` with `dbmem` I had
to use other proxy methods to validate the corresponding dbmem
implementation.)

---------

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-12-12 12:37:45 +00:00
06e7739e7d chore: add e2e tests for organization members (#15807) 2024-12-11 15:48:48 -07:00
b39becba66 feat(site): add a provisioner warning to workspace builds (#15686)
This PR adds warnings about provisioner health to workspace build pages.
It closes https://github.com/coder/coder/issues/15048


![image](https://github.com/user-attachments/assets/fa54d0e8-c51f-427a-8f66-7e5dbbc9baca)

![image](https://github.com/user-attachments/assets/b5169669-ab05-43d5-8553-315a3099b4fd)
2024-12-11 13:38:13 +02:00
49c453b42e chore: replace unmaintained ping library (#15808)
Relates to
https://github.com/coder/coder/pull/15712#issuecomment-2527841239.

We only use a ping library to determine the closest devtunnel node, so
is a very minor change.
2024-12-10 08:51:46 +05:00
40624bf78b fix: update workspace TTL on template TTL change (#15761)
Relates to https://github.com/coder/coder/issues/15390

Currently when a user creates a workspace, their workspace's TTL is
determined by the template's default TTL. If the Coder instance is AGPL,
or if the template has disallowed the user from configuring autostop,
then it is not possible to change the workspace's TTL after creation.
Any changes to the template's default TTL only takes effect on _new_
workspaces.

This PR modifies the behaviour slightly so that on AGPL Coder, or on
enterprise when a template does not allow user's to configure their
workspace's TTL, updating the template's default TTL will also update
any workspace's TTL to match this value.
2024-12-06 11:01:39 +00:00
e744cde86f fix(coderd): ensure that clearing invalid oauth refresh tokens works with dbcrypt (#15721)
https://github.com/coder/coder/pull/15608 introduced a buggy behaviour
with dbcrypt enabled.
When clearing an oauth refresh token, we had been setting the value to
the empty string.
The database encryption package considers decrypting an empty string to
be an error, as an empty encrypted string value will still have a nonce
associated with it and thus not actually be empty when stored at rest.

Instead of 'deleting' the refresh token, 'update' it to be the empty
string.
This plays nicely with dbcrypt.

It also adds a 'utility test' in the dbcrypt package to help encrypt a
value. This was useful when manually fixing users affected by this bug
on our dogfood instance.
2024-12-03 13:26:31 -06:00
148a5a3593 fix: fix goroutine leak in log streaming over websocket (#15709)
fixes #14881

Our handlers for streaming logs don't read from the websocket. We don't allow the client to send us any data, but the websocket library we use requires reading from the websocket to properly handle pings and closing. Not doing so can [can cause the websocket to hang on write](https://github.com/coder/websocket/issues/405), leaking go routines which were noticed in #14881.

This fixes the issue, and in process refactors our log streaming to a encoder/decoder package which provides generic types for sending JSON over websocket.

I'd also like for us to upgrade to the latest https://github.com/coder/websocket but we should also upgrade our tailscale fork before doing so to avoid including two copies of the websocket library.
2024-12-03 10:12:30 +04:00
e21a301682 fix: make GetWorkspacesEligibleForTransition return even less false positives (#15594)
Relates to https://github.com/coder/coder/issues/15082

Further to https://github.com/coder/coder/pull/15429, this reduces the
amount of false-positives returned by the 'is eligible for autostart'
part of the query. We achieve this by calculating the 'next start at'
time of the workspace, storing it in the database, and using it in our
`GetWorkspacesEligibleForTransition` query.

The prior implementation of the 'is eligible for autostart' query would
return _all_ workspaces that at some point in the future _might_ be
eligible for autostart. This now ensures we only return workspaces that
_should_ be eligible for autostart.

We also now pass `currentTick` instead of `t` to the
`GetWorkspacesEligibleForTransition` query as otherwise we'll have one
round of workspaces that are skipped by `isEligibleForTransition` due to
`currentTick` being a truncated version of `t`.
2024-12-02 21:02:36 +00:00
2b57dcc68c feat(coderd): add matched provisioner daemons information to more places (#15688)
- Refactors `checkProvisioners` into `db2sdk.MatchedProvisioners`
- Adds a separate RBAC subject just for reading provisioner daemons
- Adds matched provisioners information to additional endpoints relating to
  workspace builds and templates
-Updates existing unit tests for above endpoints
-Adds API endpoint for matched provisioners of template dry-run job
-Updates CLI to show warning when creating/starting/stopping/deleting
 workspaces for which no provisoners are available

---------

Co-authored-by: Danny Kopping <danny@coder.com>
2024-12-02 20:54:32 +00:00
3014713c47 fix(cli): handle version mismatch re MatchedProvisioners response (#15682)
* Modifies `MatchedProvisioners` response of `codersdk.TemplateVersion`
to be a pointer
* CLI now checks for absence of `*MatchedProvisioners` before showing
warning regarding provisioners
* Extracts logic for warning about provisioners to a function
* Improves test coverage for CLI template push with
`coder_workspace_tags`.
2024-11-29 19:45:58 +00:00
ef09b51912 fix(coderd): extract provisionerdserver.StaleInterval to 90 seconds (#15643)
Follow-up from https://github.com/coder/coder/pull/15578

Extracts `provisionerdserver.StaleInterval` and sets it to 90 seconds by
default
2024-11-28 12:57:43 +00:00
40f12aeca3 chore: update group and role sync notes (#15658) 2024-11-27 14:39:03 -07:00
b830c05e3e chore: track usage of built-in example templates (#15671)
Addresses https://github.com/coder/nexus/issues/99.

Changes:
- Save the id of the built-in example template used to create a template
version in the database
- Include the example id in telemetry
2024-11-27 20:01:08 +01:00
83c493e832 chore: fix more flaky tests on Windows with Postgres (#15629)
Addresses the following flakes:

- https://github.com/coder/internal/issues/222
- https://github.com/coder/internal/issues/223
- https://github.com/coder/internal/issues/224
- https://github.com/coder/internal/issues/225
- https://github.com/coder/internal/issues/226
- https://github.com/coder/internal/issues/227
- https://github.com/coder/internal/issues/228
- https://github.com/coder/internal/issues/229
- https://github.com/coder/internal/issues/230
2024-11-26 11:56:07 +01:00
8afb10e090 chore: improve validation of Security tag in swaggerparser (#15660)
Aims to resolve #15605 

There's currently one option valid for the `@Security` tag in
swaggerparser - which fails in the CI if we try to put any other value.

At least one of our endpoints does not accept `CoderSessionToken` as an
option for the authentication and so we need to add new possibilities in
order to keep the documentation up-to-date.

In this PR , I added `ProvisionerKey` which is the way our provisioner
daemon can authenticate to the backend - also modified a bit the code to
simplify other options later.
2024-11-26 07:19:43 +01:00
60ddcf5de2 chore: improve testing coverage on ExtractProvisionerDaemonAuthenticated middleware (#15622)
This one aims to resolve #15604 

Created some table tests for the main cases - 
also preferred to create two isolated cases for the most complicated
cases in order to keep table tests simple enough.

Give us full coverage on the middleware logic, for both optional and non
optional cases - PSK and ProvisionerKey.
2024-11-26 04:02:20 +01:00
d60b58874e fix: update /builds transition example (#15657) 2024-11-26 00:52:23 +00:00
0896f339c4 refactor(coderd/provisionerdserver): use quartz.Clock instead of TimeNowFn (#15642)
Replace `TimeNowFn` in `provisionerdserver` with `quartz.Clock` as
well as pass `coderd`'s `Clock` to `provisionerdserver`.
2024-11-25 16:25:36 +00:00
1cdc3e8921 feat!: extract provisioner tags from coder_workspace_tags data source (#15578)
Relates to https://github.com/coder/coder/issues/15087 and
https://github.com/coder/coder/issues/15427

- Extracts provisioner job tags from `coder_workspace_tags` on template
version creation using `provisioner/terraform/tfparse` added in
https://github.com/coder/coder/pull/15236
- Drops a WARN log in coderd if no matching provisioners found.
- Also drops a warning message in the CLI if no provisioners are found.
- To support both CLI and UI warnings, added a
`codersdk.MatchedProvisioners` struct to the `TemplateVersion` response
containing details of how many provisioners were around at the time of
the insert.

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-11-25 11:19:14 +00:00
26438aa91f chore: implement OIDCClaimFieldValues for idp sync mappings auto complete (#15576)
When creating IDP sync mappings, these are the values that can be
selected from. These are the values that can be mapped from in
org/group/role sync.
2024-11-21 13:04:00 -06:00
5b7fa78676 chore: add deployment config option to append custom csp directives (#15596)
Allows adding custom static CSP directives to Coder. Niche use case but
makes this easier then creating a reverse proxy that has to replace the
header. We want to preserve our directives, so having an append option
is preferred to a "replace" option via a reverse proxy.


Closes https://github.com/coder/coder/issues/15118
2024-11-21 11:53:53 -06:00
78f9f43c97 chore: do not refresh tokens that have already failed refreshing (#15608)
Once a token refresh fails, we remove the `oauth_refresh_token` from the
database. This will prevent the token from hitting the IDP for
subsequent refresh attempts.

Without this change, a bad script can cause a failing token to hit a
remote IDP repeatedly with each `git` operation. With this change, after
the first hit, subsequent hits will fail locally, and never contact the
IDP.

The solution in both cases is to authenticate the external auth link. So
the resolution is the same as before.
2024-11-20 20:13:07 -06:00
a518017a88 feat(coderd): add endpoint to fetch provisioner key details (#15505)
This PR is the first step aiming to resolve #15126 - 

Creating a new endpoint to return the details associated to a
provisioner key.

This is an authenticated endpoints aiming to be used by the provisioner
daemons - using the provisioner key as authentication method.

This endpoint is not ment to be used with PSK or User Sessions.
2024-11-20 18:04:47 +01:00
6ed76921dd chore: fix windows postgres tests (#15593)
Patches tests that caused Windows Postgres CI in
https://github.com/coder/coder/pull/15520 to consistently fail.

I tested this by temporarily adding Postgres Windows CI to this PR.
However, I reverted those changes to merge them with
https://github.com/coder/coder/pull/15520. For reference, here's [a
passing CI
run](https://github.com/coder/coder/actions/runs/11918816662/job/33219786238)
from an earlier commit.

**Note:** Although Windows tests now pass, they remain quite flaky. I
recommend running Postgres Windows CI to gather data on these flakes,
but I don’t think it should be a required job just yet.
2024-11-20 13:30:31 +01:00
97ce44a77d chore: track terraform module source type in telemetry (#15590) 2024-11-20 11:03:48 +01:00
fbe2fa66f5 chore: add test for coord rolling restart (#14680)
Closes https://github.com/coder/team-coconut/issues/50

---------

Co-authored-by: Ethan Dickson <ethan@coder.com>
2024-11-20 18:04:33 +11:00
576e1f48fe feat!: allow disabling notifications (#15509)
Resolves https://github.com/coder/coder/issues/15513

Disables notifications when both `$CODER_NOTIFICATIONS_WEBHOOK_ENDPOINT` and `$CODER_EMAIL_SMARTHOST` are unset.

Breaking change: `$CODER_EMAIL_SMARTHOST` is no longer set by default as `localhost:587`, meaning any deployments that make use of this default value will need to add it back.

---------

Co-authored-by: Danny Kopping <danny@coder.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-11-19 15:05:12 +00:00
c3c23ed3d9 chore: add query to fetch top level idp claim fields (#15525)
Adds an api endpoint to grab all available sync field options for IDP
sync. This is for autocomplete on idp sync forms. This is required for
organization admins to have some insight into the claim fields available
when configuring group/role sync.
2024-11-18 14:31:39 -06:00
48bb452829 fix: fix tailnet resume using incorrect DB reference (#15522)
- We were instantiating a cryptokey cache with a vanilla reference to
the database instead of one wrapped by dbcrypt.
- Fixes an issue where failing to instantiate unrelated keycaches does
not fatally error out.
2024-11-18 14:09:04 -06:00
4fedc7cf3d chore: include merged claims into the database (#15570)
Merging happens before IDP sync. Storing this will make some SQL queries
much simplier.
2024-11-18 11:58:19 -06:00
5861e516b9 chore: add standard test logger ignoring db canceled (#15556)
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests.  This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.

Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
2024-11-18 14:09:22 +04:00
aa0dc2daa1 chore: track terraform modules in telemetry (#15450)
Addresses https://github.com/coder/nexus/issues/35.

This PR:

- Adds a `workspace_modules` table to track modules used by the
Terraform provisioner in provisioner jobs.
- Adds a `module_path` column to the `workspace_resources` table,
allowing to identify which module a resource originates from.
- Starts pushing this new information into telemetry.

For the person reviewing this PR, do not fret about the 1,500 new lines
- ~1,000 of them are auto-generated.
2024-11-16 21:56:19 +01:00
4cb807670d chore: generate countries.tsx from Go code (#15274)
Closes https://github.com/coder/coder/issues/15074

We have a hard-coded list of countries at
https://github.com/coder/coder/blob/main/site/src/pages/SetupPage/countries.tsx.
This means Go code in coder/coder doesn't have an easy way of utilizing
it.

## Solution
Generate countries.tsx from Go code. Generated by `scripts/apitypings`
2024-11-15 12:05:21 -06:00
450c72f95c chore(coderd/database): fix duplicate migration numbers (#15533) 2024-11-15 11:39:05 +00:00
dbf41a1160 chore(coderd/database): fix duplicate migration numbers (#15530)
Renaming migrations to avoid duplicate numbering
2024-11-15 10:55:47 +00:00
814dd6f854 feat(coderd): update API to allow filtering provisioner daemons by tags (#15448)
This PR provides new parameters to an endpoint that will be necessary
for #15048
2024-11-15 11:33:22 +02:00
40802958e9 fix: use explicit api versions for agent and tailnet (#15508)
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.

What happened is that we accidentally shipped two new API features without bumping the version.  `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.

Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet.  That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs.  This isn't great, but hasn't caused us major issues because 

1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported. 
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.

Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.

To mitigate against this thing happening again, this PR also:

1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version.  That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.

With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
2024-11-15 11:16:28 +04:00
b6d0b7713a chore: implement user link claims as a typed golang object (#15502)
Move claims from a `debug` column to an actual typed column to be used.
This does not functionally change anything, it just adds some Go typing to build
on.
2024-11-14 10:05:44 -06:00
cb1a006ae4 chore: add link to RBAC usage doc in README (#15510)
Some new joiners had found the README, but not the usage doc

Signed-off-by: Danny Kopping <danny@coder.com>
2024-11-14 10:01:19 +00:00
4a6b28f5df feat(provisioner): add support for workspace_owner_login_type (#15499)
- Adds support for the `coder_workspace_owner.login_type` attribute.
- Adds a currently disabled test for `coder_workspace_owner.login_type`
2024-11-13 15:34:58 +00:00
f2fe379bd2 fix: make GetWorkspacesEligibleForTransition return less false-positives (#15429)
Relates to https://github.com/coder/coder/issues/15082

The old implementation of `GetWorkspacesEligibleForTransition` returns
many workspaces that are not actually eligible for transition. This new
implementation reduces this number significantly (at least on our
dogfood instance).
2024-11-13 10:24:20 +00:00
bebe4f06d2 chore: sort inserted users on dbmem (#15483) 2024-11-12 10:44:31 -03:00
30e6fbd35c fix(coderd): ensure correct RBAC when enqueueing notifications (#15478)
- Assert rbac in fake notifications enqueuer
- Move fake notifications enqueuer to separate notificationstest package
- Update dbauthz rbac policy to allow provisionerd and autostart to create and read notification messages
- Update tests as required
2024-11-12 12:40:46 +00:00
d1305ac25e fix: stop logging error when template schedule query is canceled (#15402)
Fixes test flakes _a la_

```
    t.go:108: 2024-11-05 09:52:37.996 [erro]  workspacestats: failed to load template schedule bumping activity, defaulting to bumping by 60min  request_id=f14215d2-73dc-47ba-aa81-422c62f257e4  workspace_id=545d73c7-3a62-4466-8c08-b6abb12867b7  template_id=49747428-3abb-40e4-a6b2-03653e9f2506 ...
        error= fetch object:
                   github.com/coder/coder/v2/coderd/database/dbauthz.(*querier).GetTemplateByID.fetch[...].func1
                       /home/runner/work/coder/coder/coderd/database/dbauthz/dbauthz.go:497
                 - pq: canceling statement due to user request
         *** slogtest: log detected at level ERROR; TEST FAILURE ***
```

seen here on main: https://github.com/coder/coder/actions/runs/11681605747/job/32527006174
2024-11-12 15:08:15 +04:00
a5d19776af chore: increase autostop requirement leeway to two hours (#15445)
Closes https://github.com/coder/coder/issues/12612

The problem in the linked issue was caused due to a mismatch of when the
Web UI tooltip shows up (2 hours before an autostop requirement) and the
leeway in the `autostop_requirement` algorithm (workspace builds must be
1 hour before an autostop requirement to skip them).

Now, restarting your workspace whilst the tooltip is showing will skip
the upcoming autostop requirement.

This also could have been fixed by only showing the tooltip one hour
before the autostop requirement, but it looks like 1 hour was chosen
arbitrarily, and it doesn't hurt to give users more time to skip the
autostop.
2024-11-12 13:53:21 +11:00
782214bcd8 chore: move organizatinon sync to runtime configuration (#15431)
Moves the configuration from environment to database backed, to allow
configuring organization sync at runtime.
2024-11-08 08:44:14 -06:00