Commit Graph

165 Commits

Author SHA1 Message Date
4d0fe20ca6 chore(coderd/database/dbauthz): update RBAC for InsertWorkspaceApp (#18223)
Instead of using `ResourceSystem` as the resource for
`InsertWorkspaceApp`, we instead use the associated workspace (if it
exists), with the action `ActionUpdate`.
2025-06-04 11:22:01 +00:00
b712d0b23f feat(coderd/agentapi): implement sub agent api (#17823)
Closes https://github.com/coder/internal/issues/619

Implement the `coderd` side of the AgentAPI for the upcoming
dev-container agents work.

`agent/agenttest/client.go` is left unimplemented for a future PR
working to implement the agent side of this feature.
2025-05-29 12:15:47 +01:00
110102a60a fix: optimize queue position sql query (#17974)
Use only `online provisioner daemons` for
`GetProvisionerJobsByIDsWithQueuePosition` query. It should improve
performance of the query.
2025-05-28 08:21:16 -04:00
53e8e9c7cd fix: reduce cost of prebuild failure (#17697)
Relates to https://github.com/coder/coder/issues/17432

### Part 1:

Notes:
- `GetPresetsAtFailureLimit` SQL query is added, which is similar to
`GetPresetsBackoff`, they use same CTEs: `filtered_builds`,
`time_sorted_builds`, but they are still different.

- Query is executed on every loop iteration. We can consider marking
specific preset as permanently failed as an optimization to avoid
executing query on every loop iteration. But I decided don't do it for
now.

- By default `FailureHardLimit` is set to 3.

- `FailureHardLimit` is configurable. Setting it to zero - means that
hard limit is disabled.

### Part 2

Notes:
- `PrebuildFailureLimitReached` notification is added.
- Notification is sent to template admins.
- Notification is sent only the first time, when hard limit is reached.
But it will `log.Warn` on every loop iteration.
- I introduced this enum:
```sql
CREATE TYPE prebuild_status AS ENUM (
  'normal',           -- Prebuilds are working as expected; this is the default, healthy state.
  'hard_limited',     -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore.
  'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried.
);
```
`validation_failed` not used in this PR, but I think it will be used in
next one, so I wanted to save us an extra migration.

- Notification looks like this:
<img width="472" alt="image"
src="https://github.com/user-attachments/assets/e10efea0-1790-4e7f-a65c-f94c40fced27"
/>

### Latest notification views:
<img width="463" alt="image"
src="https://github.com/user-attachments/assets/11310c58-68d1-4075-a497-f76d854633fe"
/>
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/6bbfe21a-91ac-47c3-a9d1-21807bb0c53a"
/>
2025-05-21 15:16:38 -04:00
3e7ff9d9e1 chore(coderd/rbac): add Action{Create,Delete}Agent to ResourceWorkspace (#17932) 2025-05-20 21:20:56 +01:00
769c9ee337 feat: cancel stuck pending jobs (#17803)
Closes: #16488
2025-05-20 15:22:44 +02:00
f36fb67f57 chore: use static params when dynamic param metadata is missing (#17836)
Existing template versions do not have the metadata (modules + plan) in
the db. So revert to using static parameter information from the
original template import.

This data will still be served over the websocket.
2025-05-16 11:47:59 -05:00
1bacd82e80 feat: add API key scope to restrict access to user data (#17692) 2025-05-15 15:32:52 +01:00
425ee6fa55 feat: reinitialize agents when a prebuilt workspace is claimed (#17475)
This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.

Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:15:36 +02:00
544259b809 feat: add database tables and API routes for agentic chat feature (#17570)
Backend portion of experimental `AgenticChat` feature:
- Adds database tables for chats and chat messages
- Adds functionality to stream messages from LLM providers using
`kylecarbs/aisdk-go`
- Adds API routes with relevant functionality (list, create, update
chats, insert chat message)
- Adds experiment `codersdk.AgenticChat`

---------

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2025-05-02 17:29:57 +01:00
669e790df6 test: add unit test to excercise bug when idp sync hits deleted orgs (#17405)
Deleted organizations are still attempting to sync members. This causes
an error on inserting the member, and would likely cause issues later in
the sync process even if that member is inserted. Deleted orgs should be
skipped.
2025-04-16 09:27:35 -05:00
c06ef7c1eb chore!: remove JFrog integration (#17353)
- Removes displaying XRay scan results in the dashboard. I'm not sure
  anyone was even using this integration so it's just debt for us to
  maintain. We can open up a separate issue to get rid of the db tables
  once we know for sure that we haven't broken anyone.
2025-04-11 14:45:21 -04:00
743d308eb3 feat: support multiple terminal fonts (#17257)
Fixes: https://github.com/coder/coder/issues/15024
2025-04-07 14:30:10 +02:00
0b2b643ce2 feat: persist prebuild definitions on template import (#16951)
This PR allows provisioners to recognise and report prebuild definitions
to the coder control plane. It also allows the coder control plane to
then persist these to its store.

closes https://github.com/coder/internal/issues/507

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>
2025-04-07 10:35:28 +02:00
99c6f235eb feat: add migrations and queries to support prebuilds (#16891)
Depends on https://github.com/coder/coder/pull/16916 _(change base to
`main` once it is merged)_

Closes https://github.com/coder/internal/issues/514

_This is one of several PRs to decompose the `dk/prebuilds` feature
branch into separate PRs to merge into `main`._

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>
2025-04-03 10:58:30 +02:00
184c1f0a59 chore: add db queries for dynamic parameters (#17137) 2025-04-01 15:47:30 -06:00
8ea956fc11 feat: add app status tracking to the backend (#17163)
This does ~95% of the backend work required to integrate the AI work.

Most left to integrate from the tasks branch is just frontend, which
will be a lot smaller I believe.

The real difference between this branch and that one is the abstraction
-- this now attaches statuses to apps, and returns the latest status
reported as part of a workspace.

This change enables us to have a similar UX to in the tasks branch, but
for agents other than Claude Code as well. Any app can report status
now.
2025-03-31 10:55:44 -04:00
06e5d9ef21 feat(coderd): add webpush package (#17091)
* Adds `codersdk.ExperimentWebPush` (`web-push`)
* Adds a `coderd/webpush` package that allows sending native push
notifications via `github.com/SherClockHolmes/webpush-go`
* Adds database tables to store push notification subscriptions.
* Adds an API endpoint that allows users to subscribe/unsubscribe, and
send a test notification (404 without experiment, excluded from API docs)
* Adds server CLI command to regenerate VAPID keys (note: regenerating
the VAPID keypair requires deleting all existing subscriptions)

---------

Co-authored-by: Kyle Carberry <kyle@carberry.com>
2025-03-27 10:03:53 +00:00
cf10d98aab fix: improve error message when deleting organization with resources (#17049)
Closes
[coder/internal#477](https://github.com/coder/internal/issues/477)

![Screenshot 2025-03-21 at 11 25
57 AM](https://github.com/user-attachments/assets/50cc03e9-395d-4fc7-8882-18cb66b1fac9)

I'm solving this issue in two parts:

1. Updated the postgres function so that it doesn't omit 0 values in the
error
2. Created a new query to fetch the number of resources associated with
an organization and using that information to provider a cleaner error
message to the frontend

> **_NOTE:_** SQL is not my strong suit, and the code was created with
the help of AI. So I'd take extra time looking over what I wrote there
2025-03-25 15:31:24 -04:00
4c33846f6d chore: add prebuilds system user (#16916)
Pre-requisite for https://github.com/coder/coder/pull/16891

Closes https://github.com/coder/internal/issues/515

This PR introduces a new concept of a "system" user.

Our data model requires that all workspaces have an owner (a `users`
relation), and prebuilds is a feature that will spin up workspaces to be
claimed later by actual users - and thus needs to own the workspaces in
the interim.

Naturally, introducing a change like this touches a few aspects around
the codebase and we've taken the approach _default hidden_ here; in
other words, queries for users will by default _exclude_ all system
users, but there is a flag to ensure they can be displayed. This keeps
the changeset relatively small.

This user has minimal permissions (it's equivalent to a `member` since
it has no roles). It will be associated with the default org in the
initial migration, and thereafter we'll need to somehow ensure its
membership aligns with templates (which are org-scoped) for which it'll
need to provision prebuilds; that's a solution we'll have in a
subsequent PR.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
2025-03-25 12:18:06 +00:00
5b3eda6719 chore: persist template import terraform plan in postgres (#17012) 2025-03-24 10:01:50 -06:00
69ba27e347 feat: allow specifying devcontainer on agent in terraform (#16997)
This change allows specifying devcontainers in terraform and plumbs it
through to the agent via agent manifest.

This will be used for autostarting devcontainers in a workspace.

Depends on coder/terraform-provider-coder#368
Updates #16423
2025-03-20 19:09:39 +02:00
4960a1e85a feat(coderd): add mark-all-as-read endpoint for inbox notifications (#16976)
[Resolve this issue](https://github.com/coder/internal/issues/506)

Add a mark-all-as-read endpoint which is marking as read all
notifications that are not read for the authenticated user.
Also adds the DB logic.
2025-03-20 13:41:54 +01:00
de41bd6b95 feat: add support for workspace app audit (#16801)
This change adds support for workspace app auditing.

To avoid audit log spam, we introduce the concept of app audit sessions.
An audit session is unique per workspace app, user, ip, user agent and
http status code. The sessions are stored in a separate table from audit
logs to allow use-case specific optimizations. Sessions are ephemeral
and the table does not function as a log.

The logic for auditing is placed in the DBTokenProvider for workspace
apps so that wsproxies are included.

This is the final change affecting the API fo #15139.

Updates #15139
2025-03-18 13:50:52 +02:00
8c0350e20c feat: add a paginated organization members endpoint (#16835)
Closes
[coder/internal#460](https://github.com/coder/internal/issues/460)
2025-03-10 14:42:07 -04:00
9041646b81 chore: add "user_configs" db table (#16564) 2025-03-05 10:46:03 -07:00
24f3445e00 chore: track workspace resource monitors in telemetry (#16776)
Addresses https://github.com/coder/nexus/issues/195. Specifically, just
the "tracking templates" requirement:

> ## Tracking in templates
> To enable resource alerts, a user must add the resource_monitoring
block to a template's coder_agent resource. We'd like to track if
customers have any resource monitoring enabled on a per-deployment
basis. Even better, we could identify which templates are using resource
monitoring.
2025-03-03 18:41:01 +01:00
c074f77a4f feat: add notifications inbox db (#16599)
This PR is linked [to the following
issue](https://github.com/coder/internal/issues/334).

The objective is to create the DB layer and migration for the new `Coder
Inbox`.
2025-03-03 10:12:48 +01:00
91a4a98c27 chore: add an unassign action for roles (#16728) 2025-02-27 10:39:06 -07:00
d3a56ae3ef feat: enable GitHub OAuth2 login by default on new deployments (#16662)
Third and final PR to address
https://github.com/coder/coder/issues/16230.

This PR enables GitHub OAuth2 login by default on new deployments.
Combined with https://github.com/coder/coder/pull/16629, this will allow
the first admin user to sign up with GitHub rather than email and
password.

We take care not to enable the default on deployments that would upgrade
to a Coder version with this change.

To disable the default provider an admin can set the
`CODER_OAUTH2_GITHUB_DEFAULT_PROVIDER` env variable to false.
2025-02-25 16:31:33 +01:00
546a549dcf feat: enable soft delete for organizations (#16584)
- Add deleted column to organizations table
- Add trigger to check for existing workspaces, templates, groups and
members in a org before allowing the soft delete

---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
2025-02-24 12:59:41 -05:00
9469b78290 fix!: enforce regex for agent names (#16641)
Underscores and double hyphens are now blocked. The regex is almost the
exact same as the `coder_app` `slug` regex, but uppercase characters are
still permitted.
2025-02-20 05:09:26 +00:00
d6b9806098 chore: implement oom/ood processing component (#16436)
Implements the processing logic as set out in the OOM/OOD RFC.
2025-02-17 16:56:52 +00:00
5ba7ba6bfc fix(coderd): add strict org ID joins for provisioner job metadata (#16588)
References #16558
2025-02-17 14:16:45 +02:00
bc609d0056 feat: integrate agentAPI with resources monitoring logic (#16438)
As part of the new resources monitoring logic - more specifically for
OOM & OOD Notifications , we need to update the AgentAPI , and the
agents logic.

This PR aims to do it, and more specifically :  
We are updating the AgentAPI & TailnetAPI to version 24 to add two new
methods in the AgentAPI :
- One method to fetch the resources monitoring configuration
- One method to push the datapoints for the resources monitoring.

Also, this PR adds a new logic on the agent side, with a routine running
and ticking - fetching the resources usage each time , but also storing
it in a FIFO like queue.

Finally, this PR fixes a problem we had with RBAC logic on the resources
monitoring model, applying the same logic than we have for similar
entities.
2025-02-14 10:28:15 +01:00
71cbf735e5 feat(coderd): add support for presets to the coder API (#16526)
This pull request builds on the existing migrations and queries to add
support for presets to the coder API.
2025-02-12 14:41:14 +02:00
34b46f9205 feat(coderd/database): add support for presets (#16509)
This pull requests adds the necessary migrations and queries to support
presets within the coderd database. Future PRs will build functionality
to the provisioners and the frontend.
2025-02-11 13:55:09 +02:00
7cbd77fd94 feat: improve resources_monitoring for OOM & OOD monitoring (#16241)
As requested for [this
issue](https://github.com/coder/internal/issues/245) we need to have a
new resource `resources_monitoring` in the agent.

It needs to be parsed from the provisioner and inserted into a new db
table.
2025-02-04 18:45:33 +01:00
2ace044e0b chore: track the first time html is served in telemetry (#16334)
Addresses https://github.com/coder/nexus/issues/175.

## Changes

- Adds the `telemetry_items` database table. It's a key value store for
telemetry events that don't fit any other database tables.
- Adds a telemetry report when HTML is served for the first time in
`site.go`.
2025-01-31 13:55:46 +01:00
3864c7e3b0 feat(coderd): add endpoint to list provisioner jobs (#16029)
Closes #15190
Updates #15084
2025-01-20 11:18:53 +02:00
f34e6fd92c chore: implement 'use' verb to template object, read has less scope now (#16075)
Template `use` is now a verb.
- Template admins can `use` all templates (org template admins same in
org)
- Members get the `use` perm from the `everyone` group in the
`group_acl`.
2025-01-17 11:55:41 -06:00
50bf5ca8fe fix(coderd/database): aggregate user engagement statistics by interval (#16150)
This PR addresses the TODO comment here:

https://github.com/coder/coder/pull/16134/files#diff-1844f895bb005f036da11d800fe2a76b54bfddd481c5d8cb15f210c64679caa5R47

The backend now backfills entries for dates with no status changes.
2025-01-16 17:34:53 +02:00
071bb26018 feat(coderd): add endpoint to list provisioner daemons (#16028)
Updates #15190
Updates #15084
Supersedes #15940
2025-01-14 16:40:26 +00:00
4543b21b7c feat(coderd/database): track user status changes over time (#16019)
RE: https://github.com/coder/coder/issues/15740,
https://github.com/coder/coder/issues/15297

In order to add a graph to the coder frontend to show user status over
time as an indicator of license usage, this PR adds the following:

* a new `api.insightsUserStatusCountsOverTime` endpoint to the API
* which calls a new `GetUserStatusCountsOverTime` query from postgres
* which relies on two new tables `user_status_changes` and
`user_deleted`
* which are populated by a new trigger and function that tracks updates
to the users table

The chart itself will be added in a subsequent PR

---------

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2025-01-13 13:08:16 +02:00
106b1cd3bc chore: convert dbauthz tests to also run with Postgres (#15862)
Another PR to address https://github.com/coder/coder/issues/15109.

- adds the DisableForeignKeysAndTriggers utility, which simplifies
converting tests from in-mem to postgres
- converts the dbauthz test suite to pass on both the in-mem db and
Postgres
2025-01-08 16:22:51 +01:00
08463c27d8 feat: add OpenIn option to coder_app (#15743)
This PR is the coder/coder part of [the open_in parameter
issue](https://github.com/coder/terraform-provider-coder/issues/297)
aiming to add a new optional parameter to choose how to open modules.

This PR is heavily linked [to this
PR](https://github.com/coder/terraform-provider-coder/pull/321).

ℹ️ For now, some integrations tests can not be pushed as it requires a
release on the terraform-provider repo.
2025-01-03 11:27:02 +01:00
b39becba66 feat(site): add a provisioner warning to workspace builds (#15686)
This PR adds warnings about provisioner health to workspace build pages.
It closes https://github.com/coder/coder/issues/15048


![image](https://github.com/user-attachments/assets/fa54d0e8-c51f-427a-8f66-7e5dbbc9baca)

![image](https://github.com/user-attachments/assets/b5169669-ab05-43d5-8553-315a3099b4fd)
2024-12-11 13:38:13 +02:00
40624bf78b fix: update workspace TTL on template TTL change (#15761)
Relates to https://github.com/coder/coder/issues/15390

Currently when a user creates a workspace, their workspace's TTL is
determined by the template's default TTL. If the Coder instance is AGPL,
or if the template has disallowed the user from configuring autostop,
then it is not possible to change the workspace's TTL after creation.
Any changes to the template's default TTL only takes effect on _new_
workspaces.

This PR modifies the behaviour slightly so that on AGPL Coder, or on
enterprise when a template does not allow user's to configure their
workspace's TTL, updating the template's default TTL will also update
any workspace's TTL to match this value.
2024-12-06 11:01:39 +00:00
e744cde86f fix(coderd): ensure that clearing invalid oauth refresh tokens works with dbcrypt (#15721)
https://github.com/coder/coder/pull/15608 introduced a buggy behaviour
with dbcrypt enabled.
When clearing an oauth refresh token, we had been setting the value to
the empty string.
The database encryption package considers decrypting an empty string to
be an error, as an empty encrypted string value will still have a nonce
associated with it and thus not actually be empty when stored at rest.

Instead of 'deleting' the refresh token, 'update' it to be the empty
string.
This plays nicely with dbcrypt.

It also adds a 'utility test' in the dbcrypt package to help encrypt a
value. This was useful when manually fixing users affected by this bug
on our dogfood instance.
2024-12-03 13:26:31 -06:00
e21a301682 fix: make GetWorkspacesEligibleForTransition return even less false positives (#15594)
Relates to https://github.com/coder/coder/issues/15082

Further to https://github.com/coder/coder/pull/15429, this reduces the
amount of false-positives returned by the 'is eligible for autostart'
part of the query. We achieve this by calculating the 'next start at'
time of the workspace, storing it in the database, and using it in our
`GetWorkspacesEligibleForTransition` query.

The prior implementation of the 'is eligible for autostart' query would
return _all_ workspaces that at some point in the future _might_ be
eligible for autostart. This now ensures we only return workspaces that
_should_ be eligible for autostart.

We also now pass `currentTick` instead of `t` to the
`GetWorkspacesEligibleForTransition` query as otherwise we'll have one
round of workspaces that are skipped by `isEligibleForTransition` due to
`currentTick` being a truncated version of `t`.
2024-12-02 21:02:36 +00:00