2671 Commits

Author SHA1 Message Date
699dd8e554 chore: create interface for pkgs to return codersdk errors (#18719)
This interface allows it to create rich codersdk errors and pass them up to the `wsbuilder` error handling.
2025-07-03 08:33:45 -05:00
09c50559f3 feat: implement RFC 6750 Bearer token authentication (#18644)
# Add RFC 6750 Bearer Token Authentication Support

This PR implements RFC 6750 Bearer Token authentication as an additional authentication method for Coder's API. This allows clients to authenticate using standard OAuth 2.0 Bearer tokens in two ways:

1. Using the `Authorization: Bearer <token>` header
2. Using the `access_token` query parameter

Key changes:

- Added support for extracting tokens from both Bearer headers and access_token query parameters
- Implemented proper WWW-Authenticate headers for 401/403 responses with appropriate error descriptions
- Added comprehensive test coverage for the new authentication methods
- Updated the OAuth2 protected resource metadata endpoint to advertise Bearer token support
- Enhanced the OAuth2 testing script to verify Bearer token functionality

These authentication methods are added as fallback options, maintaining backward compatibility with Coder's existing authentication mechanisms. The existing authentication methods (cookies, session token header, etc.) still take precedence.

This implementation follows the OAuth 2.0 Bearer Token specification (RFC 6750) and improves interoperability with standard OAuth 2.0 clients.
2025-07-02 19:14:54 +02:00
33bbf18a4b feat: add OAuth2 protected resource metadata endpoint for RFC 9728 (#18643)
# Add OAuth2 Protected Resource Metadata Endpoint

This PR implements the OAuth2 Protected Resource Metadata endpoint according to RFC 9728. The endpoint is available at `/.well-known/oauth-protected-resource` and provides information about Coder as an OAuth2 protected resource.

Key changes:
- Added a new endpoint at `/.well-known/oauth-protected-resource` that returns metadata about Coder as an OAuth2 protected resource
- Created a new `OAuth2ProtectedResourceMetadata` struct in the SDK
- Added tests to verify the endpoint functionality
- Updated API documentation to include the new endpoint

The implementation currently returns basic metadata including the resource identifier and authorization server URL. The `scopes_supported` field is empty until a scope system based on RBAC permissions is implemented. The `bearer_methods_supported` field is omitted as Coder uses custom authentication methods rather than standard RFC 6750 bearer tokens.

A TODO has been added to implement RFC 6750 bearer token support in the future.
2025-07-02 18:58:41 +02:00
630804ec92 chore: fix duplicate migration 000345 (#18721)
Fixes duplicate migration introduced by
https://github.com/coder/coder/pull/18575
2025-07-02 16:15:10 +00:00
f0c9c4dbcd feat: oauth2 - add RFC 8707 resource indicators and audience validation (#18575)
This pull request implements RFC 8707, Resource Indicators for OAuth 2.0 (https://datatracker.ietf.org/doc/html/rfc8707), to enhance the security of our OAuth 2.0 provider. 

This change enables proper audience validation and binds access tokens to their intended resource, which is crucial
  for preventing token misuse in multi-tenant environments or deployments with multiple resource servers.

##  Key Changes:


   * Resource Parameter Support: Adds support for the resource parameter in both the authorization (`/oauth2/authorize`) and token (`/oauth2/token`) endpoints, allowing clients to specify the intended resource server.
   * Audience Validation: Implements server-side validation to ensure that the resource parameter provided during the token exchange matches the one from the authorization request.
   * API Middleware Enforcement: Introduces a new validation step in the API authentication middleware (`coderd/httpmw/apikey.go`) to verify that the audience of the access token matches the resource server being accessed.
   * Database Schema Updates:
       * Adds a `resource_uri` column to the `oauth2_provider_app_codes` table to store the resource requested during authorization.
       * Adds an `audience` column to the `oauth2_provider_app_tokens` table to bind the issued token to a specific audience.
   * Enhanced PKCE: Includes a minor enhancement to the PKCE implementation to protect against timing attacks.
   * Comprehensive Testing: Adds extensive new tests to `coderd/oauth2_test.go` to cover various RFC 8707 scenarios, including valid flows, mismatched resources, and refresh token validation.

##  How it Works:


   1. An OAuth2 client specifies the target resource (e.g., https://coder.example.com) using the resource parameter in the authorization request.
   2. The authorization server stores this resource URI with the authorization code.
   3. During the token exchange, the server validates that the client provides the same resource parameter.
   4. The server issues an access token with an audience claim set to the validated resource URI.
   5. When the client uses the access token to call an API endpoint, the middleware verifies that the token's audience matches the URL of the Coder deployment, rejecting any tokens intended for a different resource.


  This ensures that a token issued for one Coder deployment cannot be used to access another, significantly strengthening our authentication security.

---

Change-Id: I3924cb2139e837e3ac0b0bd40a5aeb59637ebc1b
Signed-off-by: Thomas Kosiewski <tk@coder.com>
2025-07-02 17:49:00 +02:00
01163ea57b feat: allow users to pause prebuilt workspace reconciliation (#18700)
This PR provides two commands:
* `coder prebuilds pause`
* `coder prebuilds resume`

These allow the suspension of all prebuilds activity, intended for use
if prebuilds are misbehaving.
2025-07-02 15:05:42 +00:00
4072d228c5 feat: support dynamic parameters on create template request (#18636)
Future work is to add this checkbox to the UI to opt into dynamic
parameters from the first template create.
2025-07-02 09:44:01 -05:00
0b82f41a24 feat: allow masking workspace parameter inputs (#18595) 2025-07-01 16:27:43 -06:00
d22ac1cf65 chore: don't cache errors in file cache (#18555) 2025-07-01 13:50:37 -06:00
6f2834f62a feat: oauth2 - add authorization server metadata endpoint and PKCE support (#18548)
## Summary

  This PR implements critical MCP OAuth2 compliance features for Coder's authorization server, adding PKCE support, resource parameter handling, and OAuth2 server metadata discovery. This brings Coder's OAuth2 implementation significantly closer to production readiness for MCP (Model Context Protocol)
  integrations.

  ## What's Added

  ### OAuth2 Authorization Server Metadata (RFC 8414)
  - Add `/.well-known/oauth-authorization-server` endpoint for automatic client discovery
  - Returns standardized metadata including supported grant types, response types, and PKCE methods
  - Essential for MCP client compatibility and OAuth2 standards compliance

  ### PKCE Support (RFC 7636)
  - Implement Proof Key for Code Exchange with S256 challenge method
  - Add `code_challenge` and `code_challenge_method` parameters to authorization flow
  - Add `code_verifier` validation in token exchange
  - Provides enhanced security for public clients (mobile apps, CLIs)

  ### Resource Parameter Support (RFC 8707)
  - Add `resource` parameter to authorization and token endpoints
  - Store resource URI and bind tokens to specific audiences
  - Critical for MCP's resource-bound token model

  ### Enhanced OAuth2 Error Handling
  - Add OAuth2-compliant error responses with proper error codes
  - Use standard error format: `{"error": "code", "error_description": "details"}`
  - Improve error consistency across OAuth2 endpoints

  ### Authorization UI Improvements
  - Fix authorization flow to use POST-based consent instead of GET redirects
  - Remove dependency on referer headers for security decisions
  - Improve CSRF protection with proper state parameter validation

  ## Why This Matters

  **For MCP Integration:** MCP requires OAuth2 authorization servers to support PKCE, resource parameters, and metadata discovery. Without these features, MCP clients cannot securely authenticate with Coder.

  **For Security:** PKCE prevents authorization code interception attacks, especially critical for public clients. Resource binding ensures tokens are only valid for intended services.

  **For Standards Compliance:** These are widely adopted OAuth2 extensions that improve interoperability with modern OAuth2 clients.

  ## Database Changes

  - **Migration 000343:** Adds `code_challenge`, `code_challenge_method`, `resource_uri` to `oauth2_provider_app_codes`
  - **Migration 000343:** Adds `audience` field to `oauth2_provider_app_tokens` for resource binding
  - **Audit Updates:** New OAuth2 fields properly tracked in audit system
  - **Backward Compatibility:** All changes maintain compatibility with existing OAuth2 flows

  ## Test Coverage

  - Comprehensive PKCE test suite in `coderd/identityprovider/pkce_test.go`
  - OAuth2 metadata endpoint tests in `coderd/oauth2_metadata_test.go`
  - Integration tests covering PKCE + resource parameter combinations
  - Negative tests for invalid PKCE verifiers and malformed requests

  ## Testing Instructions

  ```bash
  # Run the comprehensive OAuth2 test suite
  ./scripts/oauth2/test-mcp-oauth2.sh

  Manual Testing with Interactive Server

  # Start Coder in development mode
  ./scripts/develop.sh

  # In another terminal, set up test app and run interactive flow
  eval $(./scripts/oauth2/setup-test-app.sh)
  ./scripts/oauth2/test-manual-flow.sh
  # Opens browser with OAuth2 flow, handles callback automatically

  # Clean up when done
  ./scripts/oauth2/cleanup-test-app.sh

  Individual Component Testing

  # Test metadata endpoint
  curl -s http://localhost:3000/.well-known/oauth-authorization-server | jq .

  # Test PKCE generation
  ./scripts/oauth2/generate-pkce.sh

  # Run specific test suites
  go test -v ./coderd/identityprovider -run TestVerifyPKCE
  go test -v ./coderd -run TestOAuth2AuthorizationServerMetadata
```

  ### Breaking Changes

  None. All changes maintain backward compatibility with existing OAuth2 flows.

---

Change-Id: Ifbd0d9a543d545f9f56ecaa77ff2238542ff954a
Signed-off-by: Thomas Kosiewski <tk@coder.com>
2025-07-01 15:39:29 +02:00
4e95b1d20e fix: revert changes to GetRunningPrebuiltWorkspaces (#18688)
… (#18588)"

This reverts commit 258a839d27.
2025-07-01 10:11:43 +00:00
258a839d27 chore(coderd/database): optimize GetRunningPrebuiltWorkspaces (#18588)
Fixes https://github.com/coder/internal/issues/715

After this change, the only use of the `workspace_prebuilds` view is the
`ClaimPrebuiltWorkspace` query. A subsequent PR will update the view.

Before: ~44ms https://explain.dalibo.com/plan/76cbe21d1a4c9329#plan

After: 7.3ms https://explain.dalibo.com/plan/5abbdf926315677e#plan
2025-07-01 09:42:01 +01:00
695de6e0c0 chore(coderd/database): optimize AuditLogs queries (#18600)
Closes #17689

This PR optimizes the audit logs query performance by extracting the
count operation into a separate query and replacing the OR-based
workspace_builds with conditional joins.

## Query changes
* Extracted count query to separate one
* Replaced single `workspace_builds` join with OR conditions with
separate conditional joins
* Added conditional joins
* `wb_build` for workspace_build audit logs (which is a direct lookup)
    * `wb_workspace` for workspace create audit logs (via workspace)

Optimized AuditLogsOffset query:
https://explain.dalibo.com/plan/4g1hbedg4a564bg8

New CountAuditLogs query:
https://explain.dalibo.com/plan/ga2fbcecb9efbce3
2025-07-01 07:31:14 +02:00
4756080eb2 feat(site): display devcontainer start error (#18637)
Fixes https://github.com/coder/internal/issues/705

Surface errors on the UI when a devcontainer agent is unable to be
injected.
2025-06-30 21:34:29 +01:00
f0251dfc91 chore: retry postgres connection on reset by peer in tests (#18632)
Fixes https://github.com/coder/internal/issues/695

Retries initial connection to postgres in testing up to 3 seconds if we
see "reset by peer", which probably means that some other test proc just
started the container.

---------

Co-authored-by: Hugo Dutka <hugo@coder.com>
2025-06-27 13:03:32 +00:00
3cb9b20b11 chore: improve rbac and add benchmark tooling (#18584)
## Description

This PR improves the RBAC package by refactoring the policy, enhancing
documentation, and adding utility scripts.

## Changes

* Refactored `policy.rego` for clarity and readability
* Updated README with OPA section
* Added `benchmark_authz.sh` script for authz performance testing and
comparison
* Added `gen_input.go` to generate input for `opa eval` testing
2025-06-27 12:05:34 +01:00
09cc906981 chore: remove unnecessary redeclarations in for loops (part 2) (#18593) 2025-06-26 12:28:00 -06:00
c6e0ba12d3 feat: graduate prebuilds to general availability (#18607)
This PR removes the prebuilds experiment and allows the use of prebuilds
without opting into an experiment.
2025-06-26 15:54:52 +02:00
f2d229eed3 fix!: use devcontainer ID when rebuilding a devcontainer (#18604)
This PR replaces the use of the **container** ID with the
**devcontainer** ID. This is a breaking change. This allows rebuilding a
devcontainer when there is no valid container ID.
2025-06-26 11:41:57 +01:00
6c713d5c20 fix(coderd/agentapi): make sub agent slugs more unique (#18581)
The incorrect assumption that slugs were unique per-agent was made when
the subagent API was implemented. Whilst this PR doesn't completely
enforce that, we instead compute a stable hash to prefix the slug that
should provide a reasonable level of probability that the slug will be
unique.
2025-06-25 17:36:23 +01:00
e396b06c25 feat: allow new immutable parameters for existing workspaces (#18579)
Closes https://github.com/coder/coder/issues/18578
2025-06-25 15:41:53 +00:00
688d2ee3eb chore: remove chats experiment (#18535) 2025-06-25 13:03:32 +00:00
c4e4fe85f9 fix(agent): start devcontainers through agentcontainers package (#18471)
Fixes https://github.com/coder/internal/issues/706

Context for the implementation here
https://github.com/coder/internal/issues/706#issuecomment-2990490282

Synchronously starts dev containers defined in terraform with our
`DevcontainerCLI` abstraction, instead of piggybacking off of our
`agentscripts` package. This gives us more control over logs, instead of
being reliant on packages which may or may not exist in the
user-provided image.
2025-06-25 11:52:50 +01:00
e443f8624d feat(agent/agentcontainers): implement ignore customization for devcontainers (#18530)
Fixes coder/internal#737
2025-06-24 18:52:05 +00:00
06c997a100 chore: make telemetry use_classic_parameter_flow nullable (#18547) 2025-06-24 13:38:12 -05:00
99d124e276 feat(agent): enable devcontainers by default (#18533) 2025-06-24 21:17:04 +03:00
e5eb2a8322 fix: prebuild user without ssh key when fetching owner ctx (#18541) 2025-06-24 16:49:36 +00:00
f44969b689 chore: reorder prebuilt workspace authorization logic (#18506)
## Description

Follow-up from PR https://github.com/coder/coder/pull/18333
Related with:
https://github.com/coder/coder/pull/18333#discussion_r2159300881

This changes the authorization logic to first try the normal workspace
authorization check, and only if the resource is a prebuilt workspace,
fall back to the prebuilt workspace authorization check. Since prebuilt
workspaces are a subset of workspaces, the normal workspace check is
more likely to succeed. This is a small optimization to reduce
unnecessary prebuilt authorization calls.
2025-06-24 16:33:21 +01:00
341b54e604 fix: allow dynamic parameters without requiring org membership (#18531) 2025-06-24 10:33:10 -05:00
a4f1c64a9b fix: allow dynamic parameters to consider the prebuilds user an owner (#18529)
This Pull request allows dynamic parameters to list system users in its
search for workspace owners. This is necessary to allow prebuilds to
reconcile prebuilt workspaces and to delete them.
2025-06-24 16:47:01 +02:00
4ff2254e5f chore: remove ai tasks from experiment (#18511)
Closes https://github.com/coder/internal/issues/661
2025-06-24 16:24:01 +02:00
45ab265df2 chore: add permissions to autobuilder & prebuilder to run wsbuild (#18527)
Read organization member and read files is now required for dynamic
param building.
2025-06-24 08:45:41 -05:00
7b152cdd91 chore: increase fileCache hit rate in autobuilds lifecycle (#18507)
`wsbuilder` hits the file cache when running validation. This solution is imperfect, but by first sorting workspaces by their template version id, the cache hit rate should improve.
2025-06-24 07:36:39 -05:00
670fa4a3cc feat: add the /aitasks/prompts endpoint (#18464)
Add an endpoint to fetch AI task prompts for multiple workspace builds
at the same time. A prompt is the value of the "AI Prompt" workspace
build parameter. On main, the only way our API allows fetching workspace
build parameters is by using the `/workspacebuilds/$build_id/parameters`
endpoint, requiring a separate API call for every build.

The Tasks dashboard fetches Task workspaces in order to show them in a
list, and then needs to fetch the value of the `AI Prompt` parameter for
every task workspace (using its latest build id), requiring an
additional API call for each list item. This endpoint will allow the
dashboard to make just 2 calls to render the list: one to fetch task
workspaces, the other to fetch prompts.

<img width="1512" alt="Screenshot 2025-06-20 at 11 33 11"
src="https://github.com/user-attachments/assets/92899999-e922-44c5-8325-b4b23a0d2bff"
/>

Related to https://github.com/coder/internal/issues/660.
2025-06-24 13:06:02 +02:00
0238f2926d feat: persist AI task state in template imports & workspace builds (#18449) 2025-06-24 10:36:37 +00:00
6cc4cfa346 feat: allow for default presets (#18445) 2025-06-24 12:19:19 +02:00
2afd1a203e chore: disable devtunnel tests on windows (#18521) 2025-06-24 19:01:29 +10:00
d892427b78 fix: do not warn on valid known experiments (#18514)
Fixes https://github.com/coder/coder/issues/18024

* drive-by: renames `handleExperimentsSafe` to
`handleExperimentsAvailable` to better match semantics
* defines list of `codersdk.ExperimentsKnown` and updates
`ReadExperiments` to log on invalid experiments
* typescript-ignores `codersdk.Experiments` so apitypings generates a
valid enum list of possible values of experiment
* updates OverviewPageView to distinguish between known 'hidden'
experiments and unknown 'invalid' experiments
2025-06-24 09:14:41 +01:00
5ed0c7abcb chore: improve dynamic parameter validation errors (#18501)
`BuildError` response from `wsbuilder` does not support rich errors from validation. Changed this to use the `Validations` block of codersdk responses to return all errors for invalid parameters.
2025-06-23 15:08:18 -05:00
f6e4ba6ed9 chore: remove per request dynamic parameters opt in and rely on template (#18505)
When in experimental this was used as an escape hatch. Removed to be
consistent with the template author's intentions

Backwards compatible, removing an experimental api field that is no longer used.
2025-06-23 15:04:09 -05:00
4699393522 fix: upsert coder_app resources in case they are persistent (#18509) 2025-06-23 18:50:44 +00:00
82af2e019d feat: implement dynamic parameter validation (#18482)
# What does this do?

This does parameter validation for dynamic parameters in `wsbuilder`. All input parameters are validated in `coder/coder` before being sent to terraform.

The heart of this PR is [`ResolveParameters`](b65001e89c/coderd/dynamicparameters/resolver.go (L30-L30)).

# What else changes?

`wsbuilder` now needs to load the terraform files into memory to succeed. This does add a larger memory requirement to workspace builds.

# Future work

- Sort autostart handling workspaces by template version id. So workspaces with the same template version only load the terraform files once from the db, and store them in the cache.
2025-06-23 12:35:15 -05:00
7254c08af4 chore: remove parallel queries in the same transaction (#18489)
Parallel concurrent queries cannot be run in the same tx

Was getting this error: https://stackoverflow.com/questions/78472996/go-postgres-pq-unexpected-parse-response-c-with-queryrow
2025-06-23 12:17:58 -05:00
c1b35bf2f6 chore: use database in current context for file cache (#18490)
Using the db.Store when in a TX causes a deadlock for dbmem.
In production, this can cause a deadlock if at the current conn pool
limit.
2025-06-23 11:58:52 -05:00
659b787b9f chore: set wsbuilder to use preview parameters (#18474)
Use richer `previewtypes.Parameter` for `wsbuilder`. This is a pre-requirement to adding dynamic parameter validation.

The richer type contains more information than the `db` parameter, so the conversion is lossless.
2025-06-23 11:31:53 -05:00
2f55e29466 fix: complete job and mark workspace as deleted when no provisioners are available (#18465)
Alternate fix for https://github.com/coder/coder/issues/18080

Modifies wsbuilder to complete the provisioner job and mark the
workspace as deleted if it is clear that no provisioner will be able to
pick up the delete build.

This has a significant advantage of not deviating too much from the
current semantics of `POST /api/v2/workspacebuilds`.
https://github.com/coder/coder/pull/18460 ends up returning a 204 on
orphan delete due to no build being created.

Downside is that we have to duplicate some responsibilities of
provisionerdserver in wsbuilder.

There is a slight gotcha to this approach though: if you stop a
provisioner and then immediately try to orphan-delete, the job will
still be created because of the provisioner heartbeat interval. However
you can cancel it and try again.
2025-06-23 14:07:42 +01:00
66e8dbbe17 feat: persist generated coder_app id (#18487) 2025-06-23 08:46:18 +00:00
49fcffc266 fix!: stop workspace before update (#18425)
Fixes https://github.com/coder/coder/issues/17840

NOTE: calling this out as a breaking change so that it is highly visible
in the changelog.

* CLI: Modifies `coder update` to stop the workspace if already running.
* UI: Modifies "update" button to always stop the workspace if already
running.
2025-06-23 09:12:37 +01:00
0a483ea2b7 feat: add idle app status (#18415)
"Idle" is more accurate than "complete" since:

1. AgentAPI only knows if the screen is active; it has no way of knowing
    if the task is complete.
2. The LLM might be done with its current prompt, but that does not mean
    the task is complete either (it likely needs refinement).

The "complete" state will be reserved for future definition.

Additionally, in the case where the screen goes idle but the LLM never
reported a status update, we can get an idle icon without a message, and
it looks kinda janky in the UI so if there is no message I display the
state text.

Closes https://github.com/coder/internal/issues/699
2025-06-20 14:34:31 -08:00
6e4508e29c chore: assume template versions without tf values to be empty (#18479)
Closes https://github.com/coder/internal/issues/735
2025-06-20 15:05:22 -05:00