Closes#18207
This PR adds license status to support bundle to help with
troubleshooting license-related issues.
- `license-status.txt`, is added to the support bundle.
- it contains the same output as the `coder license list` command.
- license output formatter logic has been extracted into a separate
function.
- this allows it to be reused both in the `coder license list` cmd and
in the support bundle generation.
# What does this do?
This does parameter validation for dynamic parameters in `wsbuilder`. All input parameters are validated in `coder/coder` before being sent to terraform.
The heart of this PR is [`ResolveParameters`](b65001e89c/coderd/dynamicparameters/resolver.go (L30-L30)).
# What else changes?
`wsbuilder` now needs to load the terraform files into memory to succeed. This does add a larger memory requirement to workspace builds.
# Future work
- Sort autostart handling workspaces by template version id. So workspaces with the same template version only load the terraform files once from the db, and store them in the cache.
Alternate fix for https://github.com/coder/coder/issues/18080
Modifies wsbuilder to complete the provisioner job and mark the
workspace as deleted if it is clear that no provisioner will be able to
pick up the delete build.
This has a significant advantage of not deviating too much from the
current semantics of `POST /api/v2/workspacebuilds`.
https://github.com/coder/coder/pull/18460 ends up returning a 204 on
orphan delete due to no build being created.
Downside is that we have to duplicate some responsibilities of
provisionerdserver in wsbuilder.
There is a slight gotcha to this approach though: if you stop a
provisioner and then immediately try to orphan-delete, the job will
still be created because of the provisioner heartbeat interval. However
you can cancel it and try again.
Fixes https://github.com/coder/coder/issues/17840
NOTE: calling this out as a breaking change so that it is highly visible
in the changelog.
* CLI: Modifies `coder update` to stop the workspace if already running.
* UI: Modifies "update" button to always stop the workspace if already
running.
"Idle" is more accurate than "complete" since:
1. AgentAPI only knows if the screen is active; it has no way of knowing
if the task is complete.
2. The LLM might be done with its current prompt, but that does not mean
the task is complete either (it likely needs refinement).
The "complete" state will be reserved for future definition.
Additionally, in the case where the screen goes idle but the LLM never
reported a status update, we can get an idle icon without a message, and
it looks kinda janky in the UI so if there is no message I display the
state text.
Closes https://github.com/coder/internal/issues/699
## Description
This PR adds support for deleting prebuilt workspaces via the
authorization layer. It introduces special-case handling to ensure that
`prebuilt_workspace` permissions are evaluated when attempting to delete
a prebuilt workspace, falling back to the standard `workspace` resource
as needed.
Prebuilt workspaces are a subset of workspaces, identified by having
`owner_id` set to `PREBUILD_SYSTEM_USER`.
This means:
* A user with `prebuilt_workspace.delete` permission is allowed to
**delete only prebuilt workspaces**.
* A user with `workspace.delete` permission can **delete both normal and
prebuilt workspaces**.
⚠️ This implementation is scoped to **deletion operations only**. No
other operations are currently supported for the `prebuilt_workspace`
resource.
To delete a workspace, users must have the following permissions:
* `workspace.read`: to read the current workspace state
* `update`: to modify workspace metadata and related resources during
deletion (e.g., updating the `deleted` field in the database)
* `delete`: to perform the actual deletion of the workspace
## Changes
* Introduced `authorizeWorkspace()` helper to handle prebuilt workspace
authorization logic.
* Ensured both `prebuilt_workspace` and `workspace` permissions are
checked.
* Added comments to clarify the current behavior and limitations.
* Moved `SystemUserID` constant from the `prebuilds` package to the
`database` package `PrebuildsSystemUserID` to resolve an import cycle
(commit
f24e4ab4b6).
* Update middleware `ExtractOrganizationMember` to include system user
members.
There were some code paths where if we exited early from the function
the postgres connection would never get cleaned up.
This is the mechanism that cleans up the db - it requires the err
variable to be not nil:
118bf98145/cli/server.go (L2319-L2328)
In the past we randomly selected workspace agent if there were multiple.
Unless both are running on the same machine with the same configuration,
this would be very confusing behavior for a user.
With the introduction of sub agents (devcontainer agents), we have now
made this an error state and require the specifying of agent when there
is more than one (either normal agent or sub agent).
This aligns with the behavior of e.g. Coder Desktop.
Fixescoder/internal#696
This is meant to complement the existing task reporter since the LLM
does not call it reliably.
It also includes refactoring to use the common agent flags/env vars.
This PR implements protobuf streaming to handle large module files by:
1. **Streaming large payloads**: When module files exceed the 4MB limit,
they're streamed in chunks using a new UploadFile RPC method
2. **Database storage**: Streamed files are stored in the database and
referenced by hash for deduplication
3. **Backward compatibility**: Small module files continue using the
existing direct payload method
# Add separate token lifetime limits for administrators
This PR introduces a new configuration option `--max-admin-token-lifetime` that allows administrators to create API tokens with longer lifetimes than regular users. By default, administrators can create tokens with a lifetime of up to 7 days (168 hours), while the existing `--max-token-lifetime` setting continues to apply to regular users.
The implementation:
- Adds a new `MaximumAdminTokenDuration` field to the session configuration
- Modifies the token validation logic to check the user's role and apply the appropriate lifetime limit
- Updates the token configuration endpoint to return the correct maximum lifetime based on the user's role
- Adds tests to verify that administrators can create tokens with longer and shorter lifetimes
- Updates documentation and help text to reflect the new option
This change allows organizations to grant administrators extended token lifetimes while maintaining tighter security controls for regular users.
Fixes#17395
fixes#18199
Corrects handling of paths with spaces in the `Match !exec` clause we
use to determine whether Coder Connect is running. This is handled
differently than the ProxyCommand, so we have a different escape
routine, which also varies by OS.
On Windows, we resort to a pretty gnarly hack, but it does work and I
feel the only other option would be to reduce functionality such that we
could not detect the Coder Connect state.
Closes#17982.
The purpose of this PR is to expose network latency via the API used by Coder Desktop.
This PR has the tunnel ping all known agents every 5 seconds, in order to produce an instance of:
```proto
message LastPing {
// latency is the RTT of the ping to the agent.
google.protobuf.Duration latency = 1;
// did_p2p indicates whether the ping was sent P2P, or over DERP.
bool did_p2p = 2;
// preferred_derp is the human readable name of the preferred DERP region,
// or the region used for the last ping, if it was sent over DERP.
string preferred_derp = 3;
// preferred_derp_latency is the last known latency to the preferred DERP
// region. Unset if the region does not appear in the DERP map.
optional google.protobuf.Duration preferred_derp_latency = 4;
}
```
The contents of this message are stored and included on all subsequent upsertions of the agent.
Note that we upsert existing agents every 5 seconds to update the `last_handshake` value.
On the desktop apps, this message will be used to produce a tooltip similar to that of the VS Code extension:
<img width="495" alt="image" src="https://github.com/user-attachments/assets/d8b65f3d-f536-4c64-9af9-35c1a42c92d2" />
(wording not final)
Unlike the VS Code extension, we omit:
- The Latency of *all* available DERP regions. It seems not ideal to send a copy of this entire map for every online agent, and it certainly doesn't make sense for it to be on the `Agent` or `LastPing` message.
If we do want to expose this info on Coder Desktop, we should consider how best to do so; maybe we want to include it on a more generic `Netcheck` message.
- The current throughput (Bytes up/down). This is out of scope of the linked issue, and is non-trivial to implement. I'm also not sure of the value given the frequency we're doing these status updates (every 5 seconds).
If we want to expose it, it'll be in a separate PR.
<img width="343" alt="image" src="https://github.com/user-attachments/assets/8447d03b-9721-4111-8ac1-332d70a1e8f1" />
Relates to https://github.com/coder/coder/issues/17818
Note: due to limitations in `cobra/serpent` I ended up having to use `-`
to signify absence of provisioner tags. This value is not a valid
key-value pair and thus not a valid tag.
Refactor the workspace SSH command syntax across the project to use the
"workspace.coder" format instead of "coder.workspace". This standardizes
the SSH host entries for better consistency and clarity.
This is a follow-up from #17445 and recommends using the suffix-based
format for all new Coder versions.
<img width="418" alt="image"
src="https://github.com/user-attachments/assets/3893f840-9ce1-4803-a013-736068feb328"
/>
Discovered an unhelpful error when running a CLI command without internet (I didn't know I didn't have internet!):
```
$ coder ls
Encountered an error running "coder list", see "coder list --help" for more information
error: <nil>
```
The source of this was that calling `Unwrap()` on `net.DNSError` can return nil, causing the whole error trace to get replaced by it. Instead, we'll just treat a nil `Unwrap()` return value as if there was nothing to unwrap.
The result is:
```
$ coder ls
Encountered an error running "coder list", see "coder list --help" for more information
error: query workspaces: Get "https://dev.coder.com/api/v2/workspaces?q=owner%3Ame": dial tcp: lookup dev.coder.com: no such host
```
Adds telemetry for a _global_ account of prebuilt workspaces created,
failed to build, and claimed.
Partitioning this data by template/preset tuple is not currently in
scope.
---------
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Closes#18088.
The linked issue is misleading -- `coder config-ssh` continues to support the `coder.` prefix. The reason the command
`ssh coder.workspace.agent` fails is because `coder ssh workspace.agent` wasn't supported. This PR fixes that.
We know we used to support `workspace.agent`, as this is what we recommend in the Web UI:

This PR also adds support for `coder ssh agent.workspace.owner`, such that after running `coder config-ssh`, a command like
```
ssh agent.workspace.owner.coder
```
works, even without Coder Connect running. This is done for parity with an existing workflow that uses `ssh workspace.coder`, which either uses Coder Connect if available, or the CLI.
User name, avatar URL, and last seen at, are not required fields so they
can be empty. Instead of returning the 0 values from Go, we want to make
it more agnostic, and omit them when they are empty. This make the docs
and usage way clearer for consumers.
This change introduces a significant refactor to the agentcontainers API
and enables periodic updates of Docker containers rather than on-demand.
Consequently this change also allows us to move away from using a
locking channel and replace it with a mutex, which simplifies usage.
Additionally a previous oversight was fixed, and testing added, to clear
devcontainer running/dirty status when the container has been removed.
Updates coder/coder#16424
Updates coder/internal#621
Relates to https://github.com/coder/coder/issues/17432
### Part 1:
Notes:
- `GetPresetsAtFailureLimit` SQL query is added, which is similar to
`GetPresetsBackoff`, they use same CTEs: `filtered_builds`,
`time_sorted_builds`, but they are still different.
- Query is executed on every loop iteration. We can consider marking
specific preset as permanently failed as an optimization to avoid
executing query on every loop iteration. But I decided don't do it for
now.
- By default `FailureHardLimit` is set to 3.
- `FailureHardLimit` is configurable. Setting it to zero - means that
hard limit is disabled.
### Part 2
Notes:
- `PrebuildFailureLimitReached` notification is added.
- Notification is sent to template admins.
- Notification is sent only the first time, when hard limit is reached.
But it will `log.Warn` on every loop iteration.
- I introduced this enum:
```sql
CREATE TYPE prebuild_status AS ENUM (
'normal', -- Prebuilds are working as expected; this is the default, healthy state.
'hard_limited', -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore.
'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried.
);
```
`validation_failed` not used in this PR, but I think it will be used in
next one, so I wanted to save us an extra migration.
- Notification looks like this:
<img width="472" alt="image"
src="https://github.com/user-attachments/assets/e10efea0-1790-4e7f-a65c-f94c40fced27"
/>
### Latest notification views:
<img width="463" alt="image"
src="https://github.com/user-attachments/assets/11310c58-68d1-4075-a497-f76d854633fe"
/>
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/6bbfe21a-91ac-47c3-a9d1-21807bb0c53a"
/>
Fixed environment variable name for app status slug in Claude MCP configuration from `CODER_MCP_CLAUDE_APP_STATUS_SLUG` to `CODER_MCP_APP_STATUS_SLUG` to maintain consistency with other MCP environment variables.
This also caused the User level Claude.md to not contain instructions to report its progress, so it did not receive status reports.
Dynamic params skip parameter validation in coder/coder.
This is because conditional parameters cannot be validated
with the static parameters in the database.