feat: collect database metrics (#17635)

Currently we don't have a way to get insight into Postgres connections
being exhausted.

By using the prometheus' [`DBStats`
collector](https://github.com/prometheus/client_golang/blob/main/prometheus/collectors/dbstats_collector.go),
we get some insight out-of-the-box.

```
# HELP go_sql_idle_connections The number of idle connections.
# TYPE go_sql_idle_connections gauge
go_sql_idle_connections{db_name="coder"} 1
# HELP go_sql_in_use_connections The number of connections currently in use.
# TYPE go_sql_in_use_connections gauge
go_sql_in_use_connections{db_name="coder"} 2
# HELP go_sql_max_idle_closed_total The total number of connections closed due to SetMaxIdleConns.
# TYPE go_sql_max_idle_closed_total counter
go_sql_max_idle_closed_total{db_name="coder"} 112
# HELP go_sql_max_idle_time_closed_total The total number of connections closed due to SetConnMaxIdleTime.
# TYPE go_sql_max_idle_time_closed_total counter
go_sql_max_idle_time_closed_total{db_name="coder"} 0
# HELP go_sql_max_lifetime_closed_total The total number of connections closed due to SetConnMaxLifetime.
# TYPE go_sql_max_lifetime_closed_total counter
go_sql_max_lifetime_closed_total{db_name="coder"} 0
# HELP go_sql_max_open_connections Maximum number of open connections to the database.
# TYPE go_sql_max_open_connections gauge
go_sql_max_open_connections{db_name="coder"} 10
# HELP go_sql_open_connections The number of established connections both in use and idle.
# TYPE go_sql_open_connections gauge
go_sql_open_connections{db_name="coder"} 3
# HELP go_sql_wait_count_total The total number of connections waited for.
# TYPE go_sql_wait_count_total counter
go_sql_wait_count_total{db_name="coder"} 28
# HELP go_sql_wait_duration_seconds_total The total time blocked waiting for a new connection.
# TYPE go_sql_wait_duration_seconds_total counter
go_sql_wait_duration_seconds_total{db_name="coder"} 0.086936235
```

`go_sql_wait_count_total` is the metric I'm most interested in gaining,
but the others are also very useful.

Changing the prefix is easy (`prometheus.WrapRegistererWithPrefix`), but
getting rid of the `go_` segment is not quite so easy. I've kept the
changeset small for now.

**NOTE:** I imported a library to determine the database name from the
given conn string. It's [not as
simple](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING)
as one might hope. The database name is used for the `db_name` label.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
This commit is contained in:
Danny Kopping
2025-05-02 12:17:01 +02:00
committed by GitHub
parent e718c3ab2f
commit c278662218

View File

@ -739,6 +739,15 @@ func (r *RootCmd) Server(newAPI func(context.Context, *coderd.Options) (*coderd.
_ = sqlDB.Close() _ = sqlDB.Close()
}() }()
if options.DeploymentValues.Prometheus.Enable {
// At this stage we don't think the database name serves much purpose in these metrics.
// It requires parsing the DSN to determine it, which requires pulling in another dependency
// (i.e. https://github.com/jackc/pgx), but it's rather heavy.
// The conn string (https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING) can
// take different forms, which make parsing non-trivial.
options.PrometheusRegistry.MustRegister(collectors.NewDBStatsCollector(sqlDB, ""))
}
options.Database = database.New(sqlDB) options.Database = database.New(sqlDB)
ps, err := pubsub.New(ctx, logger.Named("pubsub"), sqlDB, dbURL) ps, err := pubsub.New(ctx, logger.Named("pubsub"), sqlDB, dbURL)
if err != nil { if err != nil {