From c27866221822e4112bddb43f8ad102a5589c98ab Mon Sep 17 00:00:00 2001 From: Danny Kopping Date: Fri, 2 May 2025 12:17:01 +0200 Subject: [PATCH] feat: collect database metrics (#17635) Currently we don't have a way to get insight into Postgres connections being exhausted. By using the prometheus' [`DBStats` collector](https://github.com/prometheus/client_golang/blob/main/prometheus/collectors/dbstats_collector.go), we get some insight out-of-the-box. ``` # HELP go_sql_idle_connections The number of idle connections. # TYPE go_sql_idle_connections gauge go_sql_idle_connections{db_name="coder"} 1 # HELP go_sql_in_use_connections The number of connections currently in use. # TYPE go_sql_in_use_connections gauge go_sql_in_use_connections{db_name="coder"} 2 # HELP go_sql_max_idle_closed_total The total number of connections closed due to SetMaxIdleConns. # TYPE go_sql_max_idle_closed_total counter go_sql_max_idle_closed_total{db_name="coder"} 112 # HELP go_sql_max_idle_time_closed_total The total number of connections closed due to SetConnMaxIdleTime. # TYPE go_sql_max_idle_time_closed_total counter go_sql_max_idle_time_closed_total{db_name="coder"} 0 # HELP go_sql_max_lifetime_closed_total The total number of connections closed due to SetConnMaxLifetime. # TYPE go_sql_max_lifetime_closed_total counter go_sql_max_lifetime_closed_total{db_name="coder"} 0 # HELP go_sql_max_open_connections Maximum number of open connections to the database. # TYPE go_sql_max_open_connections gauge go_sql_max_open_connections{db_name="coder"} 10 # HELP go_sql_open_connections The number of established connections both in use and idle. # TYPE go_sql_open_connections gauge go_sql_open_connections{db_name="coder"} 3 # HELP go_sql_wait_count_total The total number of connections waited for. # TYPE go_sql_wait_count_total counter go_sql_wait_count_total{db_name="coder"} 28 # HELP go_sql_wait_duration_seconds_total The total time blocked waiting for a new connection. # TYPE go_sql_wait_duration_seconds_total counter go_sql_wait_duration_seconds_total{db_name="coder"} 0.086936235 ``` `go_sql_wait_count_total` is the metric I'm most interested in gaining, but the others are also very useful. Changing the prefix is easy (`prometheus.WrapRegistererWithPrefix`), but getting rid of the `go_` segment is not quite so easy. I've kept the changeset small for now. **NOTE:** I imported a library to determine the database name from the given conn string. It's [not as simple](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING) as one might hope. The database name is used for the `db_name` label. --------- Signed-off-by: Danny Kopping --- cli/server.go | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/cli/server.go b/cli/server.go index 39cfa52571..580dae3694 100644 --- a/cli/server.go +++ b/cli/server.go @@ -739,6 +739,15 @@ func (r *RootCmd) Server(newAPI func(context.Context, *coderd.Options) (*coderd. _ = sqlDB.Close() }() + if options.DeploymentValues.Prometheus.Enable { + // At this stage we don't think the database name serves much purpose in these metrics. + // It requires parsing the DSN to determine it, which requires pulling in another dependency + // (i.e. https://github.com/jackc/pgx), but it's rather heavy. + // The conn string (https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING) can + // take different forms, which make parsing non-trivial. + options.PrometheusRegistry.MustRegister(collectors.NewDBStatsCollector(sqlDB, "")) + } + options.Database = database.New(sqlDB) ps, err := pubsub.New(ctx, logger.Named("pubsub"), sqlDB, dbURL) if err != nil {