Commit Graph

6 Commits

Author SHA1 Message Date
fade8ba759 fix: fix MeasureLatencyRecvTimeout to accept send=0 (#13477)
Fixes the flake seen here: https://github.com/coder/coder/runs/25832852690

Linux is not a real time operating system, and so there is no guarantee that subsequent `time.Now()` `time.Since()` calls will return a non-zero time.  This assert is mainly there to ensure we don't return `-1`.
2024-06-05 18:27:56 +04:00
85de0e966d chore: fix TestMeasureLatency/MeasureLatencyRecvTimeout flake (#13301) 2024-05-16 13:42:42 -05:00
4671ebb330 feat: measure pubsub latencies and expose metrics (#13126) 2024-05-10 12:31:49 +00:00
5454f4997b chore(ci): clean up databases after test finishes in CI (#12702) 2024-03-21 14:53:16 +00:00
51707446d0 fix: stop holding Pubsub mutex while calling pq.Listener (#12518)
fixes #11950

https://github.com/coder/coder/issues/11950#issuecomment-1987756088 explains the bug

We were also calling into `Unlisten()` and `Close()` while holding the mutex.  I don't believe that `Close()` depends on the notification loop being unblocked, but it's hard to be sure, and the safest thing to do is assume it could block.

So, I added a unit test that fakes out `pq.Listener` and sends a bunch of notifies every time we call into it to hopefully prevent regression where we hold the mutex while calling into these functions.

It also removes the use of a `context.Context` to stop the PubSub -- it must be explicitly `Closed()`.  This simplifies a bunch of the logic, and is how we use the pubsub anyway.
2024-03-12 09:44:12 +04:00
5a359d50dd feat: add metrics to PGPubsub (#11971)
Adds prometheus metrics to PGPubsub for monitoring its health and performance in production.

Related to #11950 --- additional diagnostics to help figure out what's happening
2024-02-01 11:25:03 +04:00