623dcd97dc
fix(cli): fix flakes related to context cancellation when establishing pg connections ( #18246 )
...
Since https://github.com/coder/coder/pull/18195 was merged, we started
running CLI tests with postgres instead of just dbmem. This surfaced
errors related to context cancellation while establishing postgres
connections.
This PR should fix https://github.com/coder/internal/issues/672 . Related
to https://github.com/coder/coder/issues/15109 .
2025-06-05 15:54:13 +02:00
5f3a53f01b
fix: fix race condition in pubsub startup ( #17088 )
...
fixes https://github.com/coder/internal/issues/525
If the context is canceled, the goroutine that is supposed to read from the `errCh` could exit prematurely, leading to a goroutine leak. Refactors this code so it cannot block.
2025-03-25 16:45:45 +04:00
005ea536a5
fix: fix Listen/Unlisten race on Pubsub ( #15315 )
...
Fixes #15312
When we need to `Unlisten()` for an event, instead of immediately removing the event from the `p.queues`, we store a channel to signal any goroutines trying to Subscribe to the same event when we are done. On `Subscribe`, if the channel is present, wait for it before calling `Listen` to ensure the ordering is correct.
2024-11-01 14:35:26 +04:00
bd9151d224
fix(coderd/database/pubsub): prevent listeners read outside mutex lock ( #15303 )
...
https://github.com/coder/coder/actions/runs/11611105362/job/32331771969#logs
```
2024-10-31T11:36:45.9225038Z WARNING: DATA RACE
2024-10-31T11:36:45.9225120Z Write at 0x00c0000d8030 by goroutine 26:
2024-10-31T11:36:45.9225200Z runtime.mapdelete()
2024-10-31T11:36:45.9225412Z /opt/hostedtoolcache/go/1.22.8/x64/src/runtime/map.go:696 +0x0
2024-10-31T11:36:45.9225647Z github.com/coder/coder/v2/coderd/database/pubsub.(*PGPubsub).subscribeQueue.func2()
2024-10-31T11:36:45.9225906Z /home/runner/work/coder/coder/coderd/database/pubsub/pubsub.go:277 +0x131
2024-10-31T11:36:45.9225993Z runtime.deferreturn()
2024-10-31T11:36:45.9226210Z /opt/hostedtoolcache/go/1.22.8/x64/src/runtime/panic.go:602 +0x5d
2024-10-31T11:36:45.9226283Z testing.tRunner()
2024-10-31T11:36:45.9226519Z /opt/hostedtoolcache/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
2024-10-31T11:36:45.9226603Z testing.(*T).Run.gowrap1()
2024-10-31T11:36:45.9226831Z /opt/hostedtoolcache/go/1.22.8/x64/src/testing/testing.go:1742 +0x44
2024-10-31T11:36:45.9226836Z
2024-10-31T11:36:45.9226934Z Previous read at 0x00c0000d8030 by goroutine 112:
2024-10-31T11:36:45.9227159Z github.com/coder/coder/v2/coderd/database/pubsub.(*PGPubsub).subscribeQueue.func2()
2024-10-31T11:36:45.9227462Z /home/runner/work/coder/coder/coderd/database/pubsub/pubsub.go:284 +0x1b6
2024-10-31T11:36:45.9227661Z github.com/coder/coder/v2/enterprise/replicasync.(*Manager).subscribe.func3()
2024-10-31T11:36:45.9227936Z /home/runner/work/coder/coder/enterprise/replicasync/replicasync.go:228 +0x53
2024-10-31T11:36:45.9227941Z
2024-10-31T11:36:45.9228019Z Goroutine 26 (running) created at:
2024-10-31T11:36:45.9228096Z testing.(*T).Run()
2024-10-31T11:36:45.9228318Z /opt/hostedtoolcache/go/1.22.8/x64/src/testing/testing.go:1742 +0x825
2024-10-31T11:36:45.9228498Z github.com/coder/coder/v2/enterprise/replicasync_test.TestReplica()
2024-10-31T11:36:45.9228777Z /home/runner/work/coder/coder/enterprise/replicasync/replicasync_test.go:33 +0x4b
2024-10-31T11:36:45.9228847Z testing.tRunner()
2024-10-31T11:36:45.9229063Z /opt/hostedtoolcache/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
2024-10-31T11:36:45.9229142Z testing.(*T).Run.gowrap1()
2024-10-31T11:36:45.9229366Z /opt/hostedtoolcache/go/1.22.8/x64/src/testing/testing.go:1742 +0x44
2024-10-31T11:36:45.9229369Z
2024-10-31T11:36:45.9229443Z Goroutine 112 (finished) created at:
2024-10-31T11:36:45.9229685Z github.com/coder/coder/v2/enterprise/replicasync.(*Manager).subscribe()
2024-10-31T11:36:45.9229952Z /home/runner/work/coder/coder/enterprise/replicasync/replicasync.go:226 +0x568
2024-10-31T11:36:45.9230092Z github.com/coder/coder/v2/enterprise/replicasync.New()
2024-10-31T11:36:45.9230361Z /home/runner/work/coder/coder/enterprise/replicasync/replicasync.go:101 +0x1344
2024-10-31T11:36:45.9230547Z github.com/coder/coder/v2/enterprise/replicasync_test.TestReplica.func1()
2024-10-31T11:36:45.9230836Z /home/runner/work/coder/coder/enterprise/replicasync/replicasync_test.go:48 +0x26a
2024-10-31T11:36:45.9230904Z testing.tRunner()
2024-10-31T11:36:45.9231127Z /opt/hostedtoolcache/go/1.22.8/x64/src/testing/testing.go:1689 +0x21e
2024-10-31T11:36:45.9231207Z testing.(*T).Run.gowrap1()
2024-10-31T11:36:45.9231431Z /opt/hostedtoolcache/go/1.22.8/x64/src/testing/testing.go:1742 +0x44
```
2024-11-01 11:24:29 +04:00
b8944074c4
chore: improve coder server
ux ( #14761 )
2024-09-24 13:16:36 +10:00
ded612d3ec
fix: use authenticated urls for pubsub ( #14261 )
2024-08-26 15:04:04 +00:00
4671ebb330
feat: measure pubsub latencies and expose metrics ( #13126 )
2024-05-10 12:31:49 +00:00
51707446d0
fix: stop holding Pubsub mutex while calling pq.Listener ( #12518 )
...
fixes #11950
https://github.com/coder/coder/issues/11950#issuecomment-1987756088 explains the bug
We were also calling into `Unlisten()` and `Close()` while holding the mutex. I don't believe that `Close()` depends on the notification loop being unblocked, but it's hard to be sure, and the safest thing to do is assume it could block.
So, I added a unit test that fakes out `pq.Listener` and sends a bunch of notifies every time we call into it to hopefully prevent regression where we hold the mutex while calling into these functions.
It also removes the use of a `context.Context` to stop the PubSub -- it must be explicitly `Closed()`. This simplifies a bunch of the logic, and is how we use the pubsub anyway.
2024-03-12 09:44:12 +04:00
98b86f3cd6
chore: add logs to pq notification dialer ( #12020 )
2024-02-06 15:21:48 +00:00
5a359d50dd
feat: add metrics to PGPubsub ( #11971 )
...
Adds prometheus metrics to PGPubsub for monitoring its health and performance in production.
Related to #11950 --- additional diagnostics to help figure out what's happening
2024-02-01 11:25:03 +04:00
a34cada09a
feat: add logging to pgPubsub ( #11953 )
...
Should be helpful for #11950
Adds a logger to pgPubsub and logs various events, most especially connection and disconnection from postgres.
2024-01-31 15:49:16 +04:00
387723a596
fix: close pg PubSub listener to avoid race ( #11640 )
...
Fixes flake as seen here: https://github.com/coder/coder/runs/20528529187
2024-01-18 09:18:59 +04:00
e4b6f5695b
chore: separate pubsub into a new package ( #8017 )
...
* chore: rename store to dbmock for consistency
* chore: remove redundant dbtype package
This wasn't necessary and forked how we do DB types.
* chore: separate pubsub into a new package
This didn't need to be in database and was bloating it.
2023-06-14 15:34:54 +00:00