fix: Re-enable parallel run of Postgres-backed tests (#119)

@kylecarbs and I were debugging a gnarly postgres issue over the weekend, and unfortunately it looks like it is still coming up occassionally: https://github.com/coder/coder/runs/5014420662?check_suite_focus=true#step:8:35 - so thought this might be a good testing Monday task.

Intermittently, the test would fail with something like a `401` - invalid e-mail, or a `409` - initial user already created. This was quite surprising, because the tests are designed to spin up their own, isolated database.

We tried a few things to debug this...

## Attempt 1: Log out the generated port numbers when running the docker image.

Based on the errors, it seemed like one test must be connecting to another test's database - that would explain why we'd get these conflicts! However, logging out the port number that came from docker always gave a unique number... and we couldn't find evidence of one database connecting to another.

## Attempt 2: Store the database in unique, temporary folder.

@kylecarbs and I found that the there was a [volume](a83005b407/11/alpine/Dockerfile (L155)) for the postgres data... so @kylecarbs implemented mounting the volume to a unique, per-test temporary folder in https://github.com/coder/coder/pull/89

It sounded really promising... but unfortunately we hit the issue again!

### Attempt 3... this PR

After we hit the failure again, we noticed in the `docker ps` logs something quite strange:
![image](https://user-images.githubusercontent.com/88213859/151913133-522a6c2e-977a-4a65-9315-804531ab7d77.png)

When the docker image is run - it creates two port bindings, an IPv4 and an IPv6 one. These _should be the same_ - but surprisingly, they can sometimes be different. It isn't deterministic, and seems to be more common when there are multiple containers running. Importantly, __they can overlap__ as in the above image. 

Turns out, it seems this is a docker bug: https://github.com/moby/moby/issues/42442 - which may be fixed in newer versions.

To work around this bug, we have to manipulate the port bindings (like you would with `-p`) at the command line. We can do this with `docker`/`dockertest`, but it means we have to get a free port ahead of time to know which port to map.

With that fix in - the `docker ps` is a little more sane:
![image](https://user-images.githubusercontent.com/88213859/151913432-5f86bc09-8604-4355-ad49-0abeaf8cc0fe.png)

...and hopefully means we can safely run the containers in parallel again.
This commit is contained in:
Bryan
2022-02-01 09:22:02 -08:00
committed by GitHub
parent c54d61efdf
commit 38867b0ad3
2 changed files with 31 additions and 1 deletions

View File

@ -4,6 +4,7 @@ import (
"database/sql"
"fmt"
"io/ioutil"
"net"
"os"
"time"
@ -22,6 +23,13 @@ func Open() (string, func(), error) {
if err != nil {
return "", nil, xerrors.Errorf("create tempdir: %w", err)
}
// Pick an explicit port on the host to connect to 5432.
// This is necessary so we can configure the port to only use ipv4.
port, err := getFreePort()
if err != nil {
return "", nil, xerrors.Errorf("Unable to get free port: %w", err)
}
resource, err := pool.RunWithOptions(&dockertest.RunOptions{
Repository: "postgres",
Tag: "11",
@ -33,6 +41,15 @@ func Open() (string, func(), error) {
"PGDATA=/tmp",
"listen_addresses = '*'",
},
PortBindings: map[docker.Port][]docker.PortBinding{
"5432/tcp": {{
// Manually specifying a host IP tells Docker just to use an IPV4 address.
// If we don't do this, we hit a fun bug:
// https://github.com/moby/moby/issues/42442
// where the ipv4 and ipv6 ports might be _different_ and collide with other running docker containers.
HostIP: "0.0.0.0",
HostPort: fmt.Sprintf("%d", port)}},
},
Mounts: []string{
// The postgres image has a VOLUME parameter in it's image.
// If we don't mount at this point, Docker will allocate a
@ -76,3 +93,16 @@ func Open() (string, func(), error) {
_ = os.RemoveAll(tempDir)
}, nil
}
// getFreePort asks the kernel for a free open port that is ready to use.
func getFreePort() (port int, err error) {
// Binding to port 0 tells the OS to grab a port for us:
// https://stackoverflow.com/questions/1365265/on-localhost-how-do-i-pick-a-free-port-number
listener, err := net.Listen("tcp", "localhost:0")
if err != nil {
return 0, err
}
defer listener.Close()
return listener.Addr().(*net.TCPAddr).Port, nil
}