Files
coder/database/postgres/postgres.go
Bryan 38867b0ad3 fix: Re-enable parallel run of Postgres-backed tests (#119)
@kylecarbs and I were debugging a gnarly postgres issue over the weekend, and unfortunately it looks like it is still coming up occassionally: https://github.com/coder/coder/runs/5014420662?check_suite_focus=true#step:8:35 - so thought this might be a good testing Monday task.

Intermittently, the test would fail with something like a `401` - invalid e-mail, or a `409` - initial user already created. This was quite surprising, because the tests are designed to spin up their own, isolated database.

We tried a few things to debug this...

## Attempt 1: Log out the generated port numbers when running the docker image.

Based on the errors, it seemed like one test must be connecting to another test's database - that would explain why we'd get these conflicts! However, logging out the port number that came from docker always gave a unique number... and we couldn't find evidence of one database connecting to another.

## Attempt 2: Store the database in unique, temporary folder.

@kylecarbs and I found that the there was a [volume](a83005b407/11/alpine/Dockerfile (L155)) for the postgres data... so @kylecarbs implemented mounting the volume to a unique, per-test temporary folder in https://github.com/coder/coder/pull/89

It sounded really promising... but unfortunately we hit the issue again!

### Attempt 3... this PR

After we hit the failure again, we noticed in the `docker ps` logs something quite strange:
![image](https://user-images.githubusercontent.com/88213859/151913133-522a6c2e-977a-4a65-9315-804531ab7d77.png)

When the docker image is run - it creates two port bindings, an IPv4 and an IPv6 one. These _should be the same_ - but surprisingly, they can sometimes be different. It isn't deterministic, and seems to be more common when there are multiple containers running. Importantly, __they can overlap__ as in the above image. 

Turns out, it seems this is a docker bug: https://github.com/moby/moby/issues/42442 - which may be fixed in newer versions.

To work around this bug, we have to manipulate the port bindings (like you would with `-p`) at the command line. We can do this with `docker`/`dockertest`, but it means we have to get a free port ahead of time to know which port to map.

With that fix in - the `docker ps` is a little more sane:
![image](https://user-images.githubusercontent.com/88213859/151913432-5f86bc09-8604-4355-ad49-0abeaf8cc0fe.png)

...and hopefully means we can safely run the containers in parallel again.
2022-02-01 09:22:02 -08:00

109 lines
3.1 KiB
Go

package postgres
import (
"database/sql"
"fmt"
"io/ioutil"
"net"
"os"
"time"
"github.com/ory/dockertest/v3"
"github.com/ory/dockertest/v3/docker"
"golang.org/x/xerrors"
)
// Open creates a new PostgreSQL server using a Docker container.
func Open() (string, func(), error) {
pool, err := dockertest.NewPool("")
if err != nil {
return "", nil, xerrors.Errorf("create pool: %w", err)
}
tempDir, err := ioutil.TempDir(os.TempDir(), "postgres")
if err != nil {
return "", nil, xerrors.Errorf("create tempdir: %w", err)
}
// Pick an explicit port on the host to connect to 5432.
// This is necessary so we can configure the port to only use ipv4.
port, err := getFreePort()
if err != nil {
return "", nil, xerrors.Errorf("Unable to get free port: %w", err)
}
resource, err := pool.RunWithOptions(&dockertest.RunOptions{
Repository: "postgres",
Tag: "11",
Env: []string{
"POSTGRES_PASSWORD=postgres",
"POSTGRES_USER=postgres",
"POSTGRES_DB=postgres",
// The location for temporary database files!
"PGDATA=/tmp",
"listen_addresses = '*'",
},
PortBindings: map[docker.Port][]docker.PortBinding{
"5432/tcp": {{
// Manually specifying a host IP tells Docker just to use an IPV4 address.
// If we don't do this, we hit a fun bug:
// https://github.com/moby/moby/issues/42442
// where the ipv4 and ipv6 ports might be _different_ and collide with other running docker containers.
HostIP: "0.0.0.0",
HostPort: fmt.Sprintf("%d", port)}},
},
Mounts: []string{
// The postgres image has a VOLUME parameter in it's image.
// If we don't mount at this point, Docker will allocate a
// volume for this directory.
//
// This isn't used anyways, since we override PGDATA.
fmt.Sprintf("%s:/var/lib/postgresql/data", tempDir),
},
}, func(config *docker.HostConfig) {
// set AutoRemove to true so that stopped container goes away by itself
config.AutoRemove = true
config.RestartPolicy = docker.RestartPolicy{Name: "no"}
})
if err != nil {
return "", nil, xerrors.Errorf("could not start resource: %w", err)
}
hostAndPort := resource.GetHostPort("5432/tcp")
dbURL := fmt.Sprintf("postgres://postgres:postgres@%s/postgres?sslmode=disable", hostAndPort)
// Docker should hard-kill the container after 120 seconds.
err = resource.Expire(120)
if err != nil {
return "", nil, xerrors.Errorf("could not expire resource: %w", err)
}
pool.MaxWait = 120 * time.Second
err = pool.Retry(func() error {
db, err := sql.Open("postgres", dbURL)
if err != nil {
return err
}
err = db.Ping()
_ = db.Close()
return err
})
if err != nil {
return "", nil, err
}
return dbURL, func() {
_ = pool.Purge(resource)
_ = os.RemoveAll(tempDir)
}, nil
}
// getFreePort asks the kernel for a free open port that is ready to use.
func getFreePort() (port int, err error) {
// Binding to port 0 tells the OS to grab a port for us:
// https://stackoverflow.com/questions/1365265/on-localhost-how-do-i-pick-a-free-port-number
listener, err := net.Listen("tcp", "localhost:0")
if err != nil {
return 0, err
}
defer listener.Close()
return listener.Addr().(*net.TCPAddr).Port, nil
}