Skip to content

ci: Run PostgreSQL with a scratch directory to improve CI durability #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 30, 2022

Conversation

kylecarbs
Copy link
Member

When using parallel before, multiple PostgreSQL containers would
unintentionally interfere with the other's data. This ensures
both containers have separated data, and don't create a volume.

🌮 @bryphe-coder for the idea!

When using parallel before, multiple PostgreSQL containers would
unintentionally interfere with the other's data. This ensures
both containers have separated data, and don't create a volume.

🌮 @bryphe-coder for the idea!
@kylecarbs kylecarbs self-assigned this Jan 30, 2022
@codecov
Copy link

codecov bot commented Jan 30, 2022

Codecov Report

Merging #89 (58a22ad) into main (2b922b1) will increase coverage by 0.25%.
The diff coverage is 80.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #89      +/-   ##
==========================================
+ Coverage   71.45%   71.71%   +0.25%     
==========================================
  Files          89       89              
  Lines        3563     3578      +15     
  Branches       55       55              
==========================================
+ Hits         2546     2566      +20     
+ Misses        794      790       -4     
+ Partials      223      222       -1     
Flag Coverage Δ
unittest-go-macos-latest 67.49% <ø> (+0.27%) ⬆️
unittest-go-ubuntu-latest 69.45% <80.00%> (-0.08%) ⬇️
unittest-go-windows-latest 67.01% <ø> (-0.11%) ⬇️
unittest-js 78.30% <ø> (ø)
Impacted Files Coverage Δ
database/postgres/postgres.go 70.00% <80.00%> (+3.33%) ⬆️
peer/conn.go 75.91% <0.00%> (+2.43%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b922b1...58a22ad. Read the comment docs.

Copy link
Contributor

@bryphe-coder bryphe-coder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for fixing this @kylecarbs 👏

@kylecarbs kylecarbs merged commit f9e594f into main Jan 30, 2022
@kylecarbs kylecarbs deleted the psql branch January 30, 2022 00:40
bryphe-coder added a commit that referenced this pull request Feb 1, 2022
@kylecarbs and I were debugging a gnarly postgres issue over the weekend, and unfortunately it looks like it is still coming up occassionally: https://github.com/coder/coder/runs/5014420662?check_suite_focus=true#step:8:35 - so thought this might be a good testing Monday task.

Intermittently, the test would fail with something like a `401` - invalid e-mail, or a `409` - initial user already created. This was quite surprising, because the tests are designed to spin up their own, isolated database.

We tried a few things to debug this...

## Attempt 1: Log out the generated port numbers when running the docker image.

Based on the errors, it seemed like one test must be connecting to another test's database - that would explain why we'd get these conflicts! However, logging out the port number that came from docker always gave a unique number... and we couldn't find evidence of one database connecting to another.

## Attempt 2: Store the database in unique, temporary folder.

@kylecarbs and I found that the there was a [volume](https://github.com/docker-library/postgres/blob/a83005b407ee6d810413500d8a041c957fb10cf0/11/alpine/Dockerfile#L155) for the postgres data... so @kylecarbs implemented mounting the volume to a unique, per-test temporary folder in #89

It sounded really promising... but unfortunately we hit the issue again!

### Attempt 3... this PR

After we hit the failure again, we noticed in the `docker ps` logs something quite strange:
![image](https://user-images.githubusercontent.com/88213859/151913133-522a6c2e-977a-4a65-9315-804531ab7d77.png)

When the docker image is run - it creates two port bindings, an IPv4 and an IPv6 one. These _should be the same_ - but surprisingly, they can sometimes be different. It isn't deterministic, and seems to be more common when there are multiple containers running. Importantly, __they can overlap__ as in the above image. 

Turns out, it seems this is a docker bug: moby/moby#42442 - which may be fixed in newer versions.

To work around this bug, we have to manipulate the port bindings (like you would with `-p`) at the command line. We can do this with `docker`/`dockertest`, but it means we have to get a free port ahead of time to know which port to map.

With that fix in - the `docker ps` is a little more sane:
![image](https://user-images.githubusercontent.com/88213859/151913432-5f86bc09-8604-4355-ad49-0abeaf8cc0fe.png)

...and hopefully means we can safely run the containers in parallel again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants