Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: improve mysql readiness checks #2397

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ramonpetgrave64
Copy link
Contributor

Summary

Improves the docker-compose.yml file by enforcing healtchecks between mysql and the trillian services. It also increases the timeouts for the healthchecks.

See https://docs.docker.com/reference/dockerfile/#healthcheck

This improves the developer experience, so we don't see so many startup errors for the trillian services, like

 *  Executing task: docker logs --tail 1000 -f 6791b7d7b4e977cd371144b7a752018d4077d7b3c65a9a7b12a5d3da556bb3ef 

I0307 19:57:55.500513       1 main.go:97] **** Log Server Starting ****
W0307 19:57:55.502723       1 tree_storage.go:89] Failed to set strict mode on mysql db: dial tcp 192.168.8.4:3306: connect: connection refused
F0307 19:57:55.502831       1 main.go:118] Failed to get storage provider: dial tcp 192.168.8.4:3306: connect: connection refused
...
I0307 19:58:01.096252       1 main.go:97] **** Log Server Starting ****
W0307 19:58:01.098489       1 tree_storage.go:89] Failed to set strict mode on mysql db: dial tcp 192.168.8.4:3306: connect: connection refused
F0307 19:58:01.098555       1 main.go:118] Failed to get storage provider: dial tcp 192.168.8.4:3306: connect: connection refused
...

Testing process

On a personal machine, it can take 90s for mysql to be ready.

rekor git:(main) ✗ docker compose up -d
[+] Running 5/6
 ✔ Network rekor_default                  Created                                                0.2s 
 ✔ Container rekor-redis-server-1         Started                                                0.9s 
 ⠴ Container rekor-mysql-1                Waiting                                               74.6s 
 ✔ Container rekor-trillian-log-signer-1  Created                                                0.0s 
 ✔ Container rekor-trillian-log-server-1  Created                                                0.1s 
 ✔ Container rekor-rekor-server-1         Created                                                0.0s

trillian-log-signer starts cleanly

 *  Executing task: docker logs --tail 1000 -f e857bcb48a25bbe038a4100a7a68c50e94bf34e82756edfddf485193232cc151 

I0307 19:50:31.772888       1 main.go:108] **** Log Signer Starting ****
W0307 19:50:31.776053       1 main.go:147] **** Acting as master for all logs ****
I0307 19:50:31.776630       1 operation_manager.go:328] Log operation manager starting
I0307 19:50:31.776821       1 main.go:188] RPC server starting on 0.0.0.0:8090
I0307 19:50:31.777010       1 main.go:149] HTTP server starting on 0.0.0.0:8091
I0307 19:50:31.777957       1 operation_manager.go:285] Acting as master for 0 / 0 active logs: master for:
I0307 19:50:33.287933       1 operation_manager.go:243] create master election goroutine for 4643029436978467864
I0307 19:50:33.938962       1 runner.go:130] 4643029436978467864: Now, I am the master
I0307 19:50:33.995308       1 operation_manager.go:285] Acting as master for 1 / 1 active logs: master for: <log-4643029436978467864>
...
I0307 19:51:05.615560       1 operation_manager.go:453] 4643029436978467864: processed 1 items in 0.02 seconds (60.26 qps)
I0307 19:51:40.781837       1 runner.go:148] 4643029436978467864: queue up resignation of mastership
I0307 19:51:40.821169       1 runner.go:172] 4643029436978467864: deliberately resigning mastership
I0307 19:51:40.821239       1 runner.go:130] 4643029436978467864: Now, I am the master
I0307 19:53:06.594238       1 runner.go:148] 4643029436978467864: queue up resignation of mastership
I0307 19:53:06.655440       1 runner.go:172] 4643029436978467864: deliberately resigning mastership
I0307 19:53:06.655525       1 runner.go:130] 4643029436978467864: Now, I am the master

Release Note

Better timeouts and readiness checks in docker-compose.yml.

Documentation

Signed-off-by: Ramon Petgrave <[email protected]>
Copy link

codecov bot commented Mar 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 25.23%. Comparing base (488eb97) to head (fa27085).
Report is 337 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2397       +/-   ##
===========================================
- Coverage   66.46%   25.23%   -41.23%     
===========================================
  Files          92      192      +100     
  Lines        9258    24857    +15599     
===========================================
+ Hits         6153     6272      +119     
- Misses       2359    17807    +15448     
- Partials      746      778       +32     
Flag Coverage Δ
e2etests 46.68% <ø> (-0.88%) ⬇️
unittests 16.46% <ø> (-31.23%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ramonpetgrave64 ramonpetgrave64 marked this pull request as ready for review March 7, 2025 20:52
@ramonpetgrave64 ramonpetgrave64 requested a review from a team as a code owner March 7, 2025 20:52
retries: 3
start_period: 10s
retries: 5
start_period: 90s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you developing on macOS or Linux? While this is just for developers so I'm not really concerned with bumping this, I want to make sure there's not a bug with the container.

Personally, I noticed that when I switched from developing on Linux to macOS, I saw more startup errors. Saw some mentions online of needing a different MySQL container but didn't dig into this more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the e2e-test.sh works on linux, but I see some platform-related errors when running it on mac, I suspect I need to remove the platform: specification and somehow find another multiplatform image for the gcp-pubsub-emulator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants