Skip to content
This repository was archived by the owner on Mar 28, 2025. It is now read-only.

feat: Add HTTP server, remove unhealthy thread monitoring in favour of logging#25

Merged
khvn26 merged 3 commits intomainfrom
feat/remove-thread-monitoring
Feb 28, 2025
Merged

feat: Add HTTP server, remove unhealthy thread monitoring in favour of logging#25
khvn26 merged 3 commits intomainfrom
feat/remove-thread-monitoring

Conversation

@khvn26
Copy link
Member

@khvn26 khvn26 commented Feb 25, 2025

This PR adds a Gunicorn server to the runprocessor command.

Additionally, it removes the checktaskprocessorthreadhealth management command and related code. Threads deemed "unhealthy" are now simply logged to stdout under WARNING level.

@@ -1,17 +0,0 @@
import logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there's something I'm not considering, but do we need to delete this file? Could we not keep it and maybe add some warning logs saying that it's deprecated, and you should use the health endpoints directly instead?

That way we can avoid introducing this only for backwards compatibility in the API's manage.py. There might also be some subtle changes in behaviour which I'm not considering, and I'd rather keep this as unchanged as possible if it's for backwards compatibility.

Copy link
Member Author

@khvn26 khvn26 Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That way we can avoid introducing this only for backwards compatibility

The problem with this command is the fact it's run via Django's management command framework, triggering ready and post migrate signals we cannot control (or, at the very least, have to control, which is an overkill for something like a healthcheck). This is why we need to intercept it earlier than the moment Django kicks in, and it's also why an empty/lightweight command wouldn't be a sufficient solution.

I'd rather keep this as unchanged as possible if it's for backwards compatibility.

I'd agree here, but the check is causing problems in some of the production environments right now. We discussed leaving current health logic with @gagantrivedi, but he mentioned that it's not in fact correct (see https://flagsmith.slack.com/archives/CTF0THS2D/p1740484073370499?thread_ts=1740480199.053439&cid=CTF0THS2D).

TL/DR: the check is too broken to be left as is.

@khvn26 khvn26 requested a review from rolodato February 25, 2025 14:10
@khvn26 khvn26 force-pushed the feat/remove-thread-monitoring branch from ef765ed to 59af2ba Compare February 26, 2025 18:45
@khvn26 khvn26 force-pushed the feat/remove-thread-monitoring branch from 59af2ba to 43387cb Compare February 26, 2025 20:10
@khvn26
Copy link
Member Author

khvn26 commented Feb 27, 2025

In 43387cb, I refactored the task processor main thread to a thread of its own in order to be able to run a Gunicorn server in the main thread instead. This allows to serve HTTP endpoints, which HTTP healtchecks require, and is also important for the upcoming Prometheus endpoint.

@khvn26 khvn26 changed the title feat: Remove unhealthy thread monitoring in favour of logging feat: Add HTTP server, remove unhealthy thread monitoring in favour of logging Feb 28, 2025
@khvn26 khvn26 merged commit 966c02e into main Feb 28, 2025
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants