coolify

Author	SHA1	Message	Date
Andras Bacsai	ae6eef3cdb	feat(tests): add comprehensive tests for ContainerStatusAggregator and serverStatus accessor - Introduced tests for ContainerStatusAggregator to validate status aggregation logic across various container states. - Implemented tests to ensure serverStatus accessor correctly checks server infrastructure health without being affected by container status. - Updated ExcludeFromHealthCheckTest to verify excluded status handling in various components. - Removed obsolete PushServerUpdateJobStatusAggregationTest as its functionality is covered elsewhere. - Updated version number for sentinel to 0.0.17 in versions.json.	2025-11-20 17:31:07 +01:00
Andras Bacsai	14bba8ba86	fix: correct Sentinel default health status and remove debug logging This commit addresses container status reporting issues and removes debug logging: Primary Fix: - Changed PushServerUpdateJob to default to 'unknown' instead of 'unhealthy' when health_status field is missing from Sentinel data - This ensures containers WITHOUT healthcheck defined are correctly reported as "unknown" not "unhealthy" - Matches SSH path behavior (GetContainersStatus) which already defaulted to 'unknown' Service Multi-Container Aggregation: - Implemented service container status aggregation (same pattern as applications) - Added serviceContainerStatuses collection to both Sentinel and SSH paths - Services now aggregate status using priority: unhealthy > unknown > healthy - Prevents race conditions where last-processed container would win Debug Logging Cleanup: - Removed all [STATUS-DEBUG] logging statements (25 total) - Removed all ray() debugging calls (3 total) - Removed proof_unknown_preserved and health_status_was_null debug fields - Code is now production-ready Test Coverage: - Added 2 new tests for Sentinel default health status behavior - Added 5 new tests for service aggregation in SSH path - All 16 tests pass (66 assertions) Note: The root cause was identified as Sentinel (Go binary) also defaulting to "unhealthy". That will need a separate fix in the Sentinel codebase. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 11:10:34 +01:00
Andras Bacsai	747a48b933	debug: add detailed Sentinel container processing logging Added comprehensive logging to track why applicationContainerStatuses collection is empty in PushServerUpdateJob. ## Logging Added ### 1. Raw Sentinel Data (line 113-118) Logs: Complete container data received from Sentinel Purpose: See exactly what Sentinel is sending Data: Container count and full container array with all labels ### 2. Container Processing Loop (line 157-163) Logs: Every container as it's being processed Purpose: Track which containers enter the processing loop Data: Container name, status, all labels, coolify.managed flag ### 3. Skipped Containers - Not Managed (line 165-171) Logs: Containers without coolify.managed label Purpose: Identify containers being filtered out early Data: Container name ### 4. Successful Container Addition (line 193-198) Logs: When container is successfully added to applicationContainerStatuses Purpose: Confirm containers ARE being processed Data: Application ID, container name, container status ### 5. Missing com.docker.compose.service Label (line 200-206) Logs: Containers skipped due to missing com.docker.compose.service Purpose: Identify the most likely root cause Data: Container name, application ID, all labels ## Why This Matters User reported applicationContainerStatuses is empty (`[]`) even though Sentinel is pushing updates. This logging will reveal: 1. Is Sentinel sending containers at all? 2. Are containers filtered by coolify.managed check? 3. Is com.docker.compose.service label missing? (most likely) 4. What labels IS Sentinel actually sending? ## Expected Findings Based on investigation, the issue is likely: - Sentinel is NOT sending com.docker.compose.service in labels - Or Sentinel uses a different label format/name - Containers pass all other checks but fail on line 190-206 ## Next Steps After logs appear, we'll see exactly which filter is blocking containers and can fix the root cause (likely need to extract com.docker.compose.service from container name or use a different label source). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 08:34:42 +01:00
Andras Bacsai	d2d9c1b2bc	debug: add comprehensive status change logging Added detailed debug logging to all status update paths to help diagnose why "unhealthy" status appears in the UI. ## Logging Added ### 1. PushServerUpdateJob (Sentinel updates) Location: Lines 303-315 Logs: Status changes from Sentinel push updates Data tracked: - Old vs new status - Container statuses that led to aggregation - Status flags (hasRunning, hasUnhealthy, hasUnknown) ### 2. GetContainersStatus (SSH updates) Location: Lines 441-449, 346-354, 358-365 Logs: Status changes from SSH-based checks Scenarios: - Normal status aggregation - Recently restarted containers (kept as degraded) - Applications not running (set to exited) Data tracked: - Old vs new status - Container statuses - Restart count and timing - Whether containers exist ### 3. Application Model Status Accessor Location: Lines 706-712, 726-732 Logs: When status is set without explicit health information Issue: Highlights cases where health defaults to "unhealthy" Data tracked: - Raw value passed to setter - Final result after default applied ## How to Use ### Enable Debug Logging Edit `.env` or `config/logging.php` to set log level to debug: ``` LOG_LEVEL=debug ``` ### Monitor Logs ```bash tail -f storage/logs/laravel.log \| grep STATUS-DEBUG ``` ### Log Format All logs use `[STATUS-DEBUG]` prefix for easy filtering: ``` [2025-11-19 13:00:00] local.DEBUG: [STATUS-DEBUG] Sentinel status change { "source": "PushServerUpdateJob", "app_id": 123, "app_name": "my-app", "old_status": "running:unknown", "new_status": "running:healthy", "container_statuses": [...], "flags": {...} } ``` ## What to Look For 1. Default to unhealthy: Check Application model accessor logs 2. Status flipping: Compare timestamps between Sentinel and SSH updates 3. Incorrect aggregation: Check flags and container_statuses 4. Stale database values: Check if old_status persists across multiple logs ## Next Steps After gathering logs, we can: 1. Identify the exact source of "unhealthy" status 2. Determine if it's a default issue, aggregation bug, or timing problem 3. Apply targeted fix based on evidence 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 13:52:08 +01:00
Andras Bacsai	6b62847a11	fix: preserve unknown health status in Sentinel updates (PushServerUpdateJob) ## Problem Services with "running (unknown)" status were periodically changing to "running (healthy)" every ~30 seconds when Sentinel pushed updates. This was confusing for users and inconsistent with SSH-based status checks. ## Root Cause `PushServerUpdateJob::aggregateMultiContainerStatuses()` was missing logic to track "unknown" health state. It only tracked "unhealthy" and defaulted everything else to "healthy". When Sentinel pushed updates with "running (unknown)" containers: - The job saw `hasRunning = true` and `hasUnhealthy = false` - It incorrectly returned "running (healthy)" instead of "running (unknown)" ## Solution Updated `PushServerUpdateJob` to match the logic in `GetContainersStatus`: 1. Added `$hasUnknown` tracking variable 2. Check for "unknown" in status strings (alongside "unhealthy") 3. Implement 3-way priority: unhealthy > unknown > healthy This ensures consistency between: - SSH-based updates (`GetContainersStatus`) - Sentinel-based updates (`PushServerUpdateJob`) - UI display logic ## Changes - app/Jobs/PushServerUpdateJob.php: Added unknown status tracking - tests/Unit/PushServerUpdateJobStatusAggregationTest.php: New comprehensive tests - tests/Unit/ExcludeFromHealthCheckTest.php: Updated to match current implementation ## Testing All 31 status-related unit tests passing: - 18 tests in ContainerHealthStatusTest - 8 tests in ExcludeFromHealthCheckTest (updated) - 6 tests in PushServerUpdateJobStatusAggregationTest (new) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 13:40:58 +01:00
Andras Bacsai	08d257535a	fix(docker): enhance container status aggregation for multi-container applications, including exclusion handling based on docker-compose configuration	2025-09-13 20:32:15 +02:00
Andras Bacsai	0f5c988658	fix(horizon): add silenced jobs	2025-07-12 14:44:32 +02:00
Andras Bacsai	24688b2ad8	fix(jobs): update middleware to use expireAfter for WithoutOverlapping in multiple job classes	2025-07-01 10:50:27 +02:00
Andras Bacsai	f9a0ca2ca6	refactor(proxy): update StartProxy calls to use named parameter for async option	2025-06-16 13:13:01 +02:00
Andras Bacsai	ddcb14500d	refactor(proxy-status): refactored how the proxy status is handled on the UI and on the backend feat(cloudflare): improved cloudflare tunnel automated installation	2025-06-06 14:47:54 +02:00
Andras Bacsai	97ec579910	refactor(push-server-update): enhance application preview handling by incorporating pull request IDs and adding status update protections	2025-06-04 10:03:36 +02:00
Andras Bacsai	9883cef26d	refactor(jobs): update middleware to include job-specific identifiers for WithoutOverlapping	2025-05-29 17:31:43 +02:00
Andras Bacsai	0369909408	fix(PushServerUpdateJob): add null checks before updating application and database statuses	2025-05-29 10:47:26 +02:00
Andras Bacsai	c6278a06ba	refactor(jobs): unify middleware configuration to prevent job release after expiration for DockerCleanupJob and PushServerUpdateJob	2025-05-07 14:42:42 +02:00
Andras Bacsai	b78f2cccff	refactor(jobs): update WithoutOverlapping middleware to use expireAfter for better queue management	2025-04-18 09:52:32 +02:00
Andras Bacsai	b09f0043d1	fix: restrict jobs on cloud fix: restrict sentinel endpoint	2025-01-10 11:54:45 +01:00
Andras Bacsai	7dc65dfd79	fix: make sure important jobs/actions are running on high prio queue	2024-11-22 11:16:01 +01:00
Andras Bacsai	275edb6c1f	put a few things on high queue	2024-11-06 12:33:56 +01:00
Lucas Michot	8e1444eaa7	Get rid of many useless blank lines	2024-10-31 17:44:01 +01:00
Andras Bacsai	96ca72fcdb	refactor server view (phuuu)	2024-10-30 20:03:30 +01:00
Lucas Michot	5b6e466e0c	Remove some useless catch blocks	2024-10-28 14:37:00 +01:00
Lucas Michot	d557a22b91	Remove all ray() calls	2024-10-28 13:51:23 +01:00
Andras Bacsai	8c96ab52d7	feat: notification rate limiter fix: limit server up / down notification limits	2024-10-25 15:13:23 +02:00
Andras Bacsai	621e063bf1	Refactor PushServerUpdateJob to implement ShouldBeEncrypted interface	2024-10-24 15:16:00 +02:00
Andras Bacsai	ac768e5313	feat: limit storage check emails feat: sentinel should send storage usage	2024-10-22 14:01:36 +02:00
Andras Bacsai	537630acc6	Refactor PushServerUpdateJob to handle container restart notifications	2024-10-22 11:42:24 +02:00
Andras Bacsai	d7efe8a6d1	fix: no sentinel for swarm yet	2024-10-22 11:29:43 +02:00
Andras Bacsai	4c95647b96	feat: cleanup sentinel on server deletion fix: Sentinel should not be enabled on build servers	2024-10-17 11:21:43 +02:00
Andras Bacsai	2702fbc284	Refactor logging in PushServerUpdateJob, Application, and SentinelSeeder	2024-10-15 17:03:50 +02:00
Andras Bacsai	d446cd4f31	sentinel updates	2024-10-15 13:39:19 +02:00
Andras Bacsai	81db57002b	Refactor PushServerUpdateJob to handle multiple servers, previews, and emails	2024-10-14 22:53:16 +02:00
Andras Bacsai	fdeb9353be	chore: Update project service configuration view	2024-10-14 19:45:03 +02:00
Andras Bacsai	1f72321681	fix: sentinel	2024-10-14 18:04:36 +02:00
Andras Bacsai	8a2c9f3d44	updates sentinel	2024-10-14 17:54:29 +02:00
Andras Bacsai	b2e515f770	sentinel	2024-10-14 13:32:36 +02:00
Andras Bacsai	1f193d465d	sentinel updates	2024-10-14 12:07:37 +02:00

36 commits