This fixes critical bugs where Stringable objects were used in strict comparisons and collection key lookups, causing service existence checks and domain lookups to fail.
**Changes:**
- Line 539: Added ->value() to $originalServiceName conversion
- Line 541: Added ->value() to $serviceName normalization
- Line 621: Removed redundant (string) cast now that $serviceName is a plain string
**Impact:**
- Service existence check now works correctly (line 606: $transformedServiceName === $serviceName)
- Domain lookup finds existing domains (line 615: $domains->get($serviceName))
- Prevents duplicate domain entries in docker_compose_domains collection
**Tests:**
- Added comprehensive unit test suite in ApplicationParserStringableTest.php
- 9 test cases covering type verification, strict comparisons, collection operations, and edge cases
- All tests pass (24 tests, 153 assertions across related parser tests)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Exited containers don't run health checks, so showing "(unhealthy)" is
misleading. This fix ensures exited status displays without health
suffixes across all monitoring systems (SSH, Sentinel, services, etc.)
and at the UI layer for backward compatibility with existing data.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove useless conditional check for hyphens in service name normalization
The conditional `if (str($serviceName)->contains('-'))` never executes because
$serviceName is already normalized with underscores from parseServiceEnvironmentVariable()
- Always normalize service names explicitly to match docker_compose_domains lookup
This makes the code clearer and more maintainable
- Remove unused $fqdnWithPort variable assignments in both applicationParser and serviceParser
The variable is calculated but never used - only $urlWithPort and $fqdnValueForEnvWithPort are needed
These changes are code cleanup only - no behavior changes or breaking changes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit enhances the boarding flow to handle prerequisite installation asynchronously with proper retry logic and user feedback:
- Add retry mechanism with max 3 attempts for prerequisite installation
- Display live installation logs via ActivityMonitor during boarding
- Reset ActivityMonitor state when starting new activity to prevent stale event dispatching
- Support dynamic header updates in ActivityMonitor
- Add prerequisitesInstalled event handler to revalidate after installation completes
- Extract validation logic into continueValidation() method for cleaner flow
- Add unit tests for prerequisite installation logic
This improves UX by showing users real-time progress during prerequisite installation and handles installation failures gracefully with automatic retries.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added additional volumes to the Paymenter service in docker-compose:
- Added themes volume for theme storage
- Added extensions volume for external extensions
- Added app_storage_public for public storage path
These changes allow you to store extensions, themes and some images without issues(they were deleted on restart before)
Replace border-based left indicator with inset box-shadow to prevent unwanted layout shifts when focusing or marking fields as dirty. The solution reserves 4px space with transparent shadow in default state and transitions to colored shadow on focus/dirty without affecting the box model. Update all form components (input, textarea, select, datalist) for consistency.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Resolved conflicts in bootstrap/helpers/parsers.php by combining:
- ServiceApplication vs ServiceDatabase distinction from 'next' branch
- Case-preserved service name extraction and dual SERVICE_URL/SERVICE_FQDN creation from current branch
The resolution ensures:
- Only ServiceApplication instances have their fqdn column updated (ServiceDatabase does not have this column)
- Both SERVICE_URL and SERVICE_FQDN environment variables are always created with case-preserved service names
- Port-specific environment variables are created when applicable
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
ServiceDatabase doesn't have an fqdn column - only ServiceApplication does.
The parser was attempting to read/write fqdn on both types, causing SQL
errors when SERVICE_FQDN_* or SERVICE_URL_* variables were used with database
services. Now it only persists fqdn to ServiceApplication while still
generating the environment variable values for databases.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The updateCompose() function now correctly detects SERVICE_URL_* and
SERVICE_FQDN_* variables regardless of whether they are defined in
YAML list-style or map-style format.
Previously, the code only worked with list-style environment definitions:
```yaml
environment:
- SERVICE_URL_APP_3000
```
Now it also handles map-style definitions:
```yaml
environment:
SERVICE_URL_TRIGGER_3000: ""
SERVICE_FQDN_DB: localhost
```
The fix distinguishes between the two formats by checking if the array
key is numeric (list-style) or a string (map-style), then extracts the
variable name from the appropriate location.
Added 5 comprehensive unit tests covering:
- Map-style environment format detection
- Multiple map-style variables
- References vs declarations in map-style
- Abbreviated service names with map-style
- Verification of dual-format handling
This fixes variable detection for service templates like trigger.yaml,
langfuse.yaml, and paymenter.yaml that use map-style format.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Parse template variables directly instead of generating from container names. Always create both SERVICE_URL and SERVICE_FQDN pairs together. Properly separate scheme handling (URL has scheme, FQDN doesn't). Add comprehensive test coverage.
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
- Add Load/Reload Compose File button at the top for easier access
- Add toggle to switch between raw and deployable Docker Compose views
- Improve code formatting and UI consistency
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
This commit adds comprehensive validation improvements and DRY principles for handling Coolify's custom Docker Compose extensions.
## Changes
### 1. Created Reusable stripCoolifyCustomFields() Function
- Added shared helper in bootstrap/helpers/docker.php
- Removes all Coolify custom fields (exclude_from_hc, content, isDirectory, is_directory)
- Handles both long syntax (arrays) and short syntax (strings) for volumes
- Well-documented with comprehensive docblock
- Follows DRY principle for consistent field stripping
### 2. Fixed Docker Compose Modal Validation
- Updated validateComposeFile() to use stripCoolifyCustomFields()
- Now removes ALL custom fields before Docker validation (previously only removed content)
- Fixes validation errors when using templates with custom fields (e.g., traccar.yaml)
- Users can now validate compose files with Coolify extensions in UI
### 3. Enhanced YAML Validation in CalculatesExcludedStatus
- Added proper exception handling with ParseException vs generic Exception
- Added structure validation (checks if parsed result and services are arrays)
- Comprehensive logging with context (error message, line number, snippet)
- Maintains safe fallback behavior (returns empty collection on error)
### 4. Added Integer Validation to ContainerStatusAggregator
- Validates maxRestartCount parameter in both aggregateFromStrings() and aggregateFromContainers()
- Corrects negative values to 0 with warning log
- Logs warnings for suspiciously high values (> 1000)
- Prevents logic errors in crash loop detection
### 5. Comprehensive Unit Tests
- tests/Unit/StripCoolifyCustomFieldsTest.php (NEW) - 9 tests, 43 assertions
- tests/Unit/ContainerStatusAggregatorTest.php - Added 6 tests for integer validation
- tests/Unit/ExcludeFromHealthCheckTest.php - Added 4 tests for YAML validation
- All tests passing with proper Log facade mocking
### 6. Documentation
- Added comprehensive Docker Compose extensions documentation to .ai/core/deployment-architecture.md
- Documents all custom fields: exclude_from_hc, content, isDirectory/is_directory
- Includes examples, use cases, implementation details, and test references
- Updated .ai/README.md with navigation links to new documentation
## Benefits
- Better UX: Users can validate compose files with custom fields
- Better Debugging: Comprehensive logging for errors
- Better Code Quality: DRY principle with reusable validation
- Better Reliability: Prevents logic errors from invalid parameters
- Better Maintainability: Easy to add new custom fields in future
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Introduced tests for ContainerStatusAggregator to validate status aggregation logic across various container states.
- Implemented tests to ensure serverStatus accessor correctly checks server infrastructure health without being affected by container status.
- Updated ExcludeFromHealthCheckTest to verify excluded status handling in various components.
- Removed obsolete PushServerUpdateJobStatusAggregationTest as its functionality is covered elsewhere.
- Updated version number for sentinel to 0.0.17 in versions.json.
Prevents removal and re-download of database images on every restart. Docker cleanup was removing Docker Hub images (postgres, mysql, redis, etc.) that lack the coolify.managed=true label, causing them to be immediately re-pulled. Restart now preserves images while stopping/starting containers.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixes inconsistency where Service model used manual state machine logic while
all other components (Application, ComplexStatusCheck, GetContainersStatus)
use the centralized ContainerStatusAggregator service.
Changes:
- Refactored Service::aggregateResourceStatuses() to use ContainerStatusAggregator
- Removed ~60 lines of duplicated state machine logic
- Added comprehensive ServiceExcludedStatusTest with 24 test cases
- Fixed bugs in old logic where paused/starting containers were incorrectly
marked as unhealthy (should be unknown)
Benefits:
- Single source of truth for status aggregation across all models
- Leverages 42 existing ContainerStatusAggregator tests
- Consistent behavior between Service and Application/Database models
- Easier maintenance (state machine changes only in one place)
All tests pass (37 total):
- ServiceExcludedStatusTest: 24/24 passed
- AllExcludedContainersConsistencyTest: 13/13 passed
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add warning helper for 'unknown' health status
- Clarifies that no health check is configured
- Explains that Traefik/Caddy will still route traffic
- Recommends configuring health checks for better reliability
- Update warning helper for 'unhealthy' health status
- Corrects misleading message that suggested resource might work
- Clearly states health check is failing
- Emphasizes that Traefik will NOT work with unhealthy containers
- Highlights that user action is required
Both warnings include links to documentation and use consistent warning icons.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses container status reporting issues and removes debug logging:
**Primary Fix:**
- Changed PushServerUpdateJob to default to 'unknown' instead of 'unhealthy' when health_status field is missing from Sentinel data
- This ensures containers WITHOUT healthcheck defined are correctly reported as "unknown" not "unhealthy"
- Matches SSH path behavior (GetContainersStatus) which already defaulted to 'unknown'
**Service Multi-Container Aggregation:**
- Implemented service container status aggregation (same pattern as applications)
- Added serviceContainerStatuses collection to both Sentinel and SSH paths
- Services now aggregate status using priority: unhealthy > unknown > healthy
- Prevents race conditions where last-processed container would win
**Debug Logging Cleanup:**
- Removed all [STATUS-DEBUG] logging statements (25 total)
- Removed all ray() debugging calls (3 total)
- Removed proof_unknown_preserved and health_status_was_null debug fields
- Code is now production-ready
**Test Coverage:**
- Added 2 new tests for Sentinel default health status behavior
- Added 5 new tests for service aggregation in SSH path
- All 16 tests pass (66 assertions)
**Note:** The root cause was identified as Sentinel (Go binary) also defaulting to "unhealthy". That will need a separate fix in the Sentinel codebase.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added comprehensive logging to track why applicationContainerStatuses
collection is empty in PushServerUpdateJob.
## Logging Added
### 1. Raw Sentinel Data (line 113-118)
**Logs**: Complete container data received from Sentinel
**Purpose**: See exactly what Sentinel is sending
**Data**: Container count and full container array with all labels
### 2. Container Processing Loop (line 157-163)
**Logs**: Every container as it's being processed
**Purpose**: Track which containers enter the processing loop
**Data**: Container name, status, all labels, coolify.managed flag
### 3. Skipped Containers - Not Managed (line 165-171)
**Logs**: Containers without coolify.managed label
**Purpose**: Identify containers being filtered out early
**Data**: Container name
### 4. Successful Container Addition (line 193-198)
**Logs**: When container is successfully added to applicationContainerStatuses
**Purpose**: Confirm containers ARE being processed
**Data**: Application ID, container name, container status
### 5. Missing com.docker.compose.service Label (line 200-206)
**Logs**: Containers skipped due to missing com.docker.compose.service
**Purpose**: Identify the most likely root cause
**Data**: Container name, application ID, all labels
## Why This Matters
User reported applicationContainerStatuses is empty (`[]`) even though
Sentinel is pushing updates. This logging will reveal:
1. Is Sentinel sending containers at all?
2. Are containers filtered by coolify.managed check?
3. Is com.docker.compose.service label missing? (most likely)
4. What labels IS Sentinel actually sending?
## Expected Findings
Based on investigation, the issue is likely:
- Sentinel is NOT sending com.docker.compose.service in labels
- Or Sentinel uses a different label format/name
- Containers pass all other checks but fail on line 190-206
## Next Steps
After logs appear, we'll see exactly which filter is blocking containers
and can fix the root cause (likely need to extract com.docker.compose.service
from container name or use a different label source).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added detailed debug logging to all status update paths to help
diagnose why "unhealthy" status appears in the UI.
## Logging Added
### 1. PushServerUpdateJob (Sentinel updates)
**Location**: Lines 303-315
**Logs**: Status changes from Sentinel push updates
**Data tracked**:
- Old vs new status
- Container statuses that led to aggregation
- Status flags (hasRunning, hasUnhealthy, hasUnknown)
### 2. GetContainersStatus (SSH updates)
**Location**: Lines 441-449, 346-354, 358-365
**Logs**: Status changes from SSH-based checks
**Scenarios**:
- Normal status aggregation
- Recently restarted containers (kept as degraded)
- Applications not running (set to exited)
**Data tracked**:
- Old vs new status
- Container statuses
- Restart count and timing
- Whether containers exist
### 3. Application Model Status Accessor
**Location**: Lines 706-712, 726-732
**Logs**: When status is set without explicit health information
**Issue**: Highlights cases where health defaults to "unhealthy"
**Data tracked**:
- Raw value passed to setter
- Final result after default applied
## How to Use
### Enable Debug Logging
Edit `.env` or `config/logging.php` to set log level to debug:
```
LOG_LEVEL=debug
```
### Monitor Logs
```bash
tail -f storage/logs/laravel.log | grep STATUS-DEBUG
```
### Log Format
All logs use `[STATUS-DEBUG]` prefix for easy filtering:
```
[2025-11-19 13:00:00] local.DEBUG: [STATUS-DEBUG] Sentinel status change
{
"source": "PushServerUpdateJob",
"app_id": 123,
"app_name": "my-app",
"old_status": "running:unknown",
"new_status": "running:healthy",
"container_statuses": [...],
"flags": {...}
}
```
## What to Look For
1. **Default to unhealthy**: Check Application model accessor logs
2. **Status flipping**: Compare timestamps between Sentinel and SSH updates
3. **Incorrect aggregation**: Check flags and container_statuses
4. **Stale database values**: Check if old_status persists across multiple logs
## Next Steps
After gathering logs, we can:
1. Identify the exact source of "unhealthy" status
2. Determine if it's a default issue, aggregation bug, or timing problem
3. Apply targeted fix based on evidence
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Added Documentation
Created detailed documentation in `.ai/core/application-architecture.md`
explaining the container status monitoring system to prevent future bugs.
## Key Sections
### 1. Container Status Monitoring System Overview
- Explains that status is updated through multiple independent paths
- Emphasizes that ALL paths must be updated when changing status logic
### 2. Critical Implementation Locations
Documents all four status calculation locations:
- **SSH-Based Updates**: `GetContainersStatus.php` (scheduled, every ~1min)
- **Sentinel-Based Updates**: `PushServerUpdateJob.php` (real-time, every ~30sec)
- **Multi-Server Aggregation**: `ComplexStatusCheck.php` (on-demand)
- **Service-Level Aggregation**: `Service.php` (service status)
### 3. Status Flow Diagram
Visual representation of how status flows from different sources to UI
### 4. Status Priority System
Documents the required priority: unhealthy > unknown > healthy
### 5. Excluded Containers
Explains `:excluded` suffix handling and behavior
### 6. Developer Guidelines
- Checklist of all locations to update
- Testing requirements
- Edge cases to handle
### 7. Related Tests
Links to all relevant test files
### 8. Common Bugs to Avoid
Real examples from bugs we've fixed, with solutions
## Why This Documentation Matters
The recent bug (unknown → healthy) happened because:
1. `GetContainersStatus.php` was updated to handle "unknown" status
2. `PushServerUpdateJob.php` was NOT updated
3. This caused periodic status flipping
This documentation ensures future developers (and AI assistants like Claude)
will know to update ALL four locations when modifying status logic.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Problem
Services with "running (unknown)" status were periodically changing
to "running (healthy)" every ~30 seconds when Sentinel pushed updates.
This was confusing for users and inconsistent with SSH-based status checks.
## Root Cause
`PushServerUpdateJob::aggregateMultiContainerStatuses()` was missing
logic to track "unknown" health state. It only tracked "unhealthy" and
defaulted everything else to "healthy".
When Sentinel pushed updates with "running (unknown)" containers:
- The job saw `hasRunning = true` and `hasUnhealthy = false`
- It incorrectly returned "running (healthy)" instead of "running (unknown)"
## Solution
Updated `PushServerUpdateJob` to match the logic in `GetContainersStatus`:
1. Added `$hasUnknown` tracking variable
2. Check for "unknown" in status strings (alongside "unhealthy")
3. Implement 3-way priority: unhealthy > unknown > healthy
This ensures consistency between:
- SSH-based updates (`GetContainersStatus`)
- Sentinel-based updates (`PushServerUpdateJob`)
- UI display logic
## Changes
- **app/Jobs/PushServerUpdateJob.php**: Added unknown status tracking
- **tests/Unit/PushServerUpdateJobStatusAggregationTest.php**: New comprehensive tests
- **tests/Unit/ExcludeFromHealthCheckTest.php**: Updated to match current implementation
## Testing
All 31 status-related unit tests passing:
- 18 tests in ContainerHealthStatusTest
- 8 tests in ExcludeFromHealthCheckTest (updated)
- 6 tests in PushServerUpdateJobStatusAggregationTest (new)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Remove the github_runner_sources migration that was accidentally
included in the health check status fix branch. This migration
is unrelated to the health check functionality and should be
developed separately.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit adds:
- Comprehensive docker-compose examples for health check testing
- GitHub runner sources database migration
- UI fix for service heading view
Files Added:
- DOCKER_COMPOSE_EXAMPLES.md - Documentation of health check test cases
- docker-compose.*.yml - Test files for various health check scenarios:
- excluded.yml: Container with exclude_from_hc flag
- healthy.yml: All containers healthy
- unhealthy.yml: All containers unhealthy
- unknown.yml: Container without healthcheck
- mixed-healthy-unknown.yml: Mix of healthy and unknown
- mixed-unhealthy-unknown.yml: Mix of unhealthy and unknown
- database/migrations/2025_11_19_115504_create_github_runner_sources_table.php
Files Modified:
- resources/views/livewire/project/service/heading.blade.php
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>