Commit graph

4630 commits

Author SHA1 Message Date
Andras Bacsai
c7fc0a271c feat(proxy): trigger version check after restart from UI
When a user restarts the proxy from the Navbar UI component, the system now automatically dispatches a version check job immediately after the restart completes. This provides immediate feedback about available Traefik updates without waiting for the weekly scheduled check.

Changes:
- Import CheckTraefikVersionForServerJob in Navbar component
- After successful proxy restart, dispatch version check for Traefik servers
- Version check only runs for servers using Traefik proxy

This ensures users get up-to-date version information right after restarting their proxy infrastructure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-18 10:19:04 +01:00
Andras Bacsai
f8dd44410a refactor(proxy): simplify getNewerBranchInfo method parameters and streamline version checks 2025-11-17 15:03:30 +01:00
Andras Bacsai
1270136da9 merge: merge next branch into feat-traefik-version-checker
Merged latest changes from the next branch to keep the feature branch
up to date. No conflicts were encountered during the merge.

Changes from next branch:
- Updated application deployment job error logging
- Updated server manager job and instance settings
- Removed PullHelperImageJob in favor of updated approach
- Database migration refinements
- Updated versions.json with latest component versions

All automatic merges were successful and no manual conflict resolution
was required.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 14:56:24 +01:00
Andras Bacsai
5d73b76a44 refactor(proxy): implement centralized caching for versions.json and improve UX
This commit introduces several improvements to the Traefik version tracking
feature and proxy configuration UI:

## Caching Improvements

1. **New centralized helper functions** (bootstrap/helpers/versions.php):
   - `get_versions_data()`: Redis-cached access to versions.json (1 hour TTL)
   - `get_traefik_versions()`: Extract Traefik versions from cached data
   - `invalidate_versions_cache()`: Clear cache when file is updated

2. **Performance optimization**:
   - Single Redis cache key: `coolify:versions:all`
   - Eliminates 2-4 file reads per page load
   - 95-97.5% reduction in disk I/O time
   - Shared cache across all servers in distributed setup

3. **Updated all consumers to use cached helpers**:
   - CheckTraefikVersionJob: Use get_traefik_versions()
   - Server/Proxy: Two-level caching (Redis + in-memory per-request)
   - CheckForUpdatesJob: Auto-invalidate cache after updating file
   - bootstrap/helpers/shared.php: Use cached data for Coolify version

## UI/UX Improvements

1. **Navbar warning indicator**:
   - Added yellow warning triangle icon next to "Proxy" menu item
   - Appears when server has outdated Traefik version
   - Uses existing traefik_outdated_info data for instant checks
   - Provides at-a-glance visibility of version issues

2. **Proxy sidebar persistence**:
   - Fixed sidebar disappearing when clicking "Switch Proxy"
   - Configuration link now always visible (needed for proxy selection)
   - Dynamic Configurations and Logs only show when proxy is configured
   - Better navigation context during proxy switching workflow

## Code Quality

- Added comprehensive PHPDoc for Server::$traefik_outdated_info property
- Improved code organization with centralized helper approach
- All changes formatted with Laravel Pint
- Maintains backward compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 14:53:28 +01:00
Andras Bacsai
b602fef4db fix(deployment): improve error logging with exception types and hidden technical details
- Add exception class names to error messages for better debugging
- Mark technical details (error type, code, location, stack trace) as hidden in logs
- Preserve original exception types when wrapping in DeploymentException
- Update ServerManagerJob to include exception class in log messages
- Enhance unit tests to verify hidden log entry behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 14:44:39 +01:00
Andras Bacsai
97550f4066 fix(deployment): eliminate duplicate error logging in deployment methods
Wraps rolling_update(), health_check(), stop_running_container(), and
start_by_compose_file() with try-catch to ensure comprehensive error logging
happens in one place. Removes duplicate logging from intermediate catch blocks
since the failed() method already provides full error details including stack trace
and chained exception information.

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 10:52:09 +01:00
Andras Bacsai
6593b2a553 feat(proxy): enhance Traefik version notifications to show patch and minor upgrades
- Store both patch update and newer minor version information simultaneously
- Display patch update availability alongside minor version upgrades in notifications
- Add newer_branch_target and newer_branch_latest fields to traefik_outdated_info
- Update all notification channels (Discord, Telegram, Slack, Pushover, Email, Webhook)
- Show minor version in format (e.g., v3.6) for upgrade targets instead of patch version
- Enhance UI callouts with clearer messaging about available upgrades
- Remove verbose logging in favor of cleaner code structure
- Handle edge case where SSH command returns empty response

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 09:59:17 +01:00
Andras Bacsai
cc6a538fca refactor(proxy): implement parallel processing for Traefik version checks
Addresses critical performance issues identified in code review by refactoring the monolithic CheckTraefikVersionJob into a distributed architecture with parallel processing.

Changes:
- Split version checking into CheckTraefikVersionForServerJob for parallel execution
- Extract notification logic into NotifyOutdatedTraefikServersJob
- Dispatch individual server checks concurrently to handle thousands of servers
- Add comprehensive unit tests for the new job architecture
- Update feature tests to cover the refactored workflow

Performance improvements:
- Sequential SSH calls replaced with parallel queue jobs
- Scales efficiently for large installations with thousands of servers
- Reduces job execution time from hours to minutes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 11:42:58 +01:00
Andras Bacsai
7a16938f0c fix(proxy): prevent "container name already in use" error during proxy restart
Add wait loops to ensure containers are fully removed before restarting.
This fixes race conditions where docker compose would fail because an
existing container was still being cleaned up.

Changes:
- StartProxy: Add explicit stop, wait loop before docker compose up
- StopProxy: Add wait loop after container removal
- Both actions now poll up to 10 seconds for complete removal
- Add error suppression to handle non-existent containers gracefully

Tests:
- Add StartProxyTest.php with 3 tests for cleanup logic
- Add StopProxyTest.php with 4 tests for stop behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 11:35:22 +01:00
Andras Bacsai
11a7f4c8a7 fix(performance): eliminate N+1 query in CheckTraefikVersionJob
This commit fixes a critical N+1 query issue in CheckTraefikVersionJob
that was loading ALL proxy servers into memory then filtering in PHP,
causing potential OOM errors with thousands of servers.

Changes:
- Added scopeWhereProxyType() query scope to Server model for
  database-level filtering using JSON column arrow notation
- Updated CheckTraefikVersionJob to use new scope instead of
  collection filter, moving proxy type filtering into the SQL query
- Added comprehensive unit tests for the new query scope

Performance impact:
- Before: SELECT * FROM servers WHERE proxy IS NOT NULL (all servers)
- After: SELECT * FROM servers WHERE proxy->>'type' = 'TRAEFIK' (filtered)
- Eliminates memory overhead of loading non-Traefik servers
- Critical for cloud instances with thousands of connected servers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 11:35:22 +01:00
Andras Bacsai
8c77c63043 feat(proxy): add Traefik version tracking with notifications and dismissible UI warnings
- Add automated Traefik version checking job running weekly on Sundays
- Implement version detection from running containers and comparison with versions.json
- Add notifications across all channels (Email, Discord, Slack, Telegram, Pushover, Webhook) for outdated versions
- Create dismissible callout component with localStorage persistence
- Display cross-branch upgrade warnings (e.g., v3.5 -> v3.6) with changelog links
- Show patch update notifications within same branch
- Add warning icon that appears when callouts are dismissed
- Prevent duplicate notifications during proxy restart by adding restarting parameter
- Fix notification spam with transition-based logic for status changes
- Enable system email settings by default in development mode
- Track last saved/applied proxy settings to detect configuration drift
2025-11-14 11:35:22 +01:00
Andras Bacsai
318cd18dde fix: remove PullHelperImageJob and mass server scheduling
Stop dispatching PullHelperImageJob to thousands of servers when the helper image version changes. Instead, rely on Docker's automatic image pulling during actual deployments and backups. Inline the helper image pull in UpdateCoolify for the single use case.

This eliminates queue flooding on cloud instances while maintaining all functionality through Docker's built-in image management.
2025-11-14 11:31:08 +01:00
Andras Bacsai
ec30426a2f feat(ServiceDatabase): add support for TimescaleDB detection and database type identification 2025-11-12 00:36:38 +01:00
Andras Bacsai
6202803db2 fix(CleanupRedis): guard against scan() returning false and use lowercase option keys
- Change Redis scan() option keys from uppercase (MATCH, COUNT) to lowercase (match, count) to comply with PhpRedis requirements
- Add guard to handle scan() returning false and display error message
- Add comprehensive test coverage for scan() error handling scenarios
2025-11-11 21:22:29 +01:00
Andras Bacsai
ad69758c56 refactor(CleanupRedis): remove JSON decode error handling from cleanupStuckJobs method 2025-11-11 20:54:25 +01:00
Andras Bacsai
b79aa1b195 refactor(CleanupRedis): optimize key retrieval in cleanupStuckJobs using Redis scan 2025-11-11 15:41:05 +01:00
Andras Bacsai
a95e92f098 feat(CleanupRedis): add error handling for JSON decode failures in cleanupStuckJobs method 2025-11-11 15:40:11 +01:00
Andras Bacsai
49a3bb0daf refactor(DatabaseBackupJob): remove retry attempts and backoff logic for job execution 2025-11-11 15:39:01 +01:00
Andras Bacsai
644df223dc fix(ScheduledTaskJob): make server property nullable and update logging to handle null values 2025-11-11 15:38:55 +01:00
Andras Bacsai
eb70fe00ff feat(CleanupRedis): add error handling for JSON decode failures in cleanupStuckJobs method 2025-11-11 15:36:34 +01:00
Andras Bacsai
4fa0c581c8 fix(ScheduledTask): change timeout property type to int for consistency in syncData method 2025-11-11 15:30:10 +01:00
Andras Bacsai
334892d1ff feat(BackupNotification): include database name in BackupFailed notification for better context 2025-11-11 15:27:57 +01:00
Andras Bacsai
684a08bf75 feat(CleanupRedis): improve stuck job cleanup logic by prioritizing reserved_at timestamp 2025-11-11 15:27:52 +01:00
Andras Bacsai
133d6a0349 feat(DeploymentException): add custom exception for deployment errors and update handler to exclude from reporting 2025-11-11 15:08:26 +01:00
Andras Bacsai
0d14bc1df7 feat(EmailChannel): enhance error handling with user-friendly messages for Resend API errors 2025-11-11 13:23:45 +01:00
Andras Bacsai
0cfce06869 feat(Cleanup): implement failure marking for stuck scheduled tasks and database backups during startup 2025-11-11 12:32:52 +01:00
Andras Bacsai
64c7d301ce feat(DatabaseBackupJob, ScheduledTaskJob): enforce minimum timeout and add execution ID for timeout handling 2025-11-11 12:07:39 +01:00
Andras Bacsai
104e68a9ac
Merge branch 'next' into improve-scheduled-tasks 2025-11-11 11:38:04 +01:00
Andras Bacsai
e79316c8b5
Update app/Jobs/DeleteResourceJob.php
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-11-11 11:35:30 +01:00
Andras Bacsai
75e8605c3d
Merge branch 'next' into port-detection-lol 2025-11-11 11:32:17 +01:00
Andras Bacsai
f9ab2a7ca8
Merge branch 'next' into improve-scheduled-tasks 2025-11-11 11:32:15 +01:00
Andras Bacsai
0039be49b2 fix(DeleteResourceJob): escape deployment UUID and stack name in Docker commands 2025-11-11 11:30:17 +01:00
Andras Bacsai
45ab79f292
Merge branch 'next' into port-detection-lol 2025-11-11 11:21:26 +01:00
Andras Bacsai
a12dd98f64 Merge branch 'next' into fix-deployment-skipped-message 2025-11-10 21:33:10 +01:00
Andras Bacsai
fd50f72889 fix: remove duplicate deployment queue call causing false error messages
Removed duplicate queue_application_deployment() call in Heading.php deploy method that was causing "Deployment already queued for this commit" error to display even though deployment was successfully queued.

Also changed notification type from 'success' to 'error' when deployment is actually skipped for proper user feedback.
2025-11-10 21:31:06 +01:00
Andras Bacsai
f1d80d6776 fix: enhance error handling in initialization and cleanup process 2025-11-10 15:29:26 +01:00
Andras Bacsai
23c165d4d1 fix: wrap database updates in a transaction for consistency in GetContainersStatus 2025-11-10 15:07:44 +01:00
Andras Bacsai
761f177b1e fix: move restart count reset logic to the correct position in the restart method 2025-11-10 14:59:29 +01:00
Andras Bacsai
cefb425492
Update app/Livewire/Project/Application/Heading.php
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-11-10 14:58:08 +01:00
Andras Bacsai
18a14037c7 fix: improve logging for PORT environment variable mismatch and ensure .env file is created in the correct directory 2025-11-10 14:56:27 +01:00
Andras Bacsai
0b8d3d395e fix: remove redundant process termination logic from deployment methods 2025-11-10 14:46:02 +01:00
Andras Bacsai
9507f602df fix: ensure service state is refreshed and compose configurations are saved after submission 2025-11-10 14:44:11 +01:00
Andras Bacsai
f5fa09790e refactor: improve command handling and ensure correct working directory for Docker operations 2025-11-10 14:40:03 +01:00
Andras Bacsai
71c89d9ba8
Merge branch 'next' into improve-scheduled-tasks 2025-11-10 14:21:03 +01:00
Andras Bacsai
6decad2e96 refactor: streamline required port retrieval in EditDomain and ServiceApplicationView; add environment_variables method in ServiceApplication 2025-11-10 14:15:53 +01:00
Andras Bacsai
e63a270fea
Enhance container status tracking and improve user notifications (#7182) 2025-11-10 13:58:22 +01:00
Andras Bacsai
194d023f70
Enhance port detection and improve user notifications (#7184) 2025-11-10 13:56:09 +01:00
Andras Bacsai
99e97900a5 feat: add automated PORT environment variable detection and UI warnings
Add detection system for PORT environment variable to help users configure applications correctly:

- Add detectPortFromEnvironment() method to Application model to detect PORT env var
- Add getDetectedPortInfoProperty() computed property in General Livewire component
- Display contextual info banners in UI when PORT is detected:
  - Warning when PORT exists but ports_exposes is empty
  - Warning when PORT doesn't match ports_exposes configuration
  - Info message when PORT matches ports_exposes
- Add deployment logging to warn about PORT/ports_exposes mismatches
- Include comprehensive unit tests for port detection logic

The ports_exposes field remains authoritative for proxy configuration, while
PORT detection provides helpful suggestions to users.
2025-11-10 13:43:27 +01:00
Andras Bacsai
68a9f2ca77 feat: add container restart tracking and crash loop detection
Track container restart counts from Docker and detect crash loops to provide better visibility into application health issues.

- Add restart_count, last_restart_at, and last_restart_type columns to applications table
- Detect restart count increases from Docker inspect data and send notifications
- Show restart count badge in UI with warning icon on Logs navigation
- Distinguish between crash restarts and manual restarts
- Implement 30-second grace period to prevent false "exited" status during crash loops
- Reset restart count on manual stop, restart, and redeploy actions
- Add unit tests for restart count tracking logic

This helps users quickly identify when containers are in crash loops and need attention, even when the container status flickers between states during Docker's restart backoff period.
2025-11-10 13:04:31 +01:00
Andras Bacsai
1580c0d3ad
Update app/Jobs/ScheduledTaskJob.php
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-11-10 11:41:50 +01:00