Commit graph

1300 commits

Author SHA1 Message Date
Andras Bacsai
49a3bb0daf refactor(DatabaseBackupJob): remove retry attempts and backoff logic for job execution 2025-11-11 15:39:01 +01:00
Andras Bacsai
644df223dc fix(ScheduledTaskJob): make server property nullable and update logging to handle null values 2025-11-11 15:38:55 +01:00
Andras Bacsai
334892d1ff feat(BackupNotification): include database name in BackupFailed notification for better context 2025-11-11 15:27:57 +01:00
Andras Bacsai
133d6a0349 feat(DeploymentException): add custom exception for deployment errors and update handler to exclude from reporting 2025-11-11 15:08:26 +01:00
Andras Bacsai
64c7d301ce feat(DatabaseBackupJob, ScheduledTaskJob): enforce minimum timeout and add execution ID for timeout handling 2025-11-11 12:07:39 +01:00
Andras Bacsai
104e68a9ac
Merge branch 'next' into improve-scheduled-tasks 2025-11-11 11:38:04 +01:00
Andras Bacsai
e79316c8b5
Update app/Jobs/DeleteResourceJob.php
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-11-11 11:35:30 +01:00
Andras Bacsai
f9ab2a7ca8
Merge branch 'next' into improve-scheduled-tasks 2025-11-11 11:32:15 +01:00
Andras Bacsai
0039be49b2 fix(DeleteResourceJob): escape deployment UUID and stack name in Docker commands 2025-11-11 11:30:17 +01:00
Andras Bacsai
18a14037c7 fix: improve logging for PORT environment variable mismatch and ensure .env file is created in the correct directory 2025-11-10 14:56:27 +01:00
Andras Bacsai
0b8d3d395e fix: remove redundant process termination logic from deployment methods 2025-11-10 14:46:02 +01:00
Andras Bacsai
71c89d9ba8
Merge branch 'next' into improve-scheduled-tasks 2025-11-10 14:21:03 +01:00
Andras Bacsai
194d023f70
Enhance port detection and improve user notifications (#7184) 2025-11-10 13:56:09 +01:00
Andras Bacsai
99e97900a5 feat: add automated PORT environment variable detection and UI warnings
Add detection system for PORT environment variable to help users configure applications correctly:

- Add detectPortFromEnvironment() method to Application model to detect PORT env var
- Add getDetectedPortInfoProperty() computed property in General Livewire component
- Display contextual info banners in UI when PORT is detected:
  - Warning when PORT exists but ports_exposes is empty
  - Warning when PORT doesn't match ports_exposes configuration
  - Info message when PORT matches ports_exposes
- Add deployment logging to warn about PORT/ports_exposes mismatches
- Include comprehensive unit tests for port detection logic

The ports_exposes field remains authoritative for proxy configuration, while
PORT detection provides helpful suggestions to users.
2025-11-10 13:43:27 +01:00
Andras Bacsai
1580c0d3ad
Update app/Jobs/ScheduledTaskJob.php
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-11-10 11:41:50 +01:00
Andras Bacsai
0ea27ce37a
Cancel active deployments when a pull request is closed (#7164) 2025-11-10 11:16:54 +01:00
Andras Bacsai
b22e79caec feat(jobs): improve scheduled tasks with retry logic and queue cleanup
- Add retry configuration to CoolifyTask (3 tries, 600s timeout)
- Add retry configuration to ScheduledTaskJob (3 tries, configurable timeout)
- Add retry configuration to DatabaseBackupJob (2 tries)
- Implement exponential backoff for all jobs (30s, 60s, 120s intervals)
- Add failed() handlers with comprehensive error logging to scheduled-errors channel
- Add execution tracking: started_at, retry_count, duration (decimal), error_details
- Add configurable timeout field to scheduled tasks (60-3600s, default 300s)
- Update UI to include timeout configuration in task creation/editing forms
- Increase ScheduledJobManager lock expiration from 60s to 90s for high-load environments
- Implement safe queue cleanup with restart vs runtime modes
  - Restart mode: aggressive cleanup (marks all processing jobs as failed)
  - Runtime mode: conservative cleanup (only marks jobs >12h as failed, skips deployments)
- Add cleanup:redis --restart flag for system startup
- Integrate cleanup into Dev.php init() for development environment
- Increase scheduled-errors log retention from 7 to 14 days
- Create comprehensive test suite (unit and feature tests)
- Add TESTING_GUIDE.md with manual testing instructions

Fixes issues with jobs failing after single attempt and "attempted too many times" errors
2025-11-10 11:11:18 +01:00
Andras Bacsai
67605d50fc fix(deployment): prevent base deployments from being killed when PRs close (#7113)
- Fix container filtering to properly distinguish base deployments (pullRequestId=0) from PR deployments
- Add deployment cancellation when PR closes via webhook to prevent race conditions
- Prevent CleanupHelperContainersJob from killing active deployment containers
- Enhance error messages with exit codes and actual errors instead of vague "Oops" messages
- Protect status transitions in finally blocks to ensure proper job failure handling
2025-11-09 14:41:35 +01:00
Andras Bacsai
712d60c75b feat: ensure .env file exists for docker compose and auto-inject in payloads 2025-11-07 15:20:10 +01:00
Andras Bacsai
d0ee7d0412
Merge branch 'next' into feat-add-dockerfile-from-instruction-par 2025-11-06 09:24:54 +01:00
Andras Bacsai
88aa24057b fix: update environment variable mapping in deployment job 2025-11-06 09:21:41 +01:00
Andras Bacsai
dbf7957795 fix: inserting ARG statements in Dockerfile after FROM instructions 2025-11-06 08:54:35 +01:00
Andras Bacsai
f315e4bd9c feat: add dev_helper_version to instance settings and update related functionality 2025-11-03 08:38:43 +01:00
Andras Bacsai
1f158b9b35 fix: Improve custom_network_aliases handling and testing
The `is_array` check for `custom_network_aliases_array` was too strict and could lead to issues when the value was an empty string or null. This commit changes the check to `!empty()` for more robust handling.

Additionally, the unit tests for `custom_network_aliases` have been refactored to directly use the `Application::isConfigurationChanged()` method. This provides a more accurate and integrated test of the configuration change detection logic, rather than relying on a manual hash calculatio
2025-11-01 13:24:05 +01:00
Andras Bacsai
9a664865ee refactor: Improve handling of custom network aliases
The custom_network_aliases attribute in the Application model was being cast to an array directly. This commit refactors the attribute to provide both a string representation (for compatibility with older configurations and hashing) and an array representation for internal use. This ensures that network aliases are correctly parsed and utilized, preventing potential issues during deployment and configuration updates.
2025-11-01 13:13:14 +01:00
Andras Bacsai
aeba914bda refactor: remove deprecated next() method
The backward-compatible next() method is no longer needed since all
call sites have been updated to use the clearer method names:
- completeDeployment()
- failDeployment()
- transitionToStatus()

This completes the refactoring to make status transitions more explicit
and maintainable.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 09:20:30 +01:00
Andras Bacsai
42f916dce2 fix: ensure deployment failure notifications are sent reliably
**Problem:**
Deployment failure notifications were not being sent due to two bugs:

1. **Timing Issue in next() function:**
   - When failed() called next(FAILED), the database still had status "in_progress"
   - The notification check looked for ALREADY failed status (not found yet)
   - Status was updated AFTER the check, losing the notification

2. **Direct Status Update:**
   - Healthcheck failures directly updated status to FAILED
   - Bypassed next() entirely, no notification sent

**Solution:**
Refactored status transition logic with clear separation of concerns:

- Moved notification logic AFTER status update (not before)
- Created transitionToStatus() as single source of truth
- Added completeDeployment() and failDeployment() for clarity
- Extracted status-specific side effects into dedicated methods
- Updated healthcheck failure to use failDeployment()

**Benefits:**
-  Notifications sent for ALL failure scenarios
-  Clear, self-documenting method names
-  Single responsibility per method
-  Type-safe using enum instead of strings
-  Harder to bypass notification logic accidentally
-  Backward compatible (old next() preserved)

**Changed:**
- app/Jobs/ApplicationDeploymentJob.php (+101/-21 lines)

Fixes #6911

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 09:06:59 +01:00
Andras Bacsai
c6a2d1fe0a Fix stale lock issue causing scheduled tasks to stop (#4539)
## Problem
Scheduled tasks, backups, and auto-updates stopped working after 1-2 months
with error: MaxAttemptsExceededException: App\Jobs\ScheduledJobManager has
been attempted too many times.

Root cause: ScheduledJobManager used WithoutOverlapping with only
releaseAfter(60), causing locks without expiration (TTL=-1) that persisted
indefinitely when jobs hung or processes crashed.

## Solution

### Part 1: Prevention (Future Locks)
- Added expireAfter(60) to ScheduledJobManager middleware
- Lock now auto-expires after 60 seconds (matches everyMinute schedule)
- Changed from releaseAfter(60) to expireAfter(60)->dontRelease()
- Follows Laravel best practices and matches other Coolify jobs

### Part 2: Recovery (Existing Locks)
- Enhanced cleanup:redis command with --clear-locks flag
- Scans Redis for stale locks (TTL=-1) and removes them
- Called automatically during app:init on startup/upgrade
- Provides immediate recovery for affected instances

## Changes
- app/Jobs/ScheduledJobManager.php: Added expireAfter(60)->dontRelease()
- app/Console/Commands/CleanupRedis.php: Added cleanupCacheLocks() method
- app/Console/Commands/Init.php: Auto-clear locks on startup
- tests/Unit/ScheduledJobManagerLockTest.php: Test to prevent regression
- STALE_LOCK_FIX.md: Complete documentation

## Testing
- Unit tests pass (2 tests, 8 assertions)
- Code formatted with Pint
- Matches pattern used by CleanupInstanceStuffsJob

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 10:07:33 +02:00
Andras Bacsai
587517394b Changes auto-committed by Conductor 2025-10-22 13:03:17 +02:00
Andras Bacsai
d8c89a1abf Changes auto-committed by Conductor 2025-10-21 20:39:39 +02:00
Andras Bacsai
aada45d856
Merge pull request #6876 from thereis/feat/update-applicationpullrequestupdatejob-documentation
feat: include service name in preview deployment updates
2025-10-16 10:10:03 +02:00
Andras Bacsai
41afa9568d fix: handle null environment variable values in bash escaping
Previously, the bash escaping functions (`escapeBashEnvValue()` and `escapeBashDoubleQuoted()`) had strict string type hints that rejected null values, causing deployment failures when environment variables had null values.

Changes:
- Updated both functions to accept nullable strings (`?string $value`)
- Handle null/empty values by returning empty quoted strings (`''` for single quotes, `""` for double quotes)
- Added 3 new tests to cover null and empty value handling
- All 29 tests pass

This fix ensures deployments work correctly even when environment variables have null values, while maintaining the existing behavior for all other cases.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-15 13:35:58 +02:00
Lucas Reis
23250d53c4
Merge branch 'next' into feat/update-applicationpullrequestupdatejob-documentation 2025-10-15 00:59:02 +02:00
Lucas Reis
232e030838 Include service name in preview deployment updates 2025-10-15 00:53:23 +02:00
Andras Bacsai
a9d899334f
Merge branch 'next' into allow-at-sign-in-git-urls 2025-10-14 20:46:08 +02:00
Andras Bacsai
7bdd53b3fb
Merge pull request #6871 from coollabsio/fix-static-publish-dir-slash
Fix static site publish directory double slash in build logs
2025-10-14 20:45:49 +02:00
Andras Bacsai
933a67645f
Update app/Jobs/ApplicationDeploymentJob.php
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-10-14 20:45:40 +02:00
Andras Bacsai
b81baff4b1 fix: improve logging and add shell escaping for git ls-remote
Two improvements to Git deployment handling:

1. **ApplicationDeploymentJob.php**:
   - Fixed log message to show actual resolved commit SHA (`$this->commit`)
   - Previously showed `$this->application->git_commit_sha` which could be "HEAD"
   - Now displays the actual 40-character commit SHA that will be deployed

2. **Application.php (generateGitLsRemoteCommands)**:
   - Added `escapeshellarg()` for repository URL in 'other' deployment type
   - Prevents shell injection in git ls-remote commands
   - Complements existing shell escaping in `generateGitImportCommands`
   - Ensures consistent security across all Git operations

**Security Impact:**
- All Git commands now use properly escaped repository URLs
- Prevents command injection through malicious repository URLs
- Consistent escaping in both ls-remote and clone operations

**User Experience:**
- Deployment logs now show exact commit SHA being deployed
- More accurate debugging information for deployment issues

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-14 20:44:35 +02:00
Andras Bacsai
91e070b2c3 fix: add missing save_runtime_environment_variables() in deploy_simple_dockerfile
Fixes pure Dockerfile deployment failing with 'env file not found' error.

The deploy_simple_dockerfile() method was missing the call to
save_runtime_environment_variables() which creates the .env file
needed during the rolling update phase. This call is present in
all other deployment methods (dockerfile, dockercompose, nixpacks,
static) but was missing here.

This ensures the .env file exists when docker compose tries to
use --env-file during the rolling update.
2025-10-14 20:43:11 +02:00
Andras Bacsai
1aea813b71 Fix static site publish directory double slash in build logs
- Strip leading slashes from publish_directory to prevent /app// paths
- Only add slash prefix if directory is not empty
- Ensures clean Docker COPY paths in build output
2025-10-14 17:15:41 +02:00
Andras Bacsai
893093fad3
Update app/Jobs/ApplicationDeploymentJob.php
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-10-14 15:21:38 +02:00
Andras Bacsai
bf00405971 fix(git): handle Git redirects and improve URL parsing for tangled.sh and other Git hosts
Fixes deployment failures when Git repositories redirect (e.g., tangled.sh → tangled.org)
and improves security by adding proper shell escaping for repository URLs.

**Root Cause:**
Git redirect warnings can appear on the same line as ls-remote output with no newline:
`warning: redirecting to https://tangled.org/...196d3df...	refs/heads/master`

The previous parsing logic split by newlines and extracted text before tabs, which
included the entire warning message instead of just the 40-character commit SHA.

**Changes:**

1. **Fixed commit SHA extraction** (ApplicationDeploymentJob.php):
   - Changed from line-based parsing to regex pattern matching
   - Uses `/([0-9a-f]{40})\s*\t/` to find valid 40-char hex commit SHA before tab
   - Handles warnings on same line, separate lines, multiple warnings, and whitespace
   - Added comprehensive Ray debug logs for troubleshooting

2. **Added security fix** (Application.php):
   - Added `escapeshellarg()` for repository URLs in 'other' deployment type
   - Prevents shell injection and fixes parsing issues with special characters like `@`
   - Added Ray debug logs for deployment type tracking

3. **Comprehensive test coverage** (GitLsRemoteParsingTest.php):
   - Tests normal output without warnings
   - Tests redirect warning on separate line
   - Tests redirect warning on same line (actual tangled.sh format)
   - Tests multiple warning lines
   - Tests extra whitespace handling

**Resolves:**
- Linear issue COOLGH-53: Valid git URLs are rejected as being invalid
- GitHub issue #6568: tangled.sh deployments failing
- Handles Git redirects universally for all Git hosting services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-14 11:55:17 +02:00
Andras Bacsai
8408faf897 Handle all ProcessStatus values in ApplicationPullRequestUpdateJob
- Add support for QUEUED, KILLED, and CANCELLED statuses
- Replace if-elseif chain with match expression for better exhaustiveness
- Add appropriate emoji indicators for each status
- Ensure all ProcessStatus enum values are handled

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-14 11:05:42 +02:00
Andras Bacsai
635af44539
Merge pull request #6837 from coollabsio/andrasbacsai/custom-webhooks
feat: add custom webhook notification support
2025-10-12 10:57:47 +02:00
Andras Bacsai
95fe04c484
Merge pull request #6817 from coollabsio/hetzner-do
Hetzner integration
2025-10-11 19:23:19 +02:00
Andras Bacsai
f50201152f refactor(backup): make backup_log_uuid initialization lazy
Changed backup_log_uuid property to nullable and removed eager initialization in constructor. This allows the ID to be generated when actually needed rather than upfront.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-11 13:42:50 +02:00
Andras Bacsai
dc15bee980 feat: implement actual webhook delivery with Ray debugging
Added actual HTTP POST delivery for webhook notifications and comprehensive Ray debugging for development.

Changes:
- Updated Team model to implement SendsWebhook interface
- Added routeNotificationForWebhook() method to Team
- Enhanced SendWebhookJob with Ray logging for request/response
- Added Ray debugging to WebhookChannel for dispatch tracking
- Added Ray debugging to Webhook Livewire component

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-10 18:07:04 +02:00
Andras Bacsai
413dee5d8c feat: implement actual webhook delivery
Implement full webhook delivery functionality:
- Create SendWebhookJob to handle HTTP POST requests
- Update WebhookChannel to dispatch webhook jobs
- Configure retry logic (5 attempts, 10s backoff)
- Update Test notification payload with success/message structure

Webhook payload structure:
{
  "success": true/false,
  "message": "notification message",
  "event": "event_type",
  "url": "coolify_dashboard_url"
}

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-10 17:59:17 +02:00
Andras Bacsai
00cb06150e fix: improve error logging and handling in ServerConnectionCheckJob for Hetzner server status 2025-10-10 10:12:59 +02:00
Andras Bacsai
f4e5c195fe refactor: replace direct SslCertificate queries with server relationship methods for consistency 2025-10-09 17:00:05 +02:00