Commit graph

5 commits

Author SHA1 Message Date
Andras Bacsai
b22e79caec feat(jobs): improve scheduled tasks with retry logic and queue cleanup
- Add retry configuration to CoolifyTask (3 tries, 600s timeout)
- Add retry configuration to ScheduledTaskJob (3 tries, configurable timeout)
- Add retry configuration to DatabaseBackupJob (2 tries)
- Implement exponential backoff for all jobs (30s, 60s, 120s intervals)
- Add failed() handlers with comprehensive error logging to scheduled-errors channel
- Add execution tracking: started_at, retry_count, duration (decimal), error_details
- Add configurable timeout field to scheduled tasks (60-3600s, default 300s)
- Update UI to include timeout configuration in task creation/editing forms
- Increase ScheduledJobManager lock expiration from 60s to 90s for high-load environments
- Implement safe queue cleanup with restart vs runtime modes
  - Restart mode: aggressive cleanup (marks all processing jobs as failed)
  - Runtime mode: conservative cleanup (only marks jobs >12h as failed, skips deployments)
- Add cleanup:redis --restart flag for system startup
- Integrate cleanup into Dev.php init() for development environment
- Increase scheduled-errors log retention from 7 to 14 days
- Create comprehensive test suite (unit and feature tests)
- Add TESTING_GUIDE.md with manual testing instructions

Fixes issues with jobs failing after single attempt and "attempted too many times" errors
2025-11-10 11:11:18 +01:00
Andras Bacsai
c6a2d1fe0a Fix stale lock issue causing scheduled tasks to stop (#4539)
## Problem
Scheduled tasks, backups, and auto-updates stopped working after 1-2 months
with error: MaxAttemptsExceededException: App\Jobs\ScheduledJobManager has
been attempted too many times.

Root cause: ScheduledJobManager used WithoutOverlapping with only
releaseAfter(60), causing locks without expiration (TTL=-1) that persisted
indefinitely when jobs hung or processes crashed.

## Solution

### Part 1: Prevention (Future Locks)
- Added expireAfter(60) to ScheduledJobManager middleware
- Lock now auto-expires after 60 seconds (matches everyMinute schedule)
- Changed from releaseAfter(60) to expireAfter(60)->dontRelease()
- Follows Laravel best practices and matches other Coolify jobs

### Part 2: Recovery (Existing Locks)
- Enhanced cleanup:redis command with --clear-locks flag
- Scans Redis for stale locks (TTL=-1) and removes them
- Called automatically during app:init on startup/upgrade
- Provides immediate recovery for affected instances

## Changes
- app/Jobs/ScheduledJobManager.php: Added expireAfter(60)->dontRelease()
- app/Console/Commands/CleanupRedis.php: Added cleanupCacheLocks() method
- app/Console/Commands/Init.php: Auto-clear locks on startup
- tests/Unit/ScheduledJobManagerLockTest.php: Test to prevent regression
- STALE_LOCK_FIX.md: Complete documentation

## Testing
- Unit tests pass (2 tests, 8 assertions)
- Code formatted with Pint
- Matches pattern used by CleanupInstanceStuffsJob

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 10:07:33 +02:00
Andras Bacsai
ed93031a39 feat(docker): implement Docker cleanup processing in ScheduledJobManager; refactor server task scheduling to streamline cleanup job dispatching 2025-08-26 14:43:57 +02:00
Andras Bacsai
11341d7c2c refactor(jobs): remove logging for ScheduledJobManager and ServerResourceManager start and completion 2025-07-18 23:15:51 +02:00
Andras Bacsai
80fae306e6 feat(scheduling): introduce ScheduledJobManager and ServerResourceManager for enhanced job scheduling and resource management 2025-07-12 14:44:44 +02:00