coolify/app
Andras Bacsai c6a2d1fe0a Fix stale lock issue causing scheduled tasks to stop (#4539)
## Problem
Scheduled tasks, backups, and auto-updates stopped working after 1-2 months
with error: MaxAttemptsExceededException: App\Jobs\ScheduledJobManager has
been attempted too many times.

Root cause: ScheduledJobManager used WithoutOverlapping with only
releaseAfter(60), causing locks without expiration (TTL=-1) that persisted
indefinitely when jobs hung or processes crashed.

## Solution

### Part 1: Prevention (Future Locks)
- Added expireAfter(60) to ScheduledJobManager middleware
- Lock now auto-expires after 60 seconds (matches everyMinute schedule)
- Changed from releaseAfter(60) to expireAfter(60)->dontRelease()
- Follows Laravel best practices and matches other Coolify jobs

### Part 2: Recovery (Existing Locks)
- Enhanced cleanup:redis command with --clear-locks flag
- Scans Redis for stale locks (TTL=-1) and removes them
- Called automatically during app:init on startup/upgrade
- Provides immediate recovery for affected instances

## Changes
- app/Jobs/ScheduledJobManager.php: Added expireAfter(60)->dontRelease()
- app/Console/Commands/CleanupRedis.php: Added cleanupCacheLocks() method
- app/Console/Commands/Init.php: Auto-clear locks on startup
- tests/Unit/ScheduledJobManagerLockTest.php: Test to prevent regression
- STALE_LOCK_FIX.md: Complete documentation

## Testing
- Unit tests pass (2 tests, 8 assertions)
- Code formatted with Pint
- Matches pattern used by CleanupInstanceStuffsJob

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 10:07:33 +02:00
..
Actions Changes auto-committed by Conductor 2025-10-16 17:13:47 +02:00
Console Fix stale lock issue causing scheduled tasks to stop (#4539) 2025-10-23 10:07:33 +02:00
Contracts refactor: streamline job status retrieval and clean up repository interface 2025-01-10 19:53:13 +01:00
Data Fix styling 2024-06-19 06:59:46 +00:00
Enums Add new role enum and apply authorization 2024-10-28 17:08:24 +01:00
Events work work on hetzner integration 2025-10-09 16:54:13 +02:00
Exceptions feat(exceptions): introduce NonReportableException to handle known errors and update Handler for selective reporting 2025-09-08 09:18:25 +02:00
Helpers feat(ssh-multiplexing): add connection age metadata handling to improve multiplexed connection management 2025-09-10 08:38:36 +02:00
Http Changes auto-committed by Conductor 2025-10-20 12:59:57 +02:00
Jobs Fix stale lock issue causing scheduled tasks to stop (#4539) 2025-10-23 10:07:33 +02:00
Listeners refactor(proxy): streamline proxy status handling and improve dashboard availability checks 2025-06-11 12:02:39 +02:00
Livewire fix: filter deprecated server types for Hetzner 2025-10-22 00:13:55 +02:00
Models Changes auto-committed by Conductor 2025-10-22 12:41:17 +02:00
Notifications Merge pull request #6837 from coollabsio/andrasbacsai/custom-webhooks 2025-10-12 10:57:47 +02:00
Policies Changes auto-committed by Conductor 2025-10-17 23:04:24 +02:00
Providers fix: register WebhookNotificationSettings with NotificationPolicy 2025-10-10 17:48:14 +02:00
Repositories refactor: streamline job status retrieval and clean up repository interface 2025-01-10 19:53:13 +01:00
Rules feat: add YAML validation for cloud-init scripts 2025-10-11 13:56:55 +02:00
Services fix: filter deprecated server types for Hetzner 2025-10-22 00:13:55 +02:00
Support feat(validation): centralize validation patterns for names and descriptions 2025-08-19 12:14:48 +02:00
Traits Changes auto-committed by Conductor 2025-10-16 09:51:37 +02:00
View/Components refactor: replace random ID generation with Cuid2 for unique HTML IDs in form components 2025-10-16 12:54:14 +02:00