- Add retry configuration to CoolifyTask (3 tries, 600s timeout) - Add retry configuration to ScheduledTaskJob (3 tries, configurable timeout) - Add retry configuration to DatabaseBackupJob (2 tries) - Implement exponential backoff for all jobs (30s, 60s, 120s intervals) - Add failed() handlers with comprehensive error logging to scheduled-errors channel - Add execution tracking: started_at, retry_count, duration (decimal), error_details - Add configurable timeout field to scheduled tasks (60-3600s, default 300s) - Update UI to include timeout configuration in task creation/editing forms - Increase ScheduledJobManager lock expiration from 60s to 90s for high-load environments - Implement safe queue cleanup with restart vs runtime modes - Restart mode: aggressive cleanup (marks all processing jobs as failed) - Runtime mode: conservative cleanup (only marks jobs >12h as failed, skips deployments) - Add cleanup:redis --restart flag for system startup - Integrate cleanup into Dev.php init() for development environment - Increase scheduled-errors log retention from 7 to 14 days - Create comprehensive test suite (unit and feature tests) - Add TESTING_GUIDE.md with manual testing instructions Fixes issues with jobs failing after single attempt and "attempted too many times" errors
6.7 KiB
6.7 KiB
Testing Guide: Scheduled Tasks Improvements
Overview
This guide covers testing all the improvements made to the scheduled tasks system, including retry logic, timeout handling, and error logging.
Jobs Modified
- CoolifyTask - Infrastructure job for SSH operations (3 retries, 600s timeout)
- ScheduledTaskJob - Scheduled container commands (3 retries, configurable timeout)
- DatabaseBackupJob - Database backups (2 retries, existing timeout)
Quick Test Commands
Run Unit Tests (No Database Required)
./vendor/bin/pest tests/Unit/ScheduledJobsRetryConfigTest.php
Run Feature Tests (Requires Database - Run in Docker)
docker exec coolify php artisan test --filter=CoolifyTaskRetryTest
Manual Testing
1. Test ScheduledTaskJob ✅ (You tested this)
How to test:
- Create a scheduled task in the UI
- Set a short frequency (every minute)
- Monitor execution in the UI
- Check logs:
storage/logs/scheduled-errors-2025-11-09.log
What to verify:
- Task executes successfully
- Duration is recorded (in seconds with 2 decimal places)
- Retry count is tracked
- Timeout configuration is respected
2. Test DatabaseBackupJob ✅ (You tested this)
How to test:
- Create a scheduled database backup
- Set frequency to manual or very short interval
- Trigger backup manually or wait for schedule
- Check logs for any errors
What to verify:
- Backup completes successfully
- Retry logic works if there's a transient failure
- Error logging is consistent
- Backoff timing is correct (60s, 300s)
3. Test CoolifyTask ⚠️ (IMPORTANT - Not tested yet)
CoolifyTask is used throughout the application for ALL SSH operations. Here are multiple ways to test it:
Option A: Server Validation (Easiest)
- Go to Servers in Coolify UI
- Select any server
- Click "Validate Server" or "Check Connection"
- This triggers CoolifyTask jobs
- Check Horizon dashboard for job processing
- Check logs:
storage/logs/scheduled-errors-2025-11-09.log
Option B: Container Operations
- Go to any Application or Service
- Try these actions (each triggers CoolifyTask):
- Restart container
- View logs
- Execute command in container
- Monitor Horizon for job processing
- Check logs for errors
Option C: Application Deployment
- Deploy or redeploy any application
- This triggers MANY CoolifyTask jobs
- Watch Horizon dashboard - you should see:
- Jobs being dispatched
- Jobs completing successfully
- If any fail, they should retry (check "Failed Jobs")
- Check logs for retry attempts
Option D: Docker Cleanup
- Wait for or trigger Docker cleanup (runs on schedule)
- This uses CoolifyTask for cleanup commands
- Check logs:
storage/logs/scheduled-errors-2025-11-09.log
Monitoring & Verification
Horizon Dashboard
- Open Horizon:
/horizon - Watch these sections:
- Recent Jobs - See jobs being processed
- Failed Jobs - Jobs that failed permanently after retries
- Monitoring - Job throughput and wait times
Log Monitoring
# Watch scheduled errors in real-time
tail -f storage/logs/scheduled-errors-2025-11-09.log
# Check for specific job errors
grep "CoolifyTask" storage/logs/scheduled-errors-2025-11-09.log
grep "ScheduledTaskJob" storage/logs/scheduled-errors-2025-11-09.log
grep "DatabaseBackupJob" storage/logs/scheduled-errors-2025-11-09.log
Database Verification
-- Check execution tracking
SELECT * FROM scheduled_task_executions
ORDER BY created_at DESC
LIMIT 10;
-- Verify duration is decimal (not throwing errors)
SELECT id, duration, retry_count, started_at, finished_at
FROM scheduled_task_executions
WHERE duration IS NOT NULL;
-- Check for tasks with retries
SELECT * FROM scheduled_task_executions
WHERE retry_count > 0;
Expected Behavior
✅ Success Indicators
-
Jobs Complete Successfully
- Horizon shows completed jobs
- No errors in scheduled-errors log
- Execution records in database
-
Retry Logic Works
- Failed jobs retry automatically
- Backoff timing is respected (30s, 60s, etc.)
- Jobs marked failed only after all retries exhausted
-
Timeout Enforcement
- Long-running jobs terminate at timeout
- Timeout is configurable per task
- No hanging jobs
-
Error Logging
- All errors logged to
storage/logs/scheduled-errors-2025-11-09.log - Consistent format with job name, attempt count, error details
- Trace included for debugging
- All errors logged to
-
Execution Tracking
- Duration recorded correctly (decimal with 2 places)
- Retry count incremented on failures
- Started/finished timestamps accurate
Troubleshooting
Issue: Jobs fail immediately without retrying
Check:
- Verify
$triesproperty is set on the job - Check if exception is being caught and re-thrown correctly
- Look for
maxExceptionsbeing reached
Issue: "Invalid text representation" errors
Fix Applied:
- Duration field changed from integer to decimal(10,2)
- If you see this, run migrations again
Issue: Jobs not appearing in Horizon
Check:
- Horizon is running (
php artisan horizon) - Queue workers are active
- Job is dispatched to correct queue ('high' for these jobs)
Issue: Timeout not working
Check:
- Timeout is set on job (CoolifyTask: 600s, ScheduledTask: configurable)
- PHP
max_execution_timeallows job timeout - Queue worker timeout is higher than job timeout
Test Checklist
- Unit tests pass:
./vendor/bin/pest tests/Unit/ScheduledJobsRetryConfigTest.php - ScheduledTaskJob tested manually ✅
- DatabaseBackupJob tested manually ✅
- CoolifyTask tested manually (server validation, container ops, or deployment)
- Retry logic verified (force a failure, watch retry attempts)
- Timeout enforcement tested (create long-running task with short timeout)
- Error logs checked:
storage/logs/scheduled-errors-2025-11-09.log - Horizon dashboard shows jobs processing correctly
- Database execution records show duration as decimal
- UI shows timeout configuration field for scheduled tasks
Next Steps After Testing
-
If all tests pass, run migrations on production/staging:
php artisan migrate -
Monitor logs for the first 24 hours:
tail -f storage/logs/scheduled-errors-2025-11-09.log -
Check Horizon for any failed jobs needing attention
-
Verify existing scheduled tasks now have retry capability
Questions?
If you encounter issues:
- Check
storage/logs/scheduled-errors-2025-11-09.logfirst - Check
storage/logs/laravel.logfor general errors - Look at Horizon "Failed Jobs" for detailed error info
- Review database execution records for patterns