For bonus points, this failure is in a cron job that sends out recently queued messages. It runs once every ten minutes - last weekend we had 12 failures: four were in a cluster on their own, one was in a run of two, and six were in a single continuous run.
Please note that this server is unused by our business so no messages ever get naturally queued. Every day we sync the live production server to this server at about 9 PM - assuming an employee was queuing up a message before the snapshot is taken there might be a number of unsent messages in the snapshot - those messages will all be sent by the first cron job after the sync.
It is a wonderfully awful problem that has me wanting to pull out my luscious locks.
For bonus points, this failure is in a cron job that sends out recently queued messages. It runs once every ten minutes - last weekend we had 12 failures: four were in a cluster on their own, one was in a run of two, and six were in a single continuous run.
Please note that this server is unused by our business so no messages ever get naturally queued. Every day we sync the live production server to this server at about 9 PM - assuming an employee was queuing up a message before the snapshot is taken there might be a number of unsent messages in the snapshot - those messages will all be sent by the first cron job after the sync.
It is a wonderfully awful problem that has me wanting to pull out my luscious locks.
I wish you luck. Also, maybe you could get someone else to get a fresh view of an issue