I work for a small local government where we have 40+ scheduled batch jobs to run at various intervals. One of our weekly tasks in IT is to check that all the required production jobs are running and none have timed out. Is there a good way to automate this?
The criteria are that for each of these 40 jobs, there should be an instance of the job in "waiting" status with a scheduled start time in the future. I would like to get an email once a week notifying me that the jobs are healthy (or not).
The built-in alerts aren't helping. They will only tell us when a job ends or fails. Not helpful for a job that runs every 4 minutes, or for a job that gets stuck in Executing or Waiting status and never fires.