Links

Job monitoring

monitor the execution of any job (scripts, executables, crons etc.)

Preface

Together with cagent 1.3.0 a command line utility names jobmon has been introduced. It's included in all cagent packages. Jobmon can be uses as a wrapper around any executable script or binary. Jobmon catches the exit code and submits it to CloudRadar. Example

At a glance

Just wrap any command into jobmon and verify the exit status. Examples:
On Linux: jobmon -id myBackup -- rsync -aq /etc /var/backups/
On Windows: jobmon -id myBackup -- robocopy /MIR C:\Users\Administrator\Documents C:\Backups
Supervise the results of a job on my.cloudradar.io

The jobmon wrapper

Jobmon executes and executable (script, binary etc.) and catches the output, errors and the exit code. It stores the results in a queue, and cagent sends the data during the next cycle.
​If the exit code is not 0 CloudRadar triggers an alert automatically. You must assign each job an unique id for further identification. Jobmon can be used with the following options.

Usage of jobmon

jobmon -id <JOB_ID> {ARGS} -- <COMMAND_TO_EXECUTE> {COMMAND_ARGS}
-c string Config file path (default "/etc/cagent/cagent.conf")
-f Force run of a job even if the job with the same ID is already running or its termination wasn't handled successfully.
-id string id of the job, required, maximum 100 characters
-me duration h|m|s or just for number of seconds. Max execution time for job.
-nr duration h|m indicates when the job should run for the next time. Allows triggering alerts for not run jobs. The shortest interval is 5 minutes.
-re or -re=true|false Record errors from stderr, overwrites the default settings of cagent.conf. Limited to the last 4 KB. Use '-re=false' to disable the recording
-ro or -ro=true|false Record errors from stdout, overwrites the default settings of cagent.conf Limited to the last 4 KB.
-s string alert|warning|none If job fails (exit code != 0) trigger an event with this severity. 'alert' is used by default. Overwrites the default severity of cagent.conf. Severity 'none' suppresses all messages. -version Show the jobmon version

Examples

jobmon -id my-rsync-job -- rsync -a /etc /var/backups jobmon -id my-robocopy-job -- robocopy C:\Users\nobody\Downloads "C:\My Backups" /MIR
Record only the errors (stderr) and trigger an alert if exit code is not 0.
jobmon -id my-rsync-job -ro -- rsync -av /etc /var/backups
Record all output (stdout + stderr)
jobmon -id my-rsync-job -re=false -s none -- rsync -av /etc /var/backups
Don't record errors -re=false, just the standard output (stdout) and don't trigger any alerts -s none.
jobmon -id my-robocopy-job -nr 24h -- robocopy C:\Users\nobody\Downloads "C:\My Backups" /MIR
Indicate that this job runs every 24 hours -nr 24h. my.cloudradar.io will trigger an alert, if no data is send with the same job id within the given period of time.
Jobmon is perfect to supervise your cronjobs too.
The following example wraps a script executed by cron into jobmon. You will receive all output (stderr+stdout) on CloudRadar. The job runs every 5 minutes and cloudradar will fire an alert, if the script stops running for more than 6 minutes. So you will be alerted, if your crons are not running.
*/5 * * * * jobmon -id my-cron -nr 6m -ro -- /usr/local/bin/script.sh
Example of a failed job.
Email Alert about a failed job.