There are sample batch jobs in the samples folder
that demonstrate the use of raising and deleting
alerts automatically. They are documented in the
README file in the samples directory, which is probably
more up to date than this page.
Samples that existed when this page was created however are
========= File system space usage monitoring ======== 1. Raise alerts is a filesystem is above n% full filesystem_check.sh (script) filesystem_check.cfg (script config) (I run sample 3) ======== Running process monitoring ============ 2. Raise alerts is a process is not running process_check.sh (script) process_check.cfg (script config) (I run sample 3) ======= The above two combined ============== 3. Do both of the above in a single script system_check.sh (script) requires config files from both 1. and 2. to work (I run it every 15 minutes) ======= Process busy checking =============== 4. Raise alerts if processes using unexpectedly large amounts of cpu. process_busy_check.sh (script) (I run it every 5 minutes) ======= System load everage checks ========== 5. Raise alerts if system load averages too high system_load_average_check.sh (script) (I run it every 5 minutes) ======== Log parsing utilities samples ============= 6. Raise alerts based on specific text from a log file, in this sample from /var/log/messages. monitor_var_log_control.sh (script to start/stop tasks) log_parser_processor.sh (script to process log msgs) log_parser.sample_config (sample parser configuration file) requires alert tools ../log_parser program which is provided as part of the alert toolkit. Notes: uses [installdir]/run to store information. The stop script is provided as on systems where the messages file is rolled daily you will need to do a stop and start when the log is rolled so the tail switched to the new log. Include the stop and start in the script that rolls your logs or run a stop and start just after the log is rolled. (I leave this running as a background task) ========= Remote server ping checks ============ 7. Check all servers are still running, uses ping. Checks via a specified interface so if you have multiple interfaces that should reach a server all can be checked explicitly. server_ping_check.sh (script) server_ping_check.cfg (script config) (I run it every 15 minutes) ========= Use the JAVA console as an applet ========= 8. Sample of how to run the java console as an applet instead of as a standalone java application. (refer to manual). java_console_gui_applet_run.html =============== File growth monitoring ============== 9. Sample of monitoring filesizes and alerting when they grow too large. filesize_check.sh requires config filesize_check.cfg ==================== Automation =================== 9. Sample of using automation rather than raising an alert. Uses a customised filesize_check.sh and cfg file, and the other files in the automation directory. requires automation_filesize_check.sh automation_filesize_check.cfg automation/* See the automation directory README also. THIS IS A VERY CLEAN AND CONCISE RULES BASED SYSTEM I USE MYSELF. -------------- Configuration files used by samples --------------- The individual process and filesystem checks are provided as they are easier to read, I use the combined system_check one. All scripts read the configuration files for the appropriate checks process_check.cfg - read by process_check.sh and system_check.sh one line for each entry, a command line to be parsed for using ps. ie: syslogd filesystem_check.cfg - read by filesystem_check.sh and system_check.sh one line for each entry, a file system mount point and a number indicating the threshold after which an alert will be raised. ie: /var 50 log_parser.sample_config - sample config for the log parser program, has events for samba browser contention and logon failures through PAM. Sorry it's a bit short, my servers don't give me too many errors, so I've nothing to test with apart from these. This file has comments to explain it's useage. server_ping_check.cfg - sample interface and server list for the ping checks of remote hosts. Entry format is ipaddress of interface to use and the server to try to ping from the interface. ie: 169.254.198.7 firewall_7 filesize_check.cfg - sample of files to check. |