There are sample batch jobs in the samples folder
that demonstrate the use of raising and deleting
alerts automatically. They are documented in the
README file in the samples directory, which is probably
more up to date than this page.
Samples that existed when this page was created however are
========= File system space usage monitoring ========
1. Raise alerts is a filesystem is above n% full
filesystem_check.sh (script)
filesystem_check.cfg (script config)
(I run sample 3)
======== Running process monitoring ============
2. Raise alerts is a process is not running
process_check.sh (script)
process_check.cfg (script config)
(I run sample 3)
======= The above two combined ==============
3. Do both of the above in a single script
system_check.sh (script)
requires config files from both 1. and 2. to work
(I run it every 15 minutes)
======= Process busy checking ===============
4. Raise alerts if processes using unexpectedly
large amounts of cpu.
process_busy_check.sh (script)
(I run it every 5 minutes)
======= System load everage checks ==========
5. Raise alerts if system load averages too high
system_load_average_check.sh (script)
(I run it every 5 minutes)
======== Log parsing utilities samples =============
6. Raise alerts based on specific text from a log file,
in this sample from /var/log/messages.
monitor_var_log_control.sh (script to start/stop tasks)
log_parser_processor.sh (script to process log msgs)
log_parser.sample_config (sample parser configuration file)
requires alert tools ../log_parser program which is
provided as part of the alert toolkit.
Notes: uses [installdir]/run to store information. The
stop script is provided as on systems where the
messages file is rolled daily you will need to do
a stop and start when the log is rolled so the tail
switched to the new log. Include the stop and start in
the script that rolls your logs or run a stop and start
just after the log is rolled.
(I leave this running as a background task)
========= Remote server ping checks ============
7. Check all servers are still running, uses ping. Checks
via a specified interface so if you have multiple
interfaces that should reach a server all can be checked
explicitly.
server_ping_check.sh (script)
server_ping_check.cfg (script config)
(I run it every 15 minutes)
========= Use the JAVA console as an applet =========
8. Sample of how to run the java console as an applet instead
of as a standalone java application. (refer to manual).
java_console_gui_applet_run.html
=============== File growth monitoring ==============
9. Sample of monitoring filesizes and alerting when
they grow too large.
filesize_check.sh
requires config filesize_check.cfg
==================== Automation ===================
9. Sample of using automation rather than raising an alert.
Uses a customised filesize_check.sh and cfg file, and the
other files in the automation directory.
requires
automation_filesize_check.sh
automation_filesize_check.cfg
automation/*
See the automation directory README also.
THIS IS A VERY CLEAN AND CONCISE RULES BASED SYSTEM
I USE MYSELF.
-------------- Configuration files used by samples ---------------
The individual process and filesystem checks are provided as they are
easier to read, I use the combined system_check one.
All scripts read the configuration files for the appropriate checks
process_check.cfg - read by process_check.sh and system_check.sh
one line for each entry, a command line to be parsed
for using ps.
ie: syslogd
filesystem_check.cfg - read by filesystem_check.sh and system_check.sh
one line for each entry, a file system mount point
and a number indicating the threshold after which
an alert will be raised.
ie: /var 50
log_parser.sample_config - sample config for the log parser program, has
events for samba browser contention and logon
failures through PAM.
Sorry it's a bit short, my servers don't give me
too many errors, so I've nothing to test with
apart from these.
This file has comments to explain it's useage.
server_ping_check.cfg - sample interface and server list for the
ping checks of remote hosts. Entry format is
ipaddress of interface to use and the server
to try to ping from the interface.
ie: 169.254.198.7 firewall_7
filesize_check.cfg - sample of files to check.
|