3. Server Operation

3.1 Functionality overview

The server will collect events forwarded to the port it is listening on and store them in memory if possible. The server will store up to 50 events in memory.

When the memory tables are full and additional alerts are recieved they are recorded in an overflow file, and will be processed sequentially as the in-memory alerts are corrected and deleted.

This has an impact on alert deletions. If a remote server sends a request to delete an alert when it determines the cause is corrected two things can happen if the overflow file in in use.
  1. The first is that the record is found in memory and deleted (ideal) thus allowing the next record to be pulled from the overflow file
  2. The second which is less than ideal is if the alert to be deleted is not found in memory in which case the delete request is also appended to the overflow file. Eventually the origional alert will reach the memory buffer, but will not be deleted until enough space is available in the memory buffer at a later time when the delete request is also read from the overflow file as an addition to the memory buffer.

The second condition can only happen if you have over 50 outstanding alerts (fire your operators). It will eventually clear unless you have an application raiseing/deleting alerts at a frantic rate.

3.2 Starting the server

3.2.1 How to start the server - recomended method

The server should be started from /etc/rc3.d, after you have gone multiuser but before any other applications that are likely to generate alerts.

There is a supplied script that will have been installed into the scripts directory under the path you installed the application in, named start_alertserver.sh. This can be used as a base for your rc script.
You will need to check the start of the script to ensure the config.txt file it used has a correct path for where you installed the application to, and customise the config.txt as required for your desired setup. This was covered in the chapter on configuration which if you have not read yet read now before proceeding any further.

The syntax for the supplied start_alertserver.sh is
start_alertserver.sh start|stop|restart|status

I have also included my own startup script which starts both the alert server and a tail task against /var/log/messages using the supplied sample message processors. This is start_server_and_tailtask.sh. You can use this if you will always be processing the /var/log/messages file. The advantage of this script, and why I use it, is that I can use this to stop and start the tail task on its own also so I can use it when I roll the messages file each day also.

The syntax for start_server_and_tailtask.sh is
start_server_and_tailtask.sh start | stop | status [ all | only | tailtask ]
The default if one of the optional parameters is not provided is all.

If you use this later script you will have to manually copy it over the script that was installed in /etc/init.d (if you elected to install the startup script as part of the installation), as the installation scriot installs the normal default startup script, not my one.

3.2.2 How to start the server - manual method

You can if you wish write your own startup script. The alert collector server can be manually started as below.
Syntax: alert_collector_server [ port [ ipaddrtolistenon ] ]

Defaults:
  If no port is provided the port will default to 9003.
  If no ipaddrtolistenon is provided the server will listen
  on all interfaces installed in the server.

Parameters:
  port - specify the tcpip port the server will listen to
         for incoming alerts and console sessions.
  ipaddrtolistenon - if an ipaddress is provided here the
         server will only listen on the interface configured
         with that ipaddress. This is for sites that may have
         interfaces to both internal and external networks,
         it allows the server to run only listening on the
         internal network.
 

For example: ./alert_collector_server 9003 169.218.154.30

3.3 Stopping the server

Why would you want to ?.

If you have customised the supplied config.txt file correctly for the recomended method of starting the server you may use the same startup script to stop the server cleanly with
start_alertserver.sh stop
A benifit of using this method is that the supplied script will only stop the alert collector server if run on the same machine as the alert collector server.

If you manually started the server, you can stop it cleanly using the supplied shutdown tool as long as you know the ipaddress and port number. The clean_shutdown binary should be in the installation directory of the toolkit. The syntax is
clean_shutdown port ipaddress
CAUTION: the clean_shutdown program can shutdown remote alert servers, you should tightly secure this or remove it from remote servers.

Otherwise, you will have to do it manually.
ps -eo args | grep "alert_collector_server"
kill the pid of the process.

3.4 Log and control Files used by the server

The server generates two types of files.
  1. A daily log file. This is of the form log_.log and is rolled over to the new date the first time an alert is recived for the new date. You should put in place jobs to clean these up
  2. The file evt_buffer.dat which is the alert buffer overflow file. This will grow as alerts are appended to it, but as you effient operations staff are managing alerts, all entries will eventually be read into memory. When the last record from the overflow file is safely in memory the file is deleted/recreated as an empty file, so it manages itself.
    If this file gets extremely large then alerts are not being deleted (fire the operators or fix your batch jobs, the latter being batch jobs can cancel alerts as well as raise them so use that facility when a problem goes away).

End of Chapter 3