Software Event Monitoring for Maintenance (#00003)


Selling points: Maintenance of large software systems today requires the administrator to continuously monitor the system for faults, poor performance, and security breaches. We propose a very generic event monitoring "push" model, which admins can use to automatically monitor software status, and take appropriate actions as necessary. Consider the following:

Events can be generated by the software directly (via hooks inside the source code that can be relinked), or this can be generated by a monitoring process that polls various shared data structures that reflect the state of the software system.

The event logger system records the events and generates new events composed through boolean logic defined by the admin (i.e. "95% disk full on node 1" and "95% disk full on node 2" = NEED_NEW_NODE_EVENT). Allowing for admin-defined events with a heirarchical event system that can be used for programmatic interfacing to the event logger. For example, if we would like to implement automated response policies, we can write a event response process, that will receive the events from the logger, and respond based on the admin's preset response policy. For example, when NEED_NEW_NODE_EVENT is fired, we search for a free network-attached disk partition and mount it on the large software system.

In addition to automating the task of monitoring a software system, this functionality can be used for testing purposes, to verify assertions of program behavior, and can also be used to measure performance regressions over the course work performed during a software production cycle. Essentially, writing this feedback loop into a self-maintaining software system allows the writer of the event-receiver to test the behavior of the software system in response to workload before it is shipped, in addition to providing a framework for automated maintenance when the software system is deployed in a customer environment.

Prior Art Research: TODO

Patent Filing Documents: TODO