Software failures are now known to be a dominant source of system outages. Several studies and much anecdotal evidence point to ?software aging? as a common phenomenon in which the state of a software system degrades with time. Exhaustion of system resources, data corruption, and numerical error accumulation are the primary symptoms of this degradation, which may eventually lead to performance degradation of the software, crash/hang failure, or other undesirable effects.
Software often exhibits an increasing failure rate over time, typically because of increasing and unbounded resource consumption, data corruption, and numerical error accumulation. This phenomenon called software aging may be caused by errors in the application, middleware, or operating system. Under aging conditions, the state of the software degrades gradually with time, inevitably resulting in undesirable consequences. Some typical causes of this degradation are memory bloating and leaking, unterminated threads, unreleased file-locks, data corruption, storage-space fragmentation, and accumulation of round-off errors.
When software aging is caused by errors in the code, the software manufacturers will provide on a regular basis patches to the code. The manufacturers are constantly providing patches for the these bugs. By applying patches you will reduce the likelihood that these bugs will cause failures in your computer systems or that hackers will exploit the bugs to gain access to your systems and data.
Software aging has been observed not only in specialized software, but also in widely used software, where rebooting to clear a problem is a common practice.
Aging occurs because software is extremely complex and never wholly free of errors. It is almost impossible to fully test and verify that a piece of software is bug-free. This situation is further exacerbated by the fact that software development tends to be extremely time-to-market-driven, which results in applications which meet the short-term market needs, yet do not account very well for long-term ramifications such as reliability. Hence, residual faults have to be tolerated in the operational phase. These residual faults can take various forms, but the ones that we are concerned with cause long-term depletion of system resources such as memory, threads, and kernel tables. The essentially economic problem of developing and producing bug-free code is not the problem at hand; instead we address one of the problems that arises from the prevailing approach to developing software, and one approach to attacking that problem is a proactive program of maintaining the software running on your systems.