Abstract:
|
Fault detection techniques have been studied by the statistics community since the late 1970s, encouraged by the fast development of software technologies for monitoring systems. The increased availability of complex and detailed information leads to several kinds of data structures involving time, such as data streams, temporal networks and time series data. Between these, event logs are becoming extensively used for monitoring because of their high reliability in determining the health status of the system. Our work provides a new log based statistical methodology for fault prediction. The model consists of two phases: pattern identification and feature extraction. For the first phase we assume an unobservable process of breakpoints defining patterns within the log file. The key feature for this process is its direct dependency on the observable series of events through functions such as the rate of occurrence of the events. Once the breakpoints are inferred, a new approach derived from the word space methodology is carried out in order to extract features. Such features represent the inputs to be used in several prediction methods such as penalised regression and neural nets.
|