Algorithms and Approaches for Configuration-less Log Analysis
MetadataShow full item record
System logs contain messages from a wide range of applications. They are the natural starting point when troubleshooting a system. The usual approach for analysing system logs is to write a number of regular expressionsto match specific keywords and events. When the number of expressions grows large, the analysis solution becomes unmaintainable. In addition, the use of regular expressions requires the system administrator to have extensive knowledge of the system at hand. This thesis presents methods for performing log analysis without regular expressions. This is an area of system administration that has attracted very few researchers. Therefore, little published research is available on the subject. Much effort has been put into the task of generating patterns from log file. These patterns are an important prerequisites for statistical analysis. Patterns could also be used to identify transactions for use in Markov models. None of the existing pattern mining algorithm for system logs produce satisfactory results. To solve the task at hand, a new method for mining patterns is developed. Several different approaches were tested. An approach based on inserting log lines into a tree structure turned out to be a very promising. It outputs good quality patterns and its resource use is moderate. Log analysis without prior knowledge of the system at hand have been proven difficult. This thesis shows that methods where some basic knowl- edge of systems in general is exploited, are the most promising ones. Other approaches based on Markov models and neural networks are suggested in this thesis, but they have not been tested to full extend and require some more work before being useful.