Automating Problem Analysis Using Knowledge Extracted from Text
MetadataVis full innførsel
In the knowledge economy, it is important to utilize existing knowledge effectively. Most of the knowledge people generate is captured in textual form, which makes the ability to read and apply the knowledge contained in text for real life problems essential. Despite their processing power and storage capacity, computers have very limited ability to understand natural language, hence their ability to utilize existing knowledge is severely constrained. Researchers in artificial intelligence (AI), and in particular natural language processing (NLP), have been working on this problem for decades. Today, the products of this research including search engines and question-answering systems serve as essential tools for knowledge workers. Still, the problem of understanding text and utilizing knowledge contained in text for problem solving is far from being solved. Many of the existing systems that deal with textual information are aimed at finding relevant information leaving the task of using this information in problem solving to users. The research described in this thesis attempts to go beyond retrieval of relevant information, by introducing novel approaches to apply knowledge contained in text to a complex, real-life task that usually requires human experts to accomplish. The empirical part of our work is focused on the incident analysis task where the goal is to explain why an incident happened, identifying root causes of this incident. The main source of knowledge for this task is a collection of textual incident reports that describe analyses of previous incidents. Automating this task requires methods to acquire necessary knowledge from the reports, a representation to capture this knowledge in a structured form and reasoning methods to apply it in the analysis process. Our research investigates two approaches to accomplish this task. These approaches are based on the notion of reasoning knowledge, which is the knowledge of how a particular problem is analyzed and solved. The main idea behind the proposed approaches is to acquire reasoning knowledge from text and then reuse it for the analysis of a new problem. We distinguish between two types of reasoning knowledge, explanations and associations. Explanations are chains of causal relations contained in narratives, while associations are statistical correlations between pieces of information in text. Acquisition of explanations from text makes extensive use of NLP methods. Extracted explanations are captured in a graph-based representation where text fragments are connected through causal and entailment relations. These explanations are then reused as cases in a case-based reasoning system, where a new problem is explained by adapting, i.e. modifying and combining, explanations from previous similar problems. We develop and evaluate such a system on the incident analysis task, demonstrating the advantage of this approach over traditional information retrieval approaches. Acquisition of association relations from text is accomplished using association rule min ing, a well-known technique for discovering association relations between items in structured data, e.g. databases. We propose a method that enables mining of association relations between text fragments, e.g. phrases and sentences, in a collection of text documents. Extracted associations are then applied for association-based retrieval in the incident analysis task, which retrieves pieces of information associated with the incident description. We show that association-based retrieval performs better for the analysis task than similarity-based retrieval used in traditional information retrieval approaches.