Management of Large Scale NetFlow Data by Distributed Systems
MetadataShow full item record
Nowadays, as network has almost permeated all aspects of people s life, network quality and security administration becomes very necessary. A very important part is monitoring and analyzing network traffic. NetFlow is an important technique for collecting network traffic information and it has been used extensively in network industry. As network keeps expanding rapidly both in size and complexity, management of collected large scale NetFlow data has met new challenges. New efficient tools are needed. This thesis aims to investigate proper distributed NoSQL databases for handling large scale NetFlow data and mainly focus on their capabilities for quickly searching interesting information, analyzing and troubleshoot- ing network traffic. There are many different NoSQL databases, which can be broadly grouped into four types: key-value, column-family, document-oriented, graph- based. In this thesis work, the features and usages of different types of NoSQL databases are firstly studied. Based on that, the proper NoSQL database for this thesis are mainly selected through four aspects: data store, search ability, aggregation ability and extra useful features for data analysis. An integrated toolset: Elasticsearch, Logstash, Kibana (ELK) stands out to be a very promising solution. The three components of ELK work coordinately and can cover a complete NetFlow data analysis process from data collecting, store, process to visualization. To further evaluate the capabilities and performance of selected ELK system, practical experiments of using ELK to manage real NetFlow data are carried out through three use cases: monitoring traffic statistics, surveying suspicious flows and detecting common attacks. The results show that the powerful search and aggregations of Elasticsearch, advanced data pipeline of Logstash and rich visualizations of Kibana provide a very good solution. Some usage recommendations and further work are also discussed.