Traffic data processing using large scale graph processing systems
Abstract
Anomaly detection in internet traffic today is largely based on quantifying traffic data.This thesis proposes a new algorithm SpreadRank, which detects spreading of internet traffic as an additional metric for traffic anomaly detection.SpreadRank uses large scale graph processing to calculate spreading from multiple gigabytes of NetFlow data obtained from core routers.Studying spreading is a useful tool in determining the role of an end-host and in identifying malicious behaviour.