Identifying Proficient Cybercriminals Through Text and Network Analysis
Original version
https://doi.org/10.1109/ISI49825.2020.9280523Abstract
A few highly skilled cybercriminals run the Crime as a Service business model. These expert hackers provide entry-level criminals with tools that allow them to enhance their cybercrime operations significantly. Thus, effectively and efficiently disrupting highly proficient cybercriminals is of a high priority to law enforcement. Such individuals can be found in vast underground forums, though it is particularly challenging to identify and profile individual users. We tackle this problem by combining two analysis methods: text analysis with Latent Dirichlet Allocation (LDA) and Social Network Analysis with centrality measures. In this paper, we use LDA to eliminate around 79% of hacker forum users with very low to no technical skills, while also inferring the forum roles held by the remaining users. Furthermore, we use centrality measures to identify users with hugely popular public posts, including users with very few public posts who receive much attention from their peers. We study various preprocessing methods, wherein we achieve our results by following a series of rigorous preprocessing steps. Our proposed method works towards overcoming current challenges in identifying and interrupting highly proficient cybercriminals.