General TCP state inference model from passive measurements using machine learning techniques
Journal article, Peer reviewed
MetadataShow full item record
Original versionIEEE Access. 2018, 6 28372-28387. 10.1109/ACCESS.2018.2833107
Many applications in the Internet use the reliable end-to-end Transmission Control Protocol (TCP) as a transport protocol due to practical considerations. There are many different TCP variants widely in use, and each variant uses a specific congestion control algorithm to avoid congestion, while also attempting to share the underlying network capacity equally among the competing users. This paper shows how an intermediate node (e.g., a network operator) can identify the transmission state of the TCP client associated with a TCP flow by passively monitoring the TCP traffic. Here, we present a robust, scalable and generic machine learning-based method which may be of interest for network operators that experimentally infers Congestion Window (cwnd) and the underlying variant of loss-based TCP algorithms within a flow from passive traffic measurements collected at an intermediate node. The method can also be extended to predict other TCP transmission states of the client. We believe that our study also has a potential benefit and opportunity for researchers and scientists in the networking community from both academia and industry who want to assess the characteristics of TCP transmission states related to network congestion. We validate the robustness and scalability approach of our prediction model through a large number of controlled experiments. It turns out, surprisingly enough, that the learned prediction model performs reasonably well by leveraging knowledge from the emulated network when it is applied on a real-life scenario setting. Thus, our prediction model is general bearing similarity to the concept of transfer learning in the machine learning community. The accuracy of our experimental results both in an emulated network, realistic and combined scenario settings and across multiple TCP congestion control variants demonstrate that our model is reasonably effective and has considerable potential.