Analyzing HTTPS Encrypted Traffic to Identify User Operating System, Browser and Application

Published on Mar 15, 2016in arXiv: Cryptography and Security
Jonathan Muehlstein3
Estimated H-index: 3
Yehonatan Zion3
Estimated H-index: 3
+ 4 AuthorsOfir Pele12
Estimated H-index: 12
Desktops and laptops can be maliciously exploited to violate privacy. There are two main types of attack scenarios: active and passive. In this paper, we consider the passive scenario where the adversary does not interact actively with the device, but he is able to eavesdrop on the network traffic of the device from the network side. Most of the internet traffic is encrypted and thus passive attacks are challenging. In this paper, we show that an external attacker can identify the operating system, browser and application of HTTP encrypted traffic (HTTPS). To the best of our knowledge, this is the first work that shows this. We provide a large data set of more than 20000 examples for this task. Additionally, we suggest new features for this task. We run a through a set of experiments, which shows that our classification accuracy is 96.06%.
📖 Papers frequently viewed together
2017CCNC: Consumer Communications and Networking Conference
7 Authors (Jonathan Muehlstein, ..., Ofir Pele)
16 Citations
4 Authors (Brian Schulte, ..., Angelos Stavrou)
1 Citations
5 Authors (Luka Malisa, ..., Srdjan Capkun)
#1Eric RescorlaH-Index: 24
This document specifies version 1.3 of the Transport Layer Security (TLS) protocol. TLS allows client/server applications to communicate over the Internet in a way that is designed to prevent eavesdropping, tampering, and message forgery. This document updates RFCs 4492, 5705, and 6066 and it obsoletes RFCs 5077, 5246, and 6961. This document also specifies new requirements for TLS 1.2 implementations.
268 CitationsSource
#1Ran DubinH-Index: 7
#2Amit DvirH-Index: 11
Last. Ofir TrabelsiH-Index: 2
view all 6 authors...
The increasing popularity of HTTP adaptive video streaming services has dramatically increased bandwidth requirements on operator networks, which attempt to shape their traffic through Deep Packet Inspection (DPI). However, Google and certain content providers have started to encrypt their video services. As a result, operators often encounter difficulties in shaping their encrypted video traffic via DPI. This highlights the need for new traffic classification methods for encrypted HTTP adaptive...
9 Citations
#1Shan Suthaharan (UNCG: University of North Carolina at Greensboro)H-Index: 15
Support Vector Machine is one of the classical machine learning techniques that can still help solve big data classification problems. Especially, it can help the multidomain applications in a big data environment. However, the support vector machine is mathematically complex and computationally expensive. The main objective of this chapter is to simplify this approach using process diagrams and data flow diagrams to help readers understand theory and implement it successfully. To achieve this o...
759 CitationsSource
#2Tiago Rodrigues (UFMG: Universidade Federal de Minas Gerais)H-Index: 17
Last. Virgilio Almeida (UFMG: Universidade Federal de Minas Gerais)H-Index: 58
view all 4 authors...
Understanding how users navigate and interact when they connect to social networking sites creates opportunities for better interface design, richer studies of social interactions, and improved design of content distribution systems. In this paper, we present an in-depth analysis of user workloads in online social networks. This study is based on detailed clickstream data, collected over a 12-day period, summarizing HTTP sessions of 37,024 users who accessed four popular social networks: Orkut, ...
81 CitationsSource
#1Pablo Ameigeiras (UGR: University of Granada)H-Index: 16
#2Juan J. Ramos-Munoz (UGR: University of Granada)H-Index: 14
Last. Juan M. Lopez-Soler (UGR: University of Granada)H-Index: 16
view all 4 authors...
YouTube currently accounts for a significant percentage of the Internet's global traffic. Hence, understanding the characteristics of the YouTube traffic generation pattern can provide a significant advantage in predicting user video quality and in enhancing network design. In this paper, we present a characterisation of the traffic generated by YouTube when accessed from a regular PC. On the basis of this characterisation, a YouTube server traffic generation model is proposed, which, for exampl...
88 CitationsSource
May 22, 2011 in S&P (IEEE Symposium on Security and Privacy)
#1Andrew M. White (UNC: University of North Carolina at Chapel Hill)H-Index: 7
#2Austin Matthews (UNC: University of North Carolina at Chapel Hill)H-Index: 8
Last. Fabian Monrose (UNC: University of North Carolina at Chapel Hill)H-Index: 47
view all 4 authors...
In this work, we unveil new privacy threats against Voice-over-IP (VoIP) communications. Although prior work has shown that the interaction of variable bit-rate codecs and length-preserving stream ciphers leaks information, we show that the threat is more serious than previously thought. In particular, we derive approximate transcripts of encrypted VoIP conversations by segmenting an observed packet stream into subsequences representing individual phonemes and classifying those subsequences by t...
87 CitationsSource
#1Chih-Chung Chang (NTU: National Taiwan University)H-Index: 8
#2Chih-Jen Lin (NTU: National Taiwan University)H-Index: 64
LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in deta...
34.4k CitationsSource
#1Riyad Alshammari (Dal: Dalhousie University)H-Index: 12
#2A. Nur Zincir-Heywood (Dal: Dalhousie University)H-Index: 12
The classification of Encrypted Traffic, namely Skype, from network traffic represents a particularly challenging problem. Solutions should ideally be both simple — therefore efficient to deploy — and accurate. Recent advances to team-based Genetic Programming provide the opportunity to decompose the original problem into a subset of classifiers with non-overlapping behaviors. Thus, in this work we have investigated the identification of Skype encrypted traffic using Symbiotic Bid-Based (SBB) pa...
24 CitationsSource
Nov 4, 2009 in IMC (Internet Measurement Conference)
#1Fabian Schneider (Technical University of Berlin)H-Index: 45
#2Anja Feldmann (Technical University of Berlin)H-Index: 73
Last. Walter Willinger (AT&T Labs)H-Index: 75
view all 4 authors...
Online Social Networks (OSNs) have already attracted more than half a billion users. However, our understanding of which OSN features attract and keep the attention of these users is poor. Studies thus far have relied on surveys or interviews of OSN users or focused on static properties, e. g., the friendship graph, gathered via sampled crawls. In this paper, we study how users actually interact with OSNs by extracting clickstreams from passively monitored network traffic. Our characterization o...
236 CitationsSource
#1Dario BonfiglioH-Index: 4
#2Marco MelliaH-Index: 54
Last. Dario Rossi (ENST: Télécom ParisTech)H-Index: 36
view all 4 authors...
Skype is beyond any doubt the VoIP application in the current Internet application spectrum. Its amazing success has drawn the attention of telecom operators and the research community, both interested in knowing its internal mechanisms, characterizing its traffic, understanding its users' behavior. In this paper, we investigate the characteristics of traffic streams generated by voice and video communications, and the signaling traffic generated by Skype. Our approach is twofold, as we make use...
142 CitationsSource
Cited By2
#1Xinlei Fan (CAS: Chinese Academy of Sciences)
#2Gaopeng Gou (CAS: Chinese Academy of Sciences)H-Index: 4
Last. Gang Xiong (CAS: Chinese Academy of Sciences)H-Index: 7
view all 5 authors...
More and more security vulnerabilities are closely related to operating system (OS) information, but how to accurately identify OS versions on a real-world dynamic network in encrypted traffic is still a challenge. In this paper, we propose a comprehensive passive OS identification method based on encrypted traffic. It takes advantage of several features in TLS headers and TCP/IP headers. Moreover, we also consider flow statistic features for each session. We collect a large dataset of more than...
#1Mao Tian (CAS: Chinese Academy of Sciences)H-Index: 2
#2Peng Chang (CAS: Chinese Academy of Sciences)H-Index: 2
Last. Shuhao Li (CAS: Chinese Academy of Sciences)H-Index: 4
view all 5 authors...
The expanding volume of HTTPS traffic (both legitimate and malicious) creates even more challenges for mobile network security and management. In this work, we propose AIBMF(Application Identification Based on Multi-view Features), a fine-grained approach to classify HTTPS traffic by their application type. The key idea of AIBMF is to combine three kinds of features—payload convolution features, packet size sequence and packet content type sequence. Based on these different view features, a deep...
1 CitationsSource