Analyzing HTTPS encrypted traffic to identify user's operating system, browser and application

Published on Jan 1, 2017 in CCNC (Consumer Communications and Networking Conference)
· DOI :10.1109/CCNC.2017.8013420
Jonathan Muehlstein3
Estimated H-index: 3
(Ariel University),
Yehonatan Zion3
Estimated H-index: 3
(Ariel University)
+ 4 AuthorsOfir Pele12
Estimated H-index: 12
(Ariel University)
Desktops and laptops can be maliciously exploited to violate privacy. There are two main types of attack scenarios: active and passive. In this paper, we consider the passive scenario where the adversary does not interact actively with the device, but he is able to eavesdrop on the network traffic of the device from the network side. Most of the internet traffic is encrypted and thus passive attacks are challenging. In this paper, we show that an external attacker can identify the operating system, browser and application of HTTP encrypted traffic (HTTPS). To the best of our knowledge, this is the first work that shows this. We provide a large data set of more than 20000 examples for this task. Additionally, we suggest new features for this task.We run a through a set of experiments, which shows that our classification accuracy is 96.06%.
📖 Papers frequently viewed together
2 Citations
4 Authors (Brian Schulte, ..., Angelos Stavrou)
1 Citations
87 Citations
#1Ran Dubin (BGU: Ben-Gurion University of the Negev)H-Index: 7
#2Amit Dvir (Ariel University)H-Index: 11
Last. Ofer Hadar (Ariel University)H-Index: 18
view all 4 authors...
Desktops can be exploited to violate privacy. There are two main types of attack scenarios: active and passive. We consider the passive scenario where the adversary does not interact actively with the device, but is able to eavesdrop on the network traffic of the device from the network side. In the near future, most Internet traffic will be encrypted and thus passive attacks are challenging. Previous research has shown that information can be extracted from encrypted multimedia streams. This in...
27 CitationsSource
#1Martin Husák (Masaryk University)H-Index: 7
#2Milan Čermák (Masaryk University)H-Index: 6
Last. Pavel Čeleda (Masaryk University)H-Index: 15
view all 4 authors...
The encryption of network traffic complicates legitimate network monitoring, traffic analysis, and network forensics. In this paper, we present real-time lightweight identification of HTTPS clients based on network monitoring and SSL/TLS fingerprinting. Our experiment shows that it is possible to estimate the User-Agent of a client in HTTPS communication via the analysis of the SSL/TLS handshake. The fingerprints of SSL/TLS handshakes, including a list of supported cipher suites, differ among cl...
28 CitationsSource
#1Ran DubinH-Index: 7
#2Amit DvirH-Index: 11
Last. Ofir TrabelsiH-Index: 2
view all 6 authors...
The increasing popularity of HTTP adaptive video streaming services has dramatically increased bandwidth requirements on operator networks, which attempt to shape their traffic through Deep Packet Inspection (DPI). However, Google and certain content providers have started to encrypt their video services. As a result, operators often encounter difficulties in shaping their encrypted video traffic via DPI. This highlights the need for new traffic classification methods for encrypted HTTP adaptive...
9 Citations
#1Mauro Conti (UNIPD: University of Padua)H-Index: 60
#2Luigi V. Mancini (Sapienza University of Rome)H-Index: 23
Last. Nino Vincenzo VerdeH-Index: 15
view all 4 authors...
Mobile devices can be maliciously exploited to violate the privacy of people. In most attack scenarios, the adversary takes the local or remote control of the mobile device, by leveraging a vulnerability of the system, hence sending back the collected information to some remote web service. In this paper, we consider a different adversary, who does not interact actively with the mobile device, but he is able to eavesdrop the network traffic of the device from the network side (e.g., controlling ...
120 CitationsSource
#1Shan Suthaharan (UNCG: University of North Carolina at Greensboro)H-Index: 15
Support Vector Machine is one of the classical machine learning techniques that can still help solve big data classification problems. Especially, it can help the multidomain applications in a big data environment. However, the support vector machine is mathematically complex and computationally expensive. The main objective of this chapter is to simplify this approach using process diagrams and data flow diagrams to help readers understand theory and implement it successfully. To achieve this o...
759 CitationsSource
#1Tomasz Bujlow (AAU: Aalborg University)H-Index: 10
#2Valentín Carela-Español (UPC: Polytechnic University of Catalonia)H-Index: 9
Last. Pere Barlet-Ros (UPC: Polytechnic University of Catalonia)H-Index: 16
view all 3 authors...
Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classification. However, the actual performance of DPI is still unclear to the research community, since the lack of public datasets prevent the comparison and reproducibility of their results. This paper presen...
91 CitationsSource
#1Zigang Cao (CAS: Chinese Academy of Sciences)H-Index: 5
#2Gang Xiong (CAS: Chinese Academy of Sciences)H-Index: 17
Last. Li Guo (CAS: Chinese Academy of Sciences)H-Index: 29
view all 5 authors...
With the widespread use of encryption techniques in network applications, encrypted network traffic has recently become a great challenge for network management. Studies on encrypted traffic classification not only help to improve the network service quality, but also assist in enhancing network security. In this paper, we first introduce the basic information of encrypted traffic classification, emphasizing the influences of encryption on current classification methodology. Then, we summarize t...
22 CitationsSource
Aug 28, 2014 in DCNET (International Conference on Data Communication Networking)
#1Petr Matousek (Brno University of Technology)H-Index: 7
#2Ondrej Rysavy (Brno University of Technology)H-Index: 6
Last. Martin Vymlatil (Brno University of Technology)H-Index: 2
view all 4 authors...
This paper deals with identification of operating systems (OSs) from the Internet traffic. Every packet injected on the network carries a specific information in its packet header that reflects the initial settings of a host's operating system. The set of such features forms a fingerprint. The OS fingerprint usually includes an initial TTL time, a TCP initial window time, a set of specific TCP options, and other values obtained from IP and TCP headers. Identification of OSs can be useful for mon...
11 CitationsSource
#1Walter de DonatoH-Index: 14
#2Antonio PescapeH-Index: 45
Last. Alberto Dainotti (UCSD: University of California, San Diego)H-Index: 28
view all 3 authors...
The availability of open source traffic classification systems designed for both experimental and operational use, can facilitate collaboration, convergence on standard definitions and procedures, and reliable evaluation of techniques. In this article, we describe Traffic Identification Engine (TIE), an open source tool for network traffic classification, which we started developing in 2008 to promote sharing common implementations and data in this field. We designed TIE?s architecture and funct...
41 CitationsSource
Jan 1, 2013 in TMA (Traffic Monitoring and Analysis)
#1Silvio Valenti (ENST: Télécom ParisTech)H-Index: 12
#2Dario Rossi (ENST: Télécom ParisTech)H-Index: 36
Last. Marco Mellia (Polytechnic University of Turin)H-Index: 54
view all 6 authors...
Traffic classification has received increasing attention in the last years. It aims at offering the ability to automatically recognize the application that has generated a given stream of packets from the direct and passive observation of the individual packets, or stream of packets, flowing in the network. This ability is instrumental to a number of activities that are of extreme interest to carriers, Internet service providers and network administrators in general. Indeed, traffic classificati...
49 CitationsSource
Cited By16
Wearable devices such as smartwatches, fitness trackers, and blood-pressure monitors process, store, and communicate sensitive and personal information related to the health, life-style, habits and interests of the wearer. This data is exchanged with a companion app running on a smartphone over a Bluetooth connection. In this work, we investigate what can be inferred from the metadata (such as the packet timings and sizes) of encrypted Bluetooth communications between a wearable device and its c...
Last. Bhupesh Kumar DewanganH-Index: 3
view all 4 authors...
Network traffic classification has become an important basis for computer networks. However, the emergence of new applications, which generate unknown traffic constantly, has brought new challenges. The most critical challenge is how to divide the mixed unknown traffic into clusters containing only one category. In this paper, we propose a transfer learning approach using Deep Adaptation Network (DAN). This approach utilizes a few labeled samples from known traffic to improve the clustering puri...
#1Hua Qu (Xi'an Jiaotong University)
#1Hua Qu (Xi'an Jiaotong University)H-Index: 14
Last. Jinli Yang (China Mobile)H-Index: 1
view all 5 authors...
#1Vasilios Mavroudis (UCL: University College London)H-Index: 4
#2Jamie Hayes (UCL: University College London)H-Index: 13
The widespread adoption of encrypted communications (e.g., the TLS protocol, the Tor anonymity network) fixed several critical security flaws and shielded the end-users from adversaries intercepting their transmitted data. While these protocols are very effective in protecting the confidentiality of the users' data (e.g., credit card numbers), it has been shown that they are prone (to different degrees) to adversaries aiming to breach the users' privacy. Traffic fingerprinting attacks allow an a...
Traffic monitoring is essential for network management tasks that ensure security and QoS. However, the continuous increase of HTTPS traffic undermines the effectiveness of current service-level monitoring that can only rely on unreliable parameters from the TLS handshake (X.509 certificate, SNI) or must decrypt the traffic. We propose a new machine learning-based method to identify HTTPS services without decryption. By extracting statistical features on TLS handshake packets and on a small numb...
Last. Shuyi Guo
view all 4 authors...
The development of smartphones and social networks has brought great convenience to our lives. Due to the increasing requirements of user privacy, user data are protected by encryption protocol. But it also makes it difficult to regulate malicious behavior. The existing user behavior identification adopts the statistical features of encrypted traffic, which fluctuates greatly in different transmission environments. In this paper, we propose a method to obtain the stable features of encrypted tra...
#1Omar Richardson (Karlstad University)H-Index: 5
#2Johan Garcia (Karlstad University)H-Index: 8
High level traffic characteristics have the potential to be useful for inference of various host characteristics. This work proposes the novel Flow-Discretize Order (FDO) approach for describing session characteristics in an intuitive manner, while also retaining flow ordering information. The FDO approach allows for flexible construction of flow descriptors, by using different flow properties and applying appropriate discretization. The individual flow descriptors are concatenated to form sessi...
HTTPS is gaining widespread popularity for performing secure transactions. Most popular sites have made default choice as HTTPS. Therefore, this paper makes a survey through various study done in the area and it has comprehensively explored the various tools, technologies, and mechanisms to deal with secured network in a robust way. We make a complete analysis and evaluation of HTTPS protocol–is it ensuring security or are we entering into a vicious cycle of finding weaknesses and trying to fill...