I Know What You Saw Last Minute - Encrypted HTTP Adaptive Video Streaming Title Classification

Published on Feb 1, 2016in arXiv: Multimedia
· DOI :10.1109/TIFS.2017.2730819
Ran Dubin7
Estimated H-index: 7
(BGU: Ben-Gurion University of the Negev),
Amit Dvir11
Estimated H-index: 11
(Ariel University)
+ 1 AuthorsOfer Hadar18
Estimated H-index: 18
(Ariel University)
Sources
Abstract
Desktops can be exploited to violate privacy. There are two main types of attack scenarios: active and passive. We consider the passive scenario where the adversary does not interact actively with the device, but is able to eavesdrop on the network traffic of the device from the network side. In the near future, most Internet traffic will be encrypted and thus passive attacks are challenging. Previous research has shown that information can be extracted from encrypted multimedia streams. This includes video title classification of non HTTP adaptive streams. This paper presents algorithms for encrypted HTTP adaptive video streaming title classification. We show that an external attacker can identify the video title from video HTTP adaptive streams sites, such as YouTube. To the best of our knowledge, this is the first work that shows this. We provide a large data set of 15000 YouTube video streams of 2100 popular video titles that was collected under real-world network conditions. We present several machine learning algorithms for the task and run a thorough set of experiments, which shows that our classification accuracy is higher than 95%. We also show that our algorithms are able to classify video titles that are not in the training set as unknown and some of the algorithms are also able to eliminate false prediction of video titles and instead report unknown. Finally, we evaluate our algorithm robustness to delays and packet losses at test time and show that our solution is robust to these changes.
Download
📖 Papers frequently viewed together
6 Citations
9 Citations
120 Citations
References37
Newest
Cited By5
Newest
#1Haipeng Li (UC: University of Cincinnati)H-Index: 2
#2Ben Niu (CAS: Chinese Academy of Sciences)H-Index: 44
Last. Boyang Wang (UC: University of Cincinnati)H-Index: 19
view all 3 authors...
In stream fingerprinting, an attacker can compromise user privacy by leveraging side-channel information (e.g., packet size) of encrypted traffic in streaming services. By taking advantages of machine learning, especially neural networks, an adversary can reveal which YouTube video a victim watches with extremely high accuracy. While effective defense methods have been proposed, extremely high bandwidth overheads are needed. In other words, building an effective defense with low overheads remain...
Source
#1Roei Schuster (TAU: Tel Aviv University)H-Index: 7
#2Vitaly Shmatikov (Cornell University)H-Index: 69
Last. Eran Tromer (TAU: Tel Aviv University)H-Index: 39
view all 3 authors...
The MPEG-DASH streaming video standard contains an information leak: even if the stream is encrypted, the segmentation prescribed by the standard causes content-dependent packet bursts. We show that many video streams are uniquely characterized by their burst patterns, and classifiers based on convolutional neural networks can accurately identify these patterns given very coarse network measurements. We demonstrate that this attack can be performed even by a Web attacker who does not directly ob...
47 Citations
HTTP response size is a well-known side channel attack. With the deployment of HTTP/2.0, response size estimation attacks are generally dismissed with the argument that pipelining and response multiplexing prevent eavesdroppers from finding out response sizes. Yet the impact that pipelining and response multiplexing actually have in estimating HTTP response sizes has not been adequately investigated. In this paper we set out to help understand the effect of pipelining and response multiplexing i...
3 Citations
#2Yehonatan ZionH-Index: 3
Last. Ofir PeleH-Index: 12
view all 7 authors...
Desktops and laptops can be maliciously exploited to violate privacy. There are two main types of attack scenarios: active and passive. In this paper, we consider the passive scenario where the adversary does not interact actively with the device, but he is able to eavesdrop on the network traffic of the device from the network side. Most of the internet traffic is encrypted and thus passive attacks are challenging. In this paper, we show that an external attacker can identify the operating syst...
2 Citations