Predicting Self‐declared Movie Watching Behavior Using Facebook Data and Information‐Fusion Sensitivity Analysis

Published on Jun 1, 2021in Decision Sciences
· DOI :10.1111/DECI.12406
Matthias Bogaert4
Estimated H-index: 4
(UGent: Ghent University),
Michel Ballings11
Estimated H-index: 11
(UT: University of Tennessee)
+ 1 AuthorsDirk Van den Poel52
Estimated H-index: 52
(UGent: Ghent University)
The main purpose of this paper is to evaluate the feasibility of predicting whether yes or no a Facebook user has self-reported to have watched a given movie genre. Therefore, we apply a data analytical framework that (1) builds and evaluates several predictive models explaining self-declared movie watching behavior, and (2) provides insight into the importance of the predictors and their relationship with self-reported movie watching behavior. For the first outcome, we benchmark several algorithms (logistic regression, random forest, adaptive boosting, rotation forest, and naive Bayes) and evaluate their performance using the area under the receiver operating characteristic curve. For the second outcome, we evaluate variable importance and build partial dependence plots using information-fusion sensitivity analysis for different movie genres. To gather the data, we developed a custom native Facebook app. We resampled our dataset to make it representative of the general Facebook population with respect to age and gender. The results indicate that adaptive boosting outperforms all other algorithms. Time- and frequency-based variables related to media (movies, videos, and music) consumption constitute the list of top variables. To the best of our knowledge, this study is the first to fit predictive models of self-reported movie watching behavior and provide insights into the relationships that govern these models. Our models can be used as a decision tool for movie producers to target potential movie-watchers and market their movies more efficiently.
📖 Papers frequently viewed together
1 Citations
1 Author (Donna Marbury)
1 Author (Wei Xia)
#1Kyuhan Lee (SNU: Seoul National University)H-Index: 1
#2Jinsoo Park (SNU: Seoul National University)H-Index: 13
Last. Youngseok Choi (Brunel University London)H-Index: 6
view all 4 authors...
Previous studies on predicting the box-office performance of a movie using machine learning techniques have shown practical levels of predictive accuracy. Their works are technically- and methodologically-oriented, focusing mainly on what algorithms are better at predicting the movie performance. However, the accuracy of prediction model can also be elevated by taking other perspectives such as introducing unexplored features that might be related to the prediction of the outcomes. In this paper...
17 CitationsSource
#1Dominique M. Hanssens (UCLA: University of California, Los Angeles)H-Index: 46
16 CitationsSource
#1Matthias BogaertH-Index: 4
#2Michel BallingsH-Index: 11
Last. Dirk Van den PoelH-Index: 52
view all 4 authors...
This study assesses the feasibility of identifying self-reported sports practitioners (soccer players) on Facebook. The main goal is to develop a system to support marketers with the decision as to which prospects to target for advertising purposes. To do so, we benchmark several algorithms (i.e., random forest, logistic regression, adaboost, rotation forest, neural networks, and kernel factory) using five times twofold cross-validation. To evaluate performance and variable importances, we build...
4 CitationsSource
#1Chao Ding (HKU: University of Hong Kong)H-Index: 2
#2Hsing Kenneth Cheng (College of Business Administration)H-Index: 21
Last. Yong Jin (PolyU: Hong Kong Polytechnic University)H-Index: 5
view all 4 authors...
The mainstream research of social factors and box office performance has concentrated on post-consumption opinion mining and sentiment analysis, which are difficult to operationalize to the benefits of the industry practitioners whose objective is to maximize box office sales. In this study, we propose the Facebook like as an effective social marketing tool before the release of movies for several reasons. Firstly, people's prerelease liking of movies can be influenced by marketing campaigns. Se...
51 CitationsSource
#1Ali Dag (USD: University of South Dakota)H-Index: 8
#2Asil Oztekin (University of Massachusetts Lowell)H-Index: 24
Last. Fadel M. Megahed (Miami University)H-Index: 16
view all 5 authors...
Predicting the survival of heart transplant patients is an important, yet challenging problem since it plays a crucial role in understanding the matching procedure between a donor and a recipient. Data mining models can be used to effectively analyze and extract novel information from large/complex transplantation datasets. The objective of this study is to predict the 1-, 5-, and 9-year patient's graft survival following a heart transplant surgery via the deployment of analytical models that ar...
54 CitationsSource
#1Chong Oh (UofU: University of Utah)H-Index: 6
#2Yaman Roumani (EMU: Eastern Michigan University)H-Index: 7
Last. Han-fen Hu (UNLV: University of Nevada, Las Vegas)H-Index: 9
view all 4 authors...
Online consumer engagement behavior (CEB) affects future economic performance.CEB on Facebook and YouTube positively correlate with movie box-office revenue.Social media-based CEB is critical to improve economic performance of movie firms. This study examines the effects of social media, from the perspective of consumer engagement behavior (CEB), to investigate how CEB is associated with economic performance. Based on social media activities surrounding US movies, we used ordinary least square (...
96 CitationsSource
Purpose The prediction of graduation rates of college students has become increasingly important to colleges and universities across the USA and the world. Graduation rates, also referred to as completion rates, directly impact university rankings and represent a measurement of institutional performance and student success. In recent years, there has been a concerted effort by federal and state governments to increase the transparency and accountability of institutions, making “graduation rates”...
13 CitationsSource
#1Asil Oztekin (University of Massachusetts Lowell)H-Index: 24
#2Recep Kizilaslan (Fatih University)H-Index: 2
Last. Ali İşeri (Gazi University)H-Index: 5
view all 4 authors...
Forecasting stock market returns is a challenging task due to the complex nature of the data. This study develops a generic methodology to predict daily stock price movements by deploying and integrating three data analytical prediction models: adaptive neuro-fuzzy inference systems, artificial neural networks, and support vector machines. The proposed approach is tested on the Borsa Istanbul BIST 100 Index over an 8 year period from 2007 to 2014, using accuracy, sensitivity, and specificity as ...
49 CitationsSource
#1Matthijs Meire (UGent: Ghent University)H-Index: 3
#2Michel Ballings (UT: University of Tennessee)H-Index: 11
Last. Dirk Van den Poel (UGent: Ghent University)H-Index: 52
view all 3 authors...
Abstract The purpose of this study is to (1) assess the added value of information available before (i.e., leading) and after (i.e., lagging) the focal post's creation time in sentiment analysis of Facebook posts, (2) determine which predictors are most important, and (3) investigate the relationship between top predictors and sentiment. We build a sentiment prediction model, including leading information, lagging information, and traditional post variables. We benchmark Random Forest and Suppor...
31 CitationsSource
#1Michel Ballings (UT: University of Tennessee)H-Index: 11
#2Dirk Van den Poel (UGent: Ghent University)H-Index: 52
Last. Matthias Bogaert (UGent: Ghent University)H-Index: 4
view all 3 authors...
This paper aims to create an expert system that yields an optimal strategy for increasing network size on Facebook. Data were obtained from 5488 Facebook users by means of a custom-built Facebook application. We computed a total of 426 variables. Using these data we estimated a predictive model of network size which is subsequently used in a prescriptive model. The former is estimated with Random Forest and the latter with a Genetic Algorithm. The results indicate that the proposed expert system...
13 CitationsSource
Cited By2
Last. Chien-Hsing Wu (NUK: National University of Kaohsiung)H-Index: 10
view all 3 authors...
#1Matthias Bogaert (UGent: Ghent University)H-Index: 4
#2Michel Ballings (UT: University of Tennessee)H-Index: 11
Last. Asil Oztekin (University of Massachusetts Lowell)H-Index: 24
view all 4 authors...
Abstract This paper aims to determine the power of social media data (Facebook and Twitter) in predicting box office sales, which platforms, data types and variables are the most important and why. To do so, we compare several models based on movie data, Facebook data, and Twitter data. We benchmark these model comparisons using various prediction algorithms. Next, we apply information-fusion sensitivity analysis to evaluate which variables are driving the predictive performance. Our analysis sh...