Risk prediction for malignant intraductal papillary mucinous neoplasm of the pancreas: logistic regression versus machine learning.

Published on Nov 18, 2020in Scientific Reports3.998
· DOI :10.1038/S41598-020-76974-7
Jae Seung Kang1
Estimated H-index: 1
(SNU: Seoul National University),
Chanhee Lee1
Estimated H-index: 1
(SNU: Seoul National University)
+ 58 AuthorsJin Young Jang (SNU: Seoul National University)
Sources
Abstract
Most models for predicting malignant pancreatic intraductal papillary mucinous neoplasms were developed based on logistic regression (LR) analysis. Our study aimed to develop risk prediction models using machine learning (ML) and LR techniques and compare their performances. This was a multinational, multi-institutional, retrospective study. Clinical variables including age, sex, main duct diameter, cyst size, mural nodule, and tumour location were factors considered for model development (MD). After the division into a MD set and a test set (2:1), the best ML and LR models were developed by training with the MD set using a tenfold cross validation. The test area under the receiver operating curves (AUCs) of the two models were calculated using an independent test set. A total of 3,708 patients were included. The stacked ensemble algorithm in the ML model and variable combinations containing all variables in the LR model were the most chosen during 200 repetitions. After 200 repetitions, the mean AUCs of the ML and LR models were comparable (0.725 vs. 0.725). The performances of the ML and LR models were comparable. The LR model was more practical than ML counterpart, because of its convenience in clinical use and simple interpretability.
References28
Newest
#1Yasuhiro ShimizuH-Index: 67
#1Yasuhiro ShimizuH-Index: 1
Last. Kazuichi OkazakiH-Index: 85
view all 17 authors...
Objective:To create a simple, objective model to predict the presence of malignancy in patients with intraductal papillary mucinous neoplasm (IPMN), which can be easily applied in daily practice and, importantly, adopted for any lesion types.Background:No predictive model for malignant IPMN has been
12 CitationsSource
2 CitationsSource
#1W. Jung (Ajou University)H-Index: 1
#1Woohyun Jung (Ajou University)H-Index: 17
Last. Jun Young Jang (SNU: Seoul National University)H-Index: 11
view all 25 authors...
Background: Intraductal papillary mucinous neoplasm (IPMN) is premalignant pancreatic lesion. International guidelines offer limited predictors of individual risk. A nomogram to predict individual IPMN malignancy risk was released, with good diagnostic performance based on a large cohort of Asian patients with IPMN. The present study validated a nomogram to predict malignancy risk and invasiveness of IPMN using both Eastern and Western cohorts. Methods: Clinicopathological and radiological data ...
11 CitationsSource
#1Jae Seung Kang (SNU: Seoul National University)H-Index: 23
#2Taesung Park (SNU: Seoul National University)H-Index: 6
Last. Jin-Young Jang (SNU: Seoul National University)H-Index: 54
view all 10 authors...
Purpose: The 2017 international consensus guidelines (ICG) for intraductal papillary mucinous neoplasm (IPMN) of the pancreas were recently released. Important changes included the addition of worrisome features such as elevated serum CA 19-9 and rapid cyst growth (>5 mm over 2 years). We aimed to clinically validate the 2017 ICG and compare the diagnostic performance between the 2017 and 2012 ICG. Methods: This was a retrospective cohort study. During January 2000-January 2017, patients who und...
13 CitationsSource
#1Evangelia Christodoulou (Katholieke Universiteit Leuven)H-Index: 5
#2Jie Ma (University of Oxford)H-Index: 3
Last. Ben Van Calster (Katholieke Universiteit Leuven)H-Index: 48
view all 6 authors...
Abstract Objectives The objective of this study was to compare performance of logistic regression (LR) with machine learning (ML) for clinical prediction modeling in the literature. Study Design and Setting We conducted a Medline literature search (1/2016 to 8/2017) and extracted comparisons between LR and ML models for binary outcomes. Results We included 71 of 927 studies. The median sample size was 1,250 (range 72–3,994,872), with 19 predictors considered (range 5–563) and eight events per pr...
264 CitationsSource
#1Zhongheng Zhang (ZJU: Zhejiang University)H-Index: 37
#2Yiming Zhao (ZJU: Zhejiang University)H-Index: 1
Last. Olga Lyashevska (Galway-Mayo Institute of Technology)H-Index: 7
view all 5 authors...
Predictive analytics play an important role in clinical research. An accurate predictive model can help clinicians stratify risk thereby allowing the identification of a target population which might benefit from a certain intervention. Conventionally, predictive analytics is performed using parametric modeling which comes with a number of assumptions. For example, generalized linear regression models require linearity and additivity to hold for the underlying data. However, these assumptions ma...
26 CitationsSource
#1Hyung-Chul Lee (Seoul National University Hospital)H-Index: 7
#1Hyung-Chul Lee (Seoul National University Hospital)H-Index: 14
Last. Kook Hyun Lee (Seoul National University Hospital)H-Index: 13
view all 8 authors...
Acute kidney injury (AKI) after liver transplantation has been reported to be associated with increased mortality. Recently, machine learning approaches were reported to have better predictive ability than the classic statistical analysis. We compared the performance of machine learning approaches with that of logistic regression analysis to predict AKI after liver transplantation. We reviewed 1211 patients and preoperative and intraoperative anesthesia and surgery-related variables were obtaine...
39 CitationsSource
#1Dmitry GrapovH-Index: 23
#2Johannes F. Fahrmann (University of Texas at Austin)H-Index: 23
Last. Sakda Khoomrung (MU: Mahidol University)H-Index: 15
view all 4 authors...
Abstract Machine learning (ML) is being ubiquitously incorporated into everyday products such as Internet search, email spam filters, product recommendations, image classification, and speech recognition. New approaches for highly integrated manufacturing and automation such as the Industry 4.0 and the Internet of things are also converging with ML methodologies. Many approaches incorporate complex artificial neural network architectures and are collectively referred to as deep learning (DL) app...
69 CitationsSource
#1Masahiro Takada (Kyoto University)H-Index: 14
#2Masahiro Sugimoto (TMU: Tokyo Medical University)H-Index: 32
Last. Masakazu Toi (Kyoto University)H-Index: 80
view all 10 authors...
Purpose This study aimed to develop mathematical tools to predict the likelihood of recurrence after neoadjuvant chemotherapy (NAC) plus trastuzumab in patients with human epidermal growth factor receptor 2 (HER2)-positive breast cancer.
6 CitationsSource
#1Sara Golas (Partners HealthCare)H-Index: 6
#2Takuma Shibahara (Hitachi)H-Index: 6
Last. Kamal Jethwani (Partners HealthCare)H-Index: 20
view all 12 authors...
Heart failure is one of the leading causes of hospitalization in the United States. Advances in big data solutions allow for storage, management, and mining of large volumes of structured and semi-structured data, such as complex healthcare data. Applying these advances to complex healthcare data has led to the development of risk prediction models to help identify patients who would benefit most from disease management programs in an effort to reduce readmissions and healthcare cost, but the re...
60 CitationsSource
Cited By1
Newest
#1Suhyun HwangboH-Index: 2
#2Se Ik KimH-Index: 12
Last. Yong Sang SongH-Index: 64
view all 9 authors...
To support the implementation of individualized disease management, we aimed to develop machine learning models predicting platinum sensitivity in patients with high-grade serous ovarian carcinoma (HGSOC). We reviewed the medical records of 1002 eligible patients. Patients’ clinicopathologic characteristics, surgical findings, details of chemotherapy, treatment response, and survival outcomes were collected. Using the stepwise selection method, based on the area under the receiver operating char...
Source