Diagnosis of Cardiovascular Diseases using Hybrid Feature Selection and Classification Algorithms
Published on Dec 9, 2017
Current diagnostic systems in order to identify cardiovascular diseases (CVDs) such as Echocardiography (ECG) require highly skilled physicians to evaluate complex combinations of clinical and pathological data. Inaccurate decision decision making is the challenge in the process and thus can’t be permitted in healthcare industry. Data mining methodologies can be applied to large medical datasets to extract insights that aid healthcare professionals in the diagnosis of cardiovascular diseases. In CVDs data mining, classification categorize a patient as having CVDs or free from it based on their similarities to previous examples of other patients. The classification accuracy rate is highly influenced by feature selection technique which eliminates features or attributes with practically no or little information from the dataset. Thus, feature selection and classification algorithms are considered as a concern of global "combinatorial optimization". The aim of this research is to investigate the optimal hybrid model of feature selection and classification algorithms in the diagnosis of cardiovascular diseases based on three performance metrics namely accuracy, sensitivity and specificity. It followed the Cross Industry Standard Process for Data Mining (CRISP-DM). The effect of hybrid feature selection and classification algorithms is examined on heart disease dataset acquired from University of California, Irvine - Machine Learning Repository (UCI-ML). The feature selection algorithm used is Particle Swarm Optimization (PSO). The classification algorithms used are Support Vector Machines (SVM), Artificial Neural Network (ANN), Naive Bayes, K-Nearest Neighbour (KNN), Random Forest and C5.0 Decision Tree. The hybrid feature selection and classification algorithms are evaluated based on accuracy, sensitivity and specificity with the objective of achieving superior predictive performance. Results demonstrated that hybrid combination of PSO with SVM (PSO_SVM) achieves superior predictive performance over other models. The research will thus empower physicians to diagnose cardiovascular diseases and initiate timely treatment without the intervention of a trained cardiologist.