Applying a New Feature Selection Method for Accurate Prediction of Earthquakes Using a Soft Voting Classifier
Main Article Content
Abstract
Earthquakes are among the most hazardous natural disasters, posing significant threats to infrastructure, property and human life. This is primarily due to the sudden nature of earthquakes, which often provide little to no time for preparation. Consequently, the issue of earthquake prediction is crucial for human safety. Developing a reliable and highly accurate earthquake prediction model using machine learning (ML) methods can enhance our understanding of these complex natural phenomena, ultimately aiding in preserving lives and mitigating earthquake-related damage. In this study, we propose a novel feature selection approach that integrates two methods: normalisation based on analysis of variance and the Chi-squared technique, along with correlation based on Logistic Regression (CLR-AVCH). This approach aims to identify the most relevant features to expedite model training, minimise errors and optimise outcomes. We employ three algorithms (Support Vector Machine, Decision Tree and Random Forest) to uncover and identify patterns in the collected data. A soft voting classifier is then constructed, combining the best-performing models (Decision Tree and Random Forest) to create a unified model that leverages both strengths, improving prediction accuracy. The proposed methodology achieves high-performance metrics, including accuracy, F1 score, recall and precision (0.99, 0.98, 0.98 and 0.98, respectively). Future work will focus on implementing new feature selection techniques alongside hybrid algorithms with soft voting classifiers to enhance diagnostic capabilities.
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
Journal of Studies in Science and Engineering is licensed under a Creative Commons Attribution License 4.0 (CC BY-4.0).
References
D. D. Acula, "Detection of Earthquake Damages from Satellite Images using Gradient Boosting Algorithm with Decision Trees as Base Estimator," Acta Manilan, vol. 70, no. 2022, pp. 13-28, 2022.
P. Chittora et al., "Experimental analysis of earthquake prediction using machine learning classifiers, curve fitting, and neural modeling," 2022.
M. H. Al Banna et al., "Application of artificial intelligence in predicting earthquakes: state-of-the-art and future challenges," IEEE Access, vol. 8, pp. 192880-192923, 2020.
M. A. Salam, L. Ibrahim, and D. S. Abdelminaam, "Earthquake prediction using hybrid machine learning techniques," International Journal of Advanced Computer Science and Applications, vol. 12, no. 5, pp. 654-6652021, 2021.
M. Maya and W. Yu, "Short-term prediction of the earthquake through neural networks and meta-learning," in 2019 16th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), 2019, pp. 1-6: IEEE.
J. Li, C. Zhang, X. Chen, Y. Cao, and R. Jia, "Improving abstractive summarization with iterative representation," in 2020 International Joint Conference on Neural Networks (IJCNN), 2020, pp. 1-8: IEEE.
J. Koehler, W. Li, J. Faber, G. Ruempker, and N. Srivastava, "Testing the Potential of Deep Learning in Earthquake Forecasting," arXiv preprint arXiv:2307.01812, 2023.
Z. An et al., "Research on Earthquake Data Prediction Method Based on DIN–MLP Algorithm," Electronics, vol. 12, no. 16, p. 3519, 2023.
K. Sajan, A. Bhusal, D. Gautam, and R. Rupakhety, "Earthquake damage and rehabilitation intervention prediction using machine learning," Engineering Failure Analysis, vol. 144, p. 106949, 2023.
F. Yang, M. Kefalas, M. Koch, A. V. Kononova, Y. Qiao, and T. Bäck, "Auto-rep: an automated regression pipeline approach for high-efficiency earthquake prediction using LANL data," in 2022 14th International Conference on Computer and Automation Engineering (ICCAE), 2022, pp. 127-134: IEEE.
A. Berhich, F.-Z. Belouadha, and M. I. Kabbaj, "A location-dependent earthquake prediction using recurrent neural network algorithms," Soil Dynamics and Earthquake Engineering, vol. 161, p. 107389, 2022.
Y. Wang, Y. Zhang, Y. Lu, and X. Yu, "A Comparative Assessment of Credit Risk Model Based on Machine Learning——a case study of bank loan data," Procedia Computer Science, vol. 174, pp. 141-149, 2020.
J. Yoon, "Forecasting of real GDP growth using machine learning models: Gradient boosting and random forest approach," Computational Economics, vol. 57, no. 1, pp. 247-265, 2021.
A. Manoharan, K. Begam, V. R. Aparow, and D. Sooriamoorthy, "Artificial Neural Networks, Gradient Boosting and Support Vector Machines for electric vehicle battery state estimation: A review," Journal of Energy Storage, vol. 55, p. 105384, 2022.
N. S. Abd, O. S. Atiyah, M. T. Ahmed, and A. Bakhit, "Digital Marketing Data Classification by Using Machine Learning Algorithms," Iraqi Journal for Electrical And Electronic Engineering, vol. 20, no. 1, 2024.
L. Rokach and O. Maimon, "Top-down induction of decision trees classifiers-a survey," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 35, no. 4, pp. 476-487, 2005.
O. S. Atiyah and S. H. Thalij, "A comparison of Coivd‐19 cases classification based on machine learning approaches," Iraqi Journal for Electrical and Electronic Engineering, vol. 18, no. 1, pp. 139-143, 2022.
R. F. Jader, S. Aminifar, and M. H. M. Abd, "Diabetes detection system by mixing supervised and unsupervised algorithms," Journal of Studies in Science and Engineering, vol. 2, no. 3, pp. 52-65, 2022.
R. F. Jader, M. H. M. Abd, and I. H. Jumaa, "Signal Modulation Recognition System Based on Different Signal Noise Rate Using Artificial Intelligent Approach," Journal of Studies in Science and Engineering, vol. 2, no. 4, pp. 37-49, 2022.
C. J. Burges, "A tutorial on support vector machines for pattern recognition," Data mining and knowledge discovery, vol. 2, no. 2, pp. 121-167, 1998.
Warcoder. (2019, August, 01, 2023). Earthquake dataset. Available: https://www.kaggle.com/datasets/warcoder/earthquake-dataset
E. Seeram, "An overview of correlational research," Radiologic technology, vol. 91, no. 2, pp. 176-179, 2019.
I. Lopez-Arevalo, E. Aldana-Bobadilla, A. Molina-Villegas, H. Galeana-Zapién, V. Muñiz-Sanchez, and S. Gausin-Valle, "A memory-efficient encoding method for processing mixed-type data on machine learning," Entropy, vol. 22, no. 12, p. 1391, 2020.
D. N. Sari, D. Kusnadi, R. H. Saputra, and M. U. Khan, "Digital Signal Processing for The Development of Deep Learning-Based Speech Recognition Technology," International Journal of Electronics and Communications Systems, vol. 4, no. 1, pp. 27-41, 2024.
D. R. Morrison, E. C. Sewell, and S. H. Jacobson, "An application of the branch, bound, and remember algorithm to a new simple assembly line balancing dataset," European Journal of Operational Research, vol. 236, no. 2, pp. 403-409, 2014.
P. J. M. Ali, R. H. Faraj, E. Koya, P. J. M. Ali, and R. H. Faraj, "Data normalization and standardization: a technical report," Mach Learn Tech Rep, vol. 1, no. 1, pp. 1-6, 2014.
S. Visalakshi and V. Radha, "A literature review of feature selection techniques and applications: Review of feature selection in data mining," in 2014 IEEE international conference on computational intelligence and computing research, 2014, pp. 1-6: IEEE.
Scikit-learn. (2010, July, 20, 2024). Supervised learning. Available: https://scikit-learn.org/stable/supervised_learning.html
H. Wang, H. Moayedi, and L. Kok Foong, "Genetic algorithm hybridized with multilayer perceptron to have an economical slope stability design," Engineering with Computers, vol. 37, pp. 3067-3078, 2021.