Earthquake is a major natural disaster that causes casualties in millions and leaving many more in trauma. Analyzing the consequences of such consequences gives one a better stand-in for potential catastrophe occurrences. It is important to establish a methodology that can assist in forecasting these earthquakes, as they can help prevent the severity of the damage. This paper discusses a machine learning model that can predict the damage grade severity caused by life-threatening earthquake that hit Nepal in the year 2015. The dataset is derived from the live competition hosted by Driven Data. The data was collected through the surveys conducted by the Kathmandu Living Labs and the Central Bureau of Statistics, which operates under the National Planning Commission Secretariat of Nepal. To accomplish the defined goal, we used the Random Forest Classifier and Gradient Boosting Classifier. The Random Forest Classifier algorithm demonstrated in this study was outperformed by the Gradient Boosting Classifier. With necessary parameter tuning using the Random Forest Classifier, the F1-Score achieved was 72.95%. The next technique was to perform winsorization on some attributes to handle outliers which improved the F1-score to 74.33% along with gradient boosting classifier. The last techniqueinvolved only hyper-parameter tuning with gradient boosting classifier achieved the best F1-Score of 74.42%.
Published in | American Journal of Biological and Environmental Statistics (Volume 6, Issue 3) |
DOI | 10.11648/j.ajbes.20200603.14 |
Page(s) | 58-63 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2020. Published by Science Publishing Group |
Random Forest Classifier, Gradient Boosting Classifier, Winsorizing, Earthquake
[1] | K. M. Asim, F. Martı´ nez A´ lvarez, A. Basit, and T. Iqbal “Earthquake magnitude prediction in Hindukush region using machine learning techniques”. Nat Hazards, 2016. |
[2] | M Hosokawa, B. P Jeong, and O Takizawa. Earthquakeintensity estimation and damage detection using remote sensing data for global rescue operations, 2009. |
[3] | D Sun and B Sun. Rapid prediction of earthquake damage to buildings based on fuzzy analysis, 2010. |
[4] | Sujith Mangalathu, M. EERI, Chukwuebuka C. Nweke, Han Sun, Henry V. Burton, and Zhengxiang Yi. Classifying earthquake damage to buildings using machine learning. Earthquake Spectra, 2020. |
[5] | Kawabe Hidenori, Kamae katsuhiro, and Irikura Ko- jiro. Damage prediction of long-period structures during subduction earthquakes -Part 1: Long-period ground motion prediction in the Osaka basin for future Nankai Earthquakes, 2008. |
[6] | David Vere-Jones “Forecasting earthquakes and earthquake risk”. International Journal of Forecasting, 11 (4): 503–538, 1995. |
[7] | T K Katsuichiro Goda “The 2015 Gorkha Nepal earthquake: insights from earthquake damage survey”. Frontiers in Built Environment, 2015. |
[8] | D W Loi, M E Raghunandan, M Shanmugavel, andV Swamy. Data analytic engineering and its application in earthquake engineering: An overview, 2014. |
[9] | K. Chaurasia, S. Kanse, A. Yewale, V. K. Singh, B. Sharma, and B. R. Dattu. Predicting Damage to Buildings Caused by Earthquakes Using Machine Learning Techniques. In 2019 IEEE 9th International Conference on Advanced Computing (IACC), pages 81–86, 2019. |
[10] | Khaled Taalab, Tao Cheng, and Yang Zheng “Mapping landslide susceptibility and types using Random Forest”. Big Earth Data, 2 (2), 2018. |
[11] | M S Szpakowicz. Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation. Springer, Berlin, Heidelberg, 2006. |
[12] | Inseok Ko and H C. Interactive Visualization of Healthcare Data Using Tableau, 2017. |
[13] | Qiong & Ren, Hui & Cheng, and Hai Han “Research on machine learning framework based on random forest algorithm”. AIP Conference Proceedings, 2017. |
[14] | Alexey & Natekin and Alois Knoll. Gradient Boosting Machines, A Tutorial. Frontiers in neurorobotics, 2013. |
[15] | Alan Reifman and Kristina Garrett “Winsorize”. En- cyclopedia of research design, pages 1636–1638, 01 2010. |
[16] | https://medium.com/swlh/predicting-damage-to-building-due-to-earthquake-using-data-science-e85a62adc0c0. |
[17] | https://towardsdatascience.com/earthquake-prediction-faffd7160f98. |
[18] | https://arxiv.org/ftp/arxiv/papers/1702/1702.05774.pdf. |
[19] | https://www.drivendata.org/competitions/57/nepal-earthquake/. |
[20] | Dr. P. Vishnu Raja, Dr K. Sangeetha, S. Sibikrishna, C. Shwetha, M. Vijaykumar (2020). Earthquake Prediction Using Machine. |
[21] | Learning Using Support Vector Machine Algorithm. International Journal of Advanced Science and Technology, 2020. |
APA Style
Sourav Pandurang Adi, Vivek Bettadapura Adishesha, Keshav Vaidyanathan Bharadwaj, Abhinav Narayan. (2020). Earthquake Damage Prediction Using Random Forest and Gradient Boosting Classifier. American Journal of Biological and Environmental Statistics, 6(3), 58-63. https://doi.org/10.11648/j.ajbes.20200603.14
ACS Style
Sourav Pandurang Adi; Vivek Bettadapura Adishesha; Keshav Vaidyanathan Bharadwaj; Abhinav Narayan. Earthquake Damage Prediction Using Random Forest and Gradient Boosting Classifier. Am. J. Biol. Environ. Stat. 2020, 6(3), 58-63. doi: 10.11648/j.ajbes.20200603.14
AMA Style
Sourav Pandurang Adi, Vivek Bettadapura Adishesha, Keshav Vaidyanathan Bharadwaj, Abhinav Narayan. Earthquake Damage Prediction Using Random Forest and Gradient Boosting Classifier. Am J Biol Environ Stat. 2020;6(3):58-63. doi: 10.11648/j.ajbes.20200603.14
@article{10.11648/j.ajbes.20200603.14, author = {Sourav Pandurang Adi and Vivek Bettadapura Adishesha and Keshav Vaidyanathan Bharadwaj and Abhinav Narayan}, title = {Earthquake Damage Prediction Using Random Forest and Gradient Boosting Classifier}, journal = {American Journal of Biological and Environmental Statistics}, volume = {6}, number = {3}, pages = {58-63}, doi = {10.11648/j.ajbes.20200603.14}, url = {https://doi.org/10.11648/j.ajbes.20200603.14}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajbes.20200603.14}, abstract = {Earthquake is a major natural disaster that causes casualties in millions and leaving many more in trauma. Analyzing the consequences of such consequences gives one a better stand-in for potential catastrophe occurrences. It is important to establish a methodology that can assist in forecasting these earthquakes, as they can help prevent the severity of the damage. This paper discusses a machine learning model that can predict the damage grade severity caused by life-threatening earthquake that hit Nepal in the year 2015. The dataset is derived from the live competition hosted by Driven Data. The data was collected through the surveys conducted by the Kathmandu Living Labs and the Central Bureau of Statistics, which operates under the National Planning Commission Secretariat of Nepal. To accomplish the defined goal, we used the Random Forest Classifier and Gradient Boosting Classifier. The Random Forest Classifier algorithm demonstrated in this study was outperformed by the Gradient Boosting Classifier. With necessary parameter tuning using the Random Forest Classifier, the F1-Score achieved was 72.95%. The next technique was to perform winsorization on some attributes to handle outliers which improved the F1-score to 74.33% along with gradient boosting classifier. The last techniqueinvolved only hyper-parameter tuning with gradient boosting classifier achieved the best F1-Score of 74.42%.}, year = {2020} }
TY - JOUR T1 - Earthquake Damage Prediction Using Random Forest and Gradient Boosting Classifier AU - Sourav Pandurang Adi AU - Vivek Bettadapura Adishesha AU - Keshav Vaidyanathan Bharadwaj AU - Abhinav Narayan Y1 - 2020/10/21 PY - 2020 N1 - https://doi.org/10.11648/j.ajbes.20200603.14 DO - 10.11648/j.ajbes.20200603.14 T2 - American Journal of Biological and Environmental Statistics JF - American Journal of Biological and Environmental Statistics JO - American Journal of Biological and Environmental Statistics SP - 58 EP - 63 PB - Science Publishing Group SN - 2471-979X UR - https://doi.org/10.11648/j.ajbes.20200603.14 AB - Earthquake is a major natural disaster that causes casualties in millions and leaving many more in trauma. Analyzing the consequences of such consequences gives one a better stand-in for potential catastrophe occurrences. It is important to establish a methodology that can assist in forecasting these earthquakes, as they can help prevent the severity of the damage. This paper discusses a machine learning model that can predict the damage grade severity caused by life-threatening earthquake that hit Nepal in the year 2015. The dataset is derived from the live competition hosted by Driven Data. The data was collected through the surveys conducted by the Kathmandu Living Labs and the Central Bureau of Statistics, which operates under the National Planning Commission Secretariat of Nepal. To accomplish the defined goal, we used the Random Forest Classifier and Gradient Boosting Classifier. The Random Forest Classifier algorithm demonstrated in this study was outperformed by the Gradient Boosting Classifier. With necessary parameter tuning using the Random Forest Classifier, the F1-Score achieved was 72.95%. The next technique was to perform winsorization on some attributes to handle outliers which improved the F1-score to 74.33% along with gradient boosting classifier. The last techniqueinvolved only hyper-parameter tuning with gradient boosting classifier achieved the best F1-Score of 74.42%. VL - 6 IS - 3 ER -