Evaluating and Forecasting the Probability of Lightning Occurrence in Rasht City

Document Type : Research Paper

Authors

Department of Environmental Sciences and Engineering, Faculty of Natural Resources, University of Kurdistan, Sanandaj, Iran

Abstract

Lightning is one of the most severe weather hazards that will cause significant economic, social and environmental damage each year. The prediction of a lightning is a very difficult task due to the spatial and temporal expansion of weather either physically or dynamically. Therefore, timely forecasting of lightning and evaluation of the best data mining model is effective in reducing damage. In this research, the data of the years 2012_2018 of the Meteorological Station of Rasht were used, including dependent variable of occurrence and non-occurrence of lightning during 7 years and independent variables of factors affecting lightning including temperature, relative humidity, cloudy, wind speed, wind direction, pressure air and Previous day's lightning. After preprocessing and processing data, data mining models including Classification & Regression Tree (CART), Chi-squared Automatic Interaction Detector (CHAID), Induction of Decision Trees (C5) and neural networks Radial Basis Function (RBF), Multi Layer Perceptron (MLP) and Support Vector Machine (SVM) were used in Spss Modeler Ver 20 software. The results of the models were compared with the Comparative Criteria and the Receiver operating characteristic (ROC) curve. According to the results of the models, the probability of lightning occurrence is higher in the months of May, June and July than in other months and the rate of occurrence from spring to winter has a decreasing trend, while in winter it is at least. CHAID tree with a specificity rate of 0.794 and a minimum false positive rate of 0.205 and the SVM model with a correct prediction of 0.773 and an error rate of 0.475 and precision of 0.855 have optimum performance compared with other models. 
Extended Abstract
1-Introduction
Lightning is the ionization of the atmosphere due to the increased potential difference between the cloud and earth and the rapid discharge of electricity in the form of light and sound waves. Increasing the intensity of Lightning lead to thunderstorms, heavy rain, floods and tornadoes. Rasht is the largest rice-growing city in the country and produces 11% of the required rice in country. In recent years, lightning accidents such as rice stalk sleeping and the risk of paddy disease, roads blocked due to floods, traffic congestion, damage to buildings and the falling bridge and mortality from the electric shock have doubled the importance of predicting lightning in the future. The main purpose of this study was to use recorded ground data from the occurrence and non-occurrence of lightning (binary data) and the effect of related meteorological parameters (temperature, relative humidity, cloudy, wind speed, wind direction, pressure air and previous day's lightning) to estimate the probability of lightning occurrence in future using data mining (trees and neural network models) and evaluate and determine the optimal model to reduce future damage.                                                                                              
2-Materials and Methods
In this study, binary data of lightning and atmospheric parameters (temperature, relative humidity, cloudy, wind speed, wind direction, pressure air and previous day's lightning) were obtained from Rasht Meteorological Station during the years 2012-2018. Then according to Eq. (1) the data were normalized between zero and one and data classes were balanced using                                                                                           
 RUS and ROS algorithms in Rapid Miner software. 
Xn=X-Xmin / Xmax-Xmin                                                                  Eq. (1)
 the process of changing variables with determination of statistical properties and correlation was performed using SPSS software to reduce the errors. Finally, SPSS Modeler software was used to predict occurrence and non-occurrence of lightning in future using by CART, CHAID, C5 trees and Multi Layer Perceptron (MLP), Radial Basis Function (RBF) and Support Vector Machine (SVM). In this research, the training data set contains 70% of the data and testing data set contains 30% of the data. Then, based on the relations (2 – 9) the results of the models output were evaluated with interpolation matrix, comparative criteria and ROC curve.                
                   Eq. (2)                        Accuracy=TP+TN/ TP+TN+FP+FN
Precision=TP / TP+FP                                               Eq. (3)
Sensitivity=TP / TP+FN                                            Eq. (4)                                                    
Harmonic Mean=2*P*S / P+S                                   Eq. (5)                                     
Specificity=TN / TN+FP                                            Eq. (6)
False Positive Rate= FP / FP+TN                              Eq. (7)
False Negative Rate= FN / FN+TP                            Eq. (8)
            RMSE= √1/N Ʃ(P-O)2                                                Eq. (9)
Where, O signifies the observed value, P denotes the predicted value, TN indicates the true negative rate, FP indicates the false positive rate, FN shows the false negative rate, TP shows the true positive rate and N signifies the number of data.                                                                     
3-Results and Discussion
Lightning is one of the most important environmental hazards. Data mining technique is a suitable method to predict lightning. The results show that prediction using data mining technique is possible and effective. Based on the results, the probability of lightning occurrence is the highest in spring (May and June) and summer (July); it is minimized in winter and has a decreasing trend. Therefore, the probability of lightning occurrence in the future is higher than non-occurrence of lightning. Besides, among the three tree, CART, CHAID and C5, the CART and C5 trees had less satisfactory indices lacking the highest accuracy and precision in predicting lightning in future. Whereas the CHAID tree in 0.76 cases made a correct prediction with 0.85 precision and predicted the occurrence of lightning rate to be 0.54, which is very similar to the real value 0.62, and among the network artificial models Support Vector Machine (SVM) model with maximum utility with prediction of 0.77 accuracy and precision of 0.85 and prediction of 0.60 probability of lightning occurrence have priority and superiority than Radial Basis Function (RBF) and Multi-Layer Perceptron (MLP) models. According to the classification and Area Under Roc Curve (AUC) among the trees, the CHAID tree with 0.829 value and the Support Vector Machine model with 0.853 value have superiority. The numerical results are obtained and the similarity of this prediction with real values ​​shows that trees and network artificial are effective in predicting the probability of lightning occurring in the future and the CHAID tree and Support Vector Machine model have optimal performance compared with other models showing better predictability.                                                                                                         
4-Conclusion
According to the results of the model outputs, it was found that the probability of lightning occurring in Rasht city is very high. The models show the probability of lightning occurring in April has the same trend but the maximum lightning occurred in spring (May and June) due to unstable weather conditions and summer (July) is more than autumn and winter. Besides it has a decreasing trend, from spring to winter which is minimized in winter. From the evaluation of the CHIAD tree and the Support Vector Machine model, the Support Vector Machine model with a slight difference in utility indices of accuracy = 0.773, precision = 0.855, harmonic mean = 0.813, root mean square error = 0.475.  False negative rate = 0.198 was identified as the optimal model in predicting lightning in future and with respect to reliable outputs with maximum accuracy, precision and least prediction error, the Support Vector Machine model has a good performance which can be used to forecast the probability of  lightning occurrencein Rasht City. Also, according to the results of the models, the effective parameters to occurrence of lightning in order of Importance are previous day's lightning, temperature, pressure air, relative humidity and cloudy; other parameters are less important. Using data mining techniques and predictingprobability of lightning occurrence in future use by Support Vector Machine model, as a model with most accurate and precision, provides more accurate meteorology and the more effective actions to reduce future damage.                                           
 

Keywords

Main Subjects


جلالی، اورج؛ رسولی، علی­اکبر؛ ساری صراف، بهروز (1385). توفان­های تندری و بارش­های ناشی از ان در محدودة شهر اهر. جغرافیا و برنامه­ریزی، 16(24)، 18-33.
خالصی، فریده (1393). واکاوی زمانی توفان­های تندری در ایران. آب­وهواشناسی کاربردی، 1(1)، 47-60.
خورشیددوست، علی­محمد؛ رسولی، علی اکبر؛ فخاری واحد، مجتبی (1396). توزیع زمانی و مکانی پدیدة رعدوبرق در ایران با استفاده از داده­های سنجندة ثبت رعد و برق LIS)). جغرافیا و مخاطرات محیطی، 6 (21)، 89-107.
خوشحال دستجردی، جواد؛ قویدل رحیمی، یوسف (1386). شناسایی ویژگی­های سوانح محیطی منطقة شمال غرب ایران (نمونة مطالعاتی: خطر توفان­های تندری در تبریز). فصلنامة مدرس علوم انسانی، 11(53)، 101-115.
رسولی، علی­اکبر؛ بوداق جمالی، جواد؛ جلالی، اروج (1386). توزیع زمانی بارش­های رعد و برقی منطقة شمال غرب ایران. مجلّة پژوهشی علوم انسانی دانشگاه اصفهان، 14 (22)، 155-170.
رسولی، علی­اکبر؛ جوان، خدیجه (1391). تحلیل روند وقوع طوفان­های رعد و برق در نیمة غربی ایران با کاربرد آزمون­های ناپارامتری. مجلّة فضای جغرافیایی، 12(38)، 126-111.
قویدل رحیمی، یوسف؛ باغبانان، پرستو؛ فرج­زاده اصل، منوچهر (1393). تحلیل فضایی مخاطرة توفان­های تندری بهارة ایران. تحلیل فضایی مخاطرات محیطی، 1(3)، 59-70.
مفتاحی نمین، شیدا؛ صلاحی، برومند (1392). تحلیل ویژگی­های آماری و همدیدی توفان­های تندری شهرستان آستارا. در:عبدالله حسن­زاده، دومین همایش ملّی تغییر اقلیم و تأثیر آن بر کشاورزی و محیط­زیست، (صص. 2920-2929). ارومیه: مرکز تحقیقات کشاورزی و منابع طبیعی استان آذربایجان غربی.
موسوی، محبوبه؛ زرین، آذر؛ مفیدی، عباس؛ حسینی، فاطمه (1396). بررسی ارتباط بین فراوانی وقوع توفان­های تندری و روند دما در شهر مشهد. تحقیقات جغرافیایی، 32(3)، 75-87.
نصیری قلعه­بین، سحر؛ صلاحی، برومند؛ رسولی، علی­اکبر؛ خوش­اخلاق، فرامرز (1398). تحلیل زمانی مکانی طوفان­های تندری در دشت اردبیل. پژوهش­های جغرافیای طبیعی، 51 (1)، 149-162.
References
Bala, K., Choubey, D. K. & Paul, S. (2017). Soft Computing and Data Mining Techniques for Thunderstorms and Lightning Prediction: A Survey. International Conference on Electronics Communication and Aerospace Technology ICECA, 42-46.
Barnes, S. L. & Newton, C. W. (1982). Thunderstorms in the synoptic setting, in thunderstorms,
In: Kessler, E. (Editor), A Social Scientific and Technological Documentary, Thunderstorm Morphology and Dynamics
. Washington: U.S Deptment of Commerce, D.C 2, 109-171.
Basak, P., Sarkar, D. & Mukhopadhyay, A. K. (2012). Estimation of Thunderstorm Days from the Radio-sonde Observations at Kolkata (22.53 N, 88.33 E), India during Pre-Monsoon Season: an ANN Based Approach. Earth Science India, 5(4), 139-151.
Blouin, D. K., Flannigan, D. M., Wang, X. & Kochtobajda, B. (2016). Ensemble lightning prediction models for the province of Alberta, Canada. International Jurnal of Wildland Fire, 25 (4), 421-432.
Burrows, W. R., Price, C. & Wilson, L. J. (2005). Warm season lightning probability prediction for Canada and the northern United States. Weather Forecasting, 20, 971-988.
Chaudhuri, S. & Middey, A. (2013). Nowcasting lightning flash rate and peak wind gusts associated with severe thunderstorms using remotely sensed TRMM-LIS data. Journal of Remote Sensing, 34 (5), 1576-1590.
Chauhan, D. & Thakur, J. (2014). Data Mining Techniques for Weather Prediction: A Review. International Journal on Recent and Innovation Trends in Computing and Communication, 2 (8), 2184-2189.
Court, A. & Griffiths, J. F. (1982). Thunderstorm climatology, In: Kessler, E. (Editor), Thunderstorms: A Social, Scientific and Technological Documentary, Thunderstorm Morphology and Dynamics. Washington: U.S Deptment of Commerce, D.C 2, 11-52.
Ghavidel Rahini, Y., Baghebanan, P. & Farajzadeh, M. (2014). The spatial analysis of hazard of spring thunderstorms in Iran. Spatial Analysis Environmental Hazards, 1 (3), 59-70. (In Persian)
Han, J., Kamber, M. & Pei, J. (2006). Data Mining Concepts and Techniques. Second Edition, Burlington, USA: Morgan Kaufmann.
Hodanish, S. & Wolyn, P. (2012). April, Lightning climatology for the state of Colorado. Paper presented at the 23 rd International Lightning Detection Conference & 4th International Lightning Meteorology Conference, Broomfield, Colorado, USA.
Hou, S., Hou, R., Shi, X., Wang, J. & Yuan, C. (2014). Research on C5.0 Algorithm Improvement and the Test in Lightning Disaster Statistics. International Journal of Control and Automation, 7 (1), 181-190.
Jalali, A., Rasouly, A. A. & Sari Sarraf, B. (2006). Thunderstorms and its rains in Ahar city. Geography and Planning, 16 (24), 18-33. (In Persian)
Jian, C., Gao, J. & Ao, Y. (2016). A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputig, (193), 115-122.
Khalesi, F. (2014). A temporal analysis of thunderstorms in Iran. Applied Climatology, 1 (1), 47-60. (In Persian)
Khorshiddoust, A. M., Rasouly, A. A. & Vahed, M. F. (2017). Spatio-temporal distribution of lightning phenomenon in Iran using TRMM Lightning Image Sensor (LIS) data. Geography and Environmental Hazards, 6 (21), 89-107. (In Persian)
Khoshhal Dastjerdi, J. & Ghavidel Rahini, Y. (2007). Identify Characteristics of environmental disasters in Northwest Iran (Case Study: Thunderstorm Risk in Tabriz). Humanities Teacher Quarterly, 11 (53), 101-115. (In Persian)
Kodama, Y. M., Okabe, H., Tomisaka, Y., Kotono, K., Kondo, Y. & Kasuya, H. (2007). Lightning frequency and microphysical properties of precipitating clouds over the western North Pacific during winter as derived from TRMM multisensor observations. Monthly Weather Review, 135 (6), 2226-2241.
Kohavi, P. (1998). Glossary of terms, editorial for the special issue on applications of machine learning and the knowledge discovery process. Machine Learning, 30 (2-3), 271-274.
Lambert, W. C., Wheeler, M. & Roeder, W. (2005). Objective lightning forecasting at Kennedy Space Center and Cape Canaveral Air Force Station using cloud-to-ground lightning surveillance system data; Preprints. Conferance on Meteorological Applications of Lightning Data, (pp. 1-10). San Diego, CA: American Meteorological Society.
McGovern, A., Elmore, K. L., Gagne, D. J., Haupt, S. E., Karstens, C. D., Lagerquist, R., Smith, T. & Williams, J. K. (2017). Using Artificial Intelligence to Improve Real-Time Decision N-Making for High-Impact Weather. American Meteorological Society, 98 (10), 2073-2090.
Meftahi Namin, S. & Salahi, B. (2013). Analysis of statistical and synoptic characteristics of thunderstorms in Astara city. In: Abdullah Hassanzadeh. Second International Conference on Climate Change and Impact on Agriculture and Environment, (pp. 2920-2929). Urumia: Agriculture Research Center and Natural Resources of West Azarbaijan. (In Persian)
Moosavi, M., Zarrin, A., Mofidi, A. & Hosseini, F. (2017). Investigating the relationship between the frequency of thunderstorms and temperature trend in Mashhad. Geographical Researches Quarterly, 32 (3), 75-87. (In Persian)
Mostajabi, A. H., Finney, D., Rubinstein, M. & Rachidi, F. (2019). Nowcasting lightning occurrence from commonly available meteorological parameters using machine learning techniques. Npj Climate and Atmospheric Science, 2 (1), 1-15.
Nasiri Ghalebin, S., Salahi, B., Rasouly, A. A. & Khoshakhlagh, F. (2019). Modeling spatial distribution of thunderstorm rainfalls in mountainous area of the Northwest Iran. Physical Geography Research Quarterly, 51 (1), 149-162. (In Persian)
Qiu, T., Zhang, S., Zhou, H., Bai, X. & Liu, P. (2013). Application study of machine learning in lightning forecasting. Journal of Technology, 12 (21), 6031-6037.
Rajeevan, M., Madhulatha, A., Rajasekhar, M., Bhate, J., Kesarkar, A., Kesarkar, A. & Appa Rao, B. V. (2012). Development of a perfect prognosis probabilistic model for prediction of lightning over south-east India. Journal Earth System Science, 121 (2), 355-371.
Rasouly, A. A., Budagh Jamali, J. & Jalali, A. (2007). Spatial distribution of lightning rainfalls in North west Iran. Humanities of Isfahan University, 14 (22), 155-170. (In Persian)
Rasouly, A. A. & Javan, K. H. (2012). Analysis of thunderstorm occurrence trends in the Western part of Iranapplying Non-Parametric Statistical tests. Geographical Space, 12 (38), 111-126. (In Persian)
Reap, R. M. (1994). Analysis and prediction of lightning strike distributions associated with synoptic map types over Florida. Monthly Weather Review, (122), 1698-1715.
Zhang, Z., Krawczyk, B., Garcìa, S., Rosales-Pérez, A. & Herrera, F. (2016). Empowering one-vs-one decompositin with ensemble learning for multiclass imbalanced data. Knowledge Based System, 106, 251-263.
Zhu, Y. M., Lu, X. X. & Zhou, Y. (2007). Suspended sediment flux modeling with artificial neural network: An example of the longchuanjiang river in the Upper Yangtze Catchment, China. Geomorphology, 84 (1), 111-125.