Flood Prediction Using Logistic Regression
Flood Prediction Using Logistic Regression
Ezhillin Freeda. S
Department of Computer Science and
Engineering
Sri Ramakrishna Engineering College
Coimbatore, India
ezhilinfreeda@srec.ac.in
2023 International Conference on Circuit Power and Computing Technologies (ICCPCT) | 979-8-3503-3324-4/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICCPCT58313.2023.10245832
Abstract— Flood Prediction is crucial in mitigating its The foundation of machine learning is the notion that
impact on human life, property, and the environment. This computers are capable of learning from data, recognizing
paper proposes a machine learning approach using Logistic patterns, and making decisions without human
Regression for flood prediction, which involves the usage of intervention. The process involves three key components:
rainfall data to determine the probability of flood occurrence. data, algorithms, and models. It is a rapidly growing field
By analyzing the average rainfall and training the model with transforming how we use technology in our daily lives [4].
the vast dataset, our model can generate more accurate It is a branch of artificial intelligence that gives computers
predictions on flood occurrence. Our mobile application the capacity to learn from their experiences and advance
provides an early warning system through the flood risk map
without explicit programming. Finance, healthcare, and
that is accessible to users. Our dataset comprises 37 regions in
Tamil Nadu, which are the meteorological stations and also
marketing [3] are just a few of the industries and
the base stations of dams. The latter allows for notification in applications that use machine learning algorithms.
the event of flood and dam overflow respectively. Supervised learning and unsupervised learning are two
Furthermore, our mobile app enables users to easily identify subcategories of machine learning. The following flow
camps and shelters established during flood periods and even diagram in figure 1 explains the workflow of Machine
add information about them. In addition, our mobile app also Learning algorithms. First, data are collected and which is
provides users with the necessary helpline numbers to reach then analyzed to identify patterns and relationships among
out for assistance during flood times. them. Then, algorithms are developed to process the data
and create models that can make predictions or decisions.
Keywords—logistic regression, rainfall, flood, mobile Finally, the models are tested and refined to improve their
application. accuracy and performance.
I. INTRODUCTION
One of the most common natural catastrophes is
flooding, which is brought on by heavy downpours,
overflowing rivers, breached dams, cyclones, and storm
surges. It is a phenomenon when water runs into normally
dry land, inflicting harm to infrastructure and property,
disrupting travel, and even claiming human lives. In India,
floods are a recurring problem and Tamil Nadu is one of
the states most prone to flooding in India. Moreover,
Machine learning models have become increasingly
popular in flood prediction because of their ability to
analyze large datasets and identify patterns in flood data
[1]. By using these models, we can examine a range of
environmental as well as socioeconomic parameters,
including rainfall and weather patterns, soil moisture,
vegetation cover, land use, and terrain. These models can
aid in identifying regions with a high risk of flooding and
aid in creating an early warning system [2]. Our developed
Logistic regression model to predict the occurrence of the
flood is trained to identify patterns and predict the
likelihood of flooding in a particular area of Tamil Nadu. Fig 1 Flow diagram of Machine Learning model.
Using logistic regression, flood risk maps are created that
can be used more effectively to allocate resources, evacuate The use of machine learning models for flood
residents from high-risk locations, and plan disaster prediction has grown significantly in recent years due to
response activities. Mobile application has been created their effectiveness and potential. Recent statistics show that
that makes the above-mentioned flood prediction model these models have achieved high accuracy rates in
accessible to everyone. predicting floods, surpassing traditional methods. This has
led to a rapid expansion in the adoption of machine learning
for flood prediction, opening up new possibilities for
managing and mitigating the impacts of floods. With The primary focus of [9] is to create an optimal flood
ongoing advancements, machines are poised to accomplish detection model. In this study, a decision tree model is
tasks that were once considered impossible, promising a constructed, and machine learning algorithms such as
bright future for the application of machine learning in Random Forest and Gradient Boosting are employed. The
flood prediction and beyond. developed model performs multiple computations on
datasets, incorporating an AI algorithm specifically
A statistical technique called logistic regression [5] can designed for flood prediction [9]. By utilizing these
be used to examine a dataset in which one or more techniques, the model aims to accurately forecast and
independent factors affect the outcome. There are only two detect flood events with improved precision.
possible possibilities for the result, hence it is measured by
a bipolar variable. It is an efficient tool that can aid in In the research paper [10], a Flood Prediction Model
understanding and foretelling systems or process behavior. (FPM) is introduced to forecast river floods by employing
One benefit of logistic regression is that it can handle a high the Artificial Neural Network (ANN) approach, chosen for
number of predictor variables and is simple to execute and its ability to address nonlinear problems. The FPM utilizes
analyze. It can be used to handle non-linear interactions rainfall data to predict river water levels and provides
between variables as well. Logistic regression uses a binary corresponding data on river water levels. While various
dependent variable. The objective is to identify the link factors contribute to water level fluctuations, this model
between the independent and dependent variables and to focuses on considering two specific factors in the
generate predictions about the future course of events using prediction process. By emphasizing the ANN's nonlinear
this relationship. A useful technique for assessing and capabilities and incorporating relevant variables, the FPM
forecasting binary outcomes [6] is logistic regression. It aims to enhance flood prediction accuracy and contribute
can handle a lot of predictor variables and has a lot of to effective river management strategies. Table 1 is a
benefits, including being simple to use and analyze. summary of additional strategies [9] now in use and their
drawbacks.
Table.1 Comparison of existing ML models for flood prediction
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on October 04,2024 at 15:35:01 UTC from IEEE Xplore. Restrictions apply.
1175
2023 International Conference on Circuit Power and Computing Technologies (ICCPCT)
Fig 3 Flood Prediction using Logistic Regression workflow Fig 4 System architecture
The dataset includes a range of parameters that are Firstly, the user has to login to the application by giving
relevant to flood prediction, including the name of each your mobile number (figure 5). After logging into the
station, its latitude and longitude, the average rainfall over application, users are directed to the home screen. The
the preceding 10 days, the annual rainfall, and a binary home screen presents a variety of options for users to
indicator [6] of whether or not a flood occurred. Series of choose from as shown in figure 6. The home screen offers
preprocessing steps are conducted to clean and prepare the users a centralized platform to access crucial information
data for analysis. This included removing any incomplete and services related to meteorological conditions and
or erroneous entries, normalizing the data to ensure emergency response. It aims to provide convenience,
consistency across the different parameters, and splitting timely updates, and essential resources for users to navigate
the data into training and test datasets at a ratio of 4:1. The through weather-related challenges effectively.
training data [9] is used to refine and validate the model
and the test data to evaluate its accuracy and effectiveness
in predicting future floods in Tamil Nadu. The Logistic
Regression model is trained to predict using data collected
from 37 meteorological stations across the region for the
previous 10 days.
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on October 04,2024 at 15:35:01 UTC from IEEE Xplore. Restrictions apply.
1176
2023 International Conference on Circuit Power and Computing Technologies (ICCPCT)
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on October 04,2024 at 15:35:01 UTC from IEEE Xplore. Restrictions apply.
1177
2023 International Conference on Circuit Power and Computing Technologies (ICCPCT)
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on October 04,2024 at 15:35:01 UTC from IEEE Xplore. Restrictions apply.
1178
2023 International Conference on Circuit Power and Computing Technologies (ICCPCT)
REFERENCES
Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on October 04,2024 at 15:35:01 UTC from IEEE Xplore. Restrictions apply.
1179