Final Project Report

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

A MACHINE LEARNING MODEL FOR WEATHER

FORECASTING

A PROJECT REPORT

Submitted by

NEETIKA NIGAM (CSJMA17001390151)


NIDHI JAISWAL (CSJMA17001390152)
SURABHI GUPTA (CSJMA17001390176)

in partial fulfillment for the award of the degree

of

BACHELORS OF TECHNOLOGY

in
INFORMATION TECHNOLOGY

UNIVERSITY INSTITUTE OF ENGINEERING AND TECHNOLOGY

CHHATRAPATI SHAHU JI MAHARAJ UNIVERSITY KANPUR


NAGAR (U.P.)
JULY 2021
Project Report: Machine Learning Model for Weather Forecasting

CERTIFICATE

Certified that this project report “A MACHINE LEARNING MODEL FOR


WEATHER FORECASTING” is the bonafide work of “NEETIKA NIGAM,
NIDHI JAISWAL, and SURABHI GUPTA” who carried out the project work
under my supervision.

Er. Prateek Sir Dr. Rashi Agrawal


Guide Head of the Department

INFORMATION TECHNOLOGY

UNIVERSITY INSTITUTE OF ENGINEERING AND TECHNOLOGY

CHHATRAPATI SHAHU JI MAHARAJ UNIVERSITY KANPUR


NAGAR (U.P.)

1
Project Report: Machine Learning Model for Weather Forecasting

ACKNOWLEDGEMENT

I would like to express my special thanks of gratitude to my project guide Er.


Prateek Srivastava Sir as well as our HOD Dr. Rashi Agrawal Mam who gave me
the golden opportunity to do this wonderful project on the topic “A Machine
Learning Model for Weather Forecasting”, which also helped me in doing a lot of
research and I came to know about so many new things I am really thankful to them.
Secondly I would also like to thank my parents and friends who helped me a
lot in finalizing this project within the limited time frame.

Date : 12-07-2021 Neetika Nigam


Nidhi Jaiswal
Surabhi Gupta

Place : Sitapur CSJMA17001390151


CSJMA17001390152
CSJMA17001390176

2
Project Report: Machine Learning Model for Weather Forecasting

TABLE OF CONTENTS

TITLE
ABSTRACT 4
BACKGROUND 4
OBJECTIVE 4
1.0 INTRODUCTION 5
1.1 Introduction 5

1.2 Machine Learning 5

1.3 Use of Algorithms 6

2.0 METHODOLOGY 7
4.0 EXPERIMENTATION 9
5.0 RESULT AND DISCUSSION 10
5.1 Multiple Linear Regression 10

5.2 Decision Tree Regression 11

5.3 Random Forest Regression 12

6.0 CONCLUSION 13

3
Project Report: Machine Learning Model for Weather Forecasting

ABSTRACT

Traditionally, climate assessment has been performed reliably by treating the environment
as a liquid. The current wind condition is being observed. The future state of the
environment is recorded by understanding thermodynamics and the numerical position of
the liquid elements. Nevertheless, this traditional arrangement of differential conditions as
observed by physical models is at times unstable under oscillating effects and uncertainties
when estimating the underlying states of air. This indicates an insufficient understanding
of environmental variations, so it limits climate forecasts to 10-day periods because climate
projections are essentially unreliable. But machine learning is moderately hearty for most
barometric destabilizing effects compared to traditional techniques. Another favorable
position of machine learning is that it does not depend on the physical laws of
environmental processes.

Background
For the current situation, India observatory conducts traditional weather forecasting. There
are four common methods to predict the weather. The first method is the climatology
method that is reviewing weather statistics gathered over multiple years and calculating the
averages. The second method is an analog method that is to find a day in the past with
weather similar to the current forecast. The third method is the persistence and trends
method that has no skill to predict the weather because it relies on past trends. The fourth
method is numerical weather prediction the is making weather predictions based on
multiple conditions in the atmosphere such as temperatures, wind speed, high-and low-
pressure systems, rainfall, snowfall, and other conditions. So, there are many limitations of
these traditional methods. Not only it forecasts the temperature in the current month at
most, but also it predicts without using machine learning algorithms. Therefore, my project
is to increase the accuracy and predict the weather in the future for at least one month by
applying machine learning techniques

Objective (Brief)
Purpose of this project is to predict the temperature using different algorithms like linear
regression, random forest regression, and Decision tree regression. The output value should
be numerically based on multiple extra factors like maximum temperature, minimum
temperature, cloud cover, humidity, and sun hours in a day, precipitation, pressure and
wind speed.

4
Project Report: Machine Learning Model for Weather Forecasting

1. INTRODUCTION

Weather prediction is the task of predicting the atmosphere at a future time and a given
area. This has been done through physical equations in the early days in which the
atmosphere is considered fluid. The current state of the environment is inspected, and the
future state is predicted by solving those equations numerically, but we cannot determine
very accurate weather for more than 10 days and this can be improved with the help of
science and technology.

Machine learning can be used to process immediate comparisons between historical


weather forecasts and observations. With the use of machine learning, weather models can
better account for prediction inaccuracies, such as overestimated rainfall, and produce more
accurate predictions. Temperature prediction is of major importance in a large number of
applications, including climate-related studies, energy, agricultural, medical, or etc.

There are numerous kinds of machine learning calculations, which are Linear Regression,
Polynomial Regression, Random Forest Regression, Artificial Neural Network, and
Recurrent Neural Network. These models are prepared dependent on the authentic
information gave of any area. Contribution to these models is given, for example, if
anticipating temperature, least temperature, mean air weight, greatest temperature, mean
dampness, and order for 2 days. In light of this Minimum Temperature and Maximum
Temperature of 7 days will be accomplished.

Machine Learning

Machine learning is relatively robust to perturbations and does not need any other physical
variables for prediction. Therefore, machine learning is a much better opportunity in the
evolution of weather forecasting. Before the advancement of Technology, weather
forecasting was a hard nut to crack. Weather forecasters relied upon satellites, data model’s
atmospheric conditions with less accuracy. Weather prediction and analysis have vastly
increased in terms of accuracy and predictability with the use of the Internet of Things, for
the last 40 years. With the advancement of Data Science, Artificial Intelligence, Scientists
now do weather forecasting with high accuracy and predictability.

5
Project Report: Machine Learning Model for Weather Forecasting

USE OF ALGORITHMS:

There are different methods of foreseeing temperature utilizing Regression and a variety
of Functional Regression, in which datasets are utilized to play out the counts and
investigation. To Train, the calculations 80% size of information is utilized and 20% size
of information is named as a Test set. For Example, if we need to anticipate the temperature
of Kanpur, India utilizing these Machine Learning calculations, we will utilize 8 Years of
information to prepare the calculations and 2 years of information as a Test dataset. The as
opposed to Weather Forecasting utilizing Machine Learning Algorithms which depends
essentially on reenactment dependent on Physics and Differential Equations, Artificial
Intelligence is additionally utilized for foreseeing temperature: which incorporates models,
for example, Linear regression, Decision tree regression, Random forest regression. To
finish up, Machine Learning has enormously changed the worldview of Weather estimating
with high precision and predictivity. What's more, in the following couple of years greater
progression will be made utilizing these advances to precisely foresee the climate to avoid
catastrophes like typhoons, Tornados, and Thunderstorms.

6
Project Report: Machine Learning Model for Weather Forecasting

2. METHODOLOGY

The dataset utilized in this arrangement has been gathered from Kaggle which is “Historical
Weather Data for Indian Cities” from which we have chosen the data for “Kanpur City”.
The dataset was created by keeping in mind the necessity of such historical weather data
in the community. The datasets for the top 8 Indian cities as per the population. The dataset
was used with the help of the worldweatheronline.com API and the wwo_hist package. The
datasets contain hourly weather data from 01-01-2009 to 01-01-2020. The data of each city
is for more than 10 years. This data can be used to visualize the change in data due to global
warming or can be used to predict the weather for upcoming days, weeks, months, seasons,
etc.
Note: The data was extracted with the help of worldweatheronline.com API and we cannot
guarantee the accuracy of the data.
The main target of this dataset can be used to predict the weather for the next day or week
with huge amounts of data provided in the dataset. Furthermore, this data can also be used
to make visualization which would help to understand the impact of global warming over
the various aspects of the weather like precipitation, humidity, temperature, etc.

In this project, we are concentrating on the temperature prediction of Kanpur city with the
help of various machine learning algorithms and various regressions. By applying various
regressions on the historical weather dataset of Kanpur city we are predicting the
temperature like first we are applying Multiple Linear regression, then Decision Tree
regression, and after that, we are applying Random Forest Regression.

Table 2.1: Historical Weather Dataset of Kanpur City

7
Project Report: Machine Learning Model for Weather Forecasting

Figure 2.1: Plot for each factor for 10 years

Figure 2.2: Plot for each factor for 1 year

8
Project Report: Machine Learning Model for Weather Forecasting

3. EXPERIMENTATION

The record has just been separated into a train set and a test set. Each information has just
been labeled. First, we take the trainset organizer. We will train our model with the help of
histograms and plots. The feature so extracted is stored in a histogram. This process is done
for every data in the train set. Now we will build the model of our classifiers. The classifiers
which we will take into account are Linear Regression, Decision Tree Regression, and
Random Forest Regression. With the help of our histogram, we will train our model. The
most important thing in this process is to tune these parameters accordingly, such that we
get the most accurate results. Once the training is complete, we will take the test set. Now
for each data variable of the test set, we will extract the features using feature extraction
techniques and then compare its values with the values present in the histogram formed by
the train set. The output is then predicted for each test day. Now in order to calculate
accuracy, we will compare the predicted value with the labeled value. The different metrics
that we will use confusion matrix, R2 score, etc.

9
Project Report: Machine Learning Model for Weather Forecasting

4. RESULT AND DISCUSSION

The results of the implementation of the project are demonstrated below.

Multiple Linear Regression:


This regression model has high mean absolute error, hence turned out to be the least
accurate model. Given below is a snapshot of the actual result from the project
implementation of multiple linear regression.

10
Project Report: Machine Learning Model for Weather Forecasting

Decision Tree Regression:


This regression model has medium mean absolute error, hence turned out to be the little
accurate model. Given below is a snapshot of the actual result from the project
implementation of multiple linear regression.

11
Project Report: Machine Learning Model for Weather Forecasting

Random Forest Regression:


This regression model has low mean absolute error, hence turned out to be the more
accurate model. Given below is a snapshot of the actual result from the project
implementation of multiple linear regression.

12
Project Report: Machine Learning Model for Weather Forecasting

5. CONCLUSION

All the machine learning models: linear regression, various linear regression, decision tree
regression, random forest regression were beaten by expert climate determining
apparatuses, even though the error in their execution reduced significantly for later days,
demonstrating that over longer timeframes, our models may beat genius professional ones.

Linear regression demonstrated to be a low predisposition, high fluctuation model though


polynomial regression demonstrated to be a high predisposition, low difference model.
Linear regression is naturally a high difference model as it is unsteady to outliers, so one
approach to improve the linear regression model is by gathering more information.
Practical regression, however, was high predisposition, demonstrating that the decision of
the model was poor and that its predictions can't be improved by the further accumulation
of information. This predisposition could be expected to the structure decision to estimate
temperature dependent on the climate of the previous two days, which might be too short
to even think about capturing slants in a climate that practical regression requires. On the
off chance that the figure was rather founded on the climate of the past four or five days,
the predisposition of the practical regression model could probably be decreased. In any
case, this would require significantly more calculation time alongside retraining of the
weight vector w, so this will be conceded to future work.

Talking about Random Forest Regression, it proves to be the most accurate regression
model. Likely so, it is the most popular regression model used, since it is highly accurate
and versatile. Below is a snapshot of the implementation of Random Forest in the project.

Weather Forecasting has a major test of foreseeing the precise outcomes which are utilized
in numerous ongoing frameworks like power offices, air terminals, the travel industry
focuses, and so forth. The trouble of this determining is the mind-boggling nature of
parameters. Every parameter has an alternate arrangement of scopes of qualities.

13

You might also like