CropPredictionUsingMLPythonReport
CropPredictionUsingMLPythonReport
The classifier models used here include Logistic Regression, Naive Bayes and
Random Forest, out of which the Random Forest provides maximum
accuracy. The prediction made by machine learning algorithms will help the
farmers to come to a decision which crop to grow to induce the most yield by
considering factors like temperature, rainfall, area, etc. This bridges the gap
between technology and agriculture sector.
INTRODUCTION
Agriculture, since its invention and inception, be the prime and pre-eminent
activity of every culture and civilization throughout the history of mankind. It
is not only an enormous aspect of the growing economy, but its essential for
us to survive. Its also a crucial sector for Indian economy and also human
future. It also contributes an outsized portion of employment. Because the
time passes the requirement for production has been increased
exponentially. So as to produce in mass quantity people are using
technology in an exceedingly wrong way. New sorts of hybrid varieties are
produced day by day. However, these varieties dont provide the essential
contents as naturally produced crop. These unnatural techniques spoil the
soil. It all ends up in further environmental harm. Most of these unnatural
techniques are wont to avoid losses.
But when the producers of the crops know the accurate information on the
crop yield it minimizes the loss. Machine learning, a fast-growing approach
thats spreading out and helping every sector in making viable decisions to
create the foremost of its applications. Most devices nowadays are facilitated
by models being analyzed before deployment. The main concept is to
increase the throughput of the agriculture sector with the Machine Learning
models.
Another factor that also affects the prediction is the amount of knowledge
that’s being given within the training period, as the number of parameters
was higher comparatively. The core emphasis would be on precision
agriculture, where quality is ensured over undesirable environmental factors.
So as to perform accurate prediction and stand on the inconsistent trends in
temperature and rainfall various machine learning classifiers like Logistic
Regression, Naive Bayes, Random Forest etc. are applied to urge a pattern.
By applying the above machine learning classifiers, we came into a
conclusion that Random Forest algorithm provides the foremost accurate
value. System predicts crop prediction from the gathering of past data. Using
past information on weather, temperature and a number of other factors the
information is given. The Application which we developed, runs the algorithm
and shows the list of crops suitable for entered data with predicted yield
value.
LITERATURE SURVEY
PROJECT DESCRIPTION
Data Preprocessing is a method that is used to convert the raw data into a
clean data set. The data are gathered from different sources, it is collected in
raw format which is not feasible for the analysis. By applying different
techniques like replacing missing values and null values, we can transform
data into an understandable format. The final step on data preprocessing is
the splitting of training and testing data. The data usually tend to be split
unequally because training the model usually requires as much data- points
as possible. The training dataset is the initial dataset used to train ML
algorithms to learn and produce right predictions (Here 80% of dataset is
taken as training dataset). Fig.1. shows the few rows of the preprocessed
data
There are a lot of factors that affects the yield of any crop and its production.
These are basically the features that help in predicting the production of any
crop over the year. In this paper we include factors like Temperature,
Rainfall, Area, Humidity and Windspeed.
Random Forest:- Random Forest has the ability to analyze crop growth
related to the current climatic conditions and biophysical change. Random
forest algorithm creates decision trees on different data samples and then
predict the data from each subset and then by voting gives better solution
for the system. Random Forest uses the bagging method to train the data
which increases the accuracy of the result.
SPECIFIC REQUIREMENT
EXISTING SYSTEM
5. IoT (Internet of Things): IoT devices, such as sensors and drones, are
deployed in the field to collect real-time data on soil moisture, temperature,
and other environmental conditions. This information is then used to
optimize irrigation, fertilization, and other farming practices
PROPOSED SYSTEM
Historical Data: Include past crop yields, growth patterns, and environmental
conditions for training machine learning models
2. Machine Learning Models: Crop Classification Models: Train machine
learning models to classify satellite imagery and identify different crops in a
given area. Yield Prediction Models: Develop models to predict crop yield
based on historical data, current environmental conditions, and satellite
imagery. Pest and Disease Prediction Models: Implement models that predict
the likelihood of pest and disease outbreaks based on weather and soil
conditions.
3. GIS Integration: Geographical Information System (GIS): Use GIS for spatial
analysis and mapping of crop distribution, allowing for a better
understanding of regional variations.
5.IoT Devices: Smart Sensors:Install IoT devices and smart sensors in the
field to collect real-time data on temperature, humidity, and other
environmental factors. -Automated Irrigation Systems: Integrate systems
that automate irrigation based on soil moisture levels and weather
predictions.
REQUIREMENT ANALYSIS
HARDWARE REQUIREMENTS
SYSTEM ANALYSIS
System analysis is the most essential part of the development of the project.
The analyst has to understand the functions and concepts in detail before
designing the appropriate computer based system. He has to carry out
customary appropriate that includes the following steps:
• Requirement specification
• Preliminary investigation
• Feasibility study
• Detailed investigation
• Design and coding
• Testing
• Implementation
FEASIBILITY STUDY
A feasibility study is a high-level capsule version of the entire System
analysis and Design Process. The study begins by classifying the problem
definition. Feasibility is to determine if it’s worth doing. Once an acceptance
problem definition has been generated, the analyst develops a logical model
of the system. A search for alternatives is analyzed carefully. There are 3
parts in feasibility study.
Operational Feasibility
Technical Feasibility
This involves questions such as whether the technology needed for the
system exists, how difficult it will be to build, and whether the firm has
enough experience using that technology. The assessment is based on
outline design of system requirements in terms of input, processes, output,
fields, programs and procedures. This can be qualified in terms of volume
of data, trends, frequency of updating inorder to give an introduction to
the technical system. The application is the fact that it has been developed
on windows XP platform and a high configuration of 1GB RAM on Intel
Pentium Dual core processor. This is technically feasible .The technical
feasibility assessment is focused on gaining an understanding of the
present technical resources of the organization and their applicability to
the expected needs of the proposed system. It is an evaluation of the
hardware and software and how it meets the need of the proposed system.
Economic Feasibility
DESIGN
Introduction:
UML Diagrams:
Actor:
A coherent set of roles that users of use cases play when interacting
with the use `cases.
Use case:
A description of sequence of actions, including
variants, that a system performs that yields an observable result of
value of an actor.
UML stands for Unified Modeling Language. UML is a language for
specifying, visualizing and documenting the system. This is the step
while developing any product after analysis. The goal from this is to
produce a model of the entities involved in the project which later
need to be built. The representation of the entities that are to be
used in the product being developed need to be designed.
Use case diagrams model behavior within a system and helps the
developers understand of what the user require. The stick man
represents what’s called an actor.
Use case diagram can be useful for getting an overall view
of the system and clarifying that can do and more importantly what
they can’t do.
Use case diagram consists of use cases and actors and shows the
interaction between the use case and actors.
ADMIN
New Staff
View Staff
Admin
View Users
View Reports
Staff
View
Plans
View Profile
Customer
View Profile
Crop Prediction
Customer
View Reports
Sequence Diagram
DFD LEVEL 0
DFD LEVEL 1
DFD LEVEL 2
View Reports
ADMIN
Crop
Detection
E-R Diagrams:
About Page
Services
Gallery
Admin Login
Staff Login
User Login
NewUser
ContactPage
Home Page
About Page
Services
Gallery
AdminLoginPage
StaffLogin
UserLogin
NewUserPage
ContactPage
AdminMainPage
NewStaffPage
AdminViewUsersPage
AdminViewStaffsPage
AdminViewContactsPage
UserViewMainPage
UserViewProfilePage
UserMakePredictionPage
Userviewreportspage
Staffmainpage
Staffviewprofile
Staffviewusers
Staffviewreports
Code :
Index.html
{% extends 'commonheader.html' %}
{% block content %}
<html lang="en">
<head>
<meta charset="UTF-8">
</head>
<body>
<div>
</div>
</div>
</div>
<div class="carousel-item">
</div>
</div>
</div>
data-bs-slide="prev">
<span class="carousel-control-prev-icon"
aria-hidden="true"></span>
<span class="visually-hidden">Previous</span>
</button>
data-bs-slide="next">
<span class="carousel-control-next-icon"
aria-hidden="true"></span>
<span class="visually-hidden">Next</span>
</button>
</div>
</div>
</body>
</html>
{% endblock %}
Main.py
import datetime
import firebase_admin
import random
import numpy as np
import pandas as pd
import requests
import config
import pickle
import io
import torch
firebase_admin.initialize_app(cred)
app = Flask(__name__)
app.secret_key="AgriCrop@12345"
df = pd.read_csv('plant(IBM - Z).csv')
crops = df['label'].unique()
disease_classes = ['Apple___Apple_scab',
'Apple___Black_rot',
'Apple___Cedar_apple_rust',
'Apple___healthy',
'Blueberry___healthy',
'Cherry_(including_sour)___Powdery_mildew',
'Cherry_(including_sour)___healthy',
'Corn_(maize)___Cercospora_leaf_spot Gray_leaf_spot',
'Corn_(maize)___Common_rust_',
'Corn_(maize)___Northern_Leaf_Blight',
'Corn_(maize)___healthy',
'Grape___Black_rot',
'Grape___Esca_(Black_Measles)',
'Grape___Leaf_blight_(Isariopsis_Leaf_Spot)',
'Grape___healthy',
'Orange___Haunglongbing_(Citrus_greening)',
'Peach___Bacterial_spot',
'Peach___healthy',
'Pepper,_bell___Bacterial_spot',
'Pepper,_bell___healthy',
'Potato___Early_blight',
'Potato___Late_blight',
'Potato___healthy',
'Raspberry___healthy',
'Soybean___healthy',
'Squash___Powdery_mildew',
'Strawberry___Leaf_scorch',
'Strawberry___healthy',
'Tomato___Bacterial_spot',
'Tomato___Early_blight',
'Tomato___Late_blight',
'Tomato___Leaf_Mold',
'Tomato___Septoria_leaf_spot',
'Tomato___Spider_mites Two-spotted_spider_mite',
'Tomato___Target_Spot',
'Tomato___Tomato_Yellow_Leaf_Curl_Virus',
'Tomato___Tomato_mosaic_virus',
'Tomato___healthy']
disease_model_path = 'models/plant_disease_model.pth'
disease_model.load_state_dict(torch.load(
disease_model_path, map_location=torch.device('cpu')))
disease_model.eval()
crop_recommendation_model_path = 'models/RandomForest.pkl'
crop_recommendation_model = pickle.load(
open(crop_recommendation_model_path, 'rb'))
def weather_fetch(city_name):
"""
:params: city_name
"""
api_key = config.weather_api_key
base_url = "http://api.openweathermap.org/data/2.5/weather?"
response = requests.get(complete_url)
x = response.json()
if x["cod"] != "404":
y = x["main"]
humidity = y["humidity"]
else:
return None
"""
:params: image
"""
transform = transforms.Compose([
transforms.Resize(256),
transforms.ToTensor(),
])
image = Image.open(io.BytesIO(img))
img_t = transform(image)
img_u = torch.unsqueeze(img_t, 0)
yb = model(img_u)
prediction = disease_classes[preds[0].item()]
return prediction
def disease_prediction():
if request.method == 'POST':
return redirect(request.url)
file = request.files.get('file')
if not file:
return render_template('userdiseasepredictions.html')
try:
img = file.read()
prediction = predict_image(img)
prediction = Markup(str(disease_dic[prediction]))
return render_template('userdiseasepredictions.html',
prediction=prediction)
except:
pass
return render_template('userdiseasepredictions.html')
@app.route('/fertilizer-predict')
def fertilizer_predict():
try:
return render_template('userfertilizerpredictions.html',
recommendation="")
except Exception as e:
return str(e)
@ app.route('/fertilizer-predict1', methods=['POST'])
def fert_recommend1():
crop_name = str(request.form['cropname'])
N = int(request.form['nitrogen'])
P = int(request.form['phosphorous'])
K = int(request.form['pottasium'])
# ph = float(request.form['ph'])
df = pd.read_csv('Data/fertilizer.csv')
nr = df[df['Crop'] == crop_name]['N'].iloc[0]
pr = df[df['Crop'] == crop_name]['P'].iloc[0]
kr = df[df['Crop'] == crop_name]['K'].iloc[0]
n = nr - N
p = pr - P
k = kr - K
max_value = temp[max(temp.keys())]
if max_value == "N":
if n < 0:
key = 'NHigh'
else:
key = "Nlow"
key = 'PHigh'
else:
key = "Plow"
else:
if k < 0:
key = 'KHigh'
else:
key = "Klow"
response = Markup(str(fertilizer_dic[key]))
return render_template('userfertilizerpredictions.html',
recommendation=response)
@app.route('/crop-predict', methods=['POST'])
def crop_prediction():
if request.method == 'POST':
N = int(request.form['nitrogen'])
P = int(request.form['phosphorous'])
K = int(request.form['pottasium'])
ph = float(request.form['ph'])
rainfall = float(request.form['rainfall'])
# state = request.form.get("stt")
city = request.form.get("city")
final_prediction="None"
temperature,humidity='',''
if weather_fetch(city) != None:
my_prediction = crop_recommendation_model.predict(data)
final_prediction = my_prediction[0]
now = datetime.datetime.now()
userid = session['userid']
id = str(random.randint(1000, 9999))
'Nitrogen': N, 'Phosphorus': P,
db = firestore.client()
newuser_ref = db.collection('newprediction')
id = json['id']
newuser_ref.document(id).set(json)
return render_template('usermakepredictions1.html',
prediction=final_prediction)
SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of
trying to discover every conceivable fault or weakness in a work product. It
provides a way to check the functionality of components, sub-assemblies,
assemblies and/or a finished product It is the process of exercising software
with the intent of ensuring that the
Software system meets its requirements and user expectations and does not
fail in an unacceptable manner. There are various types of test. Each test
type addresses a specific testing requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate that the
internal program logic is functioning properly, and that program inputs
produce valid outputs. All decision branches and internal code flow should be
validated. It is the testing of individual software units of the application .it is
done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component level and test a
specific Flightiness process, application, and/or system configuration. Unit
tests ensure that each unique path of a Flightiness process performs
accurately to the documented specifications and contains clearly defined
inputs and expected results.
Integration testing
Integration tests are designed to test integrated software components
to determine if they actually run as one program. Testing is event driven
and is more concerned with the basic outcome of screens or fields.
Integration tests demonstrate that although the components were
individually satisfaction, as shown by successfully unit testing, the
combination of components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination
of components.
Functional test
N Functional tests provide systematic demonstrations that functions
tested are available as specified by the Flightiness and technical
requirements, system documentation, and user manuals.
System Test
System testing ensures that the entire integrated software system
meets requirements. It tests a configuration to ensure known and predictable
results. An example of system testing is the configuration oriented system
integration test. System testing is based on process descriptions and flows,
emphasizing pre-driven process links and integration points.
Unit Testing:
Unit testing is usually conducted as part of a combined code and unit
test phase of the software lifecycle, although it is not uncommon for coding
and unit testing to be conducted as two distinct phases.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
Integration Testing
Software integration testing is the incremental integration testing of
two or more integrated software components on a single platform to produce
failures caused by interface defects.
Test Results:
All the test cases mentioned above passed successfully. No defects
encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires
significant participation by the end user. It also ensures that the system
meets the functional requirements.
Test Results:
All the test cases mentioned above passed successfully. No defects
encountered
TestCase Resul
Number Testing Scenario Expected result t
Registration Testing
Clicking submit without Alert "Please fill all
TC – 01 entering details details" Pass
Login Testing
CONCLUSION
The System was successfully developed to meet the needs of the clients. It was found to provide
all the features that required for the organization. The accuracy and complexity of the software are also
ensured and this system provides benefits such as user-friendly environment, which serves to verify the
integrity of a remotely-hosted Requirements monitor.
The primary objective of this study was to classify heart disease using different models and a
real-world dataset. The k-modes clustering algorithm was applied to a dataset of patients with heart
disease to predict the presence of the disease. The dataset was preprocessed by converting the age
attribute to years and dividing it into bins of 5-year intervals, as well as dividing the diastolic and systolic
blood pressure data into bins of 10 intervals. The dataset was also split on the basis of gender to take into
account the unique characteristics and progression of heart disease in men and women.
The elbow curve method was utilized to determine the optimal number of clusters for both the male
and female datasets. The results indicated that the MLP model had the highest accuracy of 87.23%.
These findings demonstrate the potential of k-modes clustering to accurately predict heart disease and
suggest that the algorithm could be a valuable tool in the development of targeted diagnostic and
treatment strategies for the disease. The study utilized the Kaggle cardiovascular disease dataset with
70,000 instances, and all algorithms were implemented on Google Colab. The accuracies of all algorithms
were above 86% with the lowest accuracy of 86.37% given by decision trees and the highest accuracy
given by multilayer perceptron, as previously mentioned.
FUTURE ENHANCEMENT
Limitations. Despite the promising results, there are several limitations that should be noted. First,
the study was based on a single dataset and may not be generalizable to other populations or patient
groups. Furthermore, the study only considered a limited set of demographic and clinical variables and
did not take into account other potential risk factors for heart disease, such as lifestyle factors or genetic
predispositions. Additionally, the performance of the model on a held-out test dataset was not evaluated,
which would have provided insight on how well the model generalizes to new, unseen data. Lastly, the
interpretability of the results and the ability to explain the clusters formed by the algorithm was not
evaluated. In light of these limitations, it is recommended to conduct further research to address these
issues and to better understand the potential of k-modes clustering.
Future research. Future research could focus on addressing the limitations of this study by
comparing the performance of the k-modes clustering algorithm with other commonly used clustering
algorithms, such as k-means or hierarchical clustering, to gain a more comprehensive understanding of
its performance. Additionally, it would be valuable to evaluate the impact of missing data and outliers on
the accuracy of the model and develop strategies for handling these cases. Furthermore, it would be
beneficial to evaluate the performance of the model on a held-out test dataset in order to establish its
generalizability to new, unseen data. Ultimately, future research should aim to establish the robustness
and generalizability of the results and the interpretability of the clusters formed by the algorithm, which
could aid in understanding the results and support decision making based on the study’s findings.
We plan to formalize our approach that will allow us to provide more rigorous evaluation. This
would include developing a core calculus for the TPM’s machine model based on the cryptographic
protocol Spi calculus. This semantics would account for the authentication, secrecy, and integrity
properties of the TPM. Furthermore, a formal semantics for our approach can be built on top of this core
calculus similar to the techniques
CHAPTER -7
REFERENCES
Web References:
1. Estes, C.; Anstee, Q.M.; Arias-Loste, M.T.; Bantel, H.; Bellentani, S.; Caballeria, J.; Colombo, M.;
Craxi, A.; Crespo, J.; Day, C.P.; et al. Modeling NAFLD disease burden in China, France,
Germany, Italy, Japan, Spain, United Kingdom, and United States for the period 2016–2030. J.
Hepatol. 2018, 69, 896–904.
2. Drożdż, K.; Nabrdalik, K.; Kwiendacz, H.; Hendel, M.; Olejarz, A.; Tomasik, A.; Bartman, W.;
Nalepa, J.; Gumprecht, J.; Lip, G.Y.H.
3. Murthy, H.S.N.; Meenakshi, M. Dimensionality reduction using neuro-genetic approach for early
prediction of plants. In Proceedings of the International Conference on Circuits, Communication,
Control and Computing, Bangalore, India, 21–22 November 2014; pp. 329–332. Benjamin, E.J.;
Muntner, P.; Alonso, A.; Bittencourt, M.S.; Callaway, C.W.; Carson, A.P.; Chamberlain, A.M.;
Chang, A.R.; Cheng, S.; Das, S.R.; et al.
4. Mozaffarian, D.; Benjamin, E.J.; Go, A.S.; Arnett, D.K.; Blaha, M.J.; Cushman, M.; de Ferranti, S.;
Després, J.-P.; Fullerton, H.J.; Howard, V.J.; et al. Plant disease prediction—2015 update: A
report from the American Plant Disease Prediction Association. Circulation 2015, 131, e29–e322.