0% found this document useful (0 votes)

4 views11 pages

Detection of SQL Injection Attack Using Machine Le

This study explores the use of machine learning techniques to detect SQL injection attacks (SQLIAs) in web applications, highlighting the effectiveness of various models such as decision trees, support vector machines, and neural networks. By analyzing a dataset of legitimate and malicious SQL queries, the research demonstrates high accuracy in distinguishing between benign and harmful queries, emphasizing the importance of feature selection and real-time detection capabilities. The findings suggest that machine learning can significantly enhance the security of database-driven applications against evolving SQLIAs.

Uploaded by

Blender Junior

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views11 pages

Detection of SQL Injection Attack Using Machine Le

Uploaded by

Blender Junior

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

International Journal of Scientific Research in Science and Technology

Available online at : www.ijsrst.com

Print ISSN: 2395-6011 | Online ISSN: 2395-602X doi : https://doi.org/10.32628/IJSRST24114323

Detection of SQL Injection Attack Using Machine Learning

Techniques
Bhanu Pratap Singh1, Prof. Manish Kumar Singhal2
1M.tech Scholar, NRI Institute of Information Science and Technology, Bhopal, Madhya Pradesh, India
Associate Professor & H.O.D, Department of Information Technology (IT), NRI Institute of Information
2

Science and Technology, Bhopal, Madhya Pradesh India

ARTICLEINFO ABSTRACT
SQL injection attacks (SQLIAs) remain a prevalent threat to web
Article History:
applications, exploiting vulnerabilities in database interactions to
Accepted : 27 Nov 2024 compromise data security. Detecting such attacks effectively is crucial for
Published : 27 Dec 2024 ensuring robust application security. This study investigates the use of
machine learning techniques to identify SQLIAs by analyzing patterns and
features in SQL queries. A dataset comprising both legitimate and
Publication Issue : malicious SQL queries is utilized to train and evaluate various machine
Volume 11, Issue 6 learning models, including decision trees, support vector machines, and
November-December-2024 neural networks. The proposed approach achieves high accuracy in
distinguishing between benign and malicious queries, showcasing the
Page Number : potential of machine learning for proactive SQLIA detection. The findings
780-790 highlight the importance of feature selection, algorithm choice, and real-
time detection capabilities in mitigating the risk of SQL injection attacks.
This research provides a foundation for developing intelligent, automated
systems to enhance the security of database-driven applications.
Keywords: SQL Injection, Cross Side Scripting, Denial of Service Attack,
Naïve Bias, Gradient Boosting, etc.

I. INTRODUCTION Traditional defensive mechanisms, such as input

validation and parameterized queries, often fall short
The proliferation of web applications and databases in detecting and mitigating sophisticated SQLi
has made data security a critical concern. One of the attempts.
most prevalent and dangerous threats to database With the advent of advanced technologies, machine
systems is SQL Injection (SQLi) attacks. These attacks learning (ML) has emerged as a promising solution for
exploit vulnerabilities in applications to manipulate enhancing cyber security. Machine learning
database queries, potentially exposing sensitive algorithms can analyze vast amounts of data, identify
information or compromising system integrity. patterns, and detect anomalous behaviors indicative of

Copyright © 2024 The Author(s): This is an open-access article distributed under the terms of the Creative 780
Commons Attribution 4.0 International License (CC BY-NC 4.0)
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

SQLi attacks. Unlike traditional methods, ML-based

approaches adapt to evolving attack strategies,
offering a proactive and dynamic defense mechanism.
This study explores the application of machine
learning techniques in detecting SQL Injection
attacks, emphasizing their accuracy, efficiency, and
adaptability. By leveraging supervised and
unsupervised learning models, the proposed approach
aims to strengthen database security and minimize the
risks associated with SQLi threats.
The growing sophistication of SQL injection
techniques, such as blind, error-based, and time-based
injections, has made it increasingly difficult for
conventional security measures to keep pace. Fig-1 Securing web applications against SQLi attacks
Attackers continuously refine their methods to evade using a novel deep learning approach
detection, exploiting even minor vulnerabilities in
web applications. This has prompted a shift towards II. LITERATURE SURVEY
more intelligent and adaptive defense strategies, such
as machine learning, which can learn from existing Laila Aburashed et.al.(2024) - This research work
data and detect previously unseen attack vectors. presented, SQL Injection is one of the most common
Machine learning models, especially classification vulnerabilities exploited for both privacy breaches
algorithms like decision trees, support vector and financial damage. It remains the top vulnerability
machines (SVM), and deep learning networks, can be on the most recent OWASP Top 10 list, with the
trained on large datasets containing both normal and number of such attacks on the rise. The SQL Injection
malicious queries. These models can then classify Detection Challenge is addressed using machine
new, unseen SQL queries based on their features, such learning algorithms. By employing a classification
as syntax patterns, database operations, and user input method, communications are identified as either SQL
characteristics. By training on labeled attack data, ML Injection or plain text. This research proposes a
models can learn to distinguish between legitimate machine learning framework to assess the feasibility
user inputs and potentially malicious ones, providing of using a machine learning classifier to detect SQL
a dynamic defense against evolving attack techniques. Injection attacks. Classification algorithms such as
Furthermore, machine learning techniques offer the Random Forest, Gradient Boosting, SVM, and ANN
advantage of automation, enabling real-time detection are utilized. As a result, ANN demonstrated superior
and response to SQLi attempts. This reduces the performance and required less time to detect SQL
burden on human analysts, allowing for faster Injection attacks [01].
mitigation of threats. The ability to continuously Hakan Can Altunay et.al. (2023) - This research work
improve the accuracy of detection by feeding new presented, SQL injection attack is one of the cyber
attack data into the models ensures that the defense attack types that puts individuals and institutions in a
system remains robust against emerging SQL injection difficult situation in terms of data disclosure and
variants. material damage. This attack type, which is frequently
preferred due to its case of use, has emerged with
different usage features in recent years. In this study,

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 781
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

various machine learning algorithms were tested to applications developed in a weak and stable typed
detect SQL Injection attacks. In the data pre- language. So we looked for a big point where an
processing section, feature extraction was performed attacker may use a native notion or existing ways to
using Natural Language Processing techniques. While access or hack data from the database. What kind of
the relevance of expressions to each other was issue will arise if an attacker tries to make a database
calculated with the Word Level TF-IDF method, term vulnerable by injecting SQL that affects the results
search was also performed [02]. [04].
Maha Alghawazi et. al. (2022) - This research work Binh An Pham et.al. (2020) - This research work
presented , An SQL injection attack, usually occur presented, —SQL injection attacks (SQLi attacks)
when the attacker(s) modify, delete, read, and copy have proven their danger on several website types
data from database servers and are among the most such as social media, e-shopping, etc. In order to
damaging of web application attacks. A successful prevent such attacks from occurring, this research
SQL injection attack can affect all aspects of security, effort investigates on efficient ways of detection and
including confidentiality, integrity, and data prevention, so that we can preserve each cyberuser’s
availability. SQL (structured query language) is used right of privacy. This research effort is aimed at
to represent queries to database management systems. investigating and looking at different ways to protect
Detection and deterrence of SQL injection attacks, for websites from SQL injection attacks. In this research
which techniques from different areas can be applied effort, machine learning algorithms were used to
to improve the detect ability of the attack, is not a detect such SQLi attacks. Machine Learning (ML)
new area of research but it is still relevant. Artificial algorithms are algorithms that can learn from the data
intelligence and machine learning techniques have provided and infer interesting results from the
been tested and used to control SQL injection attacks, dataset. We used SQL code and user input as our data
showing promising results. The main contribution of and ML algorithms to detect malicious code [05].
this paper is to cover relevant work related to Tareek Pattewar et.al, (2019) - In this research work
different machine learning and deep learning models presented, SQL injection attack is a very serious
used to detect SQL injection attacks. With this problem of web applications. Finding the efficient
systematic review, we aims to keep researchers up-to- solution of this problem is essential. Researchers have
date and contribute to the understanding of the developed many techniques to detect and prevent this
intersection between SQL injection attacks and the vulnerability. There is no appropriate solution that
artificial intelligence field [03]. can prevent all types of SQL injection attacks. SQL
Ravi Raj Choudhary et.al. (2021) - This research Injection attacks remain to be one of top concerns for
work presented, a web component, and that web- cyber security researchers. Signature based SQL
based component, or web application, was accessible Injection detection methods are no longer reliable as
to the general public over the Internet. It is attackers are using new types of SQL Injections each
vulnerable to attack by the adversary. It is not time. There is a need for SQL Injection detection
uncommon for web and mobile applications to have a mechanisms that are capable of identifying new,
lackadaisical flaw that adversely affects their security never before seen attacks. Applying machine learning
and privacy. Database vulnerability attacks are to the field of cyber-security is being considered by
becoming more common and harmful. It is critical to many researchers. Two machine learning
understand software defects and, more importantly, classification algorithms are implemented on the
prevent these security issues. SQL injection and XSS problem, which are, Na¨ıve Bayes Classifier and
scan the same security code, often employed in online Gradient Boosting Classifier. Na¨ıve Bayes classifier

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 782
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

machine learning model provides results with an  Vectorization: The text data will be converted
accuracy of 92.8%. Ensemble learning methods are into numerical form. Techniques such as TF-IDF
said to provide results with better accuracy as they (Term Frequency-Inverse Document Frequency)
implement multiple simple classifiers to improve error will be used to represent the importance of each
and accuracy [06]. word or token in the context of the entire
. dataset. This allows the machine learning models
III.PROPOSED METHOD to process text effectively.
Once the data is cleaned and vectorized, it will be
The goal of this project is to build a model that can divided into training, validation, and testing sets to
detect SQL Injection (SQLi) attacks in SQL queries. evaluate the models' performance.
SQLi attacks exploit vulnerabilities in the database C. Feature Engineering
query processing, leading to unauthorized access and  TF-IDF Representation: The primary feature
data breaches. This model will classify SQL queries as engineering technique involves transforming the
either malicious (SQLi) or benign (safe), helping to SQL queries into numerical vectors using the TF-
prevent security vulnerabilities in web applications. IDF method. This method assigns a weight to
A. Dataset Description each word based on its frequency in the
The dataset used for this project consists of SQL document and its rarity across the entire dataset.
queries labeled as:  N-grams: In addition to individual words,
 Malicious (SQLi): Labeled as 1, representing a sequences of words (n-grams) will also be
harmful SQL Injection attempt. considered as features. This helps capture
 Benign (Safe): Labeled as 0, representing a contextual information in the queries, which
normal, safe SQL query. may be crucial for identifying attack patterns in
Each row in the dataset represents a single SQL query, SQLi.
and the associated label indicates whether the query is  Embeddings (Optional): If necessary, pre-trained
benign or malicious. The data is likely to be text- word embeddings like Word2Vec or GloVe could
based, and preprocessing is necessary to convert the be used to capture semantic meaning between
raw SQL queries into a usable form for machine words, improving model performance by
learning. considering word relationships.
B. Data Preprocessing D. Model Development
 Text Cleaning: The raw SQL queries may contain Random Forest:
noise such as special characters, extra whitespace, Random Forest is an ensemble learning technique
or null values. This will be cleaned to ensure that builds multiple decision trees and aggregates
uniformity and to eliminate unwanted elements their results. It works by learning patterns from the
that might interfere with model performance. data, such as identifying which features (e.g., words or
 Text Normalization: All text will be converted to n-grams) contribute to the classification of a query as
lowercase to maintain consistency across all benign or malicious. This model is robust, less prone
queries, as SQL queries may have different case to overfitting, and can handle complex relationships
conventions. in data.
 Tokenization: The cleaned text will be split into How it Works:
smaller units, such as words or tokens, to better Random Forest is an ensemble learning technique
understand the structure of the query. that combines multiple decision trees to make
predictions. Each decision tree is trained on a subset

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 783
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

of the data and features, which introduces tokens in a SQL query). The model assumes that each
randomness to improve the model’s generalization feature is conditionally independent given the class
ability. label.
 Training: Random Forest creates multiple  Bayes' Theorem: The probability of a class CCC
decision trees by using bootstrap sampling, where given the features XXX is calculated using Bayes'
each tree is trained on a random subset of the theorem:
training data. At each node of the tree, the
algorithm selects a random subset of features to
split the data, ensuring that the trees are diverse. Where
This randomness helps reduce over fitting. P (Ć/X) = is the probability of the class given the
 Prediction: Once all the trees are trained, each features
tree makes a prediction. The final prediction is P (X / Ć) = is the likelihood of the feature given the
determined by taking a majority vote from all the class
trees. This aggregation improves the model’s P (Ć) = is the prior probability of the class
robustness and makes it less sensitive to P (X) = is the probability of the feature
fluctuations or noise in the data. Training: During training, Naive Bayes calculates the
Strengths of Random Forest: likelihood of each word or token appearing in benign
 Robustness: Random Forest reduces overfitting and malicious queries. The model then uses these
by aggregating predictions from multiple decision probabilities to classify new queries.
trees, leading to more stable and accurate results. Classification: For a new SQL query, Naive Bayes
 Handles High-dimensional Data: It works well computes the likelihood of the query belonging to
with data containing many features (e.g., a large each class (benign or malicious) based on the
vocabulary in SQL queries) and can effectively probabilities of the individual words and selects the
learn complex patterns and relationships. class with the highest probability.
 Feature Importance: Random Forest can rank Strengths of Naive Bayes:
features based on their importance in predicting Simplicity and Speed: Naive Bayes is computationally
the target variable, helping us identify which efficient, making it well-suited for large datasets
words or tokens are most indicative of SQLi where quick predictions are required.
attacks. Effective for Text Classification: The model performs
Naive Bayes: Naive Bayes is a probabilistic classifier well in text classification tasks, where the goal is to
based on Bayes' theorem. It works well for text classify a document (in this case, an SQL query) based
classification tasks, especially when features (in this on word frequencies or patterns.
case, words or tokens in a query) are conditionally Convolutional Neural Networks (CNN):
independent. Despite its simplicity, Naive Bayes is CNNs, a type of deep learning model, are powerful at
often effective for detecting patterns in textual data. detecting local patterns in data. For text classification,
In this project, it will help in distinguishing between CNNs can identify important sequences of words or
benign and malicious queries based on word n-grams in SQL queries. This is especially useful for
frequencies and probabilities. detecting the structure of SQLi attacks, which often
How it Works: involve specific patterns or keywords in SQL queries.
Naive Bayes is a probabilistic classifier based on Bayes' CNNs can automatically learn these features and
Theorem, which calculates the probability of a class make predictions based on learned patterns in the
(benign or malicious) given the features (words or data.

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 784
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

How it Works:
Convolutional Neural Networks (CNNs) are deep
learning models originally designed for image
classification but have proven effective in text
classification tasks, such as detecting SQL Injection
attacks.
Convolutional Layers: The CNN applies filters (also
called kernels) to input sequences of words or
characters. These filters slide over the text data to
detect local patterns, such as specific phrases or n-
grams indicative of SQLi attacks (e.g., ―OR 1=1‖ or
Fig - 2 Simple CNN model
―DROP TABLE‖). These patterns may represent SQL
commands commonly used in injections.
Each model has unique strengths:
Pooling Layer: After convolution, a pooling layer
Random Forest: An ensemble method that builds
reduces the size of the data by taking the maximum
multiple decision trees to capture complex
(or average) value from a set of features, which helps
relationships in the data. It is robust and handles
in capturing the most important patterns while
high-dimensional data well. It also provides insights
reducing computational complexity.
into feature importance, which is valuable for
Fully Connected Layers: Once the features have been
understanding what makes a query malicious.
extracted by the convolutional and pooling layers, the
Naive Bayes: A probabilistic model that is fast and
data is passed through fully connected layers that
effective for text classification tasks. It works by
perform the final classification, determining whether
calculating the probability of a query being benign or
the SQL query is benign or malicious.
malicious based on the frequencies of words in the
Training: CNNs are trained using back propagation,
query. It is particularly efficient for large datasets.
where the weights of the filters are adjusted based on
Convolutional Neural Networks (CNN): A deep
the errors made in predictions. This allows the model
learning model that automatically learns patterns
to learn which sequences of words are most indicative
from raw data. CNNs are effective at detecting local,
of SQLi attacks.
sequential patterns in SQL queries, making them
Strengths of CNN:
powerful for detecting sophisticated SQLi attacks.
Automatic Feature Extraction: CNNs automatically
Hyper parameter Tuning and Optimization
learn relevant features from raw text data, eliminating
Once the models are trained, hyper parameter tuning
the need for manual feature engineering. They are
will be performed using methods such as Grid Search
particularly good at detecting specific patterns in
or Random Search to optimize the performance of the
sequences of words.
models. The goal is to find the best combination of
Pattern Detection: CNNs excel at recognizing
hyper parameters (such as the number of trees in
complex, local patterns in data. This is important for
Random Forest or the kernel in Naive Bayes) that
SQLi detection, where malicious queries often contain
maximizes model accuracy.
specific sequences of keywords.
Cross-validation will be used to evaluate the models
Scalability: CNNs can handle large amounts of data
and ensure that they generalize well to unseen data,
and can improve their performance as more labeled
minimizing over fitting.
data is provided, making them suitable for large-scale
applications.

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 785
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

IV. SIMULATION RESULT positives entirely. This is crucial in minimizing the

misclassification of benign queries as malicious.
A. Results and Discussion Recall:
The performance of three models—Random Forest, Recall measures the proportion of true positives
Naive Bayes, and Convolutional Neural Networks identified among all actual positives.
(CNN)—was evaluated on the task of SQL Injection Random Forest had the lowest recall at 0.731788,
(SQLi) detection using several key metrics: Precision, indicating it missed a significant number of malicious
Recall, F1-Score, and Accuracy. The results of the queries. In contrast, both Naive Bayes and CNN
evaluation are summarized in Table 1 below: achieved a recall of 0.917012, demonstrating their
ability to capture a larger proportion of malicious
Table -1 Model Performance Comparison Table queries.
Metric Random Forest Naive Bayes CNN Accuracy:
F1-Score 0.845124 0.956710 0.956710 Accuracy reflects the overall correctness of the model
Precision 1.000000 1.000000 1.000000 in classifying both benign and malicious queries.
Recall 0.731788 0.917012 0.917012 Naive Bayes and CNN achieved the highest accuracy
Accuracy 0.903571 0.976190 0.976190 of 97.6%, significantly outperforming Random Forest,
which achieved 90.4%. This highlights the
effectiveness of Naive Bayes and CNN in providing
reliable predictions.
C. Result
From the results, it is evident that Naive Bayes and
CNN outperform Random Forest in all metrics except
precision, where all models performed equally well.
While Random Forest demonstrated acceptable
performance with an accuracy of 90.4%, it was less
Fig-3 Model Performance Comparison: Random effective in identifying all malicious queries, as shown
Forest, Naive Bayes, and CNN by its lower recall and F1-Score.
B. Metric Descriptions and Analysis Naive Bayes and CNN emerged as the best-performing
F1-Score: models with:
The F1-Score balances the trade-off between  A high F1-Score of 0.956710, indicating a strong
precision and recall, offering a single measure of balance between precision and recall.
performance.  A high recall of 0.917012, showing their ability
Random Forest achieved an F1-Score of 0.845124, to detect malicious queries effectively.
which is lower compared to Naive Bayes and CNN  The highest accuracy of 97.6%, highlighting their
(both scoring 0.956710). This indicates that Random overall reliability in classifying queries.
Forest struggles to maintain a balance between Between Naive Bayes and CNN, the choice depends
precision and recall. on the application's requirements. CNN is better
Precision: suited for tasks requiring deep learning's ability to
Precision measures the proportion of true positives capture complex patterns, while Naive Bayes offers
among all predicted positives. simplicity, speed, and comparable performance,
All three models achieved perfect precision making it ideal for resource-constrained
(1.000000), showing that they were able to avoid false environments.

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 786
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

Fig-4 Training and Validation Accuracy

Fig – 6 Confusion Matrix Explanation

D. Confusion Matrix Explanation

The confusion matrix is a powerful tool to evaluate
the performance of a classification model. It provides
detailed insights into how well the model classifies
both the positive and negative classes. Below is a
breakdown of the confusion matrix for the given
results:
Confusion Matrix Overview:

Fig -5 Training and Validation Loss

Training and Validation Accuracy Detailed Analysis of the Confusion Matrix

This plot depicts the accuracy of the CNN model on Components:
the training and validation datasets over 10 epochs of True Negatives (Top Left - 599):
training.  These are cases where the model correctly
Training Accuracy (Blue Line): identified negative instances.
The training accuracy improves consistently over the  Out of all negative instances in the dataset, 599
10 epochs and reaches a high value of approximately were correctly predicted as negative.
0.98 by the final epoch.  This high count of true negatives indicates the
Validation Accuracy (Orange Line): model's ability to effectively distinguish benign
The validation accuracy follows a similar trend, SQL queries.
reaching a comparable high value (~0.97) but starts to False Positives (Top Right - 20):
slightly plateau or fluctuate after 5-6 epochs.  These are cases where the model incorrectly
Training and Validation Loss classified a negative instance as positive.
This plot tracks the loss (error) for the training and  20 benign queries were misclassified as malicious.
validation datasets over 10 epochs of training. Loss  While this number is low relative to the total
measures how well (or poorly) the model's predictions dataset, minimizing false positives is critical in
align with the actual target values, with lower values real-world applications to avoid unnecessary flags
indicating better performance. or interruptions in benign operations.

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 787
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

False Negatives (Bottom Left - 0): 1. HTML (Hypertext Markup Language):

 These are cases where the model failed to o Used to define the structure of the form.
identify a positive instance. o Includes fields for user input and buttons to
 0 malicious queries were misclassified as benign. submit the query.
 The absence of false negatives is a significant 2. CSS (Cascading Style Sheets):
strength of the model, as it ensures that no o Used to style the form, making it visually
malicious queries are missed, which is crucial in appealing and enhancing user experience.
security systems. 3. Bootstrap:
True Positives (Bottom Right - 221): o A responsive design framework that ensures
 These are cases where the model correctly the form looks good across devices.
identified positive instances. o Provides prebuilt components (e.g., buttons,
 221 malicious queries were accurately classified alerts) and responsive grid layouts.
as malicious. G. Implementation Overview
 A high count of true positives demonstrates the Frontend Implementation:
model's capability to accurately detect attacks.  Create an HTML form with a text field for users
E. Performance Interpretation: to input their SQL query and a submit button.
Exceptional Detection Rate: The model's ability to  Use Bootstrap classes to style the form (e.g.,
achieve no false negatives (FN = 0) ensures that every form-control for input fields and btn-primary for
malicious SQL query is detected. This is critical for buttons).
security systems where missing even a single  Include alerts to show validation results (e.g., a
malicious query could lead to potential breaches or Bootstrap alert to warn about SQL injection).
compromises. Flask Backend:
 Low False Positives: The model has a false  Define routes to handle the form submission
positive rate of only 20. While this is a small (/validate endpoint).
fraction, efforts to further reduce false positives Process the input query by:
can enhance system efficiency and prevent  Checking for common SQL injection patterns (or,
benign queries from being flagged unnecessarily. --, ;, etc.).
 High True Positives and True Negatives: The  Displaying appropriate warnings if malicious
model correctly classified 599 negative queries input is detected.
and 221 positive queries, showcasing its Render the results on the same page or redirect to a
reliability in both identifying attacks and results page
recognizing safe queries.
F. Explanation of Frontend and Backend
Implementation for SQL Query Validation Form
Frontend Design
The frontend of the SQL query validation system is
built using HTML, CSS, and Bootstrap. These
technologies are used to create a user-friendly and Fig 7:- SQL Injection Detected
responsive interface where users can input SQL
queries for validation. Steps in the Application Workflow
User Input:
The user inputs a query, such as a' or 1 = 1; --,1.

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 788
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

Validation Logic: decision trees, support vector machines, or neural

 The backend Flask app receives the query. networks, the system can efficiently identify and
 It parses and checks for suspicious patterns: mitigate malicious inputs in real-time. The research
 or operator: Often used to bypass conditions. emphasizes the importance of feature selection,
 ; character: Allows execution of multiple SQL dataset quality, and algorithm optimization to achieve
commands. high accuracy and low false-positive rates.
 operator: Comments out part of the SQL Furthermore, the study underscores the adaptability
statement. of machine learning models to evolving attack
Output: patterns, making them a robust choice for dynamic
 If a potential SQL injection is detected, a warning security environments. The confusion matrix
message is displayed on the page. highlights the model's overall excellent performance,
 The warning highlights the risky components in with zero false negatives, ensuring maximum security
the query (e.g., or, --, ;). by detecting all malicious queries, and a low false
 Safe input results in a confirmation message positive count, indicating efficient classification of
indicating no SQL injection was detected. benign queries. These results make the model highly
Recommendations: suitable for deployment in SQL injection detection
 The page displays best practices for query safety, tasks, where both accuracy and reliability are
including sanitizing inputs and using paramount.
parameterized queries.
VI. REFERENCES

[1]. Laila Aburashed1,Marah AL Amoush1, Wardeh

Alrefai1 ―SQL Injection Attack Detection Using
Machine Learning Algorithms ‖ ISSN: 3030-
Fig - 8 SQL Injections Detected Safe Query 5241, 15 June 2024.
[2]. Hakan Can Altunay. "Detection of SQL
The image indicates a potential security vulnerability Injection Attacks Using Machine Learning
in a web application that validates SQL queries. It Algorithms Based on NLP-Based Feature
shows a form where a user can input an SQL query, Extraction‖ 11 December 2023.
and despite the entered query "x' or full_name like [3]. Maha Alghawazi , Daniyal Alghazzawi and
'%bob%,1" being a known SQL injection attempt, the Suaad Alarifi ―Detection of SQL Injection
system incorrectly marks it as "safe." This highlights Attack Using Machine Learning Techniques‖
the importance of robust security measures and Volume 2, Issue 4 ,20 September 2022 .
accurate validation processes to prevent SQL injection [4]. Ravi Raj Choudhary; Susheela Verma; Gaurav
attacks. Meena. "Detection of SQL Injection attack
Using Machine Learning‖ 17-19 December
V. CONCLUSION 2021.
[5]. Binh An Pham, Vinitha Hannah Subburaj ―An
The use of machine learning techniques for detecting Experimental setup for Detecting SQLi Attacks
SQL injection attacks demonstrates a promising using Machine Learning Algorithms‖ Volume 8,
solution to one of the most prevalent web application No. 1, 2020.
security threats. By leveraging algorithms such as

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 789
Bhanu Pratap Singh et al Int J Sci Res Sci & Technol. November-December-2024, 11 (6) : 780-790

[6]. Tareek Pattewar,Hitesh Patil, Harshada Patil, Vulnerabilities. Proc. - Int. Comput. Softw.
Neha Patil, Muskan Taneja, Tushar Wadile Appl. Conf., 1(August), 87-94. Doi:
"Detection of SQL Injection using Machine 10.1109/COMPSAC.2007.43.
Learning‖ Volume: 06, Issue: 11, ISSN: 2395- [15]. D. Appelt, C. D. Nguyen, L. C. Briand, and N.
0072, Nov 2019. Alshahwan. (2014). Automated Testing for SQL
[7]. S. Steiner, D. Conte de Leon, and J. Alves-Foss. Injection Vulnerabilities: An Input Mutation
(2017). AStructured Analysis of SQL Injection Approach. 2014 Int. Symp. Softw. Test. Anal.
[8]. Runtime MitigationTechniques. Proc. 50th ISSTA 2014 - Proc., May, 259-269. Doi:
Hawaii Int. Conf. Syst. Sci., 2887-2895.Doi: 10.1145/2610384.2610403.
10.24251/hicss.2017.349. [16]. A. Ciampa, C. A. Visaggio, and M. Di Penta.
[9]. W. G. J. Halfond, J. Viegas, and A. Orso. (2008). (2010). A Heuristic-based Approach for
AClassification of SQL Injection Attacks and Detecting SQL-injection Vulnerabilities in Web
Countermeasures.Prev. Sql Code Inject. By Applications. Proc. - Int. Conf. Softw. Eng.,
Comb. Static Runtime Anal., 53. January, 43-49. Doi: 10.1145/1809100.1809107.
[10]. P. Kumar and R. K. Pateriya. (2012). [17]. Y. Shin. (2004). Improving the Identification of
ASurveyonSQLInjection Attacks, Detection and Actual InputManipulation Vulnerabilities, 1-4.
Prevention Techniques. 20123rd Int. Conf. [12] W. G. J. Halfond and A. Orso. (2005).
Comput. Commun. Netw. Technol. AMNESIA: Analysisand Monitoring for
ICCCNT2012.Doi:10.1109/ICCCNT.2012.63960 Neutralizing SQL-injection Attacks.
96. 20thIEEE/ACM Int. Conf. Autom. Softw. Eng.
[11]. G. Wassermann and Z. Su. (2004). An Analysis ASE2005, 174-183.Doi:
FrameworkforSecurity in Web Applications. 10.1145/1101908.1101935.
SAVCBS 2004 Specif. Verif.Component-Based [18]. R. Mui and P. Frankl. (2010). Preventing
Syst., 70. [Online]. SQLInjectionthrough Automatic Query
Available:http://web.cs.ucdavis.edu/~su/publica Sanitization with ASSIST. Electron.Proc. Theor.
tions/savcbs.pdf%0Ahttp://citeseerx.ist.psu.edu/ Comput. Sci., 35, 27-38. Doi: 10.4204/eptcs.35.3.
viewdoc/download?doi=10.1.1.72.2255&rep=rep [19]. R. Dharam and S. G. Shiva. (2012). Runtime
1&type=pdf#page=82. MonitoringTechnique to handle Tautology
[12]. C. Gould, Z. Su, and P. Devanbu. (2004). based SQL InjectionAttacks.Int. J. Cyber-
JDBCChecker:A Static Analysis Tool for Security Digit. Forensics (IJCSDF), 1(3), 189-
SQL/JDBC Applications. Proc. - Int. Conf. 203,
Softw. Eng., 26, 697-698. Doi: [20]. W. Qing and C. He. (2016). The Research of
10.1109/icse.2004.1317494. anAOP-basedApproach to the Detection and
[13]. Y. Kosuga, K. Kono, M. Hanaoka, M. Defense of SQLInjectionAttack, 731-737. Doi:
Hishiyama, and Y. Takahama. (2007). Sania: 10.2991/aest-16.2016.98.
Syntactic and Semantic Analysis for Automated [21]. A. Ghafarian. (2018). A Hybrid Method for
Testing Against SQL Injection. Proc. - Annu. DetectionandPrevention of SQL Injection
Comput. Secur. Appl. Conf. ACSAC, 107-116. Attacks. Proc. Comput. Conf.2017, 833-838.
Doi: 10.1109/ACSAC.2007.20. Doi: 10.1109/SAI.2017.8252192.
[14]. X. Fu, X. Lu, B. Peltsverger, S. Chen, K. Qian,
and L. Tao. (2007). A Static Analysis
Framework for Detecting SQL Injection

International Journal of Scientific Research in Science and Technology (www.ijsrst.com) | Volume 1 1 | Issue 6 790

Project Report On Blogging Website
86% (14)
Project Report On Blogging Website
55 pages
SAP BASIS L1 L2 and L3 Atiivties
No ratings yet
SAP BASIS L1 L2 and L3 Atiivties
9 pages
KCSE 2022 - Nekta Management System
100% (1)
KCSE 2022 - Nekta Management System
55 pages
Detection of SQL Injection Using Machine Learning: A Survey
No ratings yet
Detection of SQL Injection Using Machine Learning: A Survey
8 pages
Amazon Web Services (Aws)
100% (1)
Amazon Web Services (Aws)
13 pages
Spring MVC 5 + Hibernate 5 + JSP + MySQL CRUD Tutorial
No ratings yet
Spring MVC 5 + Hibernate 5 + JSP + MySQL CRUD Tutorial
19 pages
A Study of Machine Learning-Based Approaches For SQL Injection Detection and Prevention
No ratings yet
A Study of Machine Learning-Based Approaches For SQL Injection Detection and Prevention
10 pages
IJSRDV6I10368
No ratings yet
IJSRDV6I10368
2 pages
Article 6152
No ratings yet
Article 6152
10 pages
SQL Injection Detection Using Machine Learning
No ratings yet
SQL Injection Detection Using Machine Learning
51 pages
AI-enabled Natural Language Processing For Prediction of Malicious SQL Codes
No ratings yet
AI-enabled Natural Language Processing For Prediction of Malicious SQL Codes
11 pages
SQL Injection Detection Using Machine Learning Techniques and Mul
No ratings yet
SQL Injection Detection Using Machine Learning Techniques and Mul
28 pages
Detection of Structured Query Language Injection Attacks Using Machine Learning Techniques
No ratings yet
Detection of Structured Query Language Injection Attacks Using Machine Learning Techniques
14 pages
Chen 2021 J. Phys. Conf. Ser. 1757 012055
No ratings yet
Chen 2021 J. Phys. Conf. Ser. 1757 012055
8 pages
Intelligent Web Security: Machine Learning-Based SQL Injection Detection and Honeypot Integration
No ratings yet
Intelligent Web Security: Machine Learning-Based SQL Injection Detection and Honeypot Integration
7 pages
A Study On SQL Injection Detection AI-based Perspective
No ratings yet
A Study On SQL Injection Detection AI-based Perspective
4 pages
Pondicherry University: Project Phase - 1
No ratings yet
Pondicherry University: Project Phase - 1
12 pages
An Analysis of AI-based SQL Injection SQLi Attack Detection
No ratings yet
An Analysis of AI-based SQL Injection SQLi Attack Detection
5 pages
Enhancing SQL Injections
No ratings yet
Enhancing SQL Injections
13 pages
A Machine Learning Approach To Preventing SQL Injection Attack On Critical Information Infrastructure
No ratings yet
A Machine Learning Approach To Preventing SQL Injection Attack On Critical Information Infrastructure
47 pages
SQL-CB-GuArd: A Deep Learning Mechanism For Structured Query Language Injection Attack Detection
No ratings yet
SQL-CB-GuArd: A Deep Learning Mechanism For Structured Query Language Injection Attack Detection
13 pages
Hybrid SQL Injection Detection System
No ratings yet
Hybrid SQL Injection Detection System
5 pages
XploitSQL Advancing Adversarial SQL Injection Attack Generation
No ratings yet
XploitSQL Advancing Adversarial SQL Injection Attack Generation
8 pages
TJ 15 2021 1 112-120
No ratings yet
TJ 15 2021 1 112-120
9 pages
4 Vol 100 No 15
No ratings yet
4 Vol 100 No 15
14 pages
Machine Learning-Based Detection of SQL Injection and Data Exfiltration Through Behavioral Profiling of Relational Query Patterns
No ratings yet
Machine Learning-Based Detection of SQL Injection and Data Exfiltration Through Behavioral Profiling of Relational Query Patterns
15 pages
SQL Injection Attack Detection and Preve PDF
No ratings yet
SQL Injection Attack Detection and Preve PDF
12 pages
Sqligot: Detecting SQL Injection Attacks Using Graph of Tokens and SVM
No ratings yet
Sqligot: Detecting SQL Injection Attacks Using Graph of Tokens and SVM
42 pages
AE-Net Novel Autoencoder-Based Deep Features For SQL Injection Attack Detection
No ratings yet
AE-Net Novel Autoencoder-Based Deep Features For SQL Injection Attack Detection
10 pages
Techreport
No ratings yet
Techreport
28 pages
A Method of Detecting SQL Injection Attack To Secure Web Applications
No ratings yet
A Method of Detecting SQL Injection Attack To Secure Web Applications
9 pages
A Review of SQL Injection Attack
No ratings yet
A Review of SQL Injection Attack
16 pages
Information Security Analysis and Audit CSE3501: Slot: G1+TG1
No ratings yet
Information Security Analysis and Audit CSE3501: Slot: G1+TG1
31 pages
RamificationAnalysisOfSQL InjectionD
No ratings yet
RamificationAnalysisOfSQL InjectionD
7 pages
Sat - 94.Pdf - Detection of SQL Injection Attack Usiing Adaptive Deep Forest
No ratings yet
Sat - 94.Pdf - Detection of SQL Injection Attack Usiing Adaptive Deep Forest
11 pages
Detecting and Mitigating SQL Injection in .NET Applications Using AI-Based Anomaly Detection
No ratings yet
Detecting and Mitigating SQL Injection in .NET Applications Using AI-Based Anomaly Detection
14 pages
SSRN Id3141112
No ratings yet
SSRN Id3141112
6 pages
SQL Injection Detection and Correction Using Machine
No ratings yet
SQL Injection Detection and Correction Using Machine
8 pages
Mukhtar 2020
No ratings yet
Mukhtar 2020
6 pages
Analyzing SQL Payloads Using Logistic Regression I
No ratings yet
Analyzing SQL Payloads Using Logistic Regression I
10 pages
Implementation of Pattern Matching Algorithm To Defend SQLIA
No ratings yet
Implementation of Pattern Matching Algorithm To Defend SQLIA
7 pages
Assignment 1 - Nguyen Van Huy Quang - 105027350
No ratings yet
Assignment 1 - Nguyen Van Huy Quang - 105027350
22 pages
B1e0 PDF
No ratings yet
B1e0 PDF
13 pages
SQL Injection Detection Using Hybrid Model
No ratings yet
SQL Injection Detection Using Hybrid Model
5 pages
SQL Injection Attack Detection by Machine Learning Classifier
No ratings yet
SQL Injection Attack Detection by Machine Learning Classifier
8 pages
Paper 16-Detection of SQL Injection Using A Genetic Fuzzy Classifier System
No ratings yet
Paper 16-Detection of SQL Injection Using A Genetic Fuzzy Classifier System
9 pages
Assignemnt 1 - 103802759-1
No ratings yet
Assignemnt 1 - 103802759-1
16 pages
Detection of SQL Injection Attacks Based On Improved TFIDF Algorithm 'S
No ratings yet
Detection of SQL Injection Attacks Based On Improved TFIDF Algorithm 'S
20 pages
Classification of SQL Injection Detection and Prevention Measure
No ratings yet
Classification of SQL Injection Detection and Prevention Measure
13 pages
Case Study On SQL Injection
No ratings yet
Case Study On SQL Injection
5 pages
Term Paper On SQL Injection
No ratings yet
Term Paper On SQL Injection
6 pages
A Novel System For Detecting and Preventing SQL Injection and Cross-Site-Script
No ratings yet
A Novel System For Detecting and Preventing SQL Injection and Cross-Site-Script
6 pages
Detection of SQL Injection Attacks
No ratings yet
Detection of SQL Injection Attacks
6 pages
Reference 1 - 2017
No ratings yet
Reference 1 - 2017
13 pages
Singh - Analysis of SQL
No ratings yet
Singh - Analysis of SQL
19 pages
OWASP Security Principles and Practices: Definitive Reference for Developers and Engineers
From Everand
OWASP Security Principles and Practices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Irjet Study On SQL Injection Techniques
No ratings yet
Irjet Study On SQL Injection Techniques
5 pages
Analysis & Detection of SQL Injection Vulnerabilities Via Automatic Test Case Generation of Programs
No ratings yet
Analysis & Detection of SQL Injection Vulnerabilities Via Automatic Test Case Generation of Programs
7 pages
SQL Injection Attack Detection and Prevention Techniques To Secure Web-Site
No ratings yet
SQL Injection Attack Detection and Prevention Techniques To Secure Web-Site
5 pages
Detection and Prevention of SQL Injectio PDF
No ratings yet
Detection and Prevention of SQL Injectio PDF
5 pages
SQL Injection Research Paper
No ratings yet
SQL Injection Research Paper
5 pages
Reveiw of Tools Against Vulnerabilies in Web Applications
No ratings yet
Reveiw of Tools Against Vulnerabilies in Web Applications
4 pages
Final Paper
No ratings yet
Final Paper
9 pages
STUDY GUIDE 300-220 CBRTHD Conducting Threat Hunting and Defending using Cisco Technologies for Cybersecurity
From Everand
STUDY GUIDE 300-220 CBRTHD Conducting Threat Hunting and Defending using Cisco Technologies for Cybersecurity
Anand Vemula
No ratings yet
Building Secure Desktop Apps with Tauri: Definitive Reference for Developers and Engineers
From Everand
Building Secure Desktop Apps with Tauri: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Detectionof Remote Code Executionvulnerabilityinwebsitesourcecodesusing LSTMmachinelearningmodel
No ratings yet
Detectionof Remote Code Executionvulnerabilityinwebsitesourcecodesusing LSTMmachinelearningmodel
6 pages
Security in Microservices Architectures
No ratings yet
Security in Microservices Architectures
12 pages
Securing Web Applications Against XSS and SQLi Att
No ratings yet
Securing Web Applications Against XSS and SQLi Att
18 pages
Ymhgece Webvul
No ratings yet
Ymhgece Webvul
21 pages
Security and Communication Networks - 2022 - Shaheed - Web Application Firewall Using Machine Learning and Features
No ratings yet
Security and Communication Networks - 2022 - Shaheed - Web Application Firewall Using Machine Learning and Features
14 pages
Web Attack Intrusion Detection System Using Machine Learning Techniques
No ratings yet
Web Attack Intrusion Detection System Using Machine Learning Techniques
16 pages
To Cams 2020
No ratings yet
To Cams 2020
13 pages
Data Mining and Analytics
No ratings yet
Data Mining and Analytics
2 pages
TableauCertifiedDataAnalyst ExamGuide
No ratings yet
TableauCertifiedDataAnalyst ExamGuide
16 pages
Navaneethakrishnan Resume - March
No ratings yet
Navaneethakrishnan Resume - March
1 page
MIS Handouts
No ratings yet
MIS Handouts
4 pages
Apache Cassandra Database - Instaclustr
No ratings yet
Apache Cassandra Database - Instaclustr
8 pages
RTO Management System
100% (1)
RTO Management System
4 pages
Trắc Nghiệm Phần Quản Lý File
No ratings yet
Trắc Nghiệm Phần Quản Lý File
4 pages
JNTUA Database Management Systems Lab Manual R20
No ratings yet
JNTUA Database Management Systems Lab Manual R20
71 pages
Dbms Guide
No ratings yet
Dbms Guide
52 pages
Chapter 04 I SQL Server Installation 2019
No ratings yet
Chapter 04 I SQL Server Installation 2019
24 pages
Sap
No ratings yet
Sap
17 pages
Q) Frequent Itemset Generation: States That If An Itemset Is Frequent, Then All of Its Subsets Must Also Be Frequent. This
No ratings yet
Q) Frequent Itemset Generation: States That If An Itemset Is Frequent, Then All of Its Subsets Must Also Be Frequent. This
9 pages
Final-DBMS Question Bank
No ratings yet
Final-DBMS Question Bank
3 pages
Skewness and Kurtosis
No ratings yet
Skewness and Kurtosis
8 pages
EBM
No ratings yet
EBM
16 pages
Pcwin2 015 Updating - Procedure en
No ratings yet
Pcwin2 015 Updating - Procedure en
1 page
Contoh Proposal Skripsi
No ratings yet
Contoh Proposal Skripsi
40 pages
Module 2
No ratings yet
Module 2
8 pages
MySQL 2024 Sales Specialist Assessment
No ratings yet
MySQL 2024 Sales Specialist Assessment
5 pages
Murat Yapici
No ratings yet
Murat Yapici
1 page
Data Science MCQ Topics: Protect Your Account Without Compromising On Speed or Connection
No ratings yet
Data Science MCQ Topics: Protect Your Account Without Compromising On Speed or Connection
6 pages
Best Practices - Teradata
No ratings yet
Best Practices - Teradata
3 pages
GIS Based Network Analysis - Manual
No ratings yet
GIS Based Network Analysis - Manual
66 pages
Chapter 1
No ratings yet
Chapter 1
69 pages
BCG Executive Perspectives Future of Data Management With AI EP9 10dec2024
100% (1)
BCG Executive Perspectives Future of Data Management With AI EP9 10dec2024
22 pages