Phani Intership 2
Phani Intership 2
Phani Intership 2
ECS51801
INTERNSHIP
NAME: Konka
Korukonda harish Kumar
Phanikumar
CLASS :CSE-3C
CSE-E
B. TECH CSE
1
BONAFIDE CERTIFICATE
Certified that this internship report "Landmark Recognition technology
internship" is the bonafide work of 22112125
22112280 who carried out the internship
during the academic year 2022-2023
MENTOR
CLASS INCHARGE HEAD OF DEPARTMENT
Radhika
Mrs. Vasugi Paulraj Dr.Thangakumar J
EXTERNAL EXAMINER
Name: ________________________
INTERNAL EXAMINER
Designation: ___________________
Name: ________________________
Institution Name: _______________
Designation: ___________________
2
ABOUT
1 STOP
ACCENTURE
OFFICE ADDRESS:
17E 18th CROSS ROAD RD, SECTOR 3RD, HSR LAYOUT BENGULURU,
KARANATAKA 560102
3
Internship Domain: AI MACHINE LERNING
Project Methodology:
Week I:
• Project Understanding
• Learn basics
• Choose a frame work
• Identify the requirements that need to be delivered for this project. •
Identify which tasks you should focus on as AI machine learning
4
Week 2:
• Learning the data set Exploration.
• Understanding information about using algorithms and machine learning.
• Understanding the attributes and their relation model given by the client
Week 3:
5
CONCLUSION:
Overall, my internship experience has provided me with a deep
understanding of Landmark Recognition Technology using Excel, and
I have developed valuable skills that I can apply to future projects. I
enjoyed working with the Accenture team and appreciated the
opportunities to learn and grow throughout the internship. Based on my
experience, I recommend that future interns focus on developing their
Excel skills and exploring different data analytics techniques to better
understand the data and its insights. I also recommend that Accenture
continues to invest.
6
PROOF OF WORK
CERTIFICATE OF INTERNSHIP
7
PROOF OF WORK
PROJECT REPORT
8
RESEARCH INTERNSHIP SCHEME -MAY 2023
(Online Mode)
Regno: 22112125
22112280
9
CONTENTS
Chapter 1
1.1INTRODUCTION
CHAPTER 2
2.1PROJECT REVIEW
2.2LITERATURE REVIEW
2.3PROMBLEM STATEMENT
2.4FUNCTIONAL REQUIRMENTS
2. 4.1NON-FUNCTIONAL REQUIRMENTS
2.5 SOURCE DATA
CHAPTER 3
3.1 SYSTEM ARCHITECTURE
3.2 INTERFACE PROTOTYPING
3.3 DATA FLOW DESIGN CHAPTER 4
IMPLEMENTATION
4.1 DATABASE DESIGN
4.2 USER SCREEN AFTER PREDICTION
CHAPTER 5
CONCLUSION
5.1 EXPERIMENT AND RESULTS
5.2 CONCLUSION AND FUTURE WORK
10
ABSTRACT
Landmark Recognition is the technology that can anticipate landmark names
straightforwardly from picture pixels, to help individuals better comprehend
and sort out their photograph accumulations and for law enforcement
officials to gauge the location of images submitted as evidence. Image
classifications techniques have shown remarkable improvements over the
last few years. To further improve computer vision technologies and
methodologies, researchers are now concentrating on highly specific types
of classification. Instead of classifying cats, cars or buildings, researchers are
trying to classify among different types of landmarks - both natural and
manmade. In the present age, a tremendous roadblock in landmark
recognition research is the lack of large, well labelled datasets. To rectify this,
Google has come up with the Google Landmark Recognition Dataset. The
dataset contains 1.2 million images of 15000 categories of landmarks. For
the project, a subset of Google Landmark Recognition dataset has been used.
Various latest classification algorithms, like AlexNet, ResNet, SEResNet, VGG-
16 and Inception v3 have been implemented to classify the images. Among
them, the SE-ResNet architecture achieves the lowest loss value of
0.0985 and accuracy of 98% on the training set.
11
LIST OF ABBREVIATIONS
2. ML Machine Learning
Constructive Cost
3. COCOMO Model
Uniform Resource
7. URL Locator
12
CHAPTER 1
INTRODUCTION
14
Some of the reasons for using Deep Learning for Image recognition tasks are
as follows. Previous image recognition techniques involved the use of feature
extraction techniques such as SIFT and SURF.These techniques required
programmers to manually extract features from images and store them in a
database for comparison with test images.Deep Learning methods allow for
automatic feature extraction using Convolutional Neural Networks. Since the
dataset has a large number of images, it would require a lot of effort to
manually extract features for so many images. Thus, neural networks and
Deep Learning algorithms are the better option.
15
CHAPTER 2 PROJECT OVERVIEW
For the project, the task is to classify images based on the landmarks
contained in them. For example, if the sample image is an image of the Taj
Mahal, the output of the system should be Taj Mahal. For this task,
Convolutional Neural Networks and its various architectures have been
chosen. Previously, landmark recognition used to be done by using the GPS
information present in the image itself. Without an active internet
connection, this information will not be collected by the camera.
The proposed system will be trained on the Google Landmark Dataset, which
contains 1.2 million images of around 15000 types of landmarks. This system
does not need the use of any internet connection to determine the location
or the landmark present in the image. By running the trained model on the
test image, the landmark present in the image is predicted. Also, this system
will return results within 1-2 ms. For deciding upon the architecture to be
used for the model, various research papers were read, and experiments
were run on a subset of the dataset with the various papers architectures.
A method for classifying images using Deep Learning has been proposed,
which does not require the use of any Internet connection. The system will
be faster than existing systems and can be trained to learn images which it
does not currently recognize too. The system will be tested on the dataset
using various existing algorithms and the architecture which gives the best
results will be chosen.
Barret Zoph et al[6] NASNet was fabricated utilizing the Neural Architecture
Search system. The target of NAS is to utilize an information driven and
insightful strategy to building the system plan rather than intuition and
experimentations.In the Inception paper, it was exhibited that a muddled
gathering of channels in a phone can impressively expand results. The NAS
structure traces the structure of such a cell as a streamlining procedure, and
afterward stacks the numerous duplicates of the best cell to construct a vast
network.Finally, two unique cells are manufactured and used to prepare the
full model.
CNN’s and their various architectures. Five potential algorithms have been
chosen- AlexNet, VGG-16, ResNet, SE-ResNet and Inception v3. The
algorithm which provides the best results will be chosen for as the base for
the system.
20
2.3 REQUIREMENTS
GATHERING ●
Questionnaire:
○ The preliminary requirements will be found by asking the
stakeholders to fill a questionnaire. The questionnaire will
provide vital information on what the stakeholders are seeking
and the area of focus that needs to be tackled.
● Interview:
● Brainstorming:
21
2.4 REQUIREMENTS ANALYSIS
2.4.1 FUNCTIONAL REQUIREMENTS
● The system should accept any image and predict the
landmark in it ● The system should not take more than .5
seconds to predict the landmark ● The system should have
good accuracy(~90%).
● Reliability: The system should not crash when many people are
using it at once( down time less than 5 seconds).
● Security: A user should have access to their images only
The dataset contains 1.2 million images of 30,000 categories. The following
image depicts the geographical distribution of the landmarks:
22
URL for dataset-
https://www.kaggle.com/c/landmarkrecognitionchallenge/data
23
2.6 COCOMO Model
Here,
Product Size:
P= Size/E
= KLOC/man-months
=
2/8.
27 =
0.24
2
Staff
ing:
25
Prediction Result
After the model has been trained on the training set, it will be saved
in a Hierarchical
Data Format(HDF, .h5) file. For predicting the landmark in a given image,
the saved model will be loaded and the test image will be fed to the
model, which will output the landmark and the accuracy of prediction.
Here,
26
CHAPTER 3 ARCHITECTURE AND DESIGN
3.1 SYSTEM ARCHITECTURE
The flow of the system starts from the CSV file. Using a download
script, all the images are downloaded from the URL’s provided in the file.
Next, the images are sorted into the various folders with respect to the
landmark ID. Using another Python script, the images are resized into
128*128 pixels. Next, the images are split into a training and test dataset.
The training images are fed into the model and the accuracy and loss values
are monitored. The trained model is saved into a .h5 file for future testing.
In the GUI, the user selects the image to be classified and clicks on the predict
button. Using the saved model, the system predicts the landmark of the
selected images.
27
3.1 Architecture Diagram for Landmark Recognition
System Modules:
● Image Downloader
It is used to download the dataset used for training and testing the
model. The links for the various images are provided in a CSV file.
Using a python script, we automatically download the images.
● Data preprocessor
This module is used to preprocess the images for optimum training
and testing. It involves extracting, resizing, and compressing the
images. We use OpenCV to resize the images to a size of 128*128
pixels. Once resized, we use the Keras flow_from_directory API to load
images in batches of 32 to the model.
● Data splitter
This is used to split the dataset into training and test sets. The ratio
used is 80:20.
28
● Machine Learning model- Residual Block
● Prediction Result
After the model has been trained on the training set, it will be saved
in a Hierarchical
29
Data Format(HDF, .h5) file. For predicting the landmark in a given image,
the saved model will be loaded and the test image will be fed to the model, which
will output the landmark and the accuracy of prediction. 3.2 INTERFACE
PROTOTYPING
1. The main component of the user interface will be the image uploader
where the user will be able to upload any image of their choice
2. After this, the system will run the prediction model on the given image
to predict which landmark is present in the image
30
3.3 DATA FLOW DESIGN
LEVEL 0
The Level 0 diagram depicts the overview of the system. The image dataset
goes to the machine learning model which ultimately gives the result.
LEVEL 1
31
Chapter-4 4.1.1 ER DIAGRAM
Image URL
Landmark Category
Train/Test
Predicted Category
32
4.2 User Screen Before Prediction
1. The main component of the user interface will be the image uploader
where the user will be able to upload any image of their choice
2. After this, the system will run the prediction model on the given image
33
CHAPTER-5 5.1 EXPERIMENT AND WORK:
The flow of the system starts from the CSV file. Using a download
script, all the images are downloaded from the URL’s provided in the file.
Next, the images are sorted into the various folders with respect to the
landmark ID. Using another Python script, the images are resized into
128*128 pixels. Next, the images are split into a training and test dataset.
The training images are fed into the model and the accuracy and loss values
are monitored. The trained model is saved into a .h5 file for future testing. In
the GUI, the user selects the image to be classified and clicks on the predict
34
button. Using the saved model, the system predicts the landmark of the
selected images.
The system is made up of the these four files- resize.py, download.py, split.py
and app.py. Download.py is used to download all the images in the dataset,
after which resize.py resizes the images to 128*128 pixels. The dataset is split
into a training and test set by split.py and the app.py file is used to build and
run the model.
The user enters the image in the system. The image downloader downloads
the image dataset. The data processor resizes the images and extracts the
features. Then the dataset is split into the train and test set in a 80:20 ratio.
The machine learning module is trained. The user’s image is then classifies
and the result is displayed.
The system is made up of the these four files- resize.py, download.py, split.py
and app.py. Download.py is used to download all the images in the dataset,
after which resize.py resizes the images to 128*128 pixels. The dataset is split
into a training and test set by split.py and the app.py file is used to build and
run the model.
35
36
ANNEXURE - I
from keras import
layers from keras
import models
import
os import cv2
import
numpy as np
from keras.utils.np_utils import to_categorical from keras
import optimizers from keras.preprocessing.image import
ImageDataGenerator import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
def squeeze_excite_block(input,ratio=16):
init=input
channel_axis=-1
filters=init._keras_sh
ape[channel_axis]
37
se_shape=(1,1,filters
se=layers.GlobalAveragePooling2D()(init)
se=layers.Reshape(se_shape)(se)
se=layers.Dense(filters//ratio, activation='relu',
kernel_initializer='he_normal',use_bias=False)(se)
se =layers.Dense(filters, activation='sigmoid',
kernel_initializer='he_normal', use_bias=False)(se)
x =layers.multiply([init, se])
return x def
residual_networ k(x):
"""
"""
def
add_common_layers(y): y
38
=
layers.BatchNormalization()(y
) y = layers.LeakyReLU()(y) return
y
_d = nb_channels // cardinality
39