0% found this document useful (0 votes)
3 views29 pages

Google Aiml 3

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 29

GOOGLE AI-ML VIRTUAL INTERNSHIP

Internship-3 report submitted in partial fulfilment of


requirements for the award of degree of
Bachelor of Technology
In
Computer Science and Engineering
By

DASARI SIREESHA (20131A0555)

UNDER THE ESTEEMED GUIDANCE OF

Name of Course Coordinator Name of Course Mentor


Dr. CH. SITA KUMARI Ms. B. PRANALINI
(Associate Professor) (Assistant Professor)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


GAYATRI VIDYA PARISHAD COLLEGE OF ENGINEERING(A)
(Affiliated to JNTUK-Kakinada)
VISAKHAPATNAM
2023 – 2024

1
Gayatri Vidya Parishad College of Engineering (Autonomous)
Visakhapatnam

CERTIFICATE
This report on
“GOOGLE AI-ML VIRTUAL INTERNSHIP”
is a bonafide record of the Internship work submitted by

DASARI SIREESHA (20131A0555)

In their VII semester in partial fulfilment of the requirements for the Award of Degree of
Bachelor of Technology in Computer Science and Engineering

During the academic year 2023-2024.

Course Coordinator I/C Head of Department


Dr.CH.SITA KUMARI Dr. D.UMA DEVI
(Associate Professor) Associate Professor and
Associate Head of CSE with AI & ML

Internship Mentor
Ms. B. PRANALINI
(Assistant Professor)

2
CERTIFICATE:
20131A0555

3
ACKNOWLEDGMENT

We would like to express our deep sense of gratitude to our esteemed institute Gayatri
Vidya Parishad College of Engineering (Autonomous), which has provided us an
opportunity to fulfil our cherished desire.

We thank our Internship Mentor, Ms. B. Pranalini, Assistant Professor,


Department of Computer Science and Engineering, for the kind suggestions and
guidance for the successful completion of our internship

We thank our Course coordinator, Dr. CH. SITA KUMARI, Associate Professor,
Department of Computer Science and Engineering, for the kind suggestions and
guidance for the successful completion of our internship

We are very thank full highly indebted to our associate head of CSE department, Dr.
D. UMA DEVI Associate Professor & I/C Head of the Department of Computer
Science and Engineering, Gayatri Vidya Parishad College of Engineering
(Autonomous), for giving us an opportunity to do the internship in college.

We express our sincere thanks to our Principal Dr. A.B. KOTESWARA RAO,
Gayatri Vidya Parishad College of Engineering (Autonomous) for his
encouragement to us during this project, giving us a chance to explore and learn new
technologies in the form of internship.

We are very Thankful to AICTE, Edu-skills and Google for giving us an internship
and helping to solve every issue regarding the internship.

Finally, we are indebted to the teaching and non-teaching staff of the Computer
Science and Engineering Department for all their support in completion of our project.

D.Sireesha(20131A0555)

4
ABSTRACT

This internship report presents an in-depth exploration of my journey with Google AI ML,
where I delved into the practical application of machine learning through TensorFlow.
TensorFlow, a widely used open-source library, served as the foundation for building and
deploying various machine learning models. The program encompassed a range of modules,
from programming neural networks to object detection and image-based tasks like search and
classification. Throughout these modules, I gained hands-on experience in designing, training,
and evaluating machine learning models for real-world scenarios. This included identifying
objects within images and retrieving product images based on user queries.

The report delves into the core concepts explored in each module, highlighting the
technical skills I acquired during the internship. Understanding these concepts and mastering
the tools provided a strong foundation for building and manipulating machine learning models.
The report also explores any projects undertaken and the challenges encountered along the
way. Finally, the report concludes by summarizing the invaluable experience gained during
this internship. It emphasizes the key takeaways, particularly the acquired technical skills, and
discusses how this experience can be leveraged for future endeavors in the field of Artificial
Intelligence and Machine Learning.

The knowledge and skills acquired during this internship at Google AI ML hold significant
promise for the future. The ability to design and implement machine learning models using
TensorFlow opens doors to various applications across diverse industries. This experience
equips me to contribute to advancements in areas like computer vision, image retrieval
systems, and potentially even develop innovative solutions for problems yet to be encountered.
The internship not only fostered technical expertise but also nurtured a deeper understanding
of the potential and challenges within the ever-evolving landscape of Artificial Intelligence.

5
CONTENTS

Introduction

Chapter 1: Program neural networks with TensorFlow


1.1: The Hello World of Machine Learning
1.2: Introduction to Computer Vision
1.3: Introduction to Convolutions and CNN

Chapter 2: Get started with object detection


2.1: Introduction to Object Detection
2.2: Integrating an object detector using object detection API

Chapter 3: Go further with object detection

Chapter 4: Get started with product image search

Chapter 5: Go further with product image search

Chapter 6: Go further with image classification

Case Study

Conclusion

References

6
INTRODUCTION

Artificial Intelligence (A.I.) is the state-of-the-art (SOTA) technology that


revolutionized all possible industries with its performance and characteristics.
Machine learning (M.L.), which is a subset of AI, and Deep learning which is again
a subset of ML, are the driving innovations across industries, including healthcare,
finance, transportation, and entertainment.

Artificial Intelligence:
Artificial Intelligence (AI) is a branch of computer science that aims to create
systems capable of performing tasks that typically require human intelligence. These
tasks include reasoning, learning, problem-solving, perception, and language
understanding.

Machine Learning:
Machine Learning (ML) is a subset of AI that focuses on developing
algorithms and statistical models that enable computers to learn from and make
predictions or decisions based on data. ML algorithms can be categorized into three
main types:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning.

Deep Learning:
Deep Learning (DL) is a subfield of ML that uses neural networks with
multiple layers (hence the term "deep") to learn representations of data. DL
algorithms have shown remarkable success in various tasks such as image and
speech recognition, natural language processing, and autonomous driving.

7
TensorFlow:
TensorFlow is an open-source machine-learning framework developed by
Google Brain for building and training machine-learning models. It is widely used
in AI research and production applications due to its scalability, flexibility, and
extensive ecosystem of tools and libraries.

Key features of TensorFlow include:


1. Graph-based computation
2. Automatic differentiation
3. High-level APIs
4. Scalability
5. Extensive Library ecosystem

Comparison between Machine Learning and Deep Learning:

Machine Learning Deep Learning

1. ML can work on lesser amount 1. A lot of unlabeled training data


of data provided by users is required to make correct
conclusions
2. In ML features are accurately 2. DL creates new features by
identified by users itself
3. ML divides larger problem into 3. DL solves the problems based
sub problems and results will be on end-to-end basis
combined
4. ML needs less time to train 4. DL needs much more time to
train
5. ML requires feature engineering 5. DL does not require feature
engineering
6. ML can give good results on 6. DL gives best results on large
large and small data both data

8
CHAPTER-1: PROGRAM NEURAL NETWORKS WITH
TENSORFLOW

"Program Neural Networks with TensorFlow" serves as a pivotal entry


point into the world of machine learning and deep learning. This foundational
chapter introduces TensorFlow, Google's powerful open-source library, and its
application in building neural networks. Beginning with an overview of
TensorFlow's core concepts like tensors and computational graphs, learners
gradually delve into constructing neural networks. They gain practical insights into
defining network architecture, selecting activation functions, and configuring
parameters crucial for model design. As they progress, the chapter seamlessly
transitions into the training phase, where learners grasp the mechanics of data
feeding, backpropagation for gradient computation, and iterative optimization
algorithms such as stochastic gradient descent (SGD). Through hands-on exercises,
they cultivate a deep understanding of training neural networks and fine-tuning
model parameters to achieve optimal performance.

Following training, the chapter equips learners with essential evaluation


techniques, emphasizing metrics like accuracy, precision, and recall for assessing
model performance. Learners also explore strategies for testing models on unseen
data to ensure robustness. Furthermore, they delve into debugging neural networks
and optimizing performance through techniques like regularization and dropout. By
mastering these skills, learners not only develop a profound understanding of
TensorFlow's capabilities but also gain the confidence to tackle real-world
challenges in machine learning and deep learning domains. This chapter sets the
stage for further exploration into advanced topics, empowering learners to leverage
TensorFlow effectively for building and deploying neural networks across various
applications.

1.1 The Hello World of Machine Learning


Let's contemplate a situation wherein we're developing a system to recognize
activities, incorporating the speed of individuals as a factor. For instance, if
someone's speed is below 5km/h, they're categorized as walking; if it exceeds
5km/h, they're identified as running; otherwise, they're presumed to be in a vehicle.

9
But what if we wish to integrate another activity, such as golfing? In these intricate
scenarios, machine learning becomes indispensable.

What is Machine Learning?


In fundamental programming, we articulate rules. The data we supply adheres to
those rules, and the program furnishes responses accordingly.

Whereas ML, rather than attempting to define and express rules explicitly in a
programming language, we furnish answers, referred to as labels, alongside the data.
The machine then deduces the rules governing the relationship between the data and
the provided answers. Labels, in our context, serve as information indicating the
ongoing activity.

Though it is an alternative to traditional programming, this ML approach allows


us to open new scenarios to use our model in any application.

1.2 Introduction to Computer Vision

In this section, we were presented with the process of constructing a simple ML


model capable of identifying clothing items from a dataset known as Fashion
MNIST.

10
The dataset contains 70,000 items of clothing in 10 different categories. The
dataset is readily available in the Keras library. Even though the dataset comprises
70,000 images, we train the model using a specific portion of the dataset. This
involves dividing the entire dataset into two parts. One portion is allocated for
training, while the other is designated for testing. We undertake this division to
assess the model's performance using data, that hasn't been exposed during training.
To achieve this, we can make use of the function train_test_split ().

Subsequently, to input data into the neural network, we employ the function
keras.layers.Dense, enabling us to initialize a neural network with the preferred
number of outputs. Additionally, we can incorporate activation functions in the
output to introduce non-linearity to the data. Optimizers and loss functions serve
as the primary components of the training process. The training process iterates for
a specified number of times, referred to as epochs.

11
For model testing, we can utilize the evaluate function from the Keras library,
enabling us to verify the test results.

1.3 Introduction to Convolutions and CNN (Convolutional Neural


Networks)
Convolution serves as a fundamental operation in signal processing and image
analysis, frequently employed in deep learning for tasks such as image recognition
and feature extraction. This process entails applying a filter, also referred to as a
kernel, to an input image to generate an output feature map.

Convolution operations aid in extracting features like edges, textures, and


patterns from images through a process of sliding the filter across the input image
and performing element-wise multiplications, followed by summation.

Convolutional Neural Networks (CNNs):


CNNs, or Convolutional Neural Networks, constitute a category of deep
neural networks crafted specifically for handling structured grid-like data, such as
images. They comprise several layers, including convolutional layers, pooling
layers, and fully connected layers. CNNs have transformed computer vision tasks by
autonomously acquiring hierarchical representations of visual data, thereby
facilitating applications like object detection, image segmentation, and image
generation.

12
Key Components of CNN include:
1. Convolutional layers
2. Pooling layers
3. Fully connected layers
4. Activation functions

13
CHAPTER-2: GET STARTED WITH OBJECT DETECTION

"Get Started with Object Detection" marks a pivotal chapter within the
machine learning landscape, particularly focusing on the realm of object detection.
This chapter serves as a comprehensive initiation into the world of object detection
using TensorFlow's Model Maker Kit (MK Kit) Object Detection API. With a
focus on practical implementation, learners are introduced to the foundational
concepts of object detection, including bounding boxes, object localization, and
model evaluation metrics.

The chapter commences with an overview of the MK Kit Object Detection


API, highlighting its significance in simplifying the development of custom object
detection models. Learners are guided through the process of dataset preparation,
where annotated images with labeled bounding boxes are curated to train the model
effectively. Leveraging the power of transfer learning, learners harness pre-trained
models to expedite the training process, optimizing model performance even with
limited data.

2.1 Introduction to Object Detection


Object detection stands as a computer vision task that entails recognizing and
pinpointing objects of interest within an image or video frame. Diverging from
image classification, which predicts the presence of an object in an image, object
detection extends further by furnishing accurate bounding boxes around detected
objects, alongside their corresponding class labels.

14
Key Components and techniques used in object detection include:
1. Bounding box regression
2. Anchor Boxes
3. Feature extraction
4. Two-stage detectors
5. One-stage detector
6. Evaluation metrics

Object detection finds utility across diverse domains, spanning autonomous


driving, surveillance, robotics, medical imaging, and retail. Its capabilities empower
systems to recognize and monitor objects in real-time, facilitating tasks such as
object counting, localization, tracking, and instance segmentation.

2.2 Integrate an object detector using ML Kit object detection API

In this module, the focus is on integrating an object detection API into an


Android Studio app using Google's technology stack. The module provides a
structured approach to deploying this functionality by incorporating boilerplate code
that facilitates the capture of images and their subsequent analysis for object
detection.

1. Introduction to Object Detection API Integration:


The module begins with an overview of the objectives, emphasizing the
integration of an object detection API into an Android Studio application.
2. Boilerplate Code Deployment:
The module guides learners through the deployment of boilerplate code within
the Android Studio environment. This code serves as a foundation for implementing
the object detection functionality.
3. Camera Integration:
Learners are introduced to the integration of camera functionality within the
Android app. The code enables users to either capture a photo using the device's
camera or select a pre-existing image from the device's gallery.

15
4. Object Detection Method Implementation:
Within the boilerplate code, an object detection method is deployed. This
method is responsible for analyzing images to detect objects within them.
5. Execution of the Code:
Upon running the integrated code within the Android Studio environment,
users are presented with options to capture images using the device's camera or select
from preset images already available.
6. Object Detection Process:
Once an image is selected, the object detection API implemented in the code
executes. This API is designed to identify and label objects present within the
selected image.
7. Object Detection Results:
The module concludes with the presentation of results obtained from the
object detection process. Learners can observe the detected objects along with their
corresponding labels, providing insight into the effectiveness of the integration.

By following this module, gained hands-on experience in integrating object


detection capabilities into Android applications, thereby expanding their skill set in
mobile development and machine learning integration.

16
CHAPTER 3: GO FURTHER WITH OBJECT DETECTION

In this chapter, introduced to a new library called TensorFlow Lite.


TensorFlow Lite serves as a cross-platform machine learning library optimized for
executing machine learning models on edge devices, including Android and iOS
mobile devices.

At the core of TensorFlow Lite lies the engine utilized within ML Kit to
execute machine learning models. The TensorFlow Lite ecosystem comprises two
key components that streamline the training and deployment of machine-learning
models on mobile devices:

1. Model Maker: A Python library designed to simplify the training of


TensorFlow Lite models using your data with minimal code, eliminating the
need for machine-learning expertise.
2. Task Library: A cross-platform library that facilitates the deployment of
TensorFlow Lite models with minimal code integration into your mobile apps.

TensorFlow Lite also features a pre-trained object detection model enabling object
detection tasks. It provides methods like run_object_detection(),
draw_detection_result(), and more.

Within the TensorFlow ecosystem, Model Maker emerges as a pivotal tool,


celebrated for its adeptness in simplifying the creation of custom machine learning
models. Among its array of functionalities, Model Maker excels in offering a
streamlined pathway for diverse tasks, including image classification, text
classification, and notably, object detection. Among these tasks, object detection
holds paramount importance, and Model Maker's dedicated Task Library tailored

17
precisely for this domain resonates deeply with developers and researchers alike.

At the heart of Model Maker's prowess in object


detection lies its Task Library, a repository brimming
with pre-defined tasks meticulously crafted to cater to the
unique demands of creating object detection models.
Embedded within this library is the Object Detection
Task, a beacon of efficiency guiding users through the
intricate maze of model creation. This task epitomizes
Model Maker's commitment to user-centric design,
offering a structured framework for defining object
scopes, configuring data inputs, and fine-tuning model
parameters.

Dataset preparation, a critical phase in model development, seamlessly


integrates within Model Maker's workflow. Users are empowered with tools to
annotate images swiftly, ensuring each dataset is richly adorned with meticulously
labeled bounding boxes. Leveraging transfer learning, another cornerstone of Model
Maker's arsenal, users harness the power of pre-trained models to bootstrap their
object detection endeavors. This approach not only expedites training but also
enhances model performance, particularly in scenarios with limited data.

As the training journey unfolds, Model Maker's automated processes take


the helm, navigating through epochs while iteratively refining model parameters to
perfection. Alongside, users monitor key metrics such as accuracy and loss, gaining
invaluable insights into model performance. Armed with this information, users
guide their models towards deployment, a culmination of their efforts. With Model
Maker's deployment utilities, the transition from training to inference is seamless,
ensuring that the fruits of labor are readily available for real-world applications.

18
CHAPTER 4: GET STARTED WITH PRODUCT IMAGE
SEARCH

This chapter delves into the functionality and integration of Vision API
Product Search, a pivotal component behind applications like Google Lens,
renowned for its capability to conduct product searches within images. Through the
use of machine learning algorithms, Vision API Product Search empowers
developers to analyze image content and extract pertinent product details seamlessly.

The project is initiated within Android Studio, a robust IDE for Android app
development, where integration with the ML Object Detection and Tracking API
takes place. This integration equips developers with the necessary tools to implement
object detection functionality seamlessly into their Android applications, enhancing
user experience and interactivity.

Within the provided codebase, developers encounter boilerplate code


facilitating image capture from either the device's camera or pre-loaded images,
streamlining the process of object detection. This functionality ensures a user-
friendly experience, enabling effortless interaction with the object detection feature.

19
Central to the project is the utilization of Object Detection methods,
demonstrated through the starter app's implementation. Through this method,
detector instances are instantiated, enabling the identification and selection of
objects within captured images. This crucial functionality underpins the core
capabilities of the Vision API Product Search.

A structured image processing workflow characterizes the project,


commencing with the creation of an input image. Subsequently, detector instances
are initialized, and images are processed through these detectors for analysis and
object identification. This systematic approach ensures efficient and accurate object
detection within the Android application environment.

20
CHAPTER 5: GO FURTHER WITH PRODUCT IMAGE
SEARCH

The preceding chapter of our project journey involved the loading of the
starter app within Android Studio, where we successfully detected objects present
in the provided images and conducted subsequent product searches. This initial
phase laid the groundwork for our exploration into more advanced functionalities
within the application.

The starter app exhibited the capability to detect objects within images
autonomously. However, considering scenarios where multiple objects may exist
within an image or where the detected object occupies a small portion of the overall
image, user interaction becomes essential. To address this, the app required
enhancements that enable users to tap on a detected object, thereby indicating their
selection for product search purposes.

Transitioning to the current chapter, we embarked on a journey to establish


our custom backend infrastructure, a crucial step towards achieving more advanced
functionalities. This process commenced with a comprehensive exploration of the
Vision API Product Search quick start guide, providing valuable insights into the

21
creation of bespoke backend solutions tailored to our specific requirements.

A significant aspect of this chapter involved acquainting ourselves with


various classes, including the Product Search API client class, meticulously designed
to facilitate seamless interaction with the product search backend. Delving deeper,
we meticulously examined all sub-modules encapsulated within this class, gaining a
comprehensive understanding of their respective functionalities and capabilities.

22
CHAPTER 6: GO FURTHER WITH IMAGE CLASSIFICATION

Within this chapter, we embarked on a journey to construct a custom image


classifier utilizing the Model Maker facility embedded within TensorFlow Lite, a
powerful toolset for deploying machine learning models on mobile and embedded
devices. Model Maker streamlines the intricate process of neural network design,
abstracting away complexities such as network architecture, convolutional layers,
activation functions like relu, flattening, loss functions, and optimizers. This
abstraction enables developers to focus solely on model training and evaluation,
alleviating the burden of intricate design considerations.

One of the standout features of Model Maker is its ability to generate default
models with remarkable ease. With just a single line of code, developers can initiate
the model creation process, leveraging the underlying neural network to undergo
training based on provided datasets. This streamlined approach significantly
expedites the model development process, enabling rapid iteration and
experimentation.

To facilitate the model training process, pre-existing datasets were sourced


from the Keras library and meticulously divided into distinct training and validation
sets. This separation ensures the model's ability to generalize well to unseen data, a
critical aspect of robust machine learning model development.

Upon successful completion of the training phase, the trained model is ready
for deployment within mobile applications. Utilizing the TensorFlow Lite
framework, the trained model is exported into a file format known as .tflite,

23
optimized for efficient execution on resource-constrained mobile and embedded
platforms. This exportation process paves the way for seamless integration of the
custom image classifier into mobile applications, enabling real-world deployment
and utilization.

24
CASE STUDY

Smart Shopping Assistant


Imagine developing a smart shopping assistant app that helps users identify
and find products based on images captured with their smartphones. The app
integrates object detection, product image search, and image classification
functionalities to provide a seamless shopping experience.

Object Detection:

• When users launch the app and point their smartphone camera at a scene,
the app utilizes object detection to identify and outline various objects
within the camera view.

• For instance, if the user points the camera at a table with different items,
the app detects and outlines individual products like bottles, fruits, and
electronics.

Product Image Search:

• Upon detecting products in the scene, the app prompts users to select a
specific product they're interested in purchasing.

• Once the user selects a product, the app initiates a product image search
using the selected item's image.

• Leveraging the Vision API Product Search, the app sends the product
image to the backend, which returns visually similar products from the
app's product catalog.

25
Image Classification:

• As the app retrieves visually similar products, it performs image


classification to refine the search results further.

• Each retrieved product is classified into specific categories or


subcategories, such as "fruits," "electronics," or "home appliances."

• This classification allows the app to present more accurate and relevant
product suggestions to the user based on their preferences and browsing
history.

Personalization and Recommendations:

• The app leverages machine learning algorithms to analyze user interactions


and preferences over time.

• By tracking user behavior, such as past purchases and product views, the
app offers personalized product recommendations tailored to each user's
interests.

• For example, if a user frequently searches for electronic gadgets, the app
prioritizes showing similar products in the search results.

Integration with E-Commerce Platforms:

• To enable seamless purchasing, the app integrates with popular e-


commerce platforms or retailers' APIs.

• Users can view detailed product information, reviews, and pricing from
various online stores directly within the app.

• Additionally, users can add products to their shopping carts and complete
purchases without leaving the app.

26
Feedback and Continuous Improvement:

• The app collects user feedback and engagement metrics to continuously


improve its performance and

user experience.

• Machine learning models are periodically updated and retrained using the
latest data to enhance accuracy and relevance in product detection, search,
and classification.

• By combining object detection, product image search, and image


classification, the smart shopping assistant app offers users a convenient
and personalized shopping experience, empowering them to effortlessly
disco

27
CONCLUSION

Throughout the internship, it’s a transformative learning experience,


immersing myself in the intricacies of machine learning with TensorFlow.
Beginning with the fundamentals of neural networks, I gained a comprehensive
understanding of model architecture, parameter tuning, and the training process.
This foundational knowledge laid the groundwork for my exploration into more
specialized areas of machine learning.

In summary, my internship experience has been an enriching journey through


various facets of machine learning, leveraging the powerful capabilities of
TensorFlow. Beginning with foundational concepts in neural networks, I acquired
essential skills in model construction, training, and evaluation. As I progressed, I
delved into practical applications such as object detection and product image search,
where I learned to curate datasets, train custom models, and deploy them effectively.

Exploring advanced techniques in subsequent chapters further deepened my


understanding of machine learning methodologies. From optimizing model
performance to enhancing scalability, each chapter provided invaluable insights into
the complexities of real-world applications.

Overall, this internship has equipped me with a diverse skill set and a profound
appreciation for the potential of machine learning in addressing complex problems.
As I transition to the next phase of my career, I am confident that the knowledge and
experiences gained during this internship will serve as a solid foundation for future
endeavors in the dynamic field of machine learning.

28
REFERENCES

https://developers.google.com/learn/pathways/tensorflow

https://developers.google.com/learn/pathways/get-started-object-detection

https://developers.google.com/learn/pathways/going-further-object-detection

https://developers.google.com/learn/pathways/get-started-image-product-search

https://developers.google.com/learn/pathways/going-further-image-product-search

https://developers.google.com/learn/pathways/going-further-image-classification

29

You might also like