0% found this document useful (0 votes)
4K views

Batch-13 Air Canvas Using Python Opencv.1

The document is a project report submitted to Jawaharlal Nehru Technological University Hyderabad for the degree of Bachelor of Technology. It proposes developing a motion-to-text converter using OpenCV in Python that can track finger movements and generate text, serving as software for intelligent wearable devices or a communication method for the deaf by writing in air. The system would detect object movement, track it between frames, and analyze the behavior to convert gestures to text for various uses like sending messages and emails.

Uploaded by

Sheema Nazle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4K views

Batch-13 Air Canvas Using Python Opencv.1

The document is a project report submitted to Jawaharlal Nehru Technological University Hyderabad for the degree of Bachelor of Technology. It proposes developing a motion-to-text converter using OpenCV in Python that can track finger movements and generate text, serving as software for intelligent wearable devices or a communication method for the deaf by writing in air. The system would detect object movement, track it between frames, and analyze the behavior to convert gestures to text for various uses like sending messages and emails.

Uploaded by

Sheema Nazle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

“AIR CANVAS USING OPENCV- PYTHON ”

A mini project report submitted to

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD

in partial fulfillment of the requirement for the award of the degree of


BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
Submitted by

1. S.Architha (H.T.N0:19N01A0597)
2. Khuteja Nazlee (H.T.N0:19N01A0563)
3. S.Pooja (H.T.N0:19N01A0598)
4. P.Laxmi (H.T.N0:19N01A0580)
5. S.Sai Surya (H.T.N0:19N01A0599)

Under the Supervision of


Mr.R.RAMESH
AssociateProfessor

Department of Computer Science and Engineering


SREE CHAITANYA COLLEGE OF ENGINEERING
(Affiliated to JNTUH, HYDERABAD)

THIMMAPUR,KARIMNAGAR,TELANGANA505527
DECEMBER-2022
SREE CHAITANYA COLLEGE OF ENGINEERING
(Affiliated to JNTUH, HYDERABAD)
THIMMAPUR, KARIMNAGAR, TELANGANA-505 527
Department of Computer Science and Engineering

CERTIFICATE

We , S.ARCHITHA (19N01A0597), KHUTEJA NAZLEE (19N01A0563) ,S.POOJA


(19N01A0598),P.LAXMI (19N01A0580), S.SAISURYA (19N01A0599) is student of
Bachelor of Technology in Computer Science and Engineering, during the academic year:2022-
2023,hereby declare that the work presented in this Project Work entitled “AIR CANVAS USING
OPENCV-PYTHON”is the outcome of our own bona fide work and is correct to the best of our
knowledge and this work has been undertaken taking care of Engineering Ethics and carried out
under the supervision of Mr.R.Ramesh,Assistant Professor.

It contains no material previously published or written by another person nor material which
has been accepted for the award of any other degree or diploma of the university or other institute
of higher learning, except where due acknowledgment has been made in the text.

ProjectGuide Head of the department

Mr.R.RAMESH Mr.KHAJA ZIAUDDIN


Associate Professor Associate Professor &HOD
Department of CSE Department of CSE

EXTERNAL EXAMINER

i
CIN NUM: U80900TG2019PTC134293

Date: 25-07-2022
TOWHOMSOEVERIT MAYCONCERN

This is to certify that S.ARCHITHA,KHUTEJA NAZLEE,S.POOJA,P.LAXMI,S.SAI


SURYA,bearing H.T.N.019N01A0597,19N01A0563,19N01A0598,19N01A0580,19N01A0599,
B.Tech IV Year CSE-B, Sree Chaitanya College of Engineering, Karimnagar, has done their
project work/ Internship on the subject name of “Air Canvas Using OpenCV-Python”.
We has done the project / Internship under the guidance of Mr. Surya Chandana,Asst.
Manager, in MSR EDUSOFT PVT LTD, Hyderabad.

During the period of our project work with us, we found her conduct and character are Good.
We wish all the best for her future endeavors.

Managing Director

(D.MAHESWARA REDDY)

ii
SREE CHAITANYA COLLEGE OF ENGINEERING
(Affiliated to JNTUH, HYDERABAD)
THIMMAPUR, KARIMNAGAR, TELANGANA-505 527
Department of Computer Science and Engineering

DECLARATION

We , S.ARCHITHA (19N01A0598), KHUTEJA NAZLEE (19N01A0563)


,S.POOJA (19N01A0598), P.LAXMI (19N01A0580), S.SAISURYA (19N01A0599) is
student of Bachelor of Technology in Computer Science and Engineering, during the
academic year:2022-2023,hereby declare that the work presented in this Project Work entitled
“AIR CANVAS USING OPENCV-PYTHON” is the outcome of our own bona fide work and
is correct to the best of our knowledge and this work has been undertaken taking care of
Engineering Ethics and carried out under the supervision of Mr.R.Ramesh,Assistant
Professor.

It contains no material previously published or written by another person nor material


which has been accepted for the award of any other degree or diploma of the university or other
institute of higher learning, except where due acknowledgment has been made in the text.

1. S.Architha (H.T.N0:19N01A0597)
2. Khuteja Nazlee (H.T.N0:19N01A0563)
3.S.Pooja (H.T.N0:19N01A0598)
4.P.Laxmi (H.T.N0:19N01A0580)
5:S.Sai Surya (H.T.N0:19N01A0599)

Date:

Place:

iii
SREE CHAITANYA COLLEGE OF ENGINEERING
(Affiliated to JNTUH, HYDERABAD)
THIMMAPUR, KARIMNAGAR, TELANGANA-505 527
Department of Computer Science and Engineering

ACKNOWLEDGEMENTS
The Satisfaction that accomplishes the successful completion of any task would be
incomplete without the mention of the people who make it possible and whose constant
guidance and encouragement crown all the efforts with success.
I would like to express my sincere gratitude and indebtedness to my project supervisor
Mr.R.Ramesh,Assistant Professor, Department of Computer Science and Engineering, Sree
Chaitanya College of Engineering,LMDColony,Karimnagar for his/her valuable suggestions
and interest throughout the course of this project
I am also thankful to Head of the department Mr.Khaja Ziauddin,Associate Professor
&HOD,Department of Computer Science and Engineering, Sree Chaitanya Collegeof
Engineering,LMDColony,Karimnagar for providing excellent infrastructure and a nice
atmosphere for completing this project successfully
We Sincerely extend out thanks to Dr.G.Venkateswarlu,Principal, Sree Chaitanya
College of Engineering,LMDColony,Karimnagar,for providing all the facilities required for
completion of this project.
I convey my heartfelt thanks to the lab staff for allowing me to use the required equipment
whenever needed.
Finally, I would like to take this opportunity to thank my family for their support through the
work.
I sincerely acknowledge and thank all those who gave directly or indirectly their support in
completion of this work.

1. S.Architha (H.T.N0:19N01A0597)
2. KhutejaNazlee (H.T.N0:19N01A0563)
3.S.Pooja (H.T.N0:19N01A0598)
4.P.Laxmi (H.T.N0:19N01A0580)
5:S.Sai Surya (H.T.N0:19N01A0599)

iv
ABSTRACT

Writing in air has been one of the most fascinating and challenging research areas in field of
image processing and pattern recognition in the recent years. It contributes immensely to the
advancement of an automation process and can improve the interface between man and machine
in numerous applications. Generally, video analysis procedure has three major steps: firstly,
detecting of the object, secondly tracking its movement from frame to frame and lastly
analysing the behaviour of that object. For object tracking, four different issues are taken into
account; selection of suitable object representation, feature selection for tracking, object
detection and object tracking. In real world, Object tracking algorithms are the primarily part
of different applications such as: automatic surveillance, video indexing and vehicle navigation
etc. The project takes advantage of this gap and focuses on developing a motion- to-text
converter that can potentially serve as software for intelligent wearable devices for writing
from the air. This project is a reporter of occasional gestures. It will use computer vision to
trace the path of the finger. The generated text can also be used for various purposes, such as
sending messages, emails, etc. It will be a powerful means of communication for the deaf. It
is an effective communication method that reduces mobile and laptop usage by eliminating
the need to write.

v
CONTENTS

TITLE PAGE NO
CERTIFICATE i
DECLARATION ii
ACKNOWLEDGEMENT iii
ABSTRACT iv
CONTENTS v-vi
LIST OF FIGURES vii
CHAPTER–1 : INTRODUCTION 01-09
1.1 Overview 02
1.2 Motivation 03-04
1.3 Existing System 05
1.4 Proposed System 05-06
1.4.1 Features 06
1.4.2 Scope 07
1.4.3 Applications 07
1.4.4 Modules of Proposed System 08
1.5 Objective 08-09
CHAPTER–2 : LITERATURE SURVEY 10-12
CHAPTER–3 : PROBLEM DEFINITION 13-14
CHAPTER–4 : SOFTWARE AND HARDWARE REQUIREMENTS 15-23
4.1 Hardware Requirements 15-17
4.2 Software Requirements 24-23
CHAPTER-5 : DESIGN 24-29
5.1 Architecture 24-25
5.2 Module Description 25-26
5.3 System Workflow 27-28

5.4 UML diagrams 29

vi
5.5 Sequence diagram 30
5.6 Interation among all modules 31

CHAPTER-6:IMPLEMENTATION 32-54
6.1 SampleCode 32-36

6.2 Algorithm 36-37


6.3 Code 37-54
CHAPTER-7 : RESULTS ANDOUTPUT SCREENS 55-56
CHAPTER-8 : CONCLUSION ANDFUTUREWORK 57-59
8.1Conclusion 57
8.2FutureEnhancement 58-59
REFERENCES 60

vii
LIST OF FIGURES

FIGURENO. NAME OF THEFIGURE PAGENO.

Fig.1.1 flow chart of proposed system 06


Fig.4.1 Object Identification 21

Fig.4.2 Representation of Gray Scale image 22

Fig.4.3 Representation of RGB pixel values 22


Fig.4.4 Picture representation of Air Canvas 23

Fig.5.1 Flow chart of Air Canvas 24


Fig.5.2 Working of Air Canvas 28
Fig.5.3 Use Case Diagram 29

Fig5.4 Sequence Diagram 30


Fig.6.1 Output to draw Blue circles on a image 33
Fig.6.2 Output to On a webcam 34

Fig.6.3 Output to draw on an image using the mouse 34


Fig.6.4 Output to convert video into individual channels 36
of the HSV image in separate windows
Fig.7.1 Live frame window is opened 55
Fig.7.2 Colour is detected in mask window 55
Fig.7.3 Contour is detected 56
Fig.7.4 Output of Air Canvas 56

viii
CHAPTER 1
INTRODUCTION

Air Canvas, a project built in python, is a computer vision project which is a cross-platform
framework for building multi model applied machine learning pipeline. Computer Vision is a
field of Artificial Intelligence (AI) that enables computers and systems to derive meaningful
information from digital image, video and other visual inputs- and take action or
recommendation based on that information.

Air Canvas provides a white canvas in which one can draw without physically touching
any input devices. Yes, you read it right. Air Canvas does not require any physical contact with
input devices in order to draw on the canvas. Using the power of computer vision and deep
learning, one can interact with canvas using the tip of his/her finger. All user needs to do is
show their hand in front of the web camera. It detects the palm of user and tracks 21 different
landmarks in it. Using the tip of the pointing finger one can interact within the canvas. Aim of
building this project is to make the virtual classes effective and easy for teachers who are facing
difficulties drawing/ writing from the mouse and they don’t have either touchscreen laptop or
any other pen input devices. They can simply draw on the board using the web cam and tip of
their finger, no additional hardware is required.

An air canvas is a type of interactive display that allows users to draw or paint in mid-air,
using gestures or other input devices. While it is possible to use objects, such as pens or sticks,
to create an air canvas, this would likely require some additional technology, such as a depth-
sensing camera or other input device that can detect the movement of objects in 3D space.

This technology could then be used to track the movement of the objects and convert them
into digital brush strokes or other drawing tools. However, creating an air canvas using objects
would likely be a complex and challenging task, and there are likely to be many technical
challenges involved. It is also worth noting that using objects to create an air canvas may not
be as intuitive or user-friendly as using gestures or other input devices designed specifically
for this purpose. An air canvas is a type of interactive display that allows users to draw or paint
in mid-air, using gestures or other input devices.

1
A camera can be used as part of an air canvas system, but it is not the only technology
required. In order to create an air canvas that allows users to draw or paint in mid-air using
gestures, a depth-sensing camera or other input device is needed that can detect the movement
of objects in 3D space. This technology could then be used to track the movement of users'
hands or other objects, and convert them into digital brush strokes or other drawing tools.
However, creating an air canvas using a camera is likely to be a complex and challenging task,
and there are likely to be many technical challenges involved. It would also require some
knowledge of computer vision and image processing.

Air Canvas is a software tool that allows users to create and manipulate digital images
using hand gestures. It uses a combination of computer vision and machine learning algorithms
to track hand movements and interpret them as drawing commands. Air Canvas allows users
to draw, paint, and erase with their hands, using a webcam or other video input device as the
input source. It can be used for a variety of creative and artistic purposes, as well as for
educational and research applications.

1.1 OVERVIEW

An air canvas is a virtual whiteboard that allows users to draw or write in the air using
gestures. It can be implemented using the OpenCV (Open Source Computer Vision) library,
which is a popular open-source library for computer vision tasks. To implement an air
canvas using OpenCV, you will need a camera to capture the gestures made by the user.
You will also need some way of detecting and tracking the gestures, such as using color
tracking or motion tracking algorithms. Once the gestures have been detected and tracked,
you can use OpenCV to draw the corresponding lines or text on a virtual canvas. This can
be displayed on a screen or projected onto a surface, allowing the user to see their drawings
in real-time. Overall, implementing an air canvas using OpenCV requires a combination of
computer vision techniques and interactive graphics. It can be a challenging project, but
can also be very rewarding as it allows users to interact with technology in a natural and
intuitive way.

2
An "air canvas" is a computer vision project that allows users to draw in the air using hand
gestures, which are captured by a camera and processed using computer vision algorithms. The
resulting drawings can be displayed on a screen in real-time. One way to implement an air
canvas using OpenCV (OpenCV is a computer vision library that provides tools for image and
video analysis) and Python is to first set up a camera to capture video frames of the user's hand
gestures. Then, the video frames can be processed using OpenCV to detect and track the
movement of the user's hand. This can be done using techniques such as object detection, image
segmentation, or contour detection.

Once the hand movement has been detected and tracked, the air canvas can use this
information to generate a corresponding drawing on a display screen. This can be done by
mapping the movement of the hand to the movement of a cursor on the screen, or by using the
hand movement to draw lines or shapes directly onto the screen. Overall, an air canvas project
using OpenCV and Python involves using computer vision algorithms to detect and track hand
gestures in real-time, and using this information to generate a corresponding drawing on a
screen.

1.2 MOTIVATION

There are several motivations for creating an "air canvas" using OpenCV and Python:

• Fun and entertainment: An air canvas can be a fun and interactive way for users to
create art or play games using hand gestures.
• Accessibility: An air canvas can be used to create an interface that is more accessible
to users with disabilities, who may have difficulty using a traditional input device such
as a mouse or keyboard.
• Educational purposes: An air canvas project can be a good opportunity to learn about
and experiment with computer vision algorithms and techniques, such as object
detection and image segmentation.
• Novelty: An air canvas can be a unique and attention-grabbing way to display and
interact with digital content, such as at a trade show or exhibition.

3
• Improved usability: An air canvas can provide a more natural and intuitive way for
users to interact with digital content, compared to traditional input devices like a mouse
or keyboard.
• Virtual reality (VR) and augmented reality (AR) applications: An air canvas can be
used to create interactive VR or AR experiences that allow users to draw or interact
with virtual objects using hand gestures.
• Interactive installations: An air canvas can be used to create interactive installations or
exhibits that allow users to interact with digital content using hand gestures.
• Gesture-based control: An air canvas can be used to create a gesture-based control
interface for devices such as smart home systems or gaming consoles.
• Advertising and marketing: An air canvas can be used as an attention-grabbing tool for
advertising and marketing campaigns, allowing users to interact with digital content in
a unique and memorable way.
• Medical rehabilitation: An air canvas can be used as a therapeutic tool for patients
undergoing medical rehabilitation, helping them to improve their hand-eye
coordination and fine motor skills.
• Art therapy: An air canvas can be used as a tool for art therapy, allowing users to express
themselves creatively and emotionally through the medium of hand gestures.
• Gaming: An air canvas can be used to create innovative and immersive gaming
experiences that allow users to interact with the game using hand gestures.
• Virtual meetings and presentations: An air canvas can be used to enhance virtual
meetings and presentations, allowing users to draw or annotate digitally in real-time
using hand gestures.
• Music performance: An air canvas can be used as a tool for music performance,
allowing users to create music or control audio effects using hand gestures.

Overall, an air canvas project using OpenCV and Python can be a fun and educational way
to explore the capabilities of computer vision, and can have applications in a variety of
contexts.

4
1.3 EXISTING SYSTEM

The superior pen contains of a tri pivotal accelerometer, a microcontroller, and a RF remote
transmission module for detecting and amassing velocity will increase of hand composing and
movement directions. Our implanted project first concentrates the time-and recurrence area
highlights from the rate boom indicators and, then, at that point, sends the symptoms with the
aid of using making use of RF transmitter. In beneficiary section RF symptoms may be gotten
with the aid of using RF recipient and given to microcontroller. The regulator procedures the
records finally then results may be proven on Graphical LCD.

Fingertip detection:
The existing system only works with your fingers, and there are no highlighters, paints, or
relatives. Identifying and characterizing an object such as a finger from an RGB image without
a depth sensor is a great challenge.
Lack of pen up and pen down motion:
The system uses a single RGB camera to write from above. Since depth sensing is not
possible, up and down pen movements cannot be followed. Therefore, the fingertip's entire
trajectory is traced, and the resulting image would be absurd and not recognized by the model.
Controlling the real-time system:
Using real-time hand gestures to change the system from one state to another requires a lot
of code care. Also, the user must know many movements to control his plan adequately

1.4 PROPOSED SYSTEM

This computer vision experiment uses an Air canvas, which allows you to draw on a screen
by waving a finger equipped with a colorful tip or a basic colored cap. These computer vision
projects would not have been possible without OpenCV's help. There are no keypads, styluses,
pens, or gloves needed for character input in the suggested technique.

In this proposed framework, going to utilize webcam and show unit (monitor screen). Here,
will be utilizing pen or hand for drawing attractive images in front of the camera then we will
attract those images, it will be shown on the presentation unit. Our mounted framework is suit
for decoding time-collection pace boom alerts into extensive thing vectors. Users can make use

5
of the pen to compose digits or make hand motions and so on may be proven at the presentation
unit.

In this proposed framework, going to utilize webcam and show unit(monitor screen). Here,
will be utilizing pen or hand for drawing attractive images in front of the camera then we will
attract those images, it will be shown on the presentation unit. Our mounted framework is suit
for decoding time-collection pace boom alerts into extensive thing vectors. Users can make use
of the pen to compose digits or make hand motions and so on may be proven at the presentation
unit.

Figure 1.1: flow chart of proposed system

1.4.1 Features of Air Canvas


• Can track any specific colored pointer.
• User can draw in four different colors and even change them without any hustle.
• Able to rub the board with a single location at the top of the screen.
• No need to touch the computer once the program is run.

6
1.4.2 Scope
The scope of computer vision and that of OpenCV is huge Object (human and non-human)
detection in both commercial as well as governmental space is huge and already happens is
many ways.
Transportation - with autonomous driving in ADAS (Automated Driver Assist System) in
traffic signs detection, pedestrian detection, safety features such driver fatigue detection etc.
Medical imaging - mammography, cardiovascular and microscopic image analysis (I’m not a
medicine guy but I am hearing that a whole lot of computer imaging aided decision-making
such as automated detection and counting of microorganisms will involve use of OpenCV)
Manufacturing - Ton of computer vision stuff there as well such as rotation invariant detection
on a conveyer belt with detection of stoke of robotic gripping. Public order and security -
pedestrian/citizen detection and tracking, mob management, prediction of future events.

1.4.3 Applications:
Python may be used to quickly analyze photos and videos and extract meaningful
information from them, thanks to the many methods provided in OpenCV. Other frequent uses
include,
Image Processing:
There are several ways in which the OpenCV may be used to process and interpret images,
such as altering their shape, colour, or extracting important information from the supplied
picture and writing it into a new image.
Face Detection:
By employing Haar-Cascade Classifiers, either from locally recorded videos or photos or
from live streaming through web camera.
Face Recognition:
In order to identify faces in the films, face identification was performed using OpenCV by
generating bounding boxes (rectangles) and subsequently model training using ML methods.
Object Detection:
OpenCV and YOLO, an object identification method, may be used to identify moving or
stationary objects in images and videos.

7
1.4.4 Modules of Proposed System

1. Color Tracking:
Understanding the HSV ( Hue , Saturation , Value ) shading space for Color Tracking.
Furthermore, following the little hued object at fingertip. The approaching picture from the
webcam is to be changed over to the HSV shading space for recognizing the hued object at the
tip of finger.
2. Trackbars:
When the trackbars are arrangement, we will get the realtime esteem from the trackbars
and make range. This reach is a numpy structure which is utilized to be passed in the capacity
cv2.inrange(). This capacity returns the Mask on the hued object. This Mask is a high contrast
picture with white pixels at the situation of the ideal tone.
3. Contour Detection:
Recognizing the Position of Colored item at fingertip and shaping a circle over it. We are
playing out some morphological procedure on the Mask, to make it liberated from
contaminations and to distinguish shape without any problem. That is Contour Detection.
4. Frame Processing:
Following the fingertip and drawing focuses at each position for air material impact. That
is Frame Processing
5. Algorithmic Optimization:
Making the code efficient to work the program without a hitch.

1.5 OBJECTIVE

The main objective of an "air canvas" system using OpenCV and Python is to allow users
to create drawings or artworks using hand gestures, which are captured by a camera and
processed in real-time using computer vision algorithms.

To achieve this objective, the system would need to detect and track the movement of the
user's hand in the video frames captured by the camera, and then use this information to
generate a corresponding drawing on a screen. This could be done by mapping the movement
of the hand to the movement of a cursor or drawing tool on the screen, or by using the hand
movement to draw lines or shapes directly onto the screen.

8
In addition to the main objective of creating drawings or artworks using hand gestures, an
air canvas system using OpenCV and Python could also have the following secondary
objectives:

• To provide a fun and interactive way for users to create art or play games using hand
gestures.
• To create an interface that is more accessible to users with disabilities, who may have
difficulty using a traditional input device such as a mouse or keyboard.
• To provide an opportunity for users to learn about and experiment with computer vision
algorithms and techniques, such as object detection and image segmentation.
• To create a unique and attention-grabbing way to display and interact with digital
content, such as at a trade show or exhibition.
• To provide a more natural and intuitive way for users to interact with digital content,
compared to traditional input devices like a mouse or keyboard.
• To create interactive VR or AR experiences that allow users to draw or interact with
virtual objects using hand gestures.
• To create interactive installations or exhibits that allow users to interact with digital
content using hand gestures.
• To create a gesture-based control interface for devices such as smart home systems or
gaming consoles.
• To use as an attention-grabbing tool for advertising and marketing campaigns, allowing
users to interact with digital content in a unique and memorable way.
• To use as a therapeutic tool for patients undergoing medical rehabilitation, helping them
to improve their hand-eye coordination and fine motor skills.
• To use as a tool for art therapy, allowing users to express themselves creatively and
emotionally through the medium of hand gestures.
• To create innovative and immersive gaming experiences that allow users to interact
with the game using hand gestures.
• To enhance virtual meetings and presentations, allowing users to draw or annotate
digitally in real-time using hand gestures.
• To use as a tool for music performance, allowing users to create music or control audio
effects using hand gestures.

9
CHAPTER – 2

LITERATURE SURVEY

Automatic object tracking has many uses in computers, such as Computer vision and human
machine interaction. Various applications for tracking algorithms are suggested in the
literature. One team of researchers used it to interpret the signals languages , some to see hand
gestures , another text-tracking group as well recognition, and body monitoring the visible
movement of objects as well character recognition based on finger tracking , etc. Bragatto et
al. he built a way for that automatically translates Brazilian Sign Language from video input.
Use NN multilayer (Neural Perceptron) a network with a line divided into sections a real-time
video capture function processing. This activation function reduces means NN complex time.
Moreover, they use NN in two stages: color recognition and steps to go check hand shape.
Their results show that the method proposed well works with the acquisition rate of 99.2% .
Cooper also introduced the method managing the most complex 3D cell bioprinting there is a
standard set. Cooper developed a process that minimizes tracking by identifying errors in thesis
division procedures and tracking. Cooper used two treatment modalities; One is for his
movement, and the other is used for clarification hand shape. He used the screen for expand
his vocabulary. Viseme is an important position word of mouth in pronouncing the word a
Phoneme and visual representations of phonemes.Over time, you become less formal a way to
identify the characters.

Araga et al. raised hand touch recognition A program that uses Jordan's Recurrent Neural
Network (JRNN). Their program compared 5 and 9 different hand positions through
representative sequence still images. He then took the recording as re-installation begins to
separate the shape of the hands. JRNN gets input touch after a temporary behavior of sequence
of positions has been detected. In addition, they created a new way of training. the proposed
method shows a 99.0% accuracy.five different positions, while reaching 94.3% accuracy of
nine touches , Yanget al .; discussed another comparison solution sequence of images in the
pattern, usually occurs with the touch of a hand touch. The proposed method is not supported
by skin color patterns and can work even in the wrong divisions. They include both the
classification and recognition process is used cross-cluster process. Their results show better
5% performance loss on both models. Neumann et al. he built a way to find out and see text in
real photos. In their own article, they use a hypothesis framework that can manage multiple

10
lines of text. They also use artifacts characters to train the algorithm, and, finally, they use the
most stable (MSER), delivery firmness in geometric shapes and lamps.

In addition, Wang et al. discussed colors internal and external motion detection system
places. In the proposed way, use a webcam and t-shirt tracking item.The result of the proposed
method shows that the method can be applied to the physical reality applications. Jari
Hannuksela et al. Toshio Asano et al. and Sharad Vikram et al. babe finger recognition systems
are finger-based to track. the author introduces the based movement a tracking algorithm that
combines two Kalmans filtering techniques and expected expansion (EM) methods for
measuring two different movements. Finger movement with the camera. Rate supported in
moving buildings the place we count each image. Its idea is to control the cell phone devices
by simply swiping a finger in front of a camera. the authors discuss visually seeing Japanese
katakana characters in the wind. Following hand movements, they use LED pen and camera.
They change the signal pencil in traffic codes. Codes are there usually up to 100 data items to
complete the result of typing speed, in which there are 46 Japanese characters explained. With
one camera, they get a 92.9% character recognition accuracy, too multiple cameras, action 9 °
directional accuracy.

A. Tracking of Brush Tip on Real Canvas:


Author-Joolekha Bibi Joole, challenging task, and there are likely to be many technical
challenges involved. It is also worth noting that using objects to create an air canvas may not
be as intuitive or user-friendly as using gestures or other input devices designed specifically
for this purpose , Ahsan Raza, Muhammad Abdullah, Seokhee Jeon .
Working-The proposed profound outfit community is ready disconnected using records
stuck thru an outer tracker (Optitrack V120) and the define primarily based totally approach.
During actual drawing, the organized organisation appraises the brush tip function via way of
means of taking the brush deal with act like an records, allowing us to make use of actual cloth
with a actual brush. During the checking out system, the framework works continuously,
considering that round then, it tracks the brush deal with present (function and direction) and
the proposed profound troupe community takes this brush deal with act like information
and predicts the brush tip function in genuine time. For information assortment, played
out various strokes for 60 seconds on a superficial level inside the following area.

11
B. Augmented Airbrush for Computer Aided Painting (CAP):
Author-Roy Shilkrot, Pattie Maes, Joseph A. Paradiso, and Amit Zoran
Working- To work our expanded artificially glamorize, the client remains before the
material, allowed to chip away at any piece of the composition, utilize any style, and
counsel the PC screen in the event that the person wishes The reference and material
are lined up with an aligned focus point that relates to the virtual beginning. The client
can move the gadget utilizing a coordinated strategy, a more instinctive one,or a blend
of both.The PC will mediate just when the virtual following compares with a paint
projection that disregards a virtual reference. In such a case, the PC will keep the client
from utilizing the maximum capacity of the artificially glamorize trigger and applying
paint where it isn't needed.A gadget depends on a Grex Genesis.XT, a gun style digitally
embellish mitigated of its back paint-volume change handle. Since this is a double
activity artificially glamorize, working the trigger opens both the compelled air valve
and the paint liquid valve, which is made of a needle and a spout, bringing about a
stream of air blended in with paint particles.They fostered a specially crafted expansion
component, to permit advanced control of the paint combination. A Grex air blower
supplies compressed air at 20 PSI, and a Polhemus Fastrack attractive movement global
positioning framework positions the gadget in 6DOF.

C. 3D Drawing with Augmented Reality:

Author-Sharanya M, Sucheta Kolur , Sowmyashree B V, Sharadhi L, Bhanushree K J


Working-A mobile application that runs on Android devices and lets the user draw on
the world, treating it as a canvas, implement real-time sync of the drawing on all instances
of the application running in the same network room and provide a tool for creative
content producers to quickly sketch their ideas in3D spaces. The Freehand procedure
permits the client to draw constantly as coordinated by hand developments. To begin a
line the client plays out the air-tap motion. The line is drawn constantly at the list cursor
position until the client ends the line by playing out a subsequent airtap.

12
CHAPTER-3
PROBLEM DEFINITION

The existing system only works with your fingers and no highlighters, paints, or relatives.
Identify and distinguish something like a finger from RGB image without depth sensor great
challenge. Another problem is lack of top and movement under the pen. The system uses a one
RGB camera that you can overwrite. From the depths discovery is impossible, jobs up and
down of the pen cannot be traced. So, everything finger path is drawn, and the result the image
will be abstract and unseen by model. Using real-time hand touch to change position the process
from one region to another requires a lot of code care. In addition, the user should know many
movement to control his plan adequately. The project focuses on solving some of the most
important social issues Problems. First of all, there are many hearing-impaired people problems
in everyday life. While listening again listening is taken for granted, people don’t have this
communicating with a disability using sign language. Most countries in the world cannot
understand yours feelings and emotions outside the middle translator. Second, overuse causes
accidents, stress, Smartphones: distractions, and other illnesses that people can no longer
tolerate find out. Although its portability and ease of use exist. very popular, its obstacles
include life terrifying events. Waste paper is not uncommon. Many papers are wasted on
writing, writing, drawing, etc. A4 paper production requires about 5 liters of water. 93%
sources come from trees, 50% of commercial waste is paper, 25% of landfills are paper, and
the list goes to. Waste paper harms the environment through use of water and trees and produce
tons waste. On-air writing can solve these problems quickly. It will serve as a communication
tool for the deaf. Your online text can be displayed in AR or translated into speech. One can
write on the air quickly and continue to operate without much interruption. Also, writing in the
air does not require paper. Everything is stored electronically.
This project focuses on solving some major societal problems –
1. People hearing impairment: Although we take hearing and listening for granted, they
communicate using sign languages. Most of the world can't understand their feeling, their
emotions without a translator in between.
2. Overuse of Smartphones: They cause accidents, depression, distractions, and other
illnesses that we humans can still discover. Although its portability, ease to use is profoundly
admired, the negatives include life-threatening events.

13
3. Paper wastage is not scarce news: We waste a lot of paper in scribbling, writing, drawing,
etc.… Some basic facts include - 5 liters of water on average are required to make one A4 size
paper, 93% of writing is from trees, 50% of business waste is paper, 25% landfill is paper, and
the list goes on. Paper wastage is harming the environment by using water and trees and creates
tons of garbage.
4. Enable handwriting recognition: If the air canvas is intended to be used for writing, you
may want to incorporate handwriting recognition functionality to convert the user's
handwriting into digital text. This could involve training a machine learning model on a dataset
of handwritten characters.
5. Allow for multi-user collaboration: You may want to design the air canvas to support
multiple users collaborating on the same canvas simultaneously. This could involve tracking
the movements of multiple styluses or fingers, and ensuring that the drawings or text from each
user are displayed correctly on the screen.
6. Add support for saving and loading canvases: It would be useful to allow the user to save
their work and load it back up at a later time. This could involve storing the digital drawings
or text in a file, and allowing the user to choose which file to load or save.
7. Implement gesture recognition: In addition to tracking the position of the stylus or fingers,
you may also want to recognize specific gestures made by the user, such as a swipe or a tap.
This could allow the user to perform actions such as erasing or changing colors without using
additional input devices.
8. Display the drawings or text on a computer screen: You would need to develop a user
interface that displays the digital drawings or text in real-time as the user moves the stylus or
fingers.
9. Allow the user to customize the drawing or writing experience: You may also want to
allow the user to adjust the color, thickness, or font of the digital drawings or text.

Air Writing can quickly solve these issues. It will act as a communication tool for people
with hearing impairment. Their air-written text can be presented using AR or converted to
speech. One can quickly write in the air and continue with your work without much distraction.
Additionally, writing in the air does not require paper. Everything is stored electronically.

14
CHAPTER-4
SOFTWARE AND HARDWARE REQUIREMENT

1. Hardware Requirement
• Dual Core CPU
• Minimum 1 gb of RAM
• Window 7 or greater
• Web Camera
2. Software Requirement
• Python
• Numpy Module
• OpenCV module

4.1 HARDWARE REQUIREMENT

➢ Dual Core CPU: A dual core CPU is a type of central processing unit (CPU) that has
two independent cores, or processing units, on the same chip. It is capable of processing
two streams of instructions simultaneously, which can increase the performance of
certain types of tasks.
➢ Minimum 1gb of RAM: RAM (random access memory) is a type of computer memory
that is used to store data that is actively being used or processed by the system. It is
volatile memory, meaning it is wiped clean when the power is turned off.
The amount of RAM that a computer has can affect its performance. In general, having more
RAM can allow a computer to run more programs concurrently and improve the speed at which
it can complete tasks.
A minimum of 1 GB (gigabyte) of RAM is often recommended for basic tasks such as web
browsing, word processing, and email. However, the actual amount of RAM that is required
for a particular task or application can vary depending on the specific needs of the software and
the operating system. For example, more resource-intensive tasks such as video editing or
gaming may require more RAM to run smoothly.
It's worth noting that the minimum requirements for RAM can vary depending on the
specific operating system and software being used. For example, the minimum RAM
requirement for running the latest version of Microsoft Windows is 2 GB, while the minimum
15
requirement for running macOS is 4 GB. It is always a good idea to check the system
requirements for the specific software or operating system that you are using to ensure that
your computer has enough RAM to run it effectively.
➢ Window 7 or greater: Windows 7 is a personal computer operating system that was
produced by Microsoft and released as part of the Windows NT family of operating
systems. It was released to manufacturing on July 22, 2009, and became generally
available on October 22, 2009. Windows 7 was succeeded by Windows 8, which was
released in October 2012.
Some key features of Windows 7 include:
• A redesigned taskbar and start menu
• Improved support for multi-touch input
• Enhanced support for hardware acceleration
• Improved performance and boot time
• Improved security features, including BitLocker encryption and AppLocker
• The ability to create a home network and share files and printers with other computers
• Support for virtual hard disks
• Improved support for different languages and input methods
To use Windows 7, your computer must meet certain hardware and software requirements.
These requirements include:
• Processor: 1 GHz or faster 32-bit (x86) or 64-bit (x64) processor
• Memory: 1 GB RAM for 32-bit or 2 GB RAM for 64-bit
• Hard drive: 16 GB available hard disk space for 32-bit or 20 GB for 64-bit
• Graphics card: DirectX 9 graphics processor with WDDM 1.0 or higher driver
It's worth noting that Microsoft ended mainstream support for Windows 7 on January 13,
2015, and ended extended support on January 14, 2020. This means that Microsoft no longer
provides security updates or technical support for the operating system. If you are still using
Windows 7, it is recommended to upgrade to a newer version of Windows to receive ongoing
security updates and support.
➢ Web Camera: A webcam, also known as a web camera, is a video camera that is used
to capture images and video for transmission over the internet. Webcams are typically
small and portable, making them convenient for use with computers and other devices.

16
Webcams can be used for a variety of purposes, such as video conferencing, live
streaming, and recording video for social media or other online platforms. They are often
integrated into laptops and desktop computers, but can also be purchased as standalone devices
that can be connected to a computer or other device via USB.
Most webcams have a built-in microphone, which allows them to capture audio as well
as video. Some webcams also have additional features such as a built-in LED light or the ability
to pan, tilt, and zoom to capture a wider field of view.
Webcams can vary in terms of their video quality, with higher-end models typically
offering higher resolution and a more detailed image. They can also vary in terms of their frame
rate, which is the number of frames captured per second. A higher frame rate can result in a
smoother, more realistic video, but may also require more bandwidth and processing power.
It's worth noting that webcams can be vulnerable to security risks, such as the potential
for unauthorized access or surveillance. If you are concerned about the security of your
webcam, you may want to consider using a physical cover to block the camera when it is not
in use, or disabling the webcam in your device's settings.

4.2 SOFTWARE REQUIREMENT


➢ PYTHON:
Python is a general-purpose interpreted, interactive, object-oriented, and high-level
programming language. It was created by Guido van Rossum during 1985- 1990. Like Perl,
Python source code is also available under the GNU General Public License (GPL).
Python is a high-level, interpreted, interactive and object-oriented scripting language.
Python is designed to be highly readable. It uses English keywords frequently where as other
languages use punctuation, and it has fewer syntactical constructions than other languages.
Python is a MUST for students and working professionals to become a great Software
Engineer specially when they are working in Web Development Domain. I will list down
some of the key advantages of learning Python:
As mentioned before, Python is one of the most widely used language over the web. I'm
going to list few of them here:
• Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
• Easy-to-read − Python code is more clearly defined and visible to the eyes.
17
• Easy-to-maintain − Python's source code is fairly easy-to-maintain.
• A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
• Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
• Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
• Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more efficient.
• Databases − Python provides interfaces to all major commercial databases.
• GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows MFC,
Macintosh, and the X Window system of Unix.
• Scalable − Python provides a better structure and support for large programs than
shell scripting.

➢ NUMPY:
NumPy is the fundamental package for scientific computing in Python. It is a Python library
that provides a multidimensional array object, various derived objects (such as masked arrays
and matrices), and an assortment of routines for fast operations on arrays, including
mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier
transforms, basic linear algebra, basic statistical operations, random simulation and much more.
At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional
arrays of homogeneous data types, with many operations being performed in compiled code
for performance. There are several important differences between NumPy arrays and the
standard Python sequences:
• NumPy arrays have a fixed size at creation, unlike Python lists (which can grow
dynamically). Changing the size of an ndarray will create a new array and delete the original.
• The elements in a NumPy array are all required to be of the same data type, and thus
will be the same size in memory. The exception: one can have arrays of (Python, including
NumPy) objects, thereby allowing for arrays of different sized elements.

18
• NumPy arrays facilitate advanced mathematical and other types of operations on large
numbers of data. Typically, such operations are executed more efficiently and with less code
than is possible using Python’s built-in sequences.
• A growing plethora of scientific and mathematical Python-based packages are using
NumPy arrays; though these typically support Python-sequence input, they convert such input
to NumPy arrays prior to processing, and they often output NumPy arrays. In other words, in
order to efficiently use much (perhaps even most) of today’s scientific/mathematical Python-
based software, just knowing how to use Python.
In other words, in order to efficiently use much (perhaps even most) of today’s
scientific/mathematical Python-based software, just knowing how to use Python’s built-in
sequence types is insufficient - one also needs to know how to use NumPy arrays.The points
about sequence size and speed are particularly important in scientific computing. As a simple
example, consider the case of multiplying each element in a 1-D sequence with the
corresponding element in another sequence of the same length.
• Why is NumPy Fast?
Vectorization describes the absence of any explicit looping, indexing, etc., in the code -
these things are taking place, of course, just “behind the scenes” in optimized, pre-compiled C
code. Vectorized code has many advantages, among which are:
• vectorized code is more concise and easier to read fewer lines of code generally means
fewer bugs
• the code more closely resembles standard mathematical notation (making it easier,
typically, to correctly code mathematical constructs)
• vectorization results in more “Pythonic” code. Without vectorization, our code would
be littered with inefficient and difficult to read for loops.
Broadcasting is the term used to describe the implicit element-by-element behavior
of operations; generally speaking, in NumPy all operations, not just arithmetic operations, but
logical, bit-wise, functional, etc., behave in this implicit element-by-element fashion, i.e., they
broadcast. Moreover, in the example above, a and b could be multidimensional arrays of the
same shape, or a scalar and an array, or even two arrays of with different shapes, provided that
the smaller array is “expandable” to the shape of the larger in such a way that the resulting
broadcast is unambiguous . For detailed “rules” of broadcasting see basics broadcasting.

19
• Who Else Uses NumPy?
NumPy fully supports an object-oriented approach, starting, once again, with ndarray. For
example, ndarray is a class, possessing numerous methods and attributes. Many of its
methods are mirrored by functions in the outer-most NumPy namespace, allowing the
programmer to code in whichever paradigm they prefer. This flexibility has allowed the
NumPy array dialect and NumPy ndarray class to become the de-facto language of multi-
dimensional data interchange used in Python.

➢ OPENCV:
OpenCV is a huge open-source library for computer vision, machine learning and image
processing. OpenCV supports a wide variety of programming languages like Python, C++,
Java, etc. It can process images and videos to identify objects, faces, or even the handwriting
of a human. When it is integrated with various libraries, such as Numpy which is a highly
optimized library for numerical operations, then the number of weapons increases in your
Arsenal i.e whatever operations one can do in Numpy can be combined with OpenCV. This
OpenCV tutorial will help you learn the Image-processing from Basics to Advance, like
operations on Images, Videos using a huge set of Opencv-programs and projects.

In OpenCV, the CV is an abbreviation form of a computer vision, which is defined as a


field of study that helps computers to understand the content of the digital images such
as photographs and videos. The purpose of computer vision is to understand the content of the
images. It extracts the description from the pictures, which may be an object, a text description,
and three-dimension model, and so on. For example, cars can be facilitated with computer
vision, which will be able to identify and different objects around the road, such as traffic lights,
pedestrians, traffic signs, and so on, and acts accordingly.

Computer vision allows the computer to perform the same kind of tasks as humans with the
same efficiency. There are a two main task which are defined below:

• Object Classification - In the object classification, we train a model on a dataset of


particular objects, and the model classifies new objects as belonging to one or more of
your training categories.

20
• Object Identification - In the object identification,
our model will identify a particular instance ofan
object - for example, parsing two faces in an
image and tagging one as Virat Kohli and other
one as Rohit Sharma. Fig.4.1: Object Identification

History:

OpenCV stands for Open Source Computer Vision Library, which is widely used for image
recognition or identification. It was officially launched in 1999 by Intel. It was written in C/C++
in the early stage, but now it is commonly used in Python for the computer vision as well.

2001 and 2005, five betas were released. The first 1.0 version was released in 2006. The
first alpha version of OpenCV was released for the common use at the IEEE Conference on
Computer Vision and Pattern Recognition in 2000, and between the second version of the
OpenCV was released in October 2009 with the significant changes. The second version
contains a major change to the C++ interface, aiming at easier, more type-safe, pattern, and
better implementations. Currently, the development is done by an independent Russian team
and releases its newer version in every six months.

• How does computer recognize the image?

Human eyes provide lots of information based on what they see. Machines are facilitated
with seeing everything, convert the vision into numbers and store in the memory. Here the
question arises how computer convert images into numbers. So the answer is that the pixel
value is used to convert images into numbers. A pixel is the smallest unit of a digital image or
graphics that can be displayed and represented on a digital display device.

The picture intensity at the particular location is represented by the numbers. In the above
image, we have shown the pixel values for a grayscale image consist of only one value, the
intensity of the black color at that location. There are two common ways to identify the images:

1. Gray scale:

Grayscale images are those images which contain only two colors black and white. The
contrast measurement of intensity is black treated as the weakest intensity, and white as the
strongest intensity. When we use the grayscale image, the computer assigns each pixel value

21
based on its level of darkness.

Fig.4.2: Representation of Gray Scale image

2. RGB:

An RGB is a combination of the red, green, blue color which together makes a new color.
The computer retrieves that value from each pixel and puts the results in an array to be
interpreted.

Fig.4.3: Representation of RGB pixel values

Why OpenCV is used for Computer Vision?

• OpenCV is available for free of cost.

22
• Since the OpenCV library is written in C/C++, so it is quit fast. Now it can be used with
Python.
• It require less RAM to usage, it maybe of 60-70 MB.
• Computer Vision is portable as OpenCV and can run on any device that can run on C
and Python.

Fig.4.4 : Picture representation of Air Canvas

23
CHAPTER-5
DESIGN

In a general sense, design refers to the process of creating or planning something, often
with the intention of improving upon or solving a problem. This can involve creating a new
product or system, developing a plan or strategy, or even creating a work of art or architecture.
Design often involves a combination of creative problem-solving and technical skills, as well
as an understanding of the needs and preferences of the intended audience or users. It is often
an iterative process, with designers testing and refining their ideas through prototypes and other
forms of experimentation.

5.1 ARCHITECTURE

camera Input :capture video


Read frames and
convert to HSV color

Detecting the
Performing object,creating a
Contour detection
morphological mask

Calculate x,y& storing


Output is displayed
of co-ordinates in deques

Fig.5.1: Flow chart of Air Canvas

The architecture of an air canvas system using OpenCV might involve the following
components:
A camera: This could be a webcam or other video capture device, which is used to capture
images or video of the user's hand or other input device as they move in front of the camera.

24
Image processing software: This could be OpenCV or another image processing library,
which is used to analyze the images or video captured by the camera and extract relevant data.
This might include detecting the position and movement of the user's hand, as well as any
gestures or other input the user is making.

A display: This could be a computer monitor or other display device, which is used to
display the output of the air canvas system. This might include visualizations of the user's hand
or input device, as well as any other graphics or interactive elements that are part of the air
canvas experience.

An input device: This could be a stylus, finger, or other device that the user uses to interact
with the air canvas system. The input device might be equipped with sensors or other hardware
that can be used to provide additional data to the system, such as pressure or orientation.

A computer or other processing platform: This is the hardware that runs the image
processing software, as well as any other software or applications that are part of the air canvas
system. The computer might also be connected to other hardware or devices, such as speakers
or motors, that can be used to enhance the air canvas experience.

Overall, the architecture of an air canvas system using OpenCV would involve using the
camera and image processing software to capture and analyze the user's input, and then using
a display and other hardware to provide an interactive and immersive experience for the user.

5.2 MODULE DESCRIPTION

The collection Module in Python provides different types of containers. A Container is an


object that is used to store different objects and provide a way to access the contained objects
and iterate over them. Some of the built-in containers are Tuple, List, Dictionary, etc. In this
article, we will discuss the different containers provided by the collections module. Image
processing modules: These modules may be used to perform operations on images such as
resizing, cropping, color space conversion, thresholding, and edge detection.

• Video processing modules: These modules may be used to process video streams, such
as to extract frames, stabilize the video, or track objects.

25
• Machine learning modules: These modules may be used to train and use machine
learning models for tasks such as object detection or classification.
• User interface modules: These modules may be used to create a user interface for the
application, such as to display images or video, or to allow the user to input commands
or select options.
• Air canvas-specific modules: There may also be specific modules that are designed
specifically for the "air canvas" application, such as modules for tracking hand
movements or detecting gestures.
• Gesture recognition module: This module is responsible for detecting and interpreting
hand gestures or movements made by the user. It may use techniques such as motion
tracking, skeleton tracking, or machine learning algorithms to identify specific gestures.
• Drawing module: This module is responsible for rendering the drawings or paintings
created by the user on the screen. It may use techniques such as image manipulation or
graphics rendering to create the final image.
• User interface module: This module is responsible for creating the interface that the
user interacts with, such as the canvas area, buttons for selecting colors or brush sizes,
and any other controls or options.
• Input module: This module is responsible for capturing input from the user, such as
hand gestures or movements, and passing it on to the appropriate modules for
processing. This may involve using computer vision techniques or sensors to track the
user's hand movements.
• Output module: This module is responsible for displaying the final result to the user,
such as the drawing or painting created using the user's hand gestures. It may use
techniques such as image rendering or video output to display the result on the screen.
• Communication module: This module is responsible for communicating with any
external devices or systems, such as a server or database. It may be used to save or load
drawings, or to share them with other users.

26
5.3 SYSTEM WORKFLOW
A workflow using OpenCV might involve the following steps:
1. Importing and setting up the OpenCV library in your project.
2. Loading an image or video from a file or camera into the program.
3. Preprocessing the image or video to improve the accuracy of any subsequent analysis.
This might include steps such as resizing, cropping, or converting the image to a
different color space.
4. Applying image processing or computer vision techniques to the image or video. This
might include tasks such as object detection, face recognition, or feature extraction.
5. Analyzing the results of the image processing or computer vision algorithms to extract
information or make decisions based on the data.
6. Visualizing the results by displaying the processed image or video to the user or saving
the results to a file.
7. Optionally, repeating the process on multiple images or videos in a loop to perform
automated analysis.

Working

Here Colour Detection and tracking is used in order to achieve the objective. The colour
marker in detected and a mask is produced. It includes the further steps of morphological
operations on the mask produced which are Erosion and Dilation. Erosion reduces the
impurities present in the mask and dilation further restores the eroded main mask. The air
canvas Detects blue colour in the camera frame and whichever object is detected, that object
becomes pen/stylus to draw the objects.(Caution- We should not have any other blue colour
object in camera frame background for air canvas to work smoothly) Some pen or objects of
blue colour can be used to act as brush to the canvas.

27
Fig.5.2: Working of Air Canvas

28
5.4 UML DIAGRAMS

Fig:Use case Diagram

29
5.5 Sequence Diagram

Fig:Sequence Diagram

30
5.6 INTERATION AMONG ALL MODULES

1. The input module captures the user's hand gestures or movements and passes them to
the gesture recognition module.
2. The gesture recognition module interprets the input and determines the appropriate
action to take, such as moving the brush or changing the brush size. It then sends this
information to the drawing module.
3. The drawing module uses the input from the gesture recognition module to update the
canvas image, adding lines or strokes as needed.
4. The output module receives the updated canvas image from the drawing module and
displays it to the user on the screen.
5. The user interface module receives input from the user, such as button clicks or
selections, and passes this input to the appropriate module for processing. For example,
if the user selects a different brush color, the user interface module would pass this
information to the drawing module, which would update the brush color accordingly.
6. The communication module may also be involved in the process, for example if the
user wants to save their drawing or share it with others. In this case, the communication
module would handle the transfer of data to and from external devices or systems.

31
CHAPTER-6
IMPLEMENTON

6.1 SAMPLE CODES

import cv2
import numpy as np

# Create a blank image with a black background


image = np.zeros((512, 512, 3), np.uint8)

# Set up the mouse callback function to draw on the image


def draw_circle(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
cv2.circle(image, (x, y), 20, (255, 0, 0), -1)

# Create a window and set it up to capture mouse events


cv2.namedWindow('image')
cv2.setMouseCallback('image', draw_circle)

# Loop until the user exits the window


while(1):
cv2.imshow('image', image)
if cv2.waitKey(20) & 0xFF == 27:
break

# Clean up
cv2.destroyAllWindows()

This code creates a 512x512 black image and sets up a mouse callback function to draw
blue circles on the image when the left mouse button is clicked. It then displays the image in
a window and waits for the user to exit the window.

32
Output:

Fig.6.1: Output to draw Blue circles on a image

• To display the video stream from a webcam in a window using Open CV-Python:

import cv2
# Open the default camera
cap = cv2.VideoCapture(0)
# Check if the camera is opened
if not cap.isOpened():
print("Cannot open the camera")
exit()
# Loop until the user exits the window
while True:
# Read the frame from the camera
ret, frame = cap.read()
# Check if the frame was successfully captured

33
if not ret:
print("Error capturing the frame")
break
# Show the frame in a window
cv2.imshow('Webcam', frame)
# Wait for the user to press a key
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release the camera and destroy all windows
cap.release()
cv2.destroyAllWindows()

This code opens the default camera (usually the built-in webcam on a laptop) and displays the
video stream in a window. The window can be closed by pressing the 'q' key.

Output:

Figure 6.2: Output to ON a webcam

34
• To create a virtual paint program that allows the user to change the color of the brush
and draw on an image using the mouse:

import cv2
import numpy as np
# Create a blank image with a white background
image = np.ones((500, 500, 3), np.uint8) * 255
# Set up the mouse callback function to draw on the image
drawing = False
color = (0, 0, 0)
def draw_circle(event, x, y, flags, param):
global drawing, color
if event == cv2.EVENT_LBUTTONDOWN:
drawing = True
elif event == cv2.EVENT_MOUSEMOVE:
if drawing:
cv2.circle(image, (x, y), 10, color, -1)
elif event == cv2.EVENT_LBUTTONUP:
drawing = False
# Create a window and set it up to capture mouse events
cv2.namedWindow('image')
cv2.setMouseCallback('image', draw_circle)
# Loop until the user exits the window
while True:
# Show the image in the window
cv2.imshow('image', image)
# Check if the user pressed a key
k = cv2.waitKey(1) & 0xFF
# Check if the user pressed the 'r' key to change the color to red
if k == ord('r'):
color = (0, 0, 255)

33
# Check if the user pressed the 'g' key to change the color to green
elif k == ord('g'):
color = (0, 255, 0)
# Check if the user pressed the 'b' key to change the color to blue
elif k == ord('b'):
color = (255, 0, 0)
# Check if the user pressed the 'q' key to exit the window
elif k == ord('q'):
break
# Destroy all windows
cv2.d
estroyAllWindows()

This code creates a blank white image and sets up a mouse callback function to draw circles
on the image when the left mouse button is clicked and dragged. It also captures key press
events and allows the user to change the color of the brush by pressing the 'r', 'g', or 'b' keys.
The window can be closed by pressing the 'q' key.

Output:

Fig.6.3: Output to draw on an image using the mouse

34
• To capture the video stream from a webcam, convert it to the HSV color space, and
display the individual channels of the HSV image in separate windows:

import cv2
import numpy as np
# Open the default camera
cap = cv2.VideoCapture(0)
# Check if the camera is opened
if not cap.isOpened():
print("Cannot open the camera")
exit()
# Loop until the user exits the window
while True:
# Read the frame from the camera
ret, frame = cap.read()
# Check if the frame was successfully captured
if not ret:
print("Error capturing the frame")
break
# Convert the frame to the HSV color space
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# Split the frame into the H, S, and V channels
h, s, v = cv2.split(hsv)
# Display the H channel
cv2.imshow('Hue', h)
# Display the S channel
cv2.imshow('Saturation', s)
# Display the V channel
cv2.imshow('Value', v)
# Check if the user pressed the 'q' key to exit the window
if cv2.waitKey(1) & 0xFF == ord('q'):
break

35
# Release the camera and destroy all windows
cap.release()
cv2.destroyAllWindows()

Output:

Fig.6.4: Output to convert video into individual channels of the HSV image in separate
windows

6.2 ALGORITHM

STEP 1: Start reading the frames and convert the captured frames to HSV color space (Easy
for color detection).
STEP 2: Prepare the canvas frame and put the respective ink buttons on it.
STEP 3: Adjust the track bar values for finding the mask of the colored marker.
STEP 4: Preprocess the mask with morphological operations (Eroding and dilation).
STEP 5: Detect the contours, find the center coordinates of largest contour and keep storing
them in the array for successive frames (Arrays for drawing points on canvas).
STEP 6: Finally draw the points stored in an array on the frames and canvas
1. Writing Mode - In this state, the system will trace the fingertip coordinates and
stores them.

36
2. Colour Mode – The user can change the colour of the text among the various
available colours.
• Backspace: Say if the user goes wrong, we need a gesture to add a quick backspace
Motion tracking: This algorithm is responsible for tracking the user's hand
movements in real-time. It may use techniques such as frame differencing, optical
flow, or feature tracking to identify the location and movement of the hand.
• Skeleton tracking: This algorithm is responsible for detecting the bones and joints in
the user's hand and arm, and creating a skeleton model of them. This allows the
application to more accurately interpret the user's hand gestures.
• Gesture recognition: This algorithm is responsible for interpreting the user's hand
gestures and determining the appropriate action to take, such as moving the brush or
changing the brush size. It may use techniques such as hidden Markov models, decision
trees, or machine learning algorithms to classify the gestures.
• Drawing: This algorithm is responsible for rendering the lines or strokes on the screen
as the user moves their hand. It may use techniques such as image manipulation or
graphics rendering to create the final image.
• Machine learning: Depending on the complexity of the gestures that the application
needs to recognize, it may also use machine learning algorithms to train a model to
classify the gestures. This may involve using a dataset of labeled gestures to train the
model, and then using the trained model to classify new gestures in real-time.

6.3 CODE

import numpy as np
import cv2
from collections import deque

• numpy: A library for scientific computing with Python. It provides functions for working
with arrays, matrices, and numerical operations.
• cv2: The OpenCV (Open Source Computer Vision) library. It provides a wide range of
functions and tools for computer vision tasks, including image and video processing,
object detection, and machine learning.

37
• deque: A double-ended queue implementation from the collections module. It allows you
to add and remove elements from both ends of the queue efficiently.
• The import numpy as np line imports the numpy library and assigns it the alias np, which
is a common convention. This allows you to refer to the library using the shorter np name
instead of typing out numpy every time.
• The import cv2 line imports the cv2 module, which provides access to the functions and
tools in the OpenCV library.
• The from collections import deque line imports the deque class from the collections
module. This allows you to create deque objects, which are double-ended queues that
you can add and remove elements from efficiently.

#default called trackbar function


def setValues(x):
print("")

• The setValues function you have defined takes a single parameter x, but it does not do
anything with it. It simply prints an empty string.
• It is likely that this function is intended to be used as a callback function for a trackbar in
the OpenCV library. A trackbar is a graphical widget that allows the user to set a value by
sliding a knob along a range of values. When the trackbar is moved, the callback function
is called with the new value of the trackbar.

# Creating the trackbars needed for adjusting the marker colour


cv2.namedWindow("Color detectors")
cv2.createTrackbar("Upper Hue", "Color detectors", 153, 180,setValues)
cv2.createTrackbar("Upper Saturation", "Color detectors", 255, 255,setValues)
cv2.createTrackbar("Upper Value", "Color detectors", 255, 255,setValues)
cv2.createTrackbar("Lower Hue", "Color detectors", 64, 180,setValues)
cv2.createTrackbar("Lower Saturation", "Color detectors", 72, 255,setValues)
cv2.createTrackbar("Lower Value", "Color detectors", 49, 255,setValues)

In this code snippet, you are creating a window called "Color detectors" and adding six
trackbars to it. The trackbars are used to adjust the upper and lower bounds of the hue,
saturation, and value channels of a color space.

38
The createTrackbar cv2.createTrackbar("Lower Value", "Color detectors", 49, 255,setValues)
function is used to create a trackbar. It takes the following arguments:
• The name of the trackbar.
• The name of the window in which the trackbar will be displayed.
• The initial value of the trackbar.
• The maximum value of the trackbar.
• The callback function to be called when the trackbar is moved.
• In this case, the trackbars are given names like "Upper Hue" and "Lower Saturation",
and they have a range of values from 0 to 255. The setValues function is specified as
the callback function for each trackbar.
• When the user moves any of these trackbars, the setValues function will be called with
the new value of the trackbar as an argument. You can then use this value to adjust the
color detection parameters in your code.

# Giving different arrays to handle colour points of different colour


bpoints = [deque(maxlen=1024)]
gpoints = [deque(maxlen=1024)]
rpoints = [deque(maxlen=1024)]
ypoints = [deque(maxlen=1024)]

• This code appears to be defining four different arrays, each of which is a deque (double-
ended queue) with a maximum length of 1024. The deques are named "bpoints",
"gpoints", "rpoints", and "ypoints", and they are each associated with a different color.
• A deque is a data structure that allows you to add and remove elements from both the
front and the back of the queue. It is similar to a list, but it has more efficient insert and
delete operations for elements at the beginning and end of the queue. The "maxlen"
parameter specifies the maximum number of elements that the deque can hold. If the
deque reaches its maximum length and a new element is added, the oldest element will
be automatically removed to make room for the new one.
• In this code, it looks like the four deques are being used to store points of different
colors. It is not clear from this code snippet how the deques are being used or what the
points represent. Without more context, it is difficult to provide a more detailed
explanation of this code.

39
#
These indexes will be used to mark the points in particular arrays of specific colour
blue_index = 0
green_index = 0
red_index = 0
yellow_index = 0

• This code appears to be defining four variables: "blue_index", "green_index",


"red_index", and "yellow_index". These variables are all integers with initial values of
0.
• It looks like these variables are being used to keep track of the indexes of points in the
"bpoints", "gpoints", "rpoints", and "ypoints" arrays that were defined in the previous.

#The kernel to be used for dilation purpose

kernel = np.ones((5,5),np.uint8)

colors = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (0, 255, 255)]

colorIndex = 0

This code initializes two variables: kernel and color Index.

• ‘kernel’ is a 2D NumPy array of ones, with a shape of (5, 5). The data type of the
elements in the array is ‘np.uint8’, which stands for "unsigned 8-bit integer". This type
of data can represent integer values in the range 0 to 255.
• ‘colors’ is a list of tuples, each tuple representing an RGB color value. The list contains
four elements, one for each of the colors red, green, blue, and yellow (which is
represented as a combination of red and green).
• ‘colorIndex’ is a variable that is initialized to 0. It is not clear from this code snippet
how this variable is being used.

Here is a breakdown of each line of code:

1. ‘kernel = np.ones((5,5),np.uint8)’ : This line creates a 2D NumPy array of ones, with


a shape of (5, 5) and a data type of np.uint8. The array is stored in the variable kernel.

40
2. ‘colors = [(255, 0, 0), (0, 255, 0), (0, 0, 255), (0, 255, 255)]’ : This line creates a list of
tuples, each tuple representing an RGB color value. The list contains four elements, one
for each of the colors red, green, blue, and yellow (which is represented as a
combination of red and green). The list is stored in the variable colors.
3. ‘colorIndex = 0’ : This line initializes the variable colorIndex to 0.

# Here is code for Canvas setup


paintWindow = np.zeros((471,636,3)) + 255
paintWindow = cv2.rectangle(paintWindow, (40,1), (140,65), (0,0,0), 2)
paintWindow = cv2.rectangle(paintWindow, (160,1), (255,65), colors[0], -1)
paintWindow = cv2.rectangle(paintWindow, (275,1), (370,65), colors[1], -1)
paintWindow = cv2.rectangle(paintWindow, (390,1), (485,65), colors[2], -1)
paintWindow = cv2.rectangle(paintWindow, (505,1), (600,65), colors[3], -1)

cv2.putText(paintWindow, "CLEAR", (49, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,


(0, 0, 0), 2, cv2.LINE_AA)
cv2.putText(paintWindow, "BLUE", (185, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
(255, 255, 255), 2, cv2.LINE_AA)
cv2.putText(paintWindow, "GREEN", (298, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
(255, 255, 255), 2, cv2.LINE_AA)
cv2.putText(paintWindow, "RED", (420, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
(255, 255, 255), 2, cv2.LINE_AA)
cv2.putText(paintWindow, "YELLOW", (520, 33), cv2.FONT_HERSHEY_SIMPLEX,
0.5, (150,150,150), 2, cv2.LINE_AA)
cv2.namedWindow('Paint', cv2.WINDOW_AUTOSIZE)

This code sets up a "canvas" for drawing, represented as a NumPy array paintWindow. It
also adds four colored rectangles to the canvas, representing buttons that can be used to select
different colors for drawing and adds text labels to the colored rectangles that were drawn on
the canvas and it creates a window for displaying the canvas.

41
Here is a breakdown of each line of code:

• ‘paintWindow = np.zeros((471,636,3)) + 255’ : This line creates a NumPy array with


a shape of (471, 636, 3) and fills it with zeros. The 3 in the shape indicates that the
array has three "channels", corresponding to the red, green, and blue channels of an
image. The + 255 part of the expression adds 255 to every element in the array,
effectively setting all elements to 255. This creates a "blank" canvas that is all white.
The array is stored in the variable paintWindow.
• ‘paintWindow = cv2.rectangle(paintWindow, (40,1), (140,65), (0,0,0), 2)’ : This line
uses the cv2.rectangle() function from the OpenCV library to draw a rectangle on the
paintWindow array. The function takes five arguments: the image on which to draw the
rectangle (paintWindow), the top-left corner of the rectangle ((40,1)), the bottom-right
corner of the rectangle ((140,65)), the color of the rectangle (black, represented as
(0,0,0)), and the thickness of the line used to draw the rectangle (2 pixels). The modified
paintWindow array is then stored back in the paintWindow variable.
• ‘paintWindow = cv2.rectangle(paintWindow, (160,1), (255,65), colors[0], -1)’ : This
line is similar to the previous one, but it draws a rectangle with the color specified by
the first element of the colors list (red, represented as (255, 0, 0)). The -1 value for the
thickness argument tells OpenCV to fill the rectangle completely with the specified
color.
• ‘paintWindow = cv2.rectangle(paintWindow, (275,1), (370,65), colors[1], -1)’: This
line is similar to the previous one, but it draws a rectangle with the color specified by
the second element of the colors list (green, represented as (0, 255, 0)).
• ‘paintWindow = cv2.rectangle(paintWindow, (390,1), (485,65), colors[2], -1)’ : This
line is similar to the previous one, but it draws a rectangle with the color specified by
the third element of the colors list (blue, represented as (0, 0, 255)).
• ‘paintWindow = cv2.rectangle(paintWindow, (505,1), (600,65), colors[3], -1)’ : This
line is similar to the previous one, but it draws a rectangle with the color specified by
the fourth element of the colors list (yellow, represented as (0, 255, 255)).
• ‘cv2.putText(paintWindow, "CLEAR", (49, 33), cv2.FONT_HERSHEY_SIMPLEX,
0.5, (0, 0, 0), 2, cv2.LINE_AA)’ : This line uses the cv2.putText() function from the
OpenCV library to add text to the paintWindow array. The function takes eight
arguments: the image on which to draw the text (paintWindow), the text to draw

42
("CLEAR"), the bottom-left corner of the text ((49, 33)), the font to use
(cv2.FONT_HERSHEY_SIMPLEX), the scale of the font (0.5), the color of the text
(black, represented as (0, 0, 0)), the thickness of the text (2 pixels), and the line type
(cv2.LINE_AA).
• The next lines are similar to the previous one, but it adds the text "BLUE", “GREEN”,
“RED”, “YELLOW” to the canvas and uses white text.
• ‘cv2.namedWindow('Paint', cv2.WINDOW_AUTOSIZE)’ : This line creates a window
with the name "Paint" using the cv2.namedWindow() function from the OpenCV
library. The cv2.WINDOW_AUTOSIZE flag tells OpenCV to automatically adjust the
size of the window to fit the displayed image.

# Loading the default webcam of PC.


cap = cv2.VideoCapture(0)
# Keep looping
while True:
# Reading the frame from the camera
ret, frame = cap.read()
#Flipping the frame to see same side of yours
frame = cv2.flip(frame, 1)
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

u_hue = cv2.getTrackbarPos("Upper Hue", "Color detectors")


u_saturation = cv2.getTrackbarPos("Upper Saturation", "Color detectors")
u_value = cv2.getTrackbarPos("Upper Value", "Color detectors")
l_hue = cv2.getTrackbarPos("Lower Hue", "Color detectors")
l_saturation = cv2.getTrackbarPos("Lower Saturation", "Color detectors")
l_value = cv2.getTrackbarPos("Lower Value", "Color detectors")
Upper_hsv = np.array([u_hue,u_saturation,u_value])
Lower_hsv = np.array([l_hue,l_saturation,l_value])

• ‘cap = cv2.VideoCapture(0)’ : This line creates a new VideoCapture object using the
default webcam of the PC. The parameter 0 specifies the index of the camera to use. If

43
you have multiple cameras connected to your PC, you can use other values to select a
different camera.
• ‘while True’ : This line starts an infinite loop that will keep running until the program
is stopped or interrupted.
• ‘ret, frame = cap.read()’ : This line reads a frame from the webcam and stores it in the
frame variable. The ret variable is a boolean value that indicates whether or not the
frame was successfully read.
• ‘frame = cv2.flip(frame, 1)’ : This line flips the frame horizontally. This is done to make
the image appear as if it's being seen from the same perspective as the user.
• ‘hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)’ : This line converts the frame
from the BGR color space to the HSV color space. This is often useful for color
detection tasks because it separates the color information from the luminance
(brightness) information.
• The next lines get the current position of the "Upper Hue", "Upper Saturation", "Upper
Value" trackbar on the "Color detectors" window. The trackbar is used to adjust the
upper hue threshold for color detection.
• The next lines get the current position of the "Lower Hue", "Lower Saturation", "Lower
Value" trackbar on the "Color detectors" window. The trackbar is used to adjust the
lower value threshold for color detection.
• ‘Upper_hsv = np.array([u_hue,u_saturation,u_value])’ : This line creates a new NumPy
array with the upper hue, saturation, and value thresholds. This array will be used to
define the upper bound of the color range to detect.
• ‘Lower_hsv = np.array([l_hue,l_saturation,l_value])’ : This line creates a new NumPy
array with the lower hue, saturation, and value thresholds. This array will be used to
define the lower bound of the color range to detect.
• These two arrays will be used in a later step to define the range of colors to detect in
the image. For example, if the upper and lower bounds for the hue channel are set to
180 and 0, respectively, the program will detect all colors within the full range of hues
(from 0 to 180). If the upper and lower bounds for the saturation and value channels are
also set to 255 and 0, respectively, the program will detect all colors with any level of
saturation and value.

44
# Adding the colour buttons to the live frame for colour access
frame = cv2.rectangle(frame, (40,1), (140,65), (122,122,122), -1)
frame = cv2.rectangle(frame, (160,1), (255,65), colors[0], -1)
frame = cv2.rectangle(frame, (275,1), (370,65), colors[1], -1)
frame = cv2.rectangle(frame, (390,1), (485,65), colors[2], -1)
frame = cv2.rectangle(frame, (505,1), (600,65), colors[3], -1)
cv2.putText(frame, "CLEAR ALL", (49, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,
255), 2, cv2.LINE_AA)

cv2.putText(frame, "BLUE", (185, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255,


255), 2, cv2.LINE_AA)
cv2.putText(frame, "GREEN", (298, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255,
255), 2, cv2.LINE_AA)
cv2.putText(frame, "RED", (420, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,255),
2, cv2.LINE_AA)
cv2.putText(frame,"YELLOW",(520,33),cv2.FONT_HERSHEY_SIMPLEX,0.5,
(150,150,150), 2, cv2.LINE_AA)

These lines of code are adding color buttons to the live video feed from the webcam and using
the cv2.putText() function to add text labels to the color buttons that were added.

• ‘frame = cv2.rectangle(frame, (40,1), (140,65), (122,122,122), -1)’ : This line draws a


rectangle on the frame using the cv2.rectangle() function. The first parameter is the
image on which to draw the rectangle (in this case, frame). The next two parameters are
the top-left and bottom-right corners of the rectangle, respectively. The fourth
parameter is the color of the rectangle, which is specified as a tuple of (B, G, R) values.
The final parameter is the thickness of the rectangle's border. A value of -1 means that
the rectangle will be filled with color and have no border.
• The next lines are similar to the first, but it draws a rectangle with the color specified
by the elements in the colors list.
• ‘cv2.putText(frame, "CLEAR ALL", (49, 33), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
(255,255, 255), 2, cv2.LINE_AA)’ : This line adds the text "CLEAR ALL" to the frame
image at the specified coordinates (49, 33). The cv2.FONT_HERSHEY_SIMPLEX

45
constant specifies the font to use, and the 0.5 parameter specifies the font scale. The
color of the text is specified as a tuple of (B, G, R) values, and the 2 parameter specifies
the thickness of the text. The cv2.LINE_AA constant specifies the type of line used to
render the text.
• The next lines are similar to the first, but it adds the text “BLUE”, “GREEN”, “RED”,
“YELLOW” to the frame image at the specified coordinates. The color of the text is
specified as a tuple of (B, G, R) values, which is different from the other lines.
• These lines of code are adding labels to the color buttons, so the user knows what each
button does. The labels are added in white text to make them easy to read against the
colored backgrounds of the buttons.

# Identifying the pointer by making its mask


Mask = cv2.inRange(hsv, Lower_hsv, Upper_hsv)
Mask = cv2.erode(Mask, kernel, iterations=1)
Mask = cv2.morphologyEx(Mask, cv2.MORPH_OPEN, kernel)
Mask = cv2.dilate(Mask, kernel, iterations=1)

These lines of code are creating a mask that isolates the color of the pointer (e.g. a finger) in
the video feed from the webcam. Here's a breakdown of what each line does:

• ‘Mask = cv2.inRange(hsv, Lower_hsv, Upper_hsv)’ : This line creates a mask using


the cv2.inRange() function. The hsv image is the source image, and the Lower_hsv and
Upper_hsv arrays define the range of colors to include in the mask. All pixels in the hsv
image that fall within this range will be set to 255 (white) in the mask, and all otherpixels
will be set to 0 (black).
• ‘Mask = cv2.erode(Mask, kernel, iterations=1)’ : This line erodes the mask using the
cv2.erode() function. The kernel parameter specifies a kernel (or structuring element)
to use for the erosion, and the iterations parameter specifies the number of times to
apply the erosion. Erosion is an image processing operation that removes pixels from
the boundaries of objects in the image. It is often used to reduce noise or to separate
touching objects.
• ‘Mask = cv2.morphologyEx(Mask, cv2.MORPH_OPEN, kernel)’ : This line performs
a morphological opening operation on the mask using the cv2.morphologyEx()

46
function. The cv2.MORPH_OPEN constant specifies that an opening operation should
be performed, and the kernel parameter specifies the kernel to use. Morphological
opening is an image processing operation that erodes the image followed by dilation. It
is often used to remove small objects or noise from an image.
• ‘Mask = cv2.dilate(Mask, kernel, iterations=1)’ : This line dilates the mask using the
cv2.dilate() function. The kernel parameter specifies a kernel to use for the dilation, and
the iterations parameter specifies the number of times to apply the dilation. Dilation is
an image processing operation that adds pixels to the boundaries of objects in the image.
It is often used to join broken parts of objects or to fill in small holes.

Together, these operations create a mask that isolates the pointer (e.g. finger) from the rest
of the image. The mask will be used in a later step to identify the pointer and track its
movements.

# Find contours for the pointer after identifying it

cnts,_ = cv2.findContours(Mask.copy(), cv2.RETR_EXTERNAL,

cv2.CHAIN_APPROX_SIMPLE)

center = None

• In this code, the ‘findContours’ function is being used to identify and trace the contours,
or outlines, of the object in the image represented by the Mask binary image. A binary
image is a image that has only two values for each pixel, usually denoted as 0 (black)
and 1 (white). In this case, the ‘Mask’ image should have pixels set to 1 for all the pixels
corresponding to the object of interest (e.g. the pointer), and pixels set to 0 for all other
pixels.
• The ‘findContours’ function returns two values: a list of contours and a hierarchy. Each
contour is a list of points that define the boundary of the object in the image. The
hierarchy is a list that represents the topological structure of the contours, but it is not
used in this code, so it is discarded using the _ variable.
• The RETR_EXTERNAL parameter specifies that only the extreme outer contours
should be retrieved. The CHAIN_APPROX_SIMPLE parameter specifies that the
contour points should be stored in a more compact representation, by storing only the
corner points of the contour.

47
• The center variable is being initialized as None, but it will be used later to store the
center point of the object represented by the contours.

#if the contours are formed

If len(cnts)>0:

# sorting the contours to find biggest

cnt = sorted(cnts, key = cv2.contourArea, reverse = True)[0]

# Get the radius of the enclosing circle around the found contour

((x, y), radius) = cv2.minEnclosingCircle(cnt)

# Draw the circle around the contour

cv2.circle(frame, (int(x), int(y)), int(radius), (0, 255, 255), 2)

# Calculating the center of the detected contour

M = cv2.moments(cnt)

center = (int(M['m10'] / M['m00']), int(M['m01'] / M['m00']))

This code block is executed only if there are contours detected in the image, which is
checked by the if statement if len(cnts)>0.

The contours are sorted in descending order by their area using the sorted function and the
cv2.contourArea function as the key. This is done so that the largest contour is processed first.

Then, the minimum enclosing circle of the contour is found using the
cv2.minEnclosingCircle function. This function returns the center coordinates (x, y) and the
radius of the circle as output.

The circle is then drawn on the original image frame using the cv2.circle function, with the
center coordinates and the radius of the enclosing circle as input. The circle is drawn with a
yellow color ((0, 255, 255)) and a thickness of 2 pixels.

Finally, the center of the contour is calculated using the moments of the contour. Moments are
statistical properties of an image that can be used to analyze and describe the shape of an object.

48
The center of the contour is calculated as the weighted average of all the pixels in the
contour, with the pixel intensity as the weight. The moments M['m10'] and M['m00'] represent
the sum of the x-coordinates and the sum of the y-coordinates of the pixels in the contour,
respectively. The center coordinates are then calculated as the ratios M['m10']/M['m00'] and
M['m01']/M['m00']. The center is stored as a tuple of integer coordinates (int(M['m10'] /
M['m00']), int(M['m01'] / M['m00'])).

Finally, the code calculates the center of the contour using the moments() method and the
spatial moments of the contour. The moments() method returns a dictionary of moments that
can be used to calculate various properties of the contour, such as its area, centroid, and
orientation. In this case, the centroid (center of mass) of the contour is being calculated by
dividing the first and second spatial moments by the zero-order moment (area). The resulting
coordinates are stored in the center variable.

It is not clear from the code snippet exactly what the "pointer" is or how it is being used,
but it appears that the code is using the contours and moments of the pointer to identify its
position and shape in the video frame.

# Now checking if the user wants to click on any button above the screen
if center[1] <= 65:
if 40 <= center[0] <= 140: # Clear Button
bpoints = [deque(maxlen=512)]
gpoints = [deque(maxlen=512)]
rpoints = [deque(maxlen=512)]
ypoints = [deque(maxlen=512)]

blue_index = 0
green_index = 0
red_index = 0
yellow_index = 0
paintWindow[67:,:,:] = 255

49
elif 160 <= center[0] <= 255:
colorIndex = 0 # Blue

elif 275 <= center[0] <= 370:


colorIndex = 1 # Green
elif 390 <= center[0] <= 485:
colorIndex = 2 # Red
elif 505 <= center[0] <= 600:
colorIndex = 3 # Yellow
else :
if colorIndex == 0:
bpoints[blue_index].appendleft(center)
elif colorIndex == 1:
gpoints[green_index].appendleft(center)
elif colorIndex == 2:
rpoints[red_index].appendleft(center)
elif colorIndex == 3:
ypoints[yellow_index].appendleft(center)
# Append the next deques when nothing is detected to avois messing up
else:
bpoints.append(deque(maxlen=512))
blue_index += 1
gpoints.append(deque(maxlen=512))
green_index += 1
rpoints.append(deque(maxlen=512))
red_index += 1
ypoints.append(deque(maxlen=512))
yellow_index += 1

This code block is executed when the center of the object is located within the horizontal
position of the Clear button (40 to 140 pixels) in the top 65 pixels of the screen.

50
The code resets the bpoints, gpoints, rpoints, and ypoints lists to empty lists using the deque
class, as explained in the previous answer. The blue_index, green_index, red_index, and
yellow_index variables are also reset to 0. The painting window is filled with white color using
slicing and assignment. The painting window is a region of the image where the user can draw
with different colors using the pointer. The painting window is located below the buttons, and
it starts at the 66th row of the image (the top 65 rows are reserved for the buttons).

The paintWindow[67:,:,:] slicing notation selects all the rows starting from the 67th row
(inclusive) and all the columns and channels (color channels) of the image. The : symbol is
used to select all the elements in a particular dimension. The selected region is then filled with
white color using the assignment operator = 255. The value 255 represents the maximum
intensity for each color channel in the image (255 for red, green, and blue in the case of a
standard 8-bit RGB image).

when the center of the object is located within the horizontal position of any of the color
buttons (Blue, Green, Red, or Yellow) in the top 65 pixels of the screen.The colorIndex variable
is set to the corresponding index for the selected color. The index values are 0 for Blue, 1 for
Green, 2 for Red, and 3 for Yellow. The value of the colorIndex variable will be used later to
determine which color the user has selected, and add the coordinates of the center of the object
to the appropriate list of points (bpoints, gpoints, rpoints, or ypoints).

when the center of the object is not located in the top 65 pixels of the screen (i.e. it is located
below the buttons), and a color has been selected by the user. The code checks the value of the
colorIndex variable to determine which color the user has selected. If the colorIndex is set to 0
(Blue), the coordinates of the center of the object are added to the bpoints list using the
appendleft method of the deque class. If the colorIndex is set to 1 (Green), the coordinates are
added to the gpoints list. If the colorIndex is set to 2 (Red), the coordinates are added to the
rpoints list. If the colorIndex is set to 3 (Yellow), the coordinates are added to the ypoints list.

The appendleft method adds the element to the left end of the queue (the beginning of the
list). This ensures that the most recent points are added at the beginning of the list, while the
older points are pushed to the end of the list. This is useful for drawing the lines with the
pointer, as the lines are drawn in the order that the points were added to the list.

51
when the center of the object is not detected in the image (else case). The code appends
new empty lists to the bpoints, gpoints, rpoints, and ypoints lists using the deque class. The
deque class is a double-ended queue that allows adding and removing elements from both ends
of the queue efficiently. The maxlen parameter specifies the maximum number of elements
that the queue can hold. In this case, the queue is set to hold a maximum of 512 elements. The
blue_index, green_index, red_index, and yellow_index variables are incremented by 1. These
variables are used to keep track of the current index of the lists of points being added to the
bpoints, gpoints, rpoints, and ypoints lists.

The purpose of this code block is to avoid "messing up" the lists of points and the index
variables if the object is not detected. By adding new empty lists to the lists of points and
incrementing the index variables, the code ensures that the lists and the variables are ready to
receive new points when the object is detected again.

# Draw lines of all the colors on the canvas and frame


points = [bpoints, gpoints, rpoints, ypoints]
for i in range(len(points)):
for j in range(len(points[i])):
for k in range(1, len(points[i][j])):

if points[i][j][k - 1] is None or points[i][j][k] is None:


continue
cv2.line(frame, points[i][j][k - 1], points[i][j][k], colors[i], 2)

cv2.line(paintWindow, points[i][j][k - 1], points[i][j][k], colors[i], 2)

This code block is used to draw the lines on the canvas and the original frame using the
points stored in the bpoints, gpoints, rpoints, and ypoints lists. The points list is a list that
contains the four lists of points (bpoints, gpoints, rpoints, and ypoints). The colors list is a list
of colors that corresponds to the colors represented by the lists of points (blue, green, red, and
yellow). The code iterates over the lists of points using two nested for loops. The outer loop
iterates over the indexes of the lists (0 to 3), and the inner loop iterates over the points within
each list.

52
For each pair of points in the list, the code draws a line connecting them using the
cv2.line function. The function takes as input the image where the line will be drawn (frame or
paintWindow), the coordinates of the starting and ending points of the line, the color of the line
(from the colors list), and the thickness of the line (2 pixels). The continue statement is used to
skip the iteration of the inner loop if either of the points is None. This is done to avoid drawing
lines with points that have not been detected.

The lines are drawn on both the original frame and the painting window. The painting
window is a region of the image where the user can draw with different colors using the pointer.
The painting window is located below the buttons, and it starts at the 66th row of the image
(the top 65 rows are reserved for the buttons).

# Show all the windows


cv2.imshow("Tracking", frame)
cv2.imshow("Paint", paintWindow)
cv2.imshow("mask",Mask)

This code is using the OpenCV (cv2) library to display three images in separate windows. The
first window, "Tracking" shows the original frame with the lines drawn on it. The second
window, "Paint", shows the painting window with the lines drawn on it. The third window,
"mask", shows the mask image that is used to detect the object (e.g. the pointer).

cv2.imshow() is a function that takes in two arguments: the name of the window and
the image to be displayed in that window. When the code is run, it will open three windows
and display the respective images in them. The windows will remain open until the user closes
them or until the program is stopped.

The purpose of this code is likely to allow the user to view and analyze the images in
each window. The "Tracking" window may show the original video or image, while the "Paint"
and "mask" windows may show some kind of processed version of the original image, such as
a mask or an annotated version.

53
# If the 'q' key is pressed then stop the application
if cv2.waitKey(1) & 0xFF == ord("q"):
break

This code block is used to stop the application when the user presses the 'q' key. The
cv2.waitKey function waits for a key event and returns the ASCII code of the key pressed. The
function takes as input the delay in milliseconds that the function should wait for a key event
(1 millisecond in this case). The & 0xFF operator is used to extract the lower 8 bits of the
returned value, which corresponds to the ASCII code of the key pressed.

The ord function is used to get the ASCII code of the 'q' key. If the ASCII code of the
key pressed is equal to the ASCII code of the 'q' key, the if statement is True and the break
statement is executed, which breaks the infinite loop and stops the application. This is a
common technique in OpenCV applications to allow the user to stop the execution by pressing
a key.

# Release the camera and all resources

cap.release( )

cv2.destroyAllWindows( )

This code block is used to release the camera and free the resources used by the
application when the application is stopped. The cap.release function releases the camera
resources. This is important to ensure that the camera is available for use by other applications
and to prevent resource leaks.

The cv2.destroyAllWindows function closes all the windows that have been created
using the cv2.imshow function. This is important to ensure that the windows are closed and the
resources used by the windows are freed.

These lines of code should be executed after the infinite loop is broken (e.g. when the
user press the 'q' key or when the end of the video file is reached).

54
CHAPTER-7

RESULTS AND OUTPUT SCREENS

Fig.7.1: Live frame window is opened

Fig.7.2: Colour is detected in mask window


55
Fig.7.3: Contour is detected

Fig.7.4: Output of Air Canvas

56
CHAPTER-8
CONCLUSION & FUTURE WORK

8.1 CONCLUSION
Air Canvas is a fun and interactive application that demonstrates the capabilities of
OpenCV for object detection and tracking, and for drawing on images. It can be used as a
creative tool for digital art, or as a learning project for students interested in computer vision
and image processing. This project has the potential to challenge traditional writing methods.
Eliminates the need to carry a cell phone in hand to take notes, to give an easy way on the go
to do the same. It will again work towards a greater purpose in helping especially those who
know them to communicate easily. Even adults who find it difficult to use the keyboard can
easily use the program. Expanding functionality, this program can also be used to control IoT
devices soon. Air painting can also be made happen. This program will be very good smart
clothing software using which people can work better with the digital world. The unpopular
reality of taxpayers we see can make the text come alive. Wind-writing programs should listen
only to their master's control touch and should not be misled by people all around. Such
discovery algorithms are as follows as YOLO v3 can improve fingerprint recognition accuracy
and speed. In the future, progress on Artificial Intelligence will improve the efficiency of
writing in the air.

An air writing approach employing a laptop camera and a video-based pointing


mechanism is shown here. To begin with, the proposed technique tracks the colour of the
fingertip in video frames and then applies OCR to the plotted pictures in order to identify the
written letters. It also allows a natural human-system interface that does not need a keypad,
pen, or glove for character input. It's all you need is a phone camera and a red hue to reorganize
a finger tip. OpenCv and python were used to create an application for the studies. Using the
suggested technique, an average accuracy of 92.083 percent may be achieved in the recognition
of correct alphabets. The suggested solution resulted in a 50 ms per character delay in total
writing time. Furthermore, the suggested approach may be used to all unconnected languages,
but it has one important disadvantage that it is colour sensitive in such a manner that the
presence of any red colour in the backdrop before commencing the analysis might lead to
erroneous findings.

57
8.2. FUTURE ENHANCEMENT:

There are several ways in which the Air Canvas application could be enhanced using
OpenCV and Python:
• Object recognition : Instead of using a color-based mask to detect the object, the
application could use object recognition techniques such as template matching, feature
extraction, or deep learning to recognize the object regardless of its color or appearance.
This would allow the user to use any object as a pointer, not just objects of a particular
color.
• Multiple pointers : The application could be extended to support multiple pointers at
the same time, allowing multiple users to draw on the screen simultaneously. This could
be achieved using object recognition techniques or by assigning unique colors to each
user.
• 3D drawing : The application could be enhanced to allow the user to draw in 3D space
using a depth camera or a stereo camera. This would enable the user to draw lines and
shapes in 3D, and to create 3D art and models.
• Touch screen support : The application could be adapted to work with touch screen
devices, allowing the user to draw on the screen using their fingers or a stylus.
• Network support : The application could be enhanced to support networking, allowing
multiple users to draw on the same canvas remotely using their own devices. This could
be achieved using network protocols such as TCP/IP or HTTP.
• User interface: The application could be enhanced with a more intuitive and interactive
user interface, with features such as a color palette, brush sizes, undo/redo.
• Layer support: The application could be enhanced to support layers, allowing the user
to draw on multiple layers and to rearrange, merge, or delete layers. This would enable
the user to create more complex artworks and to edit their drawings more easily.
• Text input: The application could be enhanced to support text input, allowing the user
to write and draw text on the canvas.
• Image import and export: The application could be enhanced to allow the user to import
and export images from and to the canvas. This would enable the user to use their own
images as a background or to save their drawings as image files.
• Advanced image processing techniques: The application could be enhanced with
advanced image processing techniques such as image segmentation, image
58
enhancement, or image restoration to improve the accuracy and quality of the object
detection and tracking.
• Augmented reality: The application could be enhanced with augmented reality (AR)
techniques to superimpose the drawings on the real-world environment seen through
the camera. This would enable the user to draw on real-world objects and surfaces.

PyGame incorporates a line drawing technique (pygame.draw.line()) that could be valuable


in creating smoother, cleaner lines. On a similar vein, implementing a variety of brush shapes,
surfaces, and even an eraser would make Air Canvas more powerful as a drawing program.
Permitting the client to save their final work or watch their drawing cycle as a playback
animation could likewise be remarkable highlights that look like genuine innovative software.
Maybe there would even be an approach to interface Air Canvas to genuine computerized
drawing projects, for example, Adobe Photoshop, Clip Studio Paint, or GIMP! At last, we could
make critical walks by sorting out how multicore processing works with in-order information
processing.

Computer Vision is the science of helping computers perceive and interpret digital pictures
such as photos and movies. It's been a decades-long topic of intense investigation. Computer
vision is getting better than the human visual cognitive system at spotting patterns from
pictures. Computer vision-based technologies have surpassed human doctors' pattern
recognition skills in the healthcare industry. Let us examine the status of computer vision
technology now and in the future. There are several aspects to consider when computer vision
expands its effect on the human world. With further study and fine-tuning, computer vision
will be able to do more in the future. The system will be simpler to train and can identify more
from photos than it does presently.Computer vision will be used in conjunction with other
technologies or subsets of AI to generate more attractive applications. Image captioning apps,
for example, may use natural language generation (NLG) to understand things in the
environment for visually impaired persons.
Computer vision may help create artificial general intelligence (AGI) and artificial
superintelligence (ASI) by processing information better than the human visual system.
Computer vision is a growing sector linked to virtual and augmented reality (VR and AR).
Recent market participants have shown a great interest in VR/AR fusion. This significant
growth in attention is mirrored in the release of several cutting-edge technology items.
59
REFERENCES

[1] Y. Huang, X. Liu, X. Zhang, and L. Jin, "A Pointing Gesture Based Egocentric Interaction
System: Dataset, Approach, and Application," 2016 IEEE Conference on Computer Vision and
Pattern Recognition Workshops (CVPRW), Las Vegas, NV, pp. 370-377, 2016.

[2] P. Ramasamy, G. Prabhu, and R. Srinivasan, "An economical air writing system is converting
finger movements to text using a web camera," 2016 International Conference on Recent Trends
in Information Technology (ICRTIT), Chennai, pp. 1-6, 2016.

[3] Saira Beg, M. Fahad Khan and Faisal Baig, "Text Writing in Air," Journal of Information
Display Volume 14, Issue 4, 2013.

[4] Alper Yilmaz, Omar Javed, Mubarak Shah, "Object Tracking: A Survey", ACM Computer
Survey. Vol. 38, Issue. 4, Article 13, Pp. 1-45, 2006.

[5] Yuan-Hsiang Chang, Chen-Ming Chang, "Automatic Hand-Pose Trajectory TrackingSystem


Using Video Sequences", INTECH, pp. 132- 152, Croatia, 2010.

[6] Erik B. Sudderth, Michael I. Mandel, William T. Freeman, Alan S. Willsky, "Visual Hand
Tracking Using Nonparametric Belief Propagation", MIT Laboratory For Information &
Decision Systems Technical Report P2603, Presented at IEEE CVPR Workshop On Generative
Model-Based Vision, Pp. 1-9, 2004.

[7] T. Grossman, R. Balakrishnan, G. Kurtenbach, G. Fitzmaurice, A. Khan, and B. Buxton,


"Creating Principal 3D Curves with Digital Tape Drawing," Proc. Conf. Human Factors Computing
Systems (CHI' 02), pp. 121- 128, 2002.

[8] T. A. C. Bragatto, G. I. S. Ruas, M. V. Lamar, "Real-time Video-Based Finger Spelling


Recognition System Using Low Computational Complexity Artificial Neural Networks", IEEEITS,
pp. 393-397, 2006.

60

You might also like