Hand Signs To Audio Converte1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Hand Signs To Audio Converter

PROJECT SYNOPSIS
OF MAJOR PROJECT

BACHELOR OF TECHNOLOGY
CSE

SUBMITTEDBY GUIDED BY-

Raghwendra kumar Pandey Ms. Parul Singh


(2002300100125)(15187)

Anish Thakur
(2002300100024)(15083)

Rahul
(2002300100127)(15105)

Manoj Kumar Thakur


(2002300100093)(15076)

Saurabh Agarwal
(2002300100142)(15682)

Dronacharya Group of Institutions, Greater Noida


December ,2023
TABLE OF CONTENTS

1. Abstract
2. Introduction
 Overall description
 Purpose
 Motivations and scope
3. Literature Survey
4. Problem Statement
5. Proposed model
6. References
Abstract

The hand gestures are one of the typical methods used in sign language. It is very
difficult for the hearing impaired people to communicate with the world. This
project presents a solution that will not only automatically recognize the hand
gestures but will also convert it into speech and text output so that impaired person
can easily communicate with normal people. A camera attached to computer will
capture images of hand and the contour feature extraction is used to recognize the
hand gestures of the person. Based on the recognized gestures, the recorded
soundtrack will be played.
The goal of the project is to enhance inclusivityby providing an effective bridge
between the deaf and hearing communities through seamless sign language to
speech coversion.
1. INTRODUCTION

This Effective communication is essential in all aspects of life, and it is especially


important for individuals who are deaf or hard of hearing. With the rising number
of people suffering from hearing loss, it is crucial to find ways to bridge the
communication gap between the hearing and non-hearing population. To address
this issue, we present a new system for converting Sign Language into text format
using computer vision and machine learning techniques. This system aims to
provide an efficient and accessible solution for deaf and hard of hearing
individuals to communicate with the hearing population.In the today’s world,
Communication is always having a great impact in every domain and how it is
considered the meaning of thoughts and expressions that attract the researchers to
bridge this gap for normal and deaf people. According to World Health
Organization, by 2050 nearly 2.5 billion people are projected to have some degree
of hearing loss and at least 700 million will require hearing rehabilitation. Over 1
billion young adults are at the risk of permanent, avoidable hearing loss due to
unsafe listening practices. Sign languages vary among regions and countries, with
Indian, Chinese, American, and Arabic being some of the major sign languages in
use today. This system focuses on Indian Sign Language and utilizes the Media
Pipe Holistic Key points for hand gesture recognition. The system uses an action
detection model powered by LSTM layers to build a sign language model and
predict the Indian Sign Language in real-time.Computer vision is one of the
emerging frameworks in object detection and is widely used in various aspects of
research in artificial intelligence. Sign language is categorized in accordance with
regions like Indian, Chinese, American and Arabic. This system introduces
efficient and fast techniques for identifying the hand gestures representing sign
language meaning. In this system we will extract the Media Pipe Holistic Key
points, then build a sign language model using an Action detection powered by
LSTM layers. Then Predict Indian sign language in real time.
1.1 Overall Description:

Sign Language is the most natural and expressive way for individuals who are deaf
to communicate. Gestures made with the hands and other body parts, including
facial expressions, are included in sign language. It is primarily used by people
who are deaf or dumb. Sign language recognition refers to the process of
converting the user's signs and gestures into text. Each country has its own sign
language. Indians communicate using Indian Sign Language. Other sign languages,
such as ASL (American Sign Language) and BSL (British Sign Language), are
generally single-handed, whereas ISL uses both hands to make signs. A deaf-dumb
person's desire to speak is rarely understood by the average person. As a result, the
deaf-dumb person uses gestures to communicate his or her needs. Dumb people
communicate with us in their own language. In general, they communicate with
others through sign language. They, however, have difficulty communicating with
others who do not understand sign language. Because new technology is being
developed in our generation, we have created a machine learning computer
programming model that translates sign language to text format and would reduce
the communication gap between normal people and deaf people.
1. OpenCV:
Video Input: OpenCV handles capturing live video feed from a camera or
processing pre-recorded videos.
Hand Detection: Use OpenCV to detect and track the hand in the video frames.
This can be achieved using various techniques like background subtraction,
contour analysis, or more advanced methods like using MediaPipe.
2. MediaPipe:
Hand Gesture Recognition: Utilize MediaPipe's hand tracking model to recognize
and interpret hand gestures captured by OpenCV. MediaPipe offers pre-trained
models for hand detection and gesture recognition.
Extracting Hand Features: Process the hand landmarks provided by MediaPipe to
identify specific hand gestures and their corresponding meanings.
3. Tkinter:
User Interface: Create a graphical user interface (GUI) using tkinter for users to
interact with the application. This interface can include buttons, menus, and other
controls for functionalities like starting/stopping the conversion, selecting
audio/video files, etc.
Purpose.

Here about this research is constructing a machine learning model that can predict
hand gesture from a camera & then turn recognised gesture into voice so that non-
Deaf & non-Dumb people may understand what Deaf & Dumb people are saying.
We are using a deep learning Convolution Neural Network to train hand gesture
images, & we are using that trained model to predict those learnt hand motions
from webcam. we used SVM technique in suggested investigation, although
Python SVM is not reliable for distinguishing hand motion.
This project aims to develop a system that can convert hand gestures into text.
project's objective is to add photographs to database, which will match them &
convert them to text. As part about detection process, hands are observed in
motion. method generates text output, reducing communication gap between
humans & deafmutes.

All the signs cannot be expressed in a single image, the system recognizes sign
language exclusively from images to compensate for the limitations of the existing
system, such as image categorization. As a result, we use CNN and RNN to
classify videos. The spatial properties of the hand signs are extracted using CNN.
The CNN model's output will be fed into the RNN model for sequence modelling,
which will determine which sign is shown in the video. The discovered sign will be
translated into text and speech.
III. MOTIVATION

Hearing-impaired people communicate through hand signs, which makes it


difficult for normal people to recognize their language. As a result, systems that
recognize various signs and deliver information to ordinary people are required.

The fundamental issue is that many indicators cannot be expressed in images;


nevertheless, video sequences can. The key aim here is to detect the sign in the
video sequences and translate it into text and speech that people can understand.
Normal people have a hard time understanding hearing-impaired people's
language, so a system that understands signs and gestures and relays information to
normal people is needed.
3. Literature Survey

There has been a lot of research into hand sign language gesture recognition in
recent years. The technology used to recognize gestures is listed below

A. Vision-based

In vision-based approaches, a computer camera is used to observe information


from the hands or fingers. The Vision-Based approaches just require a camera,
allowing for natural human-computer contact without the usage of any additional
technologies. By describing artificial vision systems that are implemented in
software and/or hardware, these systems tend to complement biological vision.
This is a difficult challenge to solve since, to attain real-time performance, these
systems must be background insensitive, illumination insensitive, and person and
camera agnostic. Furthermore, such systems must be tailored to satisfy the
requirements, which include accuracy and resilience.
Automatic Indian Sign Language Recognition for Continuous Video Sequence [2]
Data Acquisition, Pre-processing, Feature Extraction, and Classification are the
four primary modules in the proposed system. Skin Filtering and histogram
matching are applied in the pre-processing step, followed by Eigenvector-based
Feature Extraction and Eigen value-weighted Euclidean distance-based
Classification Technique. In this work, 24 different alphabets were considered, and
a 96 percent identification rate was achieved.

4. Problem statement:

The problem statement for developing a hand sign to audio and video converter
using OpenCV, Tkinter, and MediaPipe involves creating a system capable of real-
time hand gesture recognition from a live camera feed, translating these gestures
into textual or audible outputs, and presenting the interpreted signs through a
graphical user interface. The key components and challenges to address include:

 Real-time Hand Gesture Recognition: Utilize MediaPipe's hand tracking


capabilities within OpenCV to accurately detect and track hand movements
in varying conditions, such as different hand shapes, positions, and
orientations.

 Hand Gesture Interpretation: Develop algorithms to interpret recognized


hand gestures, mapping them to corresponding text or audio representations.
This involves creating a comprehensive database or mapping system for
different sign language gestures.

 User Interface Integration: Design and implement a user-friendly GUI using


Tkinter to display either textual outputs or videos/audios representing the
interpreted hand signs in real-time.
5. Proposed model

 Hand Detection and Tracking: Utilize MediaPipe's hand tracking capabilities


via OpenCV to detect and track hand landmarks in real-time from the
camera feed. This step involves locating and identifying the landmarks of
the hand, such as fingertips, palm, and joints.

 Gesture Recognition and Mapping: Develop an algorithm to recognize and


interpret these hand landmarks, mapping them to predefined gestures or sign
language symbols. This mapping could be based on a pre-built dataset or
machine learning models trained on sign language gestures.

7.References

 educative.jo/answers/sign-language-translator-using-opencv

You might also like