Indian Sign Language Generator and Detector Team Qstar

Deccan Education Society's
NAVINCHANDRA MEHTA INSTITUTE OF

TECHNOLOGY AND DEVELOPMENT
NAAC Accredited “B++”
“Indian Sign Language Generator and Detector”

SUBMITTED BY
C23085 – Nikhil Patil
C23122 - Siddharth Suryavanshi
C23131 - Suraj Ramshakal Vishwakarma
C23080 - Pal Deepak Ramesh
C23110 - Mohd Talha Imtiyaz Ahamad Shaikh
C23016 - Rohini Yashavant Bhavar
Mentor: Dr.Swapnali Mahadik
[2023-2025]
Submitted to University of Mumbai

in partial fulfilment of the requirements for qualifying
MASTER OF COMPUTER APPLICATION
Examination
1
Deccan Education Society’s
NAVINCHANDRA MEHTA INSTITUTE OF TECHNOLOGY AND DEVELOPMENT
PROJECT CERTIFICATE
This is to certify that the Project done at Deccan Education Society by Mr. Nikhil Satyawan
Patil (C23085), Mr. Siddharth Suryavanshi(C23122), Mr. Suraj Ramshakal
Vishwakarma(C23131), Mr. Pal Deepak Ramesh(C23080), Mohd Talha Imtiyaz
Ahamad Shaikh(C23110) & Miss. Rohini Yashavant Bhavar (C23016) in partial fulfilment
for MCA Degree Examination has been found satisfactory. This report had not been submitted
for any other examination and does not form part of any other course undergone by the
candidate.
Internal Guide Director
EXAMINED BY
EXTERNAL EXAMINERBOT
…………………………
College Stamp
2
ACKNOWLEDGEMENT
Achievement is finding out what you would be doing rather than what you must do. It is not until
you undertake such a project that you realize how much effort and hard work it really is, what are your
capabilities and how well you can present yourself or other things. It gives me immense pleasure to present
this report towards the fulfilment of my project.
It has been rightly said that we are built on the shoulder of others. For everything I have
achieved, the credit goes to all those who had helped me to complete this project successfully.
I take this opportunity to express my profound gratitude to management of DES NMITD for giving me
this opportunity to accomplish this project work.
I am very thankful to Dr. Rasika Mallya, Incharge Director of DES NMITD College, for her kind
cooperation in the completion of my project.
Thanks to my project mentor and guide, Dr. Swapnali Mahadik, for helping and guiding me
throughout my project. A special vote of thanks to Dr. Sulakshana Vispute, Head of the Department, MCA.
I would like to thank all my classmates & the entire MCA department who directly or indirectly helped me
in completion of this project.
3
ABSTRACT
This project proposes an advanced AI-powered system to address the communication barriers between the deaf
community and those unfamiliar with Indian Sign Language (ISL). The solution is a multi-platform tool that
provides real-time translation between spoken languages (Hindi/English) and ISL, empowering deaf and hard-
of-hearing individuals to interact seamlessly in diverse social, educational, and professional settings. Leveraging
cutting-edge technologies such as speech recognition, natural language processing (NLP), computer vision, and
motion synthesis, the system ensures highly accurate and dynamic ISL gesture generation while maintaining
adaptability to regional dialects and nuances.
Key innovations include an AI engine for contextual translation, dynamic motion-capture ISL gestures,
and a mobile application that supports bidirectional translation. This mobile app, equipped with speech and
camera inputs, ensures users can switch between spoken and sign languages effortlessly. Additional integrations,
such as offline functionality, extend its usability non-connected environments, further promoting inclusivity.
The project also incorporates strategies to address technical and financial challenges, including the use
of pre-trained models, open-source tools, and continuous engagement with the deaf community for feedback.
The solution has the potential to foster inclusivity and understanding between communities, enhancing
communication in public spaces, education, healthcare, and other critical sectors. The anticipated impact includes
improved accessibility, economic empowerment, and technological advancement, all while promoting
environmental sustainability by reducing reliance on physical resources. By creating a transformative
communication tool, this project aims to make society more equitable and inclusive for the deaf community.
4
INDEX
Chapter Contents Page

No.
I Introduction
1.1 Survey/ Exploratory Study for
Requirements Identification
1.2 Problem statement
1.3 Existing System
1.4 Proposed System
1.5 System Requirement
1.6 Scope of the project
II Project Planning and Scheduling

2.1 Gnatt Chart
III Analysis and Design

3.1 Use Case Diagram
3.2 Activity Diagram
3.3 Detailed flow charts
3.4 Sequence Diagram
3.5 Class Diagram
IV User Interface
5
V Testing & Evaluation of the System
(Approaches / Test Cases)
VI Limitations and Future Enhancements
VII Conclusion
VIII References
6
CHAPTER 1 :– INTRODUCTION
Indian Sign Language (ISL) is the primary form of communication for the deaf and hard-of-hearing
community in India, serving as a vital medium to express thoughts, ideas, and emotions. However,
a significant communication gap persists between ISL users and individuals who do not understand
this language. This divide creates barriers that restrict access to opportunities and services in areas
such as education, healthcare, employment, and public services, thereby limiting the full
participation of the deaf community in mainstream society. This lack of inclusivity also affects social
integration and mutual understanding between the deaf and hearing populations.
To address this critical challenge, this project proposes the development of an innovative, AI-driven
solution: an Indian Sign Language Generator. This system is envisioned as a multi-platform tool
capable of facilitating seamless real-time translation between spoken languages (Hindi and English)
and ISL. By utilizing advanced technologies like speech recognition, natural language processing
(NLP), computer vision, and motion synthesis, the tool aims to provide accessible, effective, and
reliable communication support across diverse settings.
The proposed solution goes beyond a simple translation tool. It integrates dynamic gesture generation
through motion-capture technology, enables bi-directional communication through gesture
recognition, and supports offline functionality to make it accessible in remote or low-connectivity
areas. This comprehensive approach seeks to empower the deaf community, promote inclusivity, and
foster understanding, creating a society where everyone can communicate effectively regardless of
their linguistic abilities or disabilities.
Key Objectives:
1. Enable Real-Time Translation: Design and develop a robust system that can translate
spoken languages such as Hindi and English into Indian Sign Language (ISL) gestures in
real-time. The system will also support the reverse process, translating ISL gestures into
spoken or written languages to ensure two-way communication. Real-time translation is
critical to enable smooth, natural interactions in various real-world scenarios, such as
classrooms, hospitals, workplaces, and public spaces.
2. Ensure Multi-Platform Accessibility: Create a versatile tool that functions across multiple
platforms, including mobile applications, desktop environments, and browser extensions like
Chrome. This multi-platform design ensures users can access the tool from their preferred
device or environment, making it adaptable for use in educational institutions, workplaces,
public transportation, and even personal communication.
3. Integrate Dynamic Gesture Generation: Use motion-capture technology and advanced
2D/3D animation systems to generate highly dynamic ISL gestures. These gestures will adapt
to context-specific meanings and regional variations within ISL, ensuring accurate and
culturally appropriate communication. This feature will help maintain the integrity and clarity
of the intended messages during translation.
4. Facilitate Bi-Directional Communication: Enable two-way translation by incorporating
gesture recognition technology. The system will analyze ISL gestures performed by users
and translate them into spoken or written language, allowing deaf individuals to express
themselves effectively and participate in conversations without barriers.
7
5. Provide Offline Functionality: Equip the tool with offline capabilities to ensure it can be
used in areas with limited or no internet connectivity. By employing pre-trained models and
leveraging local storage, the system will deliver accurate translations without requiring
constant access to cloud-based resources, enhancing accessibility for users in rural or remote
locations.
6. Overcome Technical and Economic Challenges: Address potential challenges, such as
achieving high accuracy in real-time speech and gesture recognition, optimizing processing
speeds, and managing development costs. The project will rely on pre-trained AI models,
open-source libraries, and incremental development strategies to make the solution
technically feasible and economically viable.
7. Promote Scalability and Versatility: Design the system to be scalable and adaptable to a
wide range of use cases, including educational tools, assistive technologies, public service
interfaces, and more. By ensuring the solution's flexibility, it can evolve to address additional
needs and contexts as they arise, further expanding its impact.
8. Foster Community Engagement and Collaboration: Engage with the deaf and hard-of-
hearing community, ISL experts, linguists, and other stakeholders during the development
and testing phases. Their input will ensure the system is accurate, inclusive, and user-friendly.
Regular feedback and updates will help refine the tool over time, addressing real-world
challenges faced by the target audience.
9. Promote Social Inclusivity and Awareness: By addressing the communication gap, the
project seeks to foster a more inclusive society where individuals from the deaf and hearing
communities can interact without barriers. The tool will promote awareness about the
importance of accessible technologies and ISL, encouraging a broader acceptance and
understanding of the needs of the deaf community.
10. Enhance Accessibility and Empowerment: Empower the deaf community by providing
them with a reliable tool for communication. This will enable them to access education,
healthcare, employment, and public services more effectively, leading to greater
independence, confidence, and participation in societal activities.
Through these detailed objectives, this project aspires to create a transformative tool that not only
addresses immediate communication barriers but also contributes to building a more inclusive,
equitable, and connected society for all.
8
Survey/ Exploratory Study for Requirements Identification:
To ensure the successful development of an Indian Sign Language (ISL) Generator, it is essential
to conduct a comprehensive survey and exploratory study. This step will help identify the specific
requirements, preferences, and challenges of the target audience, including both the deaf
community and individuals unfamiliar with ISL. The study will also involve relevant stakeholders
such as educators, healthcare professionals, linguists, and technical experts. Below are the key
components and approaches for conducting the study:
1. Purpose of the Study

The purpose of the survey and exploratory study is to:
• Understand the communication barriers faced by the deaf community in various settings
(education, healthcare, public services, etc.).
• Identify user expectations, usability requirements, and specific functionalities desired in the
ISL generator.
• Assess the technical, linguistic, and cultural challenges associated with ISL translation.
• Gather insights from stakeholders to align the solution with real-world needs and scenarios.
2. Target Audience
The study will target the following groups to capture diverse perspectives:
• Primary Users:
o Deaf and hard-of-hearing individuals who rely on ISL for communication.
o Family members and caregivers of deaf individuals.
• Secondary Users:
o Teachers, interpreters, and social workers involved with the deaf community.
o Healthcare professionals, public service workers, and employers who interact with
ISL users.
• Subject Matter Experts:
o ISL linguists and researchers.
o AI and machine learning experts familiar with gesture recognition and NLP.
3. Methodologies
a. Online Surveys
• Objective: Collect quantitative and qualitative data on user requirements, challenges, and
expectations.
• Target Audience: Deaf individuals, their caregivers, and ISL interpreters.
• Key Questions:
o What challenges do you face in communicating with non-ISL users?
o Which features (e.g., real-time translation, offline use) are most important to you?
o What types of content do you want the tool to translate (e.g., educational materials,
public announcements)?
o What platform would you prefer (mobile, web, desktop)?
b. Focus Group Discussions
• Objective: Obtain in-depth insights into the daily communication experiences and
expectations of the deaf community.
• Participants: Deaf individuals, ISL interpreters, educators, and social workers.
• Discussion Topics:
o Common scenarios where communication barriers occur.
o Preferred formats for ISL gestures (2D/3D avatars, videos, etc.).
o Usability challenges with existing tools (if any).
c. One-on-One Interviews
• Objective: Gather detailed input from ISL experts, educators, and technical professionals.
• Focus Areas:
9
o Linguistic nuances of ISL, including regional variations.
o Limitations of current ISL translation technologies.
o Feasibility of implementing features like motion-capture gestures and bi-directional
translation.
d. Observational Studies
• Objective: Observe real-life communication scenarios where ISL is used to identify gaps
and opportunities.
• Settings: Schools for the deaf, hospitals, public service counters, and workplaces.
• Key Observations:
o How deaf individuals currently communicate with non-ISL users.
o The role of ISL interpreters and challenges in their absence.
4. Data Analysis
• Quantitative Analysis:
Use statistical tools to analyze survey responses, identifying common challenges and high-
priority features.
• Qualitative Analysis:
Transcribe and analyze focus group discussions and interviews to extract recurring themes,
unique requirements, and contextual nuances.
5. Expected Outcomes
• Clear understanding of the critical needs and expectations of ISL users.
• Insights into technological and linguistic challenges in ISL translation.
• Identification of key functionalities to prioritize in the ISL generator, such as:
o Real-time speech-to-gesture translation.
o Gesture recognition for ISL-to-text/speech conversion.
o Cross-platform usability with offline capabilities.
• Awareness of design preferences (e.g., user-friendly interfaces, avatar design).
• Alignment with real-world scenarios and social contexts.
6. Importance of the Study

This study ensures that the ISL generator is not only technologically advanced but also aligned
with the practical needs of its users. By involving stakeholders from diverse backgrounds, the
project will create a tool that is inclusive, accurate, and impactful, fostering better communication
and accessibility for the deaf community.
10
System Design
Existing System Currently, various tools and technologies exist that help with the communication
of the deaf community, but none offer the seamless, real-time translation between spoken
languages (Hindi/English) and Indian Sign Language (ISL) across multiple platforms.
Some existing systems include:

1. Speech-to-Text Applications: These apps convert spoken language into text but do
not address the gap between text and ISL. They are useful for some settings but
cannot bridge communication between hearing and non-hearing users when ISL is
required.
2. Video-based ISL Learning Platforms: Platforms like YouTube or dedicated
mobile apps provide educational content to help users learn ISL. However, they
require users to manually search for and learn gestures, which isn't efficient for real-
time, context-based translation.
3. Dedicated ISL Translators/Interpreters: ISL interpreters can translate spoken
language into ISL in real-time, but this relies heavily on human resources and is
often unavailable in everyday scenarios, creating a gap in public spaces, education,
healthcare, and other real-time environments.
4. Existing Gesture Recognition Systems: Systems like Google’s TensorFlow and
OpenPose support gesture recognition but do not specifically cater to ISL. These
systems lack the fine-grained understanding needed for regional variations and the
context-driven accuracy required for full communication.
While these systems provide partial solutions, none offer a complete, accessible, and real-time
solution for translating between spoken languages and ISL in a fully automated manner.
Proposed System
The proposed system is an AI-powered ISL Generator capable of real-time translation between
spoken languages (Hindi/English) and ISL, using advanced technologies such as speech
recognition, natural language processing (NLP), gesture recognition, and motion synthesis. Key
components of the proposed system include:
1. Real-Time Speech-to-ISL Translation:
The system will convert spoken Hindi or English into ISL gestures in real-time, using a
combination of speech recognition and NLP techniques. The ISL gestures will be
dynamically generated through 2D or 3D avatars or motion-capture animations.
2. Bi-Directional Translation:
The system will also support ISL-to-spoken-language translation. By using gesture
recognition, it will convert ISL signs into written text or speech, allowing communication
from deaf individuals to non-deaf individuals.
3. Cross-Platform Support:
The system will be accessible across various platforms, including mobile apps (for Android
and iOS), web-based interfaces (like a Chrome extension), and desktop environments. This
ensures that it can be used in multiple settings such as education, healthcare, public
services, and personal communication.
4. Motion Synthesis for Realistic ISL Gestures:
The system will generate lifelike ISL gestures using motion synthesis and animation
technologies. These gestures will be contextually accurate, adapting to regional dialects and
user-specific requirements.
5. User-Friendly Interface:
11
The system will feature an intuitive, easy-to-use interface designed for both hearing and
deaf individuals, ensuring accessibility for all users. Features such as voice commands,
gesture recognition, and adjustable UI components will enhance usability.
Advantages of the Proposed System:

The proposed system offers several advantages over existing solutions, which can be summarized as
follows:
1. Real-Time, Context-Aware Translation: Unlike existing speech-to-text or
gesture recognition systems, the proposed system provides real-time, context-
sensitive translation from spoken languages to ISL and vice-versa. This real-
time feature ensures smooth communication between deaf and hearing
individuals in various scenarios, such as classrooms, public spaces, and
hospitals.
2. Bi-Directional Communication: The system supports bi-directional
translation, which is a significant improvement over current systems that only
support one-way communication. Deaf individuals can use ISL to communicate
with hearing individuals, while hearing individuals can communicate back
using spoken language or written text.
3. Cross-Platform Accessibility: The system is designed to be used across
multiple platforms, including mobile apps, web-based interfaces, and desktop
environments. This ensures that users can access the tool from their preferred
device and in different contexts, such as during travel, in the workplace, or at
home.
4. Offline Functionality: Unlike existing systems that rely heavily on internet
connectivity, the proposed system will allow offline use. This is particularly
beneficial for users in rural or remote areas, where internet access may be
inconsistent or unavailable.
Conclusion
The proposed ISL Generator provides a transformative solution for the deaf community and society
at large. With its innovative approach to real-time, bi-directional communication, cross-platform
support, and offline capabilities, it addresses the current limitations of existing systems. By creating
a user-centric, accessible, and scalable tool, the system not only improves communication but also
contributes to a more inclusive and equitable society.
12
Software Requirements: -
Front End: Flutter, dart, Python
Backend:Python
Minimum Hardware Requirements

Processor : Intel(R) 2.10GHz
Installed memory (RAM) : 4 GB
Hard Disk : 160 GB
Operating System : Windows (10)
13
Scope of the Indian Sign Language (ISL) Generator and Detector
1. Functional Scope
1. Dual Language Translation:
• Speech-to-ISL Translation:
Translate spoken Hindi/English into Indian Sign Language (ISL) gestures in real-time using
speech recognition and natural language processing (NLP).
• ISL-to-Spoken Language Translation:
Detect ISL gestures and translate them into spoken Hindi/English using gesture recognition
and computer vision.
2. Platform Compatibility:
• Mobile Application:
Create an app for Android and iOS platforms that provides real-time, bi-directional
translation between spoken language and ISL.
• Web Interface:
Develop a Chrome extension and web-based platform for easy access on desktops and
laptops, supporting real-time translations in various environments.
3. Gesture Generation and Motion Synthesis:
• Dynamic ISL Gesture Creation:
Use motion-capture technology or 2D/3D avatars to generate lifelike and region-specific
ISL gestures, ensuring context-based accuracy.
• Real-Time Animation:
Generate real-time animated ISL gestures based on speech inputs, supporting effective
communication in both educational and everyday settings.
4. Offline Functionality:
• Local Storage for Offline Use:
Enable offline functionality by storing essential models and translation data locally,
allowing users to access the system without internet connectivity in areas with poor
network access.
5. User Interaction:
• Gesture Recognition Interface:
Provide an intuitive interface that allows users to input ISL gestures and receive translations
in spoken language or text.
• Voice Command Integration:
Allow users to interact with the system using voice commands for easy, hands-free control.
6. Cross-Platform Synchronization:
• Synchronization Across Devices:
Ensure that user data, such as saved translations and preferences, sync seamlessly across
devices, including mobile and desktop platforms.
2. Non-Functional Scope
1. Performance:
• Speed and Responsiveness:
Ensure the system translates spoken language into ISL and vice-versa with minimal delay
for seamless real-time communication.
• Scalability:
Design the system to handle increasing user base and translation requests without
compromising performance or accuracy.
2. Security:
• Data Privacy:
14
Implement secure user authentication and encryption to protect personal and usage data,
ensuring compliance with privacy regulations.
• Secure Gesture Data Handling:
Ensure that sensitive gesture data and models are stored securely and are not exposed to
unauthorized access.
3. Usability:
• User-Friendly Interface:
Develop an easy-to-use interface that is intuitive for both deaf and hearing users, ensuring
accessibility for all users regardless of technical proficiency.
• Customizable Settings:
Allow users to adjust settings such as text size, gesture speed, and language preferences to
improve accessibility and ease of use.
4. Maintainability:
• Modular Codebase:
Maintain a clean and modular codebase, enabling easy updates, bug fixes, and feature
additions without major overhauls.
• Clear Documentation:
Provide detailed documentation for developers, ensuring future updates and maintenance
can be carried out efficiently.
5. Reliability:
• High Availability and Stability:
Ensure the system remains operational with minimal downtime, providing a stable service
for real-time translation at all times.
• Error-Free Translation:
Focus on minimizing translation errors and ensuring the system provides contextually
accurate translations in various real-world scenarios.
6. Analytics and Feedback:
• Usage Analytics:
Implement analytics tools to track user behavior, identify common issues, and optimize the
system’s performance based on user data.
• User Feedback:
Integrate feedback mechanisms within the app to collect suggestions and critiques, enabling
continuous improvement of the system.
3. Out-of-Scope
1. Social Features:
• Full Social Networking Integration:
The system will not include complex social networking features like friend connections or
message feeds. It will focus solely on communication tools for translating spoken language
to ISL and vice-versa.
2. E-commerce Functionality:
• In-App Purchases or Sales:
The app will not involve any direct sales or e-commerce activities, such as selling ISL-
related learning materials or products.
3. Advanced AI Features:
• Custom AI Model Development:
The system will not include custom AI model development for tasks beyond speech
recognition, gesture recognition, and translation. It will rely on existing technologies like
TensorFlow and Google’s Cloud Speech API for AI-related functions.
4. Large-Scale Educational Content:
• ISL Learning Modules:
The system will not provide extensive ISL learning content or educational courses beyond
15
the core translation functions. The focus is on real-time communication and translation.
4. Future Considerations
1. Advanced Personalization:
• User-Centric Translation Engine:
The system could be enhanced with a more sophisticated recommendation engine,
personalizing the translation process based on user behavior, regional dialects, and specific
communication needs.
2. Interactive Learning Features:
• ISL Training and Practice Mode:
Add features for users to practice and learn ISL, such as interactive quizzes, games, and
real-time practice with the system's gestures.
3. Enhanced Gesture Database:
• Expand Gesture Library:
Continuously expand the ISL gesture database, incorporating new signs, regional dialects,
and more sophisticated motion capture animations to improve translation accuracy.
4. Integration with Other Assistive Technologies:
• Smart Devices and Wearables:
Explore potential integration with wearable devices (e.g., smart glasses, AR/VR headsets)
to provide more immersive and hands-free ISL translation experiences.
16
CHAPTER 2:- Gnatt Chart:
Concept and Planning Phase (3 weeks)

• Define project scope and objectives
• Research target market and competitors
• Identify key features and functionalities
• Create user personas and use cases
Design Phase (3 weeks)

• UI/UX design for mobile (Flutter).
• Wireframing and prototyping
• Design feedback and iterations
Development Phase (9 weeks)

• Set up development environment
• Implement core functionalities:
o Habit tracking (daily and weekly habits)
o Journaling features (weight, finance, sleep, calories)
o Data storage and synchronization
• Testing and bug fixing
Integration and Testing(Quality Assurance) (3 weeks)

• Integrate backend services (if applicable)
• Conduct unit testing and integration testing
• Gather user feedback for adjustments
17
CHAPTER 3 – Analysis and Design
System Diagram:
18
Use-case Diagram:
19
Activity Diagram:
Start
End
20
Data Flow Diagram:
21
Sequence Diagram:
22
Class Diagram:
23
CHAPTER 4:– User Interface
App Screenshots:
24
25
26
27
Source Code:
test.py:
import cv2
from cvzone.HandTrackingModule import HandDetector
from cvzone.ClassificationModule import Classifier
import numpy as np
import math
cap = cv2.VideoCapture(0)
detector = HandDetector(maxHands=1)
classifier = Classifier("Model/keras_model.h5", "Model/labels.txt")
offset = 20
imgSize = 300
counter = 0
labels = ["Hello", "I love you", "No", "Okay", "Please", "Thank you", "Yes"]
while True:
success, img = cap.read()
imgOutput = img.copy()
hands, img = detector.findHands(img)
if hands:
hand = hands[0]
x, y, w, h = hand['bbox']
imgWhite = np.ones((imgSize, imgSize, 3), np.uint8) * 255
# Crop the image around the detected hand

imgCrop = img[y - offset:y + h + offset, x - offset:x + w + offset]
# Check if imgCrop is valid (not None and has non-zero size)

if imgCrop is not None and imgCrop.size > 0:
imgCropShape = imgCrop.shape
aspectRatio = h / w
# Resize and position the cropped image on a white background

if aspectRatio > 1:
k = imgSize / h
wCal = math.ceil(k * w)
imgResize = cv2.resize(imgCrop, (wCal, imgSize))
imgResizeShape = imgResize.shape
wGap = math.ceil((imgSize - wCal) / 2)
28
imgWhite[:, wGap: wCal + wGap] = imgResize
prediction, index = classifier.getPrediction(imgWhite, draw=False)
print(prediction, index)
else:
k = imgSize / w
hCal = math.ceil(k * h)
imgResize = cv2.resize(imgCrop, (imgSize, hCal))
hGap = math.ceil((imgSize - hCal) / 2)
imgWhite[hGap: hCal + hGap, :] = imgResize
prediction, index = classifier.getPrediction(imgWhite, draw=False)
# Display predictions and bounding boxes

cv2.rectangle(imgOutput, (x - offset, y - offset - 70),
(x - offset + 400, y - offset + 60 - 50), (0, 255, 0), cv2.FILLED)
cv2.putText(imgOutput, labels[index], (x, y - 30),
cv2.FONT_HERSHEY_COMPLEX, 2, (0, 0, 0), 2)
cv2.rectangle(imgOutput, (x - offset, y - offset),
(x + w + offset, y + h + offset), (0, 255, 0), 4)
# Display cropped and processed images

cv2.imshow('ImageCrop', imgCrop)
cv2.imshow('ImageWhite', imgWhite)
else:
print("Warning: imgCrop is empty or None. Check cropping coordinates.")
cv2.imshow('Image', imgOutput)
cv2.waitKey(1)
29
data_collection.py:
import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import time
cap = cv2.VideoCapture(0)
detector = HandDetector(maxHands=1)
offset = 20
imgSize = 300
counter = 0
folder = "Data/Okay"
while True:
success, img = cap.read()
hands, img = detector.findHands(img)
if hands:
hand = hands[0]
x, y, w, h = hand['bbox']
imgWhite = np.ones((imgSize, imgSize, 3), np.uint8)*255
imgCrop = img[y-offset:y + h + offset, x-offset:x + w + offset]

imgCropShape = imgCrop.shape
aspectRatio = h / w
if aspectRatio > 1:
k = imgSize / h
wCal = math.ceil(k * w)
imgResize = cv2.resize(imgCrop, (wCal, imgSize))
wGap = math.ceil((imgSize-wCal)/2)
imgWhite[:, wGap: wCal + wGap] = imgResize
else:
k = imgSize / w
hCal = math.ceil(k * h)
imgResize = cv2.resize(imgCrop, (imgSize, hCal))
hGap = math.ceil((imgSize - hCal) / 2)
30
imgWhite[hGap: hCal + hGap, :] = imgResize
cv2.imshow('ImageCrop', imgCrop)
cv2.imshow('ImageWhite', imgWhite)
cv2.imshow('Image', img)
key = cv2.waitKey(1)
if key == ord("s"):
counter += 1
cv2.imwrite(f'{folder}/Image_{time.time()}.jpg', imgWhite)
print(counter)
signLanguageTranslator.py:
import speech_recognition as sr
import numpy as np
import matplotlib.pyplot as plt
import cv2
from easygui import *
import os
from PIL import Image, ImageTk
from itertools import count
import tkinter as tk
import string
def func():
r = sr.Recognizer()
isl_gif=['all the best', 'any questions', 'are you angry', 'are you busy', 'are you
hungry', 'are you sick', 'be careful',
'can we meet tomorrow', 'did you book tickets', 'did you finish homework',
'do you go to office', 'do you have money',
'do you want something to drink', 'do you want tea or coffee', 'do you
watch TV', 'dont worry', 'flower is beautiful',
'good afternoon', 'good evening', 'good morning', 'good night', 'good
question', 'had your lunch', 'happy journey',
'hello what is your name', 'how many people are there in your family', 'i
am a clerk', 'i am bore doing nothing',
'i am fine', 'i am sorry', 'i am thinking', 'i am tired', 'i dont understand
anything', 'i go to a theatre', 'i love to shop',
'i had to say something but i forgot', 'i have headache', 'i like pink colour',
'i live in nagpur', 'lets go for lunch', 'my mother is a homemaker',
'my name is john', 'nice to meet you', 'no smoking please', 'open the door',
'please call an ambulance', 'please call me later',
'please clean the room', 'please give me your pen', 'please use dustbin dont
31
throw garbage', 'please wait for sometime', 'shall I help you',
'shall we go together tommorow', 'sign language interpreter', 'sit down',
'stand up', 'take care', 'there was traffic jam', 'wait I am thinking',
'what are you doing', 'what is the problem', 'what is todays date', 'what is
your age', 'what is your father do', 'what is your job',
'what is your mobile number', 'what is your name', 'whats up', 'when is
your interview', 'when we will go', 'where do you stay',
'where is the bathroom', 'where is the police station', 'you are
wrong','address','agra','ahemdabad', 'all', 'april', 'assam', 'august', 'australia', 'badoda',
'banana', 'banaras', 'banglore',
'bihar','bihar','bridge','cat', 'chandigarh', 'chennai', 'christmas', 'church', 'clinic',
'coconut', 'crocodile','dasara',
'deaf', 'december', 'deer', 'delhi', 'dollar', 'duck', 'febuary', 'friday', 'fruits', 'glass',
'grapes', 'gujrat', 'hello',
'hindu', 'hyderabad', 'india', 'january', 'jesus', 'job', 'july', 'july', 'karnataka', 'kerala',
'krishna', 'litre', 'mango',
'may', 'mile', 'monday', 'mumbai', 'museum', 'muslim', 'nagpur', 'october', 'orange',
'pakistan', 'pass', 'police station',
'post office', 'pune', 'punjab', 'rajasthan', 'ram', 'restaurant', 'saturday', 'september',
'shop', 'sleep', 'southafrica',
'story', 'sunday', 'tamil nadu', 'temperature', 'temple', 'thursday', 'toilet', 'tomato',
'town', 'tuesday', 'usa', 'village',
'voice', 'wednesday', 'weight','please wait for sometime','what is your mobile
number','what are you doing','are you busy']
arr=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r',
's','t','u','v','w','x','y','z']
with sr.Microphone() as source:
r.adjust_for_ambient_noise(source)
i=0
while True:
print('Say something')
audio = r.listen(source)
try:
a=r.recognize_sphinx(audio)
print("you said " + a.lower())
for c in string.punctuation:
a= a.replace(c,"")
if(a.lower()=='goodbye'):
print("oops!Time To say good bye")
32
break
elif(a.lower() in isl_gif):
class ImageLabel(tk.Label):
def load(self, im):
if isinstance(im, str):
im = Image.open(im)
self.loc = 0
self.frames = []
try:
for i in count(1):
self.frames.append(ImageTk.PhotoImage(im.copy()))
im.seek(i)
except EOFError:
pass
try:
self.delay = im.info['duration']
except:
self.delay = 100
if len(self.frames) == 1:
self.config(image=self.frames[0])
else:
self.next_frame()
def unload(self):
self.config(image=None)
self.frames = None
def next_frame(self):
if self.frames:
self.loc += 1
self.loc %= len(self.frames)
self.config(image=self.frames[self.loc])
self.after(self.delay, self.next_frame)
root = tk.Tk()
lbl = ImageLabel(root)
lbl.pack()
lbl.load(r'ISL_Gifs/{0}.gif'.format(a.lower()))
33
root.mainloop()
else:
for i in range(len(a)):
if(a[i] in arr):
ImageAddress = 'letters/'+a[i]+'.jpg'
ImageItself = Image.open(ImageAddress)
ImageNumpyFormat = np.asarray(ImageItself)
plt.imshow(ImageNumpyFormat)
plt.draw()
plt.pause(0.8)
else:
continue
except:
print("Could not listen")
plt.close()
while 1:
image = "signlang.png"
msg="HEARING IMPAIRMENT ASSISTANT"
choices = ["Live Voice","All Done!"]
reply = buttonbox(msg,image=image,choices=choices)
if reply ==choices[0]:
func()
if reply == choices[1]:
quit()
34
CHAPTER 5 :– Testing and Evaluation of System
35
36
CHAPTER 6: – Limitations and Future Enhancements
Limitations of the System:
1. Limited Vocabulary:
The system can only recognize a predefined set of phrases and letters, restricting its
usability for conversations beyond the programmed inputs.
2. Dependency on Quality of Input:
The system struggles with low-quality audio input or unclear speech, affecting its ability to
recognize phrases accurately.
3. Resource Limitations:
Missing GIFs or images for some phrases or characters result in errors, reducing user
experience.
4. Language Restriction:
The system only supports English, making it inaccessible for non-English speakers.
5. Static Gesture Recognition:
It can only interpret static gestures or phrases with predefined GIFs and cannot process
dynamic gestures in real-time.
6. Hardware Dependence:
The system requires a microphone and camera for full functionality, limiting its use in
environments where these devices are unavailable or of poor quality.
7. Lack of Context Understanding:
The system cannot infer context or respond dynamically to inputs outside its training set.
8. High System Requirements:
Real-time gesture recognition and GIF processing may require significant computational
power, limiting its usability on low-spec devices.
Future Enhancements:
1. Expand Vocabulary:
Incorporate a broader range of phrases and words to improve versatility and cover a wider
range of communication scenarios.
2. Multilingual Support:
Add support for multiple languages to cater to users from diverse linguistic backgrounds.
3. Dynamic Gesture Recognition:
Develop capabilities for recognizing and interpreting dynamic hand gestures in real-time.
4. AI-Powered Context Understanding:
Use AI models to understand the context and provide more relevant and adaptive responses.
5. Cloud Integration:
Move resource-heavy processing to the cloud for better performance and lower hardware
dependency.
6. Offline Mode:
Implement an offline mode by storing key models and resources locally, allowing use
without internet connectivity.
7. Interactive User Interface:
Enhance the graphical interface for better user engagement, including options for
customization and visual feedback.
37
8. Error Recovery Mechanism:
Introduce mechanisms to handle errors, such as suggesting similar words or re-prompting
the user in case of unrecognized input.
9. Integration with Assistive Devices:
Connect the system to assistive devices like hearing aids or smart glasses to improve
accessibility for hearing-impaired users.
10. Learning Mode for Customization:
Add a learning mode where users can train the system to recognize new phrases, gestures,
or languages.
These enhancements aim to make the system more robust, accessible, and user-friendly for a
broader audience.
38
CHAPTER 7 :– Conclusion
The Sign Language Detector and Generator project serves as a vital tool in bridging the
communication gap between individuals with hearing or speech impairments and the broader
community. By leveraging computer vision, natural language processing, and interactive
interfaces, the system provides a platform for recognizing gestures and translating them into
meaningful outputs. The integration of speech recognition further enhances its usability, allowing
seamless two-way communication.
Although the current system has limitations, such as a restricted vocabulary, dependency on
predefined datasets, and hardware requirements, it lays a solid foundation for future advancements.
With enhancements like dynamic gesture recognition, multilingual support, and AI-driven context
understanding, the system has the potential to become a transformative solution in assistive
technology.
This project not only demonstrates the application of technology in solving real-world challenges
but also highlights the importance of inclusivity and accessibility in innovation. By continuing to
evolve and expand its capabilities, this system can significantly contribute to improving the quality
of life for individuals with communication barriers.
39
CHAPTER 8 – Bibliography or References
Bibliography/References
1. Books and Research Papers
o Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
o Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
2. Online Tutorials and Documentation
o OpenCV Documentation: https://docs.opencv.org/
o TensorFlow and Keras Documentation: https://www.tensorflow.org/
o NumPy Documentation: https://numpy.org/doc/
o Matplotlib Documentation: https://matplotlib.org/
3. APIs and Libraries Used
o CVZone Library: https://github.com/cvzone/cvzone
o SpeechRecognition Library: https://pypi.org/project/SpeechRecognition/
o PIL (Python Imaging Library): https://pillow.readthedocs.io/
4. Articles and Tutorials
o “Hand Gesture Recognition Using OpenCV” - GeeksforGeeks:
https://www.geeksforgeeks.org/
o "Creating Sign Language Recognition Systems" - Towards Data Science:
https://towardsdatascience.com/
5. Resources for ISL (Indian Sign Language)
o ISL Gifs and Phrases: Collected from Indian Sign Language Research and Training
Centre (ISLRTC)
6. Software Tools
o Python Programming Language: https://www.python.org/
o Tkinter GUI Toolkit: https://docs.python.org/3/library/tkinter.html
7. Acknowledgments
o Open-source contributors for datasets, tools, and libraries that supported this project.
o Faculty and peers who provided guidance during the development of the project.
These references were instrumental in developing the Sign Language Detector and Generator and
ensuring its technical and practical relevance.
40

Indian Sign Language Generator and Detector Team Qstar

Uploaded by

Copyright:

Available Formats

Indian Sign Language Generator and Detector Team Qstar

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Indian Sign Language Generator and Detector Team Qstar

Uploaded by

Copyright:

Available Formats

Deccan Education Society's

NAVINCHANDRA MEHTA INSTITUTE OF

“Indian Sign Language Generator and Detector”

C23085 – Nikhil Patil

C23122 - Siddharth Suryavanshi

C23131 - Suraj Ramshakal Vishwakarma

C23080 - Pal Deepak Ramesh

C23110 - Mohd Talha Imtiyaz Ahamad Shaikh

C23016 - Rohini Yashavant Bhavar

Mentor: Dr.Swapnali Mahadik

Submitted to University of Mumbai

Internal Guide Director

Chapter Contents Page

II Project Planning and Scheduling

III Analysis and Design

VI Limitations and Future Enhancements

1. Purpose of the Study

6. Importance of the Study

Some existing systems include:

Advantages of the Proposed System:

Minimum Hardware Requirements

Concept and Planning Phase (3 weeks)

Design Phase (3 weeks)

Development Phase (9 weeks)

Integration and Testing(Quality Assurance) (3 weeks)

imgWhite = np.ones((imgSize, imgSize, 3), np.uint8) * 255

# Crop the image around the detected hand

# Check if imgCrop is valid (not None and has non-zero size)

# Resize and position the cropped image on a white background

# Display predictions and bounding boxes

# Display cropped and processed images

imgWhite = np.ones((imgSize, imgSize, 3), np.uint8)*255

imgCrop = img[y-offset:y + h + offset, x-offset:x + w + offset]

Limitations of the System:

You might also like