Report on audio to video dubbing

Innovation/Entrepreneurship/Societal Based Internship 21INT68
Chapter 1
INTRODUCTION TO INNOVATION / INTERNSHIP
On the first day of the internship, we were guided with the knowledge about
innovation of new technologies and new concepts or problem statements that
were given by SIH (smart India hackathon). We were made to gain the
Knowledge about what is innovation? What does innovation consist of?
Basically, understanding what exactly innovation is. Hereby, let us understand
the meaning of innovations:
Innovation = Invention + exploitation
Innovation is something new which is obtained by outcomes, which also
consist of value. This involves a whole process of opportunity identification,
ideation or invention to development, prototyping, and many more. But,
entrepreneur -ship only needs to involve commercialization .
Its said that innovation comes about through new combinations made by an
entrepreneur, resulting in
 A new product a new process
 opening of new market
 new way of organizing the business
 new sources of supply
fig 1.0 (a) Innovation complexity
fig 1.0 (b) Basic model for innovation management is interactive
Dept of CSE(AI&ML) 2023-2024 1

1.1. Discussing and finalization of title/ topics of interest based

on various problem statement with their respective guide
TOPIC: Developing a software for dubbing videos from English to
other Indian regional languages
Description: This challenge involves the development of software capable of
dubbing videos from English to various Indian regional languages. The software
should provide high-quality dubbing and synchronization to make video content
more accessible and inclusive.
Everyone likes to watch movies or videos without subtitles. Even better if the
movie is the language of their own
• A - Dubb is software that gives you an experience to enjoy a movie in your
regional language. It will not only eliminate the pain to dub a movie/video
indifferent languages but also saves a lot of time and money.
• What makes A - Dubb different from other software’s is that it will dub in the
same voice of the original artist rather than a regional artist.
Fig 1.1 (a)
Guides referred:
 Wikipedia
 SlideShare

Developing Software to Dub Video in Regional Languages

 In spite of many languages being spoken in India, it is difficult for the
people to understand foreign languages like English, Spanish, Italian, etc.
The recognition and synthesis of speech are prominent emerging
technologies in natural language processing and communication domains.
 People increasingly look to video as their preferred way to be better

informed, to explore their interests, and to be entertained. And yet a
video’s spoken language is often a barrier to understanding. For example,
a high percentage of YouTube videos are in English but less than 20% of
the world's population speaks English as their first or second language.
Voice dubbing is increasingly being used to transform video into other
languages, by translating and replacing a video’s original spoken
dialogue. This is effective in eliminating the language barrier and is also a
better accessibility option with regard to both literacy and sightedness in
comparison to subtitles.
 The majority of video content is produced in English, leaving those who

do not speak the language excluded from consuming it.
Solution Overview
 Dubbing Software:
We developed easy to use software that can convert English audio tracks
into multiple regional languages like Hindi, Tamil, and more.
 Accuracy and Quality:
Our software uses advanced machine learning models to ensure top-notch

accuracy and quality of dubs at a faster pace.
 Cost Effective:
Our solution provides high-quality dubs with reduced costs and

increased flexibility for content creators, enabling them to cater to
different regional audiences.

Development Process:
 Design and Conceptualization:
Fig:1.1(b)
We used agile methodologies to develop the software while following design

thinking principles to ensure seamless user experience.
Collaboration:
Fig:1.1(c)
 Our software provides a scalable and cost-effective solution to the

problem of dubbing videos in regional languages, making video content
accessible to millions of people in India.

 With the collaboration of some ML coding languages we make this

software implementation successfully to dub video to audio.
Testing and Implementation:
Fig:1.1(d)
Multiple iterations of testing and feedback from our users helped us improve the
software and release it into production.
Benefits:
 High Quality Dubs:
Our software provides high-quality dubs that are easy to understand and
watch.
 Caters to Multiple Languages:
Our solution provides dubs in multiple regional languages, catering to a

broader audience.
 Lower Cost:
Our software is cost-effective and provides an alternative to expensive

professional studios and services.
 Increased Accessibility:
Accessible video content in regional languages can help bridge the

digital divide and provide broader access to knowledge and information.

1.2 RESEARCH PAPERS:

SL.no year of Author Title of Methodology Problems
publication page
Automated Provide end-to-end translated Loss of original actor’s performance
movie movies/clips. and talent
1. 2018 Utkarsh dubbing •The translated clips will retain Dubbing is a common practice in
Agrawal system (A the voices of original artists. the world of film and television, but
dub) • Companies can save a lot of it can take away from the original
money by not hiring local artists. actor’s performance. Dubbed actors
• Also, auto-dubbing will save a often sound different than their real-
lot of time and effort. life counterparts, as they are not
• The validations of delivering their own dialogue in an
translated/dubbed scripts will be emotionally charged scenery.
done by professionals.
Antony P. J. Machine 1. The term MachineTranslation Dubbed audio can be of lower
Translation is a standard name for quality and difficult to understand
2. 2013 Approaches computerized systems A final disadvantage of dubbing is
and Survey responsible for the production of that the audio quality may not
for Indian translations from one natural always be as good as the original
Languages language into another with or source material and can sometimes
without human assistance be difficult to understand. As a
2. The literature shows that there result, viewers may end up missing
have been many attempts in MT out on certain elements of the story
for English to Indian languages . or even entire lines of dialogue due
to poor sound quality.

Cliff Weitzman Audio dubbing Audio dubbing on video is a dubbed audio also presents an issue
3. 2017 on video crucial element in the field of with syncing – often times dialogue
multimedia and broadcasting, will occur out of sync with the
essential in reaching a global actors’ mouth movements, resulting
audience. It refers to the process in an unnatural viewing experience
of replacing the original audio, for audiences. For these reasons, it’s
usually dialogue, of a video with important for producers to take
another set in a different steps to ensure the highest possible
language. sound quality when dubbing their
films and shows in order to provide
viewers with an enjoyable and
engaging experience.
4. 2019 1.Srikar Kashyap Machine In spite of many languages being Dubbing films or shows into
Pulipaka Translation of spoken in India, it is difficult for another language can face
English the people to understand foreign challenges in matching lip
Videos to languages like English, Spanish, movements, preserving cultural
Indian Italian, etc. The recognition and nuances, and maintaining the
Regional synthesis of speech are original emotional context.
Languages prominent emerging Sometimes, the translated dialogue
using Open technologies in natural language might not fully capture the intended
Innovation processing and communication meaning, leading to a loss of
domains. This paper aims to authenticity or humor. Additionally,
leverage the open source finding skilled voice actors who can
applications of these convey the same emotions and tone
technologies, machine as the original actors can also be a
translation, text-to-speech . hurdle.

5 2020 B. S. Harish & A Any language that has evolved Tokenization of the text. Some of
R. Kasturi comprehensive naturally in humans through its the regional languages dont have
Rangan Survey on usage over the time is called common delimiters like white space
Indian natural language. In this survey, or punctuations.
regional the various approaches and Language structure i.e. order of the
language techniques contributed by the words in the sentences will differ
processing researchers for Indian regional from one language to another
language processing are Ambiguity in translation or
reviewed. The tasks like transliteration of regional language
machine translation, Named words.
Entity Recognition, Sentiment
Analysis and Parts-Of-Speech
tagging are reviewed with
respect to Rule, Statistical and
Neural based approaches. The
challenges which motivate to
solve language processing
problems are presented. The
sources of dataset for the Indian
regional languages are
described.

6. 2021 Varshul Gupta Dubbing video Dubbing involves several steps. Dubbing videos from English to
Anuja Dhawan content in real First, a translated script is other regional languages presents
time created, aiming to match the lip specific challenges. Beyond issues
movements as closely as with lip-syncing and cultural
possible while retaining the nuances, finding voice actors fluent
essence of the original dialogue. in both languages and capable of
Then, skilled voice actors record delivering performances that
the translated lines in a studio. resonate with local audiences can be
Sound engineers synchronize challenging. Moreover, certain
these recordings with the idiomatic expressions or wordplay
visuals, adjusting timing and in English might not directly
pacing to match the lip translate into other languages,
movements.
7. 2023 Wilson Wongso Many to many Indonesia is home to over 700 Sound engineers synchronize these
Brandon Scott multilingual languages and most people recordings with the visuals,
Buana translation speak their respective regional adjusting timing and pacing to
model for languages aside from the lingua match the lip movements. Finally,
Ananto languages of franca. In this paper, we focus on quality checks are done to ensure
Joyoadikusumo
indonesia the task of multilingual machine coherence and accuracy before the
translation for 45 regional dubbed version is released. This
Indonesian languages and process requires precision, linguistic
introduced Indo-T5 which expertise, and attention to detail to
leveraged the mT5 sequence-to- maintain the integrity of the original
sequence language model as a content in a new language
baseline.

8. 2020 Frederic Chaume Dubbing Dubbing involves linguistic, Adapting dialects or regional
cultural, technical and creative accents in dubbing might be
team effort for the translation, challenging, affecting the
adaptation and lip authenticity of the translation.
synchronisation of an Risk of Mistranslation: Errors in
audiovisual text. Research on translation or interpretation during
dubbing has grown the dubbing process can result in
exponentially, in parallel with inaccuracies and misunderstandings.
scholarship in audiovisual Dubbed versions might not be as
translation in general, ushering widely available as the original
in a wide array of works in content with subtitles, reducing
professional and sociological accessibility
studies, linguistic studies,
descriptive studies, ideological
studies, cognitive studies and
case studies in the new field of
audiovisual translation

9. 2022 Aarati H. Real time Language understanding is one Script Adaptation: Translating and
Patil; Snehal S. machine of those perpetual challenges adapting scripts can be challenging,
Patil; Shubham translation which has dogged research from potentially altering the intended
M. system many decades. As a means of meaning or message.
Patil; Tatwadarshi between connecting with others, but Limited Availability: Dubbed
P. Nagarhalli Indian communication is not limited to versions might not be as widely
languages a single language. India contains available as the original content
around 121 different languages. with subtitles, reducing
As a result, there is a linguistic accessibility.
barrier. Natural Language Audience Preference: Some viewers
Processing is the process of prefer subtitles as they maintain the
developing a communicational authenticity of the original language
interface between machines and and performances
humans.
10 2020 Ravinder Machine With the advancement in Dubbing might not be economically
Kumar Translation computer language technology viable for all regional markets,
Inderveer system for in a multilingual country like limiting the availability of content.
Chana Indian India, numerous linguistics Educational Value: Subtitles allow
Mukaan singh Languages require technology for viewers to learn the original
translation. It aids in research for language, providing an educational
ancient languages like Sanskrit, aspect that dubbing does not offer.
Tamil, Telugu, Malayalam to be The challenge in this
available for society. machine communication process arises with
translation is one of its essential the restriction that language inflicts
areas. It plays a signifcant role in both at word and sentence level.
breaking the language barrier
and facilitating inter-lingual.

1.2. Supporting Software and its comparisons
1. What is dubbing Software?

Dubbing software enables content creators and filmmakers to dub their videos
with professional quality audio and video dubbing. Dubbing is useful for
localizing the audio in video content to audiences that speak different
languages.
2. Supporting software’s for dubbing videos:
1. Wavel-Wavel is an Al Studio that offers online video editing experience.
Scaling videos 11X FASTER by generating natural sounding voices
2. Filmora- A video editor for all creators. Craft new worlds by layering clips
and using simple green screen effects.
3. Adobe Audition-A professional audio workstation. Create, mix, and design
sound effects with the industry’s best digital audio editing software.
4. Audacity-Free, open source, cross-platform audio software. Audacity is an
easy-to-use, multi-track audio editor and recorder for Windows, macOS,
GNU/Linux and other operating systems.
5. VideoDubber-AI-powered video translation, dubbing, voice cloning and text-
to-speech services. Scale with us to 30+ languages to 10x your audience
size effortlessly!
3. How to Choose the Right Dubbing Software
Selecting the right dubbing software will depend on your specific needs and
preferences.
 The first step is to determine what type of dubbing you plan to do. Are
you creating a podcast, adding voice-over to a video, or creating an audio
book? This will help narrow down your options. Different types of
dubbing software may be better suited for different types of projects.
 Next, consider the features each program offers. For example, how many
tracks are available? Does the program have sound effects or background
music options? What type of editing tools does it include? Do you need
support for multiple languages or formats? Make sure that any desired
features are available with the software before making your decision.

 Finally, compare prices and read reviews from other users who have tried
out the software before. Reading user feedback can give you valuable
insight into potential limitations or bugs in the program and how easy it is
to use. Once you've found a few promising programs that meet your
needs and budget, take some time to test them out yourself so that you
can make an informed decision about which one is best for your project.
 Compare dubbing software according to cost, capabilities,

integrations, user feedback.
4. Advantages:
 Increased Efficiency: Dubbing software increases efficiency by
automating the process of creating multiple recordings from a single
master audio file. This makes it easier and faster for audio engineers to
create multiple versions of a project without having to manually mix each
one.
 Improved Quality: Dubbing software also has the potential to improve the
quality of recordings, as it can be used to apply effects such as
equalization, compression, and reverb to get the desired sound.
 Reduced Costs: Since dubbing software streamlines the creation process

and reduces the need for additional hardware or personnel, it can save
time and money on projects.
 More Control over Final Product: Dubbing software allows audio

professionals more control over their final product, enabling them to
make quick adjustments if needed. They can also preview their work
before committing to a final version, ensuring that all errors
have been corrected.
 Multi-language Capability: With dubbing software, you can easily record

or recreate audio in different languages so that your project is

 Versatility: Dubbing software can be used for a variety of applications,

from creating video games to creating soundtracks for films and
television shows
Disadvantages:
 Lip Sync Issues: Dubbing may not perfectly match the lip movements
of the actors, leading to a disconnect between audio and visual elements.
 Loss of Authenticity: Original performances and nuances in language

or tone may be lost, impacting the authenticity of the content.
 Cultural Misinterpretation: Dubbing can sometimes alter cultural

references or nuances, potentially leading to misinterpretation or loss of
intended meaning.
 Voice Mismatch: The dubbed voice might not suit the original actor's
appearance or character, affecting the overall viewing experience.
 Quality Variation: Dubbing quality can vary, and poorly executed

dubbing may result in a distracting viewing experience.
 Emotional Impact: Dubbed voices may not convey the same

emotional depth as the original performances, diminishing the impact of
dramatic scenes.
 Limited Availability of Voice Talents: Finding suitable voice actors

for dubbing can be challenging, leading to a restricted pool of available
talents.
 Technical Challenges: Achieving seamless synchronization between

dubbed audio and video can be technically challenging, especially in
complex scenes.

 Loss of Original Language Experience: Dubbing eliminates the

opportunity for viewers to experience the original language and its
cultural nuances.
1.4 Methodology and related Block diagram/Flowchart

1. Machine Learning approach
Video dubbing using machine learning involves automatic transcription
and translation of the original audio. Machine learning models can
generate new audio tracks in the target language or use text-to-speech
synthesis. The generated audio is then synchronized with the video
through lip movement analysis. Quality assessment and adjustments may
be necessary.
Fig:1.4(a)
 Speech Recognition:
- Utilize a pre-trained speech recognition model to convert English audio in
the video into text.

- Popular libraries like Google's Speech Recognition or cloud services like

Google Cloud Speech-to-Text can be employed.
1. Selecting a Speech Recognition Library:
- Choose a suitable speech recognition library based on your project
requirements. Google's Speech Recognition is a user-friendly option for Python
developers, offering pre-trained models for English.
2. Installing the Library:

- Install the chosen library using the appropriate package manager. For
Google's Speech Recognition in Python, you can use pip:
3. Importing the Library:

- In your Python script, import the SpeechRecognition library:
4. Loading the Audio File:

- Read the audio file containing English speech that you want to transcribe.
Supported formats typically include WAV, MP3, or other common audio
formats.
5. Initializing the Recognizer:

- Create an instance of the Recognizer class provided by the library:

6. Speech Recognition Process:

- Use the recognizer to perform the speech-to-text conversion. This involves
reading the audio file and converting it into text:
7. Handling Exceptions:
- Implement error handling to manage potential issues, such as audio file not
found or recognition errors:
8. Extracted Text Result:

- Access the transcribed text from the `text_result` variable:
9. Transcribing with Cloud Service:

- Use the cloud service to transcribe audio from a file:

10. **Accessing Cloud Results:**

- Extract the transcribed text from the cloud service response:
 Translation:
- Apply a machine translation model to translate the English text into the
target language.
- Transformer-based models such as Google's Transformer or OpenNMT can
handle complex language structures.
Detailed explanation of the "Translation" step in video dubbing, focusing

on applying a machine translation model:
1. Choosing the Machine Translation Model:
- Select an appropriate machine translation model based on the target
language and specific requirements of your video dubbing project. While there
are various machine translation models available, transformer-based models are
widely used for their ability to handle complex language structures.

2. Preprocessing the English Text:

- Prepare the English text obtained from the previous step (speech
recognition) for translation. Ensure that the text is cleaned and formatted to
improve the quality of the translation. This may involve removing unnecessary
punctuation or special characters and splitting the text into sentences or
segments.
3. Selecting a Transformer Model:

- For English to another language translation, transformer models like
Google's Transformer or OpenNMT are suitable choices. These models have
been pretrained on large multilingual corpora and are capable of handling
various language pairs effectively.
4. Loading the Pretrained Model:

- Utilize libraries like Hugging Face Transformers or TensorFlow's Seq2Seq
for loading pretrained transformer models. These libraries provide easy access
to a wide range of transformer-based models.
5. Performing Translation:
- Feed the preprocessed English text into the selected transformer model for
translation. The model will generate translations in the target language. Here's
an example using Hugging Face's Transformers library:

6. Postprocessing the Translation:

- After translation, it's essential to postprocess the translated text to ensure
coherence and readability. This may involve adjusting word order, handling
idiomatic expressions, and making minor corrections for context.
7. Quality Assessment:
- Implement quality assessment metrics or human reviewers to evaluate the
accuracy and fluency of the translation. Metrics like BLEU score and human
evaluation can be used to measure the translation quality.
 Voice Synthesis:
- Use a text-to-speech (TTS) model to convert the translated text into
synthesized speech in the target language.
- Tacotron and WaveNet are examples of TTS models that can generate
natural-sounding speech.
Detailed explanation of the "Voice Synthesis" step in video dubbing,
including the use of text-to-speech (TTS) models:
1. Selecting a Text-to-Speech (TTS) Model:
- Choose an appropriate TTS model that can generate natural-sounding speech
in the target language. TTS models have evolved significantly in recent years,
and two popular choices are Tacotron and WaveNet.
2. Setting Up the TTS Model:

- Depending on your choice of TTS model, you may need to install and
configure the necessary libraries and dependencies. For Tacotron and WaveNet,
you can use the TensorFlow implementation, which can be found on GitHub.
3. Loading the Pretrained TTS Model:

- Download or load a pretrained Tacotron and WaveNet model. You can find
pretrained models and weights on platforms like GitHub, or use libraries like
TensorFlow to access them.

4. Processing Translated Text:

- Prepare the translated text obtained from the previous step (translation) for
voice synthesis. Ensure that the text is correctly formatted, and consider
segmenting it into smaller units, such as sentences or phrases, to improve the
synthesis quality.
5. Voice Synthesis with Tacotron:
- If you're using the Tacotron model, feed the processed text into the Tacotron
model to obtain mel spectrograms, which represent the acoustic features of the
speech. Here's an example using a Tacotron 2 implementation in TensorFlow:
6. Voice Synthesis with WaveNet:

- Once you have the mel spectrograms, you can use the WaveNet model to
convert them into natural-sounding speech waveforms. Here's an example using
WaveNet in TensorFlow:
7. Quality Assessment:
- Evaluate the synthesized speech for naturalness, fluency, and clarity. Listen
to the generated speech and employ objective metrics like Mean Opinion Score
(MOS) if necessary to assess quality.
 Synchronization:
Align the synthesized speech with the original video by considering lip
movements and natural pauses.
Use techniques like forced alignment or advanced lip synchronization models to
achieve accurate synchronization.

Detailed explanation of the "Synchronization" step in video dubbing,

which involves aligning the synthesized speech with the original video:
1. Import Original Video and Synthesized Speech:

- Start by loading the original video and the synthesized speech (audio)
generated in the previous steps. You should have both the video and audio files
ready for synchronization.
2. Lip Movement and Phoneme Analysis:
- Analyze the original video to identify key features, such as lip movements
and phoneme transitions (visually distinguishable mouth shapes associated with
speech sounds). This step is crucial for achieving accurate lip synchronization.
3. Forced Alignment (Phoneme-Level Synchronization):
- Use forced alignment techniques to synchronize the synthesized speech with
the video at the phoneme level. Forced alignment tools like Penn Forced
Aligner or Montreal Forced Aligner align phonemes in the audio with
corresponding visual cues in the video. This ensures that the mouth movements
match the speech.
4. Frame-Level Alignment:
- Implement frame-level alignment techniques to ensure that the speech and
video are accurately synchronized. This involves mapping each frame of the
video to specific timestamps in the audio.
5. Handling Natural Pauses:
- Identify natural pauses in the speech and video, such as pauses between
sentences or phrases. Adjust the synchronization to allow for these pauses,
ensuring that the dubbing sounds natural.
6. Advanced Lip Synchronization Models (Optional):
- Consider using advanced lip synchronization models, which use deep
learning techniques to predict the precise lip movements for a given audio input.
These models are trained on large datasets of videos and their corresponding
speech. They can be more accurate in matching lip movements to speech,
resulting in a more lifelike dubbing.

7. Visual Feedback and Manual Adjustment:

- Visually review the synchronized video to check for any misalignments or
artifacts. Manually adjust the synchronization if necessary, particularly for cases
where automated methods might not capture subtle nuances.
8. Iterative Refinement:
- Continuously refine the synchronization process based on feedback and
quality assessment. Consider user feedback and make improvements as needed
to achieve the highest level of accuracy and naturalness.
9. Export the Synchronized Video:

- Once you are satisfied with the synchronization, export the final video with
the synchronized audio. Make sure the video file format and settings are
appropriate for your intended distribution platform.
2. Traditional approach
Dubbing a video using traditional methods typically involves the

following steps:
1. Translation and Script Preparation: Translate the original script and

dialogue from the source language to the target language. Make sure the
translated script matches the timing and lip movements of the original
video as closely as possible.
2. Voiceover Recording: Hire voice actors who are native speakers of

the target language. They will read the translated script while watching
the original video. Ensure that the voice actors' timing matches the lip
movements of the original actors.
3. Audio Editing: Edit the voiceover recordings to match the timing and
pace of the original video. This may involve cutting, splicing, and
synchronizing the audio with the video.
4. Mixing and Sound Design: Balance the audio tracks, add background
music, sound effects, and adjust audio levels for clarity and coherence.

5. Video Editing: Replace the original audio track with the dubbed
audio. Ensure that the dubbed audio is properly synchronized with the
video.
6. Quality Control: Review the dubbed video to ensure that the lip-sync
is accurate, the audio quality is good, and there are no noticeable
discrepancies.
7. Exporting and Distribution: Once you are satisfied with the dubbed
video, export it in the desired format and resolution. You can then
distribute the dubbed video to your intended audience.
 The traditional method of dubbing is labor-intensive and often requires

professional expertise in translation, voice acting, audio editing, and
video editing. It is commonly used in the film and television industry to
create high-quality localized versions of content for different regions.
1.5 Discussing its Implementation (via method one – using

machine learning):
1. Using API
2. Collecting data and training a Machine Learning model to dub the
video
 Using API
Using an API for video dubbing is not always a requirement, but it can
offer several advantages, depending on our specific needs and constraints.
Here are some reasons why we might consider using an API for video
dubbing:
1. Efficiency and Speed: APIs can automate and streamline the dubbing
process. They can handle tasks like transcription, translation, and text-to-
speech synthesis more quickly and efficiently than manual methods. This
can save you a significant amount of time, especially when dealing with
large volumes of content.

2. Accuracy: Machine learning-based APIs can provide accurate

transcriptions, translations, and synthesized speech. They are less prone
to human errors and can handle multiple languages and accents, which
may be challenging for manual dubbing.
3. Cost-Effectiveness: In some cases, using an API can be cost-effective,

especially for smaller projects or when you don't have access to a team of
professional voice actors and audio engineers. API usage typically
involves pay-as-you-go pricing, which can be cost-efficient.
4. Consistency: APIs can provide a consistent quality of dubbing across

multiple videos. This is important for maintaining a high standard of
content quality, which can be challenging to achieve with manual
dubbing.
5. Scalability: APIs can easily scale to handle a large number of videos,

making them suitable for projects with high volume requirements.
6. Multilingual Support: Many APIs support multiple languages,

making it possible to dub content into various languages without the need
for multilingual voice actors.
7. Ease of Integration: Most APIs come with software development kits

(SDKs) and libraries that make it relatively easy to integrate their services
into your applications or workflows.
 Training a Machine Learning model to dub the video
Video dubbing using Machine Learning (ML) is not easy, and it is a

complex and technically challenging task.
This example is for educational purposes and is highly simplified.
we'll use the following libraries:
- `Speech Recognition` for automatic speech recognition (ASR).
- `pydub` for audio processing.

- `moviepy` for video processing.

- `gTTS` (Google Text-to-Speech) for text-to-speech synthesis.
Follow Instructions as mentioned in the code below
 Note that the above code is oversimplified, Further details of the code
requires Assistance from a Machine learning Engineer

1.6 Internship Conclusion

Overall, this internship has been an excellent and rewarding experience. I can
conclude that there have been a lot I’ve learnt from my work. Needless to say,
the technical aspects of the work I’ve done are not flawless and could be
improved provided enough time. As someone with no prior experience with this
concpect whatsoever I believe my time spent in research and discovering it was
well worth it and contributed to finding an acceptable solution to build a fully
functional service. Two main things that I’ve learned the importance of are
time-management skills and self-motivation.
The internship was also good to find out what my strengths and weaknesses are.
This helped me to define what skills and knowledge I have to improve in the
coming time. It would be better that the knowledge level of the language is
sufficient to contribute fully to projects. After my master I think that I could
start my working career. However I could perform certain tasks in research
better if I practice/know more the research methodologies applied in cetacean
studies. It would also be better if I can present and express myself more
confidently.
At last this internship has given me new insights and motivation to pursue a
career in software world.

1.7 Reference:
Mukaan singh ,“Machine translation systems for Indian Languages” , page no.1,
2019-2020.
https://link.springer.com/article/10.1007/s11831-020-09449-7
Antony P G, “Machine Transportation approaches and Survey for Indian
Languages”,Vol.18, page no.1, 2013.
https://aclanthology.org/O13-2003.pdf?shem=ssusba
Utkarsh Agrawal ,“Automated moving dubbing system”, page 1, 2018.
https://www.slideshare.net/UtkarshAgrawal35/voice-dubbing-automation
Srikar Kashyap pulipaka, Chaitanya Krishana Kasaraneni “Machine translation
of English videos to Indian Regional languages using open innovation” , page 1
2019.
https://ieeexplore.ieee.org/document/8937988/authors#authors
BS Harish ,R Kasturi Ranjan, “A Comprehensive survey on Indian Regional
languages processinh”, Vol 1204, page 1, 2020.
https://link.springer.com/article/10.1007/s42452-020-2983-x
https://gamma.app/public/Developing-Software-to-Dub-Video-in-Regional-
Languages-jnk5avghvlp04vb?mode=doc
https://www.researchgate.net/publication/338177583_Machine_Translation_of_
English_Videos_to_Indian_Regional_Languages_using_Open_Innovation
https://link.springer.com/article/10.1007/s42452-020-2983-x

1.8 VISIT: Visvesvaraya Museum
The Visvesvaraya Industrial and Technological Museum (VITM),

Bangalore, India, a constituent unit of the National Council of Science
Museums (NCSM), Ministry of Culture, Government of India, was established
in memory of Sir M. Visvesvaraya. The 4,000 m2 (43,000 sq ft) building was
constructed in Cubbon Park, and was inaugurated by the first Prime Minister of
India, Pandit Jawaharlal Nehru, on July 14, 1962. The museum displays
industrial products, scientific models and engines.
Fig:1.8(a) VMIT fig 1.8(b)
This splendid museum, dedicated to the memory of Bharat Ratna M.

Visvesvaraya, the architect of modern Karnataka, houses various technical
inventions and offers you a glimpse of the history of technological development
in the country. A major attraction here is a 1:1 scale replica of the Wright
Brothers’ Flyer, the World’s first piloted aircraft and a ‘Flyer Simulator’ that
offers a delightful experience to visitors. At the Dinosaur Corner, one can also
witness the movement and sound of a life-sized animated Spinosaurus in a
recreated environment.

1.8.1 :HISTORY:
Honouring Sir M. Visvesvaraya, the All-India Manufacturers Organization and
Mysore State Board decided to create a science and technology museum in
Bangalore. The foundation stone was laid by Shri B. D. Jatti , Chief Minister of
Mysore, on 15 September 1958. The Visvesvaraya Industrial Museum Society
(VIMS) came to be registered as the nodal agency in order to pool resources
from various industrial houses. It was inaugurated by the first Prime Minister of
India, Pandit Jawaharlal Nehru, on July 14, 1962.
The first exhibition, 'Electricity', was opened to the public on July 27, 1965.
In the year 1970, VITM launched the Mobile/Moving Science Exhibition
(MSE) with 24 participatory exhibits mounted on a bus. The MSE Bus travels
throughout Southern India.
In 1978, many science museums, including VITM, parted from CSIR and were
brought under a newly formed society registered on 4 April 1978 as National
Council of Science Museums (NCSM). In 1979 an extension was added to the
building, increasing the total area of the museum to 6,900 m 2 (74,000 sq ft).
NCSM set up three additional science centres at Gulbarga (Karnataka) in
1984, Tirunelveli (Tamil Nadu) in 1987, Tirupati (Andhra Pradesh)in 1993 and
at Kozhikode, Kerala in 1997, which are functioning under the direct
administrative control of VITM. Thus, VITM has become the southern zone
headquarters of NCSM.
The museum attracts nearly one million visitors a year, and is open on all days
(except Deepavali and Ganesha Chaturthi) from 09:30 to 18:00.
Fig 1.8.1-Dr.M.Visvesvaraya Statue

1.8.2 :GALLERY:
1. Visvesvaraya Museum Entrance:
Stepping into the Visvesvaraya Industrial & Technological Museum is an
entrancing journey into the realms of scientific marvels and technological
innovations. The museum serves as a vibrant tapestry, intricately woven with
exhibits that showcase the evolution of science and technology. Visitors are
greeted with a chronological odyssey through the history of machinery,
witnessing the transformation from rudimentary tools to sophisticated, high-tech
systems. Interactive physics displays bring abstract principles to life, allowing
hands-on exploration of concepts like gravity and electricity.
Robotics exhibits showcase the progression from early prototypes to advanced

artificial intelligence, offering a glimpse into the future of automation.
Aeronautics and aerospace displays transport visitors into the skies and beyond,
while communication technology exhibits illustrate the incredible evolution from
telegraph to the interconnected digital age.
The museum's dedication to education is epitomized in biotechnological

showcases, shedding light on groundbreaking advancements in medicine and
genetics. Industrial processes and automation exhibits provide insight into the
optimization of manufacturing.
In this captivating environment, the wonders of science and technology are not
just observed but experienced. The Visvesvaraya Industrial & Technological
Museum stands as a testament to human ingenuity, inspiring a deep appreciation
for the incredible progress that continues to shape our world.

2. Engine Hall:
The Engine Hall at Visvesvaraya Industrial and Technological Museum is a
captivating space that celebrates the marvels of engineering and technology.
Named after Sir M. Visvesvaraya, a distinguished engineer and statesman, the
museum is in Bangalore, India. The Engine Hall was set up in the year 1994. Over
50 exhibits arranged across 1000 sq m explain the evolution of mechanisms,
machines and devices that form the very foundation of modern technology.
It is a captivating showcase that delves into the evolution of engines and
machinery pivotal to industrial and technological progress .
Upon entering, visitors are greeted by a diverse array of exhibits, spanning
different eras and technologies. The hall displays various engines, including
steam engines that powered factories during the industrial revolution, internal
combustion engines that revolutionized transportation, and modern turbines used
in aviation and power generation.
Fig:1.8.2(a) Engine hall Fig:1.8.2(b) Engine hall

Each exhibit is meticulously curated, offering detailed information about the
specific engine's history, mechanics, and its impact on society and industry.
Interactive displays, models, and informative panels provide an immersive
experience, making complex engineering concepts accessible to all visitors,
from enthusiasts to students and families.

The Engine Hall aims to educate and inspire curiosity about the development
and significance of engines throughout history. It serves as an
educationalplatform where visitors can witness firsthand the evolution of
engines and appreciate their crucial role in shaping the modern world.
3. FUN SCIENCE GALLERY:
‘Fun Science’ Gallery was inaugurated on 23rd May 2008. The gallery set up in
an area of 700 sq.mtrs has 66 exciting hands on exhibits.
The Fun Science Gallery at the Visvesvaraya Industrial and Technological
Museum (VITM) is an engaging and interactive space designed to make science
enjoyable and accessible for visitors of all ages. This gallery within the museum
aims to ignite curiosity and foster a love for science through hands-on exhibits
and demonstrations
Filled with interactive displays, puzzles, optical illusions, and engaging
experiments, the Fun Science Gallery encourages visitors to explore various
scientific principles in a playful manner. Visitors can engage in activities that
demonstrate concepts related to physics, biology, mathematics, and more.
The exhibits are designed to stimulate the mind and provoke curiosity, allowing
visitors to actively participate in experiments and learn through direct
interaction. Whether it's understanding the laws of motion, experimenting with
light and sound, or exploring the wonders of magnetism, the Fun Science
Gallery offers an immersive experience that makes learning science an
enjoyable adventure.
It's an excellent space for children, families, and school groups, providing a
platform for experiential learning and fostering an appreciation for the wonders
of science and technology. The emphasis on hands-on activities ensures that
visitors not only learn scientific principles but also have fun while doing so.

Fig:1.8.2(c) fun science Fig:1.8.2(d) fun science
3. Electro Technical Gallery:
The "Electrotechnic" gallery was thrown open to the visitors of the museum by
Shri. Jawhar Sircar, Secretary, Ministry of Culture, Govt. of India on 8th April
2010. The gallery which has been set up in an area of about 780 sq. metre houses
fascinating exhibits on various topics in electrical technology. The gallery is a
journey through the spectacular world of electricity from the classical
experiments to the state of art technology.
Filled with interactive displays, puzzles, optical illusions, and engaging
experiments, the Fun Science Gallery encourages visitors to explore various
scientific principles in a playful manner. Visitors can engage in activities that
demonstrate concepts related to physics, biology, mathematics, and more.
The exhibits are designed to stimulate the mind and provoke curiosity, allowing
visitors to actively participate in experiments and learn through direct interaction.
Whether it's understanding the laws of motion, experimenting with light and
sound, or exploring the wonders of magnetism, the Fun Science Gallery offers an
immersive experience that makes learning science an enjoyable adventure.

Fig:1.8.2(e) Electro Technical gallery
Fig:1.8.2(f) Spark Theatre
4. Biotechnological Gallery:
The Visvesvaraya Industrial and Technological Museum (VITM) primarily

focused on various aspects of science, technology, and industry, including
galleries showcasing fundamental scientific principles, engineering, and
historical technological advancements.

The existence of a dedicated Biotechnological Gallery at VITM wasn’t

prevalent up to that point. However, museums often undergo updates and
expansions, introducing new galleries or themes to keep up with emerging
technologies and changing educational needs. If the Biotechnological Gallery
has been added after that time, I might not have specific details regarding its
exhibits or inception.
The concept of a biotechnological gallery would likely focus on illustrating the
principles and applications of biotechnology, covering areas such as genetic
engineering, bioinformatics, healthcare advancements, agricultural
biotechnology, and more. It would aim to educate visitors on the significant role
biotechnology plays in various aspects of our lives.
If this gallery has been established since my last update, I recommend checking
the latest information or contacting the VITM directly for specific details about
the exhibits, themes, and the educational content featured within the
Biotechnological Gallery.
Fig :1.8.2(g) Bio technology Fig:1.8.3(h)
5. Space Technology Gallery:

The “Space Technology” gallery was inaugurated by Shri. A S Kiran Kumar,
Chairman, Indian Space Research Organisation (ISRO) on November 28th, 2017
at 3 P.M. in the presence of Dr. K Kasturirangan, Chairman National Education
Policy & Chairman, Karnataka Jnana Ayog and Ms. Riddhi Mishra, Deputy

Secretary (Museums) Ministry of Culture, Govt. of India. The gallery on Space

Technology set up in an area on 700 sq. metres in the second floor of Visvesvaraya
Industrial and Technological Museum, Bengaluru brings the various facets of
Space Technology in an easy to comprehend way through several interactive and
immersive exhibits.
Fig:1.8.2(i) Sculpture of Rakesh Sharma
Fig:1.8.2(j) Space Technology Gallery
The Space Technology Gallery at Visvesvaraya Industrial and Technological

Museum is an awe-inspiring journey through the cosmos, highlighting the
marvels of space exploration and technology. This captivating gallery offers
visitors a comprehensive and interactive experience, showcasing the
advancements that have propelled humanity into the vast realm of outer space. As

visitors enter the gallery, they are greeted by displays featuring the history of
space exploration, from the early days of rocketry to the contemporary
achievements of space agencies around the world. Engaging multimedia
presentations provide insight into the challenges and triumphs of space missions,
capturing the imagination of visitors of all ages. The Space Technology Gallery
is likely to feature life-sized models of spacecraft, satellites, and rovers, allowing
visitors to get up close and personal with the engineering marvels that have
expanded our understanding of the cosmos. Interactive exhibits may simulate
aspects of space travel, such as zero gravity environments or the experience of
landing on other planets.
6. Chandrayana-3:
Chandrayaan-3 is India's ambitious lunar exploration mission, following the
achievements of Chandrayaan-1 and Chandrayaan-2. Spearheaded by the Indian
Space Research Organisation (ISRO), the mission is designed to continue the
exploration of the Moon, emphasizing enhanced capabilities and technological
advancements. Chandrayaan-3 aims to deploy a lander and rover on the lunar
surface, conducting scientific experiments to deepen our understanding of Earth's
celestial neighbour.
Fig:1.8.2(k) Chandrayan-3 model

The mission draws on the lessons learned from Chandrayaan-2 and incorporates
improvements to ensure greater success. Chandrayaan-3 is expected to have an
increased payload capacity, enabling more sophisticated scientific instruments
and experiments. It represents a crucial step in India's space exploration
endeavors, showcasing the nation's commitment to advancing its space
capabilities and contributing valuable data to the global scientific community.
The mission aligns with India's long-term space exploration goals, emphasizing
both technological innovation and scientific discovery on the lunar landscape.
Fig:1.8.2(l) Chandrayan-3 model

7. Wright Brothers aeroplane:
The Wright brothers' aircraft at the Virginia Military Institute (VMI) is a replica
of their original 1903 Wright Flyer. VMI's replica serves as a tribute to the
pioneering aviation work of Orville and Wilbur Wright. It's meticulously
constructed to replicate the design and mechanics of the original flyer,
showcasing the historic significance of their achievements in aviation. The replica
provides an opportunity for educational purposes, allowing people to experience
and understand the groundbreaking technology and innovation behind the first
powered flight.

Fig:1.8.2(m)-Wright brothers aeroplane model

The Wright brothers built and flew the first successful powered airplane in 1903,
making history with their innovative design and determination. Their aircraft, the
Wright Flyer, made its historic flight at Kitty Hawk, North Carolina, achieving a
distance of 120 feet in 12 seconds.
8. Satellite Launching Station:
India's primary satellite launching station is the Satish Dhawan Space
Centre (SDSC), located on Srihari Kota Island in the Indian state of Andhra
Pradesh. Named after Dr. Satish Dhawan, one of the pioneers of the Indian
space program, the centre serves as the main launch site for the Indian
Space Research Organisation (ISRO). The Satish Dhawan Space Centre
has multiple launch pads, including the First Launch Pad (FLP) and the
Second Launch Pad (SLP), each designed for specific types of launches.
The FLP is primarily used for launching smaller vehicles, while the SLP is
equipped to handle larger launch vehicles, including the Geosynchronous
Satellite Launch Vehicle (GSLV) and the Geosynchronous Satellite Launch
Vehicle Mark III (GSLV Mk III). GSLV Mk III, one of the most powerful

rockets in ISRO's fleet, is instrumental in launching heavier payloads,

including communication satellites and interplanetary missions like the
Chandrayaan and Mangalyaan missions to the Moon and Mars,
respectively.
Fig:1.8.2(n) model of satellite launching station
Fig:1.8.2(o) Satellite launching station, sriharikota,India
The Satish Dhawan Space Centre plays a crucial role in India's space exploration
endeavors, providing a strategic base for the launch of satellites for
communication, Earth observation, navigation, and scientific research. The
center's geographic location on Sriharikota Island also enhances safety by
providing a clear trajectory over the Bay of Bengal for rocket .

1.8.3 :Conclusion:
The Visvesvaraya Industrial and Technological Museum (VITM) in Bangalore,
India, stands as a testament to the visionary engineer Sir M. Visvesvaraya.
Spanning various domains of science, technology, and innovation, the museum
celebrates India's scientific heritage. Through its interactive exhibits, the VITM
provides a hands-on learning experience, fostering curiosity and interest in
visitors of all ages. It's a hub for exploration, housing displays on physics,
electronics, aerospace, and other scientific disciplines, inspiring the pursuit of
knowledge and discovery.
The museum serves as an educational resource, bridging the gap between

theoretical knowledge and practical applications. Its efforts to engage visitors
through workshops, demonstrations, and interactive sessions reflect a
commitment to scientific literacy and technological advancement. The VITM
encapsulates Visvesvaraya's ideals of innovation, ingenuity, and progress,
inspiring future generations to contribute to India's scientific and technological
landscape. It stands as a symbol of tribute to Sir M. Visvesvaraya's legacy and a
beacon of inspiration for aspiring scientists, engineers, and innovators.

1.9 Industry Visit (Rakuten)

Rakuten Technology Conference
Topic: AI-nization
The Rakuten Technology Conference (RTC) of 2023, with the

exciting theme “AI-nization” was conducted on November 18, 2023. It was
the 15th edition of the event, the tech leaders, engineers, researchers and tech
enthusiasts from around the world was gathered at the Bengaluru for talks
and discussions on AI. The talks were to hear about the latest trends in
technology.
The event was not only underscored Rakuten’s steadfast commitment
to technological advancement but it also highlighted our pivotal role as
pioneers and collaborators in shaping the future of the tech industry.
Fig 1.9
Guest Speakers:
1. Pankaj Rai, Chief Data Analytics Officer, Aditya Birla Group
2. Sameer Dhanrajani, CEO, AIQRATE
3. Goda Ramkumar, VP-Data Science, Swiggy

Fig 1.9
The agenda of the programme was divided into sessions:

1.9.1. Connect with the RTC Mainstage:
The different branches of Rakuten from all around the world
was connected with RTC mainstage in Japan. The opening remarks
was conducted by Yasufumi Hirai, Group Executive Vice President,
Chief of Staff to the CEO, Rakuten Group. Here the theme of the
conference was discussed i.e., AI-nization where AI should be
Augmented Intelligence defining as it is a subset of artificial
intelligence in which AI technologies assist humans rather than
replace them.
The clear vision was given about the difference between Augmented
Intelligence and AI. Artificial intelligence (AI) is a self-learning
technology that can mimic and replace human cognitive functions.
Simply put, the machine takes the place of human intervention and
contact. On the other hand, Intelligence Augmentation supplements
human intelligence with AI to augment rather than replace it.
we were given top 10 strategic technology trends of 2024, and
they are as follows:
1. AI trust, risk and security management.
2. Continuous threat exposure management.
3. Sustainable technology.

4. Platform Engineering.
5. AI-augmented development.
6. Industry cloud platforms.
7. Intelligent applications.
8. Democratized generative AI.
9. Augmented connected workforce.
10. Machine customers.
Fig 1.9.1
Example: If we've used Alexa, Siri, or another virtual assistant, you've used
augmented intelligence. Virtual assistants don't make decisions for us.
Instead, they provide the data when we need it.
And the sponsors for this great full event
1.9.2. Welcome Note of AI-nization:

The introduction about AI was given by Sunil Gopinath,
CEO, Rakuten India. While a number of definitions of artificial
intelligence (AI) have surfaced over the last few decades. John McCarthy
offers the following definition “It is the science and engineering of
making intelligent machines, especially intelligent computer programs”.
The artificial intelligence was there before but OpenAI ChatGPT, DALL
E-2 and Google Bard are revolutionary.defined.

Fig 1.9.2(a)
Evolution of AI:
Artificial Intelligence (AI) has transformed from a concept in the realm of
science fiction to an everyday reality impacting every facet of our lives. The
journey of AI has been marked by significant milestones and breakthroughs,
each reflecting the convergence of theoretical foundations, technological
advancements, and practical implementations. Here, we chronologically
trace the evolution of AI, showcasing its most important developments.
1. Pre-1950s: Theoretical Foundations
The roots of AI can be traced back to antiquity, with philosophers
attempting to explain the human mind as a symbolic system. However, the
modern field of AI truly began to take shape in the mid-20th century.
 1843: Ada Lovelace, known as the world's first computer
programmer, proposed the idea that machines could manipulate
symbols in addition to numbers, laying a fundamental concept for AI.
 1936: Alan Turing proposed the concept of a "universal machine"
(later known as the Turing Machine), a theoretical device that could
solve any computation given enough time and resources. This forms
the basis of the digital computer and the principle of computability.
 1943: Warren McCulloch and Walter Pitts proposed the first
mathematical model of a neural network, opening up the possibility of
learning machines.
 1949: Donald Hebb proposed a learning theory, now known as
Hebbian learning, which became a fundamental concept in the
development of artificial neural networks.

2.1950s-1960s: Birth and Early Developments

The field of AI was officially born in the mid-20th century, starting with
the coining of the term "Artificial Intelligence."
 1950: Alan Turing proposed the Turing Test to determine a
machine's ability to exhibit intelligent behaviour. In the same
year, Claude Shannon published a paper on machine learning
chess.
 1956: The Dartmouth Conference officially coined the term
"Artificial Intelligence." The participants, including John
McCarthy, Marvin Minsky, Allen Newell, and Herbert Simon,
became the leaders of AI research for several decades.
 1957: Frank Rosenblatt invented the perceptron, the first artificial
neuron.
 1958: John McCarthy developed LISP, a programming language
that became popular in AI research. LISP is a functional
programming language that was designed for easy manipulation
of data strings. As one of the oldest programming languages still
in use, Lisp offers several different dialects and has influenced
the development of other languages.
 1959: Arthur Samuel developed a self-learning program to play
checkers, demonstrating the power of machine learning.
 1965: Joseph Weizenbaum created ELIZA, a natural language
processing computer program, demonstrating the potential of AI
in understanding and generating human language.
3.1970s-1980s: AI Winter and the Rise of Expert Systems

Despite the initial excitement, the lack of significant progress led
to a period known as the "AI Winter. "The AI Winter was characterized by
reduced funding and interest in AI research due to its failure to achieve its
ambitious goals.
 1972: Dendral, one of the first expert systems, was developed,
marking a shift in AI research towards solving specific problems.
 1980: The Japanese Fifth Generation Computer Systems project
aimed to develop an "intelligent" computer, but ultimately it failed to
meet its objectives.
 1986: The backpropagation algorithm was reintroduced, leading to a

resurgence in neural network research.
Fig 1.9.2(b)
4.1990s-2000s: The Internet Era and Machine Learning

The advent of the internet provided a massive amount of data,
fueling the development of machine learning algorithms.
 1997: IBM's Deep Blue defeated world chess champion Garry
Kasparov, marking a significant moment in the development of AI.
 1999: Sony released AIBO, a robotic pet, demonstrating the
capabilities of AI.
 2011: IBM's Watson showcased the power of AI in understanding
natural language by winning the TV quiz show Jeopardy, beating
human champions.
2016: Google's AlphaGo program beat the world champion Go player,
Lee Sedol, marking a significant milestone in AI's ability to learn and
make decisions.
 2018: The journey of Generative pre-trained transformers (GPT)
began in this year when OpenAI, a leading AI company in the United
States, introduced the first GPT model. This marked a significant
milestone in the field of generative artificial intelligence.
5.2020s: GPT-3 and Beyond
 GPT Characteristics: GPT models are a type of large language
model (LLM) that utilise the transformer architecture. They are

trained on vast amounts of unlabelled text data, which enables them

to generate content that closely resembles human writing. As of 2023,
most LLMs share these characteristics and are often broadly referred
to as GPTs.
 GPT-n Series: OpenAI has released a series of increasingly advanced
GPT models, known as the "GPT-n" series. Each model in this series
has been more capable than its predecessor, thanks to increases in size
(number of trainable parameters) and training. These models have
formed the foundation for more task-specific GPT systems, including
models that are fine-tuned for following instructions. One such
application of these models is the ChatGPT chatbot service.
Fig 1.9.3( c)
 GPT-4 March 2023: The latest model in the series, GPT-4, was
released. This model represents the current pinnacle of GPT
development by OpenAI. It can process upto 25,000 words versus
4,000 with GPT-3.It is 40% more likely to generate accurate
responses.
 Other GPT Models: The term "GPT" has also been adopted by other
organizations in their model names and descriptions. For instance,
EleutherAI has created a series of GPT foundation models, and
Cerebras has recently developed seven models. Additionally,
companies across various industries have developed GPT models
tailored to their specific needs. Examples include Salesforce's
"Einstein GPT" for customer relationship management (CRM) and
Bloomberg's "Bloomberg GPT" for finance.

1. Future Developments
 Launch Timeline: The debut of GPT-5 is predicted to occur in 2024.
Earlier ever, rumour’s about a 2023 release were refuted by Open AI's
CEO, Sam Altman. Howa transitional model, GPT-4.5, is likely to be
introduced by October 2023.
 Anticipated Capabilities: GPT-5 is expected to exhibit less
hallucination, making it more reliable. It's also predicted to be more
efficient in terms of computation, which would lower the cost and
duration of operating the model. It could potentially be a multisensory
AI model, handling a variety of data types such as text, audio, images,
videos, depth data, and temperature. Another anticipated feature is the
support for long-term memory through a larger context length.
 Concerns about AGI: There's a belief that GPT-5 might achieve
Artificial General Intelligence (AGI), a form of AI that surpasses
human intelligence. This has sparked worries about potential risks and
the need for regulatory measures.
 OpenAI's Future Direction: OpenAI has become more reserved
about its operations and is less likely to share its models like GPT-4 or
GPT-5 with the open-source community. However, reports suggest
that OpenAI is developing a new open-source AI model for public
release.
Everyone is talking about deploying AI!
According to the Harvard Business Review “AI is most
valuable when it is operationalized at scale. For business leaders who wish
to maximize business value using AI, scale refers to how deeply and widely
AI is integrated into an organization’s core product or service and business
processes”.
 The 35% of companies globally reported to be using AI in their business.
 The 83% of organizations say that AI is top priority in their future
business plans.
Rakuten’s AI-nization Plan
1. Vision: The vision stands for augment human creativity with the power of AI.
2. Strategy: The base of the strategy is building a Solid Data Foundation
with customer context, enterprise knowledge with Rakuten and world
knowledge with OpenAI.

 Leveraging Rakuten Channels with online and offline Ecosystem.

 The drive growth with a rapid “Flywheel”. The cycle starts from AI
first, then professionally reviewed and at last customer validation.
3. Roadmap: A roadmap is a strategic plan that defines a goal
 Wave 1: Rapidly prototype to experiment, test, learn.

 Wave 2: Focus on Rakuten AI for businesses and corporate functions.
 Wave 3: Deliver consumer-facing AI services at scale.
Applied AI at Rakuten
1. Credit Scoring: Customer Insights, Geo Data and Social Graph to
improve predictability across bank, card, insurance.
2. Super personalized recommendations in E-com: Knowledge Graphs.
3. Bioinformatics: Leverage power of AI to analyze complex biological
data including cancer cells, genome blood and tissue.
4. Computer Vision: It enables computers to interpret and analyze the visual
world, simulating the way humans see and understand their environment.
GenAI at Rakuten
1. Intelligent ChatBots: Mobile, Securities, Search.
2. Creative AI: Ad creatives and digital marketing, code generation
3. Engineering: No code/low code solutions democratize BI
dashboard development.
4. Customer Success: AI chatbots with new features have emerged to
improve the customer experience like chatbots that use machine learning to
predict what a customer would help and provide proactive support.

1.9.3. AI Led Transformation at Aditya Birla Group
Fig 1.9.3(a)
The next part of the session was held by guest speaker, Pankaj Rai, Chief
Data Analytics officer, Aditya Birla Group. Pankaj Rai shared his insights
on the importance of having a well-defined strategy to achieve success in
life. He emphasized that a good strategy is essential for managing things and
adapting to new technologies that are rapidly changing our lives. He also
spoke about the finance and accounting area, strategy area, and economics
area. Pankaj Rai’s experience in strategy consulting and financial services
was evident in his discussion of data analytics and its role in decision-
making. He shared how data- driven decision-making can help businesses
gain a competitive edge. He also discussed his experience of working with
Wells Fargo and how he established digital capabilities to enable the bank’s
digital channels. Overall, Pankaj Rai’s session was an excellent opportunity
to learn from a seasoned professional in the field of data analytics. His
insights on strategy and data analytics were enlightening, and I am grateful
for the knowledge I gained from his session.
Key Insights:
1. Financial Guidance Beyond Numbers: Mr. Rai's exploration of
financial guidance was not confined to the numerical intricacies of balance
sheets and investments. He delved into the psychological and emotional
dimensions of financial decision-making, emphasizing the need for a
comprehensive understanding. His anecdotes and real-life examples

illustrated how financial wisdom goes beyond mere calculations to become

a tool for achieving life's broader goals.
2. Strategic Planning as a Life Skill: The session seamlessly transitioned
into the broader concept of strategic planning. Drawing parallels between
corporate strategy and life choices, Mr. Rai provided tangible examples of
how strategic thinking can empower individuals to navigate challenges
effectively. This holistic approach positioned strategic planning not just as a
business necessity but as a fundamental life skill.
3. Adaptability in the Technological Epoch: Mr. Rai's role as an AI
engineer added a layer of relevance to the discussion on adaptability in the
technological era. He shared personal experiences of witnessing and driving
technological advancements. His message resonated strongly with the
audience, urging aspiring professionals to not only embrace technological
changes but also to see them as opportunities for personal and professional
growth.
4. Effective Handling of New Topics: The session's focus on learning
extended to the art of effectively handling new topics. Mr. Rai's insights
offered a roadmap for acquiring knowledge in diverse domains such as
finance, accounting, and economics. He emphasized the importance of
curiosity, strategic engagement, and continuous learning, essential
components for success in an ever-evolving academic and professional
landscape.
5. Harmony of Strategy in AI Engineering: In a captivating segment, Mr.
Rai seamlessly merged the discourse on strategy with the realm of AI
engineering. He illuminated how strategic planning plays a pivotal role in the
development and deployment of artificial intelligence. For those aspiring to
navigate the complexities of cutting-edge technology, this intersection of
strategy and AI engineering provided invaluable guidance
Enterprise Decision Making enable by AI: The new next in

Business growth and transformation
The fourth session was held by Sameer Dhanrajani, CEO,
AIQRATE & 3AI.The session began with the topic AI & Analytics. The
author for this edition is Sameer Dhanrajani.
AI and Analytics:
1.91.1 We are ushering into an AI era-An algorithm driven economy

where algorithms sit at the core of every business model, and in

the organizational DNA.
The first book of its kind in its genre, a must have primer for CXOs
for curating, developing and executing AI strategies in their
enterprises for end-to- end transformative impact. A valuable guide
for executives and aspiring professionals on how AI can transform
business, with deep focus on key industries and exponential
technologies. The book also showcases the immense AI adoption and
consumption scenarios in high impact and rapidly changing
industries.
For example, starting with something which is a story, a real-life story.
This is in a public domain and absolutely for the sayers of AI. We always
keep on believing that AI is complex, not for the masses, and it's an
expensive technology. Now this is a story of a gentleman in Japan.
Fig 1.9.3(b)
Makato Koyaki, the guy who's standing at the back of the picture, the young-
looking guy. He's standing with his parents now in Japan. His parents are
farmers, they're into agriculture. Cucumber again is an exotic variety in
Japan, grown by about 89 variables as in shape, texture, colour, dimensions
and all. And in one of the conversations with Macabu, his mother said our
income over the years by growing cucumbers is going down and cucumber
is sold in every market, let's stay fortnight in auction market. So, he had done
a bit of analytics in his engineering days and he said maybe this is the best
time, important a bit of knowledge to the extract. So he went back to his

home, picked up his iPhone through the camera of the iPhone he clipped about
2500 pictures of the cucumbers in the cucumber farm. We can observe that there
are different dimensions, different angles and can differentiate among them. He
went back through Amazon. He purchased a Raspberry Pi 3 processor which
can be bought at $3540 and there he enabled a Google Tensor Floating, which
is known as an open source, freely available tool which can actually create
algorithms. The idea was to label, annotate pictures and correlate with the option
partners of the cucumbers and his intent was to create an algorithm which can
go fewer times i.e,30% to 35%. He trades them on this until he reached about
77%. He told his parents what exact variety they should be. His parents said
that 18 months down the line, his parent’s income went up by 400% and the
cost associated of using AI was just
$35.So that's the magic what speakers keep on saying what AI becomes when it
becomes mainstream and it's actually not that complex what it's made out to be.
Algorithm Economy:
 Democratization of AI and Data as a service (DaaS) will lead to creation
of new marketplaces to buy and sell advanced analytics algorithms.
 Niche startups dealing with plug and play AI led algorithm portfolio and
a service-oriented end to end approach to unravel intelligence and insights.
 Algorithm economy will consist of millions of algorithms, each one
representing a piece of software code that solves a business problem or
creates new opportunity.
 AI platforms will be defined by the sophistication of their algorithms.
AI: The new normal in business strategy
1. Reimagining Customer Experiences.
2. Innovating products & services.
3. Transforming Business.
Decision making scale with AI: The Framework
1. Augument Intelligence:
 Bring data+algorithms+compute to every decision.
 Use AI to make BI contextual, personalized and real time.
 Find few signals in new data.

2. Automate and learn:

 Never send a human to do a machine’s job.
 Build learning systems.
 Leverage unsupervised learning.
3. Incorporate human behaviour:
 Reduce information overload.
 Account for human biases.
 “Nudge” the right behaviour.
Fig 1.9.3(c )
 Sentiment analysis of analyst reports, earnings releases, corporate
fillings.
 Image processing to interpret body language of presenter.
 Voice analytics to determine tone.
 Satellite imagery to determine customer footfalls.
 Nitrous oxide emitted by a manufacturer.
 Women in management position.
What can AI do for India?

Swachh Bharat Abhiyan: AI, and specifically computer vision, can help
substantially improve success of this GOI initiative:
 Help quickly identify areas of the country with low access to
sanitation facilities.

 Allow citizens to photograph and report broken/unclean/unusable

sanitation facilities-this can be put through an AI bases system to
access damage and speed up approvals for repair work.
 In case of natural disasters, drone images could be used to survey
areas affected and computer vision technology can again be used to
assess damage and interventions required.
Using Deep Learning to determine brand exposure from live
stream videos
 Use deep learning to translate relative position and on-screen
duration of logos into a structured data.
 Potential to link this data to social media activity and utilize into
marketing mix models.
The session ended by a thought given by Arthur C. Clarke “Any
sufficiently advanced technology is indistinguishable from magic & AI is the
magic”.
1.9.4. AI Demo Booth

a) LIP SYNC ALGORITHM:
Virtual characters changed the way we interact with computers.
The underlying key for a believable virtual character is accurate
synchronization between the visual (lip movements) and the audio
(speech) in real-time. This work develops a 3D model for the virtual
character and implements the rule-based lip-syncing algorithm for
the virtual character's lip movements. We use the Jacob voice
chatbot as the platform for the design and implementation of the
virtual character. Thus, audio-driven articulation and manual
mapping methods are considered suitable for real-time applications
such as Jacob. We evaluate the proposed virtual character using
hedonic motivation system adoption model (HMSAM) with 70
users. The HMSAM results for the behavioural intention to use is
91.74%, and the immersion is 72.95%.

Fig 1.9.4(a)
The average score for all aspects of the HMSAM is 85.50%. The rule-
based lip- syncing algorithm accurately synchronizes the lip movements
with the Jacob voice chatbot's speech in real-time.
b) GEN AI 3D MODELLING
Creating and visualizing 3D models has been made more
accurate, accessible, and efficient through high-powered AI 3D object
generators. Whether you’re a graphic designer or a game developer, it
depends on your requirements as to which AI 3D object generators might be
the best one for you. You can craft 3D models from scratch revolutionized
using only images, text, or videos. We can craft 3D models from scratch
revolutionized using only images, text, or videos.
Fig 1.9.4(b)

c) Personalised RTC Banner Generation

Personalized Real-Time Communication (RTC) Banner Generation is a
dynamic strategy that tailors digital banners to individual users or specific
audience segments in real-time. This method is designed to elevate user
engagement, boost conversion rates, and enhance overall user experience.
The heart of this technology lies in its ability to harness advanced algorithms
and artificial intelligence for dynamic content customization. By leveraging
real-time data analysis, these systems can adapt and tailor banner content on-
the-fly.A critical component of personalized RTC banner generation is the
collection of user data. Ethical and consensual data gathering ensures the
creation of user profiles that serve as the foundation for crafting personalized
banners. This data encompasses user preferences, behaviour patterns, and
demographic information. The dynamism of personalized RTC banners is
maintained through real-time updates. Content is continuously adjusted
based on user interactions, preferences, and other pertinent real-time data
sources. This ensures that the content remains relevant and engaging,
contributing to a more personalized user journey.
Numerous industries have embraced personalized RTC banners with
notable success. In e-commerce, for instance, personalized product
recommendations significantly elevate user engagement and drive sales.
These real-world applications underscore the versatility and effectiveness of
personalized RTC banner strategies. However, like any technology,
personalized RTC banner generation comes with challenges. Privacy
concerns, data security, and ethical considerations in data usage must be
carefully navigated. Striking a balance between personalization and user
privacy is pivotal for the sustained success of this approach.
Fig 1.9.4(c)

1.9.5. Food delivery times at Swiggy

The last session was held by Goda Ramkumar, Vice
President, Data Science, Swiggy.
Fig 1.9.5(a)
She emphasizes the critical role of accurately predicting food delivery times
in the customer's decision-making process. She highlights the delicate
balance required in setting customer expectations, as both delays and overly
conservative delivery time estimates can impact customer satisfaction and
order placement.
The accuracy of delivery time predictions extends beyond customer
experience; it influences downstream systems within Swiggy, such as the
assignment system and customer care ecosystem. To tackle the complex
problem, Goda and her team employ a multi-input, multi-output (MIMO)
deep learning model with entity embeddings. Elaborating further on solving
delivery executive issues she says, “Let’s take a recent example, during the
pandemic crowding around restaurants wasn’t allowed. So here’s where data
science came into play, we had to work on controlling that issue.
Another instance is how we make sure that a DE is busy during their work
hours. Most DE’s don’t like to be idle for long because it affects their goals,
so we use data science to help them reach these targets.” Data science works
wonders for companies, but Goda wants people to understand that it isn’t a
“magic wand”. She says, “When it comes to data science there’s this
expectation that it solves for everything and quickly. The biggest challenge
is to set the right expectation of what data science can and cannot do in how
much time. The models that you build, the data handling that you do, are all
tools that have an impact on the business. But in order to make it really work,

setting the right expectations and having the right strategy to collect the right
data is essential”.
Key Elements of Goda's Approach:

1. Entity Embeddings: Goda leverages entity embeddings to encode
categorical variables with high cardinality, efficiently capturing semantics
and reducing dimensionality.
2. MIMO Model Architecture: The MIMO model handles multiple inputs
and outputs, enabling the model to learn relationships between different
aspects of the delivery process, such as order to assignment time, first mile
time, wait time, last mile time, and order to reached time.
3. Feature Considerations: The features used in the model account for
various factors influencing delivery times, including restaurant type, order
size, kind of dishes, availability of delivery executives, and distance from the
restaurant to the customer.
4. Performance Improvements: Goda's approach has led to significant
improvements in model training efficiency and accuracy. The MIMO
network design reduced training memory footprint by 50%, resulting in
faster training times. The shared optimization of interdependent outcomes
improved the mean absolute error (MAE) for order to assignment (O2A)
predictions by nearly 30%, enhancing the efficiency of exception
management systems.
Goda Ramkumar acknowledges the ongoing efforts to further enhance the
model. This includes exploring custom loss functions tailored to business
metrics and addressing challenges related to data sparsity and missing
feature values across different cohorts.
In summary, Goda Ramkumar's innovative approach to predicting food
delivery times at Swiggy involves a sophisticated combination of entity
embeddings, MIMO model architecture, and feature-rich inputs. The results
showcase improved accuracy and efficiency, laying the foundation for
continued advancements in the field.

1.9.6. Conclusion
The conference was ended by, a Vote of Thanks from

Subbu Swaminathan, Senior Vice President, Product and
Engineering, Rakuten India. He delivered his appreciation to all the
chief guests, the staff and the delegate present over in the conference.
He summarized the topic
AI-nization choose for the Rakuten Technology Conference 2023.
We the student of Don Bosco Institute of Technology from
CSE(AI&ML) Department was part of Rakuten Technology
Conference 2023.
Fig:1.9.6

CHAPTER – 2
ENTREPRENEURSHIP
2.0 Characteristics of an Entrepreneur

 An entrepreneur is an individual who takes on financial risks in the pursuit of
creating and managing a new business venture.
 Entrepreneurs are often characterized by their innovation, vision, and willingness
to embrace uncertainty.
 They may identify opportunities, develop business ideas, and organize resources
to turn their vision into a reality.
 India has a rich history of entrepreneurship, and there are several famous Indian
entrepreneurs who have made significant contributions to various industries
2.1 Characteristics and Qualities
 Innovative and Creative Thinking
 Risk-Taking
 Vision
 Adaptability
 Resilience
 Passion
 Leadership Skills
 Networking Skills
 Financial Literacy
2.2
Entrepreneurial
thinking
 Independent – Honey bee
 Responsible - Hanuman
 Abundant - Karna
 Goal – oriented - Ambani
 Not afraid of failure – Thomas Alva Edison

 Growth oriented – Sony (Cooker to sound system)

 Feedback seeking – Conversation between Hanuman & Shukriva
 Learning oriented – Conversation between Krishna & Arjuna
 Forward – thinking – Sudha Murthy
 Self – accepting – Lord Gowthama Buddha
 Self – aware – Lord Gowthama Buddha
 Collaborative – Human & Vanara
 Courageour – Ravana
2.3 INTRAPRENEUR
 An intrapreneur is an employee within a large corporation or organization who
behaves like an entrepreneur but does so within the confines of that organization.
 In other words, an intrapreneur is someone who takes the initiative to develop and
implement new business ideas or innovative projects within the existing structure
of their employer.
 The term "intrapreneurship" combines "intra-" (meaning within) and
"entrepreneurship.
 Intrapreneurship can lead to the development of new products, services, or
processes, fostering innovation within larger organizations.
 It is seen as a way for companies to stay competitive and responsive to market
changes. Recognizing and nurturing intrapreneurial spirit can be beneficial for
both employees and the organization as a whole.

2.4 Forms of Business organization

 Sole Proprietorship
 Partnership firm
 Limited Liability Company (LLC)
 Join Stock company
1. Sole Proprietorship
Description: A business owned and operated by a single individual.
Advantages: Simple to establish, complete control by the owner, and direct tax
benefits.
Disadvantages: Limited resources, unlimited personal liability, and potential
difficulty in raising capital.
2. Partnership
Description: A business owned by two or more individuals who share profits and
liabilities.
Advantages: Shared responsibilities, potential for more capital and skills, and
simplified taxation (in the case of a limited liability partnership).
Disadvantages: Shared profits, potential for conflicts among partners, and unlimited
personal liability in general partnerships.
3. Limited Liability Company (LLC)
Description: A flexible form of business organization that combines elements of a
corporation and a partnership.
Advantages: Limited liability for owners, flexibility in management, and pass-
through taxation.
Disadvantages: Complexity in some aspects, potential for disputes among members,
and specific regulations varying by jurisdiction.
4. Joint Stock Company
A joint-stock company, also known as a corporation, is a form of business
organization that is owned by shareholders.
It is a legal entity separate from its owners, and its ownership is represented by
shares of stock.
Joint-stock companies are widely used for both small and large businesses and are
known for providing certain advantages, such as limited liability for shareholders and
ease of transferability of ownership.
2.5 Types of Financial Assistance for Startup Businesses

1. Grants
2. Loans
3. Equity Investments
4. Personal Savings
5. Credit Cards
6. Family and Friends
7. Crowd funding
8. Government Programs
9. Angel Investors
2.6 The Indian government offers
various incentives, subsidies, and grants to promote economic growth, support
specific industries, and encourage entrepreneurship. These programs aim to boost
investment, create employment, and foster innovation. It's important to note that the
availability and details of these incentives can change, and eligibility criteria may
vary.
Here are some examples of incentives, subsidies, and grants provided by the Indian
government: Various Incentives, Subsidies and Grants provided by Govt of India
1. Pradhan Mantri Mudra Yojana (PMMY):
Objective: Providing financial support to micro-enterprises in the form of loans.
Beneficiaries: Small business owners, particularly those in the unorganized sector.
2. Stand-Up India Scheme:
Objective: Encouraging entrepreneurship among women, SCs, and STs by providing
loans for starting greenfield enterprises.
Beneficiaries: Scheduled Castes (SCs), Scheduled Tribes (STs), and women
entrepreneurs.
3. Startup India:
Objective: Fostering the growth of startups and promoting entrepreneurship.
Benefits: Tax exemptions, self-certification, and a Startup India Hub for mentorship
and guidance.
Eligibility: Recognized startups with innovative solutions.

4. Credit Linked Capital Subsidy Scheme (CLCSS):

Objective: Providing capital subsidy for technology upgradation of Micro, Small, and
Medium Enterprises (MSMEs).
Beneficiaries: MSMEs in the manufacturing sector.
5. Export Promotion Capital Goods (EPCG) Scheme:
Objective: Facilitating import of capital goods for the promotion of exports.
Benefits: Import duty exemption on certain capital goods.
Eligibility: Exporters with a specific export obligation.
6. Make in India:
Objective: Encouraging domestic manufacturing and attracting foreign direct
investment (FDI).
Benefits: Various incentives for investors, simplification of procedures, and
infrastructure support.
7. National Manufacturing Competitiveness Programme (NMCP):
Objective: Enhancing the competitiveness of the manufacturing sector.
Components: Technology upgradation, marketing assistance, quality management
standards, and more.
8. National Rural Livelihood Mission (NRLM):

Objective: Alleviating rural poverty by promoting self-employment and
entrepreneurship.
Components: Skill development, access to credit, and market linkages for rural
entrepreneurs.
9. Pradhan Mantri Employment Generation Programme (PMEGP):
Objective: Generating employment opportunities in rural and urban areas through
the establishment of micro-enterprises.
Benefits: Subsidy for project costs and training assistance.
Eligibility: Individuals, self-help groups, and registered institutions

2.0 Conclusion
The term entrepreneur not only refers to the creator, owner and
manager of a business, but also to the project leader of a business. To define the
entrepreneur, two problems relating to the behaviour of economic agents must be
combined: methodological individualism, according to which economic agents
are calculators, and the theory of resource potential, according to which the
rationality of economic agents is embedded in a network of social relationships.
In other words, the entrepreneur is an economic agent whose ultimate goal is to
create a business from a well-defined project. To realize his project, he mobilizes
a number of resources (knowledge-based, financial and relationship-based), from
which he produces other resources (employment, innovation, etc.), interacting
with his environment. In this sense, the entrepreneur is rational, because he
maximizes his resources in order to achieve a goal, which is to create his own job.
In this sense, his behaviour is opportunistic, because he seeks to take advantage
of all the opportunities presented to him (a social relationship, a grant, a
requirement, etc.).
Entrepreneurship is a mind-set, an attitude; it is taking a particular
approach to doing things. The motivations for becoming an entrepreneur are
diverse and can include the potential for financial reward, the pursuit of personal
values and interests, and the interest in social change. At the end of the session,
the book to be read as Entrepreneur was discussed.
Fig 2.3

Chapter:3
INTERNSHIP’S SOCIETAL ACTIVITY
Topic – ChatGPT | OpenAI
3.1 Introduction –
Initially introduction was given to the students on the ChatGPT by the

following key points –
ChatGPT is a language model developed by OpenAI, and it's based on the GPT
(Generative Pre-trained Transformer) architecture, specifically GPT-3.5.
Here are some key points about ChatGPT:
 Generative Pre-trained Transformer (GPT): GPT is a type of artificial

intelligence model designed for natural language processing tasks. The "pre-
trained" aspect means that the model is initially trained on a vast amount of
diverse internet text data before being fine-tuned for specific applications.
 GPT-3.5: The version number "3.5" indicates that ChatGPT is built on the third
iteration of the GPT architecture. GPT-3 is known for its large scale, having 175
billion parameters, which are the internal variables the model uses to generate
human-like text.
 Language Generation: My primary function is to generate human-like text
based on the input I receive. I can be used for a variety of tasks, including
answering questions, providing information, generating creative content, and
more.
 No Internet Access: It's important to note that I don't have the ability to access
real-time information or browse the internet. My responses are based on the
knowledge available up to my last training cut-off in January 2022.
 Limitations: While I can be quite versatile, there are limitations. I might not
always provide accurate or up-to-date information, and I don't have personal
experiences or opinions. Additionally, my responses are based on patterns
learned from diverse data and may not always reflect a single "correct" answer.

3.2 Doubts Regarding ChatGPT
 Bias and Fairness: Users may be concerned about the potential biases present
in the training data, leading to biased or unfair responses. Language models like
ChatGPT learn from diverse internet text, and this can inadvertently include
biased content.
 Accuracy of Information: Users might question the accuracy of information
provided by ChatGPT. While I strive to offer helpful and accurate information,
my responses are generated based on patterns learned from data and may not
always reflect the most up-to-date or reliable information.
 Lack of Real-time Information: Some users might expect me to have real-time
access to the internet for the latest information. However, I don't have this
capability and can only provide information based on my training data up until
January 2022.
 Understanding Context: Users may encounter challenges in ensuring that I
understand and maintain context during a conversation. While I can generate
coherent responses, my understanding is based on patterns in the data, and I
don't have true comprehension or awareness.
 Security and Privacy Concerns: Users might express concerns about the
security and privacy implications of interacting with language models. It's
important to be cautious about sharing sensitive or personal information during
conversations.

3.3 How To Use ChatGPT ?
 Type Your Input: Start by typing your question, prompt, or statement. You can
ask for information, clarification, assistance with a task, or engage in casual
conversation.
 Receive the Response: After you've entered your input, I'll generate a response.
The response is based on patterns learned from a diverse range of internet text
during my training.
 Continue the Conversation: Feel free to continue the conversation by asking
follow-up questions or providing additional context. I'll do my best to provide
coherent and relevant responses.
 Be Specific: If you have a specific question, provide details to help me
understand your inquiry better.
 Experiment: Feel free to experiment with different queries and prompts to see
how I respond. Whether you're seeking information, creative writing assistance,
or just having a chat, I'm here for it.
 Use Keywords: Using specific keywords in your questions can help guide the
conversation in the direction you want.
 Set the Tone: You can specify the tone or style you prefer in your prompts,
such as asking for a formal response or a more casual one.
 Experiment with Prompt Length: Depending on the complexity of your
request, you can try varying the length and detail of your prompts.


10
3.4 CONCLUSION
The interactive session at Govt model primary school served as and illuminating
opportunity to introduce students to the realm of cutting-edge technology. The
engagement with ChatGPT and exploration of other contemporary technological
trends showcased the immense potential and possibilities that the digital landscape
holds for the future.
Witnessing the enthusiasm and curiosity of the students during the session was truly
inspiring. By providing them with insights into ChatGPT and other emerging
technologies, we have not only imparted practical knowledge but have also planted
seeds of innovation and technological literacy. Empowering these young minds with
the tools and understanding of the latest trends equips them to navigate the evolving
digital world with confidence.
As we strive to bridge the technological divide, it is essential to recognize the

importance of such initiatives in fostering a generation that is not just consumers but
active contributors to the ever-growing field of technology. The experience at Govt
model primary school underscores the significance of outreach programs in
demystifying complex technologies and making them accessible to all.
Moving forward, it is our collective responsibility to continue supporting educational

initiatives that promote technological awareness. By doing so, we contribute to a more
inclusive and empowered society where every individual, regardless of background,
has the opportunity to engage with and shape the technological landscape.

11
CHAPTER - 4
SUMMARY
In the implementation of Chatbot for Mining Related Queries the focus was on the
practical application of technology in addressing specific industry challenges. The
implementation of a chatbot tailored for mining-related queries demonstrated the
potential of artificial intelligence in streamlining information retrieval and
communication processes within a specialized domain. The chapter delved into the
technical aspects of developing and deploying the chatbot, highlighting its role in
improving efficiency and accessibility for stakeholders in the mining sector.
In the second chapter entrepreneurship provided a broader perspective on

entrepreneurship, exploring key principles and practices essential for aspiring business
leaders. It emphasized the significance of traits such as resilience, adaptability, and
vision in the entrepreneurial journey. The analysis covered aspects ranging from
ideation and business planning to the importance of market research, customer-centric
approaches, and the role of mentorship. The chapter underscored the dynamic nature
of entrepreneurship and its pivotal role in driving innovation, economic growth, and
societal advancement.
In the third chapter societal activity documented a hands-on outreach initiative,

recounting the experience of visiting a government school to educate students about
current technological trends. The session included practical demonstrations, with a
focus on ChatGPT and other emerging technologies. The report highlighted the
positive impact of such initiatives in fostering technological literacy among students
and inspiring a new generation of innovators. The chapter concluded by emphasizing
the importance of continued support for educational programs that bridge the
technological divide and empower individuals to navigate the evolving digital
landscape.
Collectively, these three chapters showcase a diverse range of experiences and insights
gained during the internship. From the practical application of technology in a specific
industry context to a broader exploration of entrepreneurial principles and the hands-
on interaction with students in a government school, the report encapsulates the
multifaceted nature of the internship. The common thread throughout is the
transformative power of technology and education, whether applied to industry
problem-solving or shared with the next generation to shape a more technologically
literate and empowered society.

12

Report on audio to video dubbing

Uploaded by

Document Informationclick to expand document informationDiffrent language audio dubbing

Document Informationclick to expand document information

Copyright:

Available Formats

Report on audio to video dubbing

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Report on audio to video dubbing

Uploaded by

Copyright:

Available Formats

Innovation/Entrepreneurship/Societal Based Internship 21INT68

fig 1.0 (a) Innovation complexity

fig 1.0 (b) Basic model for innovation management is interactive

Dept of CSE(AI&ML) 2023-2024 1

1.1. Discussing and finalization of title/ topics of interest based

Fig 1.1 (a)

Dept of CSE(AI&ML) 2023-2024 2

Developing Software to Dub Video in Regional Languages

 People increasingly look to video as their preferred way to be better

 The majority of video content is produced in English, leaving those who

 Accuracy and Quality:

Our software uses advanced machine learning models to ensure top-notch

Our solution provides high-quality dubs with reduced costs and

Dept of CSE(AI&ML) 2023-2024 3

We used agile methodologies to develop the software while following design

 Our software provides a scalable and cost-effective solution to the

Dept of CSE(AI&ML) 2023-2024 4

 With the collaboration of some ML coding languages we make this

Testing and Implementation:

 Caters to Multiple Languages:

Our solution provides dubs in multiple regional languages, catering to a

Our software is cost-effective and provides an alternative to expensive

Accessible video content in regional languages can help bridge the

Dept of CSE(AI&ML) 2023-2024 5

1.2 RESEARCH PAPERS:

Dept of CSE(AI&ML) 2023-2024 6

Dept of CSE(AI&ML) 2023-2024 7

Dept of CSE(AI&ML) 2023-2024 8

Dept of CSE(AI&ML) 2023-2024 9

Dept of CSE(AI&ML) 2023-2024 10

Dept of CSE(AI&ML) 2023-2024 11

1.2. Supporting Software and its comparisons

1. What is dubbing Software?

Dept of CSE(AI&ML) 2023-2024 12

 Compare dubbing software according to cost, capabilities,

 Reduced Costs: Since dubbing software streamlines the creation process

 More Control over Final Product: Dubbing software allows audio

 Multi-language Capability: With dubbing software, you can easily record

Dept of CSE(AI&ML) 2023-2024 13

 Versatility: Dubbing software can be used for a variety of applications,

 Loss of Authenticity: Original performances and nuances in language

 Cultural Misinterpretation: Dubbing can sometimes alter cultural

 Quality Variation: Dubbing quality can vary, and poorly executed

 Emotional Impact: Dubbed voices may not convey the same

 Limited Availability of Voice Talents: Finding suitable voice actors

 Technical Challenges: Achieving seamless synchronization between

Dept of CSE(AI&ML) 2023-2024 14

 Loss of Original Language Experience: Dubbing eliminates the

1.4 Methodology and related Block diagram/Flowchart

Dept of CSE(AI&ML) 2023-2024 15

- Popular libraries like Google's Speech Recognition or cloud services like

2. Installing the Library:

3. Importing the Library:

4. Loading the Audio File:

5. Initializing the Recognizer:

Dept of CSE(AI&ML) 2023-2024 16

6. Speech Recognition Process:

8. Extracted Text Result:

10. Accessing Cloud Results: