Report on audio to video dubbing
Report on audio to video dubbing
Report on audio to video dubbing
Chapter 1
INTRODUCTION TO INNOVATION / INTERNSHIP
On the first day of the internship, we were guided with the knowledge about
innovation of new technologies and new concepts or problem statements that
were given by SIH (smart India hackathon). We were made to gain the
Knowledge about what is innovation? What does innovation consist of?
Basically, understanding what exactly innovation is. Hereby, let us understand
the meaning of innovations:
Innovation = Invention + exploitation
Innovation is something new which is obtained by outcomes, which also
consist of value. This involves a whole process of opportunity identification,
ideation or invention to development, prototyping, and many more. But,
entrepreneur -ship only needs to involve commercialization .
Its said that innovation comes about through new combinations made by an
entrepreneur, resulting in
A new product a new process
opening of new market
new way of organizing the business
new sources of supply
Guides referred:
Wikipedia
SlideShare
Solution Overview
Dubbing Software:
We developed easy to use software that can convert English audio tracks
into multiple regional languages like Hindi, Tamil, and more.
Cost Effective:
Development Process:
Design and Conceptualization:
Fig:1.1(b)
Collaboration:
Fig:1.1(c)
Fig:1.1(d)
Multiple iterations of testing and feedback from our users helped us improve the
software and release it into production.
Benefits:
High Quality Dubs:
Our software provides high-quality dubs that are easy to understand and
watch.
Lower Cost:
Increased Accessibility:
Cliff Weitzman Audio dubbing Audio dubbing on video is a dubbed audio also presents an issue
3. 2017 on video crucial element in the field of with syncing – often times dialogue
multimedia and broadcasting, will occur out of sync with the
essential in reaching a global actors’ mouth movements, resulting
audience. It refers to the process in an unnatural viewing experience
of replacing the original audio, for audiences. For these reasons, it’s
usually dialogue, of a video with important for producers to take
another set in a different steps to ensure the highest possible
language. sound quality when dubbing their
films and shows in order to provide
viewers with an enjoyable and
engaging experience.
4. 2019 1.Srikar Kashyap Machine In spite of many languages being Dubbing films or shows into
Pulipaka Translation of spoken in India, it is difficult for another language can face
English the people to understand foreign challenges in matching lip
Videos to languages like English, Spanish, movements, preserving cultural
Indian Italian, etc. The recognition and nuances, and maintaining the
Regional synthesis of speech are original emotional context.
Languages prominent emerging Sometimes, the translated dialogue
using Open technologies in natural language might not fully capture the intended
Innovation processing and communication meaning, leading to a loss of
domains. This paper aims to authenticity or humor. Additionally,
leverage the open source finding skilled voice actors who can
applications of these convey the same emotions and tone
technologies, machine as the original actors can also be a
translation, text-to-speech . hurdle.
5 2020 B. S. Harish & A Any language that has evolved Tokenization of the text. Some of
R. Kasturi comprehensive naturally in humans through its the regional languages dont have
Rangan Survey on usage over the time is called common delimiters like white space
Indian natural language. In this survey, or punctuations.
regional the various approaches and Language structure i.e. order of the
language techniques contributed by the words in the sentences will differ
processing researchers for Indian regional from one language to another
language processing are Ambiguity in translation or
reviewed. The tasks like transliteration of regional language
machine translation, Named words.
Entity Recognition, Sentiment
Analysis and Parts-Of-Speech
tagging are reviewed with
respect to Rule, Statistical and
Neural based approaches. The
challenges which motivate to
solve language processing
problems are presented. The
sources of dataset for the Indian
regional languages are
described.
6. 2021 Varshul Gupta Dubbing video Dubbing involves several steps. Dubbing videos from English to
Anuja Dhawan content in real First, a translated script is other regional languages presents
time created, aiming to match the lip specific challenges. Beyond issues
movements as closely as with lip-syncing and cultural
possible while retaining the nuances, finding voice actors fluent
essence of the original dialogue. in both languages and capable of
Then, skilled voice actors record delivering performances that
the translated lines in a studio. resonate with local audiences can be
Sound engineers synchronize challenging. Moreover, certain
these recordings with the idiomatic expressions or wordplay
visuals, adjusting timing and in English might not directly
pacing to match the lip translate into other languages,
movements.
7. 2023 Wilson Wongso Many to many Indonesia is home to over 700 Sound engineers synchronize these
Brandon Scott multilingual languages and most people recordings with the visuals,
Buana translation speak their respective regional adjusting timing and pacing to
model for languages aside from the lingua match the lip movements. Finally,
Ananto languages of franca. In this paper, we focus on quality checks are done to ensure
Joyoadikusumo
indonesia the task of multilingual machine coherence and accuracy before the
translation for 45 regional dubbed version is released. This
Indonesian languages and process requires precision, linguistic
introduced Indo-T5 which expertise, and attention to detail to
leveraged the mT5 sequence-to- maintain the integrity of the original
sequence language model as a content in a new language
baseline.
8. 2020 Frederic Chaume Dubbing Dubbing involves linguistic, Adapting dialects or regional
cultural, technical and creative accents in dubbing might be
team effort for the translation, challenging, affecting the
adaptation and lip authenticity of the translation.
synchronisation of an Risk of Mistranslation: Errors in
audiovisual text. Research on translation or interpretation during
dubbing has grown the dubbing process can result in
exponentially, in parallel with inaccuracies and misunderstandings.
scholarship in audiovisual Dubbed versions might not be as
translation in general, ushering widely available as the original
in a wide array of works in content with subtitles, reducing
professional and sociological accessibility
studies, linguistic studies,
descriptive studies, ideological
studies, cognitive studies and
case studies in the new field of
audiovisual translation
9. 2022 Aarati H. Real time Language understanding is one Script Adaptation: Translating and
Patil; Snehal S. machine of those perpetual challenges adapting scripts can be challenging,
Patil; Shubham translation which has dogged research from potentially altering the intended
M. system many decades. As a means of meaning or message.
Patil; Tatwadarshi between connecting with others, but Limited Availability: Dubbed
P. Nagarhalli Indian communication is not limited to versions might not be as widely
languages a single language. India contains available as the original content
around 121 different languages. with subtitles, reducing
As a result, there is a linguistic accessibility.
barrier. Natural Language Audience Preference: Some viewers
Processing is the process of prefer subtitles as they maintain the
developing a communicational authenticity of the original language
interface between machines and and performances
humans.
10 2020 Ravinder Machine With the advancement in Dubbing might not be economically
Kumar Translation computer language technology viable for all regional markets,
Inderveer system for in a multilingual country like limiting the availability of content.
Chana Indian India, numerous linguistics Educational Value: Subtitles allow
Mukaan singh Languages require technology for viewers to learn the original
translation. It aids in research for language, providing an educational
ancient languages like Sanskrit, aspect that dubbing does not offer.
Tamil, Telugu, Malayalam to be The challenge in this
available for society. machine communication process arises with
translation is one of its essential the restriction that language inflicts
areas. It plays a signifcant role in both at word and sentence level.
breaking the language barrier
and facilitating inter-lingual.
Next, consider the features each program offers. For example, how many
tracks are available? Does the program have sound effects or background
music options? What type of editing tools does it include? Do you need
support for multiple languages or formats? Make sure that any desired
features are available with the software before making your decision.
Finally, compare prices and read reviews from other users who have tried
out the software before. Reading user feedback can give you valuable
insight into potential limitations or bugs in the program and how easy it is
to use. Once you've found a few promising programs that meet your
needs and budget, take some time to test them out yourself so that you
can make an informed decision about which one is best for your project.
Improved Quality: Dubbing software also has the potential to improve the
quality of recordings, as it can be used to apply effects such as
equalization, compression, and reverb to get the desired sound.
Disadvantages:
Lip Sync Issues: Dubbing may not perfectly match the lip movements
of the actors, leading to a disconnect between audio and visual elements.
Voice Mismatch: The dubbed voice might not suit the original actor's
appearance or character, affecting the overall viewing experience.
Fig:1.4(a)
Speech Recognition:
- Utilize a pre-trained speech recognition model to convert English audio in
the video into text.
7. Handling Exceptions:
- Implement error handling to manage potential issues, such as audio file not
found or recognition errors:
Translation:
- Apply a machine translation model to translate the English text into the
target language.
- Transformer-based models such as Google's Transformer or OpenNMT can
handle complex language structures.
5. Performing Translation:
- Feed the preprocessed English text into the selected transformer model for
translation. The model will generate translations in the target language. Here's
an example using Hugging Face's Transformers library:
7. Quality Assessment:
- Implement quality assessment metrics or human reviewers to evaluate the
accuracy and fluency of the translation. Metrics like BLEU score and human
evaluation can be used to measure the translation quality.
Voice Synthesis:
- Use a text-to-speech (TTS) model to convert the translated text into
synthesized speech in the target language.
- Tacotron and WaveNet are examples of TTS models that can generate
natural-sounding speech.
Detailed explanation of the "Voice Synthesis" step in video dubbing,
including the use of text-to-speech (TTS) models:
1. Selecting a Text-to-Speech (TTS) Model:
- Choose an appropriate TTS model that can generate natural-sounding speech
in the target language. TTS models have evolved significantly in recent years,
and two popular choices are Tacotron and WaveNet.
7. Quality Assessment:
- Evaluate the synthesized speech for naturalness, fluency, and clarity. Listen
to the generated speech and employ objective metrics like Mean Opinion Score
(MOS) if necessary to assess quality.
Synchronization:
Align the synthesized speech with the original video by considering lip
movements and natural pauses.
Use techniques like forced alignment or advanced lip synchronization models to
achieve accurate synchronization.
8. Iterative Refinement:
- Continuously refine the synchronization process based on feedback and
quality assessment. Consider user feedback and make improvements as needed
to achieve the highest level of accuracy and naturalness.
3. Audio Editing: Edit the voiceover recordings to match the timing and
pace of the original video. This may involve cutting, splicing, and
synchronizing the audio with the video.
4. Mixing and Sound Design: Balance the audio tracks, add background
music, sound effects, and adjust audio levels for clarity and coherence.
5. Video Editing: Replace the original audio track with the dubbed
audio. Ensure that the dubbed audio is properly synchronized with the
video.
6. Quality Control: Review the dubbed video to ensure that the lip-sync
is accurate, the audio quality is good, and there are no noticeable
discrepancies.
7. Exporting and Distribution: Once you are satisfied with the dubbed
video, export it in the desired format and resolution. You can then
distribute the dubbed video to your intended audience.
1. Using API
2. Collecting data and training a Machine Learning model to dub the
video
Using API
Using an API for video dubbing is not always a requirement, but it can
offer several advantages, depending on our specific needs and constraints.
Here are some reasons why we might consider using an API for video
dubbing:
1. Efficiency and Speed: APIs can automate and streamline the dubbing
process. They can handle tasks like transcription, translation, and text-to-
speech synthesis more quickly and efficiently than manual methods. This
can save you a significant amount of time, especially when dealing with
large volumes of content.
Note that the above code is oversimplified, Further details of the code
requires Assistance from a Machine learning Engineer
The internship was also good to find out what my strengths and weaknesses are.
This helped me to define what skills and knowledge I have to improve in the
coming time. It would be better that the knowledge level of the language is
sufficient to contribute fully to projects. After my master I think that I could
start my working career. However I could perform certain tasks in research
better if I practice/know more the research methodologies applied in cetacean
studies. It would also be better if I can present and express myself more
confidently.
At last this internship has given me new insights and motivation to pursue a
career in software world.
1.7 Reference:
Mukaan singh ,“Machine translation systems for Indian Languages” , page no.1,
2019-2020.
https://link.springer.com/article/10.1007/s11831-020-09449-7
Antony P G, “Machine Transportation approaches and Survey for Indian
Languages”,Vol.18, page no.1, 2013.
https://aclanthology.org/O13-2003.pdf?shem=ssusba
Utkarsh Agrawal ,“Automated moving dubbing system”, page 1, 2018.
https://www.slideshare.net/UtkarshAgrawal35/voice-dubbing-automation
Srikar Kashyap pulipaka, Chaitanya Krishana Kasaraneni “Machine translation
of English videos to Indian Regional languages using open innovation” , page 1
2019.
https://ieeexplore.ieee.org/document/8937988/authors#authors
BS Harish ,R Kasturi Ranjan, “A Comprehensive survey on Indian Regional
languages processinh”, Vol 1204, page 1, 2020.
https://link.springer.com/article/10.1007/s42452-020-2983-x
https://gamma.app/public/Developing-Software-to-Dub-Video-in-Regional-
Languages-jnk5avghvlp04vb?mode=doc
https://www.researchgate.net/publication/338177583_Machine_Translation_of_
English_Videos_to_Indian_Regional_Languages_using_Open_Innovation
https://link.springer.com/article/10.1007/s42452-020-2983-x
1.8.1 :HISTORY:
Honouring Sir M. Visvesvaraya, the All-India Manufacturers Organization and
Mysore State Board decided to create a science and technology museum in
Bangalore. The foundation stone was laid by Shri B. D. Jatti , Chief Minister of
Mysore, on 15 September 1958. The Visvesvaraya Industrial Museum Society
(VIMS) came to be registered as the nodal agency in order to pool resources
from various industrial houses. It was inaugurated by the first Prime Minister of
India, Pandit Jawaharlal Nehru, on July 14, 1962.
The first exhibition, 'Electricity', was opened to the public on July 27, 1965.
In the year 1970, VITM launched the Mobile/Moving Science Exhibition
(MSE) with 24 participatory exhibits mounted on a bus. The MSE Bus travels
throughout Southern India.
In 1978, many science museums, including VITM, parted from CSIR and were
brought under a newly formed society registered on 4 April 1978 as National
Council of Science Museums (NCSM). In 1979 an extension was added to the
building, increasing the total area of the museum to 6,900 m 2 (74,000 sq ft).
NCSM set up three additional science centres at Gulbarga (Karnataka) in
1984, Tirunelveli (Tamil Nadu) in 1987, Tirupati (Andhra Pradesh)in 1993 and
at Kozhikode, Kerala in 1997, which are functioning under the direct
administrative control of VITM. Thus, VITM has become the southern zone
headquarters of NCSM.
The museum attracts nearly one million visitors a year, and is open on all days
(except Deepavali and Ganesha Chaturthi) from 09:30 to 18:00.
1.8.2 :GALLERY:
1. Visvesvaraya Museum Entrance:
Stepping into the Visvesvaraya Industrial & Technological Museum is an
entrancing journey into the realms of scientific marvels and technological
innovations. The museum serves as a vibrant tapestry, intricately woven with
exhibits that showcase the evolution of science and technology. Visitors are
greeted with a chronological odyssey through the history of machinery,
witnessing the transformation from rudimentary tools to sophisticated, high-tech
systems. Interactive physics displays bring abstract principles to life, allowing
hands-on exploration of concepts like gravity and electricity.
In this captivating environment, the wonders of science and technology are not
just observed but experienced. The Visvesvaraya Industrial & Technological
Museum stands as a testament to human ingenuity, inspiring a deep appreciation
for the incredible progress that continues to shape our world.
2. Engine Hall:
The Engine Hall at Visvesvaraya Industrial and Technological Museum is a
captivating space that celebrates the marvels of engineering and technology.
Named after Sir M. Visvesvaraya, a distinguished engineer and statesman, the
museum is in Bangalore, India. The Engine Hall was set up in the year 1994. Over
50 exhibits arranged across 1000 sq m explain the evolution of mechanisms,
machines and devices that form the very foundation of modern technology.
It is a captivating showcase that delves into the evolution of engines and
machinery pivotal to industrial and technological progress .
Upon entering, visitors are greeted by a diverse array of exhibits, spanning
different eras and technologies. The hall displays various engines, including
steam engines that powered factories during the industrial revolution, internal
combustion engines that revolutionized transportation, and modern turbines used
in aviation and power generation.
The Engine Hall aims to educate and inspire curiosity about the development
and significance of engines throughout history. It serves as an
educationalplatform where visitors can witness firsthand the evolution of
engines and appreciate their crucial role in shaping the modern world.
‘Fun Science’ Gallery was inaugurated on 23rd May 2008. The gallery set up in
an area of 700 sq.mtrs has 66 exciting hands on exhibits.
The Fun Science Gallery at the Visvesvaraya Industrial and Technological
Museum (VITM) is an engaging and interactive space designed to make science
enjoyable and accessible for visitors of all ages. This gallery within the museum
aims to ignite curiosity and foster a love for science through hands-on exhibits
and demonstrations
Filled with interactive displays, puzzles, optical illusions, and engaging
experiments, the Fun Science Gallery encourages visitors to explore various
scientific principles in a playful manner. Visitors can engage in activities that
demonstrate concepts related to physics, biology, mathematics, and more.
The exhibits are designed to stimulate the mind and provoke curiosity, allowing
visitors to actively participate in experiments and learn through direct
interaction. Whether it's understanding the laws of motion, experimenting with
light and sound, or exploring the wonders of magnetism, the Fun Science
Gallery offers an immersive experience that makes learning science an
enjoyable adventure.
It's an excellent space for children, families, and school groups, providing a
platform for experiential learning and fostering an appreciation for the wonders
of science and technology. The emphasis on hands-on activities ensures that
visitors not only learn scientific principles but also have fun while doing so.
The "Electrotechnic" gallery was thrown open to the visitors of the museum by
Shri. Jawhar Sircar, Secretary, Ministry of Culture, Govt. of India on 8th April
2010. The gallery which has been set up in an area of about 780 sq. metre houses
fascinating exhibits on various topics in electrical technology. The gallery is a
journey through the spectacular world of electricity from the classical
experiments to the state of art technology.
Filled with interactive displays, puzzles, optical illusions, and engaging
experiments, the Fun Science Gallery encourages visitors to explore various
scientific principles in a playful manner. Visitors can engage in activities that
demonstrate concepts related to physics, biology, mathematics, and more.
The exhibits are designed to stimulate the mind and provoke curiosity, allowing
visitors to actively participate in experiments and learn through direct interaction.
Whether it's understanding the laws of motion, experimenting with light and
sound, or exploring the wonders of magnetism, the Fun Science Gallery offers an
immersive experience that makes learning science an enjoyable adventure.
4. Biotechnological Gallery:
visitors enter the gallery, they are greeted by displays featuring the history of
space exploration, from the early days of rocketry to the contemporary
achievements of space agencies around the world. Engaging multimedia
presentations provide insight into the challenges and triumphs of space missions,
capturing the imagination of visitors of all ages. The Space Technology Gallery
is likely to feature life-sized models of spacecraft, satellites, and rovers, allowing
visitors to get up close and personal with the engineering marvels that have
expanded our understanding of the cosmos. Interactive exhibits may simulate
aspects of space travel, such as zero gravity environments or the experience of
landing on other planets.
6. Chandrayana-3:
Chandrayaan-3 is India's ambitious lunar exploration mission, following the
achievements of Chandrayaan-1 and Chandrayaan-2. Spearheaded by the Indian
Space Research Organisation (ISRO), the mission is designed to continue the
exploration of the Moon, emphasizing enhanced capabilities and technological
advancements. Chandrayaan-3 aims to deploy a lander and rover on the lunar
surface, conducting scientific experiments to deepen our understanding of Earth's
celestial neighbour.
The mission draws on the lessons learned from Chandrayaan-2 and incorporates
improvements to ensure greater success. Chandrayaan-3 is expected to have an
increased payload capacity, enabling more sophisticated scientific instruments
and experiments. It represents a crucial step in India's space exploration
endeavors, showcasing the nation's commitment to advancing its space
capabilities and contributing valuable data to the global scientific community.
The mission aligns with India's long-term space exploration goals, emphasizing
both technological innovation and scientific discovery on the lunar landscape.
The Satish Dhawan Space Centre plays a crucial role in India's space exploration
endeavors, providing a strategic base for the launch of satellites for
communication, Earth observation, navigation, and scientific research. The
center's geographic location on Sriharikota Island also enhances safety by
providing a clear trajectory over the Bay of Bengal for rocket .
1.8.3 :Conclusion:
The Visvesvaraya Industrial and Technological Museum (VITM) in Bangalore,
India, stands as a testament to the visionary engineer Sir M. Visvesvaraya.
Spanning various domains of science, technology, and innovation, the museum
celebrates India's scientific heritage. Through its interactive exhibits, the VITM
provides a hands-on learning experience, fostering curiosity and interest in
visitors of all ages. It's a hub for exploration, housing displays on physics,
electronics, aerospace, and other scientific disciplines, inspiring the pursuit of
knowledge and discovery.
Fig 1.9
Guest Speakers:
1. Pankaj Rai, Chief Data Analytics Officer, Aditya Birla Group
2. Sameer Dhanrajani, CEO, AIQRATE
3. Goda Ramkumar, VP-Data Science, Swiggy
Fig 1.9
4. Platform Engineering.
5. AI-augmented development.
6. Industry cloud platforms.
7. Intelligent applications.
8. Democratized generative AI.
9. Augmented connected workforce.
10. Machine customers.
Fig 1.9.1
Example: If we've used Alexa, Siri, or another virtual assistant, you've used
augmented intelligence. Virtual assistants don't make decisions for us.
Instead, they provide the data when we need it.
And the sponsors for this great full event
Fig 1.9.2(a)
Evolution of AI:
Artificial Intelligence (AI) has transformed from a concept in the realm of
science fiction to an everyday reality impacting every facet of our lives. The
journey of AI has been marked by significant milestones and breakthroughs,
each reflecting the convergence of theoretical foundations, technological
advancements, and practical implementations. Here, we chronologically
trace the evolution of AI, showcasing its most important developments.
1. Pre-1950s: Theoretical Foundations
The roots of AI can be traced back to antiquity, with philosophers
attempting to explain the human mind as a symbolic system. However, the
modern field of AI truly began to take shape in the mid-20th century.
1843: Ada Lovelace, known as the world's first computer
programmer, proposed the idea that machines could manipulate
symbols in addition to numbers, laying a fundamental concept for AI.
1936: Alan Turing proposed the concept of a "universal machine"
(later known as the Turing Machine), a theoretical device that could
solve any computation given enough time and resources. This forms
the basis of the digital computer and the principle of computability.
1943: Warren McCulloch and Walter Pitts proposed the first
mathematical model of a neural network, opening up the possibility of
learning machines.
1949: Donald Hebb proposed a learning theory, now known as
Hebbian learning, which became a fundamental concept in the
development of artificial neural networks.
Fig 1.9.2(b)
Fig 1.9.3( c)
GPT-4 March 2023: The latest model in the series, GPT-4, was
released. This model represents the current pinnacle of GPT
development by OpenAI. It can process upto 25,000 words versus
4,000 with GPT-3.It is 40% more likely to generate accurate
responses.
Other GPT Models: The term "GPT" has also been adopted by other
organizations in their model names and descriptions. For instance,
EleutherAI has created a series of GPT foundation models, and
Cerebras has recently developed seven models. Additionally,
companies across various industries have developed GPT models
tailored to their specific needs. Examples include Salesforce's
"Einstein GPT" for customer relationship management (CRM) and
Bloomberg's "Bloomberg GPT" for finance.
1. Future Developments
Launch Timeline: The debut of GPT-5 is predicted to occur in 2024.
Earlier ever, rumour’s about a 2023 release were refuted by Open AI's
CEO, Sam Altman. Howa transitional model, GPT-4.5, is likely to be
introduced by October 2023.
Anticipated Capabilities: GPT-5 is expected to exhibit less
hallucination, making it more reliable. It's also predicted to be more
efficient in terms of computation, which would lower the cost and
duration of operating the model. It could potentially be a multisensory
AI model, handling a variety of data types such as text, audio, images,
videos, depth data, and temperature. Another anticipated feature is the
support for long-term memory through a larger context length.
Concerns about AGI: There's a belief that GPT-5 might achieve
Artificial General Intelligence (AGI), a form of AI that surpasses
human intelligence. This has sparked worries about potential risks and
the need for regulatory measures.
OpenAI's Future Direction: OpenAI has become more reserved
about its operations and is less likely to share its models like GPT-4 or
GPT-5 with the open-source community. However, reports suggest
that OpenAI is developing a new open-source AI model for public
release.
Everyone is talking about deploying AI!
According to the Harvard Business Review “AI is most
valuable when it is operationalized at scale. For business leaders who wish
to maximize business value using AI, scale refers to how deeply and widely
AI is integrated into an organization’s core product or service and business
processes”.
The 35% of companies globally reported to be using AI in their business.
The 83% of organizations say that AI is top priority in their future
business plans.
Rakuten’s AI-nization Plan
1. Vision: The vision stands for augment human creativity with the power of AI.
2. Strategy: The base of the strategy is building a Solid Data Foundation
with customer context, enterprise knowledge with Rakuten and world
knowledge with OpenAI.
GenAI at Rakuten
1. Intelligent ChatBots: Mobile, Securities, Search.
2. Creative AI: Ad creatives and digital marketing, code generation
3. Engineering: No code/low code solutions democratize BI
dashboard development.
4. Customer Success: AI chatbots with new features have emerged to
improve the customer experience like chatbots that use machine learning to
predict what a customer would help and provide proactive support.
Fig 1.9.3(a)
The next part of the session was held by guest speaker, Pankaj Rai, Chief
Data Analytics officer, Aditya Birla Group. Pankaj Rai shared his insights
on the importance of having a well-defined strategy to achieve success in
life. He emphasized that a good strategy is essential for managing things and
adapting to new technologies that are rapidly changing our lives. He also
spoke about the finance and accounting area, strategy area, and economics
area. Pankaj Rai’s experience in strategy consulting and financial services
was evident in his discussion of data analytics and its role in decision-
making. He shared how data- driven decision-making can help businesses
gain a competitive edge. He also discussed his experience of working with
Wells Fargo and how he established digital capabilities to enable the bank’s
digital channels. Overall, Pankaj Rai’s session was an excellent opportunity
to learn from a seasoned professional in the field of data analytics. His
insights on strategy and data analytics were enlightening, and I am grateful
for the knowledge I gained from his session.
Key Insights:
1. Financial Guidance Beyond Numbers: Mr. Rai's exploration of
financial guidance was not confined to the numerical intricacies of balance
sheets and investments. He delved into the psychological and emotional
dimensions of financial decision-making, emphasizing the need for a
comprehensive understanding. His anecdotes and real-life examples
Fig 1.9.3(b)
Makato Koyaki, the guy who's standing at the back of the picture, the young-
looking guy. He's standing with his parents now in Japan. His parents are
farmers, they're into agriculture. Cucumber again is an exotic variety in
Japan, grown by about 89 variables as in shape, texture, colour, dimensions
and all. And in one of the conversations with Macabu, his mother said our
income over the years by growing cucumbers is going down and cucumber
is sold in every market, let's stay fortnight in auction market. So, he had done
a bit of analytics in his engineering days and he said maybe this is the best
time, important a bit of knowledge to the extract. So he went back to his
home, picked up his iPhone through the camera of the iPhone he clipped about
2500 pictures of the cucumbers in the cucumber farm. We can observe that there
are different dimensions, different angles and can differentiate among them. He
went back through Amazon. He purchased a Raspberry Pi 3 processor which
can be bought at $3540 and there he enabled a Google Tensor Floating, which
is known as an open source, freely available tool which can actually create
algorithms. The idea was to label, annotate pictures and correlate with the option
partners of the cucumbers and his intent was to create an algorithm which can
go fewer times i.e,30% to 35%. He trades them on this until he reached about
77%. He told his parents what exact variety they should be. His parents said
that 18 months down the line, his parent’s income went up by 400% and the
cost associated of using AI was just
$35.So that's the magic what speakers keep on saying what AI becomes when it
becomes mainstream and it's actually not that complex what it's made out to be.
Algorithm Economy:
Democratization of AI and Data as a service (DaaS) will lead to creation
of new marketplaces to buy and sell advanced analytics algorithms.
Niche startups dealing with plug and play AI led algorithm portfolio and
a service-oriented end to end approach to unravel intelligence and insights.
Algorithm economy will consist of millions of algorithms, each one
representing a piece of software code that solves a business problem or
creates new opportunity.
AI platforms will be defined by the sophistication of their algorithms.
AI: The new normal in business strategy
1. Reimagining Customer Experiences.
2. Innovating products & services.
3. Transforming Business.
Decision making scale with AI: The Framework
1. Augument Intelligence:
Bring data+algorithms+compute to every decision.
Use AI to make BI contextual, personalized and real time.
Find few signals in new data.
Fig 1.9.3(c )
Sentiment analysis of analyst reports, earnings releases, corporate
fillings.
Image processing to interpret body language of presenter.
Voice analytics to determine tone.
Satellite imagery to determine customer footfalls.
Nitrous oxide emitted by a manufacturer.
Women in management position.
Fig 1.9.4(a)
The average score for all aspects of the HMSAM is 85.50%. The rule-
based lip- syncing algorithm accurately synchronizes the lip movements
with the Jacob voice chatbot's speech in real-time.
b) GEN AI 3D MODELLING
Creating and visualizing 3D models has been made more
accurate, accessible, and efficient through high-powered AI 3D object
generators. Whether you’re a graphic designer or a game developer, it
depends on your requirements as to which AI 3D object generators might be
the best one for you. You can craft 3D models from scratch revolutionized
using only images, text, or videos. We can craft 3D models from scratch
revolutionized using only images, text, or videos.
Fig 1.9.4(b)
Fig 1.9.4(c)
Fig 1.9.5(a)
She emphasizes the critical role of accurately predicting food delivery times
in the customer's decision-making process. She highlights the delicate
balance required in setting customer expectations, as both delays and overly
conservative delivery time estimates can impact customer satisfaction and
order placement.
The accuracy of delivery time predictions extends beyond customer
experience; it influences downstream systems within Swiggy, such as the
assignment system and customer care ecosystem. To tackle the complex
problem, Goda and her team employ a multi-input, multi-output (MIMO)
deep learning model with entity embeddings. Elaborating further on solving
delivery executive issues she says, “Let’s take a recent example, during the
pandemic crowding around restaurants wasn’t allowed. So here’s where data
science came into play, we had to work on controlling that issue.
Another instance is how we make sure that a DE is busy during their work
hours. Most DE’s don’t like to be idle for long because it affects their goals,
so we use data science to help them reach these targets.” Data science works
wonders for companies, but Goda wants people to understand that it isn’t a
“magic wand”. She says, “When it comes to data science there’s this
expectation that it solves for everything and quickly. The biggest challenge
is to set the right expectation of what data science can and cannot do in how
much time. The models that you build, the data handling that you do, are all
tools that have an impact on the business. But in order to make it really work,
setting the right expectations and having the right strategy to collect the right
data is essential”.
1.9.6. Conclusion
Fig:1.9.6
CHAPTER – 2
ENTREPRENEURSHIP
2.3 INTRAPRENEUR
An intrapreneur is an employee within a large corporation or organization who
behaves like an entrepreneur but does so within the confines of that organization.
In other words, an intrapreneur is someone who takes the initiative to develop and
implement new business ideas or innovative projects within the existing structure
of their employer.
The term "intrapreneurship" combines "intra-" (meaning within) and
"entrepreneurship.
Intrapreneurship can lead to the development of new products, services, or
processes, fostering innovation within larger organizations.
It is seen as a way for companies to stay competitive and responsive to market
changes. Recognizing and nurturing intrapreneurial spirit can be beneficial for
both employees and the organization as a whole.
2.0 Conclusion
The term entrepreneur not only refers to the creator, owner and
manager of a business, but also to the project leader of a business. To define the
entrepreneur, two problems relating to the behaviour of economic agents must be
combined: methodological individualism, according to which economic agents
are calculators, and the theory of resource potential, according to which the
rationality of economic agents is embedded in a network of social relationships.
In other words, the entrepreneur is an economic agent whose ultimate goal is to
create a business from a well-defined project. To realize his project, he mobilizes
a number of resources (knowledge-based, financial and relationship-based), from
which he produces other resources (employment, innovation, etc.), interacting
with his environment. In this sense, the entrepreneur is rational, because he
maximizes his resources in order to achieve a goal, which is to create his own job.
In this sense, his behaviour is opportunistic, because he seeks to take advantage
of all the opportunities presented to him (a social relationship, a grant, a
requirement, etc.).
Entrepreneurship is a mind-set, an attitude; it is taking a particular
approach to doing things. The motivations for becoming an entrepreneur are
diverse and can include the potential for financial reward, the pursuit of personal
values and interests, and the interest in social change. At the end of the session,
the book to be read as Entrepreneur was discussed.
Fig 2.3
Chapter:3
INTERNSHIP’S SOCIETAL ACTIVITY
Topic – ChatGPT | OpenAI
3.1 Introduction –
Bias and Fairness: Users may be concerned about the potential biases present
in the training data, leading to biased or unfair responses. Language models like
ChatGPT learn from diverse internet text, and this can inadvertently include
biased content.
Accuracy of Information: Users might question the accuracy of information
provided by ChatGPT. While I strive to offer helpful and accurate information,
my responses are generated based on patterns learned from data and may not
always reflect the most up-to-date or reliable information.
Lack of Real-time Information: Some users might expect me to have real-time
access to the internet for the latest information. However, I don't have this
capability and can only provide information based on my training data up until
January 2022.
Understanding Context: Users may encounter challenges in ensuring that I
understand and maintain context during a conversation. While I can generate
coherent responses, my understanding is based on patterns in the data, and I
don't have true comprehension or awareness.
Security and Privacy Concerns: Users might express concerns about the
security and privacy implications of interacting with language models. It's
important to be cautious about sharing sensitive or personal information during
conversations.
Type Your Input: Start by typing your question, prompt, or statement. You can
ask for information, clarification, assistance with a task, or engage in casual
conversation.
Receive the Response: After you've entered your input, I'll generate a response.
The response is based on patterns learned from a diverse range of internet text
during my training.
Continue the Conversation: Feel free to continue the conversation by asking
follow-up questions or providing additional context. I'll do my best to provide
coherent and relevant responses.
Be Specific: If you have a specific question, provide details to help me
understand your inquiry better.
Experiment: Feel free to experiment with different queries and prompts to see
how I respond. Whether you're seeking information, creative writing assistance,
or just having a chat, I'm here for it.
Use Keywords: Using specific keywords in your questions can help guide the
conversation in the direction you want.
Set the Tone: You can specify the tone or style you prefer in your prompts,
such as asking for a formal response or a more casual one.
Experiment with Prompt Length: Depending on the complexity of your
request, you can try varying the length and detail of your prompts.
3.4 CONCLUSION
The interactive session at Govt model primary school served as and illuminating
opportunity to introduce students to the realm of cutting-edge technology. The
engagement with ChatGPT and exploration of other contemporary technological
trends showcased the immense potential and possibilities that the digital landscape
holds for the future.
Witnessing the enthusiasm and curiosity of the students during the session was truly
inspiring. By providing them with insights into ChatGPT and other emerging
technologies, we have not only imparted practical knowledge but have also planted
seeds of innovation and technological literacy. Empowering these young minds with
the tools and understanding of the latest trends equips them to navigate the evolving
digital world with confidence.
CHAPTER - 4
SUMMARY
In the implementation of Chatbot for Mining Related Queries the focus was on the
practical application of technology in addressing specific industry challenges. The
implementation of a chatbot tailored for mining-related queries demonstrated the
potential of artificial intelligence in streamlining information retrieval and
communication processes within a specialized domain. The chapter delved into the
technical aspects of developing and deploying the chatbot, highlighting its role in
improving efficiency and accessibility for stakeholders in the mining sector.
Collectively, these three chapters showcase a diverse range of experiences and insights
gained during the internship. From the practical application of technology in a specific
industry context to a broader exploration of entrepreneurial principles and the hands-
on interaction with students in a government school, the report encapsulates the
multifaceted nature of the internship. The common thread throughout is the
transformative power of technology and education, whether applied to industry
problem-solving or shared with the next generation to shape a more technologically
literate and empowered society.