AI-based Workout Assistant and Fitness Guide

Published by : International Journal of Engineering Research & Technology (IJERT)
http://www.ijert.org ISSN: 2278-0181

Vol. 10 Issue 11, November-2021
AI-based Workout Assistant and Fitness guide

Gourangi Taware[1], Rohit Agarwal[2], Pratik Dhende[3],
Prathamesh Jondhalekar[4], Prof. Shailesh Hule[5]
1,2,3,4,5
Computer Department, Pimpri Chinchwad College of Engineering,
Pune, India
Abstract: Nowadays virtual assistant is playing a very important This project, which will have a non-distractive interface,
role in our daily activities and has become an inseparable part intends to make exercising more easy and more fun. We are
of our lives. As per the Clutch survey report that was published going to see an overview of the contribution of these families,
in 2019, almost 27% of people are using AI virtual assistants for their algorithms, advantages, disadvantages, its efficiency
performing their day-to-day activities. AI is an emerging field
that we aim to explore through this project of AI-based workout
compared to other existing technologies, applications and
assistants. In our work, we introduce Fitcercise, an application possible future work.
that detects the user’s exercise pose counts the specified exercise
repetitions and provides personalized, detailed 2. LITERATURE REVIEW:
recommendations on how the user can improve their form. The There are numerous applications available in the market
application uses the MediaPipe to detect a person’s pose, and which guide the user about the exercises to be performed. But
afterwards analyses the geometry of the pose from the dataset through our application, we not only guide the user regarding
and real-time video and counts the repetitions of the particular which exercise to perform but also about the correct posture
exercise. and counting the repetitions using computer vision. This
Keywords: AI, Virtual assistant, CNN, workout assistant, Pose
application can be considered as the workout assistant which
estimation. Blazepose, OpenCV. provides real-time posture detection and diet
recommendations. The application can not only be used by
1. INTRODUCTION: individuals at homes but by increasing the scope can be used
In our work, we introduce Fitcercise, an application that in gyms as smart trainers thus reducing the human
detects the user’s exercise pose counts the specified exercise intervention.
repetitions and provides personalized, detailed analysis about
improving the user’s body posture. This is an AI-based [1]Their objective was to provide a bottom-up approach for
Workout Assistant and Fitness guide to guide people who the activity of estimation of the pose of the user and real-time
don’t have access to the gym but are still willing to work out segmentation of the user while using images of the multi-
at home to maintain their physique and fitness and keep their person solution and by implementing an effective single-shot
body in good shape. To help them perform the exercises approach.
correctly and prevent them from chronicle and immediate So the idea they proposed used a CNN that is a convolutional
injuries. This also provides a personalised health guide and neural network by training it to detect and classify the key
diet plan along with a personalised daily workout calorie points and accordingly give accurate results by studying the
count. The application also displays necessary health relative displacements and thus by clustering or identifying
insurances and policies provided by the government of India the group of different key points and studying the pose
for the common people and check the eligibility criteria using instances.
API and Web services. Staying at home for long periods of Using single-scale inference, the model obtained a COCO[6]
time can become boring, especially when most fun activities accuracy of the points of 0.665 and 0.687 using multiple level
are done outdoors, which is difficult considering the current inference. Using part-based modelling. It depends on the key
scenario of pandemics and lockdown. But this cannot be a point level structure in order for training the real-time
relevant excuse for being unproductive because it is an segmentation activity. In the future, there might be ways to
excellent idea to utilize the extra time we get into our own overcome this limitation.
health.
Most gyms have a wide variety of exercise equipment and
also have trainers who guide us about the exercise and its
correct posture. But the unavailability of the above equipment
and trainers can be an important reason that can stop us from
doing exercise at home.
We aim to build an AI-based trainer that would help you
exercise more efficiently in your own homes. The project
focuses on creating an AI algorithm to help you exercise, by
determining the quality and quantity of repetitions which is
done by using pose estimation running on the CPU.
IJERTV10IS110154 www.ijert.org 471

(This work is licensed under a Creative Commons Attribution 4.0 International License.)
efficiency in the precision to predict pose. The total

efficiency increases by using this approach.
3. DATASET USED:
As most of the solutions use the key points and the heatmaps,
first we require to pose alignment data for each pose. We can
take into consideration the different test cases where if the
complete body is visible and there are detectable key points
for the body parts. To make sure that the pose detector can
perform in heavy occlusions which are some different test
cases than normal ones, we can make use of occlusion-
simulating augmentation. The training data set has 60000
images with a few images doing the same pose that have
different key points and 25000 frames in which the user
performs the actual exercise.
Fig 1: 33 Body Key Points[6]
4. IMPLEMENTATION AND ALGORITHM:
In research paper[2] Their objective was to create BlazePose, We have used JavaScript node JS and different libraries such
a mobile-optimized lightweight convolutional neural as Open CV and MediaPipe which is a library using ML
network architecture for human posture prediction. On a algorithms along with different numerical and algorithms.
Pixel 2 phone, the network creates 33 body key points(as
shown in Fig 1) for a single individual during inference and The MediaPipe pose estimation tool uses a 33 key points
operates at over 30 frames per second. This makes it ideal for approach wherein it detects the key points and accordingly
real-time applications such as fitness tracking and sign uses and studying the data set estimates the pose. It tracks The
language recognition. Two of our most significant pose from the real-time camera frame or RGB video by using
contributions are a novel body posture monitoring method the blaze pose tool that has a Machine Learning approach in
and a lightweight body pose prediction neural network. Both pose detection.
approaches use heatmaps and regression to find the points.
They built a robust technique to estimate the posture using This approach uses a double step tracker machine learning
Blazepose, which uses CNN and a dataset with up to 25K pipeline which is efficient in media pipe solutions. Using the
photos demonstrating distinct body endpoints, enhancing the tracker locates the region of interest of the activity or posture
accuracy. On a mobile CPU, this model runs in near real- in the real-time video. It then predicts the key points in the
time, and on a mobile GPU, it can run in super real-time. region of interest using the real-time video frame as an input.
But the point to be noted is that the tracker is invoked only
The given algorithm of 33 keypoint topology is efficient with during the start or whenever the model is unable to detect the
BlazeFace and BlazePalm. In this paper, the authors have body key points in the frame.
developed a system for majorly upper body key points. A
solution that shows lower-body analysis of pose will also be We have created a module named PoseModule.js and defined
integrated. various functions in it and imported this module to our main
project file aiTrainer.js to utilize these functions.
In the research paper[3], the researchers proposed an efficient We are basically first detecting the landmark positions on the
solution mainly to tackle the multi-person problem while body in the video with the help of MediaPipe[9]. Then the
detecting poses when there are multiple people in the real- angle between the points is calculated and a range is
time frame. In this approach, the model is trained in such a determined. This range can be demonstrated by a 0-100 %
way that it detects the points of the user and then segregates efficiency bar on the output video frame. We also calculate
based on the affinity of different points in the frame. This is the number of repetitions of the exercise and display the count
considered as the bottom-up approach and is very efficient in in the output video.
accuracy and performance-wise without considering the Formula for calculating angle formed by 3 points:
number of people in the frame as the barrier. For the dataset Angle = math.degrees(math.atan2(y3-y2,x3-x2)-
considering 288 frame images, this approach performs well math.atan2(y1-y2,x1-x2))
as compared to the other approaches discussed above by
8.5% mAP. The approach is able to get higher accuracy and In the output following data is displayed: fps rate, counter for
precision in real-time. The earlier solutions were redefined in repetitions, landmark points, the angle between landmark
the training stages. The disadvantage of OpenPose is it points and status bar.
doesn’t return any data about the depth. and also needs high
computing power. This project can be implemented on pre-recorded videos as
well as in real-time through a webcam.
In the research paper[4] they aimed to get the precise location
of the points by using a deep neural network. In this approach,
they presented DNN-based estimators. This allowed an

denotes the input picture and byh denotes our kernel. The
indexes of the result matrix's rows and columns are denoted
by mand n, respectively.
Relu:
Softmax:
Fig 2: Inference Pipeline
NEURAL NETWORK ARCHITECTURE:

The estimator in our application first estimates the position of
the 33 key points of the user and later utilises the user Padding:
alignment. We utilise the combination of heatmap and the
regression way. In the training model, we utilise the above
approaches and then prune the resultant layers from the test
model. We used the heatmap to analyse the light-weight
integration and used it by the encoder. The solution is
Output Matrix Dimension:
inspired by the Stacked Hourglass solution[12]. We used
skip-connections in all levels in order to get a balance in
higher and lower characteristics. The slopes or the gradients
were not going back to the heatmap in the train set model.
For their last post-processing stage, the bulk of current object
identification methods use the Non-Maximum Suppression
(NMS) algorithm. For hard objects with minimal degrees of Resultant Tensor After Multiple Filters:
freedom, this method works effectively. This algorithm,
however, fails in cases that feature highly articulated human
postures, such as individuals waving or hugging. This is
because the NMS algorithm's intersection over union (IoU)
criterion is satisfied by several, confusing boxes. Refer to Fig Activation:
3 for the System Implementation plan
Backpropagation:
5. BLOCK DIAGRAM:
1. User Login: The user has to enter valid credentials

and login into the system and save the personal data
of the user into the respective account.
Fig 3: System Implementation plan 2. Exercise Routines: The application contains
different exercise routines which have different
MATHEMATICAL MODELING OF CNN MODEL: exercises that the user can do in real-time and has
KERNEL CONVOLUTIONAL different pose correction and set repetition counter
tools.
It's a method in which we take a small matrix of numbers
(known as a kernel or filter), apply it to our image, and then 3. Repetition counter: It counts the set of repetitions
transform it using the values from the filter. The feature map the user does of a particular exercise in real-time by
values are determined using the formula below, where f identifying the position of the user.

7. SYSTEM ARCHITECTURE:
Fig 4: Block Diagram of the system

Fig 5: System Architecture
4. Pose corrector: it helps the user to detect and correct
the poses or the exercise posture of the user in real-
time by using different pose detecting algorithms 8. ADVANTAGES:
and computer vision techniques. 1. There are numerous applications available in the
market which guide the user about the exercises to
5. Diet Recommendation: The system prepares a diet be performed. But through this application, we not
plan for the user depending upon the health Issues only guide the user regarding which exercise to
and the area of interest. perform but also about the correct posture and
counting the repetitions using computer vision.
6. Personal Recommendation and record log: The 2. Monitor the user in real-time keeping track of the
system monitors the daily calorie loss and the quality repetitions of a particular exercise, thus
exercise count repetitions of the user and does the keeping his form intact and correct throughout their
research of the data to give relevant reports to the workout. This will educate newbies about different
person and thereby increase the precision in the exercise routines and their correct postures to
recommendations. prevent injuries.
3. The application also offers personalised health
6. FUNCTIONAL REQUIREMENTS:
advice and nutrition ideas while keeping the daily
calorie log in the database.
1. Pose estimator has the ability to detect the pose and 4. The application can not only be used by individuals
count the repetitions along with the posture guide at homes but by increasing the scope can be used in
2. Personalised calorie counter depending upon the gyms as smart trainers thus reducing the human
exercises performed. intervention.
3. Diet planners exhibit different diets depending upon 5. Our main motive is to spread awareness about the
the health conditions and calorie intake. importance of good health and fitness among
4. A platform to display different health insurances and common people.
policies provided by the Indian government along
with the benefits and eligibility criteria.
9. LIMITATIONS:
5. Display different exercise routines according to the
health conditions and focus majorly on being fit and
weight loss. 1. The application can estimate the poses and count
repetitions for a limited number of exercises as pose
estimation using computer vision for some exercises
and postures can be difficult.
2. The application is developed as a cross-web
application and is not used as a mobile android/ios
application.
3. The application cannot capture multiple people in
the frame in the real-time system.

10. APPLICATION: [7] “BlazeFace: Sub-millisecond Neural Face Detection on Mobile

GPUs” V.Bazarevsky, Y.Kartynnik, A.Vakunov, K.Raveendran,
The application can be used indoors at home or in the gyms
M.Grundmann.
to get pose detection and correction suggestions. It can also [8] “MediaPipe Hands: On-device Real-time Hand Tracking.” F.Zhang,
be used to keep the daily log of calories of each user and V.Bazarevsky, A.Vakunov, A.Tkachenka, G.Sung, C.L. Chang,
suggest changes and exercises accordingly. M.Grundmann.
[9] 10.“Composite fields for human pose estimation” by S Kreiss, L
Apart from this, the application can be used to spread
Bertoni, and A Alah, IEEE Conference on Computer Vision and
awareness about the different health-related government Pattern Recognition pages 11977–11986, 2019. 1.
schemes and different health insurance-related information. [10] “Common objects in context” by T Y Lin, M Maire, S Belongie, J
Hays, P Perona, D Ramanan, P Dollar, and C Lawrence ´ Zitnick.
Microsoft coco: Springer, 2014. 2, 3.
11. CONCLUSION AND FUTURE WORK: [11] “Stacked hourglass networks for human pose estimation” by A
Nowadays our life is becoming busier and we hardly find Newell, K Yang, and J Deng. In European conference on computer
time in our schedules to be healthy and fit and exercise daily. vision, pages 483–499. Springer, 2016. 1, 2.
[12] 13.“Robust 3d hand pose estimation in single depth images: from
This has caused many diseases and health issues.
single-view CNN to multi-view CNNs” by L.Ge, H.Liang, J.Yuan,
Implementation of Artificial Intelligence in the field of and D.Thalmann. IEEE conference on computer vision and pattern
fitness can solve many problems. The health-related recognition, 2016.
applications and devices are making our lives easier and eases [13] “Feature pyramid networks for object detection” by T Yi Lin, P
Dollar, R . Girshick, K He, ´B Hariharan, and S J Belongie. CoRR,
our fitness journey. Individuals can use this application in
abs/1612.03144, 2016.
their own workouts, hence making them more efficient are [14] “Robust articulated-icp for real-time hand tracking” by
less error-prone. In this process, we learnt how to use the A.Tagliasacchi, M.Schroder, A.Tkach, S.Bouaziz, M.Botsch, and
OpenCV library and package and how the application of M.Pauly. In Computer Graphics Forum, volume 34 Wiley Online
Library, 2015.
machine learning can be beneficial to humans.
[15] “Associative embedding: End-to-end learning for joint detection
and grouping” by Newell A, Deng J NIPS. (2017).
There is a lot of scope of development in this project. The
project can be upgraded to support more exercises. A User
interface can be added for easy navigation through the
exercises. The data collected by the AI trainer can be saved
and processed for the next sessions. Daily steps tracker can
also be added. The trainer will suggest you workout plan and
its intensity according to your body type and weight. This
application can be developed into a complete android/ios
application for ease of use.
From the brief insight provided above, it shows that “AI-

based workout assistant and fitness guide” uses some
concepts of blaze pose, requires a camera to capture the body
pose as input to the system generated and with the help of
pose estimator, will provide the stats of calories burnt and
exercise count as output in human-readable form.
Future work may include the movement of the camera

vertically and horizontally to capture another wide variety of
exercises or it may include the use of multiple cameras to
capture the body pose from various angles in order to feed the
template of other exercises.
12. REFERENCES:
1. “PersonLab: Person Pose Estimation & Instance Segmentation with
a Bottom-Up, Part-Based, Geometric Embedding Model”
G.Papandreou, T.Zhu, L.-C.Chen, S.Gidaris, J.Tompson,
K.Murphy.
2. “BlazePose: On-device Real-time Body Pose tracking.”
V.Bazarevsky, I.Grishchenko, K.Raveendran, T.Zhu, F. Zhang,
M.Grundmann.
[2] “OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part
Affinity Fields ” Z Cao, G Hidalgo, T Simon, S-E Wei, Y Sheikh.
[3] “DeepPose: Human Pose Estimation via Deep Neural Networks
(August 2014)” A.Toshev, C.Szegedy (Google) 1600 Amphitheatre
Pkwy Mountain View, CA 94043.
[4] COCO 2020 Keypoint Detection Task.
[5] “Deep Learning-based Human Pose Estimation using OpenCV” By
V Gupta.
[6] “Pose Trainer: Correcting Exercise Posture using Pose Estimation”.
By S.Chen, R.R. Yang Department of CS., Stanford University.


AI-based Workout Assistant and Fitness Guide

Uploaded by

Copyright:

Available Formats

AI-based Workout Assistant and Fitness Guide

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI-based Workout Assistant and Fitness Guide

Uploaded by

Copyright:

Available Formats

Published by : International Journal of Engineering Research & Technology (IJERT)

http://www.ijert.org ISSN: 2278-0181

AI-based Workout Assistant and Fitness guide

IJERTV10IS110154 www.ijert.org 471

efficiency in the precision to predict pose. The total

IJERTV10IS110154 www.ijert.org 472

NEURAL NETWORK ARCHITECTURE:

1. User Login: The user has to enter valid credentials

IJERTV10IS110154 www.ijert.org 473

Fig 4: Block Diagram of the system

IJERTV10IS110154 www.ijert.org 474

10. APPLICATION: [7] “BlazeFace: Sub-millisecond Neural Face Detection on Mobile

From the brief insight provided above, it shows that “AI-

Future work may include the movement of the camera

IJERTV10IS110154 www.ijert.org 475

You might also like