Remotesensing
Remotesensing
Remotesensing
Article
Framework for Autonomous UAV Navigation and Target
Detection in Global-Navigation-Satellite-System-Denied and
Visually Degraded Environments
Sebastien Boiteau 1,2, *, Fernando Vanegas 1,2 and Felipe Gonzalez 1,2
Abstract: Autonomous Unmanned Aerial Vehicles (UAVs) have possible applications in wildlife
monitoring, disaster monitoring, and emergency Search and Rescue (SAR). Autonomous capabilities
such as waypoint flight modes and obstacle avoidance, as well as their ability to survey large areas,
make UAVs the prime choice for these critical applications. However, autonomous UAVs usually rely
on the Global Navigation Satellite System (GNSS) for navigation and normal visibility conditions to
obtain observations and data on their surrounding environment. These two parameters are often
lacking due to the challenging conditions in which these critical applications can take place, limiting
the range of utilisation of autonomous UAVs. This paper presents a framework enabling a UAV to
autonomously navigate and detect targets in GNSS-denied and visually degraded environments.
The navigation and target detection problem is formulated as an autonomous Sequential Decision
Problem (SDP) with uncertainty caused by the lack of the GNSS and low visibility. The SDP is
Citation: Boiteau, S.; Vanegas, F.; modelled as a Partially Observable Markov Decision Process (POMDP) and tested using the Adaptive
Gonzalez, F. Framework for Belief Tree (ABT) algorithm. The framework is tested in simulations and real life using a navigation
Autonomous UAV Navigation and task based on a classic SAR operation in a cluttered indoor environment with different visibility
Target Detection in Global-Navigation-
conditions. The framework is composed of a small UAV with a weight of 5 kg, a thermal camera used
Satellite-System-Denied and Visually
for target detection, and an onboard computer running all the computationally intensive tasks. The
Degraded Environments. Remote Sens.
results of this study show the robustness of the proposed framework to autonomously explore and
2024, 16, 471. https://doi.org/
detect targets using thermal imagery under different visibility conditions. Devising UAVs that are
10.3390/rs16030471
capable of navigating in challenging environments with degraded visibility can encourage authorities
Academic Editors: Francisco Javier
and public institutions to consider the use of autonomous remote platforms to locate stranded people
Mesas Carrascosa, José Emilio
in disaster scenarios.
Meroño-Larriva and María Jesús
Aguilera-Ureña
Keywords: partially observable Markov decision process; unmanned aerial vehicles; search and
Received: 27 November 2023 rescue; low visibility; embedded systems; remote sensing; motion planner
Revised: 19 January 2024
Accepted: 24 January 2024
Published: 25 January 2024
1. Introduction
Over 50 years, the number of natural disasters driven by climate change and more
Copyright: © 2024 by the authors.
extreme weather has increased by a factor of five [1]. Disaster events caused by natural
Licensee MDPI, Basel, Switzerland.
hazards affected approximately 100 million people and resulted in 15,082 deaths in 2020
This article is an open access article alone [2]. Therefore, it is important to improve Search and Rescue technologies that could
distributed under the terms and assist the rescuers in locating victims, especially in challenging environments where human
conditions of the Creative Commons interventions are often dangerous and unfeasible.
Attribution (CC BY) license (https:// Autonomous UAVs can be utilised to remotely operate in challenging environments.
creativecommons.org/licenses/by/ As an example, autonomous UAVs are used in critical applications such as surveillance [3],
4.0/). SAR [4], and mining [5]. These critical applications increase in complexity in environments
characterised by the absence of the GNSS and the presence of visual obstructions in the form
of low luminosity, smoke, and dust. These particular conditions are a great challenge to
UAV autonomous navigation, decision making, and target detection. Possible applications
for autonomous UAVs are SAR operations such as the UAV Challenge [6] and subterranean
operations like the DARPA SubT Challenge [7]. Australia also started using drones to detect
sharks to learn more about their behaviours and to notify lifeguards of their presence [8].
Autonomous perception and localisation in low-visibility conditions without the
GNSS require the use of multiple sensors, allowing the agent to perceive its surrounding
environment. A framework for UAV navigation under degraded visibility was developed
in [9–12]. The objective was to allow a small UAV to autonomously explore and navigate in
subterranean environments in the presence of visual obstructions. The final framework
combined the use of 3D (three-dimensional) LIDAR (Light Detection and Ranging), a
thermal camera, an RGB (red, green, blue) camera and an IMU to perform odometry and
Simultaneous Localisation and Mapping (SLAM). The LIDAR was used for the LIDAR
odometry and mapping (LOAM) method [13], and both cameras were used for odometry
only. The agent was able to explore cluttered, low-light environments in the presence of
smoke and dust, without a GNSS signal. The main limitation of these papers was the
restricted use of their framework in cluttered and indoor environments, where LIDAR
is extremely efficient. Another gap in the knowledge from these papers is the lack of
uncertainty modelling. Subterranean environments can cause uncertainty in observations,
states, and actions due to their challenging conditions. An analysis of the use of multiple
low-cost on-board sensors for ground robots or drones navigating in visually degraded
environments was proposed by Sizintsev et al. [14]. An IMU, stereo cameras with LED
lights, active infrared (IR) cameras, and 2D (two-dimensional) LIDAR were successfully
tested on a ground robot, but not on a UAV.
The use of thermal imagery for object detection is a possible solution for SAR mis-
sions operating in low-visibility conditions such as smoke, fog, or low luminosity (night).
Jiang et al. [15] proposed a UAV thermal infrared object detection framework using You
Only Look Once (YOLO) models. The FLIR Thermal Starter Dataset [16] was used to
train the model, which contains four classes: person, car, bicycle, and dog. Their research
consisted of car and person multi-object detection experiments using YOLOv3, YOLOv4,
and YOLOv5 models. The conclusion drawn from these experiments was that the YOLOv5
model can be used on board a UAV due to its small size, relative precision, and speed of
detection. Similarly, Dong et al. [17] and Lagman et al. [18] used YOLO models for human
detection using thermal imagery.
Decision making under uncertainty is the process of an agent receiving an incomplete
or noisy observation at a precise time and then choosing an action based on this observa-
tion [19]. In SAR missions taking place under challenging conditions, the robotic agent is
subjected to uncertainty due to the possible lack of the GNSS and by the presence of visual
obstructions. Uncertainty and partial observability are modelled in two mathematical
frameworks, the Markov Decision Process (MDP), and POMDPs. POMDPs were proven
to be viable in UAV frameworks for autonomous navigation under partial observability
and uncertainty [20–23]. Sandino et al. [24] proposed a framework for UAV autonomous
navigation under uncertainty and partial observability from imperfect sensor readings in
cluttered indoor scenarios. The navigation problem was modelled as a POMDP and was
solved in real time with the ABT solver. Only colour imagery was used to detect the target.
In later work, Sandino et al. [25] proposed a framework for UAV autonomous navigation in
outdoor environments. The navigation problem was modelled with uncertainty and using
a POMDP. Colour and thermal imagery were used to make the system more robust against
the environmental conditions. Multiple flight modes were tested: a classic motion planner
working with a list of position and velocity waypoints creating a survey zone, the POMDP
motion planner, and a fusion of the two flight modes. The developed framework was able
to find an adult mannequin by reducing the object detection uncertainty. Both research
papers used the ABT online POMDP solver [26]. A gap in the knowledge highlighted by
Remote Sens. 2024, 16, 471 3 of 25
both these works is the lack of testing in different visibility conditions, as the frameworks
were not tested under poor visibility conditions.
This paper presents a framework for autonomous UAV navigation, exploration and
target finding in low-visibility environments without the GNSS. The navigation problem is
mathematically formulated using a POMDP, which models action and state uncertainty
with probabilistic distributions. The navigation task is inspired by SAR in challenging
environments. A UAV with a weight of 5 kg is deployed in a cluttered indoor environment
to explore and locate a victim. Thermal imagery is used to detect the heat signature of
a person under low-visibility conditions. This work evaluates the performance of the
proposed framework via Software in the Loop (SITL) simulations, Hardware in the Loop
(HIL), and real-life testing (RLT).
2. Background
This section covers the theory and principles of POMDPs and the Adaptive Belief Tree,
which is the POMDP online solver used in this research.
2.1. POMDP
For this research problem, the navigation problem was modelled as a Sequential
Decision Problem. As the testing environment is characterised by a low luminosity and the
absence of the GNSS, modelling and accounting for uncertainty was a fundamental feature
required in this work. The POMDP, a mathematical framework that models decision making
under uncertainty in motion and action in a non-fully observable environment [27,28],
was selected.
A POMDP is modelled using these parameters: (S, A, O, T, Z, R, γ) [29]. S is the state
space, a finite set of states representing the possible conditions of the agent and of the
environment. A is a finite set of actions that the agent can execute to go from one state
to another. O is a finite set of observations that the UAV can perceive. T is the transition
function, modelling the transition of the agent from one state to the next after performing
an action a. Z is the distribution function, modelling the probability of observing o from
state s after executing an action a. R is a finite set of rewards, and γ is the discount factor
with γ ∈ [0, 1]. In a POMDP, the robot state is not represented by a single estimation,
but by a probability distribution called a belief state b, with B being the set of all possible
belief states.
The objective of a POMDP is to determine a sequence of actions given a current belief
state b that maximises the discounted accumulated reward. This sequence of actions is
called a policy and is represented by the symbol π. The discounted accumulated reward
is the sum of all the discounted rewards from each action executed during the mission.
The aim of the POMDP is to find an optimal policy π ∗ : B → A that maps belief states to
actions and that maximises the total expected discounted return. Equation (1) represents
the mathematical expression of the optimal policy.
∞
" #!
π : argmax E ∑ γ R S , π (b ) |b , π
∗ tr tr tr
t
0 (1)
π r =0
2.2. ABT
In this research, the ABT online POMDP solver [26] was selected. Online POMDP
solvers are characterised by their ability to update the model during the execution, while
only the known part of the environment and its dynamics are modelled. Most of the
current online POMDP solvers have one main limitation: they recompute policies at
every time step over again, thus wasting computational resources. This restriction heavily
impacts platforms constrained in size and power, such as small UAVs. The ABT solver was
selected for its ability to reuse and improve the previous policy when there are changes
in the POMDP model. Moreover, the states and actions can be modelled in a continuous
representation using a generative model.
Remote Sens. 2024, 16, 471 4 of 25
The ABT algorithm contains two processes: the preprocess, which generates the first
policy estimation offline, and the runtime process, which updates the previously computed
policies. To start with, the agent executes the first action computed by the offline policy.
Then, observations are collected using onboard sensors, and the ABT updates the online
policy. The next action is then executed based on the updated policy.
3. System Architecture
The system architecture highlighted in Figure 1 was designed to allow a UAV to
perform autonomous navigation in GNSS-denied and visually degraded environments.
Decision making under uncertainty is executed following the POMDP representation
shown in Section 2.1.
Figure 1. System architecture for autonomous UAV navigation in GNSS-denied and visually degraded
environments. It is composed of a detection module processing raw IR frames from a thermal camera,
a localisation module using LIDAR inertial odometry, a decision-making module sending an action
to the flight controller, and a motion module controlling the dynamics of the UAV.
Multiple modules are used to distribute the different functions implemented in this
framework. The detection module processes IR images from the thermal camera using
a YOLOv5 object detector trained to detect the thermal signature of a human being. The
localisation module uses 2D LIDAR and an IMU to perform odometry, allowing the agent
to compute a local pose estimation. In this paper, the localisation module is yet to be
implemented; however, the pose estimation uncertainty, representing the uncertainty
in localisation, was modelled from real-life experiments. This will be covered in more
depth in Section 4.6. The decision-making module is composed of the observations of the
Remote Sens. 2024, 16, 471 5 of 25
environment made by the detection and localisation modules, which are then used by the
POMDP solver. It then computes the optimal sequence of actions to accomplish the flight
mission. These actions are fed to the motion module, which will manage the speed of the
actuators to control the dynamics of the UAV.
4. Framework Implementation
This section presents the hardware and software used to implement the framework as
presented in Section 3. To begin with, the UAV frame and payload, as well as operating
systems and communication interface of the system, are highlighted. Then, the decision-
making module, consisting of the POMDP formulation, is presented. Finally, both the
computer vision and decision-making modules are explained.
(a) (b)
Figure 2. Fully implemented UAV frame used in this research. (a) Below view of the UAV highlighting
the frame (1); thermal camera FLIR TAU2 (2); and companion computer Jetson Orin Nano (3).
(b) Above view of the UAV highlighting the Optitrack tracker (4) and Holybro 915 MHz Telemetry
Radio (5).
Remote Sens. 2024, 16, 471 6 of 25
The problem formulation, including the state space, set of actions, transition function,
rewards and reward function, observation space, and observation model, as well as the
belief states, will be covered below.
4.4.2. Actions
Several actions were chosen for the UAV to interact with its environment as shown
in Table 1. The actions were restricted to pure translation (no rotations) to simplify the
UAV’s dynamic.
Table 1. Set of actions selected in this research problem. Each action will apply a displacement ∆
from time step t to t + 1, impacting the corresponding x, y or z coordinate of the agent.
Action x a (t + 1) ya ( t + 1 ) z a (t + 1)
Forward x a (t) + ∆ x y a (t) z a (t)
Backward x a (t) − ∆ x y a (t) z a (t)
Left x a (t) y a (t) + ∆y z a (t)
Right x a (t) y a (t) − ∆y z a (t)
Up x a (t) y a (t) z a (t) + ∆z
Down x a (t) y a (t) z a (t) − ∆z
Hover x a (t) y a (t) z a (t)
Table 2 highlights the selected values for each reward. Negative rewards are used
to avoid certain actions that will put the agent in an unwanted state. Both rcrash and rout
are negative rewards. The former is used when the UAV crashes into an obstacle, and the
latter when it is out of the ROI. r f is a positive reward when the agent finds the target. To
encourage the UAV to explore the environment, rexp is a negative accumulative reward,
punishing the agent if an action results in an already explored area, and rnew is a positive
reward, rewarding the UAV if an action results in an unexplored area. r alt is a reward
encouraging the UAV to increase its altitude to facilitate the target detection module. The
reward function is represented by Algorithm 1.
Table 2. Set of reward values used in the reward function R, defined in Algorithm 1.
position of the UAV is inside the cell. Figure 3 illustrates how the grid map is initialised
and updated. The illustration represents the environment in which the system was tested
in simulation and in real life. Unexplored cells are white, explored cells are highlighted in
green, and obstacles are highlighted in red. Cells that can be explored also contain in their
centre the number of times they have been explored. A functionality allowing the agent
to keep exploring until the target was detected or until the maximum flight was reached
was also introduced by keeping track of the number of cells explored and selecting an
arbitrary percentage of exploration. If this percentage is attained, all the cells are marked as
unexplored, and their count is reset.
(a) Grid map at initialisation. (b) Grid map with updated cells.
Figure 3. Grid map representation of the testing environment at the start of the mission (a) and
during flight with updated cells (b). The unexplored cells are in white, the explored cells in green,
and each cell contains the number of times they have been explored. The obstacles are represented
in red.
O = o p a , o pt , ot f , ood (6)
where o pa is the local pose estimation of the agent, o pt is the local pose estimation of the
target, and ot f is a discrete observation defining if the target has been detected or not. o pt is
received only when the target is detected. ood is a flag describing if the agent is close to
an obstacle. This observation is obtained using an occupancy map object and the agent’s
location within this map.
Remote Sens. 2024, 16, 471 10 of 25
full integration of the system. Even with a low mAP_0.95, the model was able to detect the
heated mannequin consistently with few false positives and false negatives in simulation
and real life.
Figure 5 shows the heated mannequin used in real-life experiments and the YOLOv5
detection when the UAV was flying above the target. More information about the man-
nequin will be given in Section 5.1.
0.90
0.85 0.80
3UHFLVLRQ
0.80 0.75
5HFDOO
0.75 0.70
0.70 0.65
0 100 200 300 0 100 200 300
(SRFKV (SRFKV
0.90
0.5
0.85
P$3#
P$3#
0.80 0.4
0.75
0.3
0.70
0 100 200 300 0 100 200 300
(SRFKV (SRFKV
Figure 4. Visual analysis of YOLOv5 evaluation indicators during training.
(a) (b)
Figure 5. Target detection of the thermal mannequin using the FLIR TAU 2 thermal camera and
YOLOv5. (a) Thermal mannequin side view. (b) Target detected with an 81% confidence during flight
after processing the raw frame from the FLIR TAU 2 thermal camera.
5. Experiments
The framework presented in Section 4 was tested in a SAR mission in an indoor envi-
ronment with several obstacles in normal and low-visibility conditions. SITL simulations
and real flight tests were carried out. This section covers the environment setup, as well as
the parameters used in the POMDP problem formulation.
(a) (b)
Figure 6. Environment setup in SITL with normal visibility conditions. (a) Top view of the environ-
ment simulated in Gazebo. (b) Side view of the SITL environment.
tested under two different lighting conditions: normal lighting and obscurity, as shown in
Figure 8 and Figure 9 respectively. The position of the target and obstacles were static, and
no external disturbances (wind) were applied during the mission. In the current framework
implementation, LIDAR/inertial odometry is not integrated. The system is also currently
required to use a fixed occupancy map to know the obstacles’ position. For both real-life
and simulation experiments, the UAV path, target and pose estimation belief states, and
occupancy map were visualised on RViz [41].
(a) (b)
Figure 7. SITL environment setup with low-visibility conditions. (a) Top view of Gazebo with smoke
at the start of the mission. (b) Top view of Gazebo with smoke during flight with the detection output.
(a) (b)
Figure 8. Environment setup for real-life experiments at QUT O1304 with normal visibility conditions.
(a) Angled view of the flying area. (b) Flying area outside the net.
Remote Sens. 2024, 16, 471 14 of 25
Figure 9. Environment setup for real-life experiments at QUT O1304 with low-visibility conditions.
Table 3. Cont.
6. Results
Two different setups were tested in simulation (S) and in real-life testing. Each setup
was tested in normal (NV) and low-visibility (LV) conditions. The first testing configuration
(M1) consists of obstacles of different heights, from 1.65 m to 3.5 m (limit of altitude),
while the second setup (M2) only has obstacles of 3.5 m. The metrics used to evaluate the
success of each setup consist of Success (target found), Crash (if the UAV collides with an
obstacle), ROI Out (if the UAV leaves the ROI), and Timeout (target not found in the time
limit). In these tests, the target position was opposite the starting position of the UAV, with
coordinates of (4.5;1.5). Figure 10 shows the RViz environment of both maps, and Table 4
summarises the results:
Figure 10. RViz environment for both maps at the start of the mission.
Table 4. Performance metrics for SITL simulations and real flight tests.
Timeout
Setup Iterations Success Rate Crash Rate ROI Out
Rate
M1 NV (S) 30 100% 0% 0% 0%
M1 NV
8 100% 0% 0% 0%
(RLT)
M1 LV (S) 30 100% 0% 0% 0%
M1 LV
8 100% 0% 0% 0%
(RLT)
M2 NV (S) 30 100% 0% 0% 0%
M2 NV
8 87.5% 0% 12.5% 0%
(RLT)
M2 LV (S) 30 100% 0% 0% 0%
Remote Sens. 2024, 16, 471 16 of 25
Table 4. Cont.
Timeout
Setup Iterations Success Rate Crash Rate ROI Out
Rate
M2 LV
8 100% 0% 0% 0%
(RLT)
S 120 100% 0% 0% 0%
RLT 32 96.87% 0% 3.13% 0%
From all the testing scenarios presented, the proposed framework performed perfectly
in simulations, with a 100% success rate (target found with no crashes), and a 96.87%
success rate in real-life testing. For the simulation, a total of 120 iterations through all the
setups (30 each) were performed, and 32 iterations for real-life testing (8 each). Overall, the
proposed system performed identically in simulations and in real life, with similar results.
The agent found the target in both maps and in different visibility conditions, highlighting
the ability of the decision-making module to explore different environments and the ability
of the thermal camera and object detection module to detect the target in normal and
low-visibility conditions. The differences between the heat signature of a human being and
the heated mannequin used in real-life testing did not impact the operation of the agent.
On the contrary, the mannequin was placed in a seated position in simulations to facilitate
detection, as the laying-down model was consistently miss-detected. These miss-detections
were caused by differences between the target used in simulation and the actual detection
model. The YOLOv5 model was trained using actual people and was not trained to detect
the simulated mannequin.
The most frequent trajectory for M1 is presented in Figure 11, and the most frequent
trajectory for M2 is shown in Figure 12. In all tested environments, the first series of actions
often resulted in the following steps: meeting the altitude requirement to avoid the negative
reward and going to the centre of the map. The former was part of the POMDP formulation,
with ralt encouraging the UAV to reach the required altitude. The latter, on the other hand,
was the result of the complete formulation and reward function. The POMDP always outputs
a policy in which the first actions result in the agent’s position being in the centre of the
map to maximise the number of unexplored cells around the UAV. When the centre is
reached, the agent will either keep going toward the top side of the map, as shown in
Figure 11 and Figure 12, or the actions will result in the agent exploring the bottom side of
the map.
Figure 11. Most frequent trajectory of the agent for Map 1 (M1) in normal and low-visibility conditions
with the target position belief as red particles. Top view and side view of the trajectory in RViz with
higher obstacles and walls in light green, and lower obstacles in medium spring green.
Remote Sens. 2024, 16, 471 17 of 25
Figure 12. Most frequent trajectory of the agent for Map 2 (M2) in normal and low-visibility conditions
with the target position belief as red particles. Top view and side view of the trajectory in RViz with
higher obstacles and walls in light green, and lower obstacles in medium spring green.
M1 was also used to highlight the ability of the agent to go above an obstacle, as
shown in Figure 13. The POMDP solver was able to recognise the lower obstacles as a
“non-threat” and output actions resulting in the agent’s position being above these smaller
obstacles, without causing any collisions and successfully avoiding the obstacle.
Figure 13. Agent can fly over the smaller obstacles in medium spring green with the target position
belief as red particles. Top view and side view of the trajectory going over the obstacle in RViz with
higher obstacles and walls in light green.
The “Out of ROI” state that occurred in M2 NV in real-life testing was caused by a
known issue of the framework, which put the agent in an area far from other unexplored
cells, causing it to repeat UP and DOWN actions, forcing the agent to go higher than the
altitude limit. This is represented in Figure 14. This issue was limited by the functionality
of keeping track of the number of cells explored and resetting the explored cells when a
percentage of exploration is attained, as explained in Section 4.4.4.
Remote Sens. 2024, 16, 471 18 of 25
Figure 14. Representation of the POMDP solver getting stuck in the bottom part of the map. As the
agent is too far from other unexplored cells, the POMDP is not able to compute practical actions and
repeats UP and DOWN commands. The obstacles and walls are in green; the target position belief
states are red particles. Top view and side view of the path on RViz.
The detection model was trained using images from an external dataset detecting
human beings. A mannequin was used in the experiments for safety purposes, hence
creating differences between the model and the target. Figure 15 highlights the differences
between the heat signature of the mannequin and a person. These differences caused a
few miss-detections, which are highlighted in Figure 16. These miss-detections forced the
agent to keep exploring the environment, often activating the functionality and resetting
the states of the cells. In the few times this miss-detection happened, the agent was always
able to explore a larger part of the map and come back to the target area for successful
detection. These miss-detections were also caused by a loss of heat in the heat-packs and
heated clothes after a long testing session, creating a noticeable difference between the
mannequin and an actual human being.
Figure 15. YOLOv5 detection output of thermal imagery with a heated mannequin and a human being.
Remote Sens. 2024, 16, 471 19 of 25
110 90 90
100
100 80 80
90 70
Number of Steps
Number of Steps
Number of Steps
Number of Steps
80 70
80
60 60
70 60
50 50
60
40 40
40
50
30 30
40
20
30 20 20
The difference between the lengths of the whiskers of the simulation and real-life re-
sults was caused by how the UAV motion in Gazebo is modelled. The equations modelling
the motion in Gazebo approximate the behaviour of a generic quadcopter rather than the
exact frame used in real-life testing, which in this case, was the Holybro X500 V2.
The main variance between each map is the height of the obstacles. The first map,
with obstacles of 1.7 m and 3.5 m, offers the agent a larger range of choices to explore
the environment as it has the possibility to fly over some obstacles. On the other hand,
the second map, with obstacles of 3.5 m only, restricts the freedom of the agent. This is
highlighted in Figure 17, with the interquartile range of 90 steps for both M1 in NV M1 and
LV M1, compared to 65 and 66 steps for NV M2 and LV M2, respectively, in simulations.
The problem formulation did not include any change in behaviour between normal and
low-visibility conditions. This is represented in the median for each boxplot, with a median
of 76 for both NV M1 and LV M1 in simulations, and a median of 39 and 36.5 steps in
real-life testing (Figure 17a,b). For NV M2 and LV M2, a median of 81 steps and 82 steps,
respectively, in simulations, and 27 and 28 steps for real-life testing (Figure 17c,d) was
Remote Sens. 2024, 16, 471 20 of 25
determined. The mean altitude in M1 for simulations was 2.97 m and 2.91 m in real-life
testing. For M2, the mean altitude was 2.98 m in simulations and 2.87 m in real-life testing.
7. Discussion
The framework proposed in this research paper offers a viable and interesting solu-
tion for autonomous UAV navigation and decision making in GNSS-denied and visually
degraded environments. This work is a continuation of [42], which itself was an extension
of the contributions from Vanegas and Gonzalez [23] and Sandino et al. [24]. In [42], a ther-
mal camera was added to the framework to improve target detection under low-visibility
conditions, some modifications to the problem formulation and TAPIR parameters were
made to improve the performance (see below), and real-life experiments were performed
to verify the simulation results.
The first version of the framework was modelled with a velocity of 0.3 m/s and a
time step of 2 s, resulting in a displacement of 0.6m for the forward, backward, left, and
right actions. This displacement was calculated by the transition function, modelling the
transition from one state to another, which in this case, was the drone’s position state. This
transition function is represented by the following equations:
Figure 18. Top view of the agent’s trajectory for Map 2 (M2) with the target position belief as
red particles. A 0.6 m displacement does not allow the agent to explore some areas, while a 1 m
displacement allows the agent to explore crowded areas.
Author Contributions: Conceptualisation, S.B., F.V. and F.G.; methodology, S.B., F.V. and F.G.;
hardware conceptualisation and integration, S.B. and F.V.; software, S.B. and F.V.; validation, S.B.;
formal analysis, S.B.; investigation, S.B. and F.V.; resources, F.G.; data curation, S.B.; writing—original
draft preparation, S.B.; writing—review and editing, F.V. and F.G.; visualisation, S.B.; supervision, F.V.
and F.G.; project administration, F.G.; funding acquisition, F.G. All authors have read and agreed to
the published version of the manuscript.
Funding: The authors would like to acknowledge The Australian Research Council (ARC) through
the ARC Discovery Project 2020 “When every second counts: Multi-drone navigation in GPS-denied
environments” (GA64830).
Data Availability Statement: The collected data supporting this article’s research findings are available
at https://1drv.ms/u/s!AmUEDov2Mv7ztV3M9V2vavsH-AjU (accessed on 22 November 2023).
Remote Sens. 2024, 16, 471 23 of 25
Acknowledgments: The authors acknowledge the continued support from Queensland University of
Technology (QUT) through the QUT Centre for Robotics (QCR) and Engineering Faculty for allowing
access and flight tests at O block, QUT. The authors would also like to thank Dennis Brar for his help
with the LIDAR/Inertial odometry testing in low-visibility conditions.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or
in the decision to publish the results.
Abbreviations
The following abbreviations are used in this manuscript:
2D Two-dimensional
3D Three-dimensional
ABT Augmented Belief Tree
FCU Flight Controller Unit
FOV Field of View
GNSS Global Navigation Satellite System
GPS Global Positioning System
HIL Hardware in the Loop
IoU Intersection over Union
IR Infrared
LIDAR Light Detection and Ranging
M1 LV Map One, Low Visibility
M1 NV Map One, Normal Visibility
M2 LV Map Two, Low Visibility
M2 NV Map Two, Normal Visibility
mAP Mean Average Precision
MDP Markov Decision Process
OS Operating System
POMDP Partially Observable Markov Decision Process
RGB Red, Green, Blue
RLT Real-Life Testing
ROI Region of Interest
ROS Robot Operating System
SAR Search and Rescue
SITL Software in the Loop
SLAM Simultaneous Localisation and Mapping
References
1. WMO. Weather-Related Disasters Increase over Past 50 Years, Causing More Damage but Fewer Deaths. Available on-
line: https://public.wmo.int/en/media/press-release/weather-related-disasters-increase-over-past-50-years-causing-more-
damage-fewer (accessed on 20 March 2023).
2. Jones, R.L.; Guha-Sapir, D.; Tubeuf, S. Human and economic impacts of natural disasters: Can we trust the global data? Sci. Data
2022, 9, 572. [CrossRef] [PubMed]
3. Kwak, J.; Park, J.H.; Sung, Y. Emerging ICT UAV applications and services: Design of surveillance UAVs. Int. J. Commun. Syst.
2021, 34, e4023. [CrossRef]
4. Zimroz, P.; Trybala, P.; Wroblewski, A.; Goralczyk, M.; Szrek, J.; Wojcik, A.; Zimroz, R. Application of UAV in search and rescue
actions in underground mine a specific sound detection in noisy acoustic signal. Energies 2021, 14, 3725. [CrossRef]
5. Dawei, Z.; Lizhuang, Q.; Demin, Z.; Baohui, Z.; Lianglin, G. Unmanned aerial Vehicle (UaV) Photogrammetry Technology for
Dynamic Mining Subsidence Monitoring and Parameter Inversion: A Case Study in China. IEEE Access 2020, 8, 16372–16386.
[CrossRef]
6. UAV Challenge. Search and Rescue. Available online: https://uavchallenge.org/search-and-rescue/ (accessed on 15 January 2024).
7. UAV Challenge. DARPA Subterranean (SubT) Challenge. Available online: https://www.darpa.mil/program/darpa-
subterranean-challenge (accessed on 15 January 2024).
8. Department of Agriculture and Fisheries. SharkSmart Drone Trial. Available online: https://www.daf.qld.gov.au/business-
priorities/fisheries/shark-control-program/shark-control-equipment/drone-trial (accessed on 15 January 2024).
Remote Sens. 2024, 16, 471 24 of 25
9. Tsiourva, M.; Papachristos, C. Multi-modal Visual-Thermal Saliency-based Object Detection in Visually-degraded Environments.
In Proceedings of the 2020 IEEE Aerospace Conference, Big Sky, MT, USA, 7–14 March 2020; pp. 1–9. [CrossRef]
10. Dang, T.; Mascarich, F.; Khattak, S.; Papachristos, C.; Alexis, K. Graph-based Path Planning for Autonomous Robotic Exploration
in Subterranean Environments. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), Macau, China, 3–8 November 2019; pp. 3105–3112. [CrossRef]
11. Papachristos, C.; Khattak, S.; Alexis, K. Autonomous exploration of visually-degraded environments using aerial robots. In
Proceedings of the 2017 International Conference on Unmanned Aircraft Systems (ICUAS), Miami, FL, USA, 13–16 June 2017;
pp. 775–780. [CrossRef]
12. Khattak, S.; Nguyen, H.; Mascarich, F.; Dang, T.; Alexis, K. Complementary Multi–Modal Sensor Fusion for Resilient Robot Pose
Estimation in Subterranean Environments. In Proceedings of the 2020 International Conference on Unmanned Aircraft Systems
(ICUAS), Athens, Greece, 1–4 September 2020; pp. 1024–1029. [CrossRef]
13. Zhang, J.; Singh, S. LOAM : Lidar Odometry and Mapping in real-time. In Proceedings of the Robotics: Science and Systems
Conference (RSS), Rome, Italy, 13–15 July 2014; pp. 109–111.
14. Sizintsev, M.; Rajvanshi, A.; Chiu, H.P.; Kaighn, K.; Samarasekera, S.; Snyder, D.P. Multi-Sensor Fusion for Motion Estimation in
Visually-Degraded Environments. In Proceedings of the 2019 IEEE International Symposium on Safety, Security, and Rescue
Robotics (SSRR), Würzburg, Germany, 2–4 September 2019; pp. 7–14. [CrossRef]
15. Jiang, C.; Ren, H.; Ye, X.; Zhu, J.; Zeng, H.; Nan, Y.; Sun, M.; Ren, X.; Huo, H. Object detection from UAV thermal infrared images
and videos using YOLO models. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102912. [CrossRef]
16. FLIR. Free Teledyne FLIR Thermal Dataset for Algorithm Training. Available online: https://www.flir.com/oem/adas/adas-
dataset-form/ (accessed on 15 February 2023).
17. Dong, J.; Ota, K.; Dong, M. Real-Time Survivor Detection in UAV Thermal Imagery Based on Deep Learning. In Proceedings
of the 2020 16th International Conference on Mobility, Sensing and Networking (MSN), Tokyo, Japan, 17–19 December 2020;
pp. 352–359. [CrossRef]
18. Lagman, J.K.D.; Evangelista, A.B.; Paglinawan, C.C. Unmanned Aerial Vehicle with Human Detection and People Counter Using
YOLO v5 and Thermal Camera for Search Operations. In Proceedings of the 2022 IEEE International Conference on Automatic
Control and Intelligent Systems (I2CACIS), Shah Alam, Malaysia, 25 June 2022; pp. 113–118. [CrossRef]
19. Kochenderfer, M.J. Decision Making under Uncertainty: Theory and Application; Lincoln Laboratory Series; MIT Press: Cambridge,
MA, USA, 2015.
20. Ragi, S.; Chong, E.K.P. UAV Path Planning in a Dynamic Environment via Partially Observable Markov Decision Process. IEEE
Trans. Aerosp. Electron. Syst. 2013, 49, 2397–2412. [CrossRef]
21. Galvez-Serna, J.; Vanegas, F.; Gonzalez, F.; Flannery, D. Towards a Probabilistic Based Autonomous UAV Mission Planning for
Planetary Exploration. In Proceedings of the 2021 IEEE Aerospace Conference (50100), Big Sky, MT, USA, 6–13 March 2021;
pp. 1–8. [CrossRef]
22. Eaton, C.M.; Chong, E.K.; Maciejewski, A.A. Robust UAV path planning using POMDP with limited FOV sensor. In Proceedings
of the 2017 IEEE Conference on Control Technology and Applications (CCTA), Maui, HI, USA, 27–30 August 2017; pp. 1530–1535.
[CrossRef]
23. Vanegas, F.; Gonzalez, F. Enabling UAV navigation with sensor and environmental uncertainty in cluttered and GPS-denied
environments. Sensors 2016, 16, 666. [CrossRef] [PubMed]
24. Sandino, J.; Vanegas, F.; Maire, F.; Caccetta, P.; Sanderson, C.; Gonzalez, F. UAV Framework for Autonomous Onboard Navigation
and People/Object Detection in Cluttered Indoor Environments. Remote Sens. 2020, 12, 3386. [CrossRef]
25. Sandino, J.; Caccetta, P.A.; Sanderson, C.; Maire, F.; Gonzalez, F. Reducing Object Detection Uncertainty from RGB and Thermal
Data for UAV Outdoor Surveillance. In Proceedings of the 2022 IEEE Aerospace Conference (AERO), Big Sky, MT, USA,
5–12 March 2022; Volume 2022.
26. Kurniawati, H.; Yadav, V. An Online POMDP Solver for Uncertainty Planning in Dynamic Environment; Springer International
Publishing: Berlin/Heidelberg, Germany, 2016; pp. 611–629. [CrossRef]
27. Russell, S.J.S.J. Artificial Intelligence: A Modern Approach, 4h ed.; Pearson: London, UK, 2021.
28. Kaelbling, L.P.; Littman, M.L.; Cassandra, A.R. Planning and acting in partially observable stochastic domains. Artif. Intell. 1998,
101, 99–134. [CrossRef]
29. Thrun, S.; Burgard, W.; Fox, D. Probabilistic Robotics; The MIT Press: Cambridge, MA, USA, 2006.
30. Optitrack. OptiTrack for Robotics. Available online: https://optitrack.com/applications/robotics/ (accessed on 5 May 2023).
31. OSR Foundation. Robot Operating System. Available online: https://www.ros.org/ (accessed on 5 May 2023).
32. Meier, L.; Honegger, D.; Pollefeys, M. PX4: A node-based multithreaded open source robotics framework for deeply embedded
platforms. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA,
26–30 May 2015; pp. 6235–6240. [CrossRef]
33. Ermakov, V. Mavros. Available online: http://wiki.ros.org/mavros (accessed on 5 May 2023).
34. Koubâa, A.; Allouch, A.; Alajlan, M.; Javed, Y.; Belghith, A.; Khalgui, M. Micro Air Vehicle Link (MAVlink) in a Nutshell: A
Survey. IEEE Access 2019, 7, 87658–87680. [CrossRef]
35. Klimenko, D.; Song, J.; Kurniawati, H. TAPIR: A Software Toolkit for Approximating and Adapting POMDP Solutions Online;
Australian National University: Canberra, Australia, 2014.
Remote Sens. 2024, 16, 471 25 of 25
36. Chovancová, A.; Fico, T.; Chovanec, L.; Hubinsk, P. Mathematical Modelling and Parameter Identification of Quadrotor (A
Survey). Procedia Eng. 2014, 96, 172–181. [CrossRef]
37. Hornung, A.; Wurm, K.; Bennewitz, M.; Stachniss, C.; Burgard, W. OctoMap: An efficient probabilistic 3D mapping framework
based on octrees. Auton. Robot. 2013, 34, 189–206. [CrossRef]
38. Ultralytics. Yolov5. Available online: https://github.com/ultralytics/yolov5 (accessed on 15 May 2023).
39. Roboflow Universe Projects. People Detection—Thermal Dataset. 2022. Available online: https://universe.roboflow.com/
roboflow-universe-projects/people-detection-thermal (accessed on 3 February 2023).
40. Kohlbrecher, S.; von Stryk, O.; Meyer, J.; Klingauf, U. A flexible and scalable SLAM system with full 3D motion estimation. In
Proceedings of the 2011 IEEE International Symposium on Safety, Security, and Rescue Robotics, Kyoto, Japan, 1–5 November
2011; pp. 155–160. [CrossRef]
41. Faust, J.; Hershberger, D.; Gossow, D.; Woodall, W.; Haschke, R. Rviz. Available online: https://github.com/ros-visualization/
rviz (accessed on 21 November 2023).
42. Boiteau, S.; Vanegas, F.; Sandino, J.; Gonzalez, F.; Galvez-Serna, J. Autonomous UAV Navigation for Target Detection in
Visually Degraded and GPS Denied Environments. In Proceedings of the 2023 IEEE Aerospace Conference, Big Sky, MT, USA,
4–11 March 2023; pp. 1–10. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.