Sensors 22 07920

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

sensors

Article
WiFi Indoor Location Based on Area Segmentation
Yanchun Wang 1, *, Xin Gao 1 , Xuefeng Dai 2 , Ying Xia 1 and Bingnan Hou 1

1 School of Communication and Electronic Engineering, Qiqihar University, Qiqihar 161000, China
2 School of Computer and Control Engineering, Qiqihar University, Qiqihar 161000, China
* Correspondence: 01480@qqhr.edu.cn; Tel.: +86-153-036-20037

Abstract: Indoor positioning is the basic requirement of future positioning services, and high-precision,
low-cost indoor positioning algorithms are the key technology to achieve this goal. Different from
outdoor maps, indoor data has the characteristic of uneven distribution and close correlation. In
areas with low data density, in order to achieve a high-precision positioning effect, the positioning
time will be correspondingly longer, but this is not necessary. The instability of WiFi leads to the
introduction of noise when collecting data, which reduces the overall performance of the positioning
system, so denoising is very necessary. For the above problems, a positioning system using the
DBSCAN algorithm to segment regions and realize regionalized positioning is proposed. DBSCAN
algorithm not only divides the dataset into core points and edge points, but also divides part of the
data into noise points to achieve the effect of denoising. In the core part, the dimensionality of the
data is reduced by using stacking auto-encoders (SAE), and the localization task is accomplished by
using a deep neural network (DNN) with an adaptive learning rate. At the edge points, the random
forest (RF) algorithm is used to complete the localization task. Finally, the proposed architecture is
verified on the UJIIndoorLoc dataset. The experimental results show that our positioning accuracy
does not exceed 1.5 m with a probability of less than 87.2% at the edge point, and the time is only 32
ms; the positioning accuracy does not exceed 1.5 m with a probability of less than 98.8% at the core
point. Compared with indoor positioning algorithms such as multi-layer perceptron and K Nearest
Neighbors (KNN), good results have been achieved.
Citation: Wang, Y.; Gao, X.; Dai, X.;
Xia, Y.; Hou, B. WiFi Indoor Location Keywords: indoor positioning; area segmentation; deep neural networks; fingerprint database
Based on Area Segmentation. Sensors
2022, 22, 7920. https://doi.org/
10.3390/s22207920

Academic Editors: Xiansheng Guo, 1. Introduction


Nirwan Ansari and Gang Wang In the information times, people’s demand for positioning is getting higher and
Received: 17 August 2022
higher. For the outdoors, many outdoor positioning devices and technologies can already
Accepted: 15 October 2022
meet people’s daily needs, but indoor positioning services have been developing [1]. For
Published: 18 October 2022
indoor positioning, the current technology with the highest accuracy is Simultaneous
Localization and Mapping (SLAM) technology [2,3]. Although it can update the map
Publisher’s Note: MDPI stays neutral
in real-time and has high accuracy, but this technology is expensive and complicated to
with regard to jurisdictional claims in
implement [4]. Compared with SLAM, the large-scale popularity of WiFi has promoted the
published maps and institutional affil-
rapid entry of WiFi-based positioning systems into people’s field of vision, and because of its
iations.
advantages of convenient use, low cost, and relatively high accuracy, it has quickly replaced
ultra-wide band (UWB) [5,6], Zigbee [7,8] and Bluetooth [9,10] have become the most
widely used technologies in indoor positioning [11–13]. With the full popularization of 5G
Copyright: © 2022 by the authors.
technology [14], WiFi-based positioning technology will gradually become the mainstream
Licensee MDPI, Basel, Switzerland. positioning technology.
This article is an open access article At present, indoor positioning technology has been applied in many fields. For
distributed under the terms and example, in the building domain, many interesting localization techniques have been
conditions of the Creative Commons proposed in building energy management [15], occupancy detection [16], and intelligent
Attribution (CC BY) license (https:// building control [17]. In the medical field, indoor positioning technology can provide basic
creativecommons.org/licenses/by/ technical support for hospitals to realize intelligent perception and processing of medical
4.0/). objects (such as doctors, nurses, patients, equipment, drugs, etc.) [18–20].

Sensors 2022, 22, 7920. https://doi.org/10.3390/s22207920 https://www.mdpi.com/journal/sensors


Sensors 2022, 22, 7920 2 of 20

WiFi-based positioning technology is mainly divided into two categories: triangle


positioning method and fingerprint positioning method [21,22]. Triangulation is also the
basic operating principle of GPS [23]. It uses the distance between the measured point
and the reference point for positioning. The accuracy largely depends on the accurate
model of wireless signal transmission loss. However, due to the instability of WiFi [24], the
movement of objects and people and even temperature changes will interfere with it, there
is no particularly accurate model for a long time, so the accuracy of this method is relatively
general and cannot satisfy people’s daily needs. Although some scholars now propose
to combine this method with other traditional methods to improve the accuracy, this will
undoubtedly increase the cost [25,26]. Fingerprint-based positioning technology collects
Received Signal Strength Indicator (RSSI) values from different Access Point (AP) points in
the offline phase to build a database (also known as a fingerprint database) [27–29]. When
a user initiates a positioning request, the RSSI value of the point is matched with the data
in the database to complete the positioning task. Therefore, the technology is suitable for
large buildings and can meet the daily needs of the public [30,31].
A good positioning algorithm can have a significant impact on a positioning sys-
tem [32–34], but from the current perspective, there is little scope for improving positioning
accuracy through algorithms. However, it is often overlooked that the quality of the
database also has an important impact on the positioning accuracy, as many external fac-
tors, such as human walking and object movement, can affect the RSSI values collected
when data is being collected. Therefore, it is necessary to filter the data to remove outliers
and create a high-quality database. In addition, if only one positioning algorithm is used for
the positioning task, the positioning time is often too long and time is wasted in uncommon
or relatively open areas, where there is no requirement for positioning accuracy. Therefore,
it is necessary to identify these areas in the dataset and use a different localization algorithm.
Finally, the learning rate is always one of the most difficult parameters to adjust in neural
networks. We incorporated an adaptive learning rate algorithm into the DNN, which
helped to obtain a better localization model in a short time.
Aiming at the above problems, a new positioning system is proposed. Compared with
the previous systems, the main contributions of this paper are as follows:
(1) Since people usually place WiFi in a specific place to obtain a stronger and denser
WiFi signal in commonly used places, this habit is used to divide the dataset into core
points and edge points according to the signal density of WiFi. Different positioning
algorithms are used for different regions, which can achieve higher positioning ac-
curacy in commonly used places and save positioning time in less commonly used
places.
(2) The adaptive learning rate algorithm is integrated into the DNN network to solve the
problem that the learning rate parameter is difficult to adjust. At the same time, the
stacked autoencoder (SAE) is used to reduce the dimension of the data to solve the
problem that the dimension of the dataset is too high. Using a positioning algorithm
that incorporates an adaptive learning rate algorithm and DNN network with SAE as
the core point, the positioning accuracy does not exceed 1.5 m with a probability of
less than 98.8%.
(3) Since we use a high-precision positioning algorithm at the core point, which sacrifices
the positioning time, we use the random forest algorithm as the positioning algorithm
at the edge point. Although the positioning accuracy does not exceed 1.5 m with a
probability of less than 87.2%, the positioning time is only 32 ms. If the user is at the
edge point, the location information can be obtained in a short time.
The rest of this paper is organized as follows: the second part introduces the basic
framework and related literature of WiFi positioning; the third part introduces the used
dataset and system architecture; the fourth part introduces the experimental procedure and
results; finally, the fifth part describes the conclusions, limitations and future work.
Sensors 2022, 22, x FOR PEER REVIEW 3 of 20

Sensors 2022, 22, 7920 dataset and system architecture; the fourth part introduces the experimental procedure 3 of 20
and results; finally, the fifth part describes the conclusions, limitations and future work.

2. Related
2. Related Work
Work
In this
In this section,
section, the
the basic
basic framework
framework of
of WiFi
WiFi positioning
positioning and
and literature
literature related
related to
to this
this
paper are introduced.
paper are introduced.

2.1. WiFi Positioning System Framework


theprevious
In the previousfingerprint-based
fingerprint-based WiFi
WiFi indoor
indoor positioning
positioning system,
system, it is it is mainly
mainly di-
divided
into
videdtwo stages:
into the offline
two stages: thestage and
offline the online
stage and the stage [35].stage
online As shown in Figure
[35]. As shown1inbelow,
Figurethe1
offline
below, phase mainly
the offline collects
phase the collects
mainly signal strength of each
the signal point
strength ofand
eachcollects
point andthe geographic
collects the
location
geographic to establish
location toa fingerprint database. Through
establish a fingerprint database.data training,
Through dataatraining,
positioning model
a position-
used in theused
ing model online phase
in the is obtained,
online phase is which reflects
obtained, which thereflects
mapping relationship
the mapping between
relationship
signal
between strength and physical
signal strength location.location.
and physical In the online
In thestage,
onlinethe instantaneous
stage, RSSI signal
the instantaneous RSSI
obtained by theby
signal obtained user’s onlineonline
the user’s positioning request
positioning is input
request intointo
is input the the
positioning
positioningmodel
modelto
obtain the physical position, thereby realizing indoor positioning
to obtain the physical position, thereby realizing indoor positioning [36–38]. [36–38].

WiFi indoor
Figure 1. Fingerprint-based WiFi indoor positioning
positioning system
system architecture.
architecture.

2.2.
2.2. Related
Related Literature
Literature
At
At present,the
present, thepositioning
positioningtechnology
technologybased
basedononWiFi
WiFihas become
has become a popular
a popularpositioning
position-
technology [39,40]. Many scholars have also realized that relying on algorithms
ing technology [39,40]. Many scholars have also realized that relying on algorithms alone to
alone
improve positioning accuracy has little effect. At present, how to build an efficient and
to improve positioning accuracy has little effect. At present, how to build an efficient and
accurate fingerprint database and how to improve the accuracy on the basis of the database
accurate fingerprint database and how to improve the accuracy on the basis of the data-
have become the main bottlenecks of this technology, so there is more and more research
base have become the main bottlenecks of this technology, so there is more and more re-
on improving the quality of data and new positioning systems.
search on improving the quality of data and new positioning systems.
Tao Y et al. proposed a positioning system that can automatically build a radio
Tao Y et al. proposed a positioning system that can automatically build a radio map
map and locate online step by step. The positioning system can capture the WiFi data
and locate online step by step. The positioning system can capture the WiFi data packets
packets transmitted in the WiFi traffic, obtain the MAC address, frequency and RSSI of any
transmitted in the WiFi traffic, obtain the MAC address, frequency and RSSI of any WiFi
WiFi access point, and use the Gaussian process regression model based on the firework
access point, and use the Gaussian process regression model based on the firework algo-
algorithm to simulate the RSSI distribution of the indoor environment to estimate the
rithm to simulate the RSSI distribution of the indoor environment to estimate the location
location of the AP [41]. To address the issue of inconsistent WiFi signal observations, Du
of the AP [41]. To address the issue of inconsistent WiFi signal observations, Du X et al.
X et al. studied the signal patterns between WiFi signals and coexisting access points
studiedmodern
under the signal patternsWiFi
enterprise between WiFi signals
infrastructure andand
thecoexisting access
correlation points with
of signals underindoor
mod-
ern enterprise
path WiFithe
maps. Firstly, infrastructure and the
concept of signal correlation
pattern of used
(SSP) is signals
andwith indoor to
processed path maps.
generate
Firstly, the concept of signal pattern (SSP) is used and processed to generate
beacon ap with high positioning reliability. During the positioning process, the estimated beacon ap
with high positioning reliability. During the positioning process, the estimated
position is brought into a limited area through signal coverage constraint (SCC), thereby position is
brought into a limited area through signal coverage constraint (SCC), thereby
improving positioning accuracy [42]. Coincidentally, Zhang W et al. proposed an AP improving
selection algorithm based on multi-objective optimization, which improved the accuracy of
the positioning system by constructing an efficient and accurate fingerprint database [43].
Scholars have not only focused on building high-quality databases but also proposed
many new indoor positioning systems. While developing a new sparse Bayesian learning
Sensors 2022, 22, 7920 4 of 20

algorithm to model radio power maps in indoor spaces, Ko C H et al. also propose a
two-stage localization method: the localization task is divided into coarse localization
and fine-grained localization. The average positioning error of this method is 1.98 m,
which is 22% higher than the traditional RSSI-based positioning method [44]. While using
hierarchical positioning, Tao Y et al. proposed an accurate indoor positioning system
(AIPS) based on fingerprint adaptation for the problem that WiFi is easily affected by the
indoor dynamic environment. In the fingerprint database update process, the K-means
algorithm was used. Divide the data into two categories based on the number of loops, find
out which data changes with the dynamic environment, then update it through Gaussian
process regression [45]. Belmonte-Fernández Ó et al. proposed to directly calculate the
WiFi wireless map through the radiosity signal propagation model to replace the manual
data collection stage [46].
Neural networks have been widely used in indoor positioning systems and show good
performance. Li L et al. proposed SmartLoc, a smart wireless indoor positioning framework,
to enhance indoor positioning. In the offline phase, multiple machine learning models
are trained using the offline database, and probabilistic alignment is applied to ensure the
prediction probability of each model at the same confidence level. In the online phase, labels
with probabilities greater than a certain threshold are extracted from each model to construct
the size of candidate labels (SCL) determined by using the Dynamic size Determination
(DSD) algorithm. Finally, they also propose a probabilistic model to estimate the user’s
location by simultaneously assessing the trustworthiness of the tags. Experimental results
in a real changing environment verify the superiority of SmartLoc, outperforming the
best among comparative methods by 10.8% in 75th percentile accuracy [47]. Qin F et al.
proposed a localization system based on convolutional denoising autoencoder (CDAE)
and convolutional neural network (CNN). Compared with the traditional RSSI-based
localization method, the localization accuracy of the system reached 1.05 m [48]. Using
channel state information (CSI) as input, Wang X et al. used a deep convolutional neural
network for product localization. After extensive experiments in two representative indoor
environments, it was verified that the method has good performance [49].
Although many wireless positioning technologies have been adopted by other re-
searches due to their low cost or high accuracy, such as RFID [50,51], BLE [52,53] UWB [54,55]
and other technologies, BLE technology is considered to replace WiFi positioning technol-
ogy in the future, but at present WiFi technology still has an irreplaceable position. RFID
technology is a wireless technology that retrieves data from nearby transponders. Although
this technology has been widely used in shopping malls, warehouses, and factories because
of its energy-saving and durable characteristics, the technology has its own dedicated
infrastructure (RFID reading Card reader and tag), is not supported by any mobile de-
vice, so high cost is a key factor restricting the development of RFID indoor positioning
technology. For BLE technology, although it is a more suitable technology for indoor
positioning than WiFi in terms of energy consumption and scanning rate, the accuracy of
this technology is proportional to the number of beacons, which means that to exceed the
accuracy of WiFi indoor positioning technology requires more costs, and the BLE signal
is more susceptible to channel gain and rapid fading. The BLE measurement value will
shake violently over time. Additionally, as mobile beacons are battery powered, ensuring
uninterrupted service remains a major challenge. UWB technology is a wireless technology
with high transmission rate, low transmit power, strong penetrating ability and is based on
an extremely narrow pulse. The high delay resolution of this technology determines that it
has multi-path recovery capabilities, but it is widely used in the location of soldiers on the
battlefield, robot motion tracking, etc., and is rarely used in indoor positioning.

3. Proposed System Structure


In this section, the algorithm used to process the data is introduced after a brief
description of the data set used, then the principles of the RF algorithm and the DNN
Sensors 2022, 22, 7920 5 of 20

algorithm are introduced respectively, including the derivation of the adaptive learning
rate, and finally the whole positioning is introduced process.

3.1. Dataset Introduction


In the simulation, the effectiveness of the proposed system is verified using the
UJIIndoorLoc dataset, which is located in a building of nearly 110,000 square meters
in Jaume I University, Spain, with 25 different Android devices used by 18 different users
Complete the collection [56].
The entire database contains 21,049 records, and each record contains the following
529 elements:
001~520 RSSI Levels
521~523 Real world coordinates of the sample points
524 Building ID
525 Space ID
526 Relative position with respect to Space ID
527 User ID
528 Phone ID
529 Timestamp
(1) RSSI Levels: The more important information in the WiFi information is the detected
RSSI value. 98% of the data in the database belong to the RSSI value, of which −100
dbm is equivalent to a very weak signal, which can be considered as the point where
no signal is detected, while 0 dbm means that a very good signal is detected.
(2) Real world coordinates: Vectors 521 to 523 record longitude coordinates, latitude
coordinates, and the floor of the building.
(3) Space identifiers: Building ID, Vector 524 is the integer value (from 0 to 2) correspond-
ing to the building from which the RSSI value is obtained. Space ID, Vector 525 is used
to identify in a specific space (office, laboratory, etc.). Relative position with respect to
Space ID, Vector 526 is used to indicate whether the location where the RSSI value is
obtained is in the interior space of the corridor.
(4) User ID: Containing an integer value from 1 to 18, the vector 527 is used to represent
the 18 different users who collected RSSI values.
(5) Phone ID: Vector 528 contains different integer values to represent different Android
devices used to collect RSSI values.
(6) Timestamp: The vector 529 is a timestamp, which is used to represent the time when
the RSSI value was collected (in Unis time format).
In the process of actual use, the vectors such as Space ID and Relative position with
respect to Space ID have a value of 0, which means that it was not recorded at that time
rather than non-existent.

3.2. Data Processing Algorithms


Since the representation of different RSSI values will affect the positioning accuracy,
the data needs to be normalized and preprocessed first. In the dataset used, the value
range of RSSI data is −100 dBm~0 dBm, and the RSSI value at the AP, where no data
is detected, is marked as 100 dBm. For this, we convert these values to the (0, 1) range
using Equation (1) below.
(
RSSIi −min
0 max −min RSSI exists
RSSI = (1)
0 otherwise

where RSSIi is the intensity value provided by the ith WAP, min is the smallest RSSI value
in the dataset, and max is the largest RSSI value in the dataset.
Secondly, in the offline phase, for a larger area, a large number of signal strength
values need to be collected to build a database to improve the positioning accuracy. Since
Sensors 2022, 22, 7920 6 of 20

many factors such as temperature, humidity, and people’s movement will affect WiFi, a
large amount of noise will inevitably be introduced during the collection process. Therefore,
a denoising algorithm needs to be used to improve the quality of the database.
The clustering algorithm can divide the data set into different clusters and can achieve
the effect of denoising while finding the data with strong correlation. The commonly used
algorithms are K-Means, BIRCH, CURE, DBSCAN, etc. [57,58]. The BIRCH algorithm is
a balanced iterative reduction clustering algorithm, the CURE algorithm is a clustering
algorithm using representative points, the K-means algorithm is a clustering algorithm for
iterative solutions, and the DBSCAN algorithm is a density-based clustering algorithm.
Since the fingerprint database must be irregular, however, the BIRCH algorithm and the
K-means algorithm can only find convex or spherical clusters, so for the research in this
paper, the CURE algorithm and the DBSCAN algorithm are more suitable. As mentioned
above, when collecting RSSI data, it is inevitable to collect noise. Therefore, we hope to
find an algorithm that can de-noise the data while dividing the data, and the DBSCAN
algorithm is more efficient in dealing with noise than the CURE algorithm. From the shape
of the dataset clusters and the efficiency of noise processing, the DBSCAN algorithm is a
more suitable algorithm.
The flow of the DBSCAN algorithm is as follows:
(1) Scan the entire dataset, find any core point, and expand the core point. The augmenta-
tion method is to find all density-connected data points starting from this core point.
Traverse all the core points in the neighborhood of the core point and look for points
that are densely connected to these data points until there are no data points that can
be expanded. The boundary nodes of the final clustered clusters are all non-core data
points.
(2) Rescan the remaining data set to find the core points that have not been clustered and
repeat the above steps to expand the core points.
(3) Until there are no new core points in the dataset. Data points in the dataset that are
not included in any clusters constitute noise.
The DBSCAN algorithm can divide the data into core points, edge points and noise
points according to the density. The difference between core points and edge points is that
the data density of the two is different. There is an evaluation score inside the DBSCAN
algorithm. After the operation is completed, each data will be scored. Data with a score
greater than 0 is divided into core points, data with a score less than 0 is divided into edge
points, and data with a score equal to 0 is divided into noise points. We believe that the
RSSI data of places often used in life must be dense and high intensity, so according to this
feature, we use the DBSCAN algorithm to divide high-density places into core points.
Using the DBSCAN algorithm, the data set can be divided into core points and edge
points, and different positioning algorithms are used in different areas. Two different
positioning algorithms are applied to the system, which can well solve the balance between
positioning time and positioning accuracy. To avoid the problem of long positioning time
in less commonly used areas.

3.3. The Positioning Process of the RF Algorithm


In the selection of the positioning algorithm for discrete points, it is necessary to
choose an algorithm that can meet the needs: not only to meet the needs of relatively high
positioning accuracy and short time, but also when the sample dimension is very high, the
selected algorithm still has a good positioning effect. After a series of investigations and
tests, the RF algorithm is finally used as the edge point positioning algorithm.
The RF algorithm is an ensemble algorithm belonging to the Bagging type. As shown
in Figure 2, by combining multiple weak classifiers, the final result is voted or averaged, so
that the result of the overall model has high accuracy and generalization performance. It
can achieve good results, mainly due to “random” and “forest”; one makes it resistant to
overfitting, and one makes it more accurate [59,60].
the selected algorithm still has a good positioning effect. After a series of investigations
and tests, the RF algorithm is finally used as the edge point positioning algorithm.
The RF algorithm is an ensemble algorithm belonging to the Bagging type. As shown
in Figure 2, by combining multiple weak classifiers, the final result is voted or averaged,
so that the result of the overall model has high accuracy and generalization performance.
Sensors 2022, 22, 7920 7 of 20
It can achieve good results, mainly due to “random” and “forest”; one makes it resistant
to overfitting, and one makes it more accurate [59,60].

2. The packet structure of the


Figure 2. the RF
RF algorithm,
algorithm, by
by combining
combining multiple weak classifiers into a
strong classifier,
strong classifier, makes
makesthe
theresults
resultshave
havehigh
highaccuracy.
accuracy.

In the construction
construction process
process ofof the
the RFRF algorithm,
algorithm, there
there areare two
twomain
mainsteps:
steps:
(1)
(1) Assuming that the offline database is the training data set, part ofdata
Assuming that the offline database is the training data set, part of the the is randomly
data is ran-
selected,
domly replaced
selected, N times,
replaced and a and
N times, decision tree istree
a decision constructed.
is constructed.ThisThisrandomness
random-
ensures
ness that that
ensures eacheach
decision treetree
decision hashasa different
a different focus
focus onondata
datalearning
learningand and ensures
ensures
independence between
independence between trees. trees.
(2) Assuming that
(2) Assuming that the
the number
number of of different
different features
features of of the
the training
training data
data set
set is
is D,
D, select
select
some features randomly as E, and ensure that each time E is less than D, and the
some features randomly as E, and ensure that each time E is less than D, and the EE
feature is the decision condition of the decision tree. The number
feature is the decision condition of the decision tree. The number of feature selectionsof feature selections
determines the
determines the effectiveness
effectiveness of of random
random forests.
forests. InIn other
other words,
words, if if it
it is
is too
too small,
small, the
the
classification accuracy will be low, and conversely, if it is too large,
classification accuracy will be low, and conversely, if it is too large, the independence the independence
between trees
between treeswill
willbebereduced.
reduced. WithWiththisthis randomness,
randomness, decision
decision treestrees
have have
good good
inde-
independence and appropriate classification
pendence and appropriate classification accuracy. accuracy.
When
When the the online
onlineserver
serverreceives
receivesthe theRSSI
RSSIvalue,
value,eacheachdecision treetree
decision willwill
have a decision
have a deci-
result and vote, and the final result is the pattern of the voting results of
sion result and vote, and the final result is the pattern of the voting results of all decision all decision trees.
trees.
3.4. DNN Localization Algorithm
If theLocalization
3.4. DNN UJIIndoorLoc dataset is directly used without any processing, it will inevitably
Algorithm
introduce redundant information due to too many features. The best way to remove
If the UJIIndoorLoc
redundant information isdataset is directlyComponent
to use Principal used without any processing,
Analysis (PCA) foritdimensionality
will inevitably
introduce redundant
reduction, but PCA works information
well for due to data,
linear too many features.
and RSSI valuesThe
arebest way todata,
nonlinear remove re-
so this
dundant information is to use Principal Component Analysis
paper adopts stacked autoencoder (SAE) for dimensionality reduction. (PCA) for dimensionality
reduction, but PCA works
An autoencoder can be well for linear
thought data,
of as and RSSI
a system thatvalues are its
restores nonlinear
originaldata, so this
input. As
paper adopts stacked autoencoder (SAE) for dimensionality reduction.
shown in Figure 3, a simple autoencoder model consists of an encoder and a decoder, with
An autoencoder
numbers can be thought
such as 520 representing of as a system
the number that The
of neurons. restores its can
process original input. As
be understood
shown
that theinencoder
Figure 3,first
a simple autoencoder
transforms the inputmodel consists
signal X intoof an
theencoder
encoded and a decoder,
signal with
Y through
numbers such as 520 representing the number of neurons. The process can
functional transformation, and the task of the decoder is to represent the original input of be understood
thatencoded
the the encoder first
signal Y intransforms the input
another form. If the signal X intoisthe
input signal encoded
encoded with signal Y through
different X, the
functional transformation, and the task of the decoder is to represent the
system can restore the input signal according to Y, then Y has carried all the information original input ofof
the original data, but it is output in another form, which is the feature extraction.
Sensors 2022, 22, x FOR PEER REVIEW 8 of 20
Sensors 2022, 22, x FOR PEER REVIEW 8 of 20

the encoded signal Y in another form. If the input signal is encoded with different X, the
the encoded signal Y in another form. If the input signal is encoded with different X, the
Sensors 2022, 22, 7920 system can restore the input signal according to Y, then Y has carried all the information
8 of 20
system can restore the input signal according to Y, then Y has carried all the information
of the original data, but it is output in another form, which is the feature extraction.
of the original data, but it is output in another form, which is the feature extraction.

Figure 3. Encoder and Decoder. The output of each layer is used as the input of the next layer, the
Figure 3. Encoder
Encoder and
and Decoder.
Decoder. The
The output
output of
of each
each layer
layer is
is used
used as the
the input
input of
of the
the next
next layer,
layer, the
the
Figure
encoder3.can achieve dimensionality reduction processing, and theasdecoder can achieve dimension-
encodercan
encoder canachieve
achievedimensionality
dimensionalityreduction
reduction processing,
processing, andand
thethe decoder
decoder cancan achieve
achieve dimension-
dimensionality
ality increase processing.
ality increase
increase processing.
processing.
A single autoencoder is a three-layer network in the shape of X → h → X' , which can
A singleautoencoder
autoencoder isaathree-layer
three-layernetwork
networkininthe shapeofofX X→→h h→
theshape →XX'0 , which
which can
can
learnAa single
feature change h =is f(x) to transform its initial information. However, the output
learn aa feature
learn change hh== ff(x)
feature change totransform
( x ) to transform its itsinitial
initial information.
information. However,
However, the the output
output
information X′0is only meaningful for training the auto-encoder, therefore, in practice, a
information
information X X′ isisonly
onlymeaningful
meaningfulfor fortraining
trainingthe theauto-encoder,
auto-encoder,therefore,
therefore,ininpractice,
practice,a
new
anew autoencoder
newautoencoder
autoencoder is trained with h as the initial information to obtain a new feature ex-
is is trained
trained with
with h as
h as thethe initial
initial information
information to obtain
to obtain a new
a new feature
feature ex-
pression, and
expression, and sosoon,on,the
theoutput
outputof the
theprevious
previouslayerlayerofofencoders
encoders isis used as the input of of
pression, and so on, the output ofofthe previous layer of encoders is used as the inputinput of
the
the latter layer of encoders. This forms the so-called stacked autoencoder and completes
the latter
latter layer
layer ofof encoders.
encoders. This This forms
forms thethe so-called
so-called stacked
stacked autoencoder
autoencoder and and completes
completes
the dimensionality
the dimensionality reduction
reduction process.
process.
the dimensionality reduction process.
Since the
Since the localization task task is regarded
regarded as aa classification
classification task, the the DNN classifier
classifier is
Since the localization
localization task is is regarded as as a classification task,
task, the DNN DNN classifier is is
used
used as the localization algorithm and the forward propagation algorithm is adopted. In
used asas the
the localization
localization algorithm
algorithm and and the
the forward
forward propagation
propagation algorithm
algorithm is is adopted.
adopted. In In
DNN, the
DNN, the layers are are fully connected,
connected, that is, is, like aa perceptron,
perceptron, any neuron
neuron in the the i layer is is
DNN, the layers
layers are fully
fully connected, that that is, like
like a perceptron, any any neuron in in the ii layer
layer is
connected to
connected to any
any neuron
neuron in in the
the ii +
+1 1 layer.
layer. AsAs shown
shown in in Figure
Figure 4,
4, it
it is
is aa small
small part
part ofof the
the
connected to any neuron in the i + 1 layer. As shown in Figure 4, it is a small part of the
DNN network.
DNN network.
DNN network.

Figure 4. Part
Figure 4. Part of
of the
the DNN
DNN network.
network.
Figure 4. Part of the DNN network.
Assuming that there are m neurons in the L − 1 layer, the output aij of the jth neuron
in the Lth layer can be obtained from the Equation (2):
m
alj = σ (zlj ) = σ ( ∑ wlj k alk−1 + blj ) (2)
k =1
Assuming that there are m neurons in the L-1 layer, the output a ij of the jth neuron in
the Lth layer can be obtained from the Equation (2):
m
Sensors 2022, 22, 7920 a lj = σ (z lj )=σ ( w
k =1
l l −1
j k ak + b lj ) 9 of(2)
20

In Equation (2), l b lj is the bias variable of the jth neuron in the Lth layer,l −1
a kl −1 is the
In Equation (2), b j is the bias variable of the jth neuron in the Lth layer, ak is the output
output corresponding to the kth neuron in the L − 1 layerl and w ljk is the matrix corre-
corresponding to the kth neuron in the L − 1 layer and w jk is the matrix corresponding
sponding
to the kth to the kthinneuron
neuron the L −in1the L −and
layer 1 layer
theand the jth neuron
jth neuron in the
in the Lth Lth this
layer, layer, this matrix
matrix is the
is the weight
weight betweenbetween
network network connections,
connections, and byand by adjusting
adjusting the weights
the weights betweenbetween
neurons,neu- the
rons, the difference
difference between the between the actual
actual output output
vector and vector and the
the expected expected
output output
vector of thevector
networkof
the
can network can be minimized.
be minimized.
In our system, as shown in Figure 5, HL HL stands
stands for
for hidden
hidden layer.
layer. The pre-trained
stacked autoencoder
autoencoder is connected to the classifier to complete the localization
localization task.
task. The
figure shows a classifier with two hidden layers, and the numbers, such as 520 and 256 in
parentheses are the number of of neurons
neurons in
in each
each layer.
layer.

DNN classifier
Figure 5. DNN classifier with SAE.
SAE. SAE
SAE is
is responsible
responsible for
for dimensionality
dimensionality reduction processing, and
DNN is responsible for completing the positioning task.

In neural
In neural networks,
networks, the thelearning
learningrate rateisisoneoneofof thethe difficult
difficult parameters
parameters to set.
to set. If
If the
the learning rate is too small, the parameters with large gradients
learning rate is too small, the parameters with large gradients will have a very slow con- will have a very slow
convergence
vergence rate;
rate; if the
if the learning
learning rate rateis is
tootoo large,the
large, theoptimized
optimizedparameters
parametersmay may Instability
Instability
will
will occur.
occur. Therefore,
Therefore, the the neural
neural network
network in in this
this paperpaper adopts
adopts the the Adam
Adam optimizer
optimizer and and
integrates the adaptive learning rate algorithm
integrates the adaptive learning rate algorithm into the system. into the system.
Suppose the
Suppose the initial
initial learning
learningrate rateisisθ,θthe step
, the step length
lengthis ε,isthe
ε ,first
the and
first second
and secondmomentsmo-
estimate the decay rate
ments estimate the decay rate is ρ 1 andis ρ1 and ρ 2 , the Minimum constant for numerical is
ρ 2 , the Minimum constant for numerical stability δ, and
stability
the Updated learning rate is θ 0 . The algorithm steps are as follows:
is δ , and the Updated learning rate is θ ′ . The algorithm steps are as follows:
(1) Initialize first-order
(1) Initialize first-order and and second-order
second-order moment moment variables variablesSS== 0, R=
0 ,R = 00,, Initialization
Initialization
time t = 0.
time t = 0
(2) A A small
small batch containing m samples {{xx11,,....,
batch containing xm
. . . .,x m } }was
wascollected
collectedfromfromthethetraining
training set;
set;
corresponding target
corresponding target is y . is y i .
i
(3) The gradient is calculated according to the Equation (3) on the basis of the mini-batch data.

1
m ∑
g← ∆θ L( f ( xi ; θ ), yi ) (3)
i

(4) Refresh the time according to Equation (4).

t ← t+1 (4)
Sensors 2022, 22, 7920 10 of 20

The gradient obtained by Equation (3) is substituted into Equations (5) and (6) to update
the biased first-order moment estimation and the biased second-order moment estimation.

S ← ρ1 S + (1 − ρ1 ) g (5)

R ← ρ2 R + (1 − ρ2 ) g ⊗ g (6)
(5) The updated partial first-moment estimation is substituted into Equation (7) to achieve
the correction of first-moment error.

S
Ŝ ← (7)
1 − ρ2t
(6) The corrected first-order moment error and biased second-order moment estimation are
substituted into Equation (8) to achieve the correction of second-order moment error.


∆θ = −ε p (8)
R̂ + δ
(7) Finally, the learning rate is updated through Equation (9).

θ 0 = θ + ∆θ (9)
The reason why Adam is a better solution than RMSProp is that: In Adam, momentum
is directly incorporated into the estimation of the first moment of the gradient. The most
intuitive way to add momentum to RMSProp is to apply momentum to the scaled gradient.
Second, Adam includes bias correction, which corrects the estimates of the first and second
moments initialized from the origin. RMSProp also uses second-order moment estimates,
however, the correction factor is missing, and the RMSProp second-order moment estimates
may be highly biased at the beginning of training. Therefore, the Adam optimizer is
generally considered to be robust to the choice of hyperparameters.

3.5. Positioning Process


Combined with the algorithms mentioned above, a WiFi indoor positioning system
based on area segmentation is proposed.
As shown in Figure 6, the specific steps are as follows:
(1) First normalize the data and then use the DBSCAN algorithm to divide the dataset
into three regions: core points, edge points and noise points.
(2) The data of core points and edge points are set out by using the set-out method, and
the data are divided into training set and verification set, which are used for training
and verification of the improved DNN model and RF model.
(3) The RSSI value received in the online phase is used as the input, and the calculation
input is compared with the Euclidean distance of the core point and the edge point. If it
belongs to the core point, the improved DNN is used to complete the positioning task.
If it belongs to the edge point, the RF algorithm is used to complete the positioning task.
Sensors 2022, 22, x FOR PEER REVIEW 11 of 20

Sensors 2022, 22, 7920 task. If it belongs to the edge point, the RF algorithm is used to complete the11posi-
of 20

tioning task.

Figure 6. The overall positioning


positioning process
process of
ofthe
thesystem
systemisisdivided
dividedinto
intooffline
offlinestage
stageand
andonline
onlinestage.
stage.
4. Experimental Procedure and Results
4. Experimental Procedure and Results
In this part, the data processing process, edge point positioning process and core point
In this process
positioning part, the
aredata processing
introduced process, edge
and analyzed point positioning
respectively, process
and the results and core
are compared
point positioning process are introduced
with other WiFi-based positioning systems. and analyzed respectively, and the results are
compared with other WiFi-based positioning systems.
4.1. Data Processing
4.1. Data Processing
In this paper, in order to verify the effectiveness of the system, the UJIIndoorLoc
In this
dataset paper,
is used in original
as the order to data,
verifythe
the effectiveness
program writtenofin
the system,
Python the UJIIndoorLoc
language is used and
dataset is used as the
simulated on PyCharm2019.original data, the program written in Python language is used and
simulated on PyCharm2019.
Since the calculation of Euclidean distance is involved in the positioning system, the data
Since the at
is normalized calculation of Euclidean
the beginning. distance
The Z-Score is involved
method and thein the positioning
Max-Min methodsystem, the
are mainly
used fornormalized
data is simulationatexperiments. As The
the beginning. shown in Table
Z-Score 1, theand
method accuracy improvement
the Max-Min methodeffect
are
brought by the
mainly used forMax-Min normalization
simulation experiments.method is better
As shown than 1,
in Table that
theofaccuracy
the Z-Score method.
improvement
effect brought by the Max-Min normalization method is better than that of the Z-Score
Table 1. Accuracy corresponding to different normalization methods.
method.
Core
Table 1. Accuracy corresponding to Point normalization methods.
different Edge Point
Z-Score 94.9% 82.5%
Max-Min Core
98.8%Point Edge
87.2%Point
Z-Score 94.9% 82.5%
Max-Min 98.8% 87.2%
In this system, data segmentation is the most important step in the data processing
process. The DBSCAN algorithm can not only complete the segmentation of irregularly
In this
shaped system,
datasets data
better segmentation
than the K-mean is the most but
algorithm, important
there is step
also inan the data processing
indicator inside the
process. The DBSCAN algorithm can not only complete the
algorithm. When the index is greater than 0, the data points are divided intosegmentation ofcore
irregularly
points;
shaped
when datasets
the index isbetter than0,the
less than theK-mean algorithm,
data points but there
are divided intoisedge
also points,
an indicator inside
and the rest the
are
algorithm. When the index is greater than 0, the data points are divided
noise points. The neighborhood radius and the threshold of the number of data objects into core points;
when
in the the index is lessare
neighborhood than
two0,important
the data points are divided
parameters of theinto edge points,
DBSCAN and the
algorithm. Asrest are
shown
noise points. The neighborhood radius and the threshold of the number
in Figure 7a, different combinations of the neighborhood radius and the threshold of the of data objects in
the neighborhood are two important parameters of the DBSCAN algorithm.
number of data objects in the neighborhood can bring different segmentation effects. Since As shown in
Figure 7a, different combinations of the neighborhood radius and the
the data density of the UJIIndoorLoc dataset itself is not very large, if the neighborhood threshold of the
numberisof
radius toodata objects
small or thein the neighborhood
threshold can bring
for the number ofdifferent segmentation
data objects effects. Since
in the neighborhood is
the data
too large,density of the UJIIndoorLoc
the segmentation effect willdataset
be poor,itself is not
which mayvery
leadlarge, if the
to data neighborhood
loss. Finally, the
radius is too small
neighborhood or the
radius threshold
is set to 100~300,for the
andnumber of dataofobjects
the threshold in the of
the number neighborhood
neighbor data is
objects is set to 0~32.
Sensors 2022, 22, x FOR PEER REVIEW 12 of 20

too large, the segmentation effect will be poor, which may lead to data loss. Finally, the
Sensors 2022, 22, 7920 12 of 20
neighborhood radius is set to 100~300, and the threshold of the number of neighbor data
objects is set to 0~32.

(a)

(b)

(c)
Figure 7. (a) The influence of the neighborhood radius and the threshold of the number of data
objects in the neighborhood on the indicators of the DBSCAN algorithm. (b) shows the effect of
thresholds on the number of data objects in the neighborhood on performance metrics. (c) shows the
effect of neighborhood radius on the effect index.
Sensors 2022, 22, 7920 13 of 20

Figure 7b shows the effect of thresholds on the number of data objects in the neigh-
borhood on performance metrics. As can be seen from the figure, the closer the parameter
is to 0, the higher the effect index, which means that the more data of the core point, the
more segmentation, the better the effect. However, the minimum amount of data in each
cluster can not be 1, which will cause each data to be an independent cluster, so we set this
parameter to 2. Figure 7c shows the effect of neighborhood radius on the effect index. As
can be seen from the figure, there is an improvement process between 200 and 250, and the
effect index is the best at 227, so we set this parameter to 227.
As shown in Table 2 above, after the data is processed for normalization and region
segmentation, the effect obtained is obvious. For the DNN algorithm, when the positioning
accuracy is less than 1.5 m, the probability increases from 72.5% to 98.8%, and When
the RF algorithm’s positioning accuracy is less than 1.5 m, the probability increases from
69.6% to 87.2%. We believe that the main reason for the obvious improvement in accuracy
is that through region segmentation, the data with the same characteristics are divided
together, and the localization algorithm is easier to find similar regions when performing
the classification task to complete the localization task.

Table 2. Accuracy before and after data processing.

Accuracy of DNN Accuracy of RF


processed data 98.8% 87.2%
unprocessed data 72.5% 69.6%

4.2. Edge Point Location Process


For edge points, we expect to find a positioning algorithm that has a short positioning
time but is accurate enough to meet daily use requirements and is stable enough. Simula-
tion experiments are performed on net regression, Lasso regression, Bayesian regression,
gradient boosting, multilayer perceptron, KNN and RF algorithms.
Lasso regression is a compressed estimation. It obtains a more refined model by
constructing a penalty function, so that it compresses some coefficients and sets some
coefficients to zero. Therefore, the advantage of subset shrinkage is preserved, which is a
biased estimation for dealing with complex collinear data. Elastic network regression is a
combination of ridge regression and Lasso regression. It is good at solving models with
correlated parameters: lasso regression filters out relevant parameters and reduces other
irrelevant parameters; at the same time, ridge regression reduces all relevant parameters.
Bayesian regression is a linear regression model solved using Bayesian inference methods
in statistics. It treats the parameters of the linear model as random variables and computes
its posteriors through priors on the model parameters (weight coefficients). The KNN
algorithm is one of the most basic and simple machines learning algorithms, and it can be
used for both classification and regression. The eigenvector space is divided by the distance
between different eigenvalues of the training data, and the division result is used as the final
algorithm model. A multilayer perceptron is a forward-structured artificial neural network
that first trains a model, then stores data using weights, and uses algorithms to adjust
weights and reduce bias during training, i.e., the error between actual and predicted values.
Gradient boosting is an ensemble learning algorithm and machine learning technique
for regression and classification problems that produces predictive models in the form of
ensembles of weak predictive models. The main idea is to establish the gradient descent
direction of the model loss function before each model is established, that is, to generate
the model by optimizing the loss function.
First, perform 100 repeated simulation tests on different positioning algorithms, and
take the average value. As can be seen from Figure 8, for the edge point data, the user
positioning accuracy of RF algorithm, KNN, gradient boosting and multi-layer perceptron
are all the same. It has reached more than 80%. However, since the positioning time of
the KNN algorithm is proportional to the amount of data, when applied to practice, the
First, perform 100 repeated simulation tests on different positioning algorithms, and
take the average value. As can be seen from Figure 8, for the edge point data, the user
Sensors 2022, 22, 7920 positioning accuracy of RF algorithm, KNN, gradient boosting and multi-layer perceptron 14 of 20

are all the same. It has reached more than 80%. However, since the positioning time of the
KNN algorithm is proportional to the amount of data, when applied to practice, the posi-
tioning timetime
positioning maymaybe too
belong due to
too long thetolarge
due amount
the large of data,
amount of so the so
data, RFthe
algorithm, gradi-
RF algorithm,
ent boosting
gradient and Multilayer
boosting perceptron
and Multilayer for localization
perceptron time time
for localization testing.
testing.

points.
Figure 8. Algorithm accuracy comparison at edge points.

After
After 100
100 repeated
repeated simulation
simulation tests
tests for
for three
three different
different positioning
positioning algorithms,
algorithms, thethe
mean and variance of the positioning time and positioning accuracy
mean and variance of the positioning time and positioning accuracy of each algorithm of each algorithmare
are
shownshown in Figure
in Figure 9 and
9 and TableTable 3 when
3 when the positioning
the positioning errorerror
is 1.5ism.
1.5The
m. positioning
The positioning
time
time
is theissimulation
the simulationtimetime recorded
recorded in python
in python language
language during
during simulation.
simulation. Although
Although the the
ac-
accuracy of the gradient boosting algorithm is slightly higher than that of
curacy of the gradient boosting algorithm is slightly higher than that of the RF algorithm, the RF algorithm,
its
its positioning
positioning timetime is
is the
the longest.
longest. The
The positioning
positioning time
time ofof the
the RF
RF algorithm
algorithm is is consistent
consistent
with
with the expected target, and the positioning accuracy is slightly higher than that
the expected target, and the positioning accuracy is slightly higher than that of
of the
the
multi-layer perceptron algorithm. The variance of the RF algorithm is
multi-layer perceptron algorithm. The variance of the RF algorithm is the smallest, which the smallest, which
proves
proves that
thatitithas
hassufficient
sufficientstability, so so
stability, thethe
RFRF algorithm is finally
algorithm used.
is finally ThisThis
used. algorithm acts
algorithm
as
actsa localization algorithm
as a localization for edge
algorithm points.
for edge points.
Table 3. Algorithm accuracy comparison.
Table 3. Algorithm accuracy comparison.
Kind of
Kind of Algorithm
Algorithm Positioning Accuracy
Positioning Accuracy Variance
Variance
RF
RF Algorithms
Algorithms 87.2%
87.2% 0.047
0.047
Gradient boosting
Gradient boosting 87.7%
87.7% 0.056
0.056
Sensors 2022,multilayer
22, x FOR PEER REVIEW
perceptron 81.28% 2.05 15 of 20
multilayer perceptron 81.28% 2.05

edge points.
Figure 9. Algorithm positioning time comparison at edge points.
In the RF algorithm, the n_estimators parameter is the number of classifiers, which is
one ofInthe
theimportant
RF algorithm, the n_estimators
parameters parameter
of the algorithm. is parameter
If this the numberisof
tooclassifiers,
small, it iswhich is
easy to
one of the important parameters of the algorithm. If this parameter is too small, it is easy
to under-fit, if it is too large, it will cause too much calculation and increase the positioning
time, so you need to choose a moderate value. As shown in Figure 10, when the parameter
reaches 14, the accuracy is stable between 86% and 87.2%, so the parameter is set to 22,
and the accuracy is 87.2%.
Figure 9. Algorithm positioning time comparison at edge points.
Sensors 2022, 22, 7920 15 of 20

In the RF algorithm, the n_estimators parameter is the number of classifiers, which is


one of the important parameters of the algorithm. If this parameter is too small, it is easy
under-fit, if itifisit too
to under-fit, large,
is too it will
large, cause
it will too
cause toomuch
muchcalculation
calculationand
andincrease
increasethe
thepositioning
positioning
time,
time,so
soyou
you need
need to to choose
choose aa moderate
moderate value.
value. As
As shown
shown inin Figure
Figure 10,
10, when
when the
the parameter
parameter
reaches
reaches14,
14,the
theaccuracy
accuracyisisstable between
stable between86% andand
86% 87.2%, so the
87.2%, parameter
so the is setistoset
parameter 22,toand
22,
the
andaccuracy is 87.2%.
the accuracy is 87.2%.

Figure10.
Figure 10. Influence
Influence of
ofn_estimators
n_estimatorsparameter
parameteron
onthe
theaccuracy
accuracyof
ofRF
RFalgorithm.
algorithm.

4.3.
4.3. Core
Core Point
Point Positioning
Positioning Process
Process
For
For the selection of the positioning
the selection of the positioning algorithm
algorithm ofof the
the core
core point,
point, considering
considering that
that there
there
are
are too many database features, the database needs to be reduced in dimension, so
too many database features, the database needs to be reduced in dimension, so the
the
DNN
DNN algorithm
algorithm with
with stacked
stacked autoencoder
autoencoder is
is selected
selected asas the
the positioning
positioning algorithm
algorithm of
of the
the
core
corepoint.
point.
When choosing the optimizer and classifier, we simulated and tested different com-
When choosing the optimizer and classifier, we simulated and tested different com-
binations between the Adam optimizer and the GradientDescent optimizer, as well as
binations between the Adam optimizer and the GradientDescent optimizer, as well as the
the softmax classifier and the sigmoid classifier. Table 4 shows the corresponding user
softmax classifier and the sigmoid classifier. Table 4 shows the corresponding user posi-
positioning accuracy of each combination when the positioning error is 1.5 m. The final
tioning accuracy of each combination when the positioning error is 1.5 m. The final choice
choice is Adam optimizer and softmax classifier.
is Adam optimizer and softmax classifier.
Table 4. Accuracy corresponding to different combinations.
Table 4. Accuracy corresponding to different combinations.
Softmax
Softmax Sigmoid
Sigmoid
Adam 98.8% 58%
GradientDescent 96.5% 53.5%

The learning rate is one of the most difficult parameters to set in the neural network.
If the learning rate is too small, the convergence rate may be slow. If the learning rate is
too large, the optimized parameters may be unstable. Therefore, in the DNN algorithm,
by adding an adaptive learning rate, it is possible to set a different learning rate for each
parameter participating in the training. Although the adaptive learning rate can solve a
lot of troubles, the initial learning rate will still have a great impact on accuracy. Figure 11
shows the accuracy of different initial learning rates on the training set and validation set.
The accuracy of 0.007 is 99.7% on the training set and 98.8% on the validation set. Figure 12
shows the loss function values corresponding to different training steps. After the training
step is 100, the loss function gradually converges. Finally, set the initial learning rate to
0.007 and the epoch to 150.
parameter participating in the training. Although the adaptive learning rate can solve a
lot of troubles, the initial learning rate will still have a great impact on accuracy. Figure 11
shows the accuracy of different initial learning rates on the training set and validation set.
The accuracy of 0.007 is 99.7% on the training set and 98.8% on the validation set. Figure
12 shows the loss function values corresponding to different training steps. After the train-
Sensors 2022, 22, 7920 16 of 20
ing step is 100, the loss function gradually converges. Finally, set the initial learning rate
to 0.007 and the epoch to 150.

Figure 11. Accuracy corresponding to


to different
different learning
learning rates.
rates.

Figure
Figure 12.
12. Correspondence
Correspondence map
map of epoch and
of epoch and loss
loss value.
value.

It has been proved by 100 repeated simulation tests that the DNN algorithm has a good
effect on positioning. Without processing the data, the user positioning accuracy is only 72%
when the positioning error is 1.5 m. The core point dataset is used as the positioning data
of the improved DNN algorithm, the user location positioning accuracy can reach 98.8%
when the positioning error is 1.5 m. Table 5 shows the comparison of the accuracy and
variance of the improved DNN and KNN, gradient boosting, and multi-layer perceptron
algorithms at the core point when the positioning error is 1.5 m. Since the time of the KNN
positioning algorithm will vary greatly according to the amount of data, the time has not
been measured. Figure 13 is the time comparison of the multi-layer perceptron algorithm,
the gradient boosting algorithm, and the improved DNN algorithm. It can be seen that
the improved DNN algorithm is better than other algorithms in time, while ensuring
the positioning accuracy. It is worth mentioning that the impact of SAE’s dimensionality
reduction processing cannot be seen only from the accuracy. If the data is not dimensionally
reduced, it will bring dimensional disaster to the DNN network, and eventually lead to an
the time of the KNN positioning algorithm will vary greatly according to the amount of
data, the time has not been measured. Figure 13 is the time comparison of the multi-layer
perceptron algorithm, the gradient boosting algorithm, and the improved DNN algo-
rithm. It can be seen that the improved DNN algorithm is better than other algorithms in
Sensors 2022, 22, 7920 time, while ensuring the positioning accuracy. It is worth mentioning that the impact of
17 of 20
SAE’s dimensionality reduction processing cannot be seen only from the accuracy. If the
data is not dimensionally reduced, it will bring dimensional disaster to the DNN network,
and eventually
exponential lead to
increase inan
theexponential
positioningincrease
time. Toinsum
the positioning
up, it can betime.
seenTo sum
that, forup,
theitcore
can
be seen
point, thethat, for theDNN
improved core algorithm
point, the is
improved
obviouslyDNNbetteralgorithm is algorithms.
than other obviously better than
other algorithms.
Table 5. Algorithm accuracy at the core point.
Table 5. Algorithm accuracy at the core point.
Kind of Algorithm Positioning Accuracy Variance
Kind of Algorithm Positioning Accuracy Variance
Improved DNN
Improved DNN 98.8%
98.8% 0.128
0.128
KNN 94% 0.092
KNN 94% 0.092
Gradient boosting 93.68% 0.055
Gradient boosting
multilayer perceptron 93.68%
95.58% 0.055
0.084
multilayer perceptron 95.58% 0.084

Figure 13.
Figure 13. Time
Time comparison
comparison of
of core
core point
point algorithms.
algorithms.

5.
5. Conclusions
Conclusions
With the improvement
With the improvementof ofmaterial
materiallife,
life,WiFi
WiFi wireless
wireless network
network hashas changed
changed fromfrom
pri-
private to public, which provides a foundation for the development of WiFi
vate to public, which provides a foundation for the development of WiFi positioning tech- positioning
technology. In people’s
nology. In people’s life,life,
80%80% of the
of the time
time is spent
is spent indoors,
indoors, so so
it isit particularly
is particularly important
important to
to provide a convenient indoor positioning
provide a convenient indoor positioning service. service.
In
In most
most positioning
positioning systems,
systems, only
only one
one positioning
positioning method
method may may bebe used,
used, which
which leads
leads
to
to the phenomenon that the positioning time is too long in uncommon areas, andand
the phenomenon that the positioning time is too long in uncommon areas, the the
po-
positioning accuracy is not required in these areas. In addition, temperature and humidity,
sitioning accuracy is not required in these areas. In addition, temperature and humidity,
the movement of people, and objects can affect WiFi; thereby, affecting the quality of the
the movement of people, and objects can affect WiFi; thereby, affecting the quality of the
database, which may eventually lead to a decrease in positioning accuracy. In this regard,
database, which may eventually lead to a decrease in positioning accuracy. In this regard,
this study proposes a localization method for regional segmentation of the dataset. At the
core point, the DNN network with SAE integrated with the adaptive learning rate is used to
complete the positioning task. SAE can complete the dimensionality reduction processing
well, and the DNN network integrated with the adaptive learning rate can provide high-
precision positioning. The edge point itself does not need too high accuracy, so the RF
algorithm is used to ensure the positioning accuracy and reduce the system positioning
time. Experiments show that the experimental results show that our positioning accuracy
does not exceed 1.5 m with a probability of less than 87.2% at the edge point, and the time
is only 32 ms; the positioning accuracy does not exceed 1.5 m with a probability of less than
98.8% at the core point. In addition, the variance of core points and edge points is generally
small, and the system performance is superior, which can meet the daily positioning needs.
At present, the accuracy of the system at the edge point needs to be improved, and
other positioning technologies can be adopted at the edge point, or the WiFi positioning
technology can be integrated with other positioning technologies. Therefore, in future work,
we should also study other low-power or high-precision positioning technologies, such
Sensors 2022, 22, 7920 18 of 20

as BLE technology and RFID technology, while researching fusion positioning technology.
While ensuring that the system obtains better positioning performance, the positioning cost
is also within an acceptable range.

Author Contributions: Conceptualization, Y.W., X.G., X.D. and Y.X.; data curation, X.G. and B.H.;
formal analysis, Y.W. and X.G.; funding acquisition, Y.W. and X.D.; methodology, X.G.; project admin-
istration, Y.W.; software, X.G. and B.H.; supervision, Y.W.; validation, X.G. and Y.X.; visualization,
X.G. and X.D.; writing—original draft, X.G. All authors have read and agreed to the published version
of the manuscript.
Funding: This research was funded by Department of Education’s basic scientific research business
special project of Heilongjiang Province, grant number 145109140; the Natural Fund Joint Guidance
Project of Heilongjiang Province, grant number LH2020F050; the Ministry of Education program,
grant number 135209240; the Degree and Graduate Education Programs of Qiqihar University, grant
number JGXM_QUG_2020011.
Data Availability Statement: Publicly available datasets were analyzed in this study. This data can
be found here: http://archive.ics.uci.edu/ml/datasets/UJIIndoorLoc# (accessed on 13 October 2022).
The data presented in this study are available on request from the corresponding author. The data are
not publicly available due to further research needs.
Acknowledgments: The authors thank the editors and reviewers of this paper for their comments
with which its quality was improved.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Saputra, M.R.U.; Markham, A.; Trigoni, N. Visual SLAM and structure from motion in dynamic environments: A survey. ACM
Comput. Surv. (CSUR) 2018, 51, 1–36. [CrossRef]
2. Sumikura, S.; Shibuya, M.; Sakurada, K. OpenVSLAM: A versatile visual SLAM framework. In Proceedings of the 27th ACM
International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 2292–2295.
3. Yang, C.; Shao, H.R. WiFi-based indoor positioning. IEEE Commun. Mag. 2015, 53, 150–157. [CrossRef]
4. Guan, W.; Huang, L.; Wen, S.; Yan, Z.; Liang, W.; Yang, C.; Liu, Z. Robot localization and navigation using visible light positioning
and SLAM fusion. J. Light. Technol. 2021, 39, 7040–7051. [CrossRef]
5. Poulose, A.; Eyobu, O.S.; Kim, M.; Han, D.S. Localization error analysis of indoor positioning system based on UWB measurements.
In Proceedings of the 2019 Eleventh International Conference on Ubiquitous and Future Networks (ICUFN), Zagreb, Croatia, 2–5
July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 84–88.
6. Kunhoth, J.; Karkar, A.G.; Al-Maadeed, S.; Al-Ali, A. Indoor positioning and wayfinding systems: A survey. Hum.-Cent. Comput.
Inf. Sci. 2020, 10, 18. [CrossRef]
7. Dong, Z.Y.; Xu, W.M.; Zhuang, H. Research on ZigBee indoor technology positioning based on RSSI. Procedia Comput. Sci. 2019,
154, 424–429. [CrossRef]
8. Wang, W.; Zhu, Q.; Wang, Z.; Yang, Y. Research on Indoor Positioning Algorithm Based on SAGA-BP Neural Network. IEEE Sens.
J. 2021, 22, 3736–3744. [CrossRef]
9. Chen, R.; Li, Z.; Ye, F.; Guo, G.; Xu, S.; Qian, L.; Liu, Z.; Huang, L. Precise indoor positioning based on acoustic ranging in
smartphone. IEEE Trans. Instrum. Meas. 2021, 70, 21008986. [CrossRef]
10. Lee, S.; Kim, J.; Moon, N. Random forest and WiFi fingerprint-based indoor location recognition system using smart watch.
Hum.-Cent. Comput. Inf. Sci. 2019, 9, 1–14. [CrossRef]
11. Xia, S.; Liu, Y.; Yuan, G.; Zhu, M.; Wang, Z. Indoor fingerprint positioning based on Wi-Fi: An overview. ISPRS Int. J. Geo-Inf.
2017, 6, 135. [CrossRef]
12. Oguntala, G.; Abd-Alhameed, R.; Jones, S.; Noras, J.; Patwary, M.; Rodriguez, J. Indoor location identification technologies for
real-time IoT-based applications: An inclusive survey. Comput. Sci. Rev. 2018, 30, 55–79. [CrossRef]
13. Liu, H.; Darabi, H.; Banerjee, P.; Liu, J. Survey of wireless indoor positioning techniques and systems. IEEE Trans. Syst. Man
Cybern. Part C (Appl. Rev.) 2007, 37, 1067–1080. [CrossRef]
14. Horsmanheimo, S.; Lembo, S.; Tuomimaki, L.; Huilla, S.; Honkamaa, P.; Laukkanen, M.; Kemppi, P. Indoor positioning platform
to support 5G location based services. In Proceedings of the 2019 IEEE International Conference on Communications Workshops
(ICC Workshops), Shanghai, China, 20–24 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6.
15. Filippoupolitis, A.; Oliff, W.; Loukas, G. Bluetooth low energy based occupancy detection for emergency management. In
Proceedings of the 2016 15th International Conference on Ubiquitous Computing and Communications and 2016 International
Symposium on Cyberspace and Security (IUCC-CSS), Granada, Spain, 1 December 2016; IEEE: Piscataway, NJ, USA, 2016;
pp. 31–38.
Sensors 2022, 22, 7920 19 of 20

16. Tekler, Z.D.; Low, R.; Yuen, C.; Blessing, L. Plug-Mate: An IoT-based occupancy-driven plug load management system in smart
buildings. Build. Environ. 2022, 223, 109472. [CrossRef]
17. Balaji, B.; Xu, J.; Nwokafor, A.; Gupta, R.; Agarwal, Y. Sentinel: Occupancy based HVAC actuation using existing WiFi
infrastructure within commercial buildings. In Proceedings of the 11th ACM Conference on Embedded Networked Sensor
Systems, Roma, Italy, 11–15 November 2013; pp. 1–14.
18. Yang, J.; Wang, Z.; Zhang, X. An ibeacon-based indoor positioning systems for hospitals. Int. J. Smart Home 2015, 9, 161–168.
[CrossRef]
19. Koppar, A.R.; Singh, H.; Navali, L.; Mohan, P. Indoor Positioning System (IPS) in Hospitals. In Intelligent Systems; Springer:
Singapore, 2021; pp. 171–179.
20. Kanan, R.; Elhassan, O. A combined batteryless radio and wifi indoor positioning for hospital nursing. J. Commun. Softw. Syst.
2016, 12, 34–44. [CrossRef]
21. Nuño-Maganda, M.A.; Herrera-Rivas, H.; Torres-Huitzil, C.; Marín-Castro, H.M.; Coronado-Pérez, Y. On-Device learning of
indoor location for WiFi fingerprint approach. Sensors 2018, 18, 2202. [CrossRef] [PubMed]
22. Chebli, M.S.; Mohammad, H.; Amer, K.A. An overview of wireless indoor positioning systems: Techniques, security, and
countermeasures. In Proceedings of the International Conference on Internet and Distributed Computing Systems, Naples, Italy,
10–12 October 2019; Springer: Cham, Switzerland, 2019; pp. 223–233.
23. Ehn, M.; Richardson, M.X.; Stridsberg, S.L.; Redekop, K.; Wamala-Andersson, S. Mobile safety alarms based on gps technology in
the care of older adults: Systematic review of evidence based on a general evidence framework for digital health technologies. J.
Med. Internet Res. 2021, 23, e27267. [CrossRef] [PubMed]
24. Liu, F.; Liu, J.; Yin, Y.; Wang, W.; Hu, D.; Chen, P.; Niu, Q. Survey on WiFi-based indoor positioning techniques. IET Commun.
2020, 14, 1372–1383. [CrossRef]
25. Chen, J.; Song, S.; Yu, H. An indoor multi-source fusion positioning approach based on PDR/MM/WiFi. AEU-Int. J. Electron.
Commun. 2021, 135, 153733. [CrossRef]
26. Álvarez-Merino, C.S.; Luo-Chen, H.Q.; Khatib, E.J.; Barco, R. WiFi FTM, UWB and cellular-based radio fusion for indoor
positioning. Sensors 2021, 21, 7020. [CrossRef]
27. Cao, H.; Wang, Y.; Bi, J.; Xu, S.; Si, M.; Qi, H. Indoor positioning method using WiFi RTT based on LOS identification and range
calibration. ISPRS Int. J. Geo-Inf. 2020, 9, 627. [CrossRef]
28. Ninh, D.B.; He, J.; Trung, V.T.; Huy, D.P. An effective random statistical method for Indoor Positioning System using WiFi
fingerprinting. Future Gener. Comput. Syst. 2020, 109, 238–248. [CrossRef]
29. Gao, J.; Li, X.; Ding, Y.; Su, Q.; Liu, Z. WiFi-based indoor positioning by random forest and adjusted cosine similarity. In
Proceedings of the 2020 Chinese Control and Decision Conference (CCDC), Hefei, China, 22–24 August 2020; IEEE: Piscataway,
NJ, USA, 2020; pp. 1426–1431.
30. Sun, H.; Zhu, X.; Liu, Y.; Liu, W. WiFi based fingerprinting positioning based on Seq2seq model. Sensors 2020, 20, 3767. [CrossRef]
[PubMed]
31. Wu, Y.; Chen, R.; Li, W.; Zhou, H.; Yan, K. Indoor Positioning Based on Walking-Surveyed Wi-Fi Fingerprint and Corner Reference
Trajectory-Geomagnetic Database. IEEE Sens. J. 2021, 21, 18964–18977. [CrossRef]
32. Zhang, L.; Chen, Z.; Cui, W.; Li, B.; Chen, C.; Cao, Z.; Gao, K. Wifi-based indoor robot positioning using deep fuzzy forests. IEEE
Internet Things J. 2020, 7, 10773–10781. [CrossRef]
33. Jang, B.; Kim, H. Indoor positioning technologies without offline fingerprinting map: A survey. IEEE Commun. Surv. Tutor. 2018,
21, 508–525. [CrossRef]
34. Wu, X.; Soltani, M.D.; Zhou, L.; Safari, M.; Haas, H. Hybrid LiFi and WiFi networks: A survey. IEEE Commun. Surv. Tutor. 2021,
23, 1398–1420. [CrossRef]
35. Zhang, Y.; Qu, C.; Wang, Y. An indoor positioning method based on CSI by using features optimization mechanism with LSTM.
IEEE Sens. J. 2020, 20, 4868–4878. [CrossRef]
36. Shao, W.; Luo, H.; Zhao, F.; Tian, H.; Yan, S.; Crivello, A. Accurate indoor positioning using temporal–spatial constraints based on
Wi-Fi fine time measurements. IEEE Internet Things J. 2020, 7, 11006–11019. [CrossRef]
37. Tsuchida, S.; Takahashi, T.; Ibi, S.; Sampei, S. Machine learning-aided indoor positioning based on unified fingerprints of Wi-Fi
and BLE. In Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
(APSIPA ASC), Lanzhou, China, 18–21 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1468–1472.
38. Jondhale, S.R.; Deshpande, R.S. Kalman filtering framework-based real time target tracking in wireless sensor networks using
generalized regression neural networks. IEEE Sens. J. 2018, 19, 224–233. [CrossRef]
39. Caso, G.; De Nardis, L.; Lemic, F.; Handziski, V.; Wolisz, A.; Di Benedetto, M.G. ViFi: Virtual fingerprinting WiFi-based indoor
positioning via multi-wall multi-floor propagation model. IEEE Trans. Mob. Comput. 2019, 19, 1478–1491. [CrossRef]
40. Blasio, G.S.; Quesada-Arencibia, A.; García, C.R.; Rodríguez-Rodríguez, J.C. Bluetooth Low Energy Technology Applied to Indoor
Positioning Systems: An Overview. In Proceedings of the International Conference on Computer Aided Systems Theory, Palmas
de Gran Canaria, Spain, 17–22 February 2019; Springer: Cham, Switzerland, 2019; pp. 83–90.
41. Tao, Y.; Zhao, L. A novel system for WiFi radio map automatic adaptation and indoor positioning. IEEE Trans. Veh. Technol. 2018,
67, 10683–10692. [CrossRef]
Sensors 2022, 22, 7920 20 of 20

42. Du, X.; Yang, K.; Zhou, D. MapSense: Mitigating inconsistent WiFi signals using signal patterns and pathway map for indoor
positioning. IEEE Internet Things J. 2018, 5, 4652–4662. [CrossRef]
43. Zhang, W.; Yu, K.; Wang, W.; Li, X. A self-adaptive AP selection algorithm based on multiobjective optimization for indoor WiFi
positioning. IEEE Internet Things J. 2020, 8, 1406–1416. [CrossRef]
44. Ko, C.H.; Wu, S.H. A framework for proactive indoor positioning in densely deployed WiFi networks. IEEE Trans. Mob. Comput.
2020, 21, 21440544. [CrossRef]
45. Tao, Y.; Zhao, L. AIPS: An accurate indoor positioning system with fingerprint map adaptation. IEEE Internet Things J. 2021, 9,
3062–3073. [CrossRef]
46. Belmonte-Fernández, Ó.; Montoliu, R.; Torres-Sospedra, J.; Sansano-Sansano, E.; Chia-Aguilar, D. A radiosity-based method to
avoid calibration for indoor positioning systems. Expert Syst. Appl. 2018, 105, 89–101. [CrossRef]
47. Li, L.; Guo, X.; Ansari, N. SmartLoc: Smart wireless indoor localization empowered by machine learning. IEEE Trans. Ind. Electron.
2019, 67, 6883–6893. [CrossRef]
48. Qin, F.; Zuo, T.; Wang, X. Ccpos: Wifi fingerprint indoor positioning system based on cdae-cnn. Sensors 2021, 21, 1114. [CrossRef]
49. Wang, X.; Wang, X.; Mao, S. Deep convolutional neural networks for indoor localization with CSI images. IEEE Trans. Netw. Sci.
Eng. 2018, 7, 316–327. [CrossRef]
50. Hahnel, D.; Burgard, W.; Fox, D.; Fishkin, K.; Philipose, M. Mapping and localization with RFID technology. In Proceedings of
the IEEE International Conference on Robotics and Automation, 2004, Proceedings. ICRA’04. 2004, New Orleans, LA, USA, 26
April–1 May 2004; IEEE: Piscataway, NJ, USA, 2004; Volume 1.
51. Li, N.; Becerik-Gerber, B. Performance-based evaluation of RFID-based indoor location sensing solutions for the built environment.
Adv. Eng. Inform. 2011, 25, 535–546. [CrossRef]
52. Tekler, Z.D.; Low, R.; Gunay, B.; Andersen, R.K.; Blessing, L. A scalable Bluetooth Low Energy approach to identify occupancy
patterns and profiles in office spaces. Build. Environ. 2020, 171, 106681. [CrossRef]
53. Filippoupolitis, A.; Oliff, W.; Loukas, G. Occupancy detection for building emergency management using BLE beacons. In
International Symposium on Computer and Information Sciences; Springer: Cham, Switzerland, 2016; pp. 233–240.
54. Tiemann, J.; Schweikowski, F.; Wietfeld, C. Design of an UWB indoor-positioning system for UAV navigation in GNSS-denied
environments. In Proceedings of the 2015 International Conference on Indoor Positioning and Indoor Navigation (IPIN), Banff,
AB, Canada, 13–16 October 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–7.
55. Mazhar, F.; Khan, M.G.; Sällberg, B. Precise indoor positioning using UWB: A review of methods, algorithms and implementations.
Wirel. Pers. Commun. 2017, 97, 4467–4491. [CrossRef]
56. Maheepala, M.; Kouzani, A.Z.; Joordens, M.A. Light-based indoor positioning systems: A review. IEEE Sens. J. 2020, 20, 3971–3995.
[CrossRef]
57. Chen, Y.; Tang, S.; Bouguila, N.; Wang, C.; Du, J.; Li, H.L. A fast clustering algorithm based on pruning unnecessary distance
computations in DBSCAN for high-dimensional data. Pattern Recognit. 2018, 83, 375–387. [CrossRef]
58. Zhu, Q.; Tang, X.; Elahi, A. Application of the novel harmony search optimization algorithm for DBSCAN clustering. Expert Syst.
Appl. 2021, 178, 115054. [CrossRef]
59. Maung NA, M.; Lwi, B.Y.; Thida, S. An enhanced rss fingerprinting-based wireless indoor positioning using random forest
classifier. In Proceedings of the 2020 International Conference on Advanced Information Technologies (ICAIT), Yangon, Myanmar,
4–5 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 59–63.
60. Lee, S.; Moon, N. Location recognition system using random forest. J. Ambient. Intell. Humaniz. Comput. 2018, 9, 1191–1196.
[CrossRef]

You might also like