1 s2.0 S093336571100056X Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Artificial Intelligence in Medicine 53 (2011) 127–138

Contents lists available at ScienceDirect

Artificial Intelligence in Medicine


journal homepage: www.elsevier.com/locate/aiim

Non-invasive estimate of blood glucose and blood pressure from a


photoplethysmograph by means of machine learning techniques
Enric Monte-Moreno ∗
Department Signal Theory and Communications, Universitat Politècnica de Catalunya, Campus Nord, Edifici D5, C/Jordi Girona, 1-3, 08034, Barcelona, Spain

a r t i c l e i n f o a b s t r a c t

Article history: Objective: This work presents a system for a simultaneous non-invasive estimate of the blood glucose level
Received 14 May 2010 (BGL) and the systolic (SBP) and diastolic (DBP) blood pressure, using a photoplethysmograph (PPG) and
Received in revised form 2 May 2011 machine learning techniques. The method is independent of the person whose values are being measured
Accepted 8 May 2011
and does not need calibration over time or subjects.
Methodology: The architecture of the system consists of a photoplethysmograph sensor, an activity detec-
Keywords:
tion module, a signal processing module that extracts features from the PPG waveform, and a machine
Machine learning
learning algorithm that estimates the SBP, DBP and BGL values. The idea that underlies the system is
Photoplethysmography
Noninvasive measurement
that there is functional relationship between the shape of the PPG waveform and the blood pressure and
Blood glucose estimate glucose levels.
Blood pressure estimate Results: As described in this paper we tested this method on 410 individuals without performing any per-
sonalized calibration. The results were computed after cross validation. The machine learning techniques
tested were: ridge linear regression, a multilayer perceptron neural network, support vector machines
and random forests. The best results were obtained with the random forest technique. In the case of
blood pressure, the resulting coefficients of determination for reference vs. prediction were RSBP2
= 0.91,
RDBP
2
= 0.89, and RBGL
2
= 0.90. For the glucose estimation, distribution of the points on a Clarke error grid
placed 87.7% of points in zone A, 10.3% in zone B, and 1.9% in zone D. Blood pressure values complied
with the grade B protocol of the British Hypertension society.
Conclusion: An effective system for estimate of blood glucose and blood pressure from a photoplethys-
mograph is presented. The main advantage of the system is that for clinical use it complies with the grade
B protocol of the British Hypertension society for the blood pressure and only in 1.9% of the cases did not
detect hypoglycemia or hyperglycemia.
© 2011 Elsevier B.V. All rights reserved.

1. Introduction The need for a non invasive method for estimating both values
is justified by the size of the affected population. According to Wild
Measurements of blood glucose levels (BGL) and systolic (SBP) et al. [1] at least 171 million people worldwide suffer from dia-
and diastolic (DBP) pressures are currently made by means that betes. Monitoring glucose levels requires the use of glucometers,
are either invasive or require mechanical intervention. To mea- disposable needles, and blood test strips, which can be expensive
sure glucose levels current systems require blood sampling and and cumbersome to use. Monitoring blood pressure can help to pro-
use of certain reagents. Blood pressure is assessed by mechanically tect against complications associated with diabetes, such as retinal,
measuring the pressure difference between an external agent (the renal and cardiovascular disease. Moreover, according to Wolf-
cuff squeezing the arm) and the pressure of the blood vessels. An Maier et al. [2], the age and sex adjusted prevalence of hypertension
estimate is made by comparing the pressures. An alternative to (defined as >140/90 mmHg) was 28% in North American countries
these methods is the use of a non invasive device such as a pho- and 44% in European countries. Because hypertension increases the
toplethysmograph (PPG) that can assess the distal heart beat (i.e. risk of heart attack, heart failure, stroke and kidney disease, a cuff-
signal shape as measured on the finger’s capillaries) by means of less and easy-to-use method for monitoring blood pressure might
differences in light absorption. This can be done for instance, with improve daily control of blood pressure levels. Moreover both vari-
a pulse oximeter. ables are routinely measured in intensive care units. Specifically,
a non-invasive device that provides real-time, continuous arterial
pressure and glucose measurements is of interest because of its
lower risk of infection.
∗ Corresponding author. Tel.: +34 93 4016435; fax: +34 93 4016447. Additionally, our interest in simultaneous measurement of BP
E-mail address: enric.monte@upc.edu and BGL is particularly strong because in 2004, 75% of adults with

0933-3657/$ – see front matter © 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.artmed.2011.05.001
128 E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138

self-reported diabetes had blood pressures greater than or equal to The system was trained and tested with cross validation in a
130/80 mmHg, or used prescription medications for hypertension population of 410 individuals. The measurements used as input
[3], this represents an additional argument in favor of a noninvasive and reference for training the system were obtained as follows:
monitoring system for both quantities.
There are precedents for measurements of BGL, SBP and DBP
• PPG input measurements were made with the fingertip pulse
from the PPG signal. A PPG signal is obtained by illuminating
the skin and measuring changes in light absorption. This signal is oximeter device iPod Digital Oximeter [12]. For details on the accu-
obtained from a photoplethysmograph, which is based on small racy of the pulse oximeter see Appendix A.
• SBP and DBP target measurements were made with an aneroid
light-emitting diodes facing a photodiode. In our case the light con-
sisted on a red beam and a beam in the near infrared band. The sphygmomanometer (conventional sphygmomanometer cuff).
• BGL target measurements were made with the self-monitoring
light absorption depends on the heart beat because the blood ves-
sels in the finger expand and contract with each heart beat. The blood glucometer Accu-Chek Aviva [13].
difference between the minimum absorption and the peak absorp-
tion of the light intensity is proportional to the distal cardiac pulse. It should be stressed that the database consisted of 410 differ-
For non-invasive glucose measurements, most of the systems that ent individuals, with one measurement per individual and that the
were previously developed are based on near infrared spectroscopy design criterion was that the system should not require calibration
and the majority have proved to be unsuccessful [4]. The difficulty either between individuals or over time.
lies in the fact that the light absorption of glucose is two orders Our underlying assumption was that a functional relationship
of magnitude below that of water, making reliable measurements exists between the PPG signal and blood pressure and glucose lev-
difficult and introducing the need for repeated system recalibra- els. This functional relationship can be explained by several facts,
tion either between individuals or over time. Haaland et al. [5] such as the fact that the pulse oximeter yields a waveform that is
describe some of the difficulties associated with this approach. For the result of the propagation of the ECG signal through the arteries,
BP, the PPG signal was used by both Cattivelli and Garudadri [6] and veins and capillaries, also by the fact that the PPG signal reflects the
Chen et al. [7], both are based on the pulse arrival time, i.e. by the hemodynamics of the individual. In addition the state of the auto-
time interval between the QRS apex on an electrocardiogram and nomic nervous system is partly reflected in the measured waveform
on a beat-by-beat finger-tip photoplethysmogram generated by an and also because the shape depends partially on the relationship
oximetric sensor that requires complex equipment and frequent between distal capillary bed and blood viscosity. We also assumed
calibration. Other methods, summarized in Allen [8], are difficult that a machine learning system would be able to infer this func-
to use. tional relationship from a labeled database. Our results bear out
An approach to non invasive systolic blood pressure estima- these assumptions.
tion which specifically uses the shape of the PPG waveform [9], The paper is structured as follows: in Section 2 we summarize
is based on the fact that a pressure cuff on the right arm makes the the physiological facts that support our selection of the features
pressure pulses in the arteries distal to the cuff disappear due to used to estimate pressures and glucose levels. In Section 3.1 we
the collapse of the artery under the cuff. In the paper the authors describe an AD module for filtering out artifacts and extracting a
record the PPG signal from the two index fingers, and by ana- useful window of signal; this window was then used as input to
lyzing the derivative of the signal and by correlating both PPG the SP module. In Section 3.2 we briefy describe the SP module that
signals with the cuff pressure estimate the actual SBP. Note that computed the feature vector used by the ML algorithm. In Section
the method does not estimate the DBP level. Another approach 3.3 we present a description of the ML module and the ML tech-
that makes use of the shape of the PPG [10], found that increases niques. In Section 3.4 we describe the database. In Section 4 we
on the SBP and DBP were reflected on the DC and AC components present our results and finally Section 5 is devoted to conclusions
of the PPG waveform. We do not follow the approach proposed and future work.
in this paper because of the need of calibration of the system
to each individual. Finally, in Ref. [11], the authors measure the
shape of the PPG signal, and compute features from the acceler- 2. Physiological factors relevant to the proposed system
ation pulse wave derived from the PPG signal, which are used as
input of a decision tree followed by a multiple regression analy- In this section we summarize the physiological properties that
sis. A limitation of this method is that it estimates only systolic support a theoretical estimation of blood pressure and glucose level
blood pressures and does not use information about the heart from the PPG signal. We also relate these properties to the signal
rate. processing techniques described in Sections 3.1 (the AD module)
In contrast to these methods we propose an approach that does and 3.2 (the SP module). This section lists physiological facts that
not measure differences in time intervals or in light absorption at are briefly described, along with the signal processing techniques
different wavelengths, but rather measures the effect of physiolog- used to capture their characteristics. Afterwards, in Section 3.2, we
ical alterations on the shape of the PPG waveform and on the heart provide a detailed description of each signal processing technique.
rate. These measurements are related to the individual’s hemody- Blood viscosity and vessel compliance. Blood viscosity increases
namics and the glucose levels. under conditions of insulin resistance in fat cells due to the presence
In this paper we propose a device that is structured into three of elevated plasma levels of free fatty acids. Blood viscosity also
modules. A summary of the system itself is shown in Fig. 1. From the depends on both blood pressure and glucose level [14]. Thus the
signals entering the PPG, the activity detection (AD) module selects viscosity of the blood alters the flux of blood in the capillaries and
a window of 1-min duration containing a clean signal (compris- therefore the shape of the PPG pulse. Vessel compliance is related
ing 4500 consecutive samples). This module eliminates artifacts to the vessel wall response to increases in pressure that are in turn
and avoids loss of signal due to movements of the finger. A sig- determined by the state of the cardiovascular system.
nal processing (SP) module processes the signal in this window and
extracts a vector of features. The output of this module is a fixed
length vector. Finally, a machine learning (ML) module infers the • Both properties are reflected in the degree to which the PPG
function that relates the output of the SP module to the variables, waveform is damped; this information can be obtained via spec-
BGL, SBP and DBP. tral analysis and an autoregressive model of the waveform.
E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138 129

Fig. 1. Diagram of the system.

Hemodynamics. The baroreflex1 relates the heart rate to the been correlated with HR variability and the spectral shape of the
blood pressure [15,16]. Several studies [17] have explored the spec- cardiac tracing. McCraty et al. [26] showed that different emotional
tral properties of the HR and its relationship to blood pressure states are reflected in the shape of the power spectrum of the HR
levels. These studies identified a correlation and a transfer func- variability.
tion between different frequencies and the mean arterial pressure.
Previous investigators have described a relationship between spec- • This also justifies the use of features related to the shape of the
tral bands of the pulse interval and systolic arterial pressure [18], PPG pulse and the power spectrum and statistics of the HR (i.e
and between HR variability and the baroreflex in humans [19]; mean value, standard deviation, skewness, etc.).
the non-linearity of the latter relationship has also been reported
[20,21]. Breathing and the autonomic nervous system. Several studies have
shown that the respiratory frequency can regulate the blood pres-
• This justifies the use of ML techniques flexible enough to infer the sure [27,28] and that in diabetic patients, the two are closely related
nonlinear relationship between the spectral characteristics of the [29]. It is also known that the respiratory rate can be extracted from
HR signal and the blood pressure, as well as the use of spectral the PPG signal[30].
analysis features.
• This justifies the use of the low frequency components of the PPG
Metabolic syndrome. The metabolic syndrome [22] consists of energy, which are a measure of the respiratory rate.
hypertension, obesity, and insulin resistance and other pathologies.
The syndrome causes changes in glucose and blood pressure levels 3. Materials and methods
as well as the state of the autonomic system and general hemody-
namics, and is related to each subject’s individual demographic and This section gives a detailed account of the three parts that con-
morphometric characteristics. stitute the system, shown together in Fig. 1. The recorded signal
from the PPG is digitalized into the sequence SPPG (t) at a sample
• This requires that, we use variables such as age, weight and body rate of 75 samples/s and is the input to the activity detection mod-
mass index, in addition to describing the shape of the pulse. ule. The AD module selects a window of the SPPG (t) containing 4500
consecutive samples of clean signal. The window is then segmented
Relationship between diabetes and the HR variability. There is evi- into frames. The output of the AD module is:
dence that diabetes alters the relationship between HR and blood
pressure [23]. Diabetic neuropathies associated with diabetes mel- • A vector Swindow (t) that consists of 1 min of clean signal (window
litus result from diabetic microvascular injury to the small blood length, Lwindow = 4500 consecutive samples).
vessels that supply the nerves. These injuries are reflected in the • An array Sframe (, n), that consists of frames of 5 s (375 samples)
power spectrum of the HR [24] and manifest as a much lower HR with an overlap of 50% that were extracted from Swindow (t). The
variability. Moreover, they are indirectly reflected in blood pres- index  indicates the sample number in a frame and n is the frame
sure dynamics, due to the fact that the parasympathetic system is number. The number of samples in a frame is Lframe = 375, and the
more affected than the sympathetic. number of frames in a window is Nframe = 24.

• Since there is a functional relationship between altered glucose


Both Swindow (t) and Sframe (, n) are input to the signal processing
levels due to diabetes and HR variability, the features that will module, whose output is the vector XF of dimension 33. This vector
capture this relationship are the power spectrum of the distance contains the set of features used in the machine learning module.
between pulses and statistics about the HR and the HR variability. Finally we present the ML module and justify the technique selected
for inferring the function relating XF and the desired targets, i.e. the
Emotional states. It is known that emotional states [25], such as reference values of BGL, SBP and DBP.
stress, anger, happiness, surprise, etc. change the values of the blood The criterion for deciding Lframe = 375 was that a frame of 5 s was
pressure and glucose by altering the activity of the sympathetic long enough to record several heart beats, and at the same time, it
and parasympathetic nervous systems. These emotional states have was short enough to consider that the change in amplitude due
to the breathing rate was negligible. A frame duration lower than
5 s might capture too few heart beats, and therefore, the detection
1
A negative feedback system that controls short-term changes in blood pressure. of activity might be unreliable. However, a frame duration signifi-
130 E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138

cantly longer than 5 s might yield unreliable endpoints, i.e., frames • for background noise, transients and clicks, the mean value of the
with a transient. The  index is the local time in the n th frame. KTEn in a frame is low.

3.1. Activity detection module Therefore it is a good candidate for a feature in a classifier that
separates clean signal from noise, transients or artifacts.
In this section we summarize the structure of the AD module. The KTEn (t) sequence for the n th frame was computed as,
The aim of the AD module is to remove the corrupted parts of the
2
signal (i.e. initial transient, spurious clicks, loss of signal, noise, sat- KTEn (t) = Sframe (t, n) − Sframe (t + 1, n)Sframe (t − 1, n) (2)
urations, etc.), giving as output the vector Swindow (t) and the array
for t = 2, . . ., Lframe − 1. Then from KTEn (t) we computed, the
Sframe (, n). The AD module was designed similarly to the tech- 
mean KTEn , variance KTEn , interquartile range KTEn and skewness
iqr
niques used for detecting voice activity in speech technologies [31]. skew
KTEn .
We were interested in a robust and computationally light module.
Qi-Zheng energy. The Qi-Zheng energy (QZEn ) is the result of fil-
This was done by computing sets of features that used complemen-
tering the energy sequence at each frame, using a filter designed
tary information. Also some features had a dual use, as they were
to be optimal for detecting beginning or ending edges of a given
used in both the AD and SP modules.
signal. The coefficients of the filter are presented elsewhere [35].
The activity detection module consists of the following parts:
This model for detecting the edges of a signal assumes exponen-
n for each
tial growth of the edge to be detected, and the model is best for
1. A submodule that computes the feature vector XAD
detecting beginning or ending transients of periodic signals [35].
frame.
It filters out clicks, pops, sudden falls in amplitude, or noise with
2. A decision function based on a linear classifier that for each frame
growth-shapes that differ from an exponential. This feature is com-
outputs one of two classes, either “signal”, or “not signal”.
plementary to the others in the sense that it takes high values at
3. A finite state automaton (FSA).
the endpoints.
Spectral entropy. The spectral entropy sequence HnS is a scalar
This module takes as input the PPG signal, SPPG (t), and sequen- associated to the n th frame (see detailed description in Section
tially computes Sframe (, n). For each frame we compute a feature 3.2, spectral entropy).
n , that is the input to a decision function. The decision
vector XAD The entropy HnS has the property that, for a flat or tilted power
function for each XADn outputs the class of the frame. The class of
spectrum of a frame (i.e. absence of signal or presence of noise or
each frame is the input to a FSA that determines the endpoints clicks) it has a high value, while for a tone or harmonic signal it has
of the clean signal, i.e., determines the 1 min window of consecu- a low value. Therefore, one can use the spectral entropy to charac-
tive samples of clean signal. Once these endpoints are found, the terize the PPG signal at frame level, which will have a low entropy
system outputs the set of frames Sframe (, n) that fall between the for clean signal zones, while transients and artifacts will have high
endpoints along with the signal Swindow (t). Next we describe each entropy (i.e. flat or irregular spectrum).
of these parts in detail. Zero crossing rate. The Zero crossing rate Znc was computed for
the n th frame. First, we filtered Sframe (, n) with a high-pass filter
n
3.1.1. Set of features of the vector XAD in order to remove the mean. Given that the filter was FIR of order
In this subsection we present the set of features that are com- 4 and that the transient was negligible, there was no need to apply
puted from each frame in order to form the vector XAD n . The vector
a window. Then, the Znc was computed from the number of cross-
n is used as the input to a classifier (decision function) that clas-
XAD ings divided by the number of samples in the frame Lframe . In the
sifies the frame as either “signal” or “not signal”. presence of a clean PPG signal, this variable took values between
Below we present a summary of the features used in the vec- 5/Lframe and 10/Lframe , whereas in cases of sudden losses of signal,
n . When a given feature is also used in X , we give a detailed
tor XAD F clicks or transients, the value was either too low or too high.
description in Section 3.2. Kaiser–Teager energy of the PPG signal. The TheXADn vector. Finally the vector X n was constructed by aggre-
AD
Kaiser–Teager energy is a standard tool used in speech processing gating the features described above, i.e.:
for end-point detection, and for finding energy profiles of signals  T

n
= KTEn , KTEn , KTEn , KTEskew , QZEn , HnS , Znc
with periodic signal components. The Kaiser–Teager energy (KTE) iqr
XAD n
was computed at the frame level. The KTEn is based on a harmonic
generation model of a sinusoid [32]. The KTEn model assumes that
the signal is a sinusoid xt of the form x(t) = Acos(t + ) and the
3.1.2. Decision function
energy is computed by the left part of Eq. 1. After simple alge-
The last block was a decision function that was implemented as
bra [33] it can be shown that with the sinusoid model, KTEn (t) is n as input and decides between two classes,
a classifier that takes XAD
approximately proportional to the square of the amplitude times
i.e. class “signal” (PPG signal) and class “no signal” (corruption/loss
the angular frequency A2 2 .
of signal, background noise). The decision function consisted of a
KTE(t) = x(t)2 − x(t + 1)x(t − 1) ≈ A2 2 (1) linear classifier [36], that performs the classification by means of
a hyperplane on the space of the features listed above. This clas-
for t = 1, . . ., Lframe . sifier was trained from a hand-labeled subset of the database. The
This measure can be extended to signals with a richer spec- selection of the training subset and the labeling was done by the
tral content, by computing the value for each bin of the FFT of a author. The signal-to-noise ratio of the useful segments of the PPG
sequence at frame level [34]. The Teager energy computed by Eq. signal was higher than 30 dB and had clear harmonic components,
1 in the case of periodic signals works as if the Teager was com- whereas the transients and segments with a loss of signal had irreg-
puted for each bin of the FFT and then added. Therefore we have a ular shapes and a much lower energy. Therefore a linear classifier
measure sum of the energy for all bins. had a high performance in terms of the detection rate, and the set
Thus the KTEn (t) has the following properties of examples used for training the classifier was selected to have a
representation of the possible transients and artifacts of the signal.
• for a periodic waveform of high amplitude the mean value of the Note that the fraction of the recording time that corresponded to
KTEn in a frame is high. the “no signal” class was much smaller than the one corresponding
E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138 131

The clean signal was considered to be in the states s2 and s3 .


Once a 1 min segment of consecutive samples was found, this clean
signal was assigned to the outputs Swindow (t) and Sframe (, n).

3.1.4. Output of the AD module


Once the FSA has spent 1 min (i.e. 24 consecutive frames) in
states s2 and/or s3 , the module copies the samples of SPPG (t) in this
interval into the vector Swindow (t) and the frames in this interval into
the variable Sframe (, n). Both of these were the output of the mod-
ule. Note that the Kaiser–Teager energy and the spectral entropy
were also used in the SP module.

3.1.5. Robustness to changes in the PPG sensor


Fig. 2. Transitions between states that were allowed by the finite state automaton. Another important aspect of the design is its robustness to
changes in the photoplethysmographic instrument itself. A change
in the sensor affects the frequency response and/or gain in the
to the “signal” class. Therefore the training was biased, increasing amplifier. In the future in order to compensate for the differ-
the prior probability of the “no signal” case. ences between photoplethysmographs, a cepstral subtraction [31]
approach should be used to compensate for changes in the transfer
3.1.3. Finite state automaton function of each photoplethysmographs. This aspect of the prepro-
We used a FSA to segment the PPG signal. The decisions on the cessing is left for future work.
transitions were done based on counts on the number of consec-
utive frames classified by the decision function as “signal” or “no 3.2. Signal processing module
signal” (see Section 3.1.2). Three states were defined,
In this section we summarize the features computed from the
• s1 =Spurious/absent signal, PPG signal. The physiological properties mentioned in Section 2
• s2 =In-ppg, were used as a guide to select the set of features in XF . The output
• s3 =Leaving-ppg, of this module was a vector XF that, consisted of the aggregated
features that partially characterize the subject’s hemodynamic and
autonomic status. A duration of 1 min for Swindow (t) was selected in
in order to model the ambiguous zones between the two order to have sufficient samples for computing the heart rate and
classes. The time spent in each state was used to filter false pos- respiratory rate. For the frames, the time length and overlap were
itives/negatives and allow for a time margin at each endpoint of selected so that each frame would have at least four or five heart
activity. The transitions were decided from counts on the number pulses, and also most of the time lasted less than the respiratory
of frames classified by a decision function, and the allowed transi- rate (i.e. most frames would be shorter than the time required for
tion between states was specified by the directed graph shown in one breath).
Fig. 2. The feature vector XF generated in this module consists of two
Each state consisted of a set of counters and had as input the kinds of features:
class given by the decision function. The counters, were defined as
follows: • global features computed from Swindow (t) and
• features aggregated from traits computed on Sframe (, n). The
• C i was the number of consecutive frames at state i classified as number of frames (Nframe ) was 24. These features model the time
S
“signal”. variation of certain magnitudes across the window being mea-
• C i was the number of consecutive frames at state i classified as sured.
NS
“no signal”.
Autoregressive model of the PPG waveform. The autoregressive
The rules for the transition between states (see Fig. 2) were the (AR) coefficients model the spectral envelope of the PPG signal;
following: this technique was used because it assumes an underlying model
of a signal generated by an impulse that goes through a series of
• resonant tubes [31]. This can act as a coarse model for propagation
s1 → s2 if CS1 ≥ 4, otherwise, stay at s1
• 2 ≥ 2, otherwise, stay at s of the ECG signal through the arteries, veins and capillaries, and
s2 → s3 if CNS 2
reflects the changes in the shape of the signal due to the propagation
• s3 → s2 if CS3 ≥ 2, otherwise, stay at s3
through tubes of different diameters, as well as to the effects of
• s3 → s1 3 ≥ 4, otherwise, stay at s
if CNS 3 viscosity. In addition, this method has the interesting feature that
the AR coefficients also model the shape of the basic pulse. When
The thresholds were decided by inspection from a set of exam- the order of the model was selected, we required that the shape
ples (see Section 4.1) and adjusted by hand on a set of 20 examples; of the AR power spectrum of a given pulse follow correctly the
this, was sufficient to capture a clean segment of the PPG signal. An periodogram of the PPG signal. In our experiments we used an AR
alternative might have been deciding the thresholds by means of model of order 5. The coefficients were computed from the Yule-
an integer optimization technique or even by enumeration as the Walker equations [31] and were derived from sample covariances.
number of possible configurations is low enough using the whole This feature was considered to be a global feature and consisted of
database. Nevertheless as the base classifier had a low error rate, we a vector ARPPG of length 5.
selected the recordings with false negatives and false positives so Kaiser–Teager energy of the PPG signal We computed the
that the thresholds were reliable in adverse conditions, otherwise Kaiser–Teager energy features via two approaches, with one at the
the thresholds would be biased to an optimistic situation. window level and another at the frame level.
132 E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138

• At the window level we first computed the sequence of instanta- Then we normalized the square of the absolute value of each bin
neous energy KTE(t) as of the FFT:

|X n [k]|2
PXn [k] ← k = 1 . . . LFFT

LFFT

KTE(t) = Swindow (t)2 − Swindow (t + 1)Swindow (t − 1). |X n [j]|2


j=1

Finally, we computed the entropy HSn from PXn :


for t = 2, . . ., Lwindow − 1. From the sequence KTE(t), we estimated
an AR model of order 5. The coefficients of the model were 
LFFT
 
assigned to a vector KTEAR of length 5. HnS ← PXn [k]Log PXn [k] (4)
• At the frame level, we computed the KTEn (t) sequence from Eq. k=1

2, and we characterized each frame by the mean KTEn , variance
 iqr skew
This gave us a sequence of estimates of the entropy at the frame
KTEn , interquartile range KTEn and skewness KTEn . These level Sframe (, n). From the sequence HnS , we computed for each
features are also used at the AD module at the frame level (see  iqr
frame, its mean value HS , variance HS , interquartile range HS and
Section 3.1.1). From these sequences, we summarized the behav-
skewness HSskew .
ior over the whole window by computing averages for each frame.
Energy profile of the PPG signal. The log-energy profile LogEn of
These averages were the mean KTE , variance KTE , interquartile
the PPG signal was computed at frame level from Sframe (, n) in
range KTEiqr and skewness KTEskew .
order to estimate the respiratory rate. Unlike other methods based
on the wavelet transform [39], we decided to use the log-energy
of the sequence of frames, aiming for a low complexity estimate,
Heart rate statistics. The heart rate (HR) statistics were consid- despite the poorer precision. The LogEn was computed as follows:
ered to be a global feature and were computed from the Swindow (t). L 
First, the signal was filtered with a band pass filter in order to

frame
2
LogEn ← Log Sframe (, n)
remove the mean and attenuate the effect of the dicrotic notch. The
=1
specifications of the filter can be found at Appendix B. Afterwards
the HR was computed from the zero crossings of the signal (i.e. the The frame duration of 5 s allowed for computation of the respira-
time interval between alternate crossings). This method was found tory rate of a healthy adult at rest. The feature used for summarizing
to be more robust than computing the HR from signal peaks. Using the spectral variability of the energy profile were the AR parame-
the sequence of time intervals between heart beats, we computed ters of order 5 computed after subtracting the mean. The vector of
the mean value HR , the variance HR , the interquartile range HRiqr AR coefficients LogEAR was computed from the sequence LogEn and
and the skewness HRskew for the whole window. summarizes the spectral content of the sequence of log energy over
Oxygen saturation range. The feature OSRPPG is given by the pulse all of the frames. The AR vector LogEAR was a model of the fluctua-
oximeter (i.e. the PPG sensor) and provides the mean value for tions of the energy of the PPG relative to the respiratory rate. Note
peripheral oxygen saturation over the entire window. that in our case, the objective was not to generate an exact estimate
Information about the individual. The selected variables were age of the respiratory rate, but to find a rough estimate of the rate and
AG, weight WGT and body mass index BMI. Whether an individ- its power spectrum. In addition, we computed as descriptive statis-
ual was taking beta blockers was found to be irrelevant from a tics the log energy profile of Swindow (t), the variance LogE and the
performance point of view, and was therefore discarded. interquartile range LogEiqr .
Spectral entropy statistics. This feature was computed for each TheXF vector. Finally, the vector XF was constructed by aggregat-
frame Sframe (, n) and provides an index of confidence in the pres- ing the features mentioned above. This feature vector was used as
ence of spectral lines and/or the tilt of the spectrum. The spectral the input to a machine learning module.
entropy HnS of a frame is a scalar obtained by computing the entropy
function (see Eq. 4) from the normalized power spectrum [37,38].
Note that as the power spectrum is normalized so that the area is XF = KTEAR , KTE , KTE , KTEiqr , KTEskew , HR ,
equal to one, it can be interpreted as a probability distribution, and 
the use of the entropy function therefore makes sense. × HR , HRiqr , HRskew , ARPPG , OSRPPG , AG, WGT, BMI, HS , HS ,
This measure has the property that for tones (i.e. narrow spectral T
× HS , HSskew ,LogE AR ,LogE  ,LogE iqr
iqr
lines), the entropy will take low values, whereas when the spec-
trum is flat or has a low tilt, the entropy will take high values.
In other words, this feature measures the damping of the pulses,
the spectral shape, harmonic components, and presence of noise.
3.3. Machine learning module
The power spectrum is computed by means of the short time fast
Fourier transform (FFT) of n th frame Sframe (, n). The duration of
In this section we describe the module that infers from labeled
the frame assures that at least several heart pulses are present in
examples a function that relates the feature vector XF and the
the analysis frame.
desired targets (DBP, SBP and BGL). We tested four different ML
This feature was computed for each frame as follows:
techniques, each with different structures and properties.
First, we computed the FFT of Sframe (, n) of length Lframe , which
A requirement for the ML techniques selected was that each
was zero paddled, in order to increase the length to the nearest
should be flexible enough to deal with the peculiarities of the data,
power of two. The final length of the zero paddled frame (LFFT ) was
have ways to control the over-generalization phenomenon, and be
512.
robust enough to deal with noisy measurements. It should be noted
that the reference values have some inherent uncertainty that can
be interpreted as noise (see Section 3.4). This error originates from
X n ← FFT(Sframe (, n), LFFT ) (3) the error tolerances of the measuring devices. On the other hand
E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138 133

the effect of a small amount of additive noise in the references has layer perceptron was selected because it is able to approximate
been shown to improve accuracy when making predictions [40]. non-linear relationships. The structure consisted of a hidden layer
Additional requirements for the module were that it should and an output layer for each value of interest. The outputs shared
generalize from the training examples to unseen inputs, should the information from the hidden units in order to make use of the
not need calibration, should not require too much computational common traits of the functions to be estimated; this is known to
resources to generate the output (note that this was not required improve performance [43]. The conjugate gradient algorithm was
during training), and finally, should be stable, meaning that small selected to enable fast convergence and the capacity for dealing
changes in the training database composition, should not give rise with high dimension inputs. The Levenberg–Marquart algorithm
to large differences in performance. The stability of the algorithm was rejected because it was too slow. In order to deal with the
gives us an indication of the reliability of the trained system and, problem of convergence to local minima, we used a multistarting
together with accuracy, served as the criterion for selecting the scheme (i.e., train several times and select the network with the
appropriate ML technique for this module. Stability was checked best validation performance). In addition, the input data were
by cross-validation, and in Section 4.4, we rank the algorithms by standardized. The mean and variance values of the standard-
the standard deviation of the residual sum of squares (RSS) between ization process were estimated from TrnValDB and were used
different rotations of the training and validation database. to standardize both TrnValDB and TstDB. The best configura-
In this subsection we describe the structure of each ML tech- tion (number of hidden units and type of nonlinear function)
nique, how it was trained, and how the structure and parameters was determined from a 10-fold cross validation on the TrnValDB
were optimized so that results could be compared. database. This configuration had 10 neurons in the hidden layer,
To select the best structure and parameters we used 80% of the three at the output, and a hyperbolic tangent representing the
database as a training and validation database (TrnValDB) and the non-linearity. The quality criterion was the coefficient of deter-
remaining 20% of the database (TstDB) for testing and comparison mination R2 value on the test database TstDB. The software was
purposes. Note that each sample of a database consisted of a unique the neural network toolbox for Matlab.
recording. Selection of the best structure and testing were done in • Support vector machines (SVM). For SVM regression, we used
two phases. the -insensitive loss function [44], and the kernels were the
Gaussian and the polynomial. As the Gaussian kernel assumes
• In the first phase, different architectures and sets of parameters a common sigma for all dimensions, we standardized the data as
were tried for each algorithm, and selection was based on the best explained above. The values of sigma were varied from 0.1 to 5,
performance after 10-fold cross validation on TrnValDB, i.e., 90% and for the polynomial kernel, orders from 1 to 5 were tested.
of the TrnValDB database was used for training and the remainder In all cases different values of the soft margin C and  tube were
for computing the validation results. This was done 10 times by tried. It should be noted that the algorithm was sensitive to out-
rotating the databases. liers and that numerical problems were encountered for some
• In the second phase, once the best configuration was found, the combinations of parameters. The best configuration (kernel type,
performance was evaluated on the TstDB database. soft margin,  and kernel parameters) was determined from a 10-
fold cross validation on the TrnValDB database. The best kernel
The machine learning techniques tested were the following: was a Gaussian kernel. The quality criterion was the R2 value on
the testing database TstDB. The software was the SVM toolbox
• Linear regression (LR). For the linear regression method [41], [45] for Matlab.
j n j • Classification and regression trees (CART). Classification and
we trained a univariate model yj = ˇ0 + ˇ X (i), where j =
i=1 i F regression trees are ML techniques that infer logical trees com-
SPB,DPB,GBL , using the ridge regression algorithm [41]. Ridge posed of disjunctions of conjunctions [46], and at each node an
regression was selected because, we found that estimates of element of the input vector is compared to a threshold. This
model parameters by direct minimization of the RSS were unreli- ML technique was not tested with cross validation because pre-
able. This happened both in the multivariate and univariate cases, liminary testing yielded a rather poor performance. It was also
as the covariance matrix was badly conditioned. This occurred discarded because we tested the random forest technique, which
because the vector XF has features that are highly correlated. is a generalization of the CART algorithm.
The result was a high RSS on the validation and test databases • Random forest (RF). Random forests are a ML technique based on a
j
and high values for some of the ˇi . The ridge regression method set (forest) of classification and regression trees [47], each trained
minimizes a penalized residual sum of squares and reduces the in a different way. The output of the system is an aggregation of
variability due to the high correlation between features. The the outputs of the trees of the forest. The methodology is justified
penalization parameter  of the ridge regression imposed a size by the fact that the mathematical expression of the error of a clas-
constraint on the coefficients, thus correcting the high values sifier depends on the bias and variance of each of the trees of the
of the estimated coefficients. This phenomenon is explained by forest. In order to take advantage of the fact that averages reduce
Hastie et al. [41] on page 59. Weights were computed from the the bias and variance, the algorithm introduces a systematic con-
input feature matrix X = [XF1 , . . . , XFM ], where M was the number trolled variability and bias [47]. This is done by bootstrapping the
of observation vectors used for computing the parameters ˇ of training data and introducing randomness at the selection of the
the linear regression model. The vector of the reference values testing feature at each node. The result is that the mean of the out-
(targets) for each observation was denoted as y and the regu- puts reduces the variance and bias of the estimate; therefore the
larization parameter as . The ridged regression estimate was;
  system has a lower global error rate. The execution time of this
ˆ ridge = X T X + I −1 X T y. The value of  was the mean value of
ˇ method is low because the base classifier is a decision tree, and the
the optimal values computed in each of a 10-fold cross validation only computations are comparisons at each node, in other words,
on the TrnValDB database. The quality criterion was the value of there are no multiplications. Another advantage is related to the
the coefficient of determination R2 on the testing database, TstDB. fact that each node compares only one feature, making the system
The algorithm was programmed in Matlab by the author. robust with respect to the scale of the inputs and the correlations
• Neural networks (NN). For the neural networks method we used a between them. Consequently there is no need to standardize or
multilayer perceptron trained by means of a conjugate gradient whiten the input data. The best configuration (number of trees
algorithm (Fletcher Powell) [42] minimizing the RSS. The multi- in the forest, number of features tested at each node and maxi-
134 E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138

Table 1 Note that the experiments were done in two phases, in the first
Characteristics for the reference values.
phase we compared the ML algorithms, and in the second phase
Measurement Glucose (mg/dl) SBP (mmHg) DBP (mmHg) we tested the best algorithm. The cross validation methodology was
Min 49 90 60 done at each phase with different proportions of elements assigned
Max 393 180 120 the training, validation and test databases.
Mean 139 123 78
SD 66 21 16
Range 343 90 60
4.1. Training the activity detection module

The activity detection module was trained from labeled exam-


mum number of samples at a given node) was determined from ples. Preliminary tests showed that performance on endpoint
a 10-fold cross-validation on the TrnValDB database. The qual- detection did not depend on the subset used. We trained the clas-
ity criterion was the R2 value on the testing database TstDB. The sifier with one-fifth of the database manually labeled, and left the
performance of the RF was insensitive to variations of the param- parameters fixed for all other experiments. Examples were selected
eters in a certain margin, i.e., for a number of trees higher than 30 so that the training database of the AD module had enough exam-
there was no improvement, and the size of the set at each node ples of possible variations, including clicks, loss of signal, transients,
could vary from 1 to 10 without worsening the performance. The etc. The thresholds of the FSA were determined from 20 examples.
algorithm was programmed in Matlab by the author. Inspection of the results from the database showed that the rest
of the database was well segmented. Note that the final perfor-
3.4. Description of the database mance of the whole system was tested by cross validation on the
segmented data, but the segmentation algorithm was not retrained.
The data base was collected from 213 male and 197 female
individuals, aged from 9 to 80 years (mean±SD=37.97±13.32), (see 4.2. Feature selection
Table 2). The participants were selected from personnel of the Uni-
versitat Politecnica de Catalunya (# 71) and from an ambulatory Several methods exist for selecting a subset of features that
primary care center (# 339). The database was recorded on a suc- maximizes the performance of a ML system. Some are based on
cessive recruitment base without selection. The sampling biases enumeration, such as the sequential search techniques [42], while
come from the fact that a part of the database consisted of healthy other are based on projections, such as the use of PCA or factor
young individuals (university personnel) and persons from the analysis.
ambulatory medical assistance staff. Another bias of the data was We did not attempt feature selection using sequential search
that pediatric and geriatric populations were under-represented. A techniques, due to the computational requirements. On the other
total of 79 people reported suffering from diabetes. The measure- hand, two of the ML techniques that we used (LR-ridge, RF) rank
ments were taken by personnel trained in the technique. University the usefulness of the features, while another technique (SVM), is
measurements were taken by the author, and ambulatory mea- robust to the presence of irrelevant features. In the case of LR, the
surements were taken by a medical doctor. Each subject’s PPG regularization parameter  in the ridge regression algorithm shrank
was measured with the fingertip pulse oximeter device iPod Digital the regression coefficients by their contribution to the final results
Oximeter [12], blood pressure measurements were made an aneroid (see Ref. [41], p. 61). In our case, the coefficients shrank in a sim-
sphygmomanometer and the glucose levels were measured with a j
ilar way as  increased, and therefore from the set of weights ˇi
self-monitoring of blood glucose (SMBG) meter Accu-Chek Aviva of the ridge regression, we could not conclude that given features
[13]. Table 1 summarizes the measured blood pressure and glucose were either irrelevant or highly predictive. On the other hand, the
levels. RF has a method for ranking the features based on how the perfor-
The reference values in the database are not perfect. There are mance degrades when a given input feature is randomly permuted
variations due to estimating error of the SMBG meter, as well as in the test [47]. In this case we observed a similar phenomenon:
uncertainty due to the fact that BP values were measured manually. i.e., the contribution of the features to performance was more or
We found that the manual estimates of BP values were sometimes less similar. The LR-ridge and RF both indicated that the features
rounded off, which resulted in slight clustering that can be seen in contributed in a similar manner to the final result. As such, it was
Figs. 4 and 5. This can be considered to be a rounding error, and has not necessary to do a feature selection process.
a maximum value of 5 mmHg, which is within the grading criteria The age, weight and BMI variables were not predictive when
of the BHS [48] (Table 6) Table 2. used alone (i.e., R2 < 0.1), but nevertheless improved the accuracy
of the results. It was found that with the information related to
4. Training and results the PPG signal alone, the predicted values were reasonable, but
improved when used simultaneously with the information about
The experimental part consisted first of selecting the best the individual.
machine learning technique and then assessing its performance. In
this section we first review the training of the AD module, present
4.3. Selection of the ML algorithm
the ML technique selected and finally discuss the performance of
the definitive system. The final system was implemented in a Mat-
The criterion for selecting the ML algorithm was the value of
lab program entitled SIMPRESGLUC.
the coefficient of determination R2 on the test database. The TstDB
database consisted of 20% of the original database (82 individuals)
Table 2 sampled randomly. The TrnValDB database contained the remain-
Characteristics for the study population.
ing 80% (328 individuals). The ML algorithms were tuned on the
Measurement Weight (kg) Height (cm) Age (years) TrnValDB in order to maximize their performance on a 10-fold
Min 35 138 9
cross validation. Once the best structure and set of parameters
Max 111 192 80 were found, the algorithms were trained again with the TrnValDB
Mean 65.5 164.9 37.9 database and tested with the TstDB database. As shown in Table 3
SD 14.7 9.6 13.3 the RF method give the best performance.
E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138 135

Table 3 Table 5
Performance of the ML algorithms. Results of the Clarke error grid.

LR NN SVM RF Zone A B C D E

RBGL
2
0.52 0.54 0.64 0.88 Percentage (%) 87.71 10.32 0 1.96 0
RSBP
2
0.59 0.65 0.72 0.90 Total 357 42 0 8 0
RDBP
2
0.53 0.63 0.68 0.86

4.5.1. Glucose results


Table 4 In Fig. 3 we present the Clarke error grid, which divides the scat-
Final performance of the system. ter plot into five regions. These regions quantify the accuracy of the
blood glucose reference values compared to the predicted values
BGL SBP DBP
for different types of errors.
RRF
2
0.90 0.91 0.89 The grid is divided into five regions, each of which is associ-
ated with a different interpretation of the prediction error. Details
about the grid have been described previously [49]. In Table 5 we
present the number and percentage of points in each region. Our
4.4. Stability of the ML algorithms
results show that 87.7% of points were in region A, which cov-
ers predictions within 20% of the reference sensor. Another 10.3%
In order to have a reliable idea of how the algorithms per-
of the data went to region B, which contains predictions more
formed on novel data, we computed the standard deviation of the R2
than 20% away from the reference sensor but would not lead to
between the ten rotations of the train and validation database (Trn-
inappropriate treatment, in the sense that although the measure-
ValDB). The variability of the performance gives an indication of the
ment is inaccurate by at most 20% of the value, it does not miss
stability of the predictive method. For each of the three measures
cases of hypoglycemia or hyperglycemia or give false positives for
(BGL, SBP and DBP), Random Forest ranked first, as it displayed
either hypoglycemia or hyperglycemia. No points fell into region
less variability on the cross-validation for all three measures. This
C, which represents the false positives either of hyperglucemia
method was followed by the linear regression algorithm, the SVM
or hypoglucemia. Only 2% of the points fell into region D, which
and finally the NN. In the case of the NN, the performance displayed
typically indicates that the prediction would not be able to detect
significant variance despite multistarting.
hypoglycemia or hyperglycemia. Finally no points fell into region
E, which consists of prediction errors that confuse hypoglycemia
4.5. Final results for hyperglycemia or viceversa.
The resulting value of the coefficient of determination RBGL2 was
The system’s performance was assessed using the random for- 0.90; therefore the RSS was 10% of the sample variance.
est method and was tested on all the 410 individuals. The random The Clarke grid scatterplot was generated via a Matlab function
forest technique was selected as the predictive method because it [50].
had both the best R2 value and the best stability.
The performance was assessed for the whole database by means 4.5.2. Blood pressure results
of a 10-fold cross validation; i.e., we used 90% of the database In Fig. 4 we present the scatter-plot for systolic blood pressures
for training, leaving 10% as testing database, and then rotated ten and in Fig. 5 the plot for diastolic blood pressures. The values of
the resulting coefficients of determination were RSBP 2 = 0.91 and
times. The performance in terms of the R2 is summarized in Table 4.
2
RDBP = 0.89, meaning that in both cases the RSS was approximately
10% of the sample variance.
Table 6 displays the results and the error thresholds for grad-
Clarke’s Error Grid Analysis
400 ing the system according to the British Hypertension Society
EC B (BHS) criteria. The BHS protocol [48] grades devices by cumulative

350
220

300 200
Predicted Concentration [mg/dl]

B 180
250
Predicted SBP [mmHg]

160
200

140
150 D
120
D
100
100

50
80
A C E
0 60
0 50 100 150 200 250 300 350 400 80 100 120 140 160 180 200 220
Reference Concentration [mg/dl] Reference SBP [mmHg]

Fig. 3. Results of the Clarke error grid. Fig. 4. Scatter plot of systolic blood pressures.
136 E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138

130 system has an accuracy similar to that of the standard measure-


2
ment devices, i.e., RBGL = 0.90 for the glucose levels and grade B of
120
the BHS standard for blood pressures. An interesting property of
the system is that it is subject independent and does not require
110
calibration; therefore a, portable and easy-to-use device could be
100
fabricated that implements the system proposed in this paper [51].
Predicted DBP [mmHg]

Another property is that the response of the system was linear with
90 the values to be estimated, that is, there were no saturation levels,
as shown in Figs. 3–5. This means that the system is reliable even at
80 extreme degrees of hypo/hypertension or glucose levels. We tested
the system on a database with reference values that were not ideal;
70 nevertheless the results indicated that the system worked well for
the ranges of variables shown in Table 1. Also we have devised an
60
activity detection module, which filters out the initial transients,
and the drops in the signal level. Although the pulse oximeter nor-
50
mally produced an immediately good quality signal, but the initial
40
transient was not of a standard duration. There were also drops
50 60 70 80 90 100 110 120 130 in the signal level due to movements of the hand where the pulse
Reference DBP [mmHg] oximeter was attached. Although there was high variability in the
length of the recording to obtain a 1 min signal window, in most
Fig. 5. Scatter plot of diastolic blood pressures.
of the cases it was little more than a minute. The activity detection
is of importance in the continuous estimation, where the drops in
Table 6 signal energy can be rather frequent. In addition the computational
Comparison with the grading criteria of the BHS [48].
resources needed to implement the system proposed in this paper
≤5 mmHg ≤10 mmHg ≤15 mmHg are low, and the processing time for the 1 min window was less than
SBP Totals 256 354 373 10 s, using a standard PC and software programmed in Matlab. Note
DBP Totals 324 370 385 that the results might be biased by the fact that the database used
SBP (%) 62.43% 86.34% 90.97% for selecting the best algorithm, is also used for the validation of
DBP (%) 79.02% 90.24% 93.90%
the system.
Grade A BHS(%) 60% 85% 95%
Grade B BHS(%) 50% 75% 90%
The clinical use of the system we propose would be an easy
Grade C BHS(%) 40% 65% 85% and portable device for continuous measurement of blood pressure
and glucose level for 24 h, with the advantage of a device that does
not have to be calibrated between individuals. The system com-
Table 7
plies with Grade B of the British Hypertension Society’s protocol,
Random permutation of the references.
which is considered the minimum to pass the accepted criteria of
Measurement BGL SBP DBP the International Protocol or the AAMI Protocol [52]. Another clin-
RTrain
2
0.0174 3.7100e-004 0.0070 ical use would be ambulatory blood pressure monitoring, allowing
for the monitoring of the variability of the blood over a day either
at home or at the hospital. This clinical application is important for
percentages of readings, falling within 5, 10, and 15 mmHg of the essential and refractory hypertension, which requires strict control
standard sphygnmanometer measurement. Our predictions agreed of the treatment and medication.
with grade B of the BHS protocol. Note that the system did not As the device measures blood glucose levels, we evaluated the
meet class A criteria due to the presence of 17 outliers with an clinical performance by using the Clark error grid that quantifies the
error greater than 15 mmHg, but it did meet criteria for cumulative clinical accuracy of blood glucose estimates generated by meters as
percentage of readings less than 5 mmHg and 10 mmHg. compared to a reference value. More than 98% of the test points
were in Regions A and B of the Clark error grid, which are the
4.6. A test for overfitting estimates within a 20% margin from the reference value, whereas
only 1.96% (8) were in Region D, which are estimates with an error
It is known that machine learning techniques can overfit. We that fails to detect hypoglycemia or hyperglycemia. In addition, no
performed a test for overfitting as suggested by Smith [4], which points were in Regions C (false positives of hyperglycemia or hypo-
consisted of training with a random permutation of the reference glycemia) or E (wrongly classifies hyperglycemia as hypoglycemia
values. In a case of overfit the result should be an R2 close to one or vice versa). One clinical application could be glucose control with
in the training database. We obtained a random permutation of continuous recording of diabetes II patients. This system would be
the reference values from the entire database and trained the sys- of great usefulness in intensive care units because it allows a 24-h
tem with a training database that consisted of 369 individuals. The period of continuous monitoring of blood glucose levels and blood
results (presented in Table 7) show that the system was not able pressure, without interventions that may have a risk of infection.
to learn the function with the permuted references. These results As future work we propose to use as reference the glucose mea-
indicate that there is a functional dependence between the input surements from standard blood tests and to use a more accurate
and output, and that this system can capture this relationship. measurement of the blood pressure. In addition we propose to
recruit a larger population with a greater range of BGL, SBP and
5. Conclusions and future work DBP. Additionally, the system will be tested on severe cases of
hypotension, and if good results can be obtained it may be useful
This study shows that it is possible to simultaneously estimate for early detection of sepsis in intensive care units. Other improve-
the BGL, SBP and DBP levels from a PPG signal. In Section 2 we pre- ments include making the system robust to changes in the sensor
sented possible physiological explanations for why these variables by using cepstral subtraction [31] (see Section 3.1.5) or estimat-
might be estimated from a PPG signal. Our results show that the ing other physiological measurements related to the state of the
E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138 137

autonomic nervous system that are usually obtained from blood [10] Jeong I, Jun S, Um D, Oh K, Yoon H. Non-invasive estimation of systolic blood
analysis. Finally, a real-time version of the SIMPRESGLUC software pressure and diastolic blood pressure using photoplethysmograph compo-
nents. Yonsei Med J 2010;51(3):345–53.
could be devised that would output the SPB, DBP and BGL with a [11] Suzuki S, Oguri K. Cuffless and non-invasive systolic blood pressure estimation
delay of seconds, instead of the current version’s minute. for aged class by using a photoplethysmograph. In: Engineering in medicine
and biology society. 30th annual international conference of the IEEE. 2008. p.
1327–30.
Acknowledgments [12] ipod (r) digital oximeter. Last accessed May 2, 2011. URL
http://www.nonin.com/OEMSolutions/iPod.
[13] Accu-chek aviva. Last accessed May 2, 2011. URL https://www.accu-
The author would like to thank Jose B. Marin´o, Dr. Jose Anto- chek.com/us/glucose-meters/aviva.html.
nio Fiz, Dr. Pep Roca, Dr. Hu Yue Xhian and specially Pau Bofill for [14] Yildirim C, Mete S, Kamber D. Blood viscosity and blood pressure: role of tem-
their helpful comments that greatly improved the manuscript. The perature and hyperglycemia. Am J Hypertens 2001;14(5):433–8.
[15] Lanfranchi PA, Somers VK. Arterial baroreflex function and cardiovascular vari-
author also thanks the two anonymous referees for many helpful
ability: interactions and implications. Am J Physiol Regul Integr Comp Physiol
comments and suggestion that have greatly improved this paper. 2002;283(4):815–26.
This project was financed by TEC2009-14094-C04-01. [16] Cavalcanti S, Belardinelli E. Modeling of cardiovascular variability using
a differential delay equation. IEEE Trans Biomed Eng 1996;43(10):
982–9.
Appendix A. [17] Cerutti C, Barres C, Paultre C. Baroreflex modulation of blood pressure and heart
rate variabilities in rats: assessment by spectral analysis. Am J Physiol Heart Circ
Physiol 1994;266(5):H1993–2000.
Accuracy of the pulseoximeter. [18] Fazan R, de Oliveira M, Dias da Silva VJ, Joaquim LF, Montano N, Porta A,
SpO2 (70-100%) (1 SD) et al. Frequency-dependent baroreflex modulation of blood pressure and
No motion Adults ±2 digits heart rate variability in conscious mice. Am J Physiol Heart Circ Physiol
Motion Adults ±3 digits 2005;289(5):H1968–75.
Low perfusion Adults ±3 digits [19] deBoer RW, Karemaker JM, Strackee J. Hemodynamic fluctuations and barore-
Heart Rate flex sensitivity in humans: a beat-to-beat model. Am J Physiol Heart Circ Physiol
No motion (18-300 BPM) Adults ±3 digits 1987;253(3):680–9.
Motion (40-240 BPM) Adults ±5 digits [20] Keener JP, Sneyd J. Mathematical physiology. Interdisciplinary applied mathe-
Low perfusion (40–240 BPM) Adults ±3 digits matics, vol. 8. New York: Springer Verlag; 1998.
[21] Di Rienzo M, Parati G, Radaelli A, Castiglioni P. Baroreflex contribution to
blood pressure and heart rate oscillations: time scales, time-variant char-
Appendix B. acteristics and nonlinearities. Philos Trans R Soc A: Math Phys Eng Sci
2009;367(1892):1301–18.
[22] Cornier M-A, Dabelea D, Hernandez TL, Lindstrom RC, Steig AJ, Stob NR, et al.
Specifications of the band-pass filter used for computing the The metabolic syndrome. Endocr Rev 2008;29(7):777–822.
heart rate statistics were as follows: [23] Ducher M, Cerutti C, Gustin MP, Abou-Amara S, Thivolet C, Laville M, et al. Non-
invasive exploration of cardiac autonomic neuropathy. Four reliable methods
for diabetes? Diabetes Care 1999;22(3):388–93.
• Chebyshev Type I filter of order 6. [24] van Ravenswaaij-Arts CMA, Kollee LAA, Hopman JCW, Stoelinga GBA, van HP,
• Lower stop frequency 0.01 Hz at −10 dB. Geijn. Heart rate variability. Ann Intern Med 1993;118(6):436–47.
[25] Sapolsky RM, Romero LM, Munck AU. How do glucocorticoids influence stress
• Lower pass frequency 0.1 Hz. responses? Integrating permissive, suppressive, stimulatory, and preparative
• Upper pass frequency 0.2 Hz. actions. Endocr Rev 2000;21(1):55–89.
• Upper stop frequency 0.4 Hz at −30 dB. [26] McCraty R, Atkinson M, Tiller WA, Rein G, Watkins AD. The effects of emotions
on short-term power spectrum analysis of heart rate variability. Am J Cardiol
1995;76(14):1089–93.
The lower stop frequency was selected to eliminate the DC com- [27] Grossman E, Grossman A, Schein M, Zimlichman R, Gavish B. Breathing-control
lowers blood pressure. J Hum Hypertens 2001;15(5):263–9.
ponent and low frequency components, such as the respiratory rate.
[28] Elliot W, Izzo J, White W, Rosing D, Snyder C, Alter A, et al. Graded blood pressure
The band pass was selected to capture the margin of the heart rate„ reduction in hypertensive outpatients associated with use of a device to assist
and the upper stop frequency was selected in order to attenuate with slow breathing. J Clin Hypertens 2004;10(6):553–9.
off band components. The attenuations were selected as a trade off [29] Schein M, Alter A, Levine S, Baevsky T, Nessing A, Gavish B. High blood
pressure reduction in diabetics with interactive device-guided paced breath-
between the need for eliminating interferences and the need for ing: final results of a randomized controlled study. J Hypertens 2007;25(2):
having a low-order filter. 292.
[30] Leonard P, Beattie TF, Addison PS, Watson JN. Standard pulse oximeters can be
used to monitor respiratory rate. Emerg Med J 2003;20(6):524–5.
References [31] Rabiner LR, Juang B-H. Fundamentals of speech recognition. Englewood Cliffs,
NJ, USA: Prentice Hall; 1993.
[1] Wild S, Roglic G, Green A, Sicree R, King H. Global prevalence of diabetes. [32] Dunn RB, Quatieri TF, Kaiser JF. Detection of transient signals using the energy
Diabetes Care 2004;27(5):1047–53. operator. In: IEEE International conference on Acoustics, Speech, and Signal
[2] Wolf-Maier K, Cooper RS, Banegas JR, Giampaoli S, Hense H-W. Hypertension Processing (ICASSP93). 1993. p. 145–8.
prevalence and blood pressure levels in 6 European Countries, Canada, and the [33] Kaiser JF. Some useful properties of teager’s energy operators. In: IEEE interna-
United States. JAMA 2003;289(18):2363–9. tional conference on acoustics, speech, and signal processing (ICASSP93). 1993.
[3] Diabetes statistics. American Diabetes Asociation. Last accessed May 2, 2011. p. 149–52.
URL http://www.diabetes.org/diabetes-basics/diabetes-statistics. [34] Ying G, Mitchell CD, Jamieson LH. Endpoint detection of isolated utterances
[4] Smith JL. The pursuit of noninvasive glucose: hunting the deceit- based on a modified teager energy measurement. In: IEEE international con-
ful turkey. Last accessed May 2, 2011. URL http://www.mendosa.com/ ference on acoustics, speech, and signal processing (ICASSP93). 1993. p. 732–5.
noninvasive glucose.pdf. [35] Li Q, Zheng J, Tsai A, Zhou Q. Robust endpoint detection and energy normal-
[5] Haaland M, Ries M, Koepp VTW, Eaton R. Reagentless near-infrared determi- ization for real-time speech and speaker recognition. IEEE Trans Speech Audio
nation of glucose in whole blood using multivariate calibration. Appl Spectrosc Process 2002;10(3):146–57.
1992;46(10):1575–8. [36] Duda RO, Hart PE, Stork DG. Pattern classification. New York: Wiley-
[6] Cattivelli FS, Garudadri H. Noninvasive cuffless estimation of blood pressure Interscience Publication; 2000.
from pulse arrival time and heart rate with adaptive calibration. In: Interna- [37] Renevey P, Drygajlo A. Entropy based voice activity detection in very noisy
tional workshop on wearable and implantable body sensor networks. 2009. p. conditions. In: Dalsgaard P, Lindberg B, Benner H, Tan Z-H, editors. In:
114–9. EUROSPEECH-2001. Aalborg, Denmark; 2001:1887–1890.
[7] Chen W, Kobayashi T, Ichikawa S, Takeuchi Y, Togawa T. Continuous estima- [38] Shen J-L, Hung J-W, Lee L-S. Robust entropy-based endpoint detection for
tion of systolic blood pressure using the pulse arrival time and intermittent speech recognition in noisy environments. In: Proc. ICSLP98. Australasian
calibration. Med Biol Eng Comput 2000;5:569–74. Speech Science and Technology Association; 1998. p. 1015–8.
[8] Allen J. Photoplethysmography and its application in clinical physiological mea- [39] Leonard P, Douglas J, Grubb N, Clifton D, Addison P, Watson J. A fully automated
surement. Physiol Meas 2007;28:R1–39. algorithm for the determination of respiratory rate from the photoplethysmo-
[9] Nitzan M, Patron A, Glik Z, Weiss A. Automatic noninvasive measurement gram. J Clin Monit Comput 2006;20(February 4):33–6.
of systolic blood pressure using photoplethysmography. BioMed Eng OnLine [40] Breiman L. Randomizing outputs to increase prediction accuracy. Machine
2009;8(1):28. Learn 2000;40(3):229–42.
138 E. Monte-Moreno / Artificial Intelligence in Medicine 53 (2011) 127–138

[41] Hastie T, Tibshirani R, Friedman J. The elements of statistical learning springer [47] Breiman L. Random forests. Machine Learn 2001;45(1):5–32.
series in statistics. New York: Springer; 2001. [48] O’Brien E, Waeber B, Parati G, Staessen J, Myers MG. Blood pressure measuring
[42] Bishop C. Neural networks for pattern recognition. Oxford: Clarendon Press; devices: recommendations of the European Society of Hypertension. Br Med J
1995. 2001;322(7285):531–6.
[43] Abu-Mostafa YS. Learning from hints in neural networks. J Complex [49] Clarke WL, Cox D, Gonder-Frederick LA, Carter W, Pohl SL. Evaluating clini-
1990;6:192–8. cal accuracy of systems for self-monitoring of blood glucose. Diabetes Care
[44] Cristianini N, Shawe-Taylor J. An introduction to support vector machines and 1987;10(5):622–8.
other kernel-based learning methods. Cambridge, U.K: Cambridge University [50] Codina EG, Clarke error grid analysis; 2008. matlabcentral fileex-
Press; 2000. change. Last accessed May 2, 2011. URL http://www.mathworks.
[45] Canu S, Grandvalet Y, Guigue V, Rakotomamonjy A, SVM and kernel meth- com/matlabcentral/fileexchange/20545-clarke-error-grid-analysis.
ods matlab toolbox. Perception Systmes et Information. INSA de Rouen. [51] Monte E. Descripción de un sistema para estimar el nivel de glucosa y de presión
Rouen, France; 2005. Last accessed May 2, 2011. URL http://asi.insa- sanguínea a partir de las medidas de un pulsioximetro e información auxil-
rouen.fr/enseignants/arakotom/toolbox/index.html. iar. Internal Report 252 Mon. Universitat Politècnica de Catalunya. Barcelona,
[46] Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Spain; 2008 (June).
regression trees. Belmont, CA, USA: Wadsworth and Brooks; [52] Validated blood pressure monitors. Last accessed May 2, 2011. URL
1984. http://www.bhsoc.org/blood pressure list.stm.

You might also like