Singers Voice Range Profile
Singers Voice Range Profile
Singers Voice Range Profile
a
Kungliga Tekniska Högskolan, School of Computer Science and Communication,
Department of Speech, Music and Hearing
b
Royal Conservatory, The Hague / University Utrecht / Voice Quality Systems
Abstract
This work concerns the collection of 30 Voice Range Profiles (VRPs) of female
operatic voice . Objectives: We address the questions: Is there a need for a singer’s
protocol in VRP aquisition? Are physiological measurements sufficient or should the
measurement of performance capabilities also be included? Can we address the female
singing voice in general or is there a case for categorizing voices when studying
phonetographic data? Method: Subjects performed a series of structured tasks
involving both standard speech voice protocols and additional singing tasks. Singers
also completed an extensive questionnaire. Results: Physiological VRPs differ from
performance VRPs. Two new VRP metrics: the voice area above a defined level
threshold, and the dynamic range independent from F0, were found to be useful in the
analysis of singer VRP’s. Task design had no effect on performance VRP outcomes.
Voice category differences were mainly attributable to phonation frequency based
information. Conclusion: Results support the clinical importance of addressing the
vocal instrument as it is used in performance. Equally important is the elaboration of a
protocol suitable for the singing voice. The given context and instructions can be more
important than task design for performance VRPs. Yet, for physiological VRP
recordings, task design remains critical. Both types of VRPs are suggested for a singer’s
voice evaluation.
Introduction
The Voice Range Profile (VRP) or phonetogram, is an increasingly popular clinical
tool that produces a two-dimensional image of the range of a voice in frequency and in
amplitude. The appeal of such a tool lies in its capacity to depict subtleties of voice
function and provide both quantitative and qualitative data. Sulter, in a study on
differences in phonetogram features between male and female subjects with and
without vocal training, commented on the scarcity of reliable VRP data studies [1].
Many more VRP data have since been collected [2-10] but only a handful of studies
have focused on VRP recordings of the singing voice [11-14]. These studies are often
based on subject groups that consist mostly of students in training populations,
amateurs, or a mix of choristers and soloists.
The VRP is known to be sensitive to gender, age, as well as vowels and other
individual characteristics [1, 4, 6, 15-17]. It would follow that the VRP could also be
dependent on training and/or profession [1]. In the case of the singer, the VRP could
ideally be sensitive enough to distinguish subtleties of the professional singer’s voice.
Although a few university music programs in Europe have performed systematic VRP
recordings of their students, few detailed analyses of singer VRPs have been published.
Most VRP studies seem to focus on groups of speakers, and use the singer or trained
group as a comparison point. The VRP seems to hold great potential for describing the
singing voice, but in order for the VRP to become more clinically relevant, a frame of
reference is needed to account for singer-specific issues, the possible impact of task
design, and the possible need for additional or alternative VRP-derived singer specific
metrics. This study’s aim was to investigate whether VRP recording practice needs to
be modified in order to be relevant to the singing voice.
Three main research questions were formulated.
Question 1. Is there a need to subclassify voices by singer category in a
subject/patient VRP group?
Question 2. What tasks should be included in the protocol when the subject or
patient is a singer? More specifically, should the tasks be musically designed to be as
representative as possible of singing or singing exercises?
Question 3. Are there significant differences between the physiological VRP (i.e.,
the standard VRP) and the performance VRP (a VRP entailing singing voice quality
with dynamics appropriate for the stage)? In the affirmative, where do these differences
lie?
Method
Data Acquisition
The method for data acquisition was the same as in an earlier study [18]. For the
reader’s convenience, it is briefly restated here. Recordings were performed with a
computerized, 16 bit linear acquisition, phonetograph (Phog, version 2.00.10, Saven
Hitech AB, Sweden). This system accumulates phonation time in 2-D bins, or cells, 1
semitone (ST) wide and 1 dB high. Cells are plotted according to the UEP standard 2/1
(dB/ST) aspect ratio.
Since Phog is based on a peak-picking F0 extraction, inevitably there was some
degree of F0 tracking latching onto higher harmonics. The recorded material was
The Singer’s VRP 3
inspected manually and the few instances of mistracking were removed. The recordings
took place in a sound-treated and isolated recording studio (volume 45 m3, ceiling
height 3 m, reverberation time, T30= 0.1 s, reverberation radius >1.2 m across the
spectrum, and 0.5 m deep absorbents). Singers were asked to adopt a singing stance.
Head and body movements were restricted as much as possible without impeding the
freedom of the artist. The microphone-to-mouth distance (30 cm) was measured at the
beginning of each task.
A condenser microphone (Brüel & Kjaer, model 4003, Denmark) was used with a
pre-amplifier (Brüel & Kjaer, model 2812) and a line amplifier (Nyvalla-DSP Audio
Interface Box). Singers were given a single piece earphone (Bassonic-Champion 4939,
USA) to hear prompting tones during one of the tasks. For details concerning the
voicing detection thresholds, the reader is referred to Lamarche et al.[18].
Subjects
Group criteria for this study were strict. The group included three voice categories: 6
contraltos, 8 mezzo-sopranos, 16 sopranos. Inclusion criteria included female opera
soloist, non-smoking, more than 4 years of training, no ear-nose throat medical history,
no respiratory problems and no actual voice complaints. No laryngoscopic examina-
tions were performed. At the time of the recordings, all subjects were actively
performing on classical/opera stages.
30 female opera singers with a mean age of 33.7 ± 8.8 years were recorded. The
project was ethically vetted by the Regional etikprövningsnämnden i Stockholm
(certificate 1358-31). Subjects were remunerated for their participation. Subjects had on
average a training experience of 13.4 ± 5.9 years. Table 1 lists information and
taxonomy pertinent to the subject group.
For the VRPphys, the objective was the recording of minimum and maximum
productions regardless of phonation type or laryngeal mechanism while for the VRPperf,
we wanted to capture the voice as it is used on stage. All five tasks were recorded in
one session. The subjects could communicate with the investigator by intercom and
visual contact through a window was possible. They could however not see the
phonetogram display to avoid interference with a parallel task studied in Lamarche et
al.[18].
Task 1a: A thematic spontaneous speech task was performed. Subjects were asked to
make a 1 minute description of their warm-up routine.
Task 1b: A counting exercise in which the subject used soft (but no whisper), regular
and loud public speaking voice. Separate SRPs were saved for each task. Subjects
spoke in their native tongue (Swedish, French or German). Henceforth, the SRPs will
be referred to as SRPs (1a and1b).
Task 2: The VRPphys. The aim was to register explicitly the subject’s vocal extremes in
pitch and in level. This was done with a descending glissando (a slow frequency sweep)
and ascending glissando exercise on the vowel [a]. The glissandi were repeated and
modified to acquire the best possible achievement (as deemed by the subject and the
investigator).
For the VRPperf, singers were instructed to sing as they deemed musically acceptable
for the stage. Singing voice quality and vibrato were obligatory and the aim was to
adhere to one’s stage singing ideals at all times, both in pitch and in vocal dynamics.
At the start of each VRP perf task, subjects were asked to sing a messa di voce on a
comfortable tone in order to exercise and explore their full performance-mode dynamic
range.
Task 3: A first VRPperf was recorded with prompted frequencies equivalent to the
musical notes C-E-G-A in several octaves across the singer’s range. Prompted tones
were augmented by semitones at the extremes [28]. Tones were sung on the vowel [a]
in a messa di voce exercise (sustained pitches performed with increasing and decreasing
vocal dynamic).
Task 5: For the third VRPperf, subjects performed their best audition aria with lyrics.
This task served to obtain a minimum of 1 minute of the voice in its most representative
context. This was the only sung task that involved several different vowels. In a
previous study, [20] the authors concluded that vowel variation in the high female opera
singing voice VRP was negligible due to formant tuning.
Metrics of Importance
Here enumerated follow the metrics considered to be of interest for VRP
analysis of the singing voice.
Minimum and Maximum Frequency (fmin/fmax). These values denote the minimum
and maximum values of F0 occurring in a given VRP.
Minimum SPL (SPLmin): Minimum SPL values in the VRPperf can be expected to
be much higher than those expected for SRPs and for the VRPphys. Schultz-Coulon
estimated up to a 10-20 dB difference between a singer’s pianissimo and a speaker’s
soft tone [24]. The main reason is simply that on stage even the quiet tones must be
heard at the back of the hall, where phonation at the physiological threshold would be
inaudible. Another reason is that control of the tone is poor at the threshold.
Maximum SPL (SPLmax): According to previous reports, this metric would also
be expected to vary with the type of VRP recording. However, the direction of this
variation remains unclear. Certain studies claim that physiological VRPs show higher
maximal intensities. Singers might however be inhibited in a laboratory setting, but
more easily draw on their full resources when given the proper context.
SPL Range (SPLrge): Western opera and lyrical vocal music require a substantial
dynamic range. We recall here that SPL covaries strongly with F0 [15,20,25]. It is
acoustically inevitable that low SPL values will be difficult or impossible to produce at
high frequencies, and vice versa for the lower range. Hence a large F0 range will tend to
be associated with a large range in SPL. Therefore the overall SPL range does not
directly reflect the singer’s ability to modify her output power.
The Singer’s VRP 7
Average SPL Extent (SPLext): For a given F0, we define the SPLext as the level
difference between the upper and lower bounds of the contour; in other words as the
height of the phonation area at any given F0. This extent is then averaged from lowest
to highest F0, giving a metric for how much the singer can modify SPL at constant F0.
In this way the dependency of SPL on F0 is compensated for. Since the voices studied
here are trained in maintaining dynamic stability across the frequency range of the
voice, we can expect the SPL extent of a singer to be larger and more consistent than
that found for untrained voices.
Area above 90 dB: Singers need to be heard when they stand on a stage and are
accompanied. Indeed, classical singing technique develops the ability to produce loud
sounds and also to maintain higher energy in the 2.5-3 kHz region of the spectrum (the
singer’s formant cluster, or spectrum resonance peak). Without amplification, a certain
minimum power is needed to make oneself heard in a given performance situation.
Although the voice spectrum would also be relevant, it is plausible that a rough
criterion for a useable stage voice could be the VRP area above some minimum SPL
(corresponding to a minimum singer power). The question is then how to select a
suitable threshold level. In an earlier unpublished study, data was collected that could
be applied for this purpose. 3 sopranos and 2 mezzo-sopranos were asked to phonate on
a series of different pitches on a /papapa/ exercise. They phonated in piano, mezzo forte
and forte. The SPL range obtained for these five singers measured at 30 cm from the
mouth and for the midi pitches 60, 65, 69, 74, 79 (C4-G5) was 66-112 dB. The mean
SPL for a piano across all singers was 83 dB. The level increment was 6.5 dB between
piano and mezzoforte and 3.6 dB from mezzoforte to forte. A mezzoforte was equivalent
to roughly 90 dB. This agrees well with data from Nawka [26]. The exact value of the
chosen threshold level is not critical, as it is unlikely to have a large effect on the
conclusions arising from comparing VRPs; but in order to be normative, the choice
must be well informed.
For analysis purposes, this area will be related to the total area and a percentage of
vocal presence in the 90 or more dB area will be reported (Percent≥90dB).
A. Lamarche et al 8
VRP slope: Slope metrics can be defined in many ways and are not readily
compared from one study to another. Not only do slopes depend on many factors such
as mouth radiation, voice source parameters (mean flow declination rate, pulse rate) and
possibly acoustic strategies (F0-F1 tuning) [25], but they are also very dependent on the
actual VRP shape. Some earlier studies have reported slope values for partial contour
segments [8, 21]; however, such slope values would reflect the total effect of several
underlying mechanisms that would need to be accounted for separately. In producing
group data, many different shapes are averaged to give a group contour, and so a slope
value in this instance becomes less informative. Furthermore, VRP shapes tend to be
rounded and make it difficult to systematically define a tangent. It is also debatable
what the slope value actually represents, when the phonatory modes are not accounted
for separately. For these reasons, slopes will not be reported in this paper.
The SRP recordings (1a and 1b) were analysed with the SRP metrics: minimum,
maximum, range and average in frequency and in SPL. The total area of phonation was
also reported.
Analysis
The normality of the distribution was assessed by examining closely the kurtosis and
skewness levels. Comparative statistical tests were selected to assess SRP and VRP data.
The probability alpha was set to 0.01. A general linear multivariate analysis was
performed for the dependent variables: Rge, fmin,fmax, SPLrge, SPLmin, SPLmax, SPLext,
Area, Percent≥90dB. Fixed factors were Task (4 levels-here we excluded continuous
speech tasks) and Voice Category (3 levels).In the event that the F test resulted in
significant differences, the Ryan-Einot-Gabriel-Welsch Range test was conducted to
assess the difference among the factors and dependents. The non-parametric Wilcoxon
Signed Rank test for paired samples, was performed for SRP data. All analysis was
performed with SPSS 15.0 for Windows, SPSS Inc.
The Fourier transform (FT) is often used in image processing to detect and assess
shapes. A novel Fourier Descriptor (FD) approach to contour averaging was used here
to compare and depict the collected data. The Fourier descriptor method has several
useful features, including the ability to deal with translation, scale changes and even
rotation. A contour spectrum is calculated, filtered and inverse transformed to yield a
smooth curve that connects each point of the VRP contour. New data points can then be
interpolated over this contour.
This technique allows for the creation of average contours regardless of their original
sampling (F0/SPL range or area size), and can also depict the co-variation across the
averaged contours (see figure 1). This enables the comparison of multi-source data in
one graph. A methodological paper concerning the detailed description of the FD
technique is currently in review (Pabon, Lamarche & Ternström, in review).
The Singer’s VRP 9
Results
Questionnaire results are tabulated in Table 2. This group of subjects was overall
healthy with moderate physical training habits, healthy weight and very low intake of
medicine. Vocal habits were rated “moderate,” yet extensive voice use and training
experience were noted.
Descriptive statistics for the VRP metrics are reported in a series of tables. Table 3
gives the group means and standard deviations for SRP metrics. Table 4 reports the
statistics per voice category for the sung tasks. The VRPphys was only introduced later in
the experiment, and so the number of subjects for which the VRPphys is available is
smaller (Sopranos=8, Mezzosopranos=2, Contraltos=6).
SRP metrics did not vary substantially from task 1a to 1b, and standard deviations
(SD) were quite small, indicating good agreement within the group. For the other tasks,
differences were more noticeable from one task to another. From the physiological to
the performance VRP, the frequency range was reduced from 3.3 to 2.8 octaves (38.6 to
33.3 semitones). Naturally the Aria performance VRP has a much more reduced range
(constrained by the composition chosen by the singer).In fact, the results in all metrics
but one were constrained when moving from the physiological task to the aria. The
exception was the percentage of the voice use at 90 dB and above, which increased
(from 30% in the physiological profile to 51 % in the aria).
Averaged VRPs depict the results for each task while differentiating the voice
categories. Figure 1 illustrates the contour averages and covariation for the counting
task (1b). Clearly, mezzosopranos and contraltos, even in speech, exercise their low
range more than the sopranos. In figure 2 the averages and covariation are displayed for
the speaking task(1a), for which the same observation can be made.
The significant differences between speaking (1a) and counting (1b) tasks were
observed for SFF, SPLmin, SPLrge and Area. Table 5 gives the test results. These results
can also be assessed in Figure 3 where the SRP (1a) for the complete group (N=30) is
superimposed onto the SRP (1b). Figure 4 a) displays the SRP(1b) within the
physiological contour of the group. The speech area covers roughly 37% of the
physiological VRP area. Figure 4 b) shows the corresponding comparison for the
performance VRP. Figures 5 and 6 (a-c) illustrate the results for the contour averaging
of the physiological and performance tasks.
Table 6 (a-b) is an adapted SPSS table of the multivariate analysis results for the
sung tasks.The fixed factors Task and Voice Category both had a significant effect on
VRP metrics. There was no interaction between the factors. In table 6 a) results for
Pilais’s Trace are reported. With the exception of SPLmax , all metrics varied
significantly with the Task (Table 6 b). Conversely, Voice Category seems to have had
a limited effect, with significant levels of difference obtained for the fmin/fmax and range
metrics only.
A. Lamarche et al 10
D ispersio n 16 8 6 30
Age gro up 20 -4 6 25-55 3 3-48 20-55
Age mea n 3 0,9 36 ,8 36,3 33,7
Age Std ev 8,0 11 ,3 6 ,7 8 ,8
Voice T rain ing/yr mea n 1 2,1 15 ,6 14,8 13,4
Voice T rain ing /we ek
A-Da ily o r mo re 10 / 16 5/8 5/6 20 / 30
B-4 X to 6X 6 / 10 3/8 9 / 30
C -L ess tha n 4x 1/6 1 / 30
T rain ing m ean le ng th 1 :2 0 hr 1:05 h r 1:15 h r 1:12 h r
U se of sp oke n vo ice mo derate m ode ra te m ode ra te m ode ra te
U se of sin ging vo ice m od -g rea t m ode ra te m ode ra te m ode ra te
T oba cco in ta ke 1 0 0 1
Bod y Mass In dex
A-hea lth y 12 / 16 5/8 5/6 22 / 30
B-overw eig ht 3 / 16 2/8 1/6 6 / 30
C -o be se 1 / 16 1/8 0 2 / 30
Physica l
T rain ing/w eek 2X 2X 4X 3X
Me dicine inta ke
A-pre scribe d
B-co ntra ce ptive s 2 2 3 7
/ho rm one s
C -o ve r the cou nter 3 2 5
/ho me opa th ic
D -a llerg ies/asth ma 2 2 1 5
E-non e 9 2 2 13
Table 2. Physical and vocal health questionnaire results for a group of 30 singers. We
denotefrequency of training with ‘x’ (The one case of tobacco intake which is here reported
is not associated to smoking but rather to “snuff”.)
The Singer’s VRP 11
Figure 1 Average SRP contours for the counting Figure 2 Average SRP contours for the
speech task 1b (N= 30, soprano in black (16), spontaneous speech task 1a (N= 30, soprano in
mezzosopranos in red (8) and contralto in blue (6). black (N=16), mezzosopranos in red (8) and
The insets show the two-dimensional standard contralto in blue (6). Insets show the standard
deviation as ellipses, whose orientation also suggests deviations as for Figure 1.
the local covariation of F0 with SPL.
A. Lamarche et al 12
Task Category
Soprano Mezzosoprano Contralto Total
Mean SD Mean SD Mean Mean SD
fmax Fysio 1315.23 223.86 1176.66 96.01 1186.53 1226.14 80.42
Pitch 1295.13 115.89 1147.25 181.50 982.23 1141.54 39.70
Vocalise 1173.02 79.70 1061.70 195.46 988.00 1074.24 58.66
Excerpt 970.61 128.42 942.07 150.36 800.97 904.55 27.67
fmin Fysio 151.53 11.71 113.66 13.89 128.59 131.26 2.47
Pitch 174.46 19.71 131.71 16.74 160.37 155.51 2.73
Vocalise 169.72 24.23 144.93 35.39 158.20 157.61 7.46
Excerpt 258.22 36.90 241.41 41.46 215.05 238.23 7.61
Rge Fysio 3.14 0.27 3.45 0.07 3.20 3.26 0.16
Pitch 2.92 0.20 3.14 0.25 2.64 2.90 0.15
Vocalise 2.83 0.25 2.91 0.42 2.67 2.80 0.08
Excerpt 1.93 0.18 1.99 0.26 1.90 1.94 0.05
SPLmax Fysio 112.75 8.00 115.50 2.12 113.17 113.81 3.24
Pitch 114.00 3.58 112.38 6.09 114.60 113.66 1.27
Vocalise 113.50 2.92 114.00 5.61 108.83 112.11 1.75
Excerpt 111.50 4.27 110.50 5.98 108.67 110.22 0.90
SPLmin Fysio 56.38 3.46 55.50 0.71 58.00 56.63 1.38
Pitch 63.75 3.75 64.13 5.03 60.00 62.63 0.64
Vocalise 66.38 4.49 65.00 4.11 65.83 65.74 0.22
Excerpt 71.44 6.25 74.00 3.63 67.50 70.98 1.47
SPLrge Fysio 56.38 6.82 60.00 1.41 55.17 57.18 3.96
Pitch 50.25 3.47 48.25 8.99 54.60 51.03 2.80
Vocalise 47.13 3.83 49.00 8.75 43.00 46.38 2.99
Excerpt 40.06 5.28 36.50 5.15 41.17 39.24 0.17
SPLext Fysio 26.86 4.79 29.07 4.41 27.48 27.80 1.02
Pitch 17.34 2.63 15.94 3.83 19.64 17.64 0.60
Vocalise 16.16 4.00 18.24 6.11 13.69 16.03 1.90
Excerpt 14.24 2.75 12.79 3.58 13.86 13.63 0.70
Area Fysio 762.75 275.58 725.00 4.24 734.50 740.75 140.66
Pitch 528.06 104.94 534.63 137.60 522.00 528.23 17.33
Vocalise 541.94 144.60 642.50 215.72 478.33 554.26 57.34
Excerpt 338.44 78.20 312.50 76.90 324.17 325.03 1.39
Percent≥90dB Fysio 30.66 9.57 28.13 1.20 30.87 29.89 4.39
Pitch 46.59 7.32 41.50 8.99 39.66 42.58 1.07
Vocalise 48.74 13.32 41.43 8.66 37.95 42.70 3.51
Excerpt 50.38 15.25 53.99 12.95 46.50 50.29 1.17
Table 4 Means for VRP metrics per Voice category and per sung Task. The standard deviation is
referred to as SD. Range is indicated in octaves.
The Singer’s VRP 13
Figure 6 b) Average performance VRP contours for the vocalise task (Task 4) and c) Average
performance VRP contours for the aria excerpt (Task 5). Sopranos are in black (16), mezzosopranos in
blue (8) and contraltos are in grey (6).
The Singer’s VRP 15
Table 6a) Multivariate test results. A significant main effect of both factors, Task and Category, is observed
but no interaction between the two is observable. Significance is determined with p < 0.01. Degree of freedom
is represented by df and significance by Sig.
fmax fmin
Category Rge Subset
Subset Subset
1 2 1 2 1
REGWR Contralto 989.75 165.78 2.60
Test Mezzosoprano 1060.06 1060.06 168.14 2.74
Soprano 1170.39 193.76 2.64
Table 7a) Table of the R-E-G-W-R multiple comparisons test for Voice category. Means that appear
in the same homogeneous subset are not significantly different from each other (p<0.01).
Effect of Task
Post hoc comparisons revealed no significant differences between the discrete pitch
task (Task 3) and the vocalise exercise (Task 4). These observations are corroborated in
Table 7b). As expected, the sung aria was significantly different from all other tasks,
except in SPLext where it could not be differentiated from the vocalise task, and in
Percent≥90dB where all performance tasks were not distinct from one another. Figure 7
illustrates the contour averages for the three performance tasks. Both the discrete pitch
and vocalise tasks yielded rather similar vocal outputs. The differences that could be
noted were related mostly to lower VRP contour details.
In the frequency metrics, the VRPphys did not differ significantly from the VRPperf .
Rather, a marked distinction between both types of VRPs was associated to intensity
metrics.In the SPLmin, SPLext and the Area metrics, there was a clear distinction between
the VRPphys and the VRP perf. No significant differences were found between the
VRPphys and Task 3 data with respect to SPLrge and Percent≥90dB . Figure 8 shows thetwo
contour averages. Since greater statistical difference was found between the VRPphys and
Task 4, the vocalise contour was used to represent the VRPperf.
The Singer’s VRP 17
Table 7b) Table of the R-E-G-W-R multiple comparison test for Task. Means obtained for
each metric are tabulated, and means that appear in the same homogeneous subset are not
significantly different from each other ( p<0.01).
A. Lamarche et al 18
Figure 8. Comparison of
averaged contours for the
physiological (Task 2-black)
and vocalise (Task 4-blue)
tasks, for a group of 16
singers.
Discussion
The results reported in this study help elucidate the performance aspects of the
singing voice and how they might impact the VRP. A professional Western opera
soloist has different requirements for his/her instrument than does a speaker [26]. As
seen earlier, some have demonstrated range differences between the physiological
VRPs of untrained and trained voices however, physiological ranges might not
necessarily greatly differ in practice. Rather, voice control, is often considered the
greatest differentiating aspect between trained and untrained. The contour of the
VRPphys does not readily lend itself to the interpretation of such a vocal feature. The
VRPphys strives rather to capture the minimum threshold of phonation as well as
unrefined vocal transitions. On the other hand, the VRPperf might enable us to
understand subtleties of what can be considered functional for a singer (considerably
The Singer’s VRP 19
different from the speaker’s need for vocal function). Just as the SRP enables the
clinician to obtain a behavioral type of VRP acquisition, the VRPphys seems to
demonstrate interesting behavioral aspects of the singing voice that are akin to singing
and not necessarily present in non-singing voice use.
Speech
SRP data was here included since it seldom accompanies VRP reports in other
studies but is an important part of the total voice evaluation. Our result for the SRPs
(Task 1a and 1b) agreed well with the speech range data of Drew and Sapir [28]. They
reported an increase of SFF in reading when comparing spontaneous speech and
reading tasks. In our study, the reading task was substituted by the counting task (Task
1b). Drew and Sapir reported a mean of 219 Hz for speech and an increased mean of
230 Hz for reading. We found averages of 242 Hz (1a) and 251 Hz (1b) respectively.
(Only our soprano data is commented, since the Drew and Sapir study was conducted
with 10 healthy soprano subjects). When compared to healthy female native Swedish
speakers, the SFFs obtained here (both for 1a and 1b) are quite high. Kitzing reported a
SFF of 193 Hz with a standard deviation of 2.7 semitones for a group of 141 Swedish
female speakers [30]. Yet, when observed per voice category, the SFF averages
obtained (soprano=242 Hz, mezzosopranos=212 Hz, contraltos=220 Hz) relate
somewhat better to Nadoleszny’s results as reported by Drew and Sapir (soprano=262
Hz, mezzosopranos=230 Hz, contraltos=212 Hz). Awan also reports a higher SFF for a
group of trained voices as opposed to untrained voices [22]. See Table 4 a) for detailed
SRP results.
According to Hacki, a speech profile in normal cases should be approximately ⅓ of
the VRP [31]. It is not very clear whether he refers to a VRPperf (like Tasks 3-4 of this
study) or a VRPphys (like Task 2 of this study). Data collected in the present study
suggested that Hacki’s conclusion was most likely based on a VRPphys. Speech and
counting contours had a range of 1.3 octaves while the physiological VRP had a 3.3
octave range. In other words, the SRPs recorded in our study occupied the bottom third
of the VRPphys, covering 30 to 37% of its total area. When related to the performance
profiles, SRPs covered 40-41% of the VRPperf. area. Tables 5 a-b) exemplify these
observations. On direct juxtaposition, SRPs were not completely enclosed by VRPperf.
Although there was a good correspondence in minimum frequency for both the SRPs
and the performance profiles, the SRPs displayed lower minimum SPL values than
what was found for the performance voice. This falls in line with other reports.
This last observation might correspond to the nature of both types of phonations.
Coleman claimed that sustained tones would lead to higher intensities than intermittent
phonation such as found in speech [23]. We observed a 7 dB difference in SPL between
the soft spoken tones and the sustained performance-like phonations. While there were
differences in the lower contour, all profiles followed a similar trajectory for the upper
contour. As Pabon has observed (personal communication), the left portion of the upper
A. Lamarche et al 20
contour (the initial rise of the maximum VRP curves) is often a location of convergence
when comparing within individuals, within groups and even across groups.
Concerning the maximum SPL in speech, Hacki stated that values of 80-90 dB were
normal values for the case of individuals with “good voice capabilities”[31]. In our
investigation, we obtained similar results, with maxima of 84 and 85 dB for the speech
and counting tasks respectively. Furthermore, Sulter and Awan consider the intensity
range of 60-80 dB to be important for normal communication [32, 33]. Subjects in our
study had a similar speech intensity range and maintained, on average, a level of 71 dB.
Indeed, subjects in this group were quite loud while speaking. This could be a result of
the dampened acoustics of the recording studio. Pooling the data for soprano, alto and
age group data of Brown et al (1993) (corrected for their smaller microphone distance)
we obtained a mean of 64 dB [29]. This is a somewhat lower value considering that the
subjects were 14 professional singers. A mean level of 62 dB was reported for their
nonsinger group. These studies all seem to indicate that in terms of speech power, the
differences between speakers and singers are very small.
in figure 9a), the SPL extent at middle frequencies for both voices, are quite similar.
Rather, this observation seems more relevant for the differences at the F0 extremes of
the physiological VRP. When the lower contours are compared for both voices, we note
that the two voice categories converge well with increasing frequency and the slow rise
in intensity that usually accompanies them. When the VRPperf are similarly compared
(see figure 9b), the voice category differences are manifest in the upper high end of the
VRP, where sopranos display a larger SPL extent, consistent with a greater vocal
flexibility and control at high pitch. The frequency range difference is again clear and
seems to follow voice category definitions.
(a) (b)
Figure 9 Average VRP contours for the physiological task (a) and the performance tasks (b).
Sopranos in black (N=8) and contraltos in light grey (N=6).
Effects of Task
Reich et al (1989, 1990) tested thoroughly the effect of different tasks in recording
the frequency ranges of children and adults [34]. In those studies it was concluded that
continuous tasks such as glissandi or small steps task led to better results in regard to
frequency range. For frequency minima, the slower glissando produced lower values
than the rapid glissando exercise. Although the authors focused only on frequency,
these outcomes can be interestingly related to our results.
According to the earlier stated hypotheses, the tasks for this experiment were
designed to test specifically if 1) singers would resort to a more representative use of
the voice in a performance task and if 2) in a performance task, a continuous expiratory
gesture would lead to higher vocal flexibility (both in frequency and intensity). The
inclusion of the aria excerpt served mainly to assess the possible difference between
realistic singing and task singing: an approach similar to that used with actors by
Emerich et al [35].
A. Lamarche et al 22
An overall main effect of tasking was found in the statistical analysis. As expected,
the aria excerpt task was significantly different in almost all of the investigated metrics.
Similarly to Emerich’s study of actor VRPs and Speech Range Profile (SRP), our data
confirm that the nature of the task and the performance setting suggested to the singers
will impact the results that one obtains [35]. Minimum SPL, for example, was
significantly higher in the case of the aria singing as opposed to the discrete pitch and
vocalise tasks. Conversely, the aria singing yielded a significant smaller SPL range than
the discrete and vocalise tasks. The total area was also significantly smaller than in
other tasks.
Contrary to Emerich’s results, the singer data did not indicate an increase in
maximum intensity values when the context was changed from physiological to a
performance setting. In fact, this was the only metric which did not demonstrate any
effect of tasking. Maximum intensity levels for singers actually decreased a little when
compared to the physiological case. On the other hand, singers in all voice categories
increased their VRP area above 90 dB when given a performance context. Emerich
concludes that this ability to produce louder phonation in a performance context could
cast doubts on the proper voice function strategies of the actors. In the singer’s case, the
increase of Percent≥90dB does not evoke concern for the singing strategies of these
singers (all professionals with many years of experience) but rather attests to successful
training and vocal behavior required in performance. Significant differences for SPLext
were limited to the discrete pitch task and the physiological task. Table 7b) shows this
clearly. In fact, the two designs – the discrete pitch task and the vocalise – were not
significantly different in any of the nine VRP metrics.
It had been hypothesized that the vocalise task, being a continuous type of task and
part of the singer’s daily vocal reality, would lead to enlarged singer-specific VRPs.
The results obtained here lead us to reject this hypothesis. Differences between aria
singing and task singing were not observed for the singing-voice specific metrics. This
result speaks to the necessity for introducing two relatively new metrics, the SPLext and
the Percent≥90dB, as well as the importance of including a performance task design when
conducting singer VRP recordings. Such findings are clinically relevant. If a patient
puts forth a complaint particularly related to his/her singing voice, the clinician could
opt for which VRP acquisition to prioritize. In this case, a VRPperf would most likely
help elucidate the problem.
These task-related aspects will need consideration for the proper documentation and
understanding of the singing voice as it is used regularly by the singer. Performance
task design, according to our observations, appears to be less important than the clarity
and structure of the instructions. Providing the singer with a realistic voice-use context
is also important.
The Singer’s VRP 23
Figure 10. Physiological (in black) and performance contours (aria in light grey and vocalise in blue)
for merged soprano, mezzo and contralto groups (N=30). At low levels, the performance contours are
well contained by the physiological contour and even align at the low maximum curve rise. At high
levels, however, the performance contours exceed the physiological contour, in the uppermost region
of the maximum curve. Note however that the maximum SPL values for all VRPs is more or less the
same.
The reader may recall that for the performance tasks, the increase of voicing in the
higher SPLs was obtained in a studio context which limited the singer’s freedom of
expression, space and musicality. The context was remote from the realistic setting in
which a singer performs. On-stage recordings could well lead to an even greater
increase in the area equal and above 90 dB. Emerich’s result of actors studied in both
on-stage and in-studio monologues seems to support this [35]. The performance VRP
might bring us a step closer to a more representative image of the singer’s voice, while
remaining distinct from the real on-stage vocal behavior.
Physiological VRPs were compared to pre-existing data sources. Figure 11 includes
four different normative contours for similar groups. Although all four studies
conducted physiological VRPs, there are clear differences in the phonation threshold
and/or the minimum curve of the VRP. The data collected in the current experiment
have the highest minimum values. When related to our counting speech data, it was
found that soft phonations produced in the physiological VRP yielded similar minimum
The Singer’s VRP 25
results (recall figure 4a). For our recording of the counting tasks, subjects were asked to
count very softly without whispering. This would indicate that in the physiological VRP,
singers stayed in a “respectable phonation” zone instead of dropping to the bare
minimum levels possible.
Figure 11. Contour averaging for singer groups. Data is representative of physiological VRPs. Lamarche data in
dark blue, N=16, professional classical singers. Sulter data in broken line, N=42, choir singers with +-2 years
experience [1]. Pabon data in blue, N=23, classical singing students (unpublished). Hacki data in light grey,
N=10, classical singers, level of skill undefined [32].
There could be two reasonable explanations for this: a procedural effect and/or a
control question. Firstly, the glissando procedure was selected for its speedy and
efficient nature; also, its non-sustained nature was believed to help the singer not to sing
(instinctively, some subjects reverted to singing quality phonations – especially vibrato
– and had to be encouraged by demonstration to abandon it). It could be that in using an
ascending continuous pitch gesture, the minimum threshold could not really be obtained
in a way representative of the threshold pressure. If a discrete pitch task had been
performed instead, a drop of 10-15 dB might be expected. In that event, this study’s
data would compare better with the other contours (Sulter and Pabon used free
phonation in discrete pitch task, except at the higher frequencies where usually
glissandi were more easily produced). Reich’s results on minimum frequency and
tasking could perhaps be generalized here to minimum intensity: a fast continuous
vocal gesture automatically raises sound pressure levels.
A second possible explanation for the higher thresholds in the present study might
be that singers wanted to keep a certain degree of control as they performed. The
A. Lamarche et al 26
minimum levels for the physiological VRP matched those obtained for SRP (1b) where
soft voice was required. Instructions were carefully formulated in regard to voice
quality and task approach, but perhaps more attention should have been given to vocal
control. It seems like singers might have felt uncomfortable to visit very low levels of
phonation due to the instability it could entrain. A similar idea could explain certain
differences observed concerning the upper contour as well. Singers tended to be
cautious and needed some coaching to freely visit voice transitions. It is believed that
higher intensities could be obtained since they are demonstrably present in the
performance VRPs.
Group Criteria
The present study is concerned with one particular style of singing. Still, thanks
to the VRP’s known sensitivity to various aspects of voice and factors such as gender
and training [1, 4, 6, 15-17], it could also be of interest when grouping candidates to
collect VRP singer data by genres. In the present study, only female professional
classical soloists were included. A similar study of female professional musical theater
and commercial music could offer useful comparison material.
Technical issues
Automatic phonetographs have spread quickly within the clinical community and
their practicality and effectiveness are established. However, in using these devices
with the professional operatic singing voice, one needs to attend to certain details that
were not necessarily relevant for manual phonetographs nor for the case of the
speaker’s voice. These include the dynamic range, the phonation occurrence threshold
setting (is one going to include vibrato or not in the tasking?), the period-time variance
threshold, the responsiveness of the F0 extraction algorithm, and the required duration
of phonation.
Here follows a brief summary of details that would need to be accounted for by the
clinician who works with the VRP. Recording the operatic voice at a 30 cm mouth to
microphone distance will result in a signal with high decibel values. This is in fact an
obstacle which was often met during this data collection and which has seldom been
reported. LeBorgne mentioned in passing some student singer phonations of 125 dB in
the context of a VRP study using CSL equipment [20]. (She does not report any
recording difficulties pertaining to the microphone or the phonetograph and furthermore
uses a microphone-to-mouth distance of 15 cm). Most current phonetographs do not
have the ability to register higher SPLs than 120 dB. Such high amplitude signals will
be clipped. Most commercial phonetographs abide by the conventional display built for
speech which ranges from 16 to 4000 Hz and from 40 to 120 dB. This might seem
elementary but it nevertheless points to the necessity of creating or adopting a "singing
voice interface or mode" in present day phonetographs. (For example, a separate
window or interface setting could help mark the differences for the user and have
The Singer’s VRP 27
pre-settings necessary for singing voice recording). For the purpose of this study, an
electrical -12 dB pad was used between the microphone and the computer’s digital
sound card; or alternatively, microphone to mouth distance was increased to 1 m. The
signal was thus reduced by 12 dB or 10.5 dB in order to make Phog recordings possible
and complete. These corrections were later accounted for in post-recording analysis.
However, in a clinical context where singer-patients are being evaluated, a VRP
program would definitely need to provide immediate proper visual feedback. The SPL
limit of the instrument aside, measurement microphones used in VRP recordings of
singers may need to tolerate 130 dB, for a 30 cm placement. In a clinical context a
headset microphone might be preferred to a fixed microphone. It would then be
imperative to select a headset with the proper voice level tolerance for singers (looking
not only at saturation but also at distortion thresholds) and calibrate it adequately [38].
Despite the increasing popularity of computerized phonetographs and their
capabilities to display additional voice quality information, VRP analysis remains
largely focused on contours. Some work [13,14,18,39] has attended more specifically to
the interior of the VRP. The VRP might offer much more information than is
commonly exploited.
Conclusions
This study investigated the possible importance of recording two types of VRP when
addressing the singing voice. Furthermore, the impact of task design was considered
and the possible necessity of subdividing subjects into groups according to voice
category was explored.
The physiological VRP was found to be different from the performance VRP. It
appears important to include both types of VRPs in a singer’s voice status analysis as
they contribute different kinds of information. While there was no significant difference
concerning SPLmax, it was observed that the percentage of the voice in the VRP area
equal and above 90 dB increased in a performance context. Indeed the Percent≥90dB
could be a sensitive metric to performance capabilities and would perhaps be more
sensitive than the total area metric in the assessment of singer’s voices. It is clear that if
one records uniquely physiological VRPs of singers, important aspects of voice use
might not be represented. The performance context or mindset seems to be key in
obtaining a more representative image of the true vocal use of the singer and this seems
to apply to other types of professional voice users; actors are a previously reported
example.
Different task elicitation methods for the physiological VRP might greatly influence
the mimimum VRP thresholds. Conversely, no effect of a particular task design could
be observed when investigating the performance VRP. Discrete pitch task and a more
continuous gesture (vocalise) task led to similar results.
The instructions and the context suggested to the singer are perhaps more important
than the particular task design in determining VRP outcomes. The hypothesis that a
A. Lamarche et al 28
vocalise task would yield more representative singing voice VRPs than that obtained
with a discrete pitch task is rejected.
Finally, results did not point out any particular need to subdivide a female singer
group according to voice category. This suggests that in the case of the singing voice it
would be important to also consider other VRP metrics that are based not only on the
contour.
All in all, it is expected that this collection of VRP data for a homogenous group of
female Western opera singers could be useful and referential in understanding and
analysing the female classical singing voice.
Acknowledgments
The authors recognize the generosity of the all the singers who participated in these
recordings. The authors are also indebted to the Baxter and Ricard Foundation which
partially funded this research. Thank you to Caroline Traube who made sound booths at
the University of Montreal accessible. The authors also wish to thank Erwin
Schoonderwaldt and Clara Maitre, who helped with Matlab scripts.
References
1. Sulter A M, Schutte H K, Miller D G. Differences in phonetogram features between
male and female subjects with and without vocal training. JVoice. 1995;9:363–377.
2. Heylen L,Wuyts F L, Mertens F, de Bodt M, PH Van de Heyning. Normative voice
range profiles of male and female professional voice users. JVoice. 2002;16:1-7.
3. Ma E, Robertson J, Radford C, Vagne S, El-Halabi R, Yiu E. Reliability of Speaking
and Maximum Voice Range Measures in Screening for Dysphonia. JVoice.
2007;21:397-406.
4. Wuyts F L, Heylen L, Mertens F, du Caju M, Rooman R , Van de Heyning P H, de
Bodt M. Effects of age, sex, and disorder on voice range profile characteristics of
230 children. Ann Otol Rhinol Laryngol. 2003;112:540–548.
5. Siupsinskiene N. Quantitative analysis of professionally trained versus untrained
voices. Medicina (Kaunas). 2003;39:36–46.
6. Hacki T. [Vocal capabilities of nonprofessional singers evaluated by measurement
and superimposition of their speaking, shouting and singing voice range profiles].
HNO. 1999;47:809–815.
7. Chen S H. Voice range profile of Taiwanese normal young adults: a preliminary
study. Zhonghua Yi Xue Za Zhi (Taipei). 1996;58:414–420.
8. Heylen L, Wuyts F L, Mertens F, de Bodt M, Pattyn J, Croux C, Van de Heyning P
H . Evaluation of the vocal performance of children using a voice range profile
index. J Speech Lang Hear Res. 1998;41:232– 238.
The Singer’s VRP 29