Aalborg Universitet: Møller, Henrik
Aalborg Universitet: Møller, Henrik
Aalborg Universitet: Møller, Henrik
Published in:
Applied Acoustics
Publication date:
1992
Document Version
Accepted author manuscript, peer reviewed version
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
? Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
? You may not further distribute the material or use it for any profit-making activity or commercial gain
? You may freely distribute the URL identifying the publication in the public portal ?
Take down policy
If you believe that this document breaches copyright please contact us at vbn@aub.aau.dk providing details, and we will remove access to
the work immediately and investigate your claim.
Henrik Moller
A BS TRA C T
1 INTRODUCTION
The idea behind the binaural recording technique is as follows: The input to
the hearing consists o f two signals: sound pressures at each of the eardrums.
If these are recorded in the ears o f a listener and reproduced exactly as they
were, then the complete auditive experience is assumed to be reproduced,
171
Applied Acoustics 0003-682X/92/$05.00 © 1992 Elsevier Science Publishers Ltd, England.
Printed in Great Britain
172 Henrik Moiler
including timbre and spatial aspects. The term binaural recording refers to
the fact that the two inputs to the hearing are reproduced correctly.
The recording may be made with small microphones placed in the ear
canals of a human listener, but normally a copy of a human head is used. The
copy has the shape of an average human head, including nose, orbits, pinnae
and ear canals, and sometimes the head is even attached to a torso copy. Also
the acoustical impedance of the ear drum is sometimes simulated. By
accurately copying a human head it is ensured that sound waves reaching the
head, undergo the same transmission on their way to the ear canals, as if they
were reaching a real listener. A copy of the human head is called an artificial
head, a dummy head, a head shnulator or, in German, a Kunstkopf These
terms have inspired alternative terms for the binaural recording technique:
artificial head recording technique, dummy head technique and Kunst-
kopftechnik. Also the expression head-related technique is used.
The playback is normally done with headphones, since this method
ensures that sound picked up in one ear is only reproduced in that ear.
Reproduction through loudspeakers would introduce an unwanted
crosstalk, since sound from each of the loudspeakers would be heard with
both ears.
The basic idea of the binaural recording technique is not new.
Descriptions of the idea, of its applications and of details in the sound
transmission from the recording head to the listener's eardrums have
been found in the literature for more than 60 years. Examples are Refs
1-37.
Several artificial heads are commercially available from manufacturers
such as Neumann (Berlin, Germany), Head Acoustics G m b H (Aachen,
Germany), Briiel & Kj~er (Ncerum, Denmark) and Knowles Electronics
(Itasca, IL). Nevertheless, the binaural recording technique has not yet got the
widespread use that might be expected. A m o n g the reasons are the fact that
binaural signals are intended for headphone reproduction, and the
recording and broadcasting industries have not yet been prepared to make
special recordings for this purpose, except for experimental issues. Another
obstacle to use in broadcasting is lack of mono compatibility.
Investigations have shown, though, that proper equalization of the
microphone output may guarantee preservation of timbre, even when
signals are reproduced through an ordinary loudspeaker stereo set-
up. There is some disagreement about the exact way of doing this, the
main concepts being free-fieM equalization and diffuse-field equaliza-
liOn. 18'23"25"27'28"31"33"38-45'83 Of course, the spatial reproduction of the
binaural technique is not obtained, but it is claimed that the quality is
comparable to that of traditional intensity stereo recordings with respect to
timbre and spatial characteristics. 42-'.3
Fundamentals of binaural technology 173
this means that the sound transmission from that other point to the eardrum
must be independent of the direction and the distance to the sound source.
Any set of left and right channel signals, recorded at points that fulfil this
requirement, are called binaural signals. The sound transmission from a
source in a free field, through the external ear to the eardrum, is described in
a model given in Section 2.
The model involves a somewhat untraditional application of
transmission-line theory, that makes the calculations simpler. An example
illustrating this is given in Appendix A.
In the binaural technique, the importance of a properly equalized
headphone is often overlooked. Much effort is spent on design of artificial
recording heads, and then any headphone is used for listening. This is
remarkable, since it is quite evident that the headphone, as well as the head
used for the recording, contributes to the total sound transmission. The
correct reproduction of the recorded sound pressure can only be guaranteed
when certain characteristics of the headphone are known. Therefore, a
model describing the sound transmission from this device to the eardrum is
also developed. This model is given in Section 3.
Section 4 gives a description of the binaural technique based upon the
models from Sections 2 and 3. Three possible recordings points are selected:
(a) at the eardrum, (b) at the entrance to the ear canal and (c) at the entrance
to the ear canal, but with the ear canal physically blocked. For each of these
points it is shown how the correct reproduction can be obtained by the
introduction of an electrical equalizing circuit. Section 4 concludes
mentioning recording at other points in the ear canal, using miniature
microphones. Such microphones are often used, since they are produced
small enough to be inserted in the ear canal.
In the description it is assumed that the artificial head used during
recording, and for calibration of the headphone, is a perfect copy of the
listener present during the reproduction. Alternatively, the listener himself
may be used instead of the artificial head.
Techniques that make binaural signals suitable for loudspeaker
reproduction are covered in Sections 5 and 6. Section 5 covers equalizing
methods that give the same tonal balance in loudspeakers, as if the recording
were made with traditional microphones. However, the precise reproduction
of the eardrum signals is not preserved, and the spatial reproduction is not
superior to that of traditional stereo. Section 6 covers a more sophisticated
method for loudspeaker reproduction that gives complete restoration of the
two eardrum signals. Thus the outstanding spatial reproduction of the
binaural technique is preserved. The method works only in anechoic
surroundings, and the position of the head must be rather precise.
Until this point, the binaural signals are assumed to originate in a
Fundamentals of binaural technology 175
recording of an acoustical event that has taken place in real life. It is also
possible to synthesize the signals on a computer. The possibility of
computational creation of binaural signals is described in Section 7. Section
8 is a brief mention of some possible applications of binaural technology.
2 L I S T E N I N G IN A F R E E F I E L D
The listener in Fig. 1 is in a concert hall. He prefers the live concert to his
stereo set at h o m e - - n o t only because of the atmosphere associated with live
concerts, but also because the live concert gives him a true three-dimensional
auditive experience. He can hear the direct sound from each of the
instruments and the reflections coming from the sides and above, and thus
create an 'image' of the orchestra, the concert hall and its acoustics. If it is a
well-designed hall, the reflections contribute positively to the musical
experience, and they help in localizing the instruments.
A precondition for the creation of an 'auditive image' of the orchestra and
the room, is the ability of the human hearing to determine direction and dis-
tance to single sound sources. In the case of the concert hall, each instrument is
a sound source that sends a direct sound to the listener. The instruments also
send sound waves in other directions, waves that give rise to reflections. The
reflections can be regarded as sound coming from additional and imaginary
sound sources of which the direction and distance can be determined, for
example from a mirror image model. If these sound sources were playing one
at a time, the listener would--at least to some extent--be able to determine
direction and distance to the 'source'. When the direct sound and the
reflections are present all together, the listener gets the total spatial
experience.
It is traditionally said that the hearing uses a number of cues in the
determination of direction and distance to a sound source. Among the cues
are
(1) coloration
(2) interaural time differences
(3) interaural phase differences
(4) interaural level differences
These cues are claimed to be responsible for the directional hearing in each
of their 'domain'. For instance, in the horizontal plane low frequencies are
said to be assessed by interaural phase differences, medium frequencies by
interaural time differences and high frequencies by interaural level
differences. Coloration is claimed to be responsible where no interaural
differences exist, that is in the median plane. A thorough discussion ofcues to
directional hearing is given by BlauertJ 3'32
This way of splitting up the cues for the directional hearing is only justified
on the basis of experiments with presentation of sophisticated artificial
signals. A natural sound coming from a given direction will--on its way to
the two ears--be exposed to two unique filterings, of which the spectral and
time attributes cannot be separated. The fundamental idea in the present
description will therefore be more general, namely:
A sound wave coming from a given direction and distance, results in two
sound pressures, one at each eardrum. The transmissions are described in
terms of two transfer functions that include any linear distortion, such as
coloration and interaural time and spectral differences. The task o f a
binaural recording and playback system is to present the correct inputs to
the hearing, that is to reproduce the eardrum signals correctly. In this
connection, it is not important how the hearing extracts information from
the eardrum signals about distance and direction.
Knowledge about the way in which the hearing extracts the distance and
directional information may prove useful at the time, when the needed
accuracy of the transmission is to be assessed. Or expressed in another way:
if the eardrum signals cannot be reproduced 100% correctly, this knowledge
may tell which aspects of the eardrum signals are most important to
reproduce correctly.
Fundamentals of binaural technology 177
/
Fig. 2. Sound source and listener in a free field. Conventions for the variables indicating
distance and direction are shown.
2>>8x10-am (1)
Problems with possible zeros in the reference transfer function are also
overcome, since they hardly exist in the average.
The description above applies to the transmission into each ear. If
symmetry is assumed, then
[ Pi, left/Pl"l(r, qb, O) = [ Pi, right/P 1 ] (r, -- alp,O] (1 O)
It is important to understand that the transfer functions defined in this
section give a complete description of the sound transmission, including
diffraction around the head, reflections from shoulders, reflections in the ear
canal, etc. It may be a little tricky to see, for instance, that the reflections from
the eardrum are included in a simple pressure ratio as in eqn (3). The
explanation is, of course, that the impedances Ze, rcan,~ and Zradiatio n a r e
frequency-dependent variables given in amplitude and phase for a wide
frequency range. This is in contrast to traditional transmission-line
calculations, where normally steady-state conditions are assumed at a
specific frequency. If the pressure ratio in eqn (3) is inversely Fourier-
transformed, the reflections will be identifiable in the time domain. An
example of this is given in Appendix A.
At this place it is also important to note that the head-related transfer
functions include all the cues to the directional hearing mentioned in the
introduction to Section 2, being directional-dependent coloration and
interaural differences. Be also aware that some of the head-related transfer
functions are non-causal, since the sound arrives earlier to the ear than to the
middle of the head, if the ear is closer to the sound source.
Zr, d~,t~o. is the radiation impedance being looked out into from the entrance
to the ear canal. This impedance is independent of source location. Then the
propagation from P2 through P3 to P4 must be independent of source
location. This means that all of the sound pressures P2,P3 and P4 include the
full spatial information given to the ear. Thus, any of these sound pressures
will be suitable for binaural recording, if only the reproduction is adapted, so
that the 'final p r o d u c t ' - - s o u n d pressure at the e a r d r u m k i s correct. The
same applies to any sound pressure along the ear canal. How the adaptation
should be made is described in Section 4.
Let the transmission from a sound source to sound pressure at the
eardrum be split up in the following way:
[ e,dP,] = [_P,JP3"I " [_P3/P2] " [ P2/P,-I (11)
In addition to a possible dependence upon distance and direction to the
sound source, these transfer functions are assumed dependent u p o n the
182 Henrik Moiler
When the expression 'entrance to the ear canal' is used, it is assumed that
from this point the propagation into the ear canal is one-dimensional and
thus independent of direction and distance to the sound source. Whether the
one-dimensional propagation starts immediately at the entrance, or it is
necessary-to go a few millimeters into the ear canal, is not clear. F r o m a
theoretical point of view, the axial modes will attenuate with distance from
the entrance, and most authors mention a point 0-4 m m from the ear-canal
entrance. However, only sparse experimental verification is available, when
specific deviations from a directionally independent transmission are
considered.
Mehrgardt and Mellert v° used a point 2 m m inside the ear canal, and
reported very briefly dependence upon the direction, if a point outside this
was chosen. A different result was obtained in a recent study carried out at
our laboratory. 6a
In this study, transfer functions were measured from various points in the
ear canal to the eardrum. The measurements were m a d e with probe
microphones in the ear canals of four humans. The probes had an outer
diameter of 1-5 mm, and did not disturb the sound field in the ear canal. The
sound was transmitted from three different directions in a free sound field
(left ear, ~b = 0 °, 90 ° and 180 °, 0 = 0°).
The results from a typical subject are shown in Fig. 4. It is seen that the
Fundamentals o f binaural technology 183
transmission in the two lower lines o f the figure is highly dependent upon the
direction o f sound incidence. The differences exist for frequencies above
approximately 4 kHz. The four upper graphs in Fig. 4 show no or very little
dependence on direction. This means that the one-dimensional transmission
has already started at a point located a few millimeters outside the ear canal.
This conclusion has been based on measurements covering only three
directions. Unpublished measurements at our laboratory o f the pressure
ratio [Pa/P2] at the entrance to the ear canal with five directions of sound
incidence (0 = 0 °, ~ = 0 °, 90 °, 180 °, 270 ° and 0 = 90 °) show the same absence
of directional dependence.
i i ! i i i ii i ! ! 'IYansmission
~ d.B
! ~"'"'~'"'~'"~'"T"i"~
] ~ 5 from a point:
............. ~. . . . . . . . ~'"'"~'""r ""r "" : ' " r "'~ . . . . . . . . . . . . . . . . . . . . . . ~. . . . . . . . . . . . . i ........ ~'"'"~'"" L..~...~...
0 ~ ~ a) 12 man inside
iiii
iiiiiii:iiii/iiiiiiiiiii:/
...........
iiiiiiiii!iiiiiii /:
i. i ! E ; ~i ~ i i i ~
0 ............
!i ii.........ii..--..i.--..i
i..i •~i- i-..i-.4
!!! ..................
: : , : " i ,
.~-....i.-4.. d) outside(in
6 mm
line of)
the ear canal
to the eardrum
0 e) at "caudal
cavum conchae"
to the eardrum
i i i i iili i i i iili
P2 1 person
P1 3 directions
.............!'""'"'t'"'"t'"'!""~"" ..'"~"~'
! i i i
............. i T r r " . . . r r - ~! . . .! . .! .
! ! ! !
P3 1 person
o
P2 3 directions
i ! i i!ill i i !!ii
............. ~'"'""~'"'"~'"".-'"'~'"..'"r'7 : : : : : :
i i i i i iii i ! i i i i
P4 1 person
P3 3 directions
i i i i ilii i i i i~i
: : : : : : :: : : : : : :
............. !...,.... ~....,.~.....~....?...~,.~.?,
~ : i i ii:, i t i i i z¢
200 1.000 10.000
Frequency (Hz)
Fig. 5. Magnitude of transfer functions for one subject, one ear. Each line shows
measurements from three different directions.
Fundamentals of binaural technology 185
3 LISTENING WITH H E A D P H O N E S
+
~ronsmissJon line
4.
• ham~hono
Zheadphone
transmission
,~
~ ~ z.. z~ r u m
Just like in the free field, a set of transfer functions exists for either side. If
symmetry is assumed, then
4 BINAURAL TECHNIQUE
As argued for in Section 2, the sound pressure at the entrance to the open ear
canal includes full spatial information, it is therefore appropriate also to
have a look at the situation of when this sound pressure is recorded.
The electrical gain is called Ga, and the total transfer function from the
sound field to the eardrum is
[ ea/el] "MI "GB" [ ev/Eheadphone] (31)
while it should still be
[P,/Pl] (32)
190 Henrik Moiler
This goal is reached, if expressions (31) and (32) are m a d e equal, which gives
[P,,/PI]
Ga = [ p3/p1 ] . MI . [ P7/Eheadphone ]
[PJP3] 1
(33)
[ Pv/P6] M 1 • [ P6/Eheadphone]
The first term here is unity, since the transmission from pressure at the
entrance to the open ear canal to the eardrum is the same, whatever the
source (see eqn (22)). Then
1
GB = (34)
M 1 • [ e6/Eheadphone]
Conclusion. The correct transfer function is obtained when the electronic
circuit compensates for (1) the microphone sensitivity and (2) the h e a d p h o n e
transfer function from the terminals to the sound pressure at the entrance to
the open ear canal.
The sound pressure at the entrance to the blocked ear canal also contains the
complete spatial information.
The electrical gain is here called G o and the total transfer from the free
sound field to the sound pressure at the eardrum is:
[ P2/PI] . MI "Gc" [ PT/Ehe.~dpho,e] (35)
Again it should be
[P4/P1] (36)
This is fulfilled if expressions (35) and (36) are equal, which gives
[PJP,]
Gc =
[ P2/P1] "M1 " [ Pv/Ehe.~dphoJ
_ [Pa/P3] [P3/P2] 1
(37)
[ PT/P6] [ P6/P5] M l " [ e5/Eheadphone]
As above, the first term is unity, and the second can be exchanged with
expression (24), so
Gc = [ P3/P2] 1
[ P6/P~] M l " [ Ps/Eheadphone]
Zearcanal -'1- Zheadphone 1
(38)
Zearcanal + Zradiation M l "[Ps/Eh,.~aphoJ
Fundamentals of binaural technology 19 i
In the previous sections, it has been shown that all three methods can be
adapted to give the correct transmission from the recorded sound field to the
listener's eardrum, simply by choosing the proper equalization. This is not
surprising, since the only difference between the methods is the one-
dimensional transmission from open circuit pressure at the input of the ear
canal to pressure at the eardrum.
All methods require recording at a point in the ear canal and
determination of the headphone transfer function to the same point in the
ear canal. The point is sometimes called the reference point. Determination
of the headphone transfer function is called calibration of the headphone.
The third method requires an extra correction to the equalization. This
correction, though, disappears when an ideally open headphone is used.
If a human subject is used for recording and headphone calibration, the
first method with recording at the eardrum might be considered
inconvenient because of the practical problems involved in recording the
sound here. In Method C the ear canal is blocked, and this method has the
disadvantage that the listener is not able to hear the sound during recording.
For recording with an artificial head, Method C is attractive, since the
head needs no ear canals. The method is particularly simple, if on ideally
open headphone is used for reproduction.
It is not clear which method will give the best results, when differences
occur between the recording head, the head used for headphone calibration
and the head of the listener. Method C may seem immediately attractive,
since the recorded signals contain as little individual information as possible.
Only the transfer to P2 is included, and less will not do.
192 Henrik Mailer
G m
M1 • [ Pi/Eheadphone]
1
M1 " [ ( gmicrophone/ M 2 )/ gheadphone]
= M__k. Ehe.~dpho.e (40)
ml Emicrophone
It is seen that the electrical gain, G, must equalize the transfer function
measured from headphone voltage to output voltage of the microphone, as
well as the ratio between the two microphones used for calibration and for
recording.
Now, the smart reader will probably suggest that the same microphone be
used for recording and calibration, so M1 = M2. In that case eqn (40)
becomes
G = Eheadph°ne (41)
Emicrophone
and the need for calibration of the microphones is avoided.
Until now, it has been assumed that the recording could be made without
disturbing the sound field with the microphone. It may be possible with an
artificial head, where a microphone can be built into the head, but this
possibility does not exist with a real head. For measurement purposes, probe
microphones with very thin probe diameter will do the job, but their internal
noise is too high for making recordings.
Fundamentals of binaural technology 193
G= 1 (43)
g , . [ P=/E.~.dpho..-]*
The total transfer function to the eardrum will be
1 [ Px/Pl]* "[ PT/Ehe~dphone]
[P=/Pl]*" g l Ml . [Px/Eheadphone] , [P7/Eheadphone'] = EPx/Ehcad~ho J
(44)
This should have been
[ e,/Pl] (45)
The error is found by dividing eqn (44) by (45).
error =
[P./e,]* • [eT/e,o..,d,,ond
[ e 4/ e l ] " [ e x/ Eheadphone]*
[P.IP33*'EP31P23*'EP2/P, 3"" [eT/e63- [e6/l'd" [Pd ,o. o,on.3
= [ P . / P , a ' E e d P d ' E e = / P , ] " [J'JP6]*" Eerie,I*" [ede o.d ,ono]*
(46)
[P2/PI] and [PJE, e,dphoJ are the same with and without the microphone.
The same arguments as in Section 3.4 can be used, so
[ Px/P6]* = [ Px/P3]* (47)
194 Henrik Moiler
Now, by also using eqns (22), (3) and (23), eqn (46) can be reduced to
[ Pa/P2]* [ P6/Ps]
error =
[P3/P2] [ P6/P5]*
In Section 4 it was shown how the correct transmission of sound from the
artificial head to the listener's eardrums was achieved by the introduction of
an electrical equalizing filter G. If the signals either before or after this filter
are reproduced through a traditional stereo set-up with loudspeakers, a
coloration will in general occur because of the non-fiat frequency response
from the original sound field to the voltage at these points.
However, it is possible to divide the G filter into two, as illustrated in Fig. 8.
The filters are called G' and G", and
G'-G" = G (49)
The two filters should be chosen so that the frequency response would be flat
from the free sound field to the dividing point between them. In this way
signals from the dividing point can be played through loudspeakers without
coloration.
As mentioned in Section 1, the way of dividing G has been the matter of
some discussion. The two main concepts are free-field equalization and
diffuse-field equalization. These terms will be described in the following.
Fundamentals of binaural technology 195
to l~ds ~ker
As the recording head has different transfer functions for different angles
of sound incidence--that is the idea of binaural recording--a flat frequency
response cannot be obtained for all angles. As most sound sources are in
front of the recording head, it has been argued that a flat frequency response
should be obtained for sound arriving from a sound source in the front. This
way of choosing G' and G" is called free-field equalization.
It has also been argued that the recording head is normally further away
from the sound sources than the hall radius of the recording room. Then it is
located in the diffuse part of the sound field, and a flat frequency response
should be preferred for sound coming from random directions. This choice
is called diffuse-field equalization.
In this case there should be a flat frequency response for a sound arriving
from the front to voltage after filter G'. For method A this means
where the constant determines the gain. It has the dimension voltage per
pressure, giving the units V/Pa. Gk rr~ fieldand G~ freeaetdare easily found from
eqn (50), by using eqn (30)
constant
GAfreefield M I " [ P 4 / P I ] ( ¢ = 0 °, 0 = 0 °) (51)
G,, [v,/Pt](+ = o ° , o = o °)
(52)
G ~ freefield- - G ~ freefield [PT/Eheadphone]"constant
196 Henrik Moiler
constant
GB freefield= M l "[P3/ei]((~ = 0 °, 0 = 0 °) (53)
/-'~t constant
U C free field = M I " [P2/PI](q~ = 0 °, 0 = 0 °) (54)
[ P3/Pl](dp = 0 °,0 = 0 °)
G~ treelield-- [ P6/Eh¢adphoJ .:constant
(55)
A head followed by the filter G~n,~d (or a head that is constructed so that
G~r~enetdbecomes a constant) is called free-field-equalized. Usually this refers
to an artificial head, but the expression can be used in connection with a real
head as well. The transfer function for the free-field-equalized head with
recording at the eardrum becomes
(57)
[P2/P,](r, dp, O)
= constant
[ P 2 / P t ] ( ¢ = 0 °, 0 = 0 °)
Here a flat frequency response is wanted, when the head is placed in a diffuse
sound field. The filter G~iffusefield can be determined from
[PJPl](average of all angles). M 1 • G~diffusefield -~ constant (61)
Then
constant
Gk diffusefield= M l" [ PJPl](average of all angles) (62)
Note that only the magnitude of [Pi/Px] (average of all angles) has a
meaning (see the arguments in connection with eqn (9)). Then it is only
possible to determine the magnitude of G~iffuselield and G~'ieto~en~td,and
minimum phase realizations of the filters are normally used.
Use of definition (9) in expression (68), shows that the output of a diffuse-
field-equalized head is
output = constant. [monaural transfer function referenced to diffuse field]
(71)
The same result is found for Methods B and C (eqns (66) and (67) are
inserted, and eqns (22) and (24) are used). So, as for free-field equalization, it
does not matter at which recording point the calibration is made.
Free-field- and diffuse-field-equalized headphones are commercially
available, often in a form where the headphone itself is claimed to have the
correct characteristic, so the filter G" is not needed. If a free-field- or a
diffuse-field-equalized headphone is used in connection with a similarly
equalized artificial head, both filters G' and G" should be omitted. Correct
total equalization is obtained, when the output from the artificial head is
connected directly to the headphone (through a power amplifier).
There may be some uncertainty about the quality of the free-field or
diffuse-field equalization ofthe headphones. It is the author's experience that
headphones that are claimed to have the same type of equalization, but
produced by different manufacturers, may be very different in timbre as well
as measured frequency response. 36
Pplayback. left= nleft-left " Eloudspeaker, left "1- Bright-left" Eloudspeaker, right (76)
Pplayback, right = nleft-right" E loudspeaker.left "1- Bright-right" Eloudspeaker, right
The wanted transmission is
Signal processing
and
power amplifier
, I I .
H ~ t - ka H right right
-
Fig. 9. Principal diagram showing the transmissions from recorded voltages to sound
pressures in the ear canal.
is easily recognised as the interaural transfer function (see eqn (8)). In the
following, it is denoted ITF.
[ Pplayback/el]sameside" [ Pl/Eioudspeaker] = [ Pplayback/Eioudspeaker]sameside (82)
is the transfer function from a loudspeaker to the reference point in the ear
canal at the side facing the loudspeaker. Now, eqn (80) can be written as
Precord, same side -- Precord, opposite side" ITF (83)
El°udspeaker : [ Pplayback/Eloudspeaker]sameside(1 -- ITF 2)
This signal processing is shown in block-diagram form in Fig. 10. The blocks
to the left perform an equalization of the recording microphone and the
loudspeakers. The next blocks also perform an equalization, necessary
because of coloration due to the crosstalk. At frequencies where the
202 Henrik Moiler
c
E
r-
etO
| | °
!,
b-
,ea
J
Fundamentals of binaural technology 203
crosstalk is low, this block is almost unity. The real suppressing ofcrosstalk
is carried out by the two cross-coupled blocks.
The transfer function from voltage at loudspeaker terminals to sound
pressure at the reference point in the listener's ear canal includes a delay.
Naturally, this delay should also be accepted for the whole system, otherwise
a non-causal filter would be required. A number of other practical
considerations must be taken into account in a realization. Among these are
problems at low frequencies, where the interaural transfer function
approaches unity, and special care must be taken not to require a division by
zero in the block containing 1/(1- ITF2).
Examples of transfer functions, measured on a Neumann KUS0 artificial
head, are given in Fig. 11, and the corresponding interaural impulse response
~dg
tO
-10
-10 '
. -20.
-80'
-10
-20~
!
-30-
0.1
-0.1 I I I I '
I 2 3 4
Time [msl
Fig. 12. Interaural impulse response. Neumann KU80, 4)= 45°, 0 = 0°.
in Fig. 12. 63 It is seen that the impulse response is only a few milliseconds
long, and a finite impulse response filter is a possible means o f
implementation.
A crosstalk cancellation system has been in use at our laboratory for some
years, and commercial systems are also available. Basically, they only work
in a free field, which means that an anechoic room is needed. The position of
the listener's head is also somewhat critical. Systems that account properly
for reflections are still at the experimental stage.
7 C O M P U T E R SYNTHESIS O F B I N A U R A L S I G N A L S
In the previous sections it has been assumed that the binaural signals
originate in a recording of an acoustical event that takes place in real life. The
binaural technique is used to transmit the event to listeners who are not
present at the location, or to store it for later reproduction.
It is also possible to synthesize binaural signals on a digital computer and
thus simulate that a sound has been played in a r o o m and has been recorded
by an artificial head. The room need not even exist in real life. Inputs to this
kind o f simulation are:
(1) Information on the sound transmission from source to listening
point in the room.
(2) Information on the head-related transfer functions of the head being
simulated.
(3) A recording of a 'dry' sound source, for example m a d e in an anechoic
room.
The output is the two channels of a binaural signal. H o w this is created will
be described in principle in the following.
Fundamentals of binaural technology 205
blert'"'4~"°'(t) (85)
bright.r,.~,.o,(t )
The contribution of each of the transmission paths to the transmission from
the simulated sound source to the simulated recording points is found by
convolution
convolution of the recorded dry sound signal s(t) (voltage) with the binaural
room impulse responses.
elert(t) = hjert(t)* s(t) (88)
eri~ht(t) = hri~ht(t)* s(t)
If the room exists in real life, the binaural room impulse response in eqn (87)
can be found by measurement rather than by calculation.
For implementation in a digital system, impulse responses a(t), b(t) and
h(t) should be replaced by sequences a(n), b(n), h(n) that are samples from the
impulse responses, and the signal s(t) should be replaced by s(n) that are
samples from the signal.
8 APPLICATIONS OF BINAURAL T E C H N O L O G Y
Let us assume that the equalizing, transmission, etc., have been done
properly, and that the artificial recording head is a true copy of the listener's
own head. Then it is chai'acteristic of the binaural recording technique that
the sound presented to the listener's eardrums is exactly the same as would
have been found, if the listener had been present during the recording. This
makes the binaural recording technique superior to other recording
techniques. It gives the most valid representation of the original sound, not
only with respect to timbre, but also in relation to spatial aspects.
The applications are divided into (1) applications that involve recording
with an artificial head, (2) applications that use computer-simulated binaural
signals and (3) others.
Noise evaluation
For noise evaluation it is now well known that A-weighted sound levels or
other 'objective measures are often insufficient to characterize a noise.
Subjective assessments have become more common, although they are
difficult to carry out. Normally a number of subjects must listen to many
Fundamentals of binaural technology 207
put the microphone outside the plug. The position of the microphone should
be so close to the entrance of the ear canal that the sound propagation from
that point is one-dimensional. The electronics and the earphone can be
included in the plug. Such a device could be denoted a binaural hearing aid
(also if one ear is normal and only one hearing aid is used).
It is not clear whether a binaural hearing aid will improve directional
hearing in the case of an non-linear, neural hearing loss with recruitment.
9 CONCLUDING REMARKS
The present article has described the basic theory of binaural recording and
playback as well as the theory of computer generation of binaural signals.
Also a number of topical and prospected applications have been described.
It has been shown that any point in the ear canal--and possibly even a
point a few millimeters outside--can be used for recording, since sound
pressure here includes the full spatial information given to the ear. It is also
acceptable to record outside a blocked ear canal.
It is shown that correct overall transmission in a binaural system can be
guaranteed if an electronic equalizing filter is introduced between the
recording head and the headphone. The filter should equalize the
characteristic of the recording microphone and the headphone transfer
function measured at the point in the ear canal, where the recording is made.
The equalizing filter should include extra terms, if recording is made
outside a blocked ear canal, or if the ear canal of the head used for recording
and calibration is different from that of the listener. The extra terms are not
required, when an open headphone is used for reproduction.
The description has concentrated on 'how it should work'. It should not be
forgotten, though, that at present the binaural technique still suffers from
some problems. Most obvious is the poor frontal localization, but also
problems about standardization of recording points, equilization, compati-
bility, interface to room simulations, significance of individual variations
and other things deserve still some attention.
However, binaural technology has greatly improved since it was invented.
212 Henrik Moiler
A C K N O W L E D G E M ENTS
The author wants to thank his colleagues for their advice and encourage-
ment. O f special value were a n u m b e r o f discussions with Dorte
Hammershoi, her help in the verification of the models and her fruitful
criticism during the preparation of this article.
REFERENCES
1. Firestone, F. A., The phase difference and amplitude ratio at the ears due to a
source of pure tones. J. Acoust. Soc. Amen, 2 (1930) 260-70.
2 Eichhorst, O., Zur Frfihgeschichte der stereophonischen Obertragung.
Frequenz, 13 (1959) 273-7.
3. Nordlund, B. & Liden, G., An artificial head. Acta Oto-laryngol., 56 (1963)
493-9.
4. Schirmer, W., Die Ver~inderung der Wahrnehmbarkeitsschwelle eines kiinstlichen
Riickwurfes bei kopfbeziiglicher stereophoner Obertragung. Hochfrequenztech.
u. Elektroakustik, 75 (1966) 115-23.
5. Schirmer, W., Die Unterscheidbarkeit von H6rerpPitzen mittels kopfbezfig-
licher stereophoner und monophoner Obertragung. Hochfrequenztech. u.
Elektroakustik, 75 (1966) 181-4.
6. Torick, E. L., di Mattia, A., Rosenheck, A. J., Abbagnaro, L. A. & Bauer, B. B.,
An electric dummy for acoustical testing. J. Audio Engng Soc., 16(4) 0968)
397-403.
7. Damaske, P. & Wagener, B., Richtungsh6rversuche fiber einen nachgebildeten
Kopf. Acustica, 21 0969) 30-5.
8. Kfirer, R., Plenge, G. & Wilkens, H., Correct spatial sound perception rendered
by a special 2-channel recording method. Paper presented at Audio Engng Soc.,
New York, NY, 1969.
9. Wilkens, H., Subjektive Ermittlung der Richtcharakteristik des Kopfes und
einer kopfbezogenen Aufname und Wiedergabeanordnung. In Gemein-
schaftstagungfiir Akustik und Schwingungstechnik, Berlin 1970, ed. T. Tarnoczy.
VDI-Verlag, Diisseldorf, Germany, 1971, pp. 407-10.
10. Wilkens, H., Beurteilung von Raumeindriicken verschiedener H/Srerpl/itze
mittels kopfbezogener Stereophonie. In Proc., 7th Int. Congr. on Acoustics,
Budapest. Akademiai Kiado, Budapest, Hungary, 1971, 24 S 5.
11. Wiikens, H., Kopfbeziigliche Stereophone, ein Hilfsmittel fiir Vergleich und
Beurteilung verschiedener Raumeindriicke. Acustica, 26 (1972) 213-21.
12. Mellert, V., Construction of a dummy head after new measurements of the
threshold of hearing. J. Acoust. Soc. Amer., 51 (1972) 1359-61.
13. Blauert, J., Riiumliches H6ren. S. Hirzel Verlag, Stuttgart, Germany, 1974.
Fundamentals of binaural technology 213
35. Hammershoi, D., M~ller, H., S~rensen, M. F. & Larsen, K. A., Head-related
transfer functions: Measurements on 40 human subjects. Presented at the
92nd Convention of the Audio Engineering Society, Vienna, Austria, 1992.
36. Moiler, H., Hammershoi, D., Hundeboll, J. V. & Jensen, C. B., Transfer
characteristics of headphones: Measurements on 40 human subjects.
Presented at the 92nd Convention of the Audio Engineering Society, Vienna,
Austria, 1992.
37. Hammershoi, D. & Mailer, H., Artificial heads for free field recording; how well
do they simulate real heads? To be presented at the 14th International Congress
on Acoustics, Beijing, China, 1992.
38. Weber, R. & Mellert, V., Ein Kunstkopf mit ebenem Frequenzgang. In
Forschritte der Akustik, DAGA '78, ed. J. Blauert & H. Baule. VDE-Verlag,
Berlin, Germany, 1978, pp. 645-8.
39. Killion, M. C., Equalization filter for eardrum-pressure recording using a
KEMAR manikin. J. Audio Eng. Soc., 27(1-2) (1979) 13-16.
40. Tsujimoto, K., Equalization in artificial-head recording for loudspeaker
reproduction. In Proc. lOth lnt. Congr. on Acoustics, Sydney. The Publications
Unit, Sydney, Australia, 1980.
41. Gotoh, T., Kimura, Y. & Sakamoto, N., A proposal of normalization for
binaural recording. 70th Convention, Audio Engng, New York, 1981, preprint
1811 (B-2).
42. (a) Theile, G., Zur Kompatibilit/it von Kunstkopfsignalen mit intensit/itsstereo-
phonen Signalen bei Lautsprecherwiedergabe: Die Richtungabbildung. Rund-
funktech. Mitt., 25 (1981) 67-73. (b) Theile, G., Zur Kompatabilit/it von
Kunstkopfsignalen mit intensit/itsstereophonen Signalen bei Lautsprech-
erwiedergabe: Die Klangfarbe. Rundfunktech. Mitt., 25 (1981) 146-54.
43. Theile, G., Zur Theorie der optimalen Wiedergabe von stereophonen Signalen
fiber Lautsprecher und Kopfhrrer. Rundfunktech. Mitt., 25 (198 l) 155-69.
44. Russotti, J. S., Santoro, T. P. & Haskell, G. B., Proposed technique for earphone
calibration. J. Audio Eng. Soc., 36(9) (1985) 643-50.
45. Gierlich, H. W. & Genuit, K., Processing artificial-head recordings. Features,
AES, 37(1/2) (1989) 34-9.
46. Butler, R. A. & Belendiuk, K., Spectral cues in the localization of sound in the
median sagittal plane. J. Accoust. Soc. Amer., 61 (1977) 1264-9.
47. Morimoto, M. & Ando, Y., Localization in the median plane of sound sources
simulated by a digital computer. In Proc. 8th Int. Congr. on Acoustics, ed. A.
Lara-Saenz, Vol. 1. Socidad Espanola de Acustica, Madrid, Spain, 1977, p. 371.
48. Morimoto, M. & Ando, Y., Simulation of sound localization. Localization of
sound: Theory and application, ed. R. W. Gatehouse. The Amphora Press,
Groton, CT, 1982, 85-9.
49. Boerger, G., Laws, P. & Blauert, J., Stereophone Kopfhrrerwidergabe mit
Steuerung bestimmter Obertragungsfaktoren durch Kopfdrehbewegungen.
Acustica, 39 (1977) 22-6.
50. Blauert, J., Untersuchungen zum Richtungshrren in der Medianebene bei
fixiertem. Kopf. Doctoral Dissertation, Technische Hochschule, Aachen,
Germany, 1969.
51. Nielsen, S. H., Distance perception in hearing. Aalborg University Press, ISBN:
87 7307 447-0, 1991.
52. Bauer, B. B., Stereophonic earphones and binaural loudspeakers. J. Audio
Engng Soc., 9(2) (1961) 148-51.
Fundamentals o["binaural technology 215
APPENDIX
i i i+:++ii+ +ii~.++!~
g .5 ..............~........i......i....i....+...i4.i.i~i......i....
~ i..i
I .,0 ..............i........~iii+"ii .................i........i......~
~ i
i
i i[ ii[!
217
2
i 7 !77ii7i 7 iiiT777 7 1!7777
i° -2
0.6
0.4 ..................................................................................................................................................................
0 ° 3 .................................................................................................................................................................
0.2 ..................................................................................................................................................................
0.1 ..................................................................................................................................................................
-0.1
-0.1 -O.06 O 0.05 O.1 O.15 0.2 0.25 0.3
where
Zeardru m - - Z 0
PL - - Zeardrum + N o (A2)
The pressure division at the entrance to the ear canal can be found (eqn (3))
as
Zear canal
[P3/P2] = Zearcana I + Zradiation (A3)
The following are inserted (the values serve as an example and not as an
attempt to suggest real values):
Zradiatio n = Z 0
Zeardru m = 5" Z 0
• 2n'f
?, = j . (A4)
C
c = 340 m/s
1=25"10-3m
The result o f a calculation o f e q n (A4) for a wide frequency range is shown
in Fig. A1.
In this case, a ~ reflection is expected at the eardrum, while the reflected
w a v e - - a s a consequence o f the impedance match with the radiation
impedance--is not reflected again at the entrance. This is exactly what is
seen, if the transfer function o f Fig. A1 is inverse Fourier transformed (see
Fig. A2).