ARTICULATORY SPACE CALIBRATION IN 3D ELECTRO-MAGNETIC
ARTICULOGRAPHY
An Ji1, Michael T. Johnson1, Jeffrey Berry2
1
Electrical and Computer Engineering, Marquette University, Milwaukee, WI, USA
2
Speech Pathology and Audiology, Marquette University, Milwaukee, WI, USA
paper, a bite plate calibration algorithm based on
quaternion rotation method is developed. This approach
allows for a single transformation of both the position and
orientation information into an articulatory reference
space.
The remainder of this paper is organized as follows:
Section 2 discusses the desired articulatory space. Section
3 and 4 describe the method in detail, and Section 5 gives
results and evaluates the proposed algorithm, with
conclusions in Section 6.
ABSTRACT
This paper introduces a new method to calibrate data
collected using Electro-Magnetic Articulography (EMA)
into an appropriate articulatory space. A bite plate record
for a target subject is used to define the maxillary occlusal
and midsagittal planes, and then a single quaternion
rotation is derived to transform the dataset into the new
anatomically referenced space. The choice of specific
rotation solution is discussed relative to the corresponding
anatomical assumptions regarding the original sensor
placement and coordinate system. Data were collected
using NDI Wave Speech Research System for one pilot
subject, and calibration results and consistency throughout
the calibration record reviewed. The results show that the
rotation solution can accurately and consistently transform
all sensors’ positions into an articulatory space in which
sensor movements and orientations can be easily analyzed.
This preliminary study enables the investigation of
articulatory kinematics and relationship to acoustics.
Index Terms— Electro-Magnetic Articulography,
articulatory space, quaternion representation
1.
2.
In the data set, a 6DOF sensor attached to the center of the
forehead defines the local coordinate system. All other
sensors are 5DOF, which are less bulky and allow better
freedom of movement to articulators. The 6DOF sensor is
used as a reference sensor to track the head movement, and
the system implements head movement correction
internally so that data for all 5DOF sensors is referenced to
the 6DOF origin and orientation. The goal of calibration is
to define a robust articulatory space and find the optimal
translation and rotation to accomplish local to articulatory
space coordinate transformation. This target articulatory
space is based on each subject’s anatomy, as shown in
Figure 1. The origin of the new coordinate system is
located at the central maxillary incisors [10].When
maxillary and mandibular natural teeth are brought
together, a plane of contact can be defined between the
occlusal surfaces of the upper and lower teeth. This plane
is called the occlusal plane and it represents the x-z plane
in our coordinate system.
INTRODUCTION
A number of technologies have been used historically for
recording articulator movements. X-ray cinematography
[1-2] is effective, but the radiation to the subject’s head is a
concern. Cine MRI can provide dynamical 3D
measurement of the vocal tract but it is somewhat
cumbersome and expensive [3-4]. The ultrasound
technique is able to capture the surface of the tongue [5-6],
but noise, echo artifacts and refractions may affect the
results. Electro-Magnetic articulography (EMA) sensing
systems has been growing in use in recent years for
investigation of articulatory kinematics [7-9]. This system
is able to track both position and orientation of the sensors,
which can include both 5 DOF (degree of freedom),
corresponding to a planar orientation, and 6 DOF options.
This enables detailed study of the relationship between
acoustic data and articulator movements. It is necessary to
calibrate all the raw data from the system’s coordinate
space into a meaningful anatomical articulatory space [10].
Most datasets available now include the bite plate
calibration in their preprocessing stage [11-14]. However,
none of them provide a detailed description of this
processing sufficient for 3D EMA, or the underlying
assumptions on which the calibration is based. In this
978-1-4799-1043-4/13/$31.00 ©2013 IEEE
ARTICULATORY SPACE
Figure 1: Target articulatory referenced coordinate
system. The origin is located at the central maxillary
incisors with the x and z axis lying in the occlusal plane
(highlighted orange). The y axis is perpendicular to that
plane along the midsagittal axis.
The negative x axis follows the midsagittal line of the
occlusal plane toward the back of throat. The positive z
155
ChinaSIP 2013
precisely in the midsagittal plane, 2) assuming that the
REF sensor vertical orientation (y axis) is precisely along
the midsagittal plane, or 3) adding an additional sensor to
the bite-plate to define the occlusal plane, with the
midsagittal plane perpendicular to this.
Each of these assumptions could be a reasonable way
to define the midsagittal plane, if care is taken to minimize
physical error relative to the necessary sensor placement
and orientation. Each assumption also represents a
different potential source for calibration error. In this
experiment we assumed the first of these configurations,
and place the REF sensor physically in the center of the
forehead to define the midsagittal plane.
axis runs perpendicularly to the x axis on the occlusal plane
in the direction of the subject’s left. The positive y axis is
perpendicular to the occlusal plane in the upward direction.
Thus the x-y plane defines the midsagittal plane and the x-z
plane defines the maxillary occlusal plane in the new
articulatory coordinate system.
A bite-plate calibration record is used for identifying
the new coordinate system. One 6DOF sensor is attached at
the center of forehead (denoted REF) and two 5DOF
sensors are placed on the bite-plate, one at the maxillary
central incisors (OS) and one along the midsagittal plane at
the bisection between the back molars (MS), as shown in
Figure 2 below. To create the bite-plate, two pieces of
dental impression wax are softened in warm water and
molded onto a tongue depressor. This softened wax is then
placed into the subject’s mouth and an impression of the
bite is taken. Immediately afterward (while the wax is still
soft), the experimenters measure the midpoint between the
back molars and create an indentation for the placement of
the MS sensor. The OS sensor is placed directly in front of
the central maxillary incisors. These sensors are pressed
into the wax until they are secured and the bite-plate is
replaced in the subject’s mouth for the bite-plate recording.
3.
PROPOSED METHOD
3.1. Quaternion rotation representation
Since the orientation data for the system are given in
quaternion format, the desired rotation will be
implemented via a quaternion rotation approach. In
computer visualization and animation, the quaternion
format is a commonly used method to represent rotation
and orientation [15]. A quaternion is a 4-D unit vector
q [ q0 , qx , q y , qz ] satisfying the following equation:
q02 qx2 q 2y qz2 1
(1)
A quaternion rotation thus lies on the 4-D unit hypersphere. The key application of quaternions to tracking jaw
movement in this paper lies in their use to represent
rotations. A unit-normalized quaternion can be used to
represent a rotation by an angle around a unit axis v .
q cos ,sin v x ,sin v y ,sin vz
2
2
2
2
Figure 2: Bite-plate with sensor positions
(2)
where the vector part sin / 2 v [qx , q y , qz ] defines the rotate
axis direction, and the scalar part cos( / 2) q0 defines the
degree of rotation. To rotate a point whose position is
represented by the vector p by an angle along the axis v to
a new position p final we apply the quaternion multiply
operation
In normal recording of articulatory movements, the
bite-plate wax is taken out and 5DOF sensors were placed
at the desired recording points. For this experiment a
simple configuration was used with two 5DOF sensors
placed on the mandibular incisor and second molar. The
6DOF sensor remained attached in the same forehead
position throughout. The translation and rotation
determined from the bite plate were applied to each data
record to map the original coordinate system to the
articulatory space system, creating a virtual origin point at
the central maxillary incisors and a correctly referenced
occlusal and midsagittal plane.
The determination of the translation needed to move
the origin of the system is straightforward, since both the
REF sensor and the OS sensor are simultaneously present
in the bite plate record. However, the rotation needed to
create the desired articulatory space is not as
straightforward. In addition to the OS and MS sensors, one
additional physical assumption is needed to define the
midsagittal plane. There are a number of possibilities for
this, including for example: 1) assuming that the REF
sensor (along with the OS and MS sensors) is placed
p final QPQ *
where Q [cos( / 2),sin( / 2)v ], P [0, p ] .
(3)
In the NDI EMA system used here, sensor orientations
are represented by a quaternion vector, which indicates the
amount of rotation a sensor has undergone relative to its
established base orientation in the coordinate space.
In the following section, we will derive the unique
quaternion rotation needed to transform all position and
orientation rotation in local coordinates into the
articulatory space coordinates.
3.2. Rotation axis and angel
In the target articulatory space, following translation, the
OS represents the origin, with the MS directly on the x axis
and the REF sensor at a point in the x-y plane (not
156
necessarily on the y axis, because the MS-OS-REF triangle
may not form a right angle. Since the distance from OS to
MS, the distance from OS to REF, and the MS-OS-REF
angle can all be directly computed, the exact new
coordinate locations for the MS and REF sensors can be
easily determined. The resulting necessary calibration
rotation is thus the rotation which will rotate the MS-OSREF triangle onto the new target coordinates. There is a
single unique rotation to accomplish this.
The notation is defined as follows: MS and
MS represent
After finding the rotation axis, it is necessary to
determine the correct rotation angle. Figure 4 illustrates
this computation, based on the visualization of the rotation
cone. The following equations show how to calculate the
rotation angle , and give the resulting quaternion form
rotation.
MS
MS
Axis
the MS location in local and articulatory
coordinates respectively, while REF and REF are the
corresponding head reference sensor coordinates.
One approach to determining the necessary rotation is
to consider the rotation of the MS and REF points
separately. There are an infinite number of rotations that
Figure 4:3D visualization for the rotation angle
will rotate the original MS onto MS , and also an infinite
number of rotations that will rotate the original REF
Equations (9), (10), (11) calculate the rotation angle and its
quaternion form
onto REF , and the final solution is the unique intersection
of these two sets. Figure 3 illustrates how to describe the
set of rotation axes for the MS point. The rotation can be
visualized as rotation along the surface of a cone, with the
center of the cone as the rotation axis. All possible rotation
axes lie in the plane formed by the bisecting vector
BS MS and the cross-product vector VMS
.
(9)
r MS sin
1 MS MS '
1 2
2sin
r
q [cos ,sin Axis x ,sin Axis y ,sin Axis z ]
2
2
2
2
MS
4.
MS
Figure 3: 3D visualization for the rotation axis
By following the same steps we can find another plane
that includes all axes which rotate REF to REF in the new
coordinates. The intersection of these two planes is a
unique line which is the single rotation axis we are looking
for.
Equations (4), (5), (6), (7), (8) give the calculation method:
(4)
VREF REF , REF
(5)
MS MS
BS MS
2
REF REF
BS REF
2
Axis VMS , BS MS , VREF , BS REF
Where
EXPERIMENTAL METHOD
The NDI Wave Speech Research system [16] was used
with a pilot subject to create a bite-plate record, implement
the calibration method described above, and then track the
positions and orientation of jaw sensors. The EMA system
provided head-movement corrected data in the local
coordinate space relative to the reference sensor on the
forehead. The system samples kinematic data at 400Hz and
acoustic data at 44.1kHz.
The rotation calibration was determined from the
mean positions of the OS, MS, and REF sensors in the bite
plate record, and the equations above were used to perform
the resulting rotation on the bite plate record data, as well
as on data from a simple jaw movement experiment.
BS MS
VMS MS , MS
(11)
This q can applied using equation (3) to any data record to
transform into the appropriate articulatory space coordinate
system.
VMS
(10)
5.
RESULTS AND DISCUSSION
Figure 5 and Figure 6 show a trace of the bite plate record
data in both the original and transformed coordinate
spaces. All units are in millimeters. It can be seen from
these figures that the original MS-OS-REF plane is
correctly rotated into the x-y plane in the new articulatory
space. Table 1 shows the mean values and standard
deviations of MS and REF for this data set after
calibration, compared to the value determined
computationally. The mean value after calibration is the
same as the computed value, indicating that the derived
(6)
(7)
(8)
denotes the cross product operator.
157
movements into three components: rotation degree around
the transverse axis located approximately through the
condyles, and horizontal and vertical translation of this
axis in the midsagittal plane. These components are easy to
calculate when in the articulatory referenced coordinate
space. Figure 7 shows the motion path of mandibular
sensors and corresponding jaw rotation angle in the
midsagittal plane after calibration. One sensor is attached
on the mandibular incisor and the other is on the buccal
surface of the second molar, with the subject performing a
simple repeating range of motion movement on the jaw.
Jaw angle can be determined directly from the incisor
sensor orientation, as shown in Figure 7.
solution correctly transforms the space into the desired
coordinate system. The standard deviation gives an
indication of physical sensor movement during the
calibration process itself. Sources of this error include
changes in REF sensor position and/or orientation, changes
in OS or MS sensor position, tracking and computation
error in the internal head-correction algorithm, and error
margins of the EMA data collection system itself. The
standard deviation across the bite plate record is in the
range of 0.05-0.11 millimeters, suggesting relatively good
stability of the transformation to the articulatory-referenced
coordinate system. (Note, however, that there was no
attempt in this experiment to push the boundaries of these
errors, and that in fact the subject was relatively still during
the biteplate record.)
Table 1. MS and REF sensor positions in articulatory space
Derived MS value
MS mean
MS standard deviation
Derived REF value
REF mean
REF standard deviation
X
-34.7538
-34.7538
0.0556
-8.6217
-8.6217
0.1140
Y
0
0
0.0491
77.9419
77.9419
0.0656
Z
0
0
0.0657
0
0
0.0608
Figure 7: jaw movements in midsagittal plane after
calibration
6.
CONCLUSION
The work presented here has introduced a calibration
approach for transformation of kinematic data into an
appropriate and stable articulatory coordinate space, and
demonstrated implementation of this approach using a
quaternion rotation method. The resulting algorithm is able
to transform both position and orientation data into the
desired coordinate system. This is a first step enabling
investigation of the relationship between articulator
kinematics and acoustics in the EMA framework.
Figure 5: MS, OS and REF position in local coordinates
7.
ACKNOWLEDGEMENT
This research is supported by grant IIS-1141826 from the
National Science Foundation.
8.
REFERENCES
[1] Munhall, K. G.,Vatikiotis-Bateson. E., And Tohkura, Y.,
“X-ray film database for speech research,” J. Acoust. Soc. Am.,
pp. 1222-1224, 1998
Figure 6: MS, OS and REF position after bite plate
calibration
[2] Perkell, J. S., “Physiology of Speech Production: Results
and Implications of a Quantitative Cineradiographic Study,”
Research Monograph No. 53, The M.I.T. Press, Cambridge, MA,
1969.
To illustrate the implementation of articulatory
kinematics using the new physiologically referenced
coordinate space with both position and orientation data,
an illustrative example of jaw movement and angle
computation from a single sensor. The two-dimensional
rigid-body model [15-16] decomposes speech-related jaw
[3] Masaki, S., Tiede, M. K., Honda, K., Shimade, Y., Fujimoto,
I., Nakamura, Y., and Ninomiya, N., “MRI-basd speech
production study using a synchronized sampling method,”
158
Journal of the Acoustical Society of America, col. 20, no. 5, pp.
375-397, 1999.
[16] Berry, J., “Accuracy of the NDI Wave Speech Research
System,” Journal of Speech, Language, and Hearing Research,
Vol.54 1295-1301, 2011.
[4] Narayanan, S., Nayak, K., Lee, S., Sethy, A., AND Byrd, D,.
“An approach to real-time magnetic resonance imaging for
speech production,” J. Acoust. Soc. Am., vol. 115, no.4, pp.
1771-1776, 2004.
[17] Westbury, J. R. “Mandible and hyoid bone movements
during speech,” Journal of Speech and Hearing Research, 31, pp.
405-416. 1988.
[5] Stone, M.L., Sonies, B.C., Shawker, T. H., Weiss, G., and
Nadel, L., “Analysis of real-time ultrasound images of tongue
configuration using a grid-digitizing system,” Journal of
Phonetics, vol. 11, no. 3, pp. 207-218, 1983.
[18] Edwards J. “Mandibular rotation and translation during
speech,” Doctoral dissertation, City University of New York,
1985.
[6] Kaburagi, T. and Honda, M., “An ultrasonic method for
monitoring tongue shape and the position of a fixed-point on the
tongue surface,” J. Acoust. Soc. Am., vol. 95, no. 4, pp. 22682270, 1994.
[7] Schwestka-Polly R, Engelke W, Engelke D., “The
importance of electromagnetic articulography in studying tongue
motor function in the framework of an orthodontic diagnosis”,
Fortschritte der Kieferorthopädie 53: 3–10, 1992.
[8] Horn H, Göz G, Bacher M, Müllauer M, Kretschmer I,
Axmann-Krcmar
D.,
“Reliability
of
electromagnetic
articulography recording during speaking sequences,” European
Journal of Orthodontics 19: 647–655, 1997.
[9] Fuchs, S., Perrier, P., and Pompino-Marschall, B., “Speech
production and perception: Experimental analyses and models”,
ZAS papers in Linguistics, 40, 2005.
[10] Westbury, J. R., “The significance and measurement of head
position during speech production experiments using the x-ray
microbeam system”, J. Acoust. Soc. Am. Volume 89, Issue 4, pp.
1782-1791, 1991.
[11] Byrd, D., Browman, C. P., Goldstein, L., and Honorof, D.,
"Magnetometer and x-ray microbeam comparison," in
Proceedings of the 14th International Congress of Phonetic
Sciences, edited by J. J. Ohala, Y. Hasegawa, M. Ohala, D.
Granville, and A. C. Bailey (American Institute of Physics, New
York), pp. 627–630. 1999.
[12] Krista, R., “The Effect of Palate Morphology on Consonant
Articulation in Healthy Speakers”, Master thesis, Department of
Speech-Language Pathology, University of Toronto, 2011.
[13] Westbury, J. R., “X-Ray Microbeam Speech Production
Database User’s Handbook Version 1.0”, Jun. 1994.
[14] Gracco, V. L., Nye, P. W., “Magnetometry in speech
articulation research: some misadventures on the road to
enlightment”, Forschungber. Institute Phonet. Sprachl. Kommun.
University München 31: 91-104, 1993.
[15] John C. Hart, George K. Francis, Louis H. Kauffman,
“Visualizing quaternion rotation,” ACM Transactions on
Graphics, v.13 n.3, pp. 256-276, 1994.
159