Papers by Benjamin Schultz
BackgroundDigital biomarkers continue to make headway in the clinic and clinical trials for neuro... more BackgroundDigital biomarkers continue to make headway in the clinic and clinical trials for neurological conditions. Speech is a domain with great promise.ObjectivesUsing Friedreich ataxia (FRDA) as an exemplar population, we aimed to align objective measures of speech with markers of disease severity, speech related quality of life and subjective judgements of speech using supervised machine learning techniques.Methods132 participants with genetically confirmed diagnosis of FRDA were assessed using digital speech tests, disease severity scores (Friedreich Ataxia Rating Scale, FARS) and speech related quality of life ratings over a 10-year period. Speech was analyzed perceptually by expert listeners for intelligibility (ability to be understood) and naturalness (deviance from healthy norm) and acoustically across 344 features. Features were selected and presented into a random forest and a support vector machine classifier in a standard supervised learning setup designed to replicat...
IEEE Transactions on Neural Systems and Rehabilitation Engineering
Neurodegenerative disease often affects speech. Speech acoustics can be used as objective clinica... more Neurodegenerative disease often affects speech. Speech acoustics can be used as objective clinical markers of pathology. Previous investigations of pathological speech have primarily compared controls with one specific condition and excluded comorbidities. We broaden the utility of speech markers by examining how multiple acoustic features can delineate diseases. We used supervised machine learning with gradient boosting
Cortex
People interested in the research are advised to contact the author for the final version of the ... more People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:
Journal of the Acoustical Society of America, May 1, 2013
Behavior Research Methods, Nov 5, 2015
Timing abilities are often measured by having participants tap their finger along with a metronom... more Timing abilities are often measured by having participants tap their finger along with a metronome and presenting tap-triggered auditory feedback. These experiments predominantly use electronic percussion pads combined with software (e.g., FTAP or Max/MSP) that records responses and delivers auditory feedback. However, these setups involve unknown latencies between tap onset and auditory feedback and can sometimes miss responses or record multiple, superfluous responses for a single tap. These issues may distort measurements of tapping performance or affect the performance of the individual. We present an alternative setup using an Arduino microcontroller that addresses these issues and delivers low-latency auditory feedback. We validated our setup by having participants (N = 6) tap on a force-sensitive resistor pad connected to the Arduino and on an electronic percussion pad with various levels of force and tempi. The Arduino delivered auditory feedback through a pulse-width modulation (PWM) pin connected to a headphone jack or a wave shield component. The Arduino's PWM (M = 0.6 ms, SD = 0.3) and wave shield (M = 2.6 ms, SD = 0.3) demonstrated significantly lower auditory feedback latencies than the percussion pad (M = 9.1 ms, SD = 2.0), FTAP (M = 14.6 ms, SD = 2.8), and Max/MSP (M = 15.8 ms, SD = 3.4). The PWM and wave shield latencies were also significantly less variable than those from FTAP and Max/MSP. The Arduino missed significantly fewer taps, and recorded fewer superfluous responses, than the percussion pad. The Arduino captured all responses, whereas at lower tapping forces, the percussion pad missed more taps. Regardless of tapping force, the Arduino outperformed the percussion pad. Overall, the Arduino is a high-precision, low-latency, portable, and affordable tool for auditory experiments.
Timing & time perception, May 25, 2015
Interpersonal coordination during musical joint action (e.g., ensemble performance) requires indi... more Interpersonal coordination during musical joint action (e.g., ensemble performance) requires individuals to anticipate and adapt to each other's action timing. Individuals differ in their ability to both anticipate and adapt, however, little is known about the relationship between these skills. The present study used paced finger tapping tasks to examine the relationship between anticipatory skill and adaptive (error correction) processes. Based on a computational model, it was hypothesized that temporal anticipation and adaptation will act together to facilitate synchronization accuracy and precision. Adaptive ability was measured as the degree of temporal error correction that participants (N = 52) engaged in when synchronizing with a 'virtual partner' , that is, an auditory pacing signal that modulated its timing based on the participant's performance. Anticipation was measured through a prediction index that reflected the degree to which participants' inter-tap intervals led or lagged behind inter-onset intervals in tempo-changing sequences. A correlational analysis revealed a significant positive relationship between the prediction index and temporal error correction estimates, suggesting that anticipation and adaptation interact to facilitate synchronization performance. Hierarchical regression analyses revealed that adaptation was the best predictor of synchronization accuracy, whereas both adaptation and anticipation predicted synchronization precision. Together these results demonstrate a relationship between anticipatory and adaptive mechanisms, and indicate that individual differences in these two abilities are predictive of synchronization performance.
Quarterly Journal of Experimental Psychology, Feb 1, 2013
Implicit learning (IL) occurs unintentionally. IL of temporal patterns has received minimal atten... more Implicit learning (IL) occurs unintentionally. IL of temporal patterns has received minimal attention, and results are mixed regarding whether IL of temporal patterns occurs in the absence of a concurrent ordinal pattern. Two experiments examined the IL of temporal patterns and the conditions under which IL is exhibited. Experiment 1 examined whether uncertainty of the upcoming stimulus identity obscures learning. Based on probabilistic uncertainty, it was hypothesized that stimulus-detection tasks are more sensitive to temporal learning than multiple-alternative forced-choice tasks because of response uncertainty in the latter. Results demonstrated IL of metrical patterns in the stimulusdetection but not the multiple-alternative task. Experiment 2 investigated whether properties of rhythm (i.e., meter) benefit IL using the stimulus-detection task. The metric binding hypothesis states that metrical frameworks guide attention to periodic points in time. Based on the metric binding hypothesis, it was hypothesized that metrical patterns are learned faster than nonmetrical patterns. Results demonstrated learning of metrical and nonmetrical patterns but metrical patterns were not learned more readily than nonmetrical patterns. However, abstraction of a metrical framework was still evident in the metrical condition. The present study shows IL of auditory temporal patterns in the absence of an ordinal pattern.
Applied Psycholinguistics, Nov 25, 2015
ABSTRACTWhen speakers engage in conversation, acoustic features of their utterances sometimes con... more ABSTRACTWhen speakers engage in conversation, acoustic features of their utterances sometimes converge. We examined how the speech rate of participants changed when a confederate spoke at fast or slow rates during readings of scripted dialogues. A beat-tracking algorithm extracted the periodic relations between stressed syllables (beats) from acoustic recordings. The mean interbeat interval (IBI) between successive stressed syllables was compared across speech rates. Participants’ IBIs were smaller in the fast condition than in the slow condition; the difference between participants’ and the confederate's IBIs decreased across utterances. Cross-correlational analyses demonstrated mutual influences between speakers, with greater impact of the confederate on participants’ beat rates than vice versa. Beat rates converged in scripted conversations, suggesting speakers mutually entrain to one another's beat.
Consciousness and Cognition, Nov 1, 2016
Philosophers have proposed that when people coordinate their actions with others they may experie... more Philosophers have proposed that when people coordinate their actions with others they may experience a sense of joint agency, or shared control over actions and their effects. However, little empirical work has investigated the sense of joint agency. In the current study, pairs coordinated their actions to produce tone sequences and then rated their sense of joint agency on a scale ranging from shared to independent control. People felt more shared than independent control overall, confirming that people experience joint agency during joint action. Furthermore, people felt stronger joint agency when they a) produced sequences that required mutual coordination compared to sequences in which only one partner had to coordinate with the other, b) held the role of follower compared to leader, and c) were better coordinated with their partner. Thus, the strength of joint agency is influenced by the degree to which people mutually coordinate with each other's actions.
Music & Science, 2021
The main goal of this study was to test the hypothesis that disorders in entrainment to the beat ... more The main goal of this study was to test the hypothesis that disorders in entrainment to the beat of music originate from motor deficits. To this aim, we adapted the Beat Alignment Test and tested a large pool of control subjects, as well as nine individuals who had previously showed deficits in synchronization to the beat of music. The tasks consisted of tapping (Experiment 1) and bouncing (Experiment 2) in synchrony with the beat of non-classical music that varied in genre, tempo, and groove, and then judging whether a superimposed metronome was perceived as on or off the beat of the same selection of music. Results indicate concomitant deficits in both beat synchronization and the detection of misalignment with the beat, supporting the hypothesis that the motor system is implicated in beat perception.
ABSTRACTDuring multimodal speech perception, slow delta oscillations (~1 - 3 Hz) in the listener’... more ABSTRACTDuring multimodal speech perception, slow delta oscillations (~1 - 3 Hz) in the listener’s brain synchronize with speech signal, likely reflecting signal decomposition at the service of comprehension. In particular, fluctuations imposed onto the speech amplitude envelope by a speaker’s prosody seem to temporally align with articulatory and body gestures, thus providing two complementary sensations to the speech signal’s temporal structure. Further, endogenous delta oscillations in the left motor cortex align with speech and music beat, suggesting a role in the temporal integration of (quasi)-rhythmic stimulations. We propose that delta activity facilitates the temporal alignment of a listener’s oscillatory activity with the prosodic fluctuations in a speaker’s speech during multimodal speech perception. We recorded EEG responses in an audiovisual synchrony detection task while participants watched videos of a speaker. To test the temporal alignment of visual and auditory pro...
ABSTRACTWearing face masks (alongside physical distancing) provides some protection against infec... more ABSTRACTWearing face masks (alongside physical distancing) provides some protection against infection from COVID-19. Face masks can also change how we communicate and subsequently affect speech signal quality. Here we investigated how three face mask types (N95, surgical and cloth) affect acoustic analysis of speech and perceived intelligibility in healthy subjects. We compared speech produced with and without the different masks on acoustic measures of timing, frequency, perturbation and power spectral density. Speech clarity was also examined using a standardized intelligibility tool by blinded raters. Mask type impacted the power distribution in frequencies above 3kHz for both the N95 and surgical masks. Measures of timing and spectral tilt also differed across mask conditions. Cepstral and harmonics to noise ratios remained flat across mask type. No differences were observed across conditions for word or sentence intelligibility measures. Our data show that face masks change the...
Proceedings of the Royal Society B: Biological Sciences, 2019
Most human communication is carried by modulations of the voice. However, a wide range of culture... more Most human communication is carried by modulations of the voice. However, a wide range of cultures has developed alternative forms of communication that make use of a whistled sound source. For example, whistling is used as a highly salient signal for capturing attention, and can have iconic cultural meanings such as the catcall, enact a formal code as in boatswain's calls or stand as a proxy for speech in whistled languages. We used real-time magnetic resonance imaging to examine the muscular control of whistling to describe a strong association between the shape of the tongue and the whistled frequency. This bioacoustic profile parallels the use of the tongue in vowel production. This is consistent with the role of whistled languages as proxies for spoken languages, in which one of the acoustical features of speech sounds is substituted with a frequency-modulated whistle. Furthermore, previous evidence that non-human apes may be capable of learning to whistle from humans sugge...
The Journal of Neuroscience, 2022
During multisensory speech perception, slow δ oscillations (∼1–3 Hz) in the listener's brain ... more During multisensory speech perception, slow δ oscillations (∼1–3 Hz) in the listener's brain synchronize with the speech signal, likely engaging in speech signal decomposition. Notable fluctuations in the speech amplitude envelope, resounding speaker prosody, temporally align with articulatory and body gestures and both provide complementary sensations that temporally structure speech. Further, δ oscillations in the left motor cortex seem to align with speech and musical beats, suggesting their possible role in the temporal structuring of (quasi)-rhythmic stimulation. We extended the role of δ oscillations to audiovisual asynchrony detection as a test case of the temporal analysis of multisensory prosody fluctuations in speech. We recorded Electroencephalograph (EEG) responses in an audiovisual asynchrony detection task while participants watched videos of a speaker. We filtered the speech signal to remove verbal content and examined how visual and auditory prosodic features tem...
The main goal of brain-computer interface (BCI) research is to provide communication capabilities... more The main goal of brain-computer interface (BCI) research is to provide communication capabilities for people with severe motor impairments who are unable to communicate conventionally. However, a major drawback for most BCIs is the fact that they make use of non-intuitive mental tasks such as motor imagery, mental arithmetic, or mental reaction to external stimuli to make a selection. These control schemes usually have no correlation with normal communication methods making them difficult to perform by the target population. The goal of the work presented is to investigate the reliability of electroencephalography (EEG) signals in detecting inner speech (also known as covert speech or silent vocalization) against an unconstrained "no-control" state. Previous EEG-based inner speech studies have been limited to the silent vocalization of vowels or syllables rather than complete words [1]. To our knowledge, this study is the first report of using EEG measurements to detect covert articulation of a complete meaningful English word. Also, this is the first EEG study of inner speech performed over multiple sessions for each participant.
This data set was used to test the interoperability solutions explored in the paper "Enhanci... more This data set was used to test the interoperability solutions explored in the paper "Enhancing the interoperability of time series data through FAIR principles". It contains time series data of audio recordings of musical instruments. <br>
This code was used to test the interoperability solutions explored in the paper "Enhancing t... more This code was used to test the interoperability solutions explored in the paper "Enhancing the interoperability of time series data through FAIR principles" (currently under review). It contains code to manipulate time series data of audio recordings of musical instruments.
Background The medial geniculate body (MGB) of the thalamus plays a central role in tinnitus path... more Background The medial geniculate body (MGB) of the thalamus plays a central role in tinnitus pathophysiology. Breakdown of sensory gating in this part of the auditory thalamus is a potential mechanism underlying tinnitus. The alleviation of tinnitus-like behavior by high-frequency stimulation (HFS) of the MGB might mitigate dysfunctional sensory gating. Objective The study aims at exploring the role of the MGB in sensory gating as a mandatory relay area in auditory processing in noise-exposed and control subjects, and to assess the effect of MGB HFS on this function. Methods Noise-exposed rats and controls were tested. Continuous auditory sequences were presented to allow assessment of sensory gating effects associated with pitch, binary grouping, and temporal regularity. Evoked potentials (EP) were recorded from the MGB and acquired before and after HFS (100 Hz). Results Noise-exposed rats showed differential modulation of MGB EP amplitudes, confirmed by significant main effects of stimulus type, pair position and temporal regularity. Noiseexposure selectively abolished the effect of temporal regularity on EP amplitudes. A significant three-way interaction between HFS phase, temporal regularity and rat condition (noise-exposed, control) revealed that only noise-exposed rats showed significantly reduced EP amplitudes following MGB HFS. Conclusion This is the first report that shows thalamic filtering of incoming auditory signals based on different sound features. Noise-exposed rats further showed higher EP amplitudes in most conditions and did not differentiate the temporal regularity. Critically, MGB HFS was effective in reducing amplitudes of the EP responses in noise-exposed animals. Highlights ▪ EP findings indicate sensory gating in the MGB in rats. ▪ Noise exposure alters EP amplitudes in the MGB. ▪ HFS selectively suppresses EP responses in noise-exposed animals.
The "take the best" model of decision making proposes that people make decisions by seq... more The "take the best" model of decision making proposes that people make decisions by sequentially searching amongst cues for one that best discriminates between the options being assessed. The search process starts with the best cue and proceeds in descending order of cue validity until one is found that differentiates between the options. It follows, therefore, that the more cues a person is required to use, the longer it will take to make a decision. This study explored the relationship between response time and the number of cues needed to answer a binary choice question correctly. Participants were asked a series of questions about mammals and their response times were recorded. Results support the hypothesis that response time increases as the number of cues required increases. This gives further evidence that a sequential search is occurring during binary-choice decision-making.
Uploads
Papers by Benjamin Schultz