Papers by Jonathan Driedger
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Zenodo (CERN European Organization for Nuclear Research), Jun 7, 2022
Bookmarks Related papers MentionsView impact
Zenodo (CERN European Organization for Nuclear Research), Apr 17, 2022
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
The analysis of recorded audio sources has become increasingly important in ethnomusicological re... more The analysis of recorded audio sources has become increasingly important in ethnomusicological research. Such audio material may contain important cues on performance practice, information that is often lost in manually generated symbolic music transcriptions. As an application scenario, we consider in this paper a musically relevant audio collection that consists of three-voice polyphonic Georgian chants. As one main contribution, we introduce an interactive graphical user interface that provides various visual and acoustic control mechanisms for estimating fundamental frequency (F0) trajectories from complex sound mixtures. We then apply this interface for determining F0 trajectories of sung pitches from the Georgian chant recordings and indicate how such F0 annotations can be used as basis for addressing important questions in Georgian music research.
Bookmarks Related papers MentionsView impact
Background music is often used to generate a specific atmosphere or to draw our attention to spec... more Background music is often used to generate a specific atmosphere or to draw our attention to specific events. For example in movies or computer games it is often the accompanying music that conveys the emotional state of a scene and plays an important role for immersing the viewer or player into the virtual environment. In view of home-made videos, slide shows, and other consumer-generated visual media streams, there is a need for computer-assisted tools that allow users to generate aesthetically appealing music tracks in an easy and intuitive way. In this contribution, we consider a data-driven scenario where the musical raw material is given in form of a database containing a variety of audio recordings. Then, for a given visual media stream, the task consists in identifying, manipulating, overlaying, concatenating, and blending suitable music clips to generate a music stream that satisfies certain constraints imposed by the visual data stream and by user specifications. It is our...
Bookmarks Related papers MentionsView impact
Formalizing and verifying proofs in cryptography has become an important task. Backes et al. ther... more Formalizing and verifying proofs in cryptography has become an important task. Backes et al. therefore invented a framework [1] that uses the proof assistant Isabelle/HOL [9] to verify game-based proofs. In this framework a powerful probabilistic language allows to formalize games that describe security properties. To show that these security properties hold one can modify the games such that their outcome is not altered. This is done until the games have the form of already known security properties. Such a modification of a game is called a transformation. Transformations are based on relations between games which have to be verified. But verifying such relations is very often a challenging task. To be able to come up with game-based proofs more naturally it is useful to have a collection of relations, and therefore transformations, verified upfront. This thesis presents a couple of different game-transformations, formalizes them, and shows proofs of their correctness.
Bookmarks Related papers MentionsView impact
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
A common method to create beat annotations for music recordings is to let a human annotator tap a... more A common method to create beat annotations for music recordings is to let a human annotator tap along with them. However, this method is problematic due to the limited human ability to temporally align taps with audio cues for beats accurately. In order to create accurate beat annotations, it is therefore typically necessary to manually correct the recorded taps in a subsequent step, which is a cumbersome task. In this work we aim to automate this correction step by "snapping" the taps to close-by audio cues - a strategy that is often used by beat tracking algorithms to refine their beat estimates. The main contributions of this paper can be summarized as follows. First, we formalize the automated correction procedure mathematically. Second, we introduce a novel visualization method that serves as a tool to analyze the results of the correction procedure for potential errors. Third, we present a new dataset consisting of beat annotations for 101 music recordings. Fourth, w...
Bookmarks Related papers MentionsView impact
A common method to create beat annotations for music recordings is to let a human annotator tap a... more A common method to create beat annotations for music recordings is to let a human annotator tap along with them. However, this method is problematic due to the limited human ability to temporally align taps with audio cues for beats accurately. In order to create accurate beat annotations, it is therefore typically necessary to manually correct the recorded taps in a subsequent step, which is a cumbersome task. In this work we aim to automate this correction step by "snapping" the taps to close-by audio cues-a strategy that is often used by beat tracking algorithms to refine their beat estimates. The main contributions of this paper can be summarized as follows. First, we formalize the automated correction procedure mathematically. Second, we introduce a novel visualization method that serves as a tool to analyze the results of the correction procedure for potential errors. Third, we present a new dataset consisting of beat annotations for 101 music recordings. Fourth, we use this dataset to perform a listening experiment as well as a quantitative study to show the effectiveness of our snapping procedure.
Bookmarks Related papers MentionsView impact
The task of novelty detection with the objective of detecting changes regarding musical propertie... more The task of novelty detection with the objective of detecting changes regarding musical properties such as harmony, dynamics, timbre, or tempo is of fundamental importance when analyzing structural properties of music recordings. But for a specific audio version of a given piece of music , the novelty detection result may also crucially depend on the individual performance style of the musician. This particularly holds true for tempo-related properties, which may vary significantly across different performances of the same piece of music. In this paper, we show that tempo-based novelty detection can be stabilized and improved by simultaneously analyzing a set of different performances. We first warp the version-dependent novelty curves onto a common musical time axis, and then combine the individual curves to produce a single fusion curve. Our hypothesis is that musically relevant points of novelty tend to be consistent across different performances. This hypothesis is supported by our experiments in the context of music structure analysis, where the cross-version fusion curves yield, on average, better results than the novelty curves obtained from individual recordings.
Bookmarks Related papers MentionsView impact
Background music is often used to generate a specific atmosphere or to draw our attention to spec... more Background music is often used to generate a specific atmosphere or to draw our attention to specific events. For example in movies or computer games it is often the accompanying music that conveys the emotional state of a scene and plays an important role for immersing the viewer or player into the virtual environment. In view of home-made videos, slide shows, and other consumer-generated visual media streams, there is a need for computer-assisted tools that allow users to generate aesthetically appealing music tracks in an easy and intuitive way. In this contribution, we consider a data-driven scenario where the musical raw material is given in form of a database containing a variety of audio recordings. Then, for a given visual media stream, the task consists in identifying, manipulating, overlaying, concatenating, and blending suitable music clips to generate a music stream that satisfies certain constraints imposed by the visual data stream and by user specifications. It is our main goal to give an overview of various content-based music processing and retrieval techniques that become important in data-driven sound track generation. In particular, we sketch a general pipeline that highlights how the various techniques act together and come into play when generating musically plausible transitions between subsequent music clips.
Bookmarks Related papers MentionsView impact
Electronic Music (EM) is a popular family of genres which has increasingly received attention as ... more Electronic Music (EM) is a popular family of genres which has increasingly received attention as a research subject in the field of MIR. A fundamental structural unit in EM are loops—audio fragments whose length can span several seconds. The devices commonly used to produce EM, such as sequencers and digital audio workstations, impose a musical structure in which loops are repeatedly triggered and overlaid. This particular structure allows new perspectives on well-known MIR tasks. In this paper we first review a prototypical production technique for EM from which we derive a simplified model. We then use our model to illustrate approaches for the following task: given a set of loops that were used to produce a track, decompose the track by finding the points in time at which each loop was activated. To this end, we repurpose established MIR techniques such as fingerprinting and non-negative matrix factor deconvo-lution.
Bookmarks Related papers MentionsView impact
Music source separation aims at decomposing music recordings into their constituent component sig... more Music source separation aims at decomposing music recordings into their constituent component signals. Many existing techniques are based on separating a time-frequency representation of the mixture signal by applying suitable model-ing techniques in conjunction with generalized Wiener filtering. Recently, the term α-Wiener filtering was coined together with a theoretic foundation for the long-practiced use of magnitude spectrogram estimates in Wiener filtering. So far, optimal values for the magnitude exponent α have been empirically found in oracle experiments regarding the additivity of spectral magnitudes. In the first part of this paper, we extend these previous studies by examining further factors that affect the choice of α. In the second part, we investigate the role of α in Kernel Additive Modeling applied to Harmonic-Percussive Separation. Our results indicate that the parameter α may be understood as a kind of selectivity parameter, which should be chosen in a signal-adaptive fashion.
Bookmarks Related papers MentionsView impact
Harmonic-percussive-residual (HPR) sound separation is a useful preprocessing tool for applicatio... more Harmonic-percussive-residual (HPR) sound separation is a useful preprocessing tool for applications such as pitched instrument transcription or rhythm extraction. Recent methods rely on the observation that in a spectrogram representation, harmonic sounds lead to horizontal structures and percussive sounds lead to vertical structures. Furthermore, these methods associate structures that are neither horizontal nor vertical (i.e., non-harmonic, non-percussive sounds) with a residual category. However, this assumption does not hold for signals like frequency modulated tones that show fluctuating spectral structures, while nevertheless carrying tonal information. Therefore, a strict classification into horizontal and vertical is inappropriate for these signals and might lead to leakage of tonal information into the residual component. In this work, we propose a novel method that instead uses the structure tensor—a mathematical tool known from image processing—to calculate predominant orientation angles in the magnitude spectrogram. We show how this orientation information can be used to distinguish between harmonic , percussive, and residual signal components, even in the case of frequency modulated signals. Finally, we verify the effectiveness of our method by means of both objective evaluation measures as well as audio examples.
Bookmarks Related papers MentionsView impact
Uploads
Papers by Jonathan Driedger