In this paper, we introduce the foundations of the statistical wave field theory. This theory est... more In this paper, we introduce the foundations of the statistical wave field theory. This theory establishes the statistical laws of waves propagating in a bounded volume, that are mathematically implied by the boundary-value problem of the wave equation. These laws are derived from the Sturm-Liouville theory and the mathematical theory of dynamical billiards. They hold after many reflections on the boundary surface, and at high frequency. This is the first statistical theory of reverberation which provides the closed-form expression of the power distribution and the correlations of the wave field jointly over time, frequency and space inside the bounded volume, in terms of the geometry and the specific admittance of its boundary surface. The statistical wave field theory may find applications in various science fields, including room acoustics, electromagnetic theory, and nuclear physics.
In this paper, we focus on modeling multichannel audio signals in the short-time Fourier transfor... more In this paper, we focus on modeling multichannel audio signals in the short-time Fourier transform domain for the purpose of source separation. We propose a probabilistic model based on a class of heavy-tailed distributions, in which the observed mixtures and the latent sources are jointly modeled by using a certain class of multivariate alpha-stable distributions. As opposed to the conventional Gaussian models, where the observations are constrained to lie just within a few standard deviations from the mean, the proposed heavytailed model allows us to account for spurious data or important uncertainties in the model. We develop a Monte Carlo Expectation-Maximization algorithm for inferring the sources from the proposed model. We show that our approach leads to significant performance improvements in audio source separation under corrupted mixtures and in spatial audio object coding.
In this work, we propose a system for automatic music transcription which adapts dictionary templ... more In this work, we propose a system for automatic music transcription which adapts dictionary templates so that they closely match the spectral shape of the instrument sources present in each recording. Current dictionary-based automatic transcription systems keep the input dictionary fixed, thus the spectral shape of the dictionary components might not match the shape of the test instrument sources. By performing a conservative transcription pre-processing step, the spectral shape of detected notes can be extracted and utilized in order to adapt the template dictionary. We propose two variants for adaptive transcription, namely for single-instrument transcription and for multiple-instrument transcription. Experiments are carried out using the MAPS and Bach10 databases. Results in terms of multi-pitch detection and instrument assignment show that there is a clear and consistent improvement when adapting the dictionary in contrast with keeping the dictionary fixed.
2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684), 2003
The Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) algorithm is a ... more The Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) algorithm is a subspace-based analysis method used in source localization or frequency estimation, originally designed in a block signal processing context. In other respects, the Projection Approximation Subspace Tracker (PAST) is a fast and robust subspace tracking method. This paper introduces a new frequency estimation and tracking algorithm, which relies on the PAST subspace tracker and a fast adaptive implementation of the ESPRIT algorithm.
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
In this paper we present a new technique for monaural source separation in musical mixtures, whic... more In this paper we present a new technique for monaural source separation in musical mixtures, which uses the knowledge of the musical score. This information is used to initialize an algorithm which computes a parametric decomposition of the spectrogram based on non-negative matrix factorization (NMF). This algorithm provides time-frequency masks which are used to separate the sources with Wiener filtering.
2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings
This paper introduces a new algorithm for tracking the minor subspace of the correlation matrix a... more This paper introduces a new algorithm for tracking the minor subspace of the correlation matrix associated with time series. This algorithm is shown to have a better convergence rate than existing methods. Moreover, it guarantees the orthonormality of the subspace weighting matrix at each iteration, and reaches a linear complexity.
2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings
HRHATRAC combines the last improvements regarding the fast subspace tracking algorithms with a gr... more HRHATRAC combines the last improvements regarding the fast subspace tracking algorithms with a gradient update for adapting the signal poles estimates. It leads to a line spectral tracker which is able to robustly estimate the frequencies, even in a noisy context, when the lines are close to each other and when a modulation occurs. HRATRAC is also successfully applied in this paper to a piano note recording.
Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
This paper introduces a new algorithm for tracking the major subspace of the correlation matrix a... more This paper introduces a new algorithm for tracking the major subspace of the correlation matrix associated with time series. This algorithm greatly outperforms many well-known subspace trackers in terms of subspace estimation. Moreover, it guarantees the orthonormality of the subspace weighting matrix at each iteration, and reaches the lowest complexity found in the literature.
2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011
Dans cette article, nous présentons une nouvelle méthode de décomposition de spectrogrammes music... more Dans cette article, nous présentons une nouvelle méthode de décomposition de spectrogrammes musicaux. Cette méthode vise à transposer les decompositions invariantes par translation qui permettent de décomposer des spectrogrammes à Q-constant (avec une résolution fréquentielle logarithmique) a des spectrogrammes standard issues de transformées de Fourier à court terme (avec une résolution fréquentielle linéaire). Cette technique a l'avantage de permettre facilement une reconstruction des signaux latents par filtrage de Wiener, ce qui peut être utilisé par exemple dans des applications de séparation de sources.
Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings., 2003
This paper introduces a fast implementation of the power iterations method for subspace tracking,... more This paper introduces a fast implementation of the power iterations method for subspace tracking, based on an approximation less restrictive than the well known projection approximation. This algorithm guarantees the orthonormality of the estimated subspace weighting matrix at each iteration, and satisfies a global and exponential convergence property. Moreover, it outperforms many subspace trackers related to the power method, such as PAST, NIC, NP3 and OPAST, while keeping the same computational complexity.
We address the issue of source separation in a particular informed configuration where both the s... more We address the issue of source separation in a particular informed configuration where both the sources and the mixtures are assumed to be known during a so-called encoding stage. This knowledge enables the computation of a side information which ought to be small enough to be watermarked in the mixtures. At the decoding stage, the sources are no longer assumed to be known, only the mixtures and the side information are processed to perform source separation. The proposed method models the sources jointly using latent variables in a framework close to multichannel nonnegative matrix factorization and models the mixing process as linear filtering. Separation at the decoding stage is done using generalized Wiener filtering of the mixtures. An experimental setup shows that the method gives very satisfying results with mixtures composed of many sources. A study of its performance with respect to the number of latent variables is presented.
2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007
We introduce a conjugate gradient method for estimating and tracking the minor eigenvector of a d... more We introduce a conjugate gradient method for estimating and tracking the minor eigenvector of a data correlation matrix. This new algorithm is less computationally demanding and converges faster than other methods derived from the conjugate gradient approach. It can also be applied in the context of minor subspace tracking, as a pre-processing step for the YAST algorithm, in order to enhance its performance. Simulations show that the resulting algorithm converges much faster than existing minor subspace trackers.
2011 IEEE Statistical Signal Processing Workshop (SSP), 2011
Gaussian process (GP) models are widely used in machine learning to account for spatial or tempor... more Gaussian process (GP) models are widely used in machine learning to account for spatial or temporal relationships between multivariate random variables. In this paper, we propose a formulation of underdetermined source separation in multidimensional spaces as a problem involving GP regression. The advantage of the proposed approach is firstly to provide a flexible means to include a variety of prior information concerning the sources and secondly to lead to minimum mean squared error estimates. We show that if the additive GPs are supposed to be locally-stationary, computations can be done very efficiently in the frequency domain. These findings establish a deep connection between GP and nonnegative tensor factorizations with the Itakura-Saito distance and we show that when the signals are monodimensional, the resulting framework coincides with many popular methods that are based on nonnegative matrix factorization and time-frequency masking.
IEEE/SP 13th Workshop on Statistical Signal Processing, 2005, 2005
The ESPRIT algorithm 1 is a subspace-based high resolution method used in source localization and... more The ESPRIT algorithm 1 is a subspace-based high resolution method used in source localization and spectral analysis. It relies on the rotational invariance property of the signal subspace, and provides accurate estimates of the signal parameters. However, its main drawback is a high computational cost. In an adaptive context, some very fast algorithms were proposed to robustly track the variations of the signal subspace. Based on these subspace trackers, we propose a new adaptive implementation of the ESPRIT algorithm, faster than the existing methods.
-Cet article introduit une version à fenêtre glissante de l'algorithme API, qui effectue le suivi... more -Cet article introduit une version à fenêtre glissante de l'algorithme API, qui effectue le suivi de l'espace dominant d'une séquence de vecteurs. Cet algorithme est dérivé de la méthode des puissances itérées, et repose sur une approximation moins restrictive que celle connue sous le nom d'approximation par projection. Il garantit l'orthonormalité de la matrice générée à chaque itération, et satisfait une propriété de convergence globale et exponentielle. De plus, il atteint de meilleures performances que la plupart des algorithmes de suivi d'espace dominant voisins de la méthode des puissances itérées, tels que PAST, NIC, NP3 et OPAST, tout en ayant la même complexité algorithmique. Nos simulations numériques ont montré l'intérêt de l'utilisation d'une fenêtre glissante : l'algorithme réagit beaucoup plus rapidement à de brusques variations du signal.
At recent AES conventions, authors have had the option of submitting complete 4-to 10-page manusc... more At recent AES conventions, authors have had the option of submitting complete 4-to 10-page manuscripts for peer-review by subject-matter experts. The following two papers have been recognized as co-winners of the AES 132nd Convention Peer-Reviewed Paper Award.
The Journal of the Acoustical Society of America, 2007
When trying to find a solution to the critical and sometimes ill-posed problem of multipitch esti... more When trying to find a solution to the critical and sometimes ill-posed problem of multipitch estimation, it is common to have to choose between several approaches: Using a processing that resembles that of the human auditory perception, or using a decomposition adapted to the spectral content of the targeted sound category, taking into account an a priori knowledge of the spectral properties of musical notes or trying to learn some of its charateristics or even more, to run an algorithm that blindly tends to separate the sound into multiple elementary entities. This work involves some recently published techniques, such as the non-negative matrix factorization with sparsity constraints, a likelihood approach, based on a smooth spectral envelope model for both the background noise and for the partials, and a deterministic method combining spectral and temporal criteria. Their performance is comparatively assessed on a common multipitch database restricted to piano music, drawn both f...
In this paper, we provide analytical expressions of the Cramér-Rao bounds for the frequencies, da... more In this paper, we provide analytical expressions of the Cramér-Rao bounds for the frequencies, damping factors, amplitudes and phases of complex exponentials in colored noise. These expressions show the explicit dependence of the bounds of each distinct parameter with respect to the amplitudes and phases, leading to readily interpretable formulae, which are then simplified in an asymptotic context. The results are presented in the general framework of the Polynomial Amplitude Complex Exponentials (PACE) model, also referred to as the quasipolynomial model in the literature, which accounts for systems involving multiple poles, and represents a signal as a mixture of complex exponentials modulated by polynomials. This work looks further and generalizes the studies previously undertaken on the exponential and the quasipolynomial models.
This paper introduces a fast implementation of the power iteration method for subspace tracking, ... more This paper introduces a fast implementation of the power iteration method for subspace tracking, based on an approximation less restrictive than the well known projection approximation. This algorithm, referred to as the fast API method, guarantees the orthonormality of the subspace weighting matrix at each iteration. Moreover, it outperforms many subspace trackers related to the power iteration method, such as PAST, NIC, NP3 and OPAST, while having the same computational complexity. The API method is designed for both exponential windows and sliding windows. Our numerical simulations show that sliding windows offer a faster tracking response to abrupt signal variations.
In this paper, we introduce the foundations of the statistical wave field theory. This theory est... more In this paper, we introduce the foundations of the statistical wave field theory. This theory establishes the statistical laws of waves propagating in a bounded volume, that are mathematically implied by the boundary-value problem of the wave equation. These laws are derived from the Sturm-Liouville theory and the mathematical theory of dynamical billiards. They hold after many reflections on the boundary surface, and at high frequency. This is the first statistical theory of reverberation which provides the closed-form expression of the power distribution and the correlations of the wave field jointly over time, frequency and space inside the bounded volume, in terms of the geometry and the specific admittance of its boundary surface. The statistical wave field theory may find applications in various science fields, including room acoustics, electromagnetic theory, and nuclear physics.
In this paper, we focus on modeling multichannel audio signals in the short-time Fourier transfor... more In this paper, we focus on modeling multichannel audio signals in the short-time Fourier transform domain for the purpose of source separation. We propose a probabilistic model based on a class of heavy-tailed distributions, in which the observed mixtures and the latent sources are jointly modeled by using a certain class of multivariate alpha-stable distributions. As opposed to the conventional Gaussian models, where the observations are constrained to lie just within a few standard deviations from the mean, the proposed heavytailed model allows us to account for spurious data or important uncertainties in the model. We develop a Monte Carlo Expectation-Maximization algorithm for inferring the sources from the proposed model. We show that our approach leads to significant performance improvements in audio source separation under corrupted mixtures and in spatial audio object coding.
In this work, we propose a system for automatic music transcription which adapts dictionary templ... more In this work, we propose a system for automatic music transcription which adapts dictionary templates so that they closely match the spectral shape of the instrument sources present in each recording. Current dictionary-based automatic transcription systems keep the input dictionary fixed, thus the spectral shape of the dictionary components might not match the shape of the test instrument sources. By performing a conservative transcription pre-processing step, the spectral shape of detected notes can be extracted and utilized in order to adapt the template dictionary. We propose two variants for adaptive transcription, namely for single-instrument transcription and for multiple-instrument transcription. Experiments are carried out using the MAPS and Bach10 databases. Results in terms of multi-pitch detection and instrument assignment show that there is a clear and consistent improvement when adapting the dictionary in contrast with keeping the dictionary fixed.
2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684), 2003
The Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) algorithm is a ... more The Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) algorithm is a subspace-based analysis method used in source localization or frequency estimation, originally designed in a block signal processing context. In other respects, the Projection Approximation Subspace Tracker (PAST) is a fast and robust subspace tracking method. This paper introduces a new frequency estimation and tracking algorithm, which relies on the PAST subspace tracker and a fast adaptive implementation of the ESPRIT algorithm.
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
In this paper we present a new technique for monaural source separation in musical mixtures, whic... more In this paper we present a new technique for monaural source separation in musical mixtures, which uses the knowledge of the musical score. This information is used to initialize an algorithm which computes a parametric decomposition of the spectrogram based on non-negative matrix factorization (NMF). This algorithm provides time-frequency masks which are used to separate the sources with Wiener filtering.
2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings
This paper introduces a new algorithm for tracking the minor subspace of the correlation matrix a... more This paper introduces a new algorithm for tracking the minor subspace of the correlation matrix associated with time series. This algorithm is shown to have a better convergence rate than existing methods. Moreover, it guarantees the orthonormality of the subspace weighting matrix at each iteration, and reaches a linear complexity.
2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings
HRHATRAC combines the last improvements regarding the fast subspace tracking algorithms with a gr... more HRHATRAC combines the last improvements regarding the fast subspace tracking algorithms with a gradient update for adapting the signal poles estimates. It leads to a line spectral tracker which is able to robustly estimate the frequencies, even in a noisy context, when the lines are close to each other and when a modulation occurs. HRATRAC is also successfully applied in this paper to a piano note recording.
Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
This paper introduces a new algorithm for tracking the major subspace of the correlation matrix a... more This paper introduces a new algorithm for tracking the major subspace of the correlation matrix associated with time series. This algorithm greatly outperforms many well-known subspace trackers in terms of subspace estimation. Moreover, it guarantees the orthonormality of the subspace weighting matrix at each iteration, and reaches the lowest complexity found in the literature.
2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011
Dans cette article, nous présentons une nouvelle méthode de décomposition de spectrogrammes music... more Dans cette article, nous présentons une nouvelle méthode de décomposition de spectrogrammes musicaux. Cette méthode vise à transposer les decompositions invariantes par translation qui permettent de décomposer des spectrogrammes à Q-constant (avec une résolution fréquentielle logarithmique) a des spectrogrammes standard issues de transformées de Fourier à court terme (avec une résolution fréquentielle linéaire). Cette technique a l'avantage de permettre facilement une reconstruction des signaux latents par filtrage de Wiener, ce qui peut être utilisé par exemple dans des applications de séparation de sources.
Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings., 2003
This paper introduces a fast implementation of the power iterations method for subspace tracking,... more This paper introduces a fast implementation of the power iterations method for subspace tracking, based on an approximation less restrictive than the well known projection approximation. This algorithm guarantees the orthonormality of the estimated subspace weighting matrix at each iteration, and satisfies a global and exponential convergence property. Moreover, it outperforms many subspace trackers related to the power method, such as PAST, NIC, NP3 and OPAST, while keeping the same computational complexity.
We address the issue of source separation in a particular informed configuration where both the s... more We address the issue of source separation in a particular informed configuration where both the sources and the mixtures are assumed to be known during a so-called encoding stage. This knowledge enables the computation of a side information which ought to be small enough to be watermarked in the mixtures. At the decoding stage, the sources are no longer assumed to be known, only the mixtures and the side information are processed to perform source separation. The proposed method models the sources jointly using latent variables in a framework close to multichannel nonnegative matrix factorization and models the mixing process as linear filtering. Separation at the decoding stage is done using generalized Wiener filtering of the mixtures. An experimental setup shows that the method gives very satisfying results with mixtures composed of many sources. A study of its performance with respect to the number of latent variables is presented.
2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007
We introduce a conjugate gradient method for estimating and tracking the minor eigenvector of a d... more We introduce a conjugate gradient method for estimating and tracking the minor eigenvector of a data correlation matrix. This new algorithm is less computationally demanding and converges faster than other methods derived from the conjugate gradient approach. It can also be applied in the context of minor subspace tracking, as a pre-processing step for the YAST algorithm, in order to enhance its performance. Simulations show that the resulting algorithm converges much faster than existing minor subspace trackers.
2011 IEEE Statistical Signal Processing Workshop (SSP), 2011
Gaussian process (GP) models are widely used in machine learning to account for spatial or tempor... more Gaussian process (GP) models are widely used in machine learning to account for spatial or temporal relationships between multivariate random variables. In this paper, we propose a formulation of underdetermined source separation in multidimensional spaces as a problem involving GP regression. The advantage of the proposed approach is firstly to provide a flexible means to include a variety of prior information concerning the sources and secondly to lead to minimum mean squared error estimates. We show that if the additive GPs are supposed to be locally-stationary, computations can be done very efficiently in the frequency domain. These findings establish a deep connection between GP and nonnegative tensor factorizations with the Itakura-Saito distance and we show that when the signals are monodimensional, the resulting framework coincides with many popular methods that are based on nonnegative matrix factorization and time-frequency masking.
IEEE/SP 13th Workshop on Statistical Signal Processing, 2005, 2005
The ESPRIT algorithm 1 is a subspace-based high resolution method used in source localization and... more The ESPRIT algorithm 1 is a subspace-based high resolution method used in source localization and spectral analysis. It relies on the rotational invariance property of the signal subspace, and provides accurate estimates of the signal parameters. However, its main drawback is a high computational cost. In an adaptive context, some very fast algorithms were proposed to robustly track the variations of the signal subspace. Based on these subspace trackers, we propose a new adaptive implementation of the ESPRIT algorithm, faster than the existing methods.
-Cet article introduit une version à fenêtre glissante de l'algorithme API, qui effectue le suivi... more -Cet article introduit une version à fenêtre glissante de l'algorithme API, qui effectue le suivi de l'espace dominant d'une séquence de vecteurs. Cet algorithme est dérivé de la méthode des puissances itérées, et repose sur une approximation moins restrictive que celle connue sous le nom d'approximation par projection. Il garantit l'orthonormalité de la matrice générée à chaque itération, et satisfait une propriété de convergence globale et exponentielle. De plus, il atteint de meilleures performances que la plupart des algorithmes de suivi d'espace dominant voisins de la méthode des puissances itérées, tels que PAST, NIC, NP3 et OPAST, tout en ayant la même complexité algorithmique. Nos simulations numériques ont montré l'intérêt de l'utilisation d'une fenêtre glissante : l'algorithme réagit beaucoup plus rapidement à de brusques variations du signal.
At recent AES conventions, authors have had the option of submitting complete 4-to 10-page manusc... more At recent AES conventions, authors have had the option of submitting complete 4-to 10-page manuscripts for peer-review by subject-matter experts. The following two papers have been recognized as co-winners of the AES 132nd Convention Peer-Reviewed Paper Award.
The Journal of the Acoustical Society of America, 2007
When trying to find a solution to the critical and sometimes ill-posed problem of multipitch esti... more When trying to find a solution to the critical and sometimes ill-posed problem of multipitch estimation, it is common to have to choose between several approaches: Using a processing that resembles that of the human auditory perception, or using a decomposition adapted to the spectral content of the targeted sound category, taking into account an a priori knowledge of the spectral properties of musical notes or trying to learn some of its charateristics or even more, to run an algorithm that blindly tends to separate the sound into multiple elementary entities. This work involves some recently published techniques, such as the non-negative matrix factorization with sparsity constraints, a likelihood approach, based on a smooth spectral envelope model for both the background noise and for the partials, and a deterministic method combining spectral and temporal criteria. Their performance is comparatively assessed on a common multipitch database restricted to piano music, drawn both f...
In this paper, we provide analytical expressions of the Cramér-Rao bounds for the frequencies, da... more In this paper, we provide analytical expressions of the Cramér-Rao bounds for the frequencies, damping factors, amplitudes and phases of complex exponentials in colored noise. These expressions show the explicit dependence of the bounds of each distinct parameter with respect to the amplitudes and phases, leading to readily interpretable formulae, which are then simplified in an asymptotic context. The results are presented in the general framework of the Polynomial Amplitude Complex Exponentials (PACE) model, also referred to as the quasipolynomial model in the literature, which accounts for systems involving multiple poles, and represents a signal as a mixture of complex exponentials modulated by polynomials. This work looks further and generalizes the studies previously undertaken on the exponential and the quasipolynomial models.
This paper introduces a fast implementation of the power iteration method for subspace tracking, ... more This paper introduces a fast implementation of the power iteration method for subspace tracking, based on an approximation less restrictive than the well known projection approximation. This algorithm, referred to as the fast API method, guarantees the orthonormality of the subspace weighting matrix at each iteration. Moreover, it outperforms many subspace trackers related to the power iteration method, such as PAST, NIC, NP3 and OPAST, while having the same computational complexity. The API method is designed for both exponential windows and sliding windows. Our numerical simulations show that sliding windows offer a faster tracking response to abrupt signal variations.
Uploads
Papers by Roland Badeau