Papers by Gabriel Fricout
In a large number of applications, engineers have to estimate a function linked to the state of a... more In a large number of applications, engineers have to estimate a function linked to the state of a dynamic system. To do so, a sequence of samples drawn from this unknown function is observed while the system is transiting from state to state and the problem is to generalize these observations to unvisited states. Several solutions can be envisioned among which regressing a family of parameterized functions so as to make it fit at best to the observed samples. This is the first problem addressed with the proposed kernel-based Bayesian filtering approach, which also allows quantifying uncertainty reduction occurring when acquiring more samples. Classical methods cannot handle the case where actual samples are not directly observable but only a non linear mapping of them is available, which happens when a special sensor has to be used or when solving the Bellman equation in order to control the system. However the approach proposed in this paper can be extended to this tricky case. Moreover, an application of this indirect function approximation scheme to reinforcement learning is presented. A set of experiments is also proposed in order to demonstrate the efficiency of this kernel-based Bayesian approach.
Les différences temporelles de Kalman (KTD pour Kalman Temporal Differences) sont un cadre de tra... more Les différences temporelles de Kalman (KTD pour Kalman Temporal Differences) sont un cadre de travail statistique qui traite de l'approximation de la fonction de valeur et de qualité en apprentissage par renforcement. Son principe est d'adopter une représentation paramétrique de la fonction de valeur, de modéliser les paramètres associés comme des variables aléatoires et de minimiser l'espérance de l'erreur quadratique moyenne des paramètres conditionnée à l'ensemble des récompenses observées. Ce paradigme s'est montré efficace en terme d'échantillons (i.e. convergence rapide), capable de prendre en compte la non-stationnarité ainsi que de fournir une information d'incertitude. Cependant ce cadre de travail était restreint au processus décisionnels de Markov bénéficiant de transitions déterministes. Dans cette contribution nous proposons d'étendre le modèle au transitions stochastiques à l'aide d'un bruit coloré, ce qui mène aux différences temporelles de Kalman étendues (XKTD pour eXtended KTD). L'approche proposée est illustrée sur des problèmes usuels en apprentissage par renforcement. ∞ i=0 γ i r i |s 0 = s, π] où r i est la récompense immédiate observée au temps i, et l'espérance dépend des probabilités des trajectoires partant de s, étant données la dynamique du système et la politique suivie. La
Lecture Notes in Computer Science, 2009
... LSPI [14], which allows searching an optimal control more effi-ciently, however it is a batch... more ... LSPI [14], which allows searching an optimal control more effi-ciently, however it is a batchalgorithm which does ... For exemple, incremental natural actor-critic algorithms are presented in [17 ... TD is used as the actor part instead of LSTD, mostly because of the inability of the latter ...
Lecture Notes in Computer Science, 2008
A wide variety of function approximation schemes have been applied to reinforcement learning. How... more A wide variety of function approximation schemes have been applied to reinforcement learning. However, Bayesian filtering approaches, which have been shown efficient in other fields such as neural network training, have been little studied. We propose a general Bayesian filtering framework for reinforcement learning, as well as a specific implementation based on sigma point Kalman filtering and kernel machines. This allows us to derive an efficient off-policy model-free approximate temporal differences algorithm which will be demonstrated on two simple benchmarks.
2008 The Second International Conference on Advanced Engineering Computing and Applications in Sciences, 2008
In a large number of applications, engineers have to estimate values of an unknown function given... more In a large number of applications, engineers have to estimate values of an unknown function given some observed samples. This task is referred to as function approximation or as generalization. One way to solve the problem is to regress a family of parameterized functions so as to make it fit at best the observed samples. Yet, usually batch methods are used and parameterization is habitually linear. Moreover, very few approaches try to quantify uncertainty reduction occurring when acquiring more samples (thus more information), which can be quite useful depending on the application. In this paper we propose a sparse nonlinear bayesian online kernel regression. Sparsity is achieved in a preprocessing step by using a dictionary method. The nonlinear bayesian kernel regression can therefore be considered as achieved online by a Sigma Point Kalman Filter. First experiments on a cardinal sine regression show that our approach is promising.
2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009
This paper deals with value function and Qfunction approximation in deterministic Markovian decis... more This paper deals with value function and Qfunction approximation in deterministic Markovian decision processes. A general statistical framework based on the Kalman filtering paradigm is introduced. Its principle is to adopt a parametric representation of the value function, to model the associated parameter vector as a random variable and to minimize the mean-squared error of the parameters conditioned on past observed transitions. From this general framework, which will be called Kalman Temporal Differences (KTD), and using an approximation scheme called the unscented transform, a family of algorithms is derived, namely KTD-V, KTD-SARSA and KTD-Q, which aim respectively at estimating the value function of a given policy, the Q-function of a given policy and the optimal Q-function. The proposed approach holds for linear and nonlinear parameterization. This framework is discussed and potential advantages and shortcomings are highlighted.
Wear, 2008
The visual aspect of rough surfaces such as steel surfaces becomes of great importance for the qu... more The visual aspect of rough surfaces such as steel surfaces becomes of great importance for the quality of the final product they are dedicated to. In a previous work, we have solved the theoretically complex problem of automatically classifying surfaces according to the quality of their aspect. In this case, the measurements were based on topographical maps obtained through interferometric microscopy. The resulting data were analyzed by an algorithm based on morphological and statistical features extraction from surface topography, factorial analysis, bootstrap over-sampling and Bayesian classification. It was then important to apply this methodology as efficiently as possible to perform an automatic, on-line and continuous inspection of the product. In this paper, we focus on all steps leading to such an on-line application, among which choosing an optical sensor and the corresponding optical configuration adapted to an industrial environment and overcoming all difficulties to go from first laboratory tests to on-line measurements on fast moving product are particularly determining. Finally, results of on-line acquisitions are displayed, that are in good agreement with expected aspect characterization.
Les différences temporelles de Kalman (KTD pour Kalman Temporal Differences) sont un cadre de tra... more Les différences temporelles de Kalman (KTD pour Kalman Temporal Differences) sont un cadre de travail statistique qui traite de l'approximation de la fonction de valeur et de qualité en apprentissage par renforcement. Son principe est d'adopter une représentation paramétrique de la fonction de valeur, de modéliser les paramètres associés comme des variables aléatoires et de minimiser l'espérance de l'erreur quadratique moyenne des paramètres conditionnée à l'ensemble des récompenses observées. Ce paradigme s'est montré efficace en terme d'échantillons (i.e. convergence rapide), capable de prendre en compte la non-stationnarité ainsi que de fournir une information d'incertitude. Cependant ce cadre de travail était restreint au processus décisionnels de Markov bénéficiant de transitions déterministes. Dans cette contribution nous proposons d'étendre le modèle au transitions stochastiques à l'aide d'un bruit coloré, ce qui mène aux différences temporelles de Kalman étendues (XKTD pour eXtended KTD). L'approche proposée est illustrée sur des problèmes usuels en apprentissage par renforcement.
Une large variété de schémas d'approximation de la fonction de valeur a été appliquée à l'apprent... more Une large variété de schémas d'approximation de la fonction de valeur a été appliquée à l'apprentissage par renforcement. Cependant, les approches par filtrage bayésien, qui se sont pourtant montrées efficaces dans d'autres domaines comme l'apprentissage de paramètres pour les réseaux neuronaux, ont été peu étudiées jusqu'à présent. Cette contribution introduit un cadre de travail général pour l'apprentissage par renforcement basé sur le filtrage bayésien, ainsi qu'une implémentation spécifique basée sur un filtre de Kalman à sigma-points et une paramétrisation par noyaux. Cela nous permet de proposer un algorithme de différences temporelles pour des espaces d'état et/ou d'action continus qui soit model-free et off-policy. Il sera illustré sur deux problèmes simples. Mots-clés : Apprentissage par renforcement, filtrage bayésien, méthodes à noyaux.
Reproduction, Fertility and Development, 2003
57. TEMPORO-SPATIAL ALTERATIONS IN PROSTATE BRANCHING MORPHOGENESIS IN ESTROGEN-DEFICIENT AROMATA... more 57. TEMPORO-SPATIAL ALTERATIONS IN PROSTATE BRANCHING MORPHOGENESIS IN ESTROGEN-DEFICIENT AROMATASE KNOCKOUT (ArKO) MICE Ghanim Almahbobi1, Shelley Hedwards1, Vikrant Reddy1, Stephen McPherson1, Gabriel Fricout2 and Gail P. ...
The Journal of Pathology, 2005
Early changes to branching morphogenesis of the prostate are believed to lead to enlargement of t... more Early changes to branching morphogenesis of the prostate are believed to lead to enlargement of the gland in adult life. However, it has not been possible to demonstrate directly that alterations to branching during the developmental period have a permanent effect on adult prostate size. In order to examine branching morphogenesis in a quantitative manner in neonatal mice, a combination of imaging and computational technology was used to detect and quantify branching using bone morphogenetic protein 4 haplo-insufficient mice that develop enlarged prostate glands in adulthood. Accurate estimates were made of six parameters of branching, including prostate ductal length and volume and number of main ducts, branches, branch points, and tips. The results show that the prostate is significantly larger on day 3, well before the emergence of the phenotype in older animals. The ventral prostate is enlarged because the number of main epithelial ducts is increased; enlargement of the anterior prostate in mutant animals occurs because there are more branches. These lobe-specific mechanisms underlying prostate enlargement indicate the complex nature of gland pathology in mice, rather than a simple increase in weight or volume. This method provides a powerful means to investigate the aetiology of prostate disease in animal models prior to emergence of a phenotype in later life.
Cette contribution traite de l'approximation de la fonction de valeur ainsi que de la Q-fonction ... more Cette contribution traite de l'approximation de la fonction de valeur ainsi que de la Q-fonction dans des processus décisionnels de Markov déterministes. Un cadre de travail statistique général inspiré du filtrage de Kalman est introduit. Son principe est d'adopter une représentation paramétrique de la fonction de valeur (ou de la Q-fonction), de modéliser le vecteur de paramètres associé comme une variable aléatoire et de minimiser l'erreur quadratique sur les paramètres conditionnée aux récompenses observées depuis l'origine des temps. De ce paradigme général, que nous nommons Différences Temporelles de Kalman (KTD pour Kalman Temporal Differences), et en utilisant un schéma d'approximation appelé transformation non-parfumée, une famille d'algorithmes du second ordre est dérivée, à savoir KTD-V, KTD-SARSA et KTD-Q, qui ont respectivement comme objectif l'évaluation de la fonction de valeur pour une politique donnée, l'évaluation de la Q-fonction pour une politique donnée, et l'évaluation de la Q-fonction optimale. Cette approche présente un certain nombre d'avantages tels que la capacité à prendre en compte une paramétrisation non-linéaire, l'efficacité de l'apprentissage en terme du nombre d'échantillons observés, la prise en compte d'environnements non-stationnaires ou encore la possibilité d'obtenir une information d'incertitude, que nous utiliserons pour proposer une forme d'apprentissage actif. Ces différents aspects sont discutés et illustrés au travers de plusieurs expériences.
Experiments in Fluids, 2012
ABSTRACT A major challenge regarding thin films is the characterization of their rheology and the... more ABSTRACT A major challenge regarding thin films is the characterization of their rheology and the measurement of the fluid physical parameters. For complex fluids, performing direct rheology measurements is extremely difficult considering the geometric characteristics of thin films. In this paper, we present a method for characterizing the film rheology based on measurements at regular time intervals of the film surface topography. These measures allow us, by solving an inverse problem, to validate a model of rheology for the thin fluid film and to determine the physical parameters specific to this model.
In a large number of applications, engineers have to estimate a function linked to the state of a... more In a large number of applications, engineers have to estimate a function linked to the state of a dynamic system. To do so, a sequence of samples drawn from this unknown function is observed while the system is transiting from state to state and the problem is to generalize these observations to unvisited states. Several solutions can be envisioned among which regressing a family of parameterized functions so as to make it fit at best to the observed samples. However classical methods cannot handle the case where actual samples are not directly observable but only a nonlinear mapping of them is available, which happen when a special sensor has to be used or when solving the Bellman equation in order to control the system. This paper introduces a method based on Bayesian filtering and kernel machines designed to solve the tricky problem at sight. First experimental results are promising.
In a large number of applications, engineers have to estimate a function linked to the state of a... more In a large number of applications, engineers have to estimate a function linked to the state of a dynamic system. To do so, a sequence of samples drawn from this unknown function is observed while the system is transiting from state to state and the problem is to generalize these observations to unvisited states. Several solutions can be envisioned among which regressing a family of parameterized functions so as to make it fit at best to the observed samples. This is the first problem addressed with the proposed kernel-based Bayesian filtering approach, which also allows quantifying uncertainty reduction occurring when acquiring more samples. Classical methods cannot handle the case where actual samples are not directly observable but only a non linear mapping of them is available, which happens when a special sensor has to be used or when solving the Bellman equation in order to control the system. However the approach proposed in this paper can be extended to this tricky case. Moreover, an application of this indirect function approximation scheme to reinforcement learning is presented. A set of experiments is also proposed in order to demonstrate the efficiency of this kernel-based Bayesian approach.
The kernel trick is a well known approach allowing to implicitly cast a linear method into a nonl... more The kernel trick is a well known approach allowing to implicitly cast a linear method into a nonlinear one by replacing any dot product by a kernel function. However few vector quantization algorithms have been kernelized. Indeed, they usually imply to compute linear transformations (e.g. moving prototypes), what is not easily kernelizable. This paper introduces the Kernel-based Vector Quantization (KVQ) method which allows working in an approximation of the feature space, and thus kernelizing any Vector Quantization (VQ) algorithm.
A wide variety of function approximation schemes have been applied to reinforcement learning. How... more A wide variety of function approximation schemes have been applied to reinforcement learning. However, Bayesian filtering approaches, which have been shown efficient in other fields such as neural network training, have been little studied. We propose a general Bayesian filtering framework for reinforcement learning, as well as a specific implementation based on sigma point Kalman filtering and kernel machines. This allows us to derive an efficient off-policy model-free approximate temporal differences algorithm which will be demonstrated on two simple benchmarks.
Journal of Mathematics in Industry, 2012
Purpose: Being able to predict the visual appearance of a painted steel sheet, given its topograp... more Purpose: Being able to predict the visual appearance of a painted steel sheet, given its topography before paint application, is of crucial importance for car makers. Accurate modeling of the industrial painting process is required. Results: The equations describing the leveling of the paint film are complex and their numerical simulation requires advanced mathematical tools, which are described in detail in this paper. Simulations are validated using a large experimental data base obtained with a wavefront sensor developed by Phasics™.
Uploads
Papers by Gabriel Fricout