ENH Enhanced correlation models and noise estimation for Gaussian Process #2930

jmetzen · 2014-03-03T14:49:05Z

Support in Gaussian-Process regression for enhanced correlation models and learning the noise magnitude (the nugget) from training data.

Correlation models have been extended as follows:

Matern correlation models for nu=1.5 and nu=2.5 have been added (see https://en.wikipedia.org/wiki/Mat%C3%A9rn_covariance_function). An example script showing the potential benefit of the Matern correlation model compared to squared-exponential and absolute-exponential was added under examples/gaussian_process/plot_matern_kernel.py (see attached image)
squared_exponential, absolute_exponential and Matern correlation models support factor analysis distance. This can be seen as an extension of learning dimension-specific length scales in which also correlations of feature dimensions can be taken into account. See Rasmussen and Williams 2006, p107 for details. An example script showing the potential benefit of this extension was added under examples/gaussian_process/plot_gp_learning_curve.py (see attached image). This feature required that correlation modes get passed the componentwise differences rather than the componentwise distances (their absolute value).

Learning the noise (the "nugget effect") by GaussianProcess is now supported by setting the parameter learn_nugget to True. This allows to learn a homoscedastic noise model, i.e., it assumes that the noise has globally the same magnitude. The script examples/gaussian_process/plot_gp_regression.py was modified accordingly, i.e., it learns the noise magnitude and does not rely on specifying it externaly.

Furthermore, a typo in gp_diabetes_dataset.py was fixed and a not yet merged bugfix (#2867 and #2798) is included.

To be discussed:

The factor analysis distance has hyperparameters that can can take on arbitrary real values (not just positive ones). Since sklearn enforces the hyperparameters theta to be positive, this is internally handled by taking the log of the corresponding components of theta. Are there better ideas?
Learning the noise (the nugget) is internally handled by appending it to the vector theta in _arg_max_reduced_likelihood_function(). This was the way which required the least changes in the current implementation but is not necessarily the best way. Are their any opinions on that?

…stance model Note that this changes required that the pairwise differences rather than their absolute values are passed as d to the correlation models.

…elation models

…lly from the data

…ed code.

…ential kernel.

…nto gp_correlation_models

coveralls · 2014-03-03T15:01:00Z

Coverage remained the same when pulling 87475eb on jmetzen:gp_correlation_models into e721508 on scikit-learn:master.

jmetzen · 2014-07-15T13:43:29Z

This PR is superseded by #3388

Jan Hendrik Metzen added 10 commits February 25, 2014 13:19

ENH squared_exponential correlation model supports factor analysis di…

9f9ffbf

…stance model Note that this changes required that the pairwise differences rather than their absolute values are passed as d to the correlation models.

FIX gp_diabetes_dataset.py example crash fixed

c052beb

ENH Example of learning curves of GaussianProcess with different corr…

e3deb42

…elation models

ENH Added Matern correlation model for nu=1.5 and nu=2.5

ab24e32

ENH The nugget effect of the GaussianProcess can be estimated optiona…

17e3785

…lly from the data

PEP8 Resolved some issues reported by flake8 in GaussianProcess relat…

83407ca

…ed code.

DOC Typo fixed

ac045c5

MISC Renaming gp_learning_curve to plot_gp_learning_curve.py

ff3e7f1

ENH Example of function where Matern kernel outperforms squared-expon…

001e9d8

…ential kernel.

Merge branch 'master' of git://github.com/scikit-learn/scikit-learn i…

87475eb

…nto gp_correlation_models

jmetzen closed this Jul 15, 2014

jmetzen deleted the gp_correlation_models branch September 15, 2014 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH Enhanced correlation models and noise estimation for Gaussian Process #2930

ENH Enhanced correlation models and noise estimation for Gaussian Process #2930

Uh oh!

jmetzen commented Mar 3, 2014

Uh oh!

coveralls commented Mar 3, 2014

Uh oh!

jmetzen commented Jul 15, 2014

Uh oh!

Uh oh!

Uh oh!

ENH Enhanced correlation models and noise estimation for Gaussian Process #2930

ENH Enhanced correlation models and noise estimation for Gaussian Process #2930

Uh oh!

Conversation

jmetzen commented Mar 3, 2014

Uh oh!

coveralls commented Mar 3, 2014

Uh oh!

jmetzen commented Jul 15, 2014

Uh oh!

Uh oh!