Skip to content

Ensure common tests cover everything #4056

Closed
@amueller

Description

@amueller

I'm slightly concerned that currently the common tests don't cover as much as I'd like them to cover, which results in no sparse data tests for clustering (#4052) for example.

I think for clustering, regression, classification and transformers we are in relatively good shape, but there are two cases of "odd" estimators that we need to watch out for:

  • estimators not returned by all_estimators by default
  • estimators not belonging to the four mixin classes.

For the second:

estimators = all_estimators(type_filter=['classifier', 'regressor', 'transformer', 'cluster'])
{('CheckingClassifier', sklearn.utils.mocking.CheckingClassifier),
 ('CountVectorizer', sklearn.feature_extraction.text.CountVectorizer),
 ('DPGMM', sklearn.mixture.dpgmm.DPGMM),
 ('EmpiricalCovariance',
  sklearn.covariance.empirical_covariance_.EmpiricalCovariance),
 ('GMM', sklearn.mixture.gmm.GMM),
 ('GMMHMM', sklearn.hmm.GMMHMM),
 ('GaussianHMM', sklearn.hmm.GaussianHMM),
 ('GraphLasso', sklearn.covariance.graph_lasso_.GraphLasso),
 ('GraphLassoCV', sklearn.covariance.graph_lasso_.GraphLassoCV),
 ('HashingVectorizer', sklearn.feature_extraction.text.HashingVectorizer),
 ('KernelDensity', sklearn.neighbors.kde.KernelDensity),
 ('LSHForest', sklearn.neighbors.approximate.LSHForest),
 ('LedoitWolf', sklearn.covariance.shrunk_covariance_.LedoitWolf),
 ('LogOddsEstimator', sklearn.ensemble.gradient_boosting.LogOddsEstimator),
 ('MDS', sklearn.manifold.mds.MDS),
 ('MeanEstimator', sklearn.ensemble.gradient_boosting.MeanEstimator),
 ('MinCovDet', sklearn.covariance.robust_covariance.MinCovDet),
 ('MultinomialHMM', sklearn.hmm.MultinomialHMM),
 ('NearestNeighbors', sklearn.neighbors.unsupervised.NearestNeighbors),
 ('OAS', sklearn.covariance.shrunk_covariance_.OAS),
 ('OneClassSVM', sklearn.svm.classes.OneClassSVM),
 ('PatchExtractor', sklearn.feature_extraction.image.PatchExtractor),
 ('PriorProbabilityEstimator',
  sklearn.ensemble.gradient_boosting.PriorProbabilityEstimator),
 ('QuantileEstimator', sklearn.ensemble.gradient_boosting.QuantileEstimator),
 ('ScaledLogOddsEstimator',
  sklearn.ensemble.gradient_boosting.ScaledLogOddsEstimator),
 ('ShrunkCovariance', sklearn.covariance.shrunk_covariance_.ShrunkCovariance),
 ('SpectralBiclustering', sklearn.cluster.bicluster.SpectralBiclustering),
 ('SpectralCoclustering', sklearn.cluster.bicluster.SpectralCoclustering),
 ('SpectralEmbedding', sklearn.manifold.spectral_embedding_.SpectralEmbedding),
 ('TSNE', sklearn.manifold.t_sne.TSNE),
 ('TfidfVectorizer', sklearn.feature_extraction.text.TfidfVectorizer),
 ('VBGMM', sklearn.mixture.dpgmm.VBGMM),
 ('ZeroEstimator', sklearn.ensemble.gradient_boosting.ZeroEstimator),
 ('_BaseHMM', sklearn.hmm._BaseHMM),
 ('_BaseRidgeCV', sklearn.linear_model.ridge._BaseRidgeCV),
 ('_ConstantPredictor', sklearn.multiclass._ConstantPredictor),
 ('_RidgeGCV', sklearn.linear_model.ridge._RidgeGCV)}

These are mostly covariance, density, preprocessing and density models.
It would be great if we could figure out a good way to test them, too, or make more tests applicable to all estimators, without filtering for the four standard kinds.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions