Skip to content

Possible bug with Incremental PCA #9330

@satwikbh

Description

@satwikbh

Description

This maybe a possible bug or my assumption is wrong regarding the Incremental PCA's fit / partial_fit methods. For valid parameters, the error thrown is ValueError: Number of input features has changed from 10 to 100 between calls to partial_fit! Try setting n_components to a fixed value.

Steps/Code to Reproduce

Example:
I want to perform the pca on the below matrix with dimensions (n_samples, n_features) = (100, 1000).
I want to reduce it to a 100 dimension feature space and hence, I am setting the number of components to 100.

import numpy as np
input_matrix = np.random.randn(100, 1000)

ipca = IncrementalPCA(n_components=100, batch_size=10)
ipca.fit(input_matrix) # Error at this line
reduced_matrix = ipca.transform(input_matrix)

Expected Results

No error is thrown and the reduced matrix shape must be (n_samples, n_features) = (100, 100)

Actual Results

ValueError: Number of input features has changed from 10 to 100 between calls to partial_fit! Try setting n_components to a fixed value. Commenting out the condition check produces expected result.

if (self.components_ is not None) and (self.components_.shape[0] != self.n_components_):
	raise ValueError("Number of input features has changed from %i "
		"to %i between calls to partial_fit! Try "
		"setting n_components to a fixed value." %
		(self.components_.shape[0], self.n_components_))

Versions

Linux-3.10.0-514.26.2.el7.x86_64-x86_64-with-centos-7.3.1611-Core
('Python', '2.7.5 (default, Nov 6 2016, 00:28:07) \n[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)]')
('NumPy', '1.13.0')
('SciPy', '0.18.1')
('Scikit-Learn', '0.18.1')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions