-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
Description
This maybe a possible bug or my assumption is wrong regarding the Incremental PCA's fit / partial_fit methods. For valid parameters, the error thrown is ValueError: Number of input features has changed from 10 to 100 between calls to partial_fit! Try setting n_components to a fixed value.
Steps/Code to Reproduce
Example:
I want to perform the pca on the below matrix with dimensions (n_samples, n_features) = (100, 1000).
I want to reduce it to a 100 dimension feature space and hence, I am setting the number of components to 100.
import numpy as np
input_matrix = np.random.randn(100, 1000)
ipca = IncrementalPCA(n_components=100, batch_size=10)
ipca.fit(input_matrix) # Error at this line
reduced_matrix = ipca.transform(input_matrix)
Expected Results
No error is thrown and the reduced matrix shape must be (n_samples, n_features) = (100, 100)
Actual Results
ValueError: Number of input features has changed from 10 to 100 between calls to partial_fit! Try setting n_components to a fixed value. Commenting out the condition check produces expected result.
if (self.components_ is not None) and (self.components_.shape[0] != self.n_components_):
raise ValueError("Number of input features has changed from %i "
"to %i between calls to partial_fit! Try "
"setting n_components to a fixed value." %
(self.components_.shape[0], self.n_components_))
Versions
Linux-3.10.0-514.26.2.el7.x86_64-x86_64-with-centos-7.3.1611-Core
('Python', '2.7.5 (default, Nov 6 2016, 00:28:07) \n[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)]')
('NumPy', '1.13.0')
('SciPy', '0.18.1')
('Scikit-Learn', '0.18.1')