SparsePCA incorrectly scales results in .transform()

#### Description
When using `SparsePCA`, the `transform()` method incorrectly scales the results based on the *number of rows* in the data matrix passed.

#### Proposed Fix
I am regrettably unable to do a pull request from where I sit. The issue is with this chunk of code, as of writing this at [line number 179 in sparse_pca.py](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/sparse_pca.py#L179):
```python
        U = ridge_regression(self.components_.T, X.T, ridge_alpha,
                             solver='cholesky')
        s = np.sqrt((U ** 2).sum(axis=0))
        s[s == 0] = 1
        U /= s
        return U
```
I honestly do not understand the details of the chosen implementation of SparsePCA. Depending on the objectives of the class, making use of the features as significant for unseen examples requires one of two modifications. Either (a) **learn** the scale factor `s` from the training data (i.e., make it an instance attribute like `.scale_factor_`), or (b) use `.mean(axis=0)` instead of `.sum(axis=0)` to remove the number-of-examples dependency.

#### Steps/Code to Reproduce
```python
from sklearn.decomposition import SparsePCA
import numpy as np


def get_data( count, seed ):
    np.random.seed(seed)
    col1 = np.random.random(count)
    col2 = np.random.random(count)

    data = np.hstack([ a[:,np.newaxis] for a in [
        col1 + .01*np.random.random(count),
        -col1 + .01*np.random.random(count),
        2*col1 + col2 + .01*np.random.random(count),
        col2 + .01*np.random.random(count),
        ]])
    return data


train = get_data(1000,1)
spca = SparsePCA(max_iter=20)
results_train = spca.fit_transform( train )

test = get_data(10,1)
results_test = spca.transform( test )

print( "Training statistics:" )
print( "  mean: %12.3f" % results_train.mean() )
print( "   max: %12.3f" % results_train.max() )
print( "   min: %12.3f" % results_train.min() )
print( "Testing statistics:" )
print( "  mean: %12.3f" % results_test.mean() )
print( "   max: %12.3f" % results_test.max() )
print( "   min: %12.3f" % results_test.min() )
```
Output:
```
Training statistics:
  mean:       -0.009
   max:        0.067
   min:       -0.080
Testing statistics:
  mean:       -0.107
   max:        0.260
   min:       -0.607
```

#### Expected Results
The test results min/max values are on the same scale as the training results.

#### Actual Results
The test results min/max values are much larger than the training results, because fewer examples were used. It is trivial to repeat this process with various sizes of training and testing data to see the relationship.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

SparsePCA incorrectly scales results in .transform() #9394

Description

Proposed Fix

Steps/Code to Reproduce

Expected Results

Actual Results

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

SparsePCA incorrectly scales results in .transform() #9394

Description

Description

Proposed Fix

Steps/Code to Reproduce

Expected Results

Actual Results

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions