Skip to content

Commit 0dd7eb3

Browse files
committed
Pushing the docs to dev/ for branch: master, commit 75c76ae4881d8d46f73be38e21bad552f7f456b7
1 parent 744e746 commit 0dd7eb3

File tree

1,205 files changed

+3804
-3767
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,205 files changed

+3804
-3767
lines changed
Binary file not shown.

dev/_downloads/72cc9063542e8d12ba1a5d7ceb8c3473/plot_all_scaling.ipynb

Lines changed: 11 additions & 11 deletions
Large diffs are not rendered by default.

dev/_downloads/7699944fcd7908d5390bc6e94695487b/plot_all_scaling.py

Lines changed: 63 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,7 @@
77
=============================================================
88
99
Feature 0 (median income in a block) and feature 5 (number of households) of
10-
the `California housing dataset
11-
<https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html>`_ have very
10+
the :ref:`california_housing_dataset` have very
1211
different scales and contain some very large outliers. These two
1312
characteristics lead to difficulties to visualize the data and, more
1413
importantly, they can degrade the predictive performance of many machine
@@ -26,11 +25,13 @@
2625
data within a pre-defined range.
2726
2827
Scalers are linear (or more precisely affine) transformers and differ from each
29-
other in the way to estimate the parameters used to shift and scale each
28+
other in the way they estimate the parameters used to shift and scale each
3029
feature.
3130
32-
``QuantileTransformer`` provides non-linear transformations in which distances
33-
between marginal outliers and inliers are shrunk. ``PowerTransformer`` provides
31+
:class:`~sklearn.preprocessing.QuantileTransformer` provides non-linear
32+
transformations in which distances
33+
between marginal outliers and inliers are shrunk.
34+
:class:`~sklearn.preprocessing.PowerTransformer` provides
3435
non-linear transformations in which data is mapped to a normal distribution to
3536
stabilize variance and minimize skewness.
3637
@@ -89,12 +90,12 @@
8990
PowerTransformer(method='yeo-johnson').fit_transform(X)),
9091
('Data after power transformation (Box-Cox)',
9192
PowerTransformer(method='box-cox').fit_transform(X)),
92-
('Data after quantile transformation (gaussian pdf)',
93-
QuantileTransformer(output_distribution='normal')
94-
.fit_transform(X)),
9593
('Data after quantile transformation (uniform pdf)',
9694
QuantileTransformer(output_distribution='uniform')
9795
.fit_transform(X)),
96+
('Data after quantile transformation (gaussian pdf)',
97+
QuantileTransformer(output_distribution='normal')
98+
.fit_transform(X)),
9899
('Data after sample-wise L2 normalizing',
99100
Normalizer().fit_transform(X)),
100101
]
@@ -184,7 +185,7 @@ def plot_distribution(axes, X, y, hist_nbins=50, title="",
184185
# figure will show a scatter plot of the full data set while the right figure
185186
# will exclude the extreme values considering only 99 % of the data set,
186187
# excluding marginal outliers. In addition, the marginal distributions for each
187-
# feature will be shown on the side of the scatter plot.
188+
# feature will be shown on the sides of the scatter plot.
188189

189190

190191
def make_plot(item_idx):
@@ -238,16 +239,18 @@ def make_plot(item_idx):
238239
# StandardScaler
239240
# --------------
240241
#
241-
# ``StandardScaler`` removes the mean and scales the data to unit variance.
242+
# :class:`~sklearn.preprocessing.StandardScaler` removes the mean and scales
243+
# the data to unit variance. The scaling shrinks the range of the feature
244+
# values as shown in the left figure below.
242245
# However, the outliers have an influence when computing the empirical mean and
243-
# standard deviation which shrink the range of the feature values as shown in
244-
# the left figure below. Note in particular that because the outliers on each
246+
# standard deviation. Note in particular that because the outliers on each
245247
# feature have different magnitudes, the spread of the transformed data on
246248
# each feature is very different: most of the data lie in the [-2, 4] range for
247249
# the transformed median income feature while the same data is squeezed in the
248250
# smaller [-0.2, 0.2] range for the transformed number of households.
249251
#
250-
# ``StandardScaler`` therefore cannot guarantee balanced feature scales in the
252+
# :class:`~sklearn.preprocessing.StandardScaler` therefore cannot guarantee
253+
# balanced feature scales in the
251254
# presence of outliers.
252255

253256
make_plot(1)
@@ -256,33 +259,38 @@ def make_plot(item_idx):
256259
# MinMaxScaler
257260
# ------------
258261
#
259-
# ``MinMaxScaler`` rescales the data set such that all feature values are in
262+
# :class:`~sklearn.preprocessing.MinMaxScaler` rescales the data set such that
263+
# all feature values are in
260264
# the range [0, 1] as shown in the right panel below. However, this scaling
261-
# compress all inliers in the narrow range [0, 0.005] for the transformed
265+
# compresses all inliers into the narrow range [0, 0.005] for the transformed
262266
# number of households.
263267
#
264-
# As ``StandardScaler``, ``MinMaxScaler`` is very sensitive to the presence of
265-
# outliers.
268+
# Both :class:`~sklearn.preprocessing.StandardScaler` and
269+
# :class:`~sklearn.preprocessing.MinMaxScaler` are very sensitive to the
270+
# presence of outliers.
266271

267272
make_plot(2)
268273

269274
#############################################################################
270275
# MaxAbsScaler
271276
# ------------
272277
#
273-
# ``MaxAbsScaler`` differs from the previous scaler such that the absolute
274-
# values are mapped in the range [0, 1]. On positive only data, this scaler
275-
# behaves similarly to ``MinMaxScaler`` and therefore also suffers from the
276-
# presence of large outliers.
278+
# :class:`~sklearn.preprocessing.MaxAbsScaler` is similar to
279+
# :class:`~sklearn.preprocessing.MinMaxScaler` except that the
280+
# values are mapped in the range [0, 1]. On positive only data, both scalers
281+
# behave similarly.
282+
# :class:`~sklearn.preprocessing.MaxAbsScaler` therefore also suffers from
283+
# the presence of large outliers.
277284

278285
make_plot(3)
279286

280287
##############################################################################
281288
# RobustScaler
282289
# ------------
283290
#
284-
# Unlike the previous scalers, the centering and scaling statistics of this
285-
# scaler are based on percentiles and are therefore not influenced by a few
291+
# Unlike the previous scalers, the centering and scaling statistics of
292+
# :class:`~sklearn.preprocessing.RobustScaler`
293+
# is based on percentiles and are therefore not influenced by a few
286294
# number of very large marginal outliers. Consequently, the resulting range of
287295
# the transformed feature values is larger than for the previous scalers and,
288296
# more importantly, are approximately similar: for both features most of the
@@ -297,53 +305,57 @@ def make_plot(item_idx):
297305
# PowerTransformer
298306
# ----------------
299307
#
300-
# ``PowerTransformer`` applies a power transformation to each feature to make
301-
# the data more Gaussian-like. Currently, ``PowerTransformer`` implements the
302-
# Yeo-Johnson and Box-Cox transforms. The power transform finds the optimal
303-
# scaling factor to stabilize variance and mimimize skewness through maximum
304-
# likelihood estimation. By default, ``PowerTransformer`` also applies
305-
# zero-mean, unit variance normalization to the transformed output. Note that
308+
# :class:`~sklearn.preprocessing.PowerTransformer` applies a power
309+
# transformation to each feature to make the data more Gaussian-like in order
310+
# to stabilize variance and minimize skewness. Currently the Yeo-Johnson
311+
# and Box-Cox transforms are supported and the optimal
312+
# scaling factor is determined via maximum likelihood estimation in both
313+
# methods. By default, :class:`~sklearn.preprocessing.PowerTransformer` applies
314+
# zero-mean, unit variance normalization. Note that
306315
# Box-Cox can only be applied to strictly positive data. Income and number of
307316
# households happen to be strictly positive, but if negative values are present
308-
# the Yeo-Johnson transformed is to be preferred.
317+
# the Yeo-Johnson transformed is preferred.
309318

310319
make_plot(5)
311320
make_plot(6)
312321

313-
##############################################################################
314-
# QuantileTransformer (Gaussian output)
315-
# -------------------------------------
316-
#
317-
# ``QuantileTransformer`` has an additional ``output_distribution`` parameter
318-
# allowing to match a Gaussian distribution instead of a uniform distribution.
319-
# Note that this non-parametetric transformer introduces saturation artifacts
320-
# for extreme values.
321-
322-
make_plot(7)
323-
324322
###################################################################
325323
# QuantileTransformer (uniform output)
326324
# ------------------------------------
327325
#
328-
# ``QuantileTransformer`` applies a non-linear transformation such that the
326+
# :class:`~sklearn.preprocessing.QuantileTransformer` applies a non-linear
327+
# transformation such that the
329328
# probability density function of each feature will be mapped to a uniform
330-
# distribution. In this case, all the data will be mapped in the range [0, 1],
331-
# even the outliers which cannot be distinguished anymore from the inliers.
329+
# or Gaussian distribution. In this case, all the data, including outliers,
330+
# will be mapped to a uniform distribution with the range [0, 1], making
331+
# outliers indistinguishable from inliers.
332+
#
333+
# :class:`~sklearn.preprocessing.RobustScaler` and
334+
# :class:`~sklearn.preprocessing.QuantileTransformer` are robust to outliers in
335+
# the sense that adding or removing outliers in the training set will yield
336+
# approximately the same transformation. But contrary to
337+
# :class:`~sklearn.preprocessing.RobustScaler`,
338+
# :class:`~sklearn.preprocessing.QuantileTransformer` will also automatically
339+
# collapse any outlier by setting them to the a priori defined range boundaries
340+
# (0 and 1). This can result in saturation artifacts for extreme values.
341+
342+
make_plot(7)
343+
344+
##############################################################################
345+
# QuantileTransformer (Gaussian output)
346+
# -------------------------------------
332347
#
333-
# As ``RobustScaler``, ``QuantileTransformer`` is robust to outliers in the
334-
# sense that adding or removing outliers in the training set will yield
335-
# approximately the same transformation on held out data. But contrary to
336-
# ``RobustScaler``, ``QuantileTransformer`` will also automatically collapse
337-
# any outlier by setting them to the a priori defined range boundaries (0 and
338-
# 1).
348+
# To map to a Gaussian distribution, set the parameter
349+
# ``output_distribution='normal'``.
339350

340351
make_plot(8)
341352

342353
##############################################################################
343354
# Normalizer
344355
# ----------
345356
#
346-
# The ``Normalizer`` rescales the vector for each sample to have unit norm,
357+
# The :class:`~sklearn.preprocessing.Normalizer` rescales the vector for each
358+
# sample to have unit norm,
347359
# independently of the distribution of the samples. It can be seen on both
348360
# figures below where all samples are mapped onto the unit circle. In our
349361
# example the two selected features have only positive values; therefore the
Binary file not shown.

dev/_downloads/scikit-learn-docs.pdf

5.59 KB
Binary file not shown.

dev/_images/iris.png

0 Bytes

0 commit comments

Comments
 (0)