@@ -30,21 +30,21 @@ class TargetEncoder(OneToOneFeatureMixin, _BaseEncoder):
30
30
.. note::
31
31
`fit(X, y).transform(X)` does not equal `fit_transform(X, y)` because a
32
32
cross-validation scheme is used in `fit_transform` for encoding. See the
33
- :ref:`User Guide <target_encoder>`. for details.
33
+ :ref:`User Guide <target_encoder>` for details.
34
34
35
35
.. versionadded:: 1.3
36
36
37
37
Parameters
38
38
----------
39
- categories : "auto" or a list of array-like, default="auto"
39
+ categories : "auto" or list of shape (n_features,) of array-like, default="auto"
40
40
Categories (unique values) per feature:
41
41
42
42
- `"auto"` : Determine categories automatically from the training data.
43
43
- list : `categories[i]` holds the categories expected in the i-th column. The
44
44
passed categories should not mix strings and numeric values within a single
45
45
feature, and should be sorted in case of numeric values.
46
46
47
- The used categories is stored in the `categories_` fitted attribute.
47
+ The used categories are stored in the `categories_` fitted attribute.
48
48
49
49
target_type : {"auto", "continuous", "binary"}, default="auto"
50
50
Type of target.
@@ -56,16 +56,17 @@ class TargetEncoder(OneToOneFeatureMixin, _BaseEncoder):
56
56
57
57
.. note::
58
58
The type of target inferred with `"auto"` may not be the desired target
59
- type used for modeling. For example, if the target consistent of integers
59
+ type used for modeling. For example, if the target consisted of integers
60
60
between 0 and 100, then :func:`~sklearn.utils.multiclass.type_of_target`
61
61
will infer the target as `"multiclass"`. In this case, setting
62
- `target_type="continuous"` will understand the target as a regression
62
+ `target_type="continuous"` will specify the target as a regression
63
63
problem. The `target_type_` attribute gives the target type used by the
64
64
encoder.
65
65
66
66
smooth : "auto" or float, default="auto"
67
- The amount of mixing of the categorical encoding with the global target mean. A
68
- larger `smooth` value will put more weight on the global target mean.
67
+ The amount of mixing of the target mean conditioned on the value of the
68
+ category with the global target mean. A larger `smooth` value will put
69
+ more weight on the global target mean.
69
70
If `"auto"`, then `smooth` is set to an empirical Bayes estimate.
70
71
71
72
cv : int, default=5
@@ -75,7 +76,7 @@ class TargetEncoder(OneToOneFeatureMixin, _BaseEncoder):
75
76
76
77
shuffle : bool, default=True
77
78
Whether to shuffle the data in :meth:`fit_transform` before splitting into
78
- batches . Note that the samples within each split will not be shuffled.
79
+ folds . Note that the samples within each split will not be shuffled.
79
80
80
81
random_state : int, RandomState instance or None, default=None
81
82
When `shuffle` is True, `random_state` affects the ordering of the
@@ -87,11 +88,13 @@ class TargetEncoder(OneToOneFeatureMixin, _BaseEncoder):
87
88
Attributes
88
89
----------
89
90
encodings_ : list of shape (n_features,) of ndarray
90
- For feature `i`, `encodings_[i]` is the encoding matching the
91
+ Encodings learnt on all of `X`.
92
+ For feature `i`, `encodings_[i]` are the encodings matching the
91
93
categories listed in `categories_[i]`.
92
94
93
95
categories_ : list of shape (n_features,) of ndarray
94
- The categories of each feature determined during fitting
96
+ The categories of each feature determined during fitting or specified
97
+ in `categories`
95
98
(in order of the features in `X` and corresponding with the output
96
99
of :meth:`transform`).
97
100
0 commit comments