Skip to content

DOC Update docs guideline regarding docstring formatting #18243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jan 8, 2021
Merged
16 changes: 13 additions & 3 deletions doc/developers/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ Please make sure to still check our guidelines below, since they describe our
latest up-to-date workflow.

- Crash Course in Contributing to Scikit-Learn & Open Source Projects:
`Video <https://youtu.be/5OL8XoMMOfA>`__,
`Video <https://youtu.be/5OL8XoMMOfA>`__,
`Transcript
<https://github.com/data-umbrella/event-transcripts/blob/main/2020/05-andreas-mueller-contributing.md>`__

Expand Down Expand Up @@ -739,6 +739,8 @@ Finally, follow the formatting rules below to make it consistently good:

sample_weight : array-like of shape (n_samples,), default=None

multioutput_array : ndarray of shape (n_samples, n_classes) or list of such arrays

In general have the following in mind:

1. Use Python basic types. (``bool`` instead of ``boolean``)
Expand All @@ -752,10 +754,18 @@ Finally, follow the formatting rules below to make it consistently good:
5. Specify ``dataframe`` when "frame-like" features are being used, such
as the column names.
6. When specifying the data type of a list, use ``of`` as a delimiter:
``list of int``.
``list of int``. When the parameter supports arrays giving details
about the shape and/or data type and a list of such arrays, you can
use one of ``array-like of shape (n_samples,) or list of such arrays``.
7. When specifying the dtype of an ndarray, use e.g. ``dtype=np.int32``
after defining the shape:
``ndarray of shape (n_samples,), dtype=np.int32``.
``ndarray of shape (n_samples,), dtype=np.int32``. You can specify
multiple dtype as a set:
``array-like of shape (n_samples,), dtype={np.float64, np.float32}``.
If one wants to mention arbitrary precision, use `integral` and
`floating` rather than the Python dtype `int` and `float`. When both
`int` and `floating` are supported, there is no need to specify the
dtype.
8. When the default is ``None``, ``None`` only needs to be specified at the
end with ``default=None``. Be sure to include in the docstring, what it
means for the parameter or attribute to be ``None``.
Expand Down
7 changes: 7 additions & 0 deletions doc/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,13 @@ General Concepts
or vectorizing. Our estimators do not work with struct arrays, for
instance.

Our documentation can sometimes give information about the dtype
precision, e.g. `np.int32`, `np.int64`, etc. When the precision is
provided, it refers to the NumPy dtype. If an arbitrary precision is
used, the documentation will refer to dtype `integer` or `floating`.
Note that in this case, the precision can be platform dependent.
The `numeric` dtype refers to accepting both `integer` and `floating`.

TODO: Mention efficiency and precision issues; casting policy.

duck typing
Expand Down
2 changes: 1 addition & 1 deletion examples/model_selection/plot_learning_curve.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ def plot_learning_curve(estimator, title, X, y, axes=None, ylim=None, cv=None,
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
for more details.

train_sizes : array-like of shape (n_ticks,), dtype={int, float}
train_sizes : array-like of shape (n_ticks,)
Relative or absolute numbers of training examples that will be used to
generate the learning curve. If the ``dtype`` is float, it is regarded
as a fraction of the maximum size of the training set (that is
Expand Down
8 changes: 4 additions & 4 deletions sklearn/dummy.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,13 +64,13 @@ class DummyClassifier(MultiOutputMixin, ClassifierMixin, BaseEstimator):

Attributes
----------
classes_ : ndarray of shape (n_classes,) or list thereof
classes_ : ndarray of shape (n_classes,) or list of such arrays
Class labels for each output.

n_classes_ : int or list of int
Number of label for each output.

class_prior_ : ndarray of shape (n_classes,) or list thereof
class_prior_ : ndarray of shape (n_classes,) or list of such arrays
Probability of each class for each output.

n_outputs_ : int
Expand Down Expand Up @@ -272,7 +272,7 @@ def predict_proba(self, X):

Returns
-------
P : ndarray of shape (n_samples, n_classes) or list thereof
P : ndarray of shape (n_samples, n_classes) or list of such arrays
Returns the probability of the sample for each class in
the model, where classes are ordered arithmetically, for each
output.
Expand Down Expand Up @@ -335,7 +335,7 @@ def predict_log_proba(self, X):

Returns
-------
P : ndarray of shape (n_samples, n_classes) or list thereof
P : ndarray of shape (n_samples, n_classes) or list of such arrays
Returns the log probability of the sample for each class in
the model, where classes are ordered arithmetically for each
output.
Expand Down
38 changes: 21 additions & 17 deletions sklearn/linear_model/_least_angle.py
Original file line number Diff line number Diff line change
Expand Up @@ -864,21 +864,22 @@ class Lars(MultiOutputMixin, RegressorMixin, LinearModel):

Attributes
----------
alphas_ : array-like of shape (n_alphas + 1,) or list of thereof of \
shape (n_targets,)
alphas_ : array-like of shape (n_alphas + 1,) or list of such arrays
Maximum of covariances (in absolute value) at each iteration.
``n_alphas`` is either ``max_iter``, ``n_features`` or the
number of nodes in the path with ``alpha >= alpha_min``, whichever
is smaller.
is smaller. If this is a list of array-like, the length of the outer
list is `n_targets`.

active_ : list of shape (n_alphas,) or list of thereof of shape \
(n_targets,)
active_ : list of shape (n_alphas,) or list of such lists
Indices of active variables at the end of the path.
If this is a list of list, the length of the outer list is `n_targets`.

coef_path_ : array-like of shape (n_features, n_alphas + 1) or list of \
thereof of shape (n_targets,)
coef_path_ : array-like of shape (n_features, n_alphas + 1) or list \
of such arrays
The varying values of the coefficients along the path. It is not
present if the ``fit_path`` parameter is ``False``.
present if the ``fit_path`` parameter is ``False``. If this is a list
of array-like, the length of the outer list is `n_targets`.

coef_ : array-like of shape (n_features,) or (n_targets, n_features)
Parameter vector (w in the formulation formula).
Expand Down Expand Up @@ -1121,21 +1122,23 @@ class LassoLars(Lars):

Attributes
----------
alphas_ : array-like of shape (n_alphas + 1,) or list of thereof of shape \
(n_targets,)
alphas_ : array-like of shape (n_alphas + 1,) or list of such arrays
Maximum of covariances (in absolute value) at each iteration.
``n_alphas`` is either ``max_iter``, ``n_features`` or the
number of nodes in the path with ``alpha >= alpha_min``, whichever
is smaller.
is smaller. If this is a list of array-like, the length of the outer
list is `n_targets`.

active_ : list of length n_alphas or list of thereof of shape (n_targets,)
active_ : list of length n_alphas or list of such lists
Indices of active variables at the end of the path.
If this is a list of list, the length of the outer list is `n_targets`.

coef_path_ : array-like of shape (n_features, n_alphas + 1) or list of \
thereof of shape (n_targets,)
coef_path_ : array-like of shape (n_features, n_alphas + 1) or list \
of such arrays
If a list is passed it's expected to be one of n_targets such arrays.
The varying values of the coefficients along the path. It is not
present if the ``fit_path`` parameter is ``False``.
present if the ``fit_path`` parameter is ``False``. If this is a list
of array-like, the length of the outer list is `n_targets`.

coef_ : array-like of shape (n_features,) or (n_targets, n_features)
Parameter vector (w in the formulation formula).
Expand Down Expand Up @@ -1382,8 +1385,9 @@ class LarsCV(Lars):

Attributes
----------
active_ : list of length n_alphas or list of thereof of shape (n_targets,)
active_ : list of length n_alphas or list of such lists
Indices of active variables at the end of the path.
If this is a list of lists, the outer list length is `n_targets`.

coef_ : array-like of shape (n_features,)
parameter vector (w in the formulation formula)
Expand Down Expand Up @@ -1775,7 +1779,7 @@ class LassoLarsIC(LassoLars):
alpha_ : float
the alpha parameter chosen by the information criterion

alphas_ : array-like of shape (n_alphas + 1,) or list thereof
alphas_ : array-like of shape (n_alphas + 1,) or list of such arrays
Maximum of covariances (in absolute value) at each iteration.
``n_alphas`` is either ``max_iter``, ``n_features`` or the
number of nodes in the path with ``alpha >= alpha_min``, whichever
Expand Down
6 changes: 3 additions & 3 deletions sklearn/preprocessing/_discretization.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ def fit(self, X, y=None):

Parameters
----------
X : array-like of shape (n_samples, n_features), dtype={int, float}
X : array-like of shape (n_samples, n_features)
Data to be discretized.

y : None
Expand Down Expand Up @@ -276,7 +276,7 @@ def transform(self, X):

Parameters
----------
X : array-like of shape (n_samples, n_features), dtype={int, float}
X : array-like of shape (n_samples, n_features)
Data to be discretized.

Returns
Expand Down Expand Up @@ -326,7 +326,7 @@ def inverse_transform(self, Xt):

Parameters
----------
Xt : array-like of shape (n_samples, n_features), dtype={int, float}
Xt : array-like of shape (n_samples, n_features)
Transformed data in the binned space.

Returns
Expand Down