Skip to content

[WIP/RFC] Test docstring parameters (with order) #9023

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ matrix:
# This environment tests that scikit-learn can be built against
# versions of numpy, scipy with ATLAS that comes with Ubuntu Trusty 14.04
- env: DISTRIB="ubuntu" PYTHON_VERSION="2.7" CYTHON_VERSION="0.23.4"
COVERAGE=true
COVERAGE=true TEST_DOCSTRINGS="false"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to set TEST_DOCSTRINGS if you don't want to test the docstrings (similar to what we do with COVERAGE)

addons:
apt:
packages:
Expand All @@ -34,12 +34,12 @@ matrix:
# This environment tests the oldest supported anaconda env
- env: DISTRIB="conda" PYTHON_VERSION="2.7" INSTALL_MKL="false"
NUMPY_VERSION="1.8.2" SCIPY_VERSION="0.13.3" CYTHON_VERSION="0.23.5"
COVERAGE=true
COVERAGE=true TEST_DOCSTRINGS="true"
# This environment tests the newest supported Anaconda release (4.4.0)
# It also runs tests requiring Pandas.
- env: DISTRIB="conda" PYTHON_VERSION="3.6.1" INSTALL_MKL="true"
NUMPY_VERSION="1.12.1" SCIPY_VERSION="0.19.0" PANDAS_VERSION="0.20.1"
CYTHON_VERSION="0.25.2" COVERAGE=true
CYTHON_VERSION="0.25.2" COVERAGE=true TEST_DOCSTRINGS="false"
# This environment use pytest to run the tests. It uses the newest
# supported Anaconda release (4.4.0). It also runs tests requiring Pandas.
- env: USE_PYTEST="true" DISTRIB="conda" PYTHON_VERSION="3.6.1"
Expand All @@ -49,6 +49,7 @@ matrix:
- env: RUN_FLAKE8="true" SKIP_TESTS="true"
DISTRIB="conda" PYTHON_VERSION="3.5" INSTALL_MKL="true"
NUMPY_VERSION="1.12.1" SCIPY_VERSION="0.19.0" CYTHON_VERSION="0.23.5"
TEST_DOCSTRINGS="true"
# This environment tests scikit-learn against numpy and scipy master
# installed from their CI wheels in a virtualenv with the Python
# interpreter provided by travis.
Expand Down
4 changes: 4 additions & 0 deletions build_tools/travis/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,10 @@ if [[ "$COVERAGE" == "true" ]]; then
pip install coverage codecov
fi

if [[ "$TEST_DOCSTRINGS" == "true" ]]; then
pip install sphinx numpydoc # numpydoc requires sphinx
fi

if [[ "$SKIP_TESTS" == "true" ]]; then
echo "No need to build scikit-learn when not running the tests"
else
Expand Down
52 changes: 49 additions & 3 deletions sklearn/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,11 @@ def get_indices(self, i):

Only works if ``rows_`` and ``columns_`` attributes exist.

Parameters
----------
i : int
The index of the cluster.

Returns
-------
row_ind : np.array, dtype=np.intp
Expand All @@ -443,6 +448,11 @@ def get_indices(self, i):
def get_shape(self, i):
"""Shape of the i'th bicluster.

Parameters
----------
i : int
The index of the cluster.

Returns
-------
shape : (int, int)
Expand All @@ -454,9 +464,22 @@ def get_shape(self, i):
def get_submatrix(self, i, data):
"""Returns the submatrix corresponding to bicluster `i`.

Parameters
----------
i : int
The index of the cluster.
data : array
The data.

Returns
-------
submatrix : array
The submatrix corresponding to bicluster i.

Notes
-----
Works with sparse matrices. Only works if ``rows_`` and
``columns_`` attributes exist.

"""
from .utils.validation import check_array
data = check_array(data, accept_sparse='csr')
Expand Down Expand Up @@ -525,10 +548,33 @@ class MetaEstimatorMixin(object):
###############################################################################

def is_classifier(estimator):
"""Returns True if the given estimator is (probably) a classifier."""
"""Returns True if the given estimator is (probably) a classifier.

Parameters
----------
estimator : object
Estimator object to test.

Returns
-------
out : bool
True if estimator is a classifier and False otherwise.
"""
return getattr(estimator, "_estimator_type", None) == "classifier"


def is_regressor(estimator):
"""Returns True if the given estimator is (probably) a regressor."""
"""Returns True if the given estimator is (probably) a regressor.


Parameters
----------
estimator : object
Estimator object to test.

Returns
-------
out : bool
True if estimator is a regressor and False otherwise.
"""
return getattr(estimator, "_estimator_type", None) == "regressor"
6 changes: 3 additions & 3 deletions sklearn/cluster/affinity_propagation_.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,13 +199,13 @@ class AffinityPropagation(BaseEstimator, ClusterMixin):
damping : float, optional, default: 0.5
Damping factor between 0.5 and 1.

max_iter : int, optional, default: 200
Maximum number of iterations.

convergence_iter : int, optional, default: 15
Number of iterations with no change in the number
of estimated clusters that stops the convergence.

max_iter : int, optional, default: 200
Maximum number of iterations.

copy : boolean, optional, default: True
Make a copy of input data.

Expand Down
35 changes: 19 additions & 16 deletions sklearn/cluster/hierarchical.py
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,9 @@ def linkage_tree(X, connectivity=None, n_components=None,
be symmetric and only the upper triangular half is used.
Default is None, i.e, the Ward algorithm is unstructured.

n_components : int (optional)
The number of connected components in the graph.

n_clusters : int (optional)
Stop early the construction of the tree at n_clusters. This is
useful to decrease computation time if the number of clusters is
Expand Down Expand Up @@ -596,14 +599,6 @@ class AgglomerativeClustering(BaseEstimator, ClusterMixin):
n_clusters : int, default=2
The number of clusters to find.

connectivity : array-like or callable, optional
Connectivity matrix. Defines for each sample the neighboring
samples following a given structure of the data.
This can be a connectivity matrix itself or a callable that transforms
the data into a connectivity matrix, such as derived from
kneighbors_graph. Default is None, i.e, the
hierarchical clustering algorithm is unstructured.

affinity : string or callable, default: "euclidean"
Metric used to compute the linkage. Can be "euclidean", "l1", "l2",
"manhattan", "cosine", or 'precomputed'.
Expand All @@ -615,6 +610,14 @@ class AgglomerativeClustering(BaseEstimator, ClusterMixin):
By default, no caching is done. If a string is given, it is the
path to the caching directory.

connectivity : array-like or callable, optional
Connectivity matrix. Defines for each sample the neighboring
samples following a given structure of the data.
This can be a connectivity matrix itself or a callable that transforms
the data into a connectivity matrix, such as derived from
kneighbors_graph. Default is None, i.e, the
hierarchical clustering algorithm is unstructured.

compute_full_tree : bool or 'auto' (optional)
Stop early the construction of the tree at n_clusters. This is
useful to decrease computation time if the number of clusters is
Expand Down Expand Up @@ -766,14 +769,6 @@ class FeatureAgglomeration(AgglomerativeClustering, AgglomerationTransform):
n_clusters : int, default 2
The number of clusters to find.

connectivity : array-like or callable, optional
Connectivity matrix. Defines for each feature the neighboring
features following a given structure of the data.
This can be a connectivity matrix itself or a callable that transforms
the data into a connectivity matrix, such as derived from
kneighbors_graph. Default is None, i.e, the
hierarchical clustering algorithm is unstructured.

affinity : string or callable, default "euclidean"
Metric used to compute the linkage. Can be "euclidean", "l1", "l2",
"manhattan", "cosine", or 'precomputed'.
Expand All @@ -785,6 +780,14 @@ class FeatureAgglomeration(AgglomerativeClustering, AgglomerationTransform):
By default, no caching is done. If a string is given, it is the
path to the caching directory.

connectivity : array-like or callable, optional
Connectivity matrix. Defines for each feature the neighboring
features following a given structure of the data.
This can be a connectivity matrix itself or a callable that transforms
the data into a connectivity matrix, such as derived from
kneighbors_graph. Default is None, i.e, the
hierarchical clustering algorithm is unstructured.

compute_full_tree : bool or 'auto', optional, default "auto"
Stop early the construction of the tree at n_clusters. This is
useful to decrease computation time if the number of clusters is
Expand Down
Loading