-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG+1] Doc: complete PR #11180 for the reorganization of the dataset loading utilities section #11328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
And again, I'm sorry for messing up the merge... |
It's really not a big deal, and thank YOU for reviewing all my PRs so fast ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some comments. Also please prefix the titles of your PR with [WIP] (Work In Progress) or [MRG] (Ready to merge) to let the reviewers know whether you are still working on improvements to an early PR or not.
doc/datasets/index.rst
Outdated
|
||
*desc* | ||
They can be loaded using the following functions : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No whitespace before ":" in English.
doc/datasets/index.rst
Outdated
@@ -262,6 +277,8 @@ Generators for decomposition | |||
Loading other datasets | |||
====================== | |||
|
|||
This section gathers different tools to load other kinds of datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence brings very little information I would rather remove it to avoid diluting the actual information with boilerplate English connection sentences.
doc/datasets/index.rst
Outdated
require to download any file from some external website. | ||
scikit-learn comes with a few small standard datasets that do not require to | ||
download any file from some external website. A description of each dataset is | ||
available below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"A description of each dataset is available below." could probably be trimmed to keep the documentation as informative as possible.
doc/datasets/index.rst
Outdated
some contain ``feature_names`` and ``target_names``. See the dataset | ||
descriptions below for details. | ||
|
||
**The dataset generation functions.** They can be used to generate controled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: controlled
doc/datasets/index.rst
Outdated
@@ -9,49 +9,61 @@ Dataset loading utilities | |||
The ``sklearn.datasets`` package embeds some small toy datasets | |||
as introduced in the :ref:`Getting Started <loading_example_dataset>` section. | |||
|
|||
This package also features helpers to fetch larger datasets commonly | |||
used by the machine learning community to benchmark algorithm on data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... to benchmark algorithms ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @jeremiedbb.
doc/datasets/index.rst
Outdated
interface, returning a tuple ``(X, y)`` consisting of a ``n_samples`` * | ||
Both loaders and fetchers functions return a dictionary-like object holding | ||
at least two items: an array of shape ``n_samples`` * ``n_features`` with | ||
key ``data``(except for 20newsgroups) and a numpy array of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why but there's formatting issue here. See https://25798-843222-gh.circle-artifacts.com/0/doc/datasets/index.html. I guess we should have a blank before the left bracket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't notice, thanks !
Thanks @jeremiedbb for the great work :) |
This PR is just the continued work started in #11180 for the reorganization of the datasets loading utilities
section in the doc discussed in #11083.
There were TODOs left out in #11180 that this PR aims to fill. It's essentially adding/reworking introductions of the new datasets loading utilities sections.