Skip to content

DOC Contributing guide update #13961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 20, 2019

Conversation

NicolasHug
Copy link
Member

This PR updates the contributing guide.

I tried to prioritize the instructions that, IMHO, are relevant for a potentially inexperienced new contributor (based on my experience from the sprints)

Some of the changes are personal taste, so feel free to disagree.

I also created a separate entry for the governance document, since I don't think it should be in the contributing guide.

@@ -678,54 +672,6 @@ To test code coverage, you need to install the `coverage

3. Loop.

Developers web site
-------------------
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wiki seems outdated, and with lots of TODO, so I don't think it's a particularly useful resource. This is a typical rabit-hole risk for new users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put a note on the wiki to that effect.

Contributing pull requests
--------------------------
Pull request checklist
----------------------
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is essentially the same but I tried to prioritize tests (with some tips that were IMO missing), documentation and pep8.


$ git checkout -b my-feature

and start making changes. Always use a ``feature`` branch. It's good practice to
never work on the ``master`` branch!

.. note::
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this note was redundant with the instructions below about solving conflicts, but feel free to disagree (esp @adrinjalali since I think you wrote it IIRC).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

@abhinavsagar
Copy link
Contributor

Looks good to me.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting started.

</div>
<div class="row-fluid">
<div class="span4 box">
<h2><a href="governance.html">Governance</a></h2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I had thought it sufficient to link to this from the About Us page, but now I can't see where we link to About Us, except on index.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anywhere except for "about us". The whole navigation is a bit of a mess tbh, I don't have strong feelings either way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we somehow came to the conclusion to not put the link here. But I personally prefer to have it here and more visible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably link to About Us here instead of governance. And we should possibly move the Governance section of about.rst up to before the author list???


pip install --editable .

every time the source code of a compiled extension is
changed (for instance when switching branches or pulling changes from upstream).
every time the source code of a compiled extension is changed (for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd replace that, or maybe just mention, by python setup.py build_ext -i which really is what is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not add another command, even if it's more strictly correct. I think introducing a single command is less scary for beginners.

@NicolasHug
Copy link
Member Author

@scikit-learn/core-devs people, any thought against switching from [MRG]/[WIP] to using Github Draft PRs?

@TomDLT
Copy link
Member

TomDLT commented May 29, 2019

I did not find a way to change from PR to Draft, and we would rely completely on users to mark it as Draft.

@rth
Copy link
Member

rth commented May 29, 2019

Me too, I'm not sure that using Draft instead of WIP matters in practice.

@thomasjpfan
Copy link
Member

I am +1 on keeping [MRG]/[WIP] since there is not a way to change from PR to Draft.

@NicolasHug
Copy link
Member Author

Ok, leaving it as is then. Thanks!

The latest contributing guide is available in the repository at
`doc/developers/contributing.rst`, or online at:

https://scikit-learn.org/dev/developers/contributing.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about (and maybe we had some discussion about it?) having the link from the stable website to the dev for the contributing guide. But I'm more than happy with this note here.

Copy link
Member

@ogrisel ogrisel Jun 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this note is good enough.

In the vast majority of cases, building scikit-learn for development purposes
can be done with::

pip install numpy scipy cython pytest flake8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

joblib?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it also kinda necessary to have sphinx and sphinx-gallery for development.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point for joblib.
For the doc dependencies, I had them before, but reverted (https://github.com/scikit-learn/scikit-learn/pull/13961/files#r288462944)

I have no strong opinion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joblib will be installed automatically when installing scikit-learn.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the same argument we should also remove numpy and scipy. We have numpy, scipy, and joblib as requirements in setup.py.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed them. I'm not sure about removing numpy though.

4. Install library in editable mode::
4. Install the development dependencies::

$ pip install numpy scipy cython pytest flake8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

joblib? Also, I find it suboptimal and hard to maintain that we basically have this section twice in our documentations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a little bit of redundancy in our docs is worth it if it avoids users going through the rabbit holes of clicking into our multiple "see link for details".

specific to the file
- `pytest sklearn/linear_model` to test the whole `linear_model` module
- `pytest sklearn/doc/linear_model.rst` to make sure the user guide
examples are correct.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding the test_common and test_docstrings may also be useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding test_common, but test_docstrings very rarely fails so probably not worth adding it

@NicolasHug
Copy link
Member Author

Before this is merged we need to know what are the actual dependencies needed to run pip install -e .

  • cython?
  • cython numpy?
  • cython numpy scipy?

Just cython seems to work on my machine but I'm confused because I thought that wasn't supposed to work.

@adrinjalali
Copy link
Member

AFAIK, it works because pip fetches dependencies, and we have included numpy, scipy, and joblib in setup.py. Cython needs to be installed manually cause it's a build dependency and not a runtime/install dependency, and I couldn't find a way to tell pip to install something only if its making the package and not just installing a prepackaged one.

@jeremiedbb
Copy link
Member

(just a remark: cython only is not enough if you build with the make in command)

@adrinjalali
Copy link
Member

Could we have this in before Friday? (it'd be really nice to have it for the sprint on Saturday)

@NicolasHug
Copy link
Member Author

Could we have this in before Friday?

Still waiting for a clear answer to #13961 (comment). maybe @ogrisel @amueller ?

@adrinjalali
Copy link
Member

(just a remark: cython only is not enough if you build with the make in command)

that's true, but at the point somebody does make in, they probably know how and which dependencies to install.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that the section on "Working Notes" under "Rolling your own estimator" could be removed or changed to say that we use GitHub's pull requests and issues to track work on a project more than its Wiki.

Otherwise LGTM. I'm wondering still whether there are good ways to split this into two pages, one focused on coding (?) with scikit-learn practice and the other on getting started with contributing.

</div>
<div class="row-fluid">
<div class="span4 box">
<h2><a href="governance.html">Governance</a></h2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably link to About Us here instead of governance. And we should possibly move the Governance section of about.rst up to before the author list???

the user guide, with small code snipets. If relevant, please also add
references in the literature, with PDF links when possible.

11. The documentation should also include expected time and space complexity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which documentation? say user guide...

@@ -678,54 +672,6 @@ To test code coverage, you need to install the `coverage

3. Loop.

Developers web site
-------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put a note on the wiki to that effect.

@@ -69,11 +76,6 @@ The following people have been active contributors in the past, but are no longe
- Virgile Fritsch
- Wei Li
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, we should really kill this author list, because it is never up to date, and point to the list generated by github.

However, this could go in another PR.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm overall.


To build the documentation, you need to be in the ``doc`` folder::

cd doc

It also requires having the version of scikit-learn installed that corresponds
to the documentation, e.g.::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this? The command should be executed before cd ing into the doc folder though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we start the section with "First, make sure you have :ref:properly installed <install_bleeding_edge>
the development version."

In the past, the policy to resolve conflicts was to rebase your branch on
``master``. GitHub interface deals with merging ``master`` better than in
the past.
The `Git documentation <https://git-scm.com/documentation>`_ is an excellent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, I would put this in a "topic" directive.

Also, I would add a link to http://try.github.io/, which I find really good to learn git + github workflow.

Copy link
Member

@GaelVaroquaux GaelVaroquaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am 👍 for merging this. It is a net improvement.

I'll wait a bit to see if there are further small comments, but I'd like to merge this soon, because I feel that it improves things.

@NicolasHug
Copy link
Member Author

Before this is merged, what are the actual dependencies needed to run pip install -e .

  • cython?
  • cython numpy?
  • cython numpy scipy?

@adrinjalali
Copy link
Member

@NicolasHug I think it's easy to confirm (in a reproducible manner) that Cython is the only dependency there.

@rth
Copy link
Member

rth commented Jun 18, 2019

I think it's easy to confirm (in a reproducible manner) that Cython is the only dependency there.

It's just that it's not in install_requires in setup.py and need be manually installed. The actual build dependencies are numpy, scipy and cython I believe (we do use scipy BLAS in cython files...).

@adrinjalali
Copy link
Member

But this works:

[adrin@beauty scikit-learn]$ virtualenv -p python3 /tmp/.venv
Running virtualenv with interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /tmp/.venv/bin/python3
Also creating executable in /tmp/.venv/bin/python
Installing setuptools, pip, wheel...
done.
[adrin@beauty scikit-learn]$ source /tmp/.venv/bin/activate
(.venv) [adrin@beauty scikit-learn]$ pip install Cython
Collecting Cython
  Using cached https://files.pythonhosted.org/packages/b8/c0/500afce93187b9d94d5c6e80a3032f90bf559bd4c8d12b45a10bc9a67b92/Cython-0.29.10-cp37-cp37m-manylinux1_x86_64.whl
Installing collected packages: Cython
Successfully installed Cython-0.29.10
(.venv) [adrin@beauty scikit-learn]$ pip install -e .
Obtaining file:///home/adrin/Projects/github.com/sklearn/scikit-learn
Collecting numpy>=1.11.0 (from scikit-learn==0.22.dev0)
  Using cached https://files.pythonhosted.org/packages/fc/d1/45be1144b03b6b1e24f9a924f23f66b4ad030d834ad31fb9e5581bd328af/numpy-1.16.4-cp37-cp37m-manylinux1_x86_64.whl
Collecting scipy>=0.17.0 (from scikit-learn==0.22.dev0)
  Using cached https://files.pythonhosted.org/packages/5d/bd/c0feba81fb60e231cf40fc8a322ed5873c90ef7711795508692b1481a4ae/scipy-1.3.0-cp37-cp37m-manylinux1_x86_64.whl
Collecting joblib>=0.11 (from scikit-learn==0.22.dev0)
  Using cached https://files.pythonhosted.org/packages/cd/c1/50a758e8247561e58cb87305b1e90b171b8c767b15b12a1734001f41d356/joblib-0.13.2-py2.py3-none-any.whl
Installing collected packages: numpy, scipy, joblib, scikit-learn
  Running setup.py develop for scikit-learn
Successfully installed joblib-0.13.2 numpy-1.16.4 scikit-learn scipy-1.3.0

@glemaitre
Copy link
Member

Yes because we state numpy, scipy and joblib in the install_requires so they will be collected by pip (while we don't do that for cython).

Before this is merged, what are the actual dependencies needed to run pip install -e .

So for the command pip install -e . to work you only need to manually install cython. The three others will be fetched for you.

@NicolasHug
Copy link
Member Author

Ok, so I guess this can be merged now :)

@glemaitre
Copy link
Member

Ok, so I guess this can be merged now :)

There is a lot of +1. Feel free to merged once this is green ;)
I'll merge it if I am quicker

@NicolasHug NicolasHug changed the title [MRG+1] Contributing guide update DOC Contributing guide update Jun 20, 2019
@NicolasHug NicolasHug merged commit 214def0 into scikit-learn:master Jun 20, 2019
@NicolasHug
Copy link
Member Author

Thanks everyone for the reviews!

@adrinjalali have fun at the sprint! You'll get to crash test it ^^

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Jul 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.