Skip to content

DOC Link to build from source in contributing guide #21369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 5, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 17 additions & 15 deletions doc/developers/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -254,33 +254,35 @@ how to set up your git repository:

.. prompt:: bash $

git clone git@github.com:YourLogin/scikit-learn.git # add --depth 1 if your connection is slow
cd scikit-learn
git clone git@github.com:YourLogin/scikit-learn.git # add --depth 1 if your connection is slow
cd scikit-learn

4. Install the development dependencies:

.. prompt:: bash $

pip install cython pytest pytest-cov flake8 mypy black==21.6b0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those optional dev dependencies (besides cython) are not part of the build from source doc. We should either add insert an item on the latter and cross-link back to this doc in the last item.


5. Install scikit-learn in editable mode:
3. Follow steps 2-7 in :ref:`install_bleeding_edge` to build scikit-learn in
development mode and return to this document.

.. prompt:: bash $
4. Install the development dependencies:

pip install --no-build-isolation --editable .
.. prompt:: bash $

If you receive errors in building scikit-learn, see the
:ref:`install_bleeding_edge` section.
pip install pytest pytest-cov flake8 mypy black==21.6b0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not mistaken, you only need to install pre-commit that at the first commit will install flake8 mypy and black==21.6b0. It is indeed the step 9. below that is stated optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should remove the **optional** because it would be disregarded easily. I think that most of the time it is a good practice to install pre-commit because it will run black which is the main source of linting issues.

So in step 9. below, I would:

  • remove the optional part
  • emphasize that they are not required but will ensure that the commit changes will not fail the linting
  • mention that you can skip pre-commit using the flag --no-verify when commiting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not mistaken, you only need to install pre-commit that at the first commit will install flake8 mypy and black==21.6b0.

My understanding is that pre-commit installs black in it's own virtual environment. If we want contributors to have black in their development environment, then they would still need install black.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge this and follow up with a PR if we want required pre-commit.


.. _upstream:

6. Add the ``upstream`` remote. This saves a reference to the main
5. Add the ``upstream`` remote. This saves a reference to the main
scikit-learn repository, which you can use to keep your repository
synchronized with the latest changes:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to keep upstream remote configuration as a specific items because first time contributors are often confused by the remote configuration so I believe it's important to have a linkable bullet item for this. Then at the end if this item, I would add something like:

"""
You can check that the upstream and origin remote aliases are probably configured with the git remote -v command. It should display something like the following.

origin	https://github.com/mygithubname/scikit-learn.git (fetch)
origin	https://github.com/mygithubname/scikit-learn.git (push)
upstream	https://github.com/scikit-learn/scikit-learn.git (fetch)
upstream	https://github.com/scikit-learn/scikit-learn.git (push)

with those remote repository aliases in place, doing:

  .. prompt:: bash $
    git pull upstream main

will be equivalent to the explicit:

  .. prompt:: bash $
    git pull https://github.com/scikit-learn/scikit-learn.git main

"""

Note that I used the HTTPS URL not to introduce the complexity of setting up ssh. Maybe we should update the advanced installation doc to also use the https:// URL for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated everything to use SSH since most of our git clone are targeting ssh. I also adding the check for git remote -v for checking the remotes.

As for git pull upstream main, in step 7, we already describe a similar workflow:

git checkout main
git fetch upstream
git merge upstream/main

I like the fetch+merge a little better, since it is the same steps for syncing a local branch with upstream/main.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for fetch and merge because it does not depend on the git config for the rebase/merge behavior of the git pull command.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogrisel @thomasjpfan

  • 1
    It is good to see the fork information in here.

The page is called "Contributing" and for the text "If you plan on submitting a pull-request, you should clone from your fork instead," it's helpful for the user to have the remotes prominently part of the workflow.

.. prompt:: bash $

git remote add upstream https://github.com/scikit-learn/scikit-learn.git
git remote add upstream git@github.com:scikit-learn/scikit-learn.git

6. Check that the `upstream` and `origin` remote aliases are configured correctly
by running `git remote -v` which should display::

origin git@github.com:YourLogin/scikit-learn.git (fetch)
origin git@github.com:YourLogin/scikit-learn.git (push)
upstream git@github.com:scikit-learn/scikit-learn.git (fetch)
upstream git@github.com:scikit-learn/scikit-learn.git (push)

You should now have a working installation of scikit-learn, and your git
repository properly configured. The next steps now describe the process of
Expand Down