Skip to content

DOC Adds HDBSCAN.dbscan_clustering section to plot_hdbscan.py #24879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 277 commits into from

Conversation

Micky774
Copy link
Contributor

Reference Issues/PRs

Towards #24686

What does this implement/fix? Explain your changes.

Adds HDBSCAN.dbscan_clustering section to plot_hdbscan.py. Fixes a small typo.

Any other comments?

rusdes and others added 30 commits October 14, 2022 16:05
…mple (scikit-learn#24374)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
…#24689)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
* DOC Improve docstring around set_output

* DOC Improve docs around set_output

* DOC Address comments

* DOC Better grammar

* DOC Improve wording

* DOC Improves docstring in set_config
…blic (scikit-learn#24688)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
…-learn#24682)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
* FEA add NewtonSolver, CholeskyNewtonSolver and QRCholeskyNewtonSolver

* ENH better singular hessian special solve

* CLN fix some typos found by reviewer

* TST assert ConvergenceWarning is raised

* MNT add BaseCholeskyNewtonSolver

* WIP colinear design in GLMs

* FIX _solve_singular

* FIX false unpacking in

* TST add tests for unpenalized GLMs

* TST fix solutions of glm_dataset

* ENH add SVDFallbackSolver

* CLN remove SVDFallbackSolver

* ENH use gradient step for singular hessians

* ENH print iteration number in warnings

* TST improve test_linalg_warning_with_newton_solver

* CLN LinAlgWarning fron scipy.linalg

* ENH more robust hessian

* ENH increase maxls for lbfgs to make it more robust

* ENH add hessian_warning for too many negative hessian values

* CLN some warning messages

* ENH add lbfgs_step

* ENH use lbfgs_step for hessian_warning

* TST make them pass

* TST tweek rtol for lbfgs

* TST add rigoros test for GLMs

* TST improve test_warm_start

* ENH improve lbfgs options for better convergence

* CLN fix test_warm_start

* TST fix assert singular values in datasets

* CLN address most review comments

* ENH enable more vebosity levels for lbfgs

* DOC add whatsnew

* CLN remove xfail and clean a bit

* CLN docstring about minimum norm

* More informative repr for the glm_dataset fixture cases

* Forgot to run black

* CLN remove unnecessary filterwarnings

* CLN address review comments

* Trigger [all random seeds] on the following tests:
test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* CLN add comment for lbfgs ftol=64 * machine precision

* CLN XXX code comment

* Trigger [all random seeds] on the following tests:

test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* CLN link issue and remove code snippet in comment

* Trigger [all random seeds] on the following tests:

test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* CLN add catch_warnings

* Trigger [all random seeds] on the following tests:

test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* Trigger [all random seeds] on the following tests:

test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* [all random seeds]

test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* Trigger with -Werror [all random seeds]

test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* ENH increase maxls to 50

* [all random seeds]

test_glm_regression
test_glm_regression_hstacked_X
test_glm_regression_vstacked_X
test_glm_regression_unpenalized
test_glm_regression_unpenalized_hstacked_X
test_glm_regression_unpenalized_vstacked_X
test_warm_start

* Revert "Trigger with -Werror [all random seeds]"

This reverts commit 99f4cf9.

* TST add catch_warnings to filterwarnings

* TST adapt tests for newton solvers

* CLN cleaner gradient step with gradient_times_newton

* DOC add whatsnew

* ENH always use lbfgs as fallback

* TST adapt rtol

* TST fix test_linalg_warning_with_newton_solver

* CLN address some review comments

* Improve tests related to convergence warning on collinear data

* overfit -> fit

* Typo in comment

* Apply suggestions from code review

* ENH fallback_lbfgs_solve
- Do not use lbfgs steps, fall back complete to lbfgs

* ENH adapt rtol

* Improve test_linalg_warning_with_newton_solver

* Better comments

* Fixed Hessian casing and improved warning messages

* [all random seeds]

test_linalg_warning_with_newton_solver

* Ignore ConvergenceWarnings for now if convergence is good

* CLN remove counting of warnings

* ENH fall back to lbfgs if line search did not converge

* DOC better comment on performance bottleneck

* Update GLM related examples to use the new solver

* CLN address reviewer comments

* EXA improve some wordings

* CLN do not pop "solver in parameter constraints

* CLN fix typos

* DOC fix docstring

* CLN remove solver newton-qr-cholesky

* DOC update PR number in whatsnew

* CLN address review comments

* CLN remove unnecessary catch_warnings

* CLN address some review comments

* DOC more precise whatsnew

* CLN use init_zero_coef

* CLN use and test init_zero_coef

* CLN address some review comments

* CLN mark NewtonSolver as private by leading underscore

* CLN exact comments for inner_solve

* TST add test_newton_solver_verbosity

* TST extend test_newton_solver_verbosity

* TST logic in test_glm_regression_unpenalized

* TST use count_nonzero

* CLN remove super rare line search checks

* MNT move Newton solver to new file _newton_solver.py

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
…in Windows (scikit-learn#24742)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
)


Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@thomasjpfan
Copy link
Member

I think we need sync the hdbscan branch with main. The CI in this PR is failing and the fix is in main (#25017). Syncing with main will give us the opportunity to move the setup.py configuration to the root directory:

extension_config = {

@Micky774 Can you open a PR to sync up the hdbscan branch with main?

ArturoAmorQ and others added 22 commits December 5, 2022 11:39
Co-authored-by: Tim Head <betatim@gmail.com>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
…ions (scikit-learn#25114)

Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
closes scikit-learn#25113
…_score (scikit-learn#24683)

Co-authored-by: Tim Head <betatim@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
…n#25093)

Co-authored-by: Meekail Zain <34613774+Micky774@users.noreply.github.com>
…n#25083)

* Doc changed n_init to n_jobs in mean_shift.py
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something went wrong here or the sync with hdbscan and upstream/main.

@Micky774
Copy link
Contributor Author

Micky774 commented Feb 4, 2023

I think something went wrong here or the sync with hdbscan and upstream/main.

Opened #25538 to fix!

@Micky774 Micky774 deleted the hdbscan_dbscan_plotting branch February 4, 2023 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.