Skip to content

API for getting DBSCAN-like clusterings out of OPTICS with fit_predict #12044

Closed
@jnothman

Description

@jnothman

Currently we have an interface for OPTICS with custom method extract_dbscan. This is good for usability and visibility of the functionality, but means that a generic parameter search tool (like GridSearchCV) can't use OPTICS to perform DBSCAN at various eps.

This would involve adding an eps parameter which, when None, would use the default OPTICS clustering; when not None would use extract_dbscan. But we would also need to retain the model across multiple fits...

Here are two alternative interfaces:

  • Add a warm_start parameter (like many classifiers, regressors, but uncharted territory for clusterers). When True, and fit or fit_predict is called, the current reachability_, ordering_ and core_distances_ would be kept, but a different final clustering step would be used to output / store labels_.
  • Add a memory parameter, like in hierarchical clustering. This would cache the mapping from parameters to reachability_, ordering_ and core_distances_ using a joblib.Memory.

I think the first option sounds more appropriate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementModerateAnything that requires some knowledge of conventions and best practiceshelp wanted

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions