Closed
Description
Currently we have an interface for OPTICS with custom method extract_dbscan
. This is good for usability and visibility of the functionality, but means that a generic parameter search tool (like GridSearchCV
) can't use OPTICS to perform DBSCAN at various eps
.
This would involve adding an eps
parameter which, when None, would use the default OPTICS clustering; when not None would use extract_dbscan
. But we would also need to retain the model across multiple fits...
Here are two alternative interfaces:
- Add a
warm_start
parameter (like many classifiers, regressors, but uncharted territory for clusterers). When True, andfit
orfit_predict
is called, the currentreachability_
,ordering_
andcore_distances_
would be kept, but a different final clustering step would be used to output / storelabels_
. - Add a
memory
parameter, like in hierarchical clustering. This would cache the mapping from parameters toreachability_
,ordering_
andcore_distances_
using ajoblib.Memory
.
I think the first option sounds more appropriate.