Skip to content

Deprecate copy in Birch #29092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jeremiedbb opened this issue May 23, 2024 · 1 comment · Fixed by #29124
Closed

Deprecate copy in Birch #29092

jeremiedbb opened this issue May 23, 2024 · 1 comment · Fixed by #29124
Assignees
Labels

Comments

@jeremiedbb
Copy link
Member

jeremiedbb commented May 23, 2024

Birch doesn't perform inplace operations (at least not on the input array), so the copy parameter is useless and should be deprecated. It's even detrimental because by default it makes a copy.

The only place where an inplace operation happens is in the update method of _CFSubcluster:

def update(self, subcluster):
self.n_samples_ += subcluster.n_samples_
self.linear_sum_ += subcluster.linear_sum_
self.squared_sum_ += subcluster.squared_sum_
self.centroid_ = self.linear_sum_ / self.n_samples_
self.sq_norm_ = np.dot(self.centroid_, self.centroid_)

However, update is call in 2 places. The first one is in the _split_node function, but here we first create 2 new _CFSubcluster objects and so the update performs inplace operations on newly created data, so the input data is not modified. The second one is in the insert_cf_subcluster method of _CFNode but is only triggered if the subcluster has a child, which can only come from splitted subclusters (i.e. after _split_node), so again we're not modifying the input data.

@jeremiedbb jeremiedbb changed the title Deprecate copy in Deprecate copy in Birch May 23, 2024
@github-actions github-actions bot added the Needs Triage Issue requires triage label May 23, 2024
@jeremiedbb jeremiedbb reopened this May 24, 2024
@jeremiedbb jeremiedbb added API help wanted and removed Needs Triage Issue requires triage labels May 24, 2024
@Charlie-XIAO
Copy link
Contributor

/take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants