Skip to content

[MRG + 1] Remove unused variables #12230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 22, 2018
Merged

Conversation

mroeschke
Copy link
Contributor

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Removes unused variable assignments as flagged by PyCharm.

@sklearn-lgtm
Copy link

This pull request introduces 2 alerts and fixes 22 when merging 7ad074d into e1c3c22 - view on LGTM.com

new alerts:

  • 2 for Unused import

fixed alerts:

  • 20 for Unused local variable
  • 2 for Variable defined multiple times

Comment posted by LGTM.com

@@ -1556,7 +1556,7 @@ def fit(self, X, y=None, sample_weight=None):
init_size=init_size)

# Compute the label assignment on the init dataset
batch_inertia, centers_squared_diff = _mini_batch_step(
_mini_batch_step(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the previous issue: is n_iter ever set to 0 for k-means, or is it strictly greater than 0? Because I would see how this assignment might be useful if the subsequent for loop does not assignment them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks contingent on n_samples, self. max_iter and self. batch_size

n_batches = int(np.ceil(float(n_samples) / self.batch_size))
n_iter = int(self.max_iter * n_batches)

So it appears very unlikely that n_iter would ever be set to 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if n_iter=0, the variables batch_inertia, centers_squared_diff don't seem to be used afterwards, so it should be indeed fine.

@@ -309,7 +309,6 @@ def ledoit_wolf(X, assume_centered=False, block_size=1000):
X = np.reshape(X, (1, -1))
warnings.warn("Only one sample available. "
"You may want to reshape your data array")
n_samples = 1
n_features = X.size
else:
n_samples, n_features = X.shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If n_samples is not needed, then it's also not needed here.

@@ -435,7 +434,6 @@ def fast_mcd(X, support_fraction=None,
# (and less optimal)
all_best_covariances = np.zeros((n_best_tot, n_features,
n_features))
n_best_tot = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this look odd? The comment says it's trying a smaller matrix, but it sets the n_best_tot after retrying the matrix allocation. It seems to me that those lines are swapped somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll try swapping the lines and see if anything breaks.

@@ -523,7 +523,7 @@ def check_estimator_sparse_data(name, estimator_orig):
"sparse input is not supported if this is not"
" the case." % name)
raise
except Exception as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it print the exception instead of ignoring it? It kinda feels odd to catch Exception and suppress it completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a raise statement after the print statement below, so the exception will still be visible. It's still strange semantics that the exception does not wrap the print message though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a raise, sure. But it says that the estimator is not failing properly on sparse data, i.e. the message is being very specific, whereas it's catching Exception. So if the estimator fails because there's not enough memory, or because there's something wrong with the disk and the estimator reads from the disk while fitting, for whatever reason, we still show the same error message. It's just a bad idea to catch Exception, and pretend it's much more specific than Exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that it's bad that except Exception can be catching anything (minus a TypeError or ValueError from above) while a specific print message is being shown (should probably be a warning).

I am not too familiar with this part of the codebase, but the scope of this PR (just removing unused variables) does not change the prior behavior. This block can be improved in a future PR.

In [5]: try:
   ...:     1 / 0
   ...: except Exception as e:
   ...:     print('Something went wrong')
   ...:     raise
   ...:
   ...:
Something went wrong
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-5-b91504a88660> in <module>()
      1 try:
----> 2     1 / 0
      3 except Exception as e:
      4     print('Something went wrong')
      5     raise

ZeroDivisionError: division by zero

In [6]: try:
   ...:     1 / 0
   ...: except Exception:
   ...:     print('Something went wrong')
   ...:     raise
   ...:
   ...:
Something went wrong
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-6-ee8d63f9937e> in <module>()
      1 try:
----> 2     1 / 0
      3 except Exception:
      4     print('Something went wrong')
      5     raise

ZeroDivisionError: division by zero

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke , sorry, hadn't noticed the raise after the print. It's all good. Thanks!

@@ -334,7 +334,7 @@ def fit(self, X, y=None):
self : object
Returns the transformer.
"""
X = check_array(X, accept_sparse='csr')
check_array(X, accept_sparse='csr')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was like "this definitely doesn't look right" but it actually surprisingly is. With estimator tags we could entirely remove this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar with estimator tags, but are the tags in place so this line can be removed or should this be kept for now to still validate X?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be kept for now, sorry for being cryptic.

@sklearn-lgtm
Copy link

This pull request introduces 2 alerts and fixes 22 when merging 60f5be5 into 59b15c5 - view on LGTM.com

new alerts:

  • 2 for Unused import

fixed alerts:

  • 20 for Unused local variable
  • 2 for Variable defined multiple times

Comment posted by LGTM.com

@sklearn-lgtm
Copy link

This pull request introduces 2 alerts and fixes 22 when merging 9352f3f into bfab306 - view on LGTM.com

new alerts:

  • 2 for Unused import

fixed alerts:

  • 20 for Unused local variable
  • 2 for Variable defined multiple times

Comment posted by LGTM.com

@mroeschke
Copy link
Contributor Author

Is the codecov check necessary to pass although this is just cleaning some variables?

@amueller
Copy link
Member

amueller commented Oct 4, 2018

probably a false positive, I wouldn't worry about it.

@sklearn-lgtm
Copy link

This pull request introduces 2 alerts and fixes 20 when merging 067247b into d4c9e84 - view on LGTM.com

new alerts:

  • 2 for Unused import

fixed alerts:

  • 18 for Unused local variable
  • 2 for Variable defined multiple times

Comment posted by LGTM.com

@jnothman
Copy link
Member

Please remove the now-unused imports

@sklearn-lgtm
Copy link

This pull request fixes 20 alerts when merging ab1dc98 into 5bcd84b - view on LGTM.com

fixed alerts:

  • 18 for Unused local variable
  • 2 for Variable defined multiple times

Comment posted by LGTM.com

@mroeschke
Copy link
Contributor Author

Removed those unused imports @jnothman. Thanks

@@ -1556,7 +1556,7 @@ def fit(self, X, y=None, sample_weight=None):
init_size=init_size)

# Compute the label assignment on the init dataset
batch_inertia, centers_squared_diff = _mini_batch_step(
_mini_batch_step(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if n_iter=0, the variables batch_inertia, centers_squared_diff don't seem to be used afterwards, so it should be indeed fine.

@@ -281,7 +281,6 @@ def lsqr(A, b, damp=0.0, atol=1e-8, btol=1e-8, conlim=1e8,

itn = 0
istop = 0
nstop = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a backport from scipy, it's better to keep it without change, which will make comparing with the upstream version easier.

@mroeschke mroeschke changed the title [MRG] Remove unused variables [MRG + 1] Remove unused variables Oct 21, 2018
@sklearn-lgtm
Copy link

This pull request fixes 19 alerts when merging d9de4c0 into 5bcd84b - view on LGTM.com

fixed alerts:

  • 17 for Unused local variable
  • 2 for Variable defined multiple times

Comment posted by LGTM.com

@jnothman
Copy link
Member

Looks good, thanks @mroeschke!

@jnothman jnothman merged commit 3a3a78a into scikit-learn:master Oct 22, 2018
@mroeschke mroeschke deleted the clean_unused branch October 22, 2018 17:05
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unused Variable Assignment in for k-means
8 participants