Skip to content

Delete cache for a module if errors are found #4045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 4, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 26 additions & 3 deletions mypy/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import sys
import time
from os.path import dirname, basename
import errno

from typing import (AbstractSet, Dict, Iterable, Iterator, List, cast, Any,
NamedTuple, Optional, Set, Tuple, Union, Callable)
Expand Down Expand Up @@ -469,7 +470,7 @@ class BuildManager:
all_types: Map {Expression: Type} collected from all modules
options: Build options
missing_modules: Set of modules that could not be imported encountered so far
stale_modules: Set of modules that needed to be rechecked
stale_modules: Set of modules that needed to be rechecked (only used by tests)
version_id: The current mypy version (based on commit id when possible)
plugin: Active mypy plugin(s)
errors: Used for reporting all errors
Expand Down Expand Up @@ -1165,6 +1166,25 @@ def write_cache(id: str, path: str, tree: MypyFile,
return interface_hash


def delete_cache(id: str, path: str, manager: BuildManager) -> None:
"""Delete cache files for a module.

The cache files for a module are deleted when mypy finds errors there.
This avoids inconsistent states with cache files from different mypy runs,
see #4043 for an example.
"""
path = os.path.abspath(path)
meta_json, data_json = get_cache_names(id, path, manager)
manager.log('Deleting {} {} {} {}'.format(id, path, meta_json, data_json))

for filename in [data_json, meta_json]:
try:
os.remove(filename)
except OSError as e:
if e.errno != errno.ENOENT:
manager.log("Error deleting cache file {}: {}".format(filename, e.strerror))


"""Dependency manager.

Design
Expand Down Expand Up @@ -1534,11 +1554,12 @@ def mark_as_rechecked(self) -> None:
"""Marks this module as having been fully re-analyzed by the type-checker."""
self.manager.rechecked_modules.add(self.id)

def mark_interface_stale(self) -> None:
def mark_interface_stale(self, *, on_errors: bool = False) -> None:
"""Marks this module as having a stale public interface, and discards the cache data."""
self.meta = None
self.externally_same = False
self.manager.stale_modules.add(self.id)
if not on_errors:
self.manager.stale_modules.add(self.id)

def check_blockers(self) -> None:
"""Raise CompileError if a blocking error is detected."""
Expand Down Expand Up @@ -1813,6 +1834,8 @@ def write_cache(self) -> None:
else:
is_errors = self.manager.errors.is_errors()
if is_errors:
delete_cache(self.id, self.path, self.manager)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this also mark the interface as stale (as below on line 1845-1847) and set self.interface_hash to something unmatchable? Otherwise I worry we could still repro the same issues by adding another chain to the reference link.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I even have found a longer crasher with four files that still happens unless interface is marked as stale (I added it to the tests). Do we also need to set the interface_hash to something? It looks like it is not necessary.

It looks like marking files with errors as stale breaks several tests. I will check if this can be fixed somehow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, there is an explicit comment in tests:

Note that a file that ends up producing an error does not create
a new cache file and so is not considered stale.

Should I just mark them as stale now and update this comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the comment just reflects the state of the world at the time (i.e. that no cache file was written).

I presume this only affects some of the tests and not all tests with a [stale] marker?

Alternatively you could have a flag to mark_interface_stale() to skip adding the module to stale_modules since that is used purely by the tests. (The other two things it sets are used for the graph algorithm too.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume this only affects some of the tests and not all tests with a [stale] marker?

It affects 42 tests, i.e. around one third of all tests with [stale].

Alternatively you could have a flag to mark_interface_stale() to skip adding the module to stale_modules since that is used purely by the tests. (The other two things it sets are used for the graph algorithm too.)

OK, I did this. Now all tests should pass. By the way, what do you think about potential performance issues because of this change? Intuitively it should be minor (since it only affects situations where new errors appear after previous incremental runs), but maybe you could measure it somehow?

self.mark_interface_stale(on_errors=True)
return
dep_prios = [self.priorities.get(dep, PRI_HIGH) for dep in self.dependencies]
new_interface_hash = write_cache(
Expand Down
86 changes: 84 additions & 2 deletions test-data/unit/check-incremental.test
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@
-- Any files that we expect to be rechecked should be annotated in the [rechecked]
-- annotation, and any files expect to be stale (aka have a modified interface)
-- should be annotated in the [stale] annotation. Note that a file that ends up
-- producing an error does not create a new cache file and so is not considered stale.
-- producing an error has its caches deleted and is marked stale automatically.
-- Such files don't need to be included in [stale ...] list.
--
-- The test suite will automatically assume that __main__ is stale and rechecked in
-- all cases so we can avoid constantly having to annotate it. The list of
Expand Down Expand Up @@ -200,7 +201,7 @@ def foo() -> int:
return "foo"
return inner2()

[rechecked mod2]
[rechecked mod1, mod2]
[stale]
[out2]
tmp/mod2.py:4: error: Incompatible return value type (got "str", expected "int")
Expand Down Expand Up @@ -2800,6 +2801,87 @@ b.x.y
tmp/c.py:2: error: Revealed type is '<stale cache: consider running mypy without --quick>'
tmp/c.py:5: error: "<stale cache: consider running mypy without --quick>" has no attribute "y"

[case testCacheDeletedAfterErrorsFound]
import a
[file a.py]
from b import x
[file b.py]
from c import x
[file c.py]
x = 1
[file c.py.2]
1 + 1
[file a.py.3]
from b import x
1 + 1
[out]
[out2]
tmp/b.py:1: error: Module 'c' has no attribute 'x'
tmp/a.py:1: error: Module 'b' has no attribute 'x'
[out3]
tmp/b.py:1: error: Module 'c' has no attribute 'x'
tmp/a.py:1: error: Module 'b' has no attribute 'x'

[case testCacheDeletedAfterErrorsFound2]
import a
[file a.py]
from b import x
[file b.py]
from c import C
x: C
[file c.py]
class C: pass
[file c.py.2]
def C(): pass
[file a.py.3]
from b import x
1 + 1
[out]
[out2]
tmp/b.py:2: error: Invalid type "c.C"
[out3]
tmp/b.py:2: error: Invalid type "c.C"

[case testCacheDeletedAfterErrorsFound3]
import a
[file a.py]
import b
b.f()
[file b.py]
def f() -> None: pass
[file b.py.2]
def f(x) -> None: pass
[out]
[out2]
tmp/a.py:2: error: Too few arguments for "f"
[out3]
tmp/a.py:2: error: Too few arguments for "f"

[case testCacheDeletedAfterErrorsFound4]
import a
[file a.py]
from b import x
[file b.py]
from c import x
[file c.py]
from d import x
[file d.py]
x = 1
[file d.py.2]
1 + 1
[file a.py.3]
from b import x
1 + 1
[out]
[out2]
tmp/c.py:1: error: Module 'd' has no attribute 'x'
tmp/b.py:1: error: Module 'c' has no attribute 'x'
tmp/a.py:1: error: Module 'b' has no attribute 'x'
[out3]
tmp/c.py:1: error: Module 'd' has no attribute 'x'
tmp/b.py:1: error: Module 'c' has no attribute 'x'
tmp/a.py:1: error: Module 'b' has no attribute 'x'

[case testNoCrashOnDoubleImportAliasQuick]
# cmd: mypy -m e
# cmd2: mypy -m c
Expand Down