[WIP] Support non-type dependencies #9374

Avasam · 2022-12-17T01:13:56Z

Not complete, I'm using this PR to share code-change and look for feedback. I know I still have commented code here and there. I'll also split it up in smaller PRs once ready, for now having the stub changes are useful.
Basically let's get it working, then I'll get it pretty and ask for a proper review.

I'll need help with pytype, the error seems nonsensical to me as the dependencies are installed, and there's no error-code to ignore.
I'm having a similar issue with mypy where it cannot find the installed dependencies. AFAIK it's not run in a virtual environment like stubtest (I have not yet confirmed if that's only a local issue or also happens on the CI)
Speaking of stubtest, it and mypy may require adding ignore_missing_imports=true to a temp config file to ignore module is installed, but missing library stubs or py.typed marker on dependencies listed in external_requires
General feedback on config name and comments

Actual feature implementation PRs tracked here: #5768 (comment)

AlexWaygood · 2022-12-17T01:22:16Z

stubs/pycocotools/pycocotools/mask.pyi


 def iou(
    dt: _NDArrayUInt32 | list[float] | list[_EncodedRLE],
    gt: _NDArrayUInt32 | list[float] | list[_EncodedRLE],
    pyiscrowd: list[int] | _NDArrayUInt8,
 ) -> list[Any] | _NDArrayFloat64: ...
 def merge(rleObjs: list[_EncodedRLE], intersect: int = ...) -> _EncodedRLE: ...
-
-# ignore an "overlapping overloads" error due to _NDArrayInt32 being an alias for `Incomplete` for now
 @overload
 def frPyObjects(pyobj: _NDArrayUInt32 | list[list[int]] | list[_EncodedRLE], h: int, w: int) -> list[_EncodedRLE]: ...  # type: ignore[misc]


We need to actually remove the outdated type: ignore if we're deleting the comment about why the type ignore used to be needed 😄

Suggested change

def frPyObjects(pyobj: _NDArrayUInt32 | list[list[int]] | list[_EncodedRLE], h: int, w: int) -> list[_EncodedRLE]: ... # type: ignore[misc]

def frPyObjects(pyobj: _NDArrayUInt32 | list[list[int]] | list[_EncodedRLE], h: int, w: int) -> list[_EncodedRLE]: ...

It becomes a different overload error, stubs/pycocotools/pycocotools/mask.pyi:21: error: Overloaded function signatures 1 and 2 overlap with incompatible return types [misc]
See: https://github.com/Avasam/typeshed/actions/runs/3717387421/jobs/6304685925#step:6:114

Ah, in which case maybe we should keep the comment, but change it to "mypy complains the overloads overlap, but it still works" or something

Sidenote, since pyright 1.1.283 it's possible to split type: ignore from pyright: ignore (and have them both work on the same line). So we could watch for obsolete type: ignore comments. I have an upcoming PR for that.

(Though remember that mypy's --warn-unused-ignores option doesn't work properly inside typeshed because of the unique special casing mypy has for "typeshed directories" ☹️)

AlexWaygood · 2022-12-17T01:29:39Z

stubs/pyinstaller/METADATA.toml

@@ -1,5 +1,6 @@
 version = "5.7.*"
 requires = ["types-setuptools"]
+external_requires = ["modulegraph"]


For now, for security reasons, any external dependencies will fail the stub uploader tests unless they're explicitly included in this whitelist here: https://github.com/typeshed-internal/stub_uploader/blob/bf783eb638a8aed4e1d0ba1143e3f03f399794a0/stub_uploader/metadata.py#L166

It's not ideal and we can think about ways to improve this after we've got the problems on the typeshed side up and running (plus in the medium term, we can just make that list longer) — but for now, we can't add modulegraph as a dependency, as it's not on that list

Good to keep in mind! Eventually this PR will split-off stub changes (right now they're only included to test). So it shouldn't be an immediate issue.

Having said that, it's interesting that the stub-uploader tests typeshed runs in CI still seem to be passing. Not sure whether that means I'm mistaken or whether it means the tests are buggy

stub_uploader only looks at requires field. I think people disliked the suggestion to split the field up, so I kept it the same (there were also some subtleties around what and when to trust, see #8312 (comment) )

@hauntsaninja Ah yes, that's something I meant to ask as well and slipped my mind: should it be the same or different field. And if the same, what's the logic for what needs to be installed? Probably the same as stubtest (a link to relevant code there would be appreciated). I already abstracted the logic to get non-type dependencies anyway.

AlexWaygood · 2022-12-17T01:33:37Z

Thanks so much for taking this on! I'll take a proper look tomorrow.

AlexWaygood · 2022-12-17T16:16:58Z

tests/mypy_test.py

+        # Ignoring "module is installed, but missing library stubs or py.typed marker"
+        # because external dependencies may not be marked as typed.
+        # pyright already flags non-typed external dependencies as warnings.
+        # TODO: This mutes ALL import errors, can we narrow it down?
+        "--ignore-missing-imports",


I'm not sure there's much use for a stubs package having a dependency on a non-types library that doesn't have inline types. Stubs packages should keep dependencies to an absolute minimum, so that they pose minimal security risks and are easy to install/bootstrap. Since the most important purpose of a stubs package is to help with type checking, I'd rather we didn't allow adding dependencies that don't have inline types.

I also think adding this flag may be obscuring the fact that mypy_test.py is probably ignoring all packages installed in site-packages due to the --no-site-packages flag on line 251!

Suggested change

# Ignoring "module is installed, but missing library stubs or py.typed marker"

# because external dependencies may not be marked as typed.

# pyright already flags non-typed external dependencies as warnings.

# TODO: This mutes ALL import errors, can we narrow it down?

"--ignore-missing-imports",

The reason I was including non- py.typed packages, is because, if I'm not mistaken, pyright will still get the proper type (let's say a class). Although mypy does import everything as Any in those cases. I only found two cases of this:

comtypes : used by D3DShot, so arguably already on its way out. If another stub ever needs it, maybe consider also typing comtypes

modulegraph: used by PyInstaller. However PyInstaller currently vendors modulegraph, so we can just stub the necessary vendored files (which is already what I did).

What I was thinking on doing, is to add

[mypy-non_type_dependency] ignore-missing-imports=true

to a config file (like mypy_test does) so that only those specific non-type imports are muted.

That being said... this counter-argument is pretty weak and a niche/edge case probably not worth supporting and adding complexity. So I'm with you on this one.

That's a good point, actually, pyright has the useLibraryCodeForTypes option. I don't think mypy has anything similar, though, so I still think we should be pretty wary about adding "non-py.typed non-types dependencies". If we do decide to allow them (which I think would probably require a bit more discussion first, but I'm open to it if it'll help pyright users 🙂), I think it might actually still be better to have to use a type: ignore[import] each time the library is imported. That way, there's a very visual reminder that this library is "a bit different" and should be used with caution inside typeshed.

AlexWaygood · 2022-12-17T16:20:19Z

Problem

So, in an ideal world, each stubs package should be tested in a reasonably "isolated" environment. The approach you have currently could lead to false negatives -- for example, numpy is installed globally in CI, so a stubs package could have import numpy in a file without listing numpy as a dependency, and it wouldn't be flagged by CI. However, if a user then installed the stubs package without installing numpy, all the numpy imports inside the stubs package might silently be turned into Any by the type checker. That would be very bad.

Currently we solve this in mypy_test.py by using the --no-site-packages flag, which forces mypy to ignore all inline types in all packages installed into the site-packages directory. But if we're allowing non-types dependencies, it'll be crucial for mypy to be able to look into site-packages to find the type hints for those non-types dependencies. So we need to stop using that flag, and think of another solution.

Potential solution (1)

The "principled" way of fixing this issue would be to use an approach similar to stubtest_third_party.py in mypy_test.py: dynamically create a virtual environment for each stubs package, and install the necessary requirements into each virtual environment before running the tests for that stubs package. The disadvantage of that approach is that it might be pretty slow. mypy_test.py generally takes <2 minutes to test all stdlib and third-party stubs. Running stubtest on all our third-party stubs generally takes around 8 minutes, and we only get it down to that time by sharding the job between four workers.

Potential solution (1b)

If we have to, we could perhaps take a similar approach to the one we use in stubtest_third_party.yml, where we only run mypy_test.py on a stubs package in CI if a PR changes something in that stubs package. I really don't like that idea, however, since we often have changes to stdlib stubs that break type-checking in our third-party stubs. If we only run mypy_test.py on a stubs package when we change something in that stubs package, it'll be hard to pick up on those situations in CI.

Potential solution (2)

We should maybe try doing the principled thing here before assuming that it will be too slow. But if it is too slow, there's another approach we could try. We could install all requirements for all stubs globally, like you have now, but also run a separate script that detects if a stubs package tries to import something that isn't declared as a dependency in the package's METADATA.toml file, and emits an error if that's detected.

Potential solution (3)

We could continue using --no-site-packages for stubs that have no non-types dependencies, but start doing the "dynamically create a venv" thing for the packages that need external dependencies. This would be annoying in some ways, because it would mean that some of our stubs are tested very differently to our other stubs. But it might be an okay compromise that gets us both something that has acceptable performance and solves the isolation problem.

Note that I'm using mypy_test.py as an example here, but the points apply equally (I think) to regr_test.py.

AlexWaygood · 2022-12-17T16:24:10Z

This is the code stubtest_third_party.py uses to dynamically create a virtual environment, and then pip install the necessary dependencies into it:

typeshed/tests/stubtest_third_party.py

Lines 44 to 84 in 62accb3

    
           with tempfile.TemporaryDirectory() as tmp: 
        
               venv_dir = Path(tmp) 
        
               venv.create(venv_dir, with_pip=True, clear=True) 
        
               if sys.platform == "win32": 
        
                   pip = venv_dir / "Scripts" / "pip.exe" 
        
                   python = venv_dir / "Scripts" / "python.exe" 
        
               else: 
        
                   pip = venv_dir / "bin" / "pip" 
        
                   python = venv_dir / "bin" / "python" 
        
               pip_exe, python_exe = str(pip), str(python) 
        
               dist_version = metadata["version"] 
        
               extras = stubtest_meta.get("extras", []) 
        
               assert isinstance(dist_version, str) 
        
               assert isinstance(extras, list) 
        
               dist_extras = ", ".join(extras) 
        
               dist_req = f"{dist.name}[{dist_extras}]=={dist_version}" 
        
               # If @tests/requirements-stubtest.txt exists, run "pip install" on it. 
        
               req_path = dist / "@tests" / "requirements-stubtest.txt" 
        
               if req_path.exists(): 
        
                   pip_cmd = [pip_exe, "install", "-r", str(req_path)] 
        
                   try: 
        
                       subprocess.run(pip_cmd, check=True, capture_output=True) 
        
                   except subprocess.CalledProcessError as e: 
        
                       print_command_failure("Failed to install requirements", e) 
        
                       return False 
        
               # We need stubtest to be able to import the package, so install mypy into the venv 
        
               # Hopefully mypy continues to not need too many dependencies 
        
               # TODO: Maybe find a way to cache these in CI 
        
               dists_to_install = [dist_req, get_mypy_req()] 
        
               dists_to_install.extend(metadata.get("requires", [])) 
        
               pip_cmd = [pip_exe, "install"] + dists_to_install 
        
               try: 
        
                   subprocess.run(pip_cmd, check=True, capture_output=True) 
        
               except subprocess.CalledProcessError as e: 
        
                   print_command_failure("Failed to install", e) 
        
                   return False

AlexWaygood · 2022-12-18T20:41:58Z

I've just opened #9382 for tests/regr_test.py, and I think we'll need a similar approach for tests/mypy_test.py (the changes needed for tests/mypy_test.py will be much easier if #9382 is merged, due to the refactoring done there). However, I think the approach in this PR will work really well for pyright, and maybe also for pytype, so definitely don't close it!

(Theoretically, the same concerns about false negatives apply to pyright/pytype as they do to mypy_test.py and regr_test.py. But as long as we have one of the three type checkers testing each stubs package in isolation, I think we can probably risk pyright (and maybe pytype) doing something slightly less principled. I really like how this approach means we can continue using pyright-action to test pyright in CI, which keeps the pyright check really fast.

Avasam · 2022-12-20T18:42:49Z

I've just opened #9382 for tests/regr_test.py, and I think we'll need a similar approach for tests/mypy_test.py (the changes needed for tests/mypy_test.py will be much easier if #9382 is merged, due to the refactoring done there). However, I think the approach in this PR will work really well for pyright, and maybe also for pytype, so definitely don't close it!

(Theoretically, the same concerns about false negatives apply to pyright/pytype as they do to mypy_test.py and regr_test.py. But as long as we have one of the three type checkers testing each stubs package in isolation, I think we can probably risk pyright (and maybe pytype) doing something slightly less principled. I really like how this approach means we can continue using pyright-action to test pyright in CI, which keeps the pyright check really fast.

Totally on the same page as you here. Keeping pyright and pytype relatively simple. The requirements from the mypy test also causes the necessary isolated tests. I'll wait for #9382 (which looks much better than what I would've come up with) and re-use the logic to get external deps. And clean-up.

Side question. I know stub uploader uses an allow-list and checks dependencies so that a malicious dependency doesn't slip as included with the stub. But do the stubs actually need to forcibly install non-type dependencies in the first place?

Let's say you wanna use some untyped image manipulation library (pyscreeze for example). So you install types-pyscreeze. Which references (let's pretend opencv2 is py.typed for a moment, this is a hypothetical) opencv2 and pillow (works with either). But you only use the pillow methods, then you're still installing the entirety of cv2 and its dependencies.

Plus if you didn't have opencv2 installed (resulting in Unknowns/Anys), you couldn't have used the cv2-based pyscreeze methods anyway.

I could see an issue with complex union types that can't be represented with overloads. Where the Unknown/Any would propagate.

In other words, I'm saying there might be a difference between "non-type optional dependency referenced just for type if needed, in typeshed" and "non-type dependency that has to be installed, and validated, to avoid bad Unknown/Any propagation".

(hopefully that made sense, I'm not sure how to best put this thought to paper)

AlexWaygood · 2022-12-20T19:09:04Z

Let's say you wanna use some untyped image manipulation library (pyscreeze for example). So you install types-pyscreeze. Which references (let's pretend opencv2 is py.typed for a moment, this is a hypothetical) opencv2 and pillow (works with either). But you only use the pillow methods, then you're still installing the entirety of cv2 and its dependencies.

Hmm, that's quite an interesting scenario, and one I'm not sure we've fully considered yet!

I think we should definitely think hard and consider it carefully before adding non-types dependencies to stubs packages — and that's especially true if it's a huge package like opencv.

If there are occasions when we really don't want to pull in a large dependency into a stubs package, protocols may sometimes provide another solution. Instead of saying "this function requires a numpy class X", we could instead say "this function requires an object that conforms to this private-to-the-stub protocol that describes the shape of the numpy class X". It might not be pretty... but when has Python- typing ever been pretty? 😆

…l-non-type-dependencies

AlexWaygood · 2022-12-31T17:25:48Z

FYI, I'll happily merge this now if you revise the PR so that it's just the changes to how we run pyright in CI :)

(I'd prefer to leave pytype to a later PR, so that we can ask the pytype maintainers if they can think of a better way of running pytype in CI before we make any changes there.)

Avasam · 2022-12-31T17:52:51Z

FYI, I'll happily merge this now if you revise the PR so that it's just the changes to how we run pyright in CI :)

(I'd prefer to leave pytype to a later PR, so that we can ask the pytype maintainers if they can think of a better way of running pytype in CI before we make any changes there.)

I can do that in a different branch/PR. There's plenty of things here I'd like to keep for testing/PoC

github-actions · 2022-12-31T18:13:41Z

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

Avasam · 2023-01-02T20:17:44Z

.github/workflows/daily.yml

          fi

+          $PYTHON_EXECUTABLE tests/stubtest_third_party.py --specified-stubs-only --num-shards 4 --shard-index ${{ matrix.shard-index }}


@AlexWaygood is this code de-duplication worth making a PR for? (adds a line, but makes the command to run have a single source of truth)

Sure, seems like a nice improvement :)

I'd vote for this control flow, though:

if [ "${{ runner.os }}" = "Linux" ]; then PYTHON_EXECUTABLE="xvfb-run python" else PYTHON_EXECUTABLE="python" fi

rather than

PYTHON_EXECUTABLE="python" if [ "${{ runner.os }}" = "Linux" ]; then PYTHON_EXECUTABLE="xvfb-run python" fi

I find the former much more readable.

Avasam · 2023-01-04T00:24:11Z

I think most of what I wanted out of this PR have been split-off and done. So I'll close this now (further tests willl probably be done in my fork as to not ping or use the CI uselessly)

Avasam added 2 commits December 16, 2022 20:09

Support non-type dependencies

c70c02a

typo

86d65e7

AlexWaygood reviewed Dec 17, 2022

View reviewed changes

Revert --allow-subclassing-any (will tackle later)

8d7e141

AlexWaygood reviewed Dec 17, 2022

View reviewed changes

This comment has been minimized.

Sign in to view

AlexWaygood reviewed Dec 17, 2022

View reviewed changes

AlexWaygood mentioned this pull request Dec 18, 2022

Allow non-types dependencies #5768

Closed

Merge branch 'main' of https://github.com/python/typeshed into instal…

9875e76

…l-non-type-dependencies

This comment has been minimized.

Sign in to view

Merge external_requires in requires

6e05706

Revert to using vendored modulegraph in PyInstaller

f78f335

Avasam changed the title ~~[WIP] (looking for feedback) Support non-type dependencies~~ [WIP] Support non-type dependencies Dec 31, 2022

Avasam mentioned this pull request Dec 31, 2022

Support external-dependencies in pyright #9434

Merged

Avasam commented Jan 2, 2023

View reviewed changes

Avasam mentioned this pull request Jan 3, 2023

Support third-party stub external dependencies in pytype #9449

Merged

Avasam closed this Jan 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Support non-type dependencies #9374

[WIP] Support non-type dependencies #9374

Avasam commented Dec 17, 2022 •

edited

Loading

AlexWaygood Dec 17, 2022

Avasam Dec 17, 2022 •

edited

Loading

AlexWaygood Dec 17, 2022

Avasam Dec 17, 2022

AlexWaygood Dec 17, 2022 •

edited

Loading

AlexWaygood Dec 17, 2022

Avasam Dec 17, 2022

AlexWaygood Dec 17, 2022

hauntsaninja Dec 17, 2022 •

edited

Loading

Avasam Dec 17, 2022

AlexWaygood commented Dec 17, 2022

This comment has been minimized.

AlexWaygood Dec 17, 2022 •

edited

Loading

Avasam Dec 20, 2022 •

edited

Loading

AlexWaygood Dec 20, 2022

AlexWaygood commented Dec 17, 2022 •

edited

Loading

AlexWaygood commented Dec 17, 2022

AlexWaygood commented Dec 18, 2022

Avasam commented Dec 20, 2022 •

edited

Loading

AlexWaygood commented Dec 20, 2022 •

edited

Loading

This comment has been minimized.

AlexWaygood commented Dec 31, 2022

Avasam commented Dec 31, 2022

github-actions bot commented Dec 31, 2022

Avasam Jan 2, 2023

AlexWaygood Jan 2, 2023

Avasam commented Jan 4, 2023

	def frPyObjects(pyobj: _NDArrayUInt32 \| list[list[int]] \| list[_EncodedRLE], h: int, w: int) -> list[_EncodedRLE]: ... # type: ignore[misc]
	def frPyObjects(pyobj: _NDArrayUInt32 \| list[list[int]] \| list[_EncodedRLE], h: int, w: int) -> list[_EncodedRLE]: ...

		fi

		$PYTHON_EXECUTABLE tests/stubtest_third_party.py --specified-stubs-only --num-shards 4 --shard-index ${{ matrix.shard-index }}

[WIP] Support non-type dependencies #9374

[WIP] Support non-type dependencies #9374

Conversation

Avasam commented Dec 17, 2022 • edited Loading

Choose a reason for hiding this comment

Avasam Dec 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood Dec 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hauntsaninja Dec 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood commented Dec 17, 2022

This comment has been minimized.

AlexWaygood Dec 17, 2022 • edited Loading

Choose a reason for hiding this comment

Avasam Dec 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood commented Dec 17, 2022 • edited Loading

Problem

Potential solution (1)

Potential solution (1b)

Potential solution (2)

Potential solution (3)

AlexWaygood commented Dec 17, 2022

AlexWaygood commented Dec 18, 2022

Avasam commented Dec 20, 2022 • edited Loading

AlexWaygood commented Dec 20, 2022 • edited Loading

This comment has been minimized.

AlexWaygood commented Dec 31, 2022

Avasam commented Dec 31, 2022

github-actions bot commented Dec 31, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Avasam commented Jan 4, 2023

Avasam commented Dec 17, 2022 •

edited

Loading

Avasam Dec 17, 2022 •

edited

Loading

AlexWaygood Dec 17, 2022 •

edited

Loading

hauntsaninja Dec 17, 2022 •

edited

Loading

AlexWaygood Dec 17, 2022 •

edited

Loading

Avasam Dec 20, 2022 •

edited

Loading

AlexWaygood commented Dec 17, 2022 •

edited

Loading

Avasam commented Dec 20, 2022 •

edited

Loading

AlexWaygood commented Dec 20, 2022 •

edited

Loading