Skip to content

Transcribe: Enable MyPy #12588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Transcribe: Enable MyPy #12588

wants to merge 1 commit into from

Conversation

bblommers
Copy link
Contributor

@bblommers bblommers commented May 7, 2025

Motivation

Part of a broader push to enforce typing support everywhere (see #12532). One of the areas that I wanted to focus next on was adding correct types to the BaseStore/AccountRegionBundle/etc, because it is such a fundamental concept that all emulators would benefit from improved type support.

However, before improving types there, it makes sense to first ensure that the usage of BaseStore is typed and validated. Only then can we be improve the types in BaseStore and be sure they are correct.

Hence this PR: enforcing type checks in one of the smaller services, for the sole purpose of checking that BaseStore is used/typed correctly.

History

An earlier version of this PR tackled everything at once, but with the above PR's already in place, this PR has been simplified to only change the Transcribe-specific types.

@bblommers bblommers added this to the Playground milestone May 7, 2025
@bblommers bblommers requested a review from alexrashed May 7, 2025 11:52
@bblommers bblommers added aws:transcribe Amazon Transcribe semver: patch Non-breaking changes which can be included in patch releases labels May 7, 2025
Copy link

github-actions bot commented May 7, 2025

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 43m 5s ⏱️ + 1m 22s
4 459 tests ±0  4 074 ✅ ±0  385 💤 ±0  0 ❌ ±0 
4 461 runs  ±0  4 074 ✅ ±0  387 💤 ±0  0 ❌ ±0 

Results for commit 212965f. ± Comparison against base commit d3020f4.

♻️ This comment has been updated with latest results.

@bblommers bblommers force-pushed the transcribe-enable-mypy branch from 3e9da18 to 14069d2 Compare May 8, 2025 09:02
@bblommers bblommers marked this pull request as ready for review May 8, 2025 10:09
@alexrashed
Copy link
Member

@sannya-singal As a service owner for transcribe: What do you think? I personally are very much in favor or increasing the enforcement of type hints and checks, but it's really you as a service owner who has to decide if you want to introduce this for your service. 😃
@bblommers kudos for driving this effort, onboarding the first AWS service is a huge step towards starting a "mypy pandemic" in the good way / i.e. making typing enforcement infectious 😄

Copy link
Contributor

@sannya-singal sannya-singal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work 🚀 Enabling MyPy for the Transcribe module is a great step towards our broader goals of strengthening type checks across the codebase, and I really like the incremental approach you are taking here 🎉

Kudos as well for integrating this with the first AWS service 🙌 Having type checking enabled at the service level will also empower code owners to enforce typing in their codebase, also making it easier to catch issues early and improve maintainability.

Looks great from my side 🚀 thanks for pushing this forward! 👍

Copy link
Member

@alexrashed alexrashed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this push towards better and enforced typing! 💯
I just stumbled upon the exclusion of the updated provider from the unit test which worries me a bit. I think it would be great to quickly discuss how we want to move forward here before we merge this PR.

next_token: NextToken = None,
max_results: MaxResults = None,
**kwargs,
status: Optional[TranscriptionJobStatus] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: We could maybe directly use PEP 604 syntax here?

Suggested change
status: Optional[TranscriptionJobStatus] = None,
status: TranscriptionJobStatus | None = None,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly used Optional here because it's not supported in Python 3.9, so if the CLI ever explicitly loads this module in Python 3.9 it would fall over.

I don't know how likely that is though - I'm happy to do either way 🤷

Comment on lines 16 to 19
if "TranscribeApi" in str(base_class):
raise SkipTest(
"Transcribe Signatures are different on purpose, as optional args are marked as such"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Unfortunately, this one is problematic. This test makes sure that the provider implementations don't drift from the upstream API / their superclasses.
While we did reduce the coupling between the generated API and the providers implementation with adding the kwargs everywhere, we were still monitoring and updating the signatures based on this tests' results.
An example for an ASF Update PR with changed providers would be #12490.

The root of this problem is actually that we do generate the code with wrong typing. In this case I think we should just update the code generation and all its subclasses. It might even make sense to implement a script (with something like libcst) that does this automatically to help us keep our implementations up-to-date automatically (as part of the ASF Update action directly).

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the generated code makes a lot of sense IMO. Considering the scope of changing signatures over the entire codebase, we could split it up in three parts:

  1. Change this test to view X=None and X|None = None as equal, as a stop gap measure while we have both versions in place
  2. Change the ASF update action to update all the generated code to X | None
  3. Change the providers themselves using libcst

We could do all of this in one PR, but that would be massive, and very difficult to review.

Any objections to the stepwise approach @alexrashed?

(I'll mark this PR as draft for now, until we have a solution.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is perfect. let's move forwards with (1) here and then we can tackle the other two in follow ups! 💯

@bblommers bblommers force-pushed the transcribe-enable-mypy branch from 3332882 to 212965f Compare May 20, 2025 11:29
@bblommers bblommers marked this pull request as ready for review May 20, 2025 19:18
@bblommers bblommers requested a review from sannya-singal May 20, 2025 19:18
Copy link
Contributor

@sannya-singal sannya-singal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀 Thanks for incorporating the feedback! I will defer to @alexrashed for a re-review since he requested the changes.

@bblommers bblommers requested a review from alexrashed May 22, 2025 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws:transcribe Amazon Transcribe semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants