-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
⚡️ Performance: Overhead of populateGlobalsFromLib in scope-manager #9575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚡️ Performance: Overhead of populateGlobalsFromLib in scope-manager #9575
Comments
@jakebailey and I I looked at this briefly in an existing performance pairing session. Jake ideated we might be able to cache/memoize the root global scope based on the backing library ( Note that neither of us are deeply familiar with this area - but my hunch is that that'd be a good direction to look. cc @bradzacher |
The implicit globals are defined automatically if you are using type-aware linting. In v8 we could probably STOP defining them automatically and force anyone who chooses to use those rules to pass As for speeding it up - sadly there's no memcpy in JS - so we can't just tell JS "hey copy this block of memory please and thanks" and instead we're stuck iterating and assigning.
That garbage collection isn't likely something to do with the call, specifically. typescript-eslint/packages/scope-manager/src/referencer/Referencer.ts Lines 85 to 103 in 90bacee
typescript-eslint/packages/scope-manager/src/scope/GlobalScope.ts Lines 42 to 53 in 90bacee
typescript-eslint/packages/scope-manager/src/scope/ScopeBase.ts Lines 402 to 429 in 90bacee
typescript-eslint/packages/scope-manager/src/variable/ImplicitLibVariable.ts Lines 26 to 43 in 90bacee
|
Grr 😬 I'd really like to avoid that if possible. We have enough configuration woes for users as it is... but that'd good to keep in mind as a backup option.
The "cache/memoize the root global scope based on the backing library ( I added a performance comparison at https://github.com/typescript-eslint/performance/blob/13c6c79f0d5a765471ad6d88d079268c94ccba45/README.md#comparison-globals-in-scopes. Skipping this function improves the rough baseline lint time in linting ~1k files with typed linting by 20% or so:
|
The thing I was actually trying to say is to have a cache of global scopes, not to stop respecting lib. The signature of
This makes me think that it's already written in such a way to be an "overlay" of scopes. If you only ever do this once per list of lib stuff, then each file can just re-nest from that and work. But, maybe I'm just plain underestimating the complexity of that. |
This is correct but we cannot add a new scope to the hierarchy. Doing so would break many invariants and assumptions that exist in lint rules. We also have to be careful because scopes are mutable - eslint will write the set of configured globals to them for each file. So we need to be careful about memoising unsafely - we need to do so in such a way that we are not opening the door to contamination of the memoised value. |
Hm, if everything inherits from the same scope anyway, is it a bad idea to add the elements from |
Everything doesn't inherit from the same scope - each file that is parsed creates a new instance of the global scope. That's how isolation is achieved right now. The name "global scope" is misleading here - it's not "global and shared between files", it's just "global variables that a given file can see". |
Oh, that's unfortunate, I was hoping for some abilitiy for copy-on-write here :( |
Nah sadly what happens is that you can have a different global environment for each file - eg a test file has the So there isn't a "canonical global" environment. In reality there's a few small sets of global scopes across a lint run - however eslint doesn't tell us that. It doesn't tell us anything - it just asks for a scope manager and then it populates the global scope with variables afterward.
Sadly yeah the usage of a But if we can save time then it might be worth it at scale. |
The main issue is that every global variable should expose mutable typescript-eslint/packages/scope-manager/src/variable/VariableBase.ts Lines 37 to 46 in e239e0c
These references are unique for each global variable for each file. Imagine // file1.ts
Promise.resolve()
// file2.ts
Promise.resolve()
Promise.reject() In I reduced mean lint time for It's not as good as it could be, but at least it's something. If we're okay with these proxy tricks, I can send a PR. |
Yeah for sure, auvred - but as jake mentioned - in an ideal world we'd implement what they call a COW pattern -- copy on write. So whilst things read the variables they all see the same, immutable data - but it's not until they attempt to write to the data that we copy the reference into the scope properly and then mutate it. There's several hundred variables defined via the TS lib types - most of which aren't ever read, let alone written to. So a COW pattern would let us build the set once and only go through the heavy effort of duplicating the variable when someone actually references it in the linted code. Meaning most of the globals would never get cloned (or even read!) Sadly ESLint's architecture prevents us from building this pattern. That being said even if ESLint's design was more conducive to the idea - COW is pretty hard to implement in JS given anyone can hold a reference to and mutate any data at any time. We'd need some intense Proxy-based shenanigans to do COW transparently 🤮 |
I didn't realize reference counts were there too. Oof. I wasn't meaning a very magical CoW, more like at least the functionality to build up a chain of scopes like an overlay, such that you (at least in TS) are prevented from writing upwards accidentally. Counts could be stored somewhere else too. But, that all seems like it doesn't fit within eslint's design... Without the Proxy shenanigans anyway |
I think that the best idea here would probably to do a copy-on-read style. eg something like this quick-and-dirty design: class CopyOnReadMap extends Map<string, Variable> {
constructor(
private libVariables: Map<string, Variable>,
) {
super();
}
override get(key: string): Variable | undefined {
const selfValue = super.get(key);
if (selfValue) {
return selfValue;
}
const libValue = this.libVariables.get(key);
if (libValue) {
const cloned = CLONE_VARIABLE(libValue);
this.set(key, cloned);
return cloned;
}
return undefined;
}
} we'd ofc have to override the iterator methods as well to ensure we populate the variables if someone's dumb enough to iterate all the variables in the global scope. |
FWIW I'm happy with any significant performance boost we can get quickly. The 9-10% is a fantastic result (nice job @auvred! 👏) that would be more than reasonable to ship with a big TODO to switch to a COR/COW/etc. later IMO. |
Yea, the Check out this one https://github.com/eslint/eslint/blob/13d0bd371eb8eb4aa1601c8727212a62ab923d0e/lib/rules/camelcase.js#L253 Tomorrow I will try to dig into all this more deeply 😄 |
By the way, another direction we could take is to have ESLint lazily create+analyze with scope managers. Why both making them at all for files that don't need scope analysis? Skipping ScopeManager analysis wouldn't remove overhead for files that actually are linted with scope. But it would help greatly with the case of projects that don't use it in some files. I'll file an issue over on ESLint core. |
A lot of recommend rules use scope analysis - so you wouldn't save much by doing that change. Eg it's rare that a project doesn't use |
So there are changes with proxied implicit global variables: auvred@20400ab. This looks very ugly, and it looks like these changes only show a real speed increase in the artificial test case from the I tried to come up with some lazily computed versions of
|
I tend to think that in real projects, this overhead is not that noticeable. I just ran the cd packages/eslint-plugin
node --cpu-prof --cpu-prof-interval=20 ../../node_modules/eslint/bin/eslint.js And the entire So even if we somehow cleverly optimize it with proxying, most users (including typescript-eslint) won't notice any difference :( |
@auvred worth noting that if you have type-info rules on then that will dwarf the runtime - but without such rules then it'll take a larger %age of the time. |
I'd independently noticed the same performance bottleneck with First thing I noticed is that this:
is regenerating the Object.entries array for each used Lib on every file, so you can save a bit by having the Lib records export the result of Object.entries instead of the object form to begin with (in the large project I was testing with this saved about 4 seconds).
However, it occurred to me that the linter (or Prettier, or whatever) doesn't care if there are implicit global variables unless the corresponding identifier appears in the file somewhere; if anything, they're actually an obstacle because the rules have to waste time skipping them. Based on this observation, I put together a prototype where I delayed injection of the implicit global variables until just before finalizing the global scope. See #10561. In my local testing this version shaved around 25 seconds off of a 112 second eslint invocation. Edit: PR was incomplete and has since been updated to pass all the unit tests, though I did have to update the logic in the tests that evaluated inclusion of correct Lib variables. |
Nice! |
Overview
I'm investigating performance traces of typescript-eslint work in preparation for v8 (#9571). You can find some preliminary traces in https://github.com/typescript-eslint/performance/blob/c5e8a725ad5338a3dca5cebcd69f5a802df8c8d6/README.md#tracing-measurements -> https://github.com/typescript-eslint/performance/blob/main/traces/Project%201%20-%20Service%202.cpuprofile.
If you look at the call stack in an example project with a large number of simple files, quite a few of them are spending anywhere from ~0.3ms to ~2ms in
populateGlobalsFromLib
, sometimes including garbage collection:Most rules don't actually use the scope manager, and sometimes only conditionally. So this function is unnecessary for many projects!
I tried commenting it out and lint times for ~1024
.ts
files, even layout,parserOptions.project
went from ~3.8s to ~2.8s. That's a great savings!Even in projects where it is necessary, it seems exceptionally long to spend over a ms on it. I wonder if switching the data structures around to a more performant loop style (e.g. from an object to an array) would be helpful.
Both
parserOptions.project
andparserOptions.projectService
are showing this behavior.The text was updated successfully, but these errors were encountered: