Skip to content

⚡️ Performance: Overhead of populateGlobalsFromLib in scope-manager #9575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JoshuaKGoldberg opened this issue Jul 17, 2024 · 22 comments · Fixed by #10647 · May be fixed by #10561
Closed

⚡️ Performance: Overhead of populateGlobalsFromLib in scope-manager #9575

JoshuaKGoldberg opened this issue Jul 17, 2024 · 22 comments · Fixed by #10647 · May be fixed by #10561
Labels
accepting prs Go ahead, send a pull request that resolves this issue locked due to age Please open a new issue if you'd like to say more. See https://typescript-eslint.io/contributing. performance Issues regarding performance

Comments

@JoshuaKGoldberg
Copy link
Member

Overview

I'm investigating performance traces of typescript-eslint work in preparation for v8 (#9571). You can find some preliminary traces in https://github.com/typescript-eslint/performance/blob/c5e8a725ad5338a3dca5cebcd69f5a802df8c8d6/README.md#tracing-measurements -> https://github.com/typescript-eslint/performance/blob/main/traces/Project%201%20-%20Service%202.cpuprofile.

If you look at the call stack in an example project with a large number of simple files, quite a few of them are spending anywhere from ~0.3ms to ~2ms in populateGlobalsFromLib, sometimes including garbage collection:

Screenshot of a linked stack trace zoomed into several files parseForESLint. Each has a populateGlobalsFromLib taking up most of its call tree.

Most rules don't actually use the scope manager, and sometimes only conditionally. So this function is unnecessary for many projects!

I tried commenting it out and lint times for ~1024 .ts files, even layout, parserOptions.project went from ~3.8s to ~2.8s. That's a great savings!

Even in projects where it is necessary, it seems exceptionally long to spend over a ms on it. I wonder if switching the data structures around to a more performant loop style (e.g. from an object to an array) would be helpful.

Both parserOptions.project and parserOptions.projectService are showing this behavior.

@JoshuaKGoldberg JoshuaKGoldberg added triage Waiting for team members to take a look performance Issues regarding performance labels Jul 17, 2024
@JoshuaKGoldberg
Copy link
Member Author

@jakebailey and I I looked at this briefly in an existing performance pairing session. Jake ideated we might be able to cache/memoize the root global scope based on the backing library (this.#lib).

Note that neither of us are deeply familiar with this area - but my hunch is that that'd be a good direction to look.

cc @bradzacher

@bradzacher
Copy link
Member

bradzacher commented Jul 18, 2024

The implicit globals are defined automatically if you are using type-aware linting.
The intent is that they make rules like no-undef and no-redeclare "just work".

In v8 we could probably STOP defining them automatically and force anyone who chooses to use those rules to pass parserOptions.lib. It's just a complex setup problem really - do we want it to work "for free" or not?


As for speeding it up - sadly there's no memcpy in JS - so we can't just tell JS "hey copy this block of memory please and thanks" and instead we're stuck iterating and assigning.
The backing store for the variables is a Map (which is neccesitated by eslint's design) - which limits our choices...

sometimes including garbage collection

That garbage collection isn't likely something to do with the call, specifically.
If you look at the chain of calls there's very little local assignments - just the fn arguments really.
So it's unlikely that that GC time is specifically due to the method in question.

private populateGlobalsFromLib(globalScope: GlobalScope): void {
for (const lib of this.#lib) {
const variables = TSLibraries[lib];
// eslint-disable-next-line @typescript-eslint/no-unnecessary-condition
/* istanbul ignore if */ if (!variables) {
throw new Error(`Invalid value for lib provided: ${lib}`);
}
for (const [name, variable] of Object.entries(variables)) {
globalScope.defineImplicitVariable(name, variable);
}
}
// for const assertions (`{} as const` / `<const>{}`)
globalScope.defineImplicitVariable('const', {
eslintImplicitGlobalSetting: 'readonly',
isTypeVariable: true,
isValueVariable: false,
});
}

public defineImplicitVariable(
name: string,
options: ImplicitLibVariableOptions,
): void {
this.defineVariable(
new ImplicitLibVariable(this, name, options),
this.set,
this.variables,
null,
null,
);
}

protected defineVariable(
nameOrVariable: Variable | string,
set: Map<string, Variable>,
variables: Variable[],
node: TSESTree.Identifier | null,
def: Definition | null,
): void {
const name =
typeof nameOrVariable === 'string' ? nameOrVariable : nameOrVariable.name;
let variable = set.get(name);
if (!variable) {
variable =
typeof nameOrVariable === 'string'
? new Variable(name, this as Scope)
: nameOrVariable;
set.set(name, variable);
variables.push(variable);
}
if (def) {
variable.defs.push(def);
this.addDeclaredVariablesOfNode(variable, def.node);
this.addDeclaredVariablesOfNode(variable, def.parent);
}
if (node) {
variable.identifiers.push(node);
}
}

public constructor(
scope: Scope,
name: string,
{
isTypeVariable,
isValueVariable,
writeable,
eslintImplicitGlobalSetting,
}: ImplicitLibVariableOptions,
) {
super(name, scope);
this.isTypeVariable = isTypeVariable ?? false;
this.isValueVariable = isValueVariable ?? false;
this.writeable = writeable ?? false;
this.eslintImplicitGlobalSetting =
eslintImplicitGlobalSetting ?? 'readonly';
}
}

@JoshuaKGoldberg
Copy link
Member Author

JoshuaKGoldberg commented Jul 18, 2024

STOP defining them automatically

Grr 😬 I'd really like to avoid that if possible. We have enough configuration woes for users as it is... but that'd good to keep in mind as a backup option.

As for speeding it up

The "cache/memoize the root global scope based on the backing library (this.#lib) idea from #9575 (comment) is very appealing. And/or, perhaps we could use a "dirty" flag/tracker to delay making a new global scope until needed?

I added a performance comparison at https://github.com/typescript-eslint/performance/blob/13c6c79f0d5a765471ad6d88d079268c94ccba45/README.md#comparison-globals-in-scopes. Skipping this function improves the rough baseline lint time in linting ~1k files with typed linting by 20% or so:

Variant Measurement User Time
Baseline 3.137 s ± 0.024 s 4.417 s
Skipping 2.477 s ± 0.014 s 3.501 s

@jakebailey
Copy link
Collaborator

The thing I was actually trying to say is to have a cache of global scopes, not to stop respecting lib. The signature of nestGlobalScope is:

(method) ScopeManager.nestGlobalScope(node: GlobalScope["block"]): GlobalScope

This makes me think that it's already written in such a way to be an "overlay" of scopes. If you only ever do this once per list of lib stuff, then each file can just re-nest from that and work.

But, maybe I'm just plain underestimating the complexity of that.

@bradzacher
Copy link
Member

This makes me think that it's already written in such a way to be an "overlay" of scopes. If you only ever do this once per list of lib stuff, then each file can just re-nest from that and work.

This is correct but we cannot add a new scope to the hierarchy. Doing so would break many invariants and assumptions that exist in lint rules.

We also have to be careful because scopes are mutable - eslint will write the set of configured globals to them for each file. So we need to be careful about memoising unsafely - we need to do so in such a way that we are not opening the door to contamination of the memoised value.

@jakebailey
Copy link
Collaborator

Hm, if everything inherits from the same scope anyway, is it a bad idea to add the elements from lib once to the initial scope manager (the one that gets passed in directly)? Or is that also dangerous?

@bradzacher
Copy link
Member

if everything inherits from the same scope anyway

Everything doesn't inherit from the same scope - each file that is parsed creates a new instance of the global scope. That's how isolation is achieved right now.

The name "global scope" is misleading here - it's not "global and shared between files", it's just "global variables that a given file can see".

@jakebailey
Copy link
Collaborator

Oh, that's unfortunate, I was hoping for some abilitiy for copy-on-write here :(

@bradzacher
Copy link
Member

Nah sadly what happens is that you can have a different global environment for each file - eg a test file has the jest globals, a react file has the web globals, a server/ file has the node globals. You can even technically configure different ES versions per file in eslint.

So there isn't a "canonical global" environment.

In reality there's a few small sets of global scopes across a lint run - however eslint doesn't tell us that. It doesn't tell us anything - it just asks for a scope manager and then it populates the global scope with variables afterward.

I was hoping for some abilitiy for copy-on-write here

Sadly yeah the usage of a Map fucks us here too - we can't even use a Proxy easily to have a single shared but locally overwriteable setup. We'd have to proxy or subclass the Map to do that. It's doable but it would be hard and icky.

But if we can save time then it might be worth it at scale.

@auvred
Copy link
Member

auvred commented Jul 24, 2024

The main issue is that every global variable should expose mutable references field:

/**
* List of {@link Reference} of this variable (excluding parameter entries) in its defining scope and all nested scopes.
* For defining occurrences only see {@link Variable#defs}.
* @public
*/
public readonly references: Reference[] = [];
/**
* Reference to the enclosing Scope.
*/
public readonly scope: Scope;

These references are unique for each global variable for each file.

Imagine

// file1.ts
Promise.resolve()

// file2.ts
Promise.resolve()
Promise.reject()

In file1.ts the global variable "Promise" has one reference, but in file2.ts it has two references. Therefore, we cannot reuse the same object for a global variable in multiple linted files :(


I reduced mean lint time for files-1024-layout-even-singlerun-false-types-service locally from 4.694 s ± 0.090 s to 4.261 s ± 0.072 s (9-10 % faster) by creating ImplicitLibVariable once for each global variable and wrapping it in a Proxy that proxies scope and references to unique ones.

It's not as good as it could be, but at least it's something. If we're okay with these proxy tricks, I can send a PR.

@bradzacher
Copy link
Member

bradzacher commented Jul 24, 2024

Yeah for sure, auvred - but as jake mentioned - in an ideal world we'd implement what they call a COW pattern -- copy on write. So whilst things read the variables they all see the same, immutable data - but it's not until they attempt to write to the data that we copy the reference into the scope properly and then mutate it.

There's several hundred variables defined via the TS lib types - most of which aren't ever read, let alone written to. So a COW pattern would let us build the set once and only go through the heavy effort of duplicating the variable when someone actually references it in the linted code. Meaning most of the globals would never get cloned (or even read!)

Sadly ESLint's architecture prevents us from building this pattern. That being said even if ESLint's design was more conducive to the idea - COW is pretty hard to implement in JS given anyone can hold a reference to and mutate any data at any time. We'd need some intense Proxy-based shenanigans to do COW transparently 🤮

@jakebailey
Copy link
Collaborator

I didn't realize reference counts were there too. Oof.

I wasn't meaning a very magical CoW, more like at least the functionality to build up a chain of scopes like an overlay, such that you (at least in TS) are prevented from writing upwards accidentally. Counts could be stored somewhere else too.

But, that all seems like it doesn't fit within eslint's design... Without the Proxy shenanigans anyway

@bradzacher
Copy link
Member

bradzacher commented Jul 24, 2024

I think that the best idea here would probably to do a copy-on-read style.
By that I mean we subclass Map so that we can avoid instantiating a lib variable until someone attempts to read it out of the scope.

eg something like this quick-and-dirty design:

class CopyOnReadMap extends Map<string, Variable> {
  constructor(
    private libVariables: Map<string, Variable>,
  ) {
    super();
  }

  override get(key: string): Variable | undefined {
    const selfValue = super.get(key);
    if (selfValue) {
      return selfValue;
    }

    const libValue = this.libVariables.get(key);
    if (libValue) {
      const cloned = CLONE_VARIABLE(libValue);
      this.set(key, cloned);
      return cloned;
    }

    return undefined;
  }
}

we'd ofc have to override the iterator methods as well to ensure we populate the variables if someone's dumb enough to iterate all the variables in the global scope.

@bradzacher bradzacher added accepting prs Go ahead, send a pull request that resolves this issue and removed triage Waiting for team members to take a look labels Jul 24, 2024
@JoshuaKGoldberg
Copy link
Member Author

FWIW I'm happy with any significant performance boost we can get quickly. The 9-10% is a fantastic result (nice job @auvred! 👏) that would be more than reasonable to ship with a big TODO to switch to a COR/COW/etc. later IMO.

@auvred
Copy link
Member

auvred commented Jul 24, 2024

we'd ofc have to override the iterator methods

Yea, the variables array is also used in several places, so I didn't even try to do deeper proxied things.

Check out this one https://github.com/eslint/eslint/blob/13d0bd371eb8eb4aa1601c8727212a62ab923d0e/lib/rules/camelcase.js#L253

Tomorrow I will try to dig into all this more deeply 😄

@JoshuaKGoldberg
Copy link
Member Author

JoshuaKGoldberg commented Jul 24, 2024

By the way, another direction we could take is to have ESLint lazily create+analyze with scope managers. Why both making them at all for files that don't need scope analysis?

Skipping ScopeManager analysis wouldn't remove overhead for files that actually are linted with scope. But it would help greatly with the case of projects that don't use it in some files. I'll file an issue over on ESLint core.

@bradzacher
Copy link
Member

Skipping ScopeManager analysis wouldn't remove overhead for files that actually are linted with scope. But it would help greatly with the case of projects that don't use it in some files.

A lot of recommend rules use scope analysis - so you wouldn't save much by doing that change.

Eg it's rare that a project doesn't use no-unused-vars - which is entirely driven by scope analysis.

@auvred
Copy link
Member

auvred commented Jul 26, 2024

So there are changes with proxied implicit global variables: auvred@20400ab. This looks very ugly, and it looks like these changes only show a real speed increase in the artificial test case from the performance repository, because I didn't notice any significant speed boost in the packages/eslint-plugin tests.

I tried to come up with some lazily computed versions of variables, but until no-unused-vars and no-global-assign (and maybe some others) stop iterating over all variables of the global scope, end users won't notice any difference, since these rules are included in the tseslint.configs.recommended and eslint.configs.recommended configurations respectively.

for (const variable of scope.variables) {

https://github.com/eslint/eslint/blob/8e1a627a6784380ca7e7670e336bbe9630da2da1/lib/rules/no-global-assign.js#L91

@auvred
Copy link
Member

auvred commented Jul 27, 2024

I tend to think that in real projects, this overhead is not that noticeable.

I just ran the

cd packages/eslint-plugin
node --cpu-prof --cpu-prof-interval=20 ../../node_modules/eslint/bin/eslint.js

And the entire analyze function was on the stack for only 0.6% of the total lint time.
populateGlobalsFromLib takes ~10% of analyze, so roughly speaking 0.06% of the total execution time.

So even if we somehow cleverly optimize it with proxying, most users (including typescript-eslint) won't notice any difference :(

@bradzacher
Copy link
Member

bradzacher commented Jul 27, 2024

@auvred worth noting that if you have type-info rules on then that will dwarf the runtime - but without such rules then it'll take a larger %age of the time.

@dmichon-msft
Copy link
Contributor

dmichon-msft commented Dec 27, 2024

I'd independently noticed the same performance bottleneck with populateGlobalsFromLib and did some local experimentation to see if I could improve performance without altering signatures.

First thing I noticed is that this:

for (const [name, variable] of Object.entries(variables)) {

is regenerating the Object.entries array for each used Lib on every file, so you can save a bit by having the Lib records export the result of Object.entries instead of the object form to begin with (in the large project I was testing with this saved about 4 seconds).

However, it occurred to me that the linter (or Prettier, or whatever) doesn't care if there are implicit global variables unless the corresponding identifier appears in the file somewhere; if anything, they're actually an obstacle because the rules have to waste time skipping them. Based on this observation, I put together a prototype where I delayed injection of the implicit global variables until just before finalizing the global scope.

See #10561.

In my local testing this version shaved around 25 seconds off of a 112 second eslint invocation.

Edit: PR was incomplete and has since been updated to pass all the unit tests, though I did have to update the logic in the tests that evaluated inclusion of correct Lib variables.

@octogonz
Copy link
Contributor

In my local testing this version shaved around 25 seconds off of a 112 second eslint invocation.

Nice!

@github-actions github-actions bot added the locked due to age Please open a new issue if you'd like to say more. See https://typescript-eslint.io/contributing. label Mar 11, 2025
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 11, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepting prs Go ahead, send a pull request that resolves this issue locked due to age Please open a new issue if you'd like to say more. See https://typescript-eslint.io/contributing. performance Issues regarding performance
Projects
None yet
6 participants