Skip to content

feat(typescript-estree): lazily compute loc #6542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

bradzacher
Copy link
Member

@bradzacher bradzacher commented Feb 28, 2023

PR Checklist

Overview

This PR changes our AST conversion to lazily compute the .loc for all nodes, comments, and tokens.

TS does not compute node location during the parsing sequence. This means that in order to calculate the location, we need to ask TS to convert the range to a location - which involves doing a binary search of the line:range mapping.

Ideally TS would calculate and store these as it goes - but understandably they don't because TS doesn't need the information for the most part!

Perf Result

This change reduces time spent in getLineAndCharacterFor in our codebase from 231ms to 68ms, however the time spent in astConverter was mostly a neutral change - suggesting that just declaring the getters and setters cost the same as the time saved.

I'm really surprised that accessors are so expensive to define. Really sad that this is a dead end. I'm putting this up and closing it immediately just so there's a clear log of things for people to see.

Here are the cpu profiles for the change:
Archive.zip

@nx-cloud
Copy link

nx-cloud bot commented Feb 28, 2023

☁️ Nx Cloud Report

We didn't find any information for the current pull request with the commit a79b7c6.
You might need to set the 'NX_BRANCH' environment variable in your CI pipeline.

Check the Nx Cloud Github Integration documentation for more information.


Sent with 💌 from NxCloud.

@typescript-eslint
Copy link
Contributor

Thanks for the PR, @bradzacher!

typescript-eslint is a 100% community driven project, and we are incredibly grateful that you are contributing to that community.

The core maintainers work on this in their personal time, so please understand that it may not be possible for them to review your work immediately.

Thanks again!


🙏 Please, if you or your company is finding typescript-eslint valuable, help us sustain the project by sponsoring it transparently on https://opencollective.com/typescript-eslint.

@bradzacher bradzacher closed this Feb 28, 2023
@netlify
Copy link

netlify bot commented Feb 28, 2023

Deploy Preview for typescript-eslint ready!

Name Link
🔨 Latest commit a79b7c6
🔍 Latest deploy log https://app.netlify.com/sites/typescript-eslint/deploys/63fd446f5cde6100089b84e3
😎 Deploy Preview https://deploy-preview-6542--typescript-eslint.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@armano2
Copy link
Collaborator

armano2 commented Feb 28, 2023

wdyt about changing that entire loc is a getter? that should save some time on declaring so many getters, we could make it, I don't think we have to worry about setter

you also forgot to remove other calls to getLineAndCharacterFor eg in fixParentLocation

export function defineLocProperty<T extends { range?: [number, number] }>(
  ast: ts.SourceFile,
  node: T,
): T & { loc: TSESTree.SourceLocation } {
  Object.defineProperty(node, 'loc', {
    get(this: { range: [number, number] }) {
      return {
        start: getLineAndCharacterFor(this.range[0], ast),
        end: getLineAndCharacterFor(this.range[1], ast),
      };
    },
  });
  return node as T & { loc: TSESTree.SourceLocation };
}

@bradzacher
Copy link
Member Author

@armano2 I actually went down that route first before moving down one layer.
It was unfortunately about the same performance (it actually trended slightly negative, adding a dozen or so MS in some runs).

I think that Object.defineProperty is even slower than the accessor syntax because of the additional checks it has to do over just defining an object with accessors.

I don't think we have to worry about setter

We do need a setter, sadly, as there's a few places during the conversion that we change the default location for a node (for example when separating out export to its own node).

@armano2
Copy link
Collaborator

armano2 commented Feb 28, 2023

we actually don't need a setter for our uses,

we can rely on range[0] and range[1] from current object and with that loc is going to react to all changes

example
Index: packages/typescript-estree/src/convert-comments.ts
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/packages/typescript-estree/src/convert-comments.ts b/packages/typescript-estree/src/convert-comments.ts
--- a/packages/typescript-estree/src/convert-comments.ts	(revision a79b7c687d7e6d21ae88aaf06cd526ad81254695)
+++ b/packages/typescript-estree/src/convert-comments.ts	(date 1677547296096)
@@ -1,7 +1,7 @@
 import { forEachComment } from 'tsutils/util/util';
 import * as ts from 'typescript';
 
-import { getLocFor } from './node-utils';
+import { defineLocProperty } from './node-utils';
 import type { TSESTree } from './ts-estree';
 import { AST_TOKEN_TYPES } from './ts-estree';
 
@@ -35,12 +35,13 @@
             range[1] - textStart
           : // multiline comments end 2 characters early
             range[1] - textStart - 2;
-      comments.push({
-        type,
-        value: code.slice(textStart, textStart + textEnd),
-        range,
-        loc: getLocFor(range[0], range[1], ast),
-      });
+      comments.push(
+        defineLocProperty(ast, {
+          type,
+          value: code.slice(textStart, textStart + textEnd),
+          range,
+        }),
+      );
     },
     ast,
   );
Index: packages/typescript-estree/src/convert.ts
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/packages/typescript-estree/src/convert.ts b/packages/typescript-estree/src/convert.ts
--- a/packages/typescript-estree/src/convert.ts	(revision a79b7c687d7e6d21ae88aaf06cd526ad81254695)
+++ b/packages/typescript-estree/src/convert.ts	(date 1677547017090)
@@ -7,12 +7,11 @@
 import {
   canContainDirective,
   createError,
+  defineLocProperty,
   findNextToken,
   getBinaryExpressionType,
   getDeclarationKind,
   getLastModifier,
-  getLineAndCharacterFor,
-  getLocFor,
   getRange,
   getTextForTokenKind,
   getTSNodeAccessibility,
@@ -175,7 +174,7 @@
         : findNextToken(exportKeyword, this.ast, this.ast);
 
       result.range[0] = varToken!.getStart(this.ast);
-      result.loc = getLocFor(result.range[0], result.range[1], this.ast);
+      defineLocProperty(this.ast, result);
 
       if (declarationIsDefault) {
         return this.createNode<TSESTree.ExportDefaultDeclaration>(node, {
@@ -262,7 +261,7 @@
       );
     }
     if (!result.loc) {
-      result.loc = getLocFor(result.range[0], result.range[1], this.ast);
+      defineLocProperty(this.ast, result);
     }
 
     if (result && this.options.shouldPreserveNodeMaps) {
@@ -305,13 +304,11 @@
         : 1;
     const annotationStartCol = child.getFullStart() - offset;
 
-    const loc = getLocFor(annotationStartCol, child.end, this.ast);
-    return {
+    return defineLocProperty(this.ast, {
       type: AST_NODE_TYPES.TSTypeAnnotation,
-      loc,
       range: [annotationStartCol, child.end],
       typeAnnotation: this.convertType(child),
-    };
+    });
   }
 
   /**
@@ -383,14 +380,13 @@
   ): TSESTree.TSTypeParameterDeclaration {
     const greaterThanToken = findNextToken(typeParameters, this.ast, this.ast)!;
 
-    return {
+    return defineLocProperty(this.ast, {
       type: AST_NODE_TYPES.TSTypeParameterDeclaration,
       range: [typeParameters.pos - 1, greaterThanToken.end],
-      loc: getLocFor(typeParameters.pos - 1, greaterThanToken.end, this.ast),
       params: typeParameters.map(typeParameter =>
         this.convertType(typeParameter),
       ),
-    };
+    });
   }
 
   /**
@@ -763,11 +759,9 @@
   ): void {
     if (childRange[0] < result.range[0]) {
       result.range[0] = childRange[0];
-      result.loc.start = getLineAndCharacterFor(result.range[0], this.ast);
     }
     if (childRange[1] > result.range[1]) {
       result.range[1] = childRange[1];
-      result.loc.end = getLineAndCharacterFor(result.range[1], this.ast);
     }
   }
 
@@ -1659,7 +1653,6 @@
           if (modifiers) {
             // AssignmentPattern should not contain modifiers in range
             result.range[0] = parameter.range[0];
-            result.loc = getLocFor(result.range[0], result.range[1], this.ast);
           }
         } else {
           parameter = result = this.convertChild(node.name, parent);
@@ -1676,10 +1669,6 @@
         if (node.questionToken) {
           if (node.questionToken.end > parameter.range[1]) {
             parameter.range[1] = node.questionToken.end;
-            parameter.loc.end = getLineAndCharacterFor(
-              parameter.range[1],
-              this.ast,
-            );
           }
           parameter.optional = true;
         }
Index: packages/typescript-estree/src/node-utils.ts
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/packages/typescript-estree/src/node-utils.ts b/packages/typescript-estree/src/node-utils.ts
--- a/packages/typescript-estree/src/node-utils.ts	(revision a79b7c687d7e6d21ae88aaf06cd526ad81254695)
+++ b/packages/typescript-estree/src/node-utils.ts	(date 1677547121454)
@@ -171,38 +171,23 @@
 /**
  * Returns line and column data for the given start and end positions,
  * for the given AST
- * @param start start data
- * @param end   end data
+ * @param node the AST node
  * @param ast   the AST object
  * @returns the loc data
  */
-export function getLocFor(
-  start: number,
-  end: number,
+export function defineLocProperty<T extends { range?: [number, number] }>(
   ast: ts.SourceFile,
-): TSESTree.SourceLocation {
-  let locStart: TSESTree.Position | null = null;
-  let locEnd: TSESTree.Position | null = null;
-  return {
-    get start(): TSESTree.Position {
-      if (locStart == null) {
-        locStart = getLineAndCharacterFor(start, ast);
-      }
-      return locStart;
-    },
-    set start(val: TSESTree.Position) {
-      locStart = val;
+  node: T,
+): T & { loc: TSESTree.SourceLocation } {
+  Object.defineProperty(node, 'loc', {
+    get(this: { range: [number, number] }) {
+      return {
+        start: getLineAndCharacterFor(this.range[0], ast),
+        end: getLineAndCharacterFor(this.range[1], ast),
+      };
     },
-    get end(): TSESTree.Position {
-      if (locEnd == null) {
-        locEnd = getLineAndCharacterFor(end, ast);
-      }
-      return locEnd;
-    },
-    set end(val: TSESTree.Position) {
-      locEnd = val;
-    },
-  };
+  });
+  return node as T & { loc: TSESTree.SourceLocation };
 }
 
 /**
@@ -570,7 +555,7 @@
   const tokenType = getTokenType(token);
 
   if (tokenType === AST_TOKEN_TYPES.RegularExpression) {
-    const newToken: TSESTree.RegularExpressionToken = {
+    const newToken: TSESTree.RegularExpressionToken = defineLocProperty(ast, {
       type: AST_TOKEN_TYPES.RegularExpression,
       value,
       range: [start, end],
@@ -578,18 +563,17 @@
         pattern: value.slice(1, value.lastIndexOf('/')),
         flags: value.slice(value.lastIndexOf('/') + 1),
       },
-      loc: getLocFor(start, end, ast),
-    };
+    });
     return newToken;
   }
 
   // @ts-expect-error TS is complaining about `value` not being the correct type but it is
-  const newToken: Exclude<TSESTree.Token, TSESTree.RegularExpressionToken> = {
-    type: tokenType,
-    value,
-    range: [start, end],
-    loc: getLocFor(start, end, ast),
-  };
+  const newToken: Exclude<TSESTree.Token, TSESTree.RegularExpressionToken> =
+    defineLocProperty(ast, {
+      type: tokenType,
+      value,
+      range: [start, end],
+    });
   return newToken;
 }

@bradzacher
Copy link
Member Author

Sorry, yes, what I meant is that we need a setter for the lower-level approach this PR currently represents.

For a higher-level approach we can remove the setter. My first attempt (not captured in a commit) was using Object.defineProperty in exactly this way - delete the .loc assignments and rely solely on the .range.
However that doesn't change the performance appreciable way. There's not any additional cost to defining a property with Object.defineProperty to be both a getter and a setter - either way you're defining one property.

Object.defineProperty is just slower than defining accessors directly on an object.

I thought about maybe spreading objects to instead use the syntax, but this ofc incurs the cost of spreading - which won't be great at scale.

I thought about declaring a few base classes for all our nodes like TS does so that we can have a getter declared on the prototype (i.e. declared exactly once) - and that may be a viable option! I just did not explore it yet as it's quite a large change.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 8, 2023
@bradzacher bradzacher deleted the lazy-loc branch March 13, 2023 06:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants