-
Notifications
You must be signed in to change notification settings - Fork 1.7k
JS: Overhaul import resolution #19391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
javascript/ql/lib/semmle/javascript/internal/paths/PackageJsonEx.qll
Fixed
Show resolved
Hide resolved
82b8d58
to
56deae6
Compare
56deae6
to
7a81e48
Compare
7a81e48
to
c008ba4
Compare
The new class 'FilePath' has bindingset[this] so one just has to cast a string to that type and you can use its methods.
c008ba4
to
9ae6d8b
Compare
This moves the test for the babel `root-import` plugin into the new unit test for import resolution, so we only have one set of tests to maintain. The actual implementation is added in the next commit.
To avoid negative recursion in some upcoming changes, we want to make sure the modeling of createRequire does not depend on getImportedPath().
9ae6d8b
to
620ce66
Compare
path.resolve() and template expressions are now working. Previously they could not be resolved because Import.getImportedPath() returned a PathExpr, and these were not instances of PathExpr.
… name clash 'getTargetFile' was originally named to avoid the clash with 'getImportedFile' from a subclass. But we now just merge the two predicates.
We don't extract node_modules folders by default so these tests aren't that relevant anymore, and we no longer follow node_modules resolution rules directly. Instead, these imports are resolved based on the monorepo support which simply requires a package.json file to exist. There is not a good enough reason to support node_modules directly, so we're accepting some minor regression in these tests.
620ce66
to
b0f73f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work as always, the new structure is definitely better.
And impressive that we can get that good import-resolution relatively easily (I'm guessing you did quite a few iterations of improvements).
Looks good, just minor comments (but also not the most thorough review).
I still want Napalys to have a look, and some of the comments are just for him.
javascript/ql/lib/semmle/javascript/internal/paths/PackageJsonEx.qll
Fixed
Show resolved
Hide resolved
.getFile(mainFile.getStem().regexpReplaceAll("\\.d$", "") + ".d.ts") | ||
) | ||
} | ||
File getTypingsFile() { none() } // implemented in PackageJsonEx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we delete this predicate now? It feel wrong to leave a none()
predicate like this.
I know that it's never actually none()
, because it's always overriden in PackageJsonEx
.
But, in general I don't like the class *Ex
pattern, it feels like it's hiding the real implementation.
Could you instead have the internal path resolution in e.g. a private module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not look pretty but honestly I just don't think it's worth the hassle to start moving and renaming things in order to make it possible to move the implementation in here. The none()
with override is a fairly well-understood way to keep the implementation in a different file, where we have access to the parts we need.
|
||
private class JoinCallConcatenation extends PathConcatenation, CallExpr { | ||
JoinCallConcatenation() { | ||
this.getReceiver().(VarAccess).getName() = "path" and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a comment that we can't use module resolution here because negative recursion, and that this approximation is probably good enough.
(But maybe a reference to which full implementation we would use, if that was possible).
} | ||
|
||
/** Gets the constant-value of `node` */ | ||
language[monotonicAggregates] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is related to the strictconcat
, somehow.
I remember being confused about monotonicAggregates
before, so could you add a comment (down at the strictconcat
) what difference monotonicAggregates
makes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment.
or | ||
// Handle imports referring to a package by name, where we have a package.json | ||
// file for that package in the codebase. This is treated separately from PathMapping for performance | ||
// reasons, as there can be a large number of packages which affect all files in the project. | ||
// | ||
// This part only handles the "exports" property of package.json. "main" and "modules" are | ||
// handled further down because their semantics are easier to handle there. | ||
exists(PackageJsonEx pkg, string packageName, string remainder | | ||
packageName = expr.getPackagePrefix() and | ||
pkg.getDeclaredPackageName() = packageName and | ||
remainder = expr.getValue().suffix(packageName.length()).regexpReplaceAll("^[/\\\\]", "") | ||
| | ||
// "exports": { ".": "./foo.js" } | ||
// "exports": { "./foo.js": "./foo/impl.js" } | ||
pkg.hasExactPathMappingTo(remainder, base) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Napalys (for when you review this), this is one of the surprising features of NodeJS.
The path you import, and the corresponding path in the file-system can be very different.
/** | ||
* Holds if `expr` should be resolved as `path` relative to `base`. | ||
*/ | ||
pragma[nomagic] | ||
private predicate shouldResolve(RelevantExpr expr, Container base, FilePath path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this is not the most thorough review.
From your tests and evaluations I trust in general that the implementation cover all relevant cases, because I don't have an overview of whether a predicate like this covers everything relevant.
import "@/both" // $ importTarget=Fallback/lib1/both.ts SPURIOUS: importTarget=Fallback/lib2/both.ts | ||
import "@/only1" // $ importTarget=Fallback/lib1/only1.ts | ||
import "@/only2" // $ importTarget=Fallback/lib2/only2.ts | ||
import "@/differentExtension" // $ importTarget=Fallback/lib2/differentExtension.ts SPURIOUS: importTarget=Fallback/lib1/differentExtension.js |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose spurious edges like this is a big reason for the new import edges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I believe it's extremely rare. It only happens when a path mapping has a prioritised list of targets, e.g {"paths": { "@/*": ["src/*", "lib/*"] }
and both folders contain a match for a given import.
One of the big changes is that monorepo support only worked for imports targeting a package exactly, but not a path within the package. E.g. @foo/bar
could be resolved, but not @foo/bar/baz
.
@@ -2852,7 +2852,7 @@ class DynamicImportExpr extends @dynamic_import, Expr, Import { | |||
} | |||
|
|||
/** A literal path expression appearing in a dynamic import. */ | |||
private class LiteralDynamicImportPath extends PathExpr, ConstantString { | |||
deprecated private class LiteralDynamicImportPath extends PathExpr, ConstantString { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Napalys.
Every 3 months I usually run a script to delete all deprecations that are past the deprecation period (a year).
I think that's something you can take over, I'll introduce you to it after the offsite.
Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>
Updates JS import resolution to avoid depending on the TypeScript compiler, which ultimately led to a completely rewrite of import resolution.
Deprecating PathExpr
This PR deprecates
PathExpr
,PathString
, andPath
, in favor of the parameterised modulesFolder::Resolve
andResolveExpr
and extended models of tsconfig.json and package.json files.PathExpr
tried to represent multiple things simultaneously:The new implementation has a clear separation between these and can do more powerful constant folding as a result.
The old abstraction also could not properly support path mapping. It had a concept of "search root" but assumed that the value to resolve did not depend on the search root, which is not true for mappings like
@/* -> src/*
, where the@/
prefix must be chopped off.It also forced all path-resolution logic to happen within a single layer, which complicated library models that need import resolution (Angular). Attempts to refactor things often led to negative recursion due to the Angular model. Parameterised modules don't have this problem as any model can resolve whatever paths they need in the layer they need it.
Evaluation
Evaluation shows neutral performance and about 5k new import edges, and about 30k new call edges, and a few lost imports and call as well.
The alert changes aren't so interesting. Some results were gained in a copy of jQuery, and two alerts were lost as we lost a library input, but not one that seemed likely to be user-controlled.
Review notes
Commit-by-commit view is strongly recommended.
A bunch of unit tests are added in a single commit. The annotations initially represent what we resolve using the baseline implementation (relying on the TypeScript compiler) and you may not want to worry too much about whether those initial annotation are correct; they just represent the baseline.