Skip to content

Conversation

slydiman
Copy link
Contributor

@slydiman slydiman commented Aug 19, 2025

Do not check that Source is a valid regex in case of Target (explicit) transformation. Source may contain special symbols that may cause an incorrect invalid regex error.

Note that source and exactly one of [Target, Transform] must be provided.

Target (explicit transformation): In this kind of rule Source is treated as a symbol name and is matched in its entirety. Target field will denote the symbol name to transform to.

Transform (pattern transformation): This rule treats Source as a regex that should match the complete symbol name. Transform is a regex specifying the name to transform to.

@llvmbot
Copy link
Member

llvmbot commented Aug 19, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Dmitry Vasilyev (slydiman)

Changes

ExplicitRewriteDescriptor expects a non-regex Source. But unconditional verification that Source is a valid regex breaks this logic if the Source contains $ in the middle or some other special characters.


Full diff: https://github.com/llvm/llvm-project/pull/154319.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SymbolRewriter.cpp (+42-30)
diff --git a/llvm/lib/Transforms/Utils/SymbolRewriter.cpp b/llvm/lib/Transforms/Utils/SymbolRewriter.cpp
index d52d52a9b7d3e..6319fd524ff0f 100644
--- a/llvm/lib/Transforms/Utils/SymbolRewriter.cpp
+++ b/llvm/lib/Transforms/Utils/SymbolRewriter.cpp
@@ -349,13 +349,7 @@ parseRewriteFunctionDescriptor(yaml::Stream &YS, yaml::ScalarNode *K,
 
     KeyValue = Key->getValue(KeyStorage);
     if (KeyValue == "source") {
-      std::string Error;
-
       Source = std::string(Value->getValue(ValueStorage));
-      if (!Regex(Source).isValid(Error)) {
-        YS.printError(Field.getKey(), "invalid regex: " + Error);
-        return false;
-      }
     } else if (KeyValue == "target") {
       Target = std::string(Value->getValue(ValueStorage));
     } else if (KeyValue == "transform") {
@@ -379,12 +373,22 @@ parseRewriteFunctionDescriptor(yaml::Stream &YS, yaml::ScalarNode *K,
 
   // TODO see if there is a more elegant solution to selecting the rewrite
   // descriptor type
-  if (!Target.empty())
+  if (!Target.empty()) {
     DL->push_back(std::make_unique<ExplicitRewriteFunctionDescriptor>(
         Source, Target, Naked));
-  else
-    DL->push_back(
-        std::make_unique<PatternRewriteFunctionDescriptor>(Source, Transform));
+    return true;
+  }
+
+  {
+    std::string Error;
+    if (!Regex(Source).isValid(Error)) {
+      YS.printError(Descriptor, "invalid Source regex: " + Error);
+      return false;
+    }
+  }
+
+  DL->push_back(
+      std::make_unique<PatternRewriteFunctionDescriptor>(Source, Transform));
 
   return true;
 }
@@ -418,13 +422,7 @@ parseRewriteGlobalVariableDescriptor(yaml::Stream &YS, yaml::ScalarNode *K,
 
     KeyValue = Key->getValue(KeyStorage);
     if (KeyValue == "source") {
-      std::string Error;
-
       Source = std::string(Value->getValue(ValueStorage));
-      if (!Regex(Source).isValid(Error)) {
-        YS.printError(Field.getKey(), "invalid regex: " + Error);
-        return false;
-      }
     } else if (KeyValue == "target") {
       Target = std::string(Value->getValue(ValueStorage));
     } else if (KeyValue == "transform") {
@@ -441,13 +439,23 @@ parseRewriteGlobalVariableDescriptor(yaml::Stream &YS, yaml::ScalarNode *K,
     return false;
   }
 
-  if (!Target.empty())
+  if (!Target.empty()) {
     DL->push_back(std::make_unique<ExplicitRewriteGlobalVariableDescriptor>(
         Source, Target,
         /*Naked*/ false));
-  else
-    DL->push_back(std::make_unique<PatternRewriteGlobalVariableDescriptor>(
-        Source, Transform));
+    return true;
+  }
+
+  {
+    std::string Error;
+    if (!Regex(Source).isValid(Error)) {
+      YS.printError(Descriptor, "invalid Source regex: " + Error);
+      return false;
+    }
+  }
+
+  DL->push_back(std::make_unique<PatternRewriteGlobalVariableDescriptor>(
+      Source, Transform));
 
   return true;
 }
@@ -481,13 +489,7 @@ parseRewriteGlobalAliasDescriptor(yaml::Stream &YS, yaml::ScalarNode *K,
 
     KeyValue = Key->getValue(KeyStorage);
     if (KeyValue == "source") {
-      std::string Error;
-
       Source = std::string(Value->getValue(ValueStorage));
-      if (!Regex(Source).isValid(Error)) {
-        YS.printError(Field.getKey(), "invalid regex: " + Error);
-        return false;
-      }
     } else if (KeyValue == "target") {
       Target = std::string(Value->getValue(ValueStorage));
     } else if (KeyValue == "transform") {
@@ -504,13 +506,23 @@ parseRewriteGlobalAliasDescriptor(yaml::Stream &YS, yaml::ScalarNode *K,
     return false;
   }
 
-  if (!Target.empty())
+  if (!Target.empty()) {
     DL->push_back(std::make_unique<ExplicitRewriteNamedAliasDescriptor>(
         Source, Target,
         /*Naked*/ false));
-  else
-    DL->push_back(std::make_unique<PatternRewriteNamedAliasDescriptor>(
-        Source, Transform));
+    return true;
+  }
+
+  {
+    std::string Error;
+    if (!Regex(Source).isValid(Error)) {
+      YS.printError(Descriptor, "invalid Source regex: " + Error);
+      return false;
+    }
+  }
+
+  DL->push_back(
+      std::make_unique<PatternRewriteNamedAliasDescriptor>(Source, Transform));
 
   return true;
 }

@slydiman slydiman requested a review from arsenm August 20, 2025 13:31
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs tests

@slydiman
Copy link
Contributor Author

Needs tests

I would be happy to add tests.
But I don't see any usage of SymbolRewriter in public LLVM tools.
Currently SymbolRewriter has no tests at all.
So first it is necessary to create some environment to test SymbolRewriter with some source data.
This patch is safe and very simple (the regex checking has been moved after target usage).
I hope this patch can be applied without tests.

@arsenm
Copy link
Contributor

arsenm commented Aug 20, 2025

Currently SymbolRewriter has no tests at all.

There's one here https://github.com/llvm/llvm-project/tree/main/llvm/test/SymbolRewriter

@slydiman slydiman requested a review from arsenm August 21, 2025 08:07
@slydiman
Copy link
Contributor Author

There's one here https://github.com/llvm/llvm-project/tree/main/llvm/test/SymbolRewriter

Thanks. I have added the test. Please take a look.

@slydiman slydiman requested review from andjo403 and nikic August 25, 2025 10:08
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me, but could you please add test coverage for all three cases being changed?

(Or alternatively, we should do a preliminary refactoring to deduplicate this case -- at least at a cursory look, all three implementations are the same?)

…g ExplicitRewriteDescriptor

ExplicitRewriteDescriptor expects a non-regex Source.
But unconditional verification that Source is a valid regex breaks this logic if the Source contains $ in the middle or some other special characters.
@slydiman slydiman force-pushed the non-regex-source-SymbolRewriter branch from 1d8381e to 794bf48 Compare August 25, 2025 18:00
@slydiman
Copy link
Contributor Author

This looks reasonable to me, but could you please add test coverage for all three cases being changed?

Done.

(Or alternatively, we should do a preliminary refactoring to deduplicate this case -- at least at a cursory look, all three implementations are the same?)

I wouldn't refactor this code. This code assumes 3rd party extension. Even the description at the beginning contains recommendations New rewrite descriptors can be created. Addding a new rewrite descriptor involves: ... So refactoring can complicate private implementations.

@slydiman slydiman requested a review from nikic August 25, 2025 18:22
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@slydiman slydiman merged commit e64c9d1 into llvm:main Aug 26, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants