Analyzer for string concatenation in raw SQL methods. #36698

cincuranet · 2025-09-02T17:49:38Z

roji

Just a few quick comments

roji · 2025-09-02T17:54:36Z

src/EFCore.Analyzers/AnalyzerReleases.Shipped.md

+Rule ID | Category | Severity | Notes
+--------|----------|----------|-------
+EF1003  | Security | Warning  | StringConcatenationUsageInRawQueriesDiagnosticAnalyzer, [Documentation](https://learn.microsoft.com/ef/core/querying/sql-queries#passing-parameters)


It feels like these should be the same diagnostic, after all they indicate the same thing on the same methods (risk of SQL injection etc.).

(regardless, I'd also lean towards implementing both in the same actual analyzer - they again analyze the same method invocations, FromSqlRaw and so on. But that's an internal implementation detail that the user doesn't care about, unlike the diagnostic ID).

If it would be same diagnostic ID then it would have to be same analyzer. Although I don't have strong feeling about it (do I ever?) I have a preference into having two separate. Although both work on same methods, the analysis has different focus. And more importantly the code fix is only for the existing one - if we put it under one diagnostic ID, we'd need repeat the logic to know whether we're reporting for string interpolation problem (and allow code fix) or string concatenation (no code fix) and that feels like waste of CPU cycles.

If it would be same diagnostic ID then it would have to be same analyzer.

Why, is there some restriction about two analyzers emitting the same diagnostic ID?

The code fix is possibly a good argument... From a user perspective, the two warnings basically feel like variants of exactly the same thing (potential SQL injection due to piecing together strings with FromSqlRaw), and so the diagnostic should be the same. In other words, regardless of internal/implementation concerns, if I suppress the warning for FromSqlRaw (for interpolation) and then I "get it again" for another FromSqlRaw (from concatenation), that seems a bit bad.

I don't really see a CPU cycle/performance problem in repeating the check for the code fix - as far as I know, this would only happen once a diagnostic is actually emitted, which is an extremely cold event and not something we need to optimize for. Am I wrong?

Why, is there some restriction about two analyzers emitting the same diagnostic ID?

Yes, RS1019 Diagnostic Id 'EF1002' is already used by analyzer ....

The code fix is possibly a good argument... From a user perspective, the two warnings basically feel like variants of exactly the same thing (potential SQL injection due to piecing together strings with FromSqlRaw), and so the diagnostic should be the same. In other words, regardless of internal/implementation concerns, if I suppress the warning for FromSqlRaw (for interpolation) and then I "get it again" for another FromSqlRaw (from concatenation), that seems a bit bad.

I don't know. For example, would you say IDE0018 Inline Variable Declaration and IDE0019 Inline as Type Check should be the same, because both are about inlining? (Maybe not a great example, but work with me.)

What if you want to disable the string concatenation warnings (because you "know what you're doing"), but you want to keep the interpolation warnings, just in case you make a mistake? More flexibility? Better?

Maybe @AndriySvyryd has some good point of view.

I don't really see a CPU cycle/performance problem in repeating the check for the code fix - as far as I know, this would only happen once a diagnostic is actually emitted, which is an extremely cold event and not something we need to optimize for. Am I wrong?

For code fix, this happens when the list of available code fixes is being populated - i.e. when user clicks on a 💡 (or does similar action).

Why, is there some restriction about two analyzers emitting the same diagnostic ID?

Yes, RS1019 Diagnostic Id 'EF1002' is already used by analyzer ....

Thanks, was not aware...

I don't know. For example, would you say IDE0018 Inline Variable Declaration and IDE0019 Inline as Type Check should be the same, because both are about inlining? (Maybe not a great example, but work with me.)

I understand the analogy you're trying to draw, but the above two examples are purely stylistic, and people/organizations have arbitrary choices on style (i.e. someone may want to style their code with inlined variable declarations but not with inlined type checks). The SQL injections warnings we're discussing here are about the actual (potentially dangerous) outcome of the code, i.e. the fact that SQL injection may result - regardless of the specific syntactic way of piecing together the string (concatenation vs. interpolation).

What if you want to disable the string concatenation warnings (because you "know what you're doing"), but you want to keep the interpolation warnings, just in case you make a mistake?

Isn't it weird for someone to globally disable string concat warnings because they know what they're doing, but then not also not what they're doing with interpolation?

String concatenation and interpolation are two alternative C# syntaxes to do the same thing - piece together a string from multiple components. The piecing together is the thing we're warning about here (because of the danger of SQL injection) - the specific syntactic mechanism doesn't seem relevant... In other words, I'm just not seeing a case where it would make sense to disable one but not the other.

For code fix, this happens when the list of available code fixes is being populated - i.e. when user clicks on a 💡 (or does similar action).

Right; it doesn't seem that an extra identification on the method at that point (when the user clicks) should be a perf concern. Do you disagree?

Maybe @AndriySvyryd has some good point of view.

Yep, let us know @AndriySvyryd.

Maybe as a sort of mental exercise, imagine if we were building the entire thing from scratch today (i.e. a "SQL injection" but for both interpolation and concatenation), rather than just adding the missing concatenation warning; would you still think we'd do two separate warnings? Am asking to avoid us adding a new diagnostic only because we're adding interpolation rather than implementing it form the get-go.

And one last point for thought...

What if you want to disable the string concatenation warnings (because you "know what you're doing"), but you want to keep the interpolation warnings, just in case you make a mistake? More flexibility? Better?

If you think we should keep separate diagnostics IDs to give the user more flexibility, then it's worth thinking why not also split FromSqlRaw, ExecuteSqlRaw, SqlQueryRaw etc. to different diagnostics IDs as well; in theory someone may say "I know what I'm doing" for one, but want warnings for the other.

But of course I don't think we should; just like we have a single diagnostic regardless of which method causes the possible SQL injection (FromSqlRaw, SqlQueryRaw...), I believe we should have the same diagnostic regardless of which C# syntax causes it (concatenation, interpolation).

String concatenation and interpolation are two alternative C# syntaxes to do the same thing - piece together a string from multiple components. The piecing together is the thing we're warning about here (because of the danger of SQL injection) - the specific syntactic mechanism doesn't seem relevant... In other words, I'm just not seeing a case where it would make sense to disable one but not the other.

Now I see what's the difference in our views. When I do FromSqlRaw($"...") I see that as a mistake in method selection (instead of FromSql[Interpolated]), not that I want to build string SQL and it's a problem. When I do FromSqlRaw("..." + "...") that's a possible mistake.

That's interesting... I can see what you mean...

Though it's possible that when a user does FromSqlRaw($"..."), they're not mistaken about the method selection, but really do want string interpolation because they need to e.g. parameterize a table name (something which FromSql[Interpolated] doesn't work for).

I agree it's a bit of a tricky question and not totally conclusive either way... I'm still focusing on what the warning is
"about", and in both cases it's possible SQL injection that may be happening; and the solution in both cases is to rewrite using FromSql[Interpolated] with parameters (if possible - otherwise if you really need to parameterize the table name you need to just suppress the warning). So if the problem and solution are generally the same, it feels like the same diagnostic.

Curious to hear what @AndriySvyryd thinks!

If possible, I'd prefer to have a single analyzer, to avoid overhead during analysis. I don't have a strong opinion on whether it should use different diagnostics, as I don't think it's common to use both interpolation and concatenation.

roji · 2025-09-02T17:55:04Z

src/EFCore.Analyzers/Properties/AnalyzerStrings.resx

+            </xsd:complexType>
+          </xsd:element>
+        </xsd:choice>
+      </xsd:complexType>


Revert? I systematically revert these changes every time when I submit a PR...

Yeah. I'll do the cleanup before I move this out of draft.

But how will ResXResourceWriter know what schema is being used?

roji · 2025-09-02T18:07:35Z

src/EFCore.Analyzers/SqlRawMethodsCompilationHelper.cs

+
+internal static class SqlRawMethodsCompilationHelper
+{
+    public static IMethodSymbol? FromSqlRawMethod(Compilation compilation)


Consider just merging the two analyzers into a single type?

cincuranet force-pushed the concat-analyzer branch from 19e7040 to e296172 Compare September 2, 2025 17:51

cincuranet changed the title ~~Concat analyzer~~ Analyzer for string concatenation in raw SQL methods. Sep 2, 2025

roji reviewed Sep 2, 2025

View reviewed changes

Analyzer for string concatenation in raw SQL methods.

6eca9d2

cincuranet force-pushed the concat-analyzer branch from e296172 to 6eca9d2 Compare September 2, 2025 18:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Analyzer for string concatenation in raw SQL methods. #36698

Analyzer for string concatenation in raw SQL methods. #36698

Uh oh!

cincuranet commented Sep 2, 2025

Uh oh!

roji left a comment

Uh oh!

roji Sep 2, 2025

Uh oh!

cincuranet Sep 2, 2025

Uh oh!

roji Sep 3, 2025

Uh oh!

cincuranet Sep 3, 2025 •

edited

Loading

Uh oh!

roji Sep 4, 2025

Uh oh!

roji Sep 4, 2025

Uh oh!

roji Sep 4, 2025 •

edited

Loading

Uh oh!

cincuranet Sep 4, 2025

Uh oh!

roji Sep 4, 2025

Uh oh!

AndriySvyryd Sep 5, 2025

Uh oh!

roji Sep 2, 2025

Uh oh!

cincuranet Sep 2, 2025

Uh oh!

AndriySvyryd Sep 5, 2025

Uh oh!

roji Sep 2, 2025

Uh oh!

Uh oh!

Analyzer for string concatenation in raw SQL methods. #36698

Are you sure you want to change the base?

Analyzer for string concatenation in raw SQL methods. #36698

Uh oh!

Conversation

cincuranet commented Sep 2, 2025

Uh oh!

roji left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cincuranet Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roji Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cincuranet Sep 3, 2025 •

edited

Loading

roji Sep 4, 2025 •

edited

Loading