-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Analyzer for string concatenation in raw SQL methods. #36698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/10.0
Are you sure you want to change the base?
Conversation
19e7040
to
e296172
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few quick comments
Rule ID | Category | Severity | Notes | ||
--------|----------|----------|------- | ||
EF1003 | Security | Warning | StringConcatenationUsageInRawQueriesDiagnosticAnalyzer, [Documentation](https://learn.microsoft.com/ef/core/querying/sql-queries#passing-parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like these should be the same diagnostic, after all they indicate the same thing on the same methods (risk of SQL injection etc.).
(regardless, I'd also lean towards implementing both in the same actual analyzer - they again analyze the same method invocations, FromSqlRaw and so on. But that's an internal implementation detail that the user doesn't care about, unlike the diagnostic ID).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it would be same diagnostic ID then it would have to be same analyzer. Although I don't have strong feeling about it (do I ever?) I have a preference into having two separate. Although both work on same methods, the analysis has different focus. And more importantly the code fix is only for the existing one - if we put it under one diagnostic ID, we'd need repeat the logic to know whether we're reporting for string interpolation problem (and allow code fix) or string concatenation (no code fix) and that feels like waste of CPU cycles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it would be same diagnostic ID then it would have to be same analyzer.
Why, is there some restriction about two analyzers emitting the same diagnostic ID?
The code fix is possibly a good argument... From a user perspective, the two warnings basically feel like variants of exactly the same thing (potential SQL injection due to piecing together strings with FromSqlRaw), and so the diagnostic should be the same. In other words, regardless of internal/implementation concerns, if I suppress the warning for FromSqlRaw (for interpolation) and then I "get it again" for another FromSqlRaw (from concatenation), that seems a bit bad.
I don't really see a CPU cycle/performance problem in repeating the check for the code fix - as far as I know, this would only happen once a diagnostic is actually emitted, which is an extremely cold event and not something we need to optimize for. Am I wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why, is there some restriction about two analyzers emitting the same diagnostic ID?
Yes, RS1019
Diagnostic Id 'EF1002' is already used by analyzer ....
The code fix is possibly a good argument... From a user perspective, the two warnings basically feel like variants of exactly the same thing (potential SQL injection due to piecing together strings with FromSqlRaw), and so the diagnostic should be the same. In other words, regardless of internal/implementation concerns, if I suppress the warning for FromSqlRaw (for interpolation) and then I "get it again" for another FromSqlRaw (from concatenation), that seems a bit bad.
I don't know. For example, would you say IDE0018
Inline Variable Declaration and IDE0019
Inline as Type Check should be the same, because both are about inlining? (Maybe not a great example, but work with me.)
What if you want to disable the string concatenation warnings (because you "know what you're doing"), but you want to keep the interpolation warnings, just in case you make a mistake? More flexibility? Better?
Maybe @AndriySvyryd has some good point of view.
I don't really see a CPU cycle/performance problem in repeating the check for the code fix - as far as I know, this would only happen once a diagnostic is actually emitted, which is an extremely cold event and not something we need to optimize for. Am I wrong?
For code fix, this happens when the list of available code fixes is being populated - i.e. when user clicks on a 💡 (or does similar action).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why, is there some restriction about two analyzers emitting the same diagnostic ID?
Yes, RS1019 Diagnostic Id 'EF1002' is already used by analyzer ....
Thanks, was not aware...
I don't know. For example, would you say IDE0018 Inline Variable Declaration and IDE0019 Inline as Type Check should be the same, because both are about inlining? (Maybe not a great example, but work with me.)
I understand the analogy you're trying to draw, but the above two examples are purely stylistic, and people/organizations have arbitrary choices on style (i.e. someone may want to style their code with inlined variable declarations but not with inlined type checks). The SQL injections warnings we're discussing here are about the actual (potentially dangerous) outcome of the code, i.e. the fact that SQL injection may result - regardless of the specific syntactic way of piecing together the string (concatenation vs. interpolation).
What if you want to disable the string concatenation warnings (because you "know what you're doing"), but you want to keep the interpolation warnings, just in case you make a mistake?
Isn't it weird for someone to globally disable string concat warnings because they know what they're doing, but then not also not what they're doing with interpolation?
String concatenation and interpolation are two alternative C# syntaxes to do the same thing - piece together a string from multiple components. The piecing together is the thing we're warning about here (because of the danger of SQL injection) - the specific syntactic mechanism doesn't seem relevant... In other words, I'm just not seeing a case where it would make sense to disable one but not the other.
For code fix, this happens when the list of available code fixes is being populated - i.e. when user clicks on a 💡 (or does similar action).
Right; it doesn't seem that an extra identification on the method at that point (when the user clicks) should be a perf concern. Do you disagree?
Maybe @AndriySvyryd has some good point of view.
Yep, let us know @AndriySvyryd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe as a sort of mental exercise, imagine if we were building the entire thing from scratch today (i.e. a "SQL injection" but for both interpolation and concatenation), rather than just adding the missing concatenation warning; would you still think we'd do two separate warnings? Am asking to avoid us adding a new diagnostic only because we're adding interpolation rather than implementing it form the get-go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And one last point for thought...
What if you want to disable the string concatenation warnings (because you "know what you're doing"), but you want to keep the interpolation warnings, just in case you make a mistake? More flexibility? Better?
If you think we should keep separate diagnostics IDs to give the user more flexibility, then it's worth thinking why not also split FromSqlRaw, ExecuteSqlRaw, SqlQueryRaw etc. to different diagnostics IDs as well; in theory someone may say "I know what I'm doing" for one, but want warnings for the other.
But of course I don't think we should; just like we have a single diagnostic regardless of which method causes the possible SQL injection (FromSqlRaw, SqlQueryRaw...), I believe we should have the same diagnostic regardless of which C# syntax causes it (concatenation, interpolation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String concatenation and interpolation are two alternative C# syntaxes to do the same thing - piece together a string from multiple components. The piecing together is the thing we're warning about here (because of the danger of SQL injection) - the specific syntactic mechanism doesn't seem relevant... In other words, I'm just not seeing a case where it would make sense to disable one but not the other.
Now I see what's the difference in our views. When I do FromSqlRaw($"...")
I see that as a mistake in method selection (instead of FromSql[Interpolated]
), not that I want to build string SQL and it's a problem. When I do FromSqlRaw("..." + "...")
that's a possible mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting... I can see what you mean...
Though it's possible that when a user does FromSqlRaw($"...")
, they're not mistaken about the method selection, but really do want string interpolation because they need to e.g. parameterize a table name (something which FromSql[Interpolated] doesn't work for).
I agree it's a bit of a tricky question and not totally conclusive either way... I'm still focusing on what the warning is
"about", and in both cases it's possible SQL injection that may be happening; and the solution in both cases is to rewrite using FromSql[Interpolated] with parameters (if possible - otherwise if you really need to parameterize the table name you need to just suppress the warning). So if the problem and solution are generally the same, it feels like the same diagnostic.
Curious to hear what @AndriySvyryd thinks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, I'd prefer to have a single analyzer, to avoid overhead during analysis. I don't have a strong opinion on whether it should use different diagnostics, as I don't think it's common to use both interpolation and concatenation.
</xsd:complexType> | ||
</xsd:element> | ||
</xsd:choice> | ||
</xsd:complexType> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert? I systematically revert these changes every time when I submit a PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I'll do the cleanup before I move this out of draft.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how will ResXResourceWriter know what schema is being used?
|
||
internal static class SqlRawMethodsCompilationHelper | ||
{ | ||
public static IMethodSymbol? FromSqlRawMethod(Compilation compilation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider just merging the two analyzers into a single type?
e296172
to
6eca9d2
Compare
Fixes #35735.