-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C#: Improve some existing manual models. #19940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
822992d
to
8256741
Compare
Click to show differences in coveragecsharpGenerated file changes for csharp
- System,"``System.*``, ``System``",47,12139,54,5
+ System,"``System.*``, ``System``",47,12165,54,5
- Totals,,107,14403,407,9
+ Totals,,107,14429,407,9
- System,54,47,12139,,6,5,5,,,4,1,,33,2,,6,15,17,4,3,,5903,6236
+ System,54,47,12165,,6,5,5,,,4,1,,33,2,,6,15,17,4,3,,5929,6236 |
8256741
to
c60a040
Compare
c60a040
to
822486e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances security test queries by converting deserialization tests to inline alert annotations and improves the library’s manual summary models to reduce false negatives.
- Converted multiple unsafe/unsafe-untrusted-deserialization tests to use inline
// $ Alert[...]
comments. - Renamed and expanded binary‐formatter test classes for broader coverage.
- Added and refined summary models for
XmlDictionaryReader.CreateBinaryReader
,Encoding.GetBytes
/GetChars
,MemoryStream
constructors, and serialization APIs.
Reviewed Changes
Copilot reviewed 22 out of 29 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInputNewtonsoftJson/Test.cs | Moved JSON unsafe-deserialization alert to inline comment. |
csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInput/XmlSerializerUntrustedInputBad.cs | Added inline alert annotation after the return call. |
csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInput/BinaryFormatterUntrustedInputBad.cs | Renamed BadBinaryFormatter class and added a second test class. |
csharp/ql/src/change-notes/2025-07-01-improve-summary-models.md | Documented the new and updated summary models. |
csharp/ql/lib/ext/System.Xml.model.yml | Added summary models for all XmlDictionaryReader.CreateBinaryReader overloads. |
csharp/ql/lib/ext/System.Text.model.yml | Updated taint mappings for Encoding.GetBytes /GetChars . |
csharp/ql/lib/ext/System.Runtime.Serialization.model.yml | Added models for SerializationInfo and SerializationInfoEnumerator . |
csharp/ql/lib/ext/System.IO.model.yml | Adjusted MemoryStream taint model to use .Element for byte arrays. |
Files not reviewed (7)
- csharp/ql/test/library-tests/dataflow/library/FlowSummaries.expected: Language not supported
- csharp/ql/test/library-tests/dataflow/library/FlowSummariesFiltered.expected: Language not supported
- csharp/ql/test/query-tests/Security Features/CWE-502/DeserializedDelegate/DeserializedDelegate.qlref: Language not supported
- csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserialization/UnsafeDeserialization.qlref: Language not supported
- csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInput/UnsafeDeserializationUntrustedInput.expected: Language not supported
- csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInput/UnsafeDeserializationUntrustedInput.qlref: Language not supported
- csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInputNewtonsoftJson/UnsafeDeserializationUntrustedInput.qlref: Language not supported
Comments suppressed due to low confidence (2)
csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInputNewtonsoftJson/Test.cs:17
- The
$ Alert
annotation is placed on theJsonSerializerSettings
line rather than on theDeserializeObject
invocation itself. Move the inline comment to the line starting withreturn JsonConvert.DeserializeObject(...)
to ensure the test framework picks up the alert.
return JsonConvert.DeserializeObject(data.Text, new JsonSerializerSettings // $ Alert[cs/unsafe-deserialization-untrusted-input]
csharp/ql/test/query-tests/Security Features/CWE-502/UnsafeDeserializationUntrustedInput/BinaryFormatterUntrustedInputBad.cs:7
- The test harness expects a class named
BadBinaryFormatter
, but this has been renamed toBadBinaryFormatter1
. Rename it back toBadBinaryFormatter
to maintain compatibility with the test matcher.
class BadBinaryFormatter1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few questions about how the models all work together.
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetEnumerator", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"] | ||
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetString", "(System.String)", "", "Argument[this]", "ReturnValue", "taint", "manual"] | ||
- ["System.Runtime.Serialization", "SerializationInfo", False, "GetValue", "(System.String,System.Type)", "", "Argument[this]", "ReturnValue", "taint", "manual"] | ||
- ["System.Runtime.Serialization", "SerializationInfoEnumerator", False, "get_Current", "()", "", "Argument[this]", "ReturnValue", "taint", "manual"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This returns a SerializationEntry
struct. I'm not familiar with CodeQL for C#, so I just want to check that it's sufficient to taint that struct, and then accessing the fields will propagate the taint, without requiring any extra models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we would need extra models for that as well. The model itself doesn't bring any value unless models for SerializationEntry
is also provided. However, I decided to include it for consistency (such that all properties of SerializationInfoEnumerator
are modelled - or at least those where it makes sense to have a model). But I have stop the modelling somehere 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Perhaps a comment on this line saying that it won't work until models for SerializationEntry
are added, would save the next person to run into this some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -47,7 +47,7 @@ extensions: | |||
- ["System.IO", "FileStream", False, "FileStream", "(System.String,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,System.Int32)", "", "Argument[0]", "Argument[this]", "taint", "manual"] | |||
- ["System.IO", "FileStream", False, "FileStream", "(System.String,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,System.Int32,System.Boolean)", "", "Argument[0]", "Argument[this]", "taint", "manual"] | |||
- ["System.IO", "FileStream", False, "FileStream", "(System.String,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare,System.Int32,System.IO.FileOptions)", "", "Argument[0]", "Argument[this]", "taint", "manual"] | |||
- ["System.IO", "MemoryStream", False, "MemoryStream", "(System.Byte[])", "", "Argument[0]", "Argument[this]", "taint", "manual"] | |||
- ["System.IO", "MemoryStream", False, "MemoryStream", "(System.Byte[])", "", "Argument[0].Element", "Argument[this]", "taint", "manual"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. We do the opposite in CodeQL for Go - byte arrays are often used synonymously with strings, and we always taint the whole array rather than individual elements, just as we would if it was a string. I guess either way works as long as your consistent, but I seem to recall that for Go it was better to taint the whole array because strings can be considered as byte arrays in some circumstances. Just thought I'd mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Java, we would also taint a whole byte[]
rather than its members - in fact single byte
values are typically considered too small to carry taint and used as type-based sanitizers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you both for looking into this!
So the change is probably a bit more controversial than advertised in slack. 😄
To address the questions/comments: Yes, I think it makes sense to consider both byte[]
and char[]
as "bulk" like datatypes like it is done for Java. The change in this PR solves the reported issue (as it is now) and doesn't break the pattern that is already being used for C# in terms of modelling - but we should probably change this.
Will it be acceptable to merge the models as they are and then as a follow up try and convert all the models (at least all the manual ones) to check what the consequences will be? This might also tie into model generator as well.
@@ -3,18 +3,18 @@ extensions: | |||
pack: codeql/csharp-all | |||
extensible: summaryModel | |||
data: | |||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char*,System.Int32,System.Byte*,System.Int32)", "", "Argument[0]", "ReturnValue", "taint", "manual"] | |||
- ["System.Text", "Encoding", True, "GetBytes", "(System.Char*,System.Int32,System.Byte*,System.Int32)", "", "Argument[0].Element", "Argument[2]", "taint", "manual"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For many of the models in this file, I don't understand why the input has .Element
but the output doesn't. Please can you explain how this works? (Obviously the old model was just wrong, though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an excellent question - I didn't want to investigate that as a part of adding the models - but now I have checked: The cause is that for C#, support for implicit reads at sinks hasn't been implemented. That is, if the elements of a collection are tainted but the sink is a collection type, then taint isn't propagated. Since we have a few byte[]
sinks, it appears that the models are made like this.
This should probably be addressed (but not as a part of this PR) - will open an issue for it (I think it is just a matter of making a simple implementation for defaultImplicitTaintRead
- but it will need to be tested separately).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like all the skeletons are falling out of the closet 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems fine to me, as long as there's a follow-up issue and we don't totally forget about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have opened issues for both ((1) Implicit reads at sinks and (2) Convert byte[]
and char[]
models)).
…the other models).
…Reader models to manual models.
…yReader.CreateBinaryReader.
5d1f2a1
to
e9fdca7
Compare
In this PR we improve the models as data modelling across multiple classes.
System.Xml.XmlDictionaryReader.CreateBinaryReader
.System.Runtime.Serialization.SerializationInfo
andSystem.Runtime.Serialization.SerializationInfoEnumerator
.System.Text.Encoding.GetBytes
,System.Text.Encoding.GetChars
and the constructor forSystem.IO.MemoryStream
.Also convert the deserialization tests to inline expectations.
The new findings in
mono/mono
forcs/cleartext-storage-of-sensitive-information
andcs/exposure-of-sensitive-information
are both related to the improved models forGetBytes
(where taint is propagated to another parameter instead of the return).