Improving Code Quality Using The Roslyn Compiler API
Improving Code Quality Using The Roslyn Compiler API
Innsbruck, Austria
Bachelor Thesis
Robin Knoll
robin.knoll@student.uibk.ac.at
Ich erkläre hiermit an Eides statt durch meine eigenhändige Unterschrift, dass ich die
vorliegende Arbeit selbständig verfasst und keine anderen als die angegebenen Quellen
und Hilfsmittel verwendet habe. Alle Stellen, die wörtlich oder inhaltlich den angegebe-
nen Quellen entnommen wurden, sind als solche kenntlich gemacht.
Die vorliegende Arbeit wurde bisher in gleicher oder ähnlicher Form noch nicht als
Magister-/Master-/Diplomarbeit/Dissertation eingereicht.
Datum Unterschrift
ii
Abstract
Code quality is one of the main concerns in software development to keep large projects
maintainable. Code needs to comply with certain standards that are defined specifically
in a team. To enforce these rules, static code analysis is often used. It is usually done
either by a third party tool or through code review by a team member.
The .NET Compiler Platform, also called Roslyn, makes it possible to write powerful
code analyzers that are integrated into the C# compiler. Using this technique, rules are
applied while writing code and suitable improvements can be suggested and implemented
using code generation the moment mistakes happen. The code analyzer can also define
rule transgressions as compiler errors, so that these rules have to be followed for the
project to build. The code is only reviewed when no more rules are violated and team
members can focus on other aspects in their code review.
In this paper, the usefulness of the Roslyn compiler API for improving code quality is
evaluated. This is accomplished by the example of creating multiple analyzers for C#
projects. The analyzers will cover different aspects such as analyzing naming schemas
of classes or improving exception handling.
Contents
1 Introduction 2
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Solution approach and contribution . . . . . . . . . . . . . . . . . . . . . . 2
2 Compilers 4
2.1 Lexical, Syntax and Semantic analyzer . . . . . . . . . . . . . . . . . . . . 4
2.2 Intermediate code generator . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Code generator and architecture specific optimizer . . . . . . . . . . . . . 6
4 Authorize Analyzer 13
4.1 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3 Code fix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Naming Analyzer 21
5.1 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6 Exception Analyzer 29
6.1 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.3 Code Fix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
iv
List of Figures
1
1 Introduction
1.1 Motivation
Maintainability is one of the most important aspects of software development. Large
code bases need to be sensibly structured and code needs to be written in a certain way
so that developers can quickly understand and become acquainted with projects. Code
quality can be interpreted as the measure of how readable and maintainable code is [5].
To guarantee a high level of code quality, developers need to be held accountable as it
is often easier and quicker to ignore best practices and documentation while writing code.
Static code analysis using tools or code reviews is the main way of assessing the quality
of code after it has been written. Third-party tools can be expensive and licensed on a
per-developer basis, which often motivates teams to rely on manual code reviews. These
can take a lot of time, and changes require the author to refamiliarize and rethink their
code a second time, taking up a lot of resources.
Roslyn provides developers with the capability to write their own static code analy-
sis tools. An API is exposed so that information from all stages of the compiler pipeline
can be accessed [7]. The analysis is executed while writing code. Developers see their
mistakes light up in the same moment a sequence of code is written, making code fixes
much faster and easier. Additionally, automatic code fixes can be implemented, which
makes the process of fixing faulty code even simpler. Analyzers are not specific to the
developer, but can be added to a project simply by adding a NuGet package to its
dependencies. The diagnostics are shown to every developer opening the project from
there on out.
Roslyn also allows developers to quickly test out a theory they might have about a
potential improvement to the quality of their code. As simple analyzers are developed
without too much effort, these theories can anecdotally be tried out on existing projects
and it quickly becomes visible which parts of a code base could profit from refactoring.
2
1.2 Solution approach and contribution
development documented.
The ideas for these analyzer have been formed together with developers at World-
Direct 1 . Many ideas for useful analyzers have come up, but we settled on three im-
mediately useful diagnostic analyzer that will be implemented and looked at in more
detail. All code will be publically available at https://github.com/knollsen/csharp_
analyzers. In each analyzer’s project, NuGet packages of the analyzer can be found as
well.
1
https://world-direct.at
3
2 Compilers
Compilers are software with the purpose of translating programs from one language into
another. In most cases, the target language is a machine language so that the program
can be executed on a specific hardware architecture. However, a compiler might also
translate from one higher level language to another according to Aho et al. [6].
The main advantage of compiled languages in contrast to interpreted ones is their higher
speed of execution. Interpreted languages on the other hand can provide a higher level
of error traceability and diagnostics. Also, they are often shorter than programs with
the same purpose written in a compiled language.
As a hybrid between these types of languages, just-in-time compilers have been created.
These languages are first compiled to bytecode, which is then at run time translated to
machine code by the just-in-time compiler. These languages combine the advantages of
both of the previously mentioned types of languages.
As this paper’s purpose is to explore the ways in which a compiler can help with code
analysis and code quality, we need to establish some terms and explain the main stages
or phases of compilers as described in Aho et al.[6]. A representation can be found in
figure 2.1.
Another purpose of the lexical analyzer is to associate compile time errors with their
corresponding line number by counting the number of new-line characters that appear.
It may also take over the role of a macro-processor, if the source language supports such
constructs.
4
2.1 Lexical, Syntax and Semantic analyzer
The syntax analyzer uses the stream of tokens to create a syntax tree corresponding
to the defined syntax of the source language. A language is made up of complex rules
describing the structure of valid programs. These rules can be defined as context-free
grammars or Backus-Naur-Form notation. The syntax analyzer needs to be able to
report syntax errors in a meaningful way. Commonly occurring errors might also be
recovered from automatically.
The semantic analyzer takes the syntax tree created by the syntax analyzer and checks
it for semantic consistency. An example of this would be to check for correct and valid
typing. It also adds coercions to allow for certain operations. For example, an arith-
metic operation on a floating point number and an integer first requires the integer to
be coerced to a float.
5
2 Compilers
All described stages up to this point are also called the front-end of the compiler.
These stages are specific to the source language and produce intermediate code. Now
begins the back-end of the compiler, which takes the intermediate code and translates it
into the target language. The front-end and back-end can be swapped out for another
if it uses the same intermediate language, which can be very useful for portability.
After this, optimizations on an architecture specific level can be executed. This might
include vectorization or other forms of parallelism.
6
3 Roslyn: The .NET Compiler Platform
3.1 Structure
7
3 Roslyn: The .NET Compiler Platform
There are 4 layers to Roslyn’s APIs, as visible in fig. 3.1. Each layer provides public
and mostly documented APIs to interact with it. The base layer is the CodeAnalysis
or Compiler layer. With it, users can gain access to data and compilation units of the
different phases of the compilation process like syntax trees, symbol tables, semantic in-
formation and intermediate language byte codes. An abstracted compiler pipeline with
the corresponding APIs can be seen in fig. 3.2.
The Workspaces layer unites related source files to projects and solutions. With it,
operations on multiple documents can be completed. The Features layer provides an
API for integrating IDE features such as finding possible references, code fixes, code
completion or access to IntelliSense. Finally, the so called Visual Studio layer is exclu-
sive to the IDE Visual Studio and can be used to interact with its interface.
The APIs giving access to these layers can be used to expand and round off the devel-
opment experience of teams. Developers can write diagnostic analyzers and code fixes
specifically useful to their team, making Roslyn a powerful tool in software development.
The action gets the context of the code unit of interest as an argument. The infor-
mation this context includes depends on the chosen unit of code and will be discussed
in more detail later.
Analyzers can either be stateless or stateful, depending on whether they need to re-
8
3.3 Development with Roslyn
tain information about the code across multiple calls and actions. Stateless analyzers
do not maintain any state across their calls. This keeps writing them simpler, but more
complex analysis is not possible. Stateful analyzers define and access a mutable state
that is initialized at a registered point in the compilation process. They are more diffi-
cult to write and need to be correctly memory-managed, but can make more powerful
analysis possible.
In case developers are sure that they do not need a certain diagnostic or the analyzer
is wrong about a certain code sequence, diagnostics can be manually disabled on a per-
section, per-file and per-project scope. However, the developer needs to explicitly state
a reason for the deactivation so that other team members can review the action.
In the Initialize method, a context of type AnalysisContext is given. Using this ar-
gument, some configuration can be done, like enabling concurrent execution of analyzers
and analysis for generated code. More importantly, all analysis actions need to be reg-
istered here. The context provides multiple methods to register for different layers of
the compilation process. Some examples include RegisterSyntaxTreeAction, RegisterSyn-
taxNodeAction and RegisterSymbolAction.
9
3 Roslyn: The .NET Compiler Platform
These methods are used with an action as a parameter that will be called when a section
of code corresponding to the layer described in the method name has been compiled [7].
All of the mentioned registration methods also have counterparts for stateful analyzers.
Additionally, for syntax node and symbol actions, a SyntaxKind or SymbolKind can be
passed to only register for these types of nodes or symbols. These include syntax nodes
such as invocation expressions or symbols such as a method. Examples on how to cor-
rectly use these methods can be found in the following chapters, where some analyzers
are described in more detail.
A challenging part of development can be the usage of more specific parts and features
of the Roslyn APIs as the documentation can at times be sparse. One such challenge may
be the management of dependencies, in cases where the analyzer needs to use another
1
https://roslynquoter.azurewebsites.net/
10
3.3 Development with Roslyn
package to work. Another may be the use of a configuration file in the target project. A
more elaborate explanation and the solutions to both of these challenges can be found
in chapter 5. A useful technique for transmitting custom information about a diagnostic
from the diagnostic analyzer to a potential code fix provider is used in chapter 6.
The source code is represented in the immutable class Document. If the user calls the
11
3 Roslyn: The .NET Compiler Platform
code action, a new Document needs to be generated where the problem is fixed. There
are a few classes provided to help with this, such as the SyntaxFactory and SyntaxGen-
erator. Using these classes, a new syntax tree can be generated, with nodes in the tree
added, changed or removed. The new Document is then returned and the changes will
be applied in the user’s solution.
The previously mentioned tools SyntaxViewer and RoslynQuoter are of great use for
code fix development as well. The RoslynQuoter 2 takes a code snippet from its user and
shows a collection of function calls using the Roslyn API’s SyntaxFactory utility class
with which the given snippet can be generated.
One simple technique to get suitable and valid code snippets for unit testing is to open
a second project and write code that the analyzer would be applied to. This way, there
is confidence that the code is valid and can be used in the unit test. Sadly, in these unit
test snippets imports from .NET Core libraries are not supported. Because of this, any
interfaces, attributes or abstract classes from external libraries need to be simulated in
the unit test.
Test-driven development techniques prove very useful when developing diagnostic ana-
lyzers and code fixes. Being able to use a debugger and set break points while executing
different test scenarios is very important as otherwise there is no other easy way to
quickly debug analyzer code.
2
https://roslynquoter.azurewebsites.net/
12
4 Authorize Analyzer
4.1 Functionality
This analyzer was the first project completed for this paper and served as an intro-
duction to the process of writing code analyzers for Roslyn. It has a very simple purpose
in finding [Authorize] attributes in C# code and reporting a warning if the attribute is
commented out. A visual representation of the analyzer in action can be seen in fig. 4.1.
In C#, attributes can add metadata to classes, methods or entire projects [1]. They
work in a similar way as annotations in Java. In .NET projects, web controllers provide
HTTP endpoints [4]. The controller can be secured with authentication by adding the
C# attribute [Authorize]. The attribute can either be placed on the class to secure all
endpoints defined in the controller or on a specific endpoint. On endpoints and con-
trollers with the attribute, the authentication scheme that is configured as default will
be used to authenticate the controller or endpoint. In development, this attribute is of-
ten commented out so that the developer does not need to authenticate the application
each time they want to use an endpoint. However, the attribute is sometimes still left
commented out in release builds, which can pose a problematic security risk.
To prevent this from happening, this analyzer will report a warning to show the
developer that a commented out [Authorize] attribute is most likely a mistake.
4.2 Implementation
The complete implementation of this analyzer can be found on https://github.com/
knollsen/csharp_analyzers. The core of the implementation is shown in figure 4.2.
13
4 Authorize Analyzer
14
4.3 Code fix
In the overridden Initialize method, we register an action for whenever the parsing of
a code document is completed. When this happens, the HandleSyntaxTree method (line
9) is called.
If there are any single line comments, we remove slash characters and whitespaces.
Then, each of them is scanned for the [Authorize] attribute. If the attribute is present,
we report a diagnostic (line 29). The configuration for the diagnostic including id, name
and severity are saved in the property Rule. To report the diagnostic, we also have to
pass the location where the diagnostic is supposed to show up. The location consists of
line and column number and can be easily retrieved from the trivia node.
In the code fix provider’s RegisterCodeFixesAsync method (fig. 4.3), we try to fetch
the SyntaxNode that the comment with [Authorize] in it is assigned to. We do this be-
cause trivia is assigned to the following syntax node on parsing. If the node is not found,
we return as a safety measure. Safe guards such as this have to be implemented as the
compiler will call this method for every diagnostic in our code fix provider’s FixableDi-
agnosticIds property and a third party analyzer may be using the same identification
string as our analyzer. Should this happen, our code fix provider will be called for code
sections that it can not fix and might behave unexpectedly.
The GetCommentAsync method can be seen in fig. 4.4. We obtain the diagnostic we
are trying to fix as a parameter, which simplifies the process of looking for the correct
syntax node. First, we try to find the trivia node, our comment, in the location that the
diagnostic was shown in. Then, we try to find the syntax node the compiler assigned
our comment as trivia to. We only want to find nodes that are class declarations or
method declarations (line 19) as these are the only types of syntax nodes that can be
attributes with [Authorize]. The trivia might also have been assigned to a child token of
our syntax node, so we also have to check these for the comment (line 27). If either the
comment or the assigned syntax node can not be found, we return null so that no code
15
4 Authorize Analyzer
16
4.3 Code fix
Figure 4.4: Fetching of the correct node in the Authorize code fix
17
4 Authorize Analyzer
fix will mistakenly be shown. Otherwise, we return the syntax and trivia node as a tuple.
Finally, the method UncommentAsync seen in fig. 4.5 makes the necessary changes
to the code. In case the comment is assigned directly to our syntax node (line 6), we
create a new SyntaxNode containing the [Authorize] attribute and remove the comment
from the leading trivia of our syntax node. If the comment is assigned to a child token
of the syntax node, we retrieve the correct token (line 14). Then we create a new token
from it without the comment inside its leading trivia. Then, we replace the old token
with our newly generated one. We then create a new attribute with the SyntaxFactory
utility class. Using the SyntaxGenerator, we insert the new attribute before our new
syntax node. Finally, having generated a new syntax node that we want to swap with
our old one, we use the SyntaxGenerator to generate a new syntax tree with our desired
node replaced. Then, we return a new document with the new syntax tree.
4.4 Testing
The unit tests for this analyzer are there to test edge cases and make sure the analyzer
shows diagnostics in the correct locations. Additionally, the code fix needs to be tested
so that the correct comment is removed and the attribute is inserted correctly. For this
purpose, code snippets can be defined as seen in fig. 4.6. In the unit test described here,
we have a class definition with a commented out [Authorize] attribute that we expect
the diagnostic analyzer to generate a warning for. Our second code snippet shows the
same class declaration after we apply the code fix. For our test assertion, we first define
the diagnostic expected to show including its location (line 31). Then, we use our test
utility classes to assert that the diagnostic is shown, and afterwards, if the code fix is
applied, that our initial code snippet now looks like our fixed class declaration. If there
are any differences whatsoever, the unit test will fail.
The other unit tests for this analyzer try out different cases of interest. They can be
found in the analyzer repository as well1 . Additional attributes are placed around the
comment containing the [Authorize] attribute to make sure the comment is still correctly
replaced. The code fix is used on a commented out attribute placed on a method to
ensure that it works for methods as well. Finally, a combination of the above mentioned
use cases is tested to try out the code fix on multiple instances of the diagnostic, placing
the commented out attribute on both the web controller itself as well as on a method
inside it.
1
https://github.com/knollsen/csharp_analyzers/blob/main/AuthorizeAnalyzer/
AuthorizeAnalyzer/AuthorizeAnalyzer.Test/AuthorizeAnalyzerUnitTests.cs
18
4.4 Testing
Figure 4.5: Swapping the comment for an attribute in the Authorize code fix
19
4 Authorize Analyzer
1 [ TestMethod ]
2 public async Task C o d e F i x U n c o m m e n t s A u t h o r i z e A t t r i b u t e ()
3 {
4 var test = @ "
5 using System ;
6
7 namespace C o n so l e Ap p l ic a t io n 1
8 {
9 // this is a comment
10 // [ Authorize ]
11 class Test
12 {
13 }
14
15 class Au th or iz eA tt ri bu te : Attribute {}
16 }";
17 var fix = @ "
18 using System ;
19
20 namespace C o n so l e Ap p l ic a t io n 1
21 {
22 // this is a comment
23 [ Authorize ]
24 class Test
25 {
26 }
27
28 class A uth or iz eA tt ri bu te : Attribute {}
29 }";
30
31 var expected =
VerifyCS . Diagnostic ( Auth orizeA nalyze r . DiagnosticId )
32 . WithSpan (7 , 5 , 7 , 19) ;
33 await VerifyCS . Ve ri fy Co de Fi xA sy nc ( test , expected , fix ) ;
34 }
Figure 4.6: Example of a unit test for the Authorize code fix
20
5 Naming Analyzer
5.1 Functionality
This analyzer’s purpose is to make sure that design patterns are used correctly. De-
velopers who are not comfortable with certain patterns stand to benefit from instant
feedback when they use these pattern incorrectly. By putting constraints on the fields
and properties of classes that are part of a specific pattern, this is made possible. For
example, when using the repository and factory pattern, it is not allowed to have a
repository as a member of a factory. An illustration of what the diagnostic analyzer
would show in this case can be seen in fig. 5.1. The analyzer does not use the name of
the field or property for this check, but the name of the type. This way, the issue is not
simply resolved by changing the name of the variable.
Additionally, it has the functionality to ban certain terms in class names. This may
be useful for some teams that do not want to use class names with overloaded meaning
like Manager or Helper. In these cases, the analyzer will even show the diagnostic as a
compiler error, like in fig. 5.2.
For this analyzer, the user has to be able to do some amount of configuration. Not
every team using it will have the same naming conventions and repacking the analyzer
with a new configuration is tedious and impractical. For this reason, the analyzer will
look for a file called namingConfig.json in the solution it is used in. This JSON file is
21
5 Naming Analyzer
The file contains an array of terms that are not allowed to be used in class names.
It can be found with the key DisallowedTermsInClassNames. The key DisallowedSuf-
fixesInMembersType contains an array of objects, where each one stands for a design
pattern. The object contains a ClassSuffix, the suffix of the classes this rule applies
to, and an array of suffixes that are not allowed to appear in its members’ types. In
the example file shown, this means that classes ending with Factory cannot contain any
fields or properties of types ending with Repository.
Because a fix for one of these diagnostics requires the developer to rethink the decisions
they made when naming the affected classes or implementing the design pattern, there
is no code fix provided.
5.2 Implementation
The analyzer is registered as a symbol action specifically for named types, meaning that
it is only called when a named type like a class or an interface is compiled.
The first challenge of implementing this diagnostic analyzer is how to read and manage
the configuration file. The solution is shown in fig. 5.4. There are two ways in which
we can get the content of the file. Roslyn provides an API to get any files designated as
C# Analyzer Additional files in Visual Studio. The access is simple, as shown in line 8,
and with one line of code we can get the file’s content.
However, the user of the analyzer has to specifically designate the file as such which
might lead to some confusion. An instruction on how to correctly configure the file can
22
5.2 Implementation
Figure 5.4: Reading the configuration file for the Naming analyzer
23
5 Naming Analyzer
be found on the analyzer’s GitHub page1 . Additionally, in unit tests there is no access
to the Additional Files API.
Because of these problems, a second way of reading the content of the configuration
file is implemented. In the second implementation from line 17 to line 28 in fig. 5.4,
the configuration is taken directly from the file system. If we cannot find the file, a
diagnostic is reported to the user to suggest adding the configuration. To make this
work with unit tests, we simply have to include a namingConfig.json file in the unit test
project.
1 ...
2
3 NamingConfig config = null ;
4 try
5 {
6 config =
JsonConvert . DeserializeObject < NamingConfig >( configContent ) ;
7 }
8 catch ( J s o nR e a de r E xc e p ti o n )
9 {
10 // ignore this exception , as it is handled below because
config == null
11 }
12
13
14 // if the config file does not contain the correct structure ,
return as well
15 if ( config ?. D i s a l l o w e d S u f f i x e s I n M e m b e r s T y p e == null ||
config ?. D i s a l l o w e d T e r m s I n C l a s s N a m e s == null )
16 {
17 context . ReportDiagnostic ( Diagnostic . Create ( MissingFileRule ,
Location . None ) ) ;
18 return ;
19 }
20
21 this . Config = config ;
22 }
23 ...
Figure 5.5: Parsing the configuration file for the Naming analyzer
Independent of which method to read the file is used, once its content is retrieved, it
is parsed using the Newtonsoft.Json 2 library’s JsonConvert class, seen in line 6 of fig.
5.5. This is a well supported and extremely popular library for .NET projects that can
1
https://github.com/knollsen/csharp_analyzers/blob/main/NamingAnalyzer/README.md
2
https://www.newtonsoft.com/json
24
5.2 Implementation
1 ...
2
3 var namedType = context . Symbol ;
4
5 // check for disallowed terms in class names
6 foreach ( var disallowedTerm in
this . Config . D i s a l l o w e d T e r m s I n C l a s s N a m e s )
7 {
8 if ( namedType . Name . Contains ( disallowedTerm ) )
9 {
10 context . ReportDiagnostic ( Diagnostic . Create ( DisallowedTermsRule ,
namedType . Locations . First () , namedType . Name ,
disallowedTerm ) ) ;
11
12 break ;
13 }
14 }
15
16 ...
Figure 5.6: Checking type names for disallowed terms in the Naming analyzer
perform actions related to the JSON format in a performant manner. We try to parse
the JSON to a class containing structurally matching properties. In case the parsing
does not work, we report a diagnostic as well to show that the format of the JSON needs
to be changed. If we are able to parse a valid configuration, we save it in a property
of the analyzer so that the configuration does not have to read each time the method
is called. However, because this is a stateless analyzer, it will still be read more than once.
This provides us with a second challenge on how to include the Newtonsoft.Json library
in a package with our analyzer. Microsoft sadly does not provide any documentation
on this problem. However, the Roslyn community is able to provide a well explained
solution3 . It works by reconfigurating the way the analyzer is packed to a NuGet pack-
age. First, all NuGet dependencies of the project are scanned, then all .NET Standard
packages are removed as they have to be present in the user’s project anyway. Finally,
the .dll files of all remaining dependencies are copied inside the NuGet package of the
analyzer. Using this method, the analyzer will only use the included binary files and
is not reliant on the user installing additional NuGets which might lead to dependency
problems if they want to use a different version of the same library.
After successfully parsing the configuration, we can now perform the required checks
for the type the analyzer has been called for. First, we check its name for disallowed
3
https://www.meziantou.net/packaging-a-roslyn-analyzer-with-nuget-dependencies.htm,
Accessed: 04.12.2021
25
5 Naming Analyzer
terms. This can be done simply by taking the name of the type and iterating over all
terms specified in our configuration. If the name contains a disallowed term, we report
a diagnostic informing the developer of this transgression, as seen in fig. 5.6.
Fig. 5.7 shows the implementation of checking the members of a class for disallowed
types. For this purpose, we need to retrieve the corresponding syntax node as currently
we only have the symbol of the class. Roslyn allows us to do this in a simple way by
giving us its declaration (line 3). We make sure that the syntax node is of the correct type
ClassDeclarationSyntax and check if the suffix of our class in included in the analyzer
configuration (line 11). If that is the case, we retrieve the disallowed member suffixes and
the members of the class. Then, we iterate over each member. We are only interested in
fields and properties, but methods, constructors and other definitions are also part of a
class’s members. Because of this, we perform a check on the type of the member (lines
20 and 33). In case we are dealing with a field or property, we search for its type name
in the configuration to see if there are any matching suffixes. If we find any, we report
a diagnostic (lines 27 and 39).
5.3 Testing
Unit tests for this diagnostic analyzer like the one included in fig. 5.8 are not as exten-
sive as those for other analyzers because there is no code fix. Nevertheless, the analyzer
is still tested for correctly reading and parsing the configuration file and if it works on
abstract classes as well. The test in fig. 5.8 makes sure that the diagnostic is shown for
both fields and properties. The configuration file for this test disallows Factories to con-
tain a Repository. For the test to succeed, a diagnostic has to be shown on both members.
Of course, the diagnostic shown for invalid terms in class names is tested as well. For
example, we test if the disallowed term can be at any position in the class name.
To test the reading and parsing of configuration files, there are multiple test projects
for this analyzer. The configuration file has to be unique per project, so additionally
there is one unit test project where the configuration is missing and one where the
configuration is in an invalid format.
26
5.3 Testing
1 ...
2
3 var syntaxNode =
namedType . D e c l a r i n g S y n t a x R e f e r e n c e s . First () . GetSyntax () ;
4
5 // for class declarations , check for members with disallowed
suffixes
6 if ( this . Config . D i s a l l o w e d S u f f i x e s I n M e m b e r s T y p e != null &&
syntaxNode is C l a s s D e c l a r a t i o n S y n t a x classDeclaration )
7 {
8 var name = classDeclaration . Identifier . Text ;
9
10 // check if there is a configuration for this class
11 if ( this . Config . D i s a l l o w e d S u f f i x e s I n M e m b e r s T y p e . Any ( x = >
name . EndsWith ( x . ClassSuffix ) ) )
12 {
13 var config =
this . Config . D i s a l l o w e d S u f f i x e s I n M e m b e r s T y p e . First ( x = >
name . EndsWith ( x . ClassSuffix ) ) ;
14
15 var members = classDeclaration . Members ;
16
17 // check if a member is of a type with a disallowed suffix
18 foreach ( var member in members )
19 {
20 if ( member is P r o p e r t y D e c l a r a t i o n S y n t a x
p r op e r ty D e cl a r at i o n )
21 {
22 var disallowedSuffix =
23 config . D i s a l l o w e d M e m b e r S u f f i x e s . FirstOrDefault ( x
=>
24 p ro p e rt y D ec l a ra t i on . Type . ToString () . EndsWith ( x ) ) ;
25 if ( disallowedSuffix != null )
26 {
27 context . ReportDiagnostic (
28 Diagnostic . Create ( DisallowedSuffixRule ,
29 member . GetLocation () , namedType . Name ,
disallowedSuffix ) ) ;
30 }
31 }
32
33 if ( member is F i e l d D e c l a r a t i o n S y n t a x fieldDeclaration )
34 {
35 var disallowedSuffix =
config . D i s a l l o w e d M e m b e r S u f f i x e s . FirstOrDefault ( x
=>
36 fieldDeclaration . Declaration . Type . ToString () . EndsWith ( x ) ) ;
37 if ( disallowedSuffix != null )
38 {
39 context . ReportDiagnostic (
40 Diagnostic . Create ( DisallowedSuffixRule ,
41 member . GetLocation () , namedType . Name ,
disallowedSuffix ) ) ; 27
42 }
43 }
44 }
45 }
46 }
1 [ TestMethod ]
2 public async Task D i s a l l o w e d M e m b e r S u f f i x e s S h o w D i a g n o s t i c ()
3 {
4 var test = @ "
5 using System ;
6
7 namespace C o n so l e Ap p l ic a t io n 1
8 {
9 class TestFactory
10 {
11 private IRepository repository2 { get ; }
12 private readonly IRepository repository ;
13 }
14
15 interface IRepository {}
16 }";
17 var pr op er ty Di agn os ti c =
VerifyCS . Diagnostic ( NamingAnalyzer . D i s a l l o w e d S u f f i x D i a g n o s t i c I d )
18 . WithSpan (8 , 9 , 8 , 49) . WithArguments ( " TestFactory " ,
" IRepository " ) ;
19
20 var fieldDiagnostic =
VerifyCS . Diagnostic ( NamingAnalyzer . D i s a l l o w e d S u f f i x D i a g n o s t i c I d )
21 . WithSpan (9 , 9 , 9 , 49) . WithArguments ( " TestFactory " ,
" IRepository " ) ;
22
23 await VerifyCS . V e r if y A na l y ze r A sy n c ( test , propertyDiagnostic ,
fieldDiagnostic ) ;
24 }
28
6 Exception Analyzer
6.1 Functionality
C# does not include support for compile-time checking of exceptions. In Java1 , this
is possible using the throws keyword when defining a method. If a method definition
contains this keyword in its signature, callers of this method need to catch all exception
types after the throws keyword. This diagnostic analyzer will try to emulate this feature
using C#’s XML comments.
29
6 Exception Analyzer
is implemented as well, where the method call will be wrapped in a try-catch block for
the documented exception.
6.2 Implementation
30
6.2 Implementation
Fig. 6.3 shows how the analysis is continued. In line 3, we try to find out if our
method call is inside a try-catch block. We do this by searching through the syntax
node’s ancestors for a try statement. In case we find one, we access its catches and col-
lect the types of the caught exceptions. We now iterate over all documented exceptions
and for each one, retrieve its type symbol, as before we only had its name. Then, we
check if the exception is caught by iterating over all caught exceptions and examine if
any of them are equal to the documented exception or if the caught exception is a base
type of the documented one.
For this purpose, we again use a helper function InheritsFrom that takes two type
symbols and goes through the first argument’s base types to see if any of them is equal
to the second argument. In case the documented exception is not caught or there is no
try-catch block at all, we report a diagnostic at the invocation’s location.
A special part of this analyzer is that it uses properties on a diagnostic. This is custom
data than can be passed to the diagnostic and can then be read by a code fix provider.
The data must be in the shape of a ImmutableDictionary<string,string>, but it still
has potential to be very useful so that the code fix provider does not have to redo the
computations done in the diagnostic analyzer. In this case, we pass the name of the
uncaught exception as a property. The next section will describe how this data is used
in the code fix provider.
31
6 Exception Analyzer
1 ...
2
3 var tryCatch = ( Tr yS ta te me ntS yn ta x ) invocation
4 . FirstAncestorOrSelf < SyntaxNode >( n = >
n . IsKind ( SyntaxKind . TryStatement ) ) ;
5 var caughtExceptions = tryCatch ?. Catches . Select
6 ( x = > x . Declaration . Type ) . ToList () ;
7
8 foreach ( var d oc u m en t e dE x c ep t i on in d o c u m e n t e d E x c e p t i o n s )
9 {
10 var d o c u m e n t e d E x c e p t i o n S y m b o l =
context . Compilation . G e t T y p e B y M e t a d a t a N a m e ( d o cu m e nt e d Ex c e pt i o n ) ;
11
12 if ( d o c u m e n t e d E x c e p t i o n S y m b o l == null )
13 {
14 continue ;
15 }
16
17 var caught = false ;
18 // check if d oc u m en t e dE x c ep t i on inherits from any
caughtException
19 foreach ( var caughtException in caughtExceptions ?? new
List < TypeSyntax >() )
20 {
21 var c a u g h t S y m b o l T y p e I n f o =
semanticModel . GetTypeInfo ( caughtException ) ;
22 var caughtSymbol =
context . Compilation . G e t T y p e B y M e t a d a t a N a m e ( c a u g h t S y m b o l T y p e I n f o
23 . ConvertedType . ToString () ) ;
24 if ( caughtSymbol != null &&
InheritsFrom ( documentedExceptionSymbol , caughtSymbol ) )
25 {
26 caught = true ;
27 break ;
28 }
29 }
30
31 if (! caught )
32 {
33 // give documented exception type as property
34 var properties = new Dictionary < string , string > { {
PropertiesExceptionTypeKey , d o c um e n te d E xc e p ti o n }
}. T o I m m u t a b l e D i c t i o n a r y () ;
35
36 context . ReportDiagnostic ( Diagnostic . Create ( ReferenceRule ,
invocation . GetLocation () , properties ,
invocation . ToString () , d o c um e n te d E xc e p ti o n ) ) ;
37 }
38 }
For the code fix, in addition to the invocation expression that we operated on in the
diagnostic analzyer, we need the statement that it is a part of. The reason for this is
that the invocation may for example be part of a variable declaration statement like
var x = MethodCall(). Only MethodCall() is the invocation, but for adding a try-catch,
the whole statement needs to be put inside the block. This retrieval is not shown here
as it simply involves iterating over the invocation’s ancestors and taking the first node
that is of type StatementSyntax.
Using the invocation, its surrounding statement and the name of the exception that
is not caught, we can call the method AddTryCatchAsync, illustrated in fig. 6.4. First,
we want to find out if our invocation is inside a try-catch block as that changes the way
we handle our codefix. For this, we again iterate over our invocation’s ancestors to see if
any of them is a TryStatement (line 3). We will use this information later in the fix. The
second preparation step for the code fix is constructing our exception type. As we only
have the name of the exception as an argument, we need to create the TypeSyntax using
33
6 Exception Analyzer
1 if ( tryStatement == null )
2 {
3 var block = ( BlockSyntax ) SyntaxFactory . ParseStatement ( " {\ n "
+ statement . GetText () + " } " ) ;
4
5 // generate try catch around node
6 var tryCatch = SyntaxFactory . TryStatement (
7 SyntaxFactory . SingletonList ( SyntaxFactory . CatchClause ()
8 . WithDeclaration (
9 SyntaxFactory . CatchDeclaration ( e x c ep t i on I d en t i fi e r ) ) ) )
10 . WithBlock ( block ) ;
11
12 var oldRoot = await
document . G et Sy nta xR oo tA sy nc ( ct ) . ConfigureAwait ( false ) ;
13 var newRoot = oldRoot . ReplaceNode ( statement ,
tryCatch ) . N o r ma l i ze W h it e s pa c e () ;
14
15 return document . WithSyntaxRoot ( newRoot ) ;
16 }
17 else
18 {
19 var newTryStatement =
tryStatement . AddCatches ( SyntaxFactory . CatchClause ()
20 . WithDeclaration ( SyntaxFactory . CatchDeclaration ( e x c ep t i on I d en t i fi e r ) ) )
21
22 var oldRoot = await
document . G et Sy nta xR oo tA sy nc ( ct ) . ConfigureAwait ( false ) ;
23
24 var newRoot = oldRoot . ReplaceNode ( tryStatement ,
newTryStatement ) . N o r ma l i ze W h it e s pa c e () ;
25
26 return document . WithSyntaxRoot ( newRoot ) ;
27 }
Continuing in fig. 6.5, we now check if our invocation is inside a try-catch block. The
easier case, shown in line 19 and onwards, is the one where we are inside a catch block.
Here, we simply have to add another catch using the very useful extension method Ad-
dCatches. Then, we replace the old try-catch with our newly constructed one (line 24)
and return an updated document.
34
6.4 Testing
1 [ TestMethod ]
2 public async Task U n k n o w n E x c e p t i o n s A r e I g n o r e d ()
3 {
4 var test = @ "
5 using System ;
6 public class SomeClass
7 {
8 public void Execute ()
9 {
10 ThrowsException () ;
11 }
12 /// < summary >
13 /// Will throw an exception .
14 /// </ summary >
15 /// < exception cref = " " S o m e N o n e E x i s t i n g E x c e p t i o n " " > Will throw
this exception . </ exception >
16 private static void ThrowsException ()
17 {
18 throw new Argum entExc eption () ;
19 }
20 }";
21 await VerifyCS . V e r if y A na l y ze r A sy n c ( test ) ;
22 }
The alternative involves creating a new try-catch block and putting our invocation
inside it. For this purpose, we first create a new block with our statement inside it
(line 3). Then we use the SyntaxFactory again to create the TryStatement with a single
CatchDeclaration catching our defined exception. For these calls to the SyntaxFactory,
the RoslynQuoter 3 was very useful. It shows exactly which methods should be used to
get the desired code sequence. After replacing our invocation with the new try-catch,
we return the new document.
6.4 Testing
There are in total 12 unit tests for this analyzer and its code fix, one of them illus-
trated in fig. 6.6. The test shown here tries out a case where the exception type can
not be found in the compilation. The diagnostic analyzer should be able to handle this
problem and simply ignore the exception. The assertion used here makes sure that no
diagnostic is shown and the analyzer does not error out while executing. Tests like this
are important as developers tend to make mistakes and a stable diagnostic analyzer
3
https://roslynquoter.azurewebsites.net/
35
6 Exception Analyzer
Some other unit tests of interest try out cases where multiple exceptions are docu-
mented and none of them or only one are caught. Also, We confirm that catching the
base type of an exception is sufficient for the analyzer and try out the code fix in several
situations, both where the try-catch block already exists and just needs to be extended
and where it has to be newly created.
Analyzer and code fix are tested for static and non-static, public and private methods
to ensure the accessibility level and whether it is a static or an instance method does
not impact the functionality at all. The correct location and message arguments are also
part of all assertions.
36
7 Conclusion and Future Work
The goal of this project was to evaluate if Roslyn provides a suitable API for C# devel-
opers to write their own code analysis tools. For this purpose, three diagnostic analyzers
for different objectives were implemented. The ideas for them were worked out with de-
velopers at World-Direct1 .
After the initial familiarization, the three previously mentioned diagnostic analyzers
were realized. Their implementation was documented and explained. Additionally, for
two of them a code fix was implemented and documented.
The first analyzer by the name of Authorize Analyzer has a very simple purpose and
was used as an introduction to programming with Roslyn. It scans comments in code for
a specific attribute and reports a warning if it finds that attribute. It provides a code
fix to uncomment that attribute.
The second analyzer called Naming Analyzer takes a look at class names and struc-
tures. It can be used to enforce correct usage of design patterns and with it, certain
terms can be banned from occurring in type names.
Lastly, the Exception Analyzer emulates Java’s compile-time exception checking. If a
method explicitly states that an exception can be thrown in its XML documentation, a
call to that method has to handle the exception.
The analyzers are all open source and available for free use2 . In the future, they will
also be available in the official NuGet package gallery3 .
In terms of potential improvements, the analyzers could all be extended with more
features if needed. For example, the Exception Analyzer could support rethrowing ex-
ceptions if they are documented in the calling method’s XML documentation. The Au-
thorize Analyzer could support a configuration file and look for other attributes other
than only [Authorize]. The Naming Analyzer might review class structures of design
patterns more intricately so that even less mistakes are made. It could also be imple-
mented as a stateful analyzer as that might improve performance if the configuration
can be cached.
1
https://world-direct.at
2
https://github.com/knollsen/csharp_analyzers
3
https://www.nuget.org/profiles/Knollsen
37
7 Conclusion and Future Work
Additionally, there are some ideas for new analyzers like one that enforces a specific
way of initializing fields and properties either in the constructor or in the member dec-
laration. One already partly developed analyzer idea takes care of project structure by
checking import paths. More analyzers will be developed by World-Direct in the future.
To conclude, the familiarization with the Roslyn compiler’s API was a success. Work-
ing and useful analyzers were implemented, are now in active use and openly available.
Techniques for dealing with more difficult features were discovered and utilized.
38
Bibliography
[1] Bill Wagner et al. Attributes (C#). https://docs.microsoft.com/en-us/
dotnet/csharp/programming-guide/concepts/attributes/, 15.09.2021. Ac-
cessed: 04.12.2021.
[2] Bill Wagner et al. The .NET Compiler Platform SDK. https://docs.microsoft.
com/en-us/dotnet/csharp/roslyn-sdk/, 15.09.2021. Accessed: 04.12.2021.
[3] Bill Wagner et al. Understand the .NET Compiler Platform SDK model. https://
docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model,
15.09.2021. Accessed: 04.12.2021.
[4] Stephen Walther et al. ASP.NET MVC Controller Overview (C#). https:
//docs.microsoft.com/en-us/aspnet/mvc/overview/older-versions-1/
controllers-and-routing/aspnet-mvc-controllers-overview-cs, 19.02.2020.
Accessed: 04.12.2021.
[5] Diomidis Spinellis. Code Quality: The Open Source Perspective (Effective Software
Development Series). Addison-Wesley Professional, 2006.
[6] Alfred V. Aho; Monica S. Lam; Ravi Sethi; Jeffrey D. Ullman. Compilers: Principles,
Techniques, and Tools. Pearson Education, 2007.
[7] Manish Vasani. Roslyn Cookbook: Compiler as a Service, Code Analysis, Code Qual-
ity and more. Packt Publishing, 2017.
39