Lecture Slides For The Clang Libraries-0.1.0-1
Lecture Slides For The Clang Libraries-0.1.0-1
Lecture Slides For The Clang Libraries-0.1.0-1
Edition 0.1.0
Michael D. Adams
Linux is a registered trademark of Linus Torvalds. UNIX and X Window System are registered trademarks of The Open Group. Windows is a
registered trademark of Microsoft Corporation. macOS is a registered trademark of Apple Inc. Chrome OS is a registered trademark of Google
LLC. Fedora is a registered trademark of Red Hat, Inc. Ubuntu is a registered trademark of Canonical Ltd. The YouTube logo is a registered
trademark of Google, Inc. The GitHub logo is a registered trademark of GitHub, Inc. The Twitter logo is a registered trademark of Twitter, Inc.
License who has not previously violated the terms of this License with
respect to the Work, or who has received express permission from the
Licensor to exercise rights under this License despite a previous
violation.
h. "Publicly Perform" means to perform public recitations of the Work and
to communicate to the public those public recitations, by any means or
process, including by wire or wireless means or public digital
performances; to make available to the public Works in such a way that
members of the public may access these Works from a place and at a
place individually chosen by them; to perform the Work to the public
by any means or process and the communication to the public of the
performances of the Work, including by public digital performance; to
broadcast and rebroadcast the Work by any means including signs,
sounds or images.
i. "Reproduce" means to make copies of the Work by any means including
without limitation by sound or visual recordings and the right of
fixation and reproducing fixations of the Work, including storage of a
protected performance or phonogram in digital form or other electronic
medium.
2. Fair Dealing Rights. Nothing in this License is intended to reduce,
limit, or restrict any uses free from copyright or rights arising from
limitations or exceptions that are provided for in connection with the
copyright protection under copyright law or other applicable laws.
3. License Grant. Subject to the terms and conditions of this License,
Licensor hereby grants You a worldwide, royalty-free, non-exclusive,
perpetual (for the duration of the applicable copyright) license to
exercise the rights in the Work as stated below:
a. to Reproduce the Work, to incorporate the Work into one or more
Collections, and to Reproduce the Work as incorporated in the
Collections; and,
b. to Distribute and Publicly Perform the Work including as incorporated
in Collections.
The above rights may be exercised in all media and formats whether now
known or hereafter devised. The above rights include the right to make
such modifications as are technically necessary to exercise the rights in
other media and formats, but otherwise you have no rights to make
Adaptations. Subject to 8(f), all rights not expressly granted by Licensor
are hereby reserved, including but not limited to the rights set forth in
Section 4(d).
4. Restrictions. The license granted in Section 3 above is expressly made
subject to and limited by the following restrictions:
a. You may Distribute or Publicly Perform the Work only under the terms
of this License. You must include a copy of, or the Uniform Resource
Identifier (URI) for, this License with every copy of the Work You
Distribute or Publicly Perform. You may not offer or impose any terms
on the Work that restrict the terms of this License or the ability of
the recipient of the Work to exercise the rights granted to that
recipient under the terms of the License. You may not sublicense the
Work. You must keep intact all notices that refer to this License and
to the disclaimer of warranties with every copy of the Work You
Distribute or Publicly Perform. When You Distribute or Publicly
Perform the Work, You may not impose any effective technological
measures on the Work that restrict the ability of a recipient of the
Work from You to exercise the rights granted to that recipient under
the terms of the License. This Section 4(a) applies to the Work as
incorporated in a Collection, but this does not require the Collection
apart from the Work itself to be made subject to the terms of this
License. If You create a Collection, upon notice from any Licensor You
must, to the extent practicable, remove from the Collection any credit
as required by Section 4(c), as requested.
b. You may not exercise any of the rights granted to You in Section 3
above in any manner that is primarily intended for or directed toward
commercial advantage or private monetary compensation. The exchange of
the Work for other copyrighted works by means of digital file-sharing
or otherwise shall not be considered to be intended for or directed
toward commercial advantage or private monetary compensation, provided
there is no payment of any monetary compensation in connection with
the exchange of copyrighted works.
c. If You Distribute, or Publicly Perform the Work or Collections, You
must, unless a request has been made pursuant to Section 4(a), keep
intact all copyright notices for the Work and provide, reasonable to
the medium or means You are utilizing: (i) the name of the Original
Author (or pseudonym, if applicable) if supplied, and/or if the
Original Author and/or Licensor designate another party or parties
(e.g., a sponsor institute, publishing entity, journal) for
attribution ("Attribution Parties") in Licensor’s copyright notice,
Preface
■ In a definition, the term being defined is often typeset in a font like this.
■ To emphasize particular words, the words are typeset in a font like this.
■ To show that particular text is associated with a hyperlink to an internal
target, the text is typeset .........
like this.
■ To show that particular text is associated with a hyperlink to an external
document, the text is typeset like this.
::::::
■ URLs are typeset like https://www.ece.uvic.ca/~mdadams.
Compilers
Compiler
Assembler
Linker
Structure of Compiler
Middle-End: Optimizer
Optimized IR
Tokens
simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }
simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }
Command
clang -std=c++20 -Xclang -dump-tokens -fsyntax-only simple_1.cpp
simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }
FunctionDecl
int add(int, int)
BinaryOperator
+
ImplicitCastExpr ImplicitCastExpr
DeclRefExpr DeclRefExpr
x y
simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }
Command
clang-check -ast-dump -ast-dump-filter=add simple_1.cpp -- \
-fno-color-diagnostics -std=c++20
Command
clang -std=c++20 -Xclang -ast-dump -fsyntax-only -fno-color-diagnostics \
simple_1.cpp
Command
clang++ -O3 -S -emit-llvm -o simple_1-opt.ll simple_1.cpp
LLVM Optimized IR (simple_1-opt.ll)
1 ; ModuleID = ’simple_1.cpp’
2 source_filename = "simple_1.cpp"
3 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
4 target triple = "x86_64-unknown-linux-gnu"
5
6 ; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
7 define dso_local noundef i32 @_Z3addii(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
8 %3 = add nsw i32 %1, %0
9 ret i32 %3
10 }
11
12 attributes #0 = { mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn "frame-
,→ pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-
,→ size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="
,→ generic" }
13
14 !llvm.module.flags = !{!0, !1}
15 !llvm.ident = !{!2}
16
17 !0 = !{i32 1, !"wchar_size", i32 4}
18 !1 = !{i32 7, !"uwtable", i32 1}
19 !2 = !{!"clang version 14.0.0"}
Command
clang++ -O3 -S -o simple_1-opt.s simple_1.cpp
Optimized x86-64 Assembly Code (simple_1-opt.s)
1 .text
2 .file "simple_1.cpp"
3 .globl _Z3addii # -- Begin function _Z3addii
4 .p2align 4, 0x90
5 .type _Z3addii,@function
6 _Z3addii: # @_Z3addii
7 .cfi_startproc
8 # %bb.0:
9 # kill: def $esi killed $esi def $rsi
10 # kill: def $edi killed $edi def $rdi
11 leal (%rdi,%rsi), %eax
12 retq
13 .Lfunc_end0:
14 .size _Z3addii, .Lfunc_end0-_Z3addii
15 .cfi_endproc
16 # -- End function
17 .ident "clang version 14.0.0"
18 .section ".note.GNU-stack","",@progbits
19 .addrsig
Name Mangling
Output
add(int, int)
Output
add(int, int) add(int, int) add(int, int)
declarations:
int n; double x; double y;
1 1 switch (n) {
2 case 0:
3 y = 0.0;
4 break;
3 6 9 5 case 1:
6 y = 2.0 * x;
7 break;
8 case 2:
12 9 y = 0.5 * x * x;
10 break;
11 }
12 // ...
1
declarations:
int n;
2 1 while (n > 0) {
2 --n;
3 }
4 // ...
4
2
declarations:
int n;
3 1 do {
2 --n;
3 while (n > 0);
4 // ...
4
1a
1b
declarations:
int a[1024];
2 1 for (int i = 0; i < 1024; ++i) {
2 a[i] = 0;
3 }
4 // ...
1c
■ variable is said to be live at some point in code if it holds value that may
be needed in future (or equivalently, if its value may be read before next
time variable is written)
■ at each point in code execution, each variable is either live or dead
■ live-variable analysis (also called liveness analysis) is classic data-flow
analysis to calculate variables that are live at each point in code
■ some uses for liveness analysis include:
2 detecting dead stores (i.e., value written to variable that is never read)
2 detecting use of uninitialized variables
2 register allocation (i.e., deciding which variables should be allocated to
registers)
entry
exit
llvmorg-15.0.6/llvm-project-15.0.6.src.tar.xz
■ in most cases, probably preferable to obtain LLVM/Clang from Git
repository
■ can query how LLVM was configured at time it was built using
llvm-config program
■ can query such things as:
2 LLVM build mode (e.g., Debug or Release)
2 whether LLVM was built with RTTI enabled
2 whether LLVM was built with assertions enabled
2 libraries needed to link against various LLVM components
2 installation directory for LLVM headers
■ for example, to print build mode, RTTI setting, and assertion mode, use:
llvm-config --build-mode --has-rtti --assertion-mode
■ for more information, see:
2 https://llvm.org/docs/CommandGuide/llvm-config.html
Command
clang++ -ccc-print-phases -c hello.cpp
Output (Standard Error)
+- 0: input, "hello.cpp", c++
+- 1: preprocessor, {0}, c++-cpp-output
+- 2: compiler, {1}, ir
+- 3: backend, {2}, assembler
4: assembler, {3}, object
Command
clang++ -ccc-print-phases hello.cpp
Output (Standard Error)
+- 0: input, "hello.cpp", c++
+- 1: preprocessor, {0}, c++-cpp-output
+- 2: compiler, {1}, ir
+- 3: backend, {2}, assembler
+- 4: assembler, {3}, object
5: linker, {4}, image
simple_2.cpp
1 int factorial(int n) {
2 int result = 1;
3 while (n >= 2) {result *= n--;}
4 return result;
5 }
Command
clang -std=c++20 -Xclang -dump-tokens -fsyntax-only simple_2.cpp
FunctionDecl
int factorial(int)
ParmVarDecl
int n CompoundStmt
Command
clang-check -ast-dump -ast-dump-filter=factorial simple_2.cpp -- -fno-color-diagnostics -std=c++20
Command
clang -std=c++20 -Xclang -ast-dump -fsyntax-only -fno-color-diagnostics simple_2.cpp
LLVM IR (simple_2.ll)
1 ; ModuleID = ’simple_2.cpp’
2 source_filename = "simple_2.cpp"
3 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
4 target triple = "x86_64-unknown-linux-gnu"
5
6 ; Function Attrs: mustprogress noinline nounwind uwtable
7 define dso_local noundef i32 @_Z9factoriali(i32 noundef %0) #0 {
8 %2 = alloca i32, align 4
9 %3 = alloca i32, align 4
10 store i32 %0, i32* %2, align 4
11 store i32 1, i32* %3, align 4
12 br label %4
13 4: ; preds = %7, %1
14 %5 = load i32, i32* %2, align 4
15 %6 = icmp sge i32 %5, 2
16 br i1 %6, label %7, label %12
17 7: ; preds = %4
18 %8 = load i32, i32* %2, align 4
19 %9 = add nsw i32 %8, -1
20 store i32 %9, i32* %2, align 4
21 %10 = load i32, i32* %3, align 4
22 %11 = mul nsw i32 %10, %8
23 store i32 %11, i32* %3, align 4
24 br label %4, !llvm.loop !4
25 12: ; preds = %4
26 %13 = load i32, i32* %3, align 4
27 ret i32 %13
28 }
29
30 ; [text deleted]
Output
factorial(int)
Output
factorial(int) factorial(int)
%1:
%2 = alloca i32, align 4
%3 = alloca i32, align 4
store i32 %0, i32* %2, align 4
store i32 1, i32* %3, align 4
br label %4
%4:
4:
%5 = load i32, i32* %2, align 4
%6 = icmp sge i32 %5, 2
br i1 %6, label %7, label %12
T F
%7:
7:
%8 = load i32, i32* %2, align 4
%12:
%9 = add nsw i32 %8, -1
12:
store i32 %9, i32* %2, align 4
%13 = load i32, i32* %3, align 4
%10 = load i32, i32* %3, align 4
ret i32 %13
%11 = mul nsw i32 %10, %8
store i32 %11, i32* %3, align 4
br label %4, !llvm.loop !4
CFG for '_Z9factoriali' function
%1:
%2 = icmp sgt i32 %0, 1
br i1 %2, label %3, label %9
T F
%3:
3:
%4 = phi i32 [ %7, %3 ], [ 1, %1 ]
%5 = phi i32 [ %6, %3 ], [ %0, %1 ]
%6 = add nsw i32 %5, -1
%7 = mul nsw i32 %4, %5
%8 = icmp sgt i32 %5, 2
br i1 %8, label %3, label %9, !llvm.loop !3
T F
%9:
9:
%10 = phi i32 [ 1, %1 ], [ %7, %3 ]
ret i32 %10
CFG for '_Z9factoriali' function
the-isa-cast-and-dyn-cast-templates
■ C library
■ functionality provided includes:
2 parsing source code into AST
2 loading already-parsed ASTs
2 traversing ASTs
2 annotating source locations with elements within AST
■ does not provide access to all information in Clang AST
■ API intended to be relatively stable from one release to next
■ intended to provide only basic functionality needed to support
development tools
■ data types prefixed with “CX” and functions prefixed with “clang_”
■ access to AST through high-level abstractions
■ for details on API, see:
2 https://clang.llvm.org/doxygen/group__CINDEX.html
■ C++ library
■ provides much richer set of functionality relative to LibClang
■ Clang library that provides functionality for utilizing parts of Clang in
standalone tools or Clang compiler plugins
■ provides convenient way to invoke compiler frontend on source code
■ provides support for compilation databases
■ easily integrates with code using CommandLine Library for processing of
command-line arguments
■ code in clang::tooling namespace
■ does not have stable API
Command-Line Processing
Compilation Databases
containing specified source file and each of its successive parents and
loads first one found
2 autoDetectFromDirectory: looks for compilation database in specified
directory and each of its successive parents and loads first one found
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1tooling_1_
1CompilationDatabase.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 78
Fixed Compilation Database
compile_flags.txt
-DGREETING="Hello, World!"
-DANSWER=42
-g
-O2
-I
/usr/local/libfoo/include
1FixedCompilationDatabase.html
1 #include <format>
2 #include <utility>
3 #include "clang/Tooling/CommonOptionsParser.h"
4 #include "clang/Tooling/CompilationDatabase.h"
5 #include "llvm/Config/llvm-config.h"
6 #include "utility.hpp"
7
8 namespace ct = clang::tooling;
9
10 int main(int argc, char** argv) {
11 if (argc < 2) {
12 llvm::errs() << "no fixed database specified\n";
13 return 1;
14 }
15 std::string pathname(argv[1]);
16 std::string errString;
17 std::unique_ptr<ct::CompilationDatabase> compDatabase;
18 compDatabase = ct::FixedCompilationDatabase::loadFromFile(pathname,
19 errString);
20 if (!compDatabase) {
21 llvm::errs() << std::format("ERROR: {}\n", errString);
22 return 1;
23 }
24 std::vector<std::string> sourcePaths = compDatabase->getAllFiles();
25 for (const auto& sourcePath : sourcePaths) {
26 llvm::outs() << std::format("{}\n", sourcePath);
27 }
28 std::vector<ct::CompileCommand> compCommands =
29 compDatabase->getAllCompileCommands();
30 printCompCommands(llvm::outs(), compCommands);
31 for (int i = 2; i < argc; ++i) {
32 std::vector<ct::CompileCommand> compCommands =
33 compDatabase->getCompileCommands(argv[i]);
34 printCompCommands(llvm::outs(), compCommands);
35 }
36 return 0;
37 }
[
{
"arguments" : [
"/usr/bin/clang++",
"-Irelative",
"-DGREET=Hello, World!\\n",
"-c",
"-o",
"file.o",
"file.cpp"
],
"directory" : "/home/user/llvm/build",
"file" : "file.cpp"
},
{
"command" : "/usr/bin/clang++ -Irelative -DGREET=\"Hello, World!\\\\n\" -c -o file.o file.cpp",
"directory" : "/home/user/llvm/build",
"file" : "file2.cpp"
}
]
1JSONCompilationDatabase.html
27 std::vector<ct::CompileCommand> compCommands =
28 compDatabase->getAllCompileCommands();
29 printCompCommands(llvm::outs(), compCommands);
30 for (int i = 2; i < argc; ++i) {
31 std::vector<ct::CompileCommand> compCommands =
32 compDatabase->getCompileCommands(argv[i]);
33 printCompCommands(llvm::outs(), compCommands);
34 }
35 return 0;
36 }
1 #include <vector>
2 #include "llvm/Support/raw_ostream.h"
3
4 bool printCompCommands(llvm::raw_fd_ostream& out,
5 const std::vector<clang::tooling::CompileCommand>& compCommands);
html#a8dcb3e0419f4f8de952b46ad1c627f68
1ArgumentsAdjustingCompilations.html
30 std::unique_ptr<ct::CompilationDatabase> wrapCompDatabase(
31 std::unique_ptr<ct::CompilationDatabase> compDatabase, int adjust) {
32 compDatabase = std::make_unique<ct::ArgumentsAdjustingCompilations>(
33 std::move(compDatabase));
34 auto aac = static_cast<ct::ArgumentsAdjustingCompilations*>(
35 compDatabase.get());
36 switch (adjust) {
37 case 1:
38 aac->appendArgumentsAdjuster(ct::getClangSyntaxOnlyAdjuster());
39 break;
40 case 2:
41 aac->appendArgumentsAdjuster(ct::getInsertArgumentAdjuster("-DFOO",
42 ct::ArgumentInsertPosition::BEGIN));
43 break;
44 }
45 return compDatabase;
46 }
ASTs
2 PointerType
2 ArrayType
2 RecordType
2 FunctionType
■ TypeLoc
::::::::
: represents source information regarding type
■ NestedNameSpecifier
:::::::::::::::::::::
: represents C++ nested name specifier (e.g.,
std::vector<int>)
■ NestedNameSpecifierLoc
:::::::::::::::::::::::::
: represents C++ nested-name specifier
augmented with source location information
■ CXXBaseSpecifier
::::::::::::::::::
: represents base class of C++ class
■ CXXCtorInitializer
::::::::::::::::::::
: represents C++ base or member initializer
■ TemplateArgument: represents template argument
::::::::::::::::::
■ TemplateArgumentLoc
:::::::::::::::::::::
: represents source information regarding
template argument
■ LambdaCapture
:::::::::::::::
: describes capture for lambda expression (e.g., capture
of variable or of this)
■ Attr
::::
: represents attribute information (e.g., C++ attribute on type,
function, or statement)
■ clang::DynTypedNode
:::::::::::::::::::::
class used to represent generic AST node (i.e.,
AST node of any type)
■ get<T> template method returns pointer to underlying node as pointer to
T or null if underyling node is not of that type
■ getUnchecked<T> template method similar to get but underlying node
required to be of T
■ clang::DynTypedNodeList class used to represent list of generic AST
::::::::::::::::::::::::::
nodes
Frontend Actions
1ClangTool.html
source file
2 EndSourceFileAction: invoked just after finishing processing of source
file
2 ExecuteAction: performs main task for frontend action
2 CreateASTConsumer: factory function used to create instance of
ASTConsumer, which provides callbacks to be invoked at particular points in
processing of AST
■ for more information, see:
2 https:
//clang.llvm.org/doxygen/classclang_1_1FrontendAction.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 112
Frontend Actions
1FrontendActionFactory.html
1CompilerInstance.html
Preprocessor-Related Processing
//clang.llvm.org/doxygen/classclang_1_1Preprocessor.html
//clang.llvm.org/doxygen/classclang_1_1PPCallbacks.html
1 #include <format>
2 #include <iostream>
3 #include "clang/Frontend/CompilerInstance.h"
4 #include "clang/Frontend/FrontendActions.h"
5 #include "clang/Lex/PPCallbacks.h"
6 #include "clang/Lex/Preprocessor.h"
7 #include "clang/Tooling/CommonOptionsParser.h"
8 #include "clang/Tooling/Tooling.h"
9 #include "llvm/Support/CommandLine.h"
10
11 namespace ct = clang::tooling;
12 using namespace std::literals;
13
14 static llvm::cl::OptionCategory toolCategory("Tool Options");
15
16 std::string locationToString(const clang::SourceManager& sourceManager,
17 clang::SourceLocation sourceLoc) {
18 return std::format("{}:{}:{}",
19 std::string(sourceManager.getFilename(sourceLoc)),
20 sourceManager.getSpellingLineNumber(sourceLoc),
21 sourceManager.getSpellingColumnNumber(sourceLoc));
22 }
1ASTFrontendAction.html
parsed
2 HandleCXXImplicitFunctionInstantiation: called when function
implicitly instantiated
■ for more information, see:
2 https:
//clang.llvm.org/doxygen/classclang_1_1ASTConsumer.html
59 std::unique_ptr<clang::ASTConsumer> MyFrontendAction::CreateASTConsumer(
60 clang::CompilerInstance& compInstance, llvm::StringRef inFile) {
61 const clang::LangOptions& langOpts = compInstance.getLangOpts();
62 llvm::outs() << std::format("{}\nlanguage: {}\nstandard: {}\nname: {}\n\n",
63 inFile, langKindToLangString(langOpts.LangStd),
64 langKindToStdString(langOpts.LangStd),
65 langKindToNameString(langOpts.LangStd));
66 return clang::SyntaxOnlyAction::CreateASTConsumer(compInstance, inFile);
67 }
68
69 static llvm::cl::OptionCategory toolOptions("Tool Options");
70
71 int main(int argc, char** argv) {
72 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
73 const_cast<const char**>(argv), toolOptions);
74 if (!expectedOptionsParser) {
75 llvm::errs() << llvm::toString(expectedOptionsParser.takeError());
76 return 1;
77 }
78 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
79 ct::ClangTool tool(optionsParser.getCompilations(),
80 optionsParser.getSourcePathList());
81 int status = tool.run(
82 ct::newFrontendActionFactory<MyFrontendAction>().get());
83 if (status) {llvm::errs() << "error detected\n";}
84 return !status ? 0 : 1;
85 }
//clang.llvm.org/doxygen/classclang_1_1ASTContext.html
1RecursiveASTVisitor.html
hierarchy (from node’s dynamic type to top-most class) for single node and
then call visit method for that node (e.g.,
WalkUpFromCXXConstructorDecl)
3 visit method (VisitType): handles visitation of single node based on its
type by calling user-specified function (e.g., VisitFunctionDecl,
VisitVarDecl)
■ traverse, walk-up, and visit methods have bool return type which
indicates if traversal should continue
2 WalkUpFromNamespaceDecl
3 WalkUpFromNamedDecl
4 WalkUpFromDecl
5 VisitDecl
6 VisitNamedDecl
7 VisitNamespaceDecl
1 #include <format>
2 #include <vector>
3 #include "clang/AST/ASTConsumer.h"
4 #include "clang/AST/RecursiveASTVisitor.h"
5 #include "clang/Frontend/CompilerInstance.h"
6 #include "clang/Frontend/FrontendAction.h"
7 #include "clang/Tooling/CommonOptionsParser.h"
8 #include "clang/Tooling/Tooling.h"
9 #include "llvm/Support/CommandLine.h"
10
11 namespace ct = clang::tooling;
12
13 class MyAstVisitor : public clang::RecursiveASTVisitor<MyAstVisitor> {
14 public:
15 MyAstVisitor(clang::ASTContext& astContext) : astContext_(&astContext),
16 stack_() {}
17 bool TraverseCXXRecordDecl(clang::CXXRecordDecl* recDecl);
18 private:
19 using Base = clang::RecursiveASTVisitor<MyAstVisitor>;
20 void printStack() const;
21 clang::ASTContext* astContext_;
22 std::vector<const clang::CXXRecordDecl*> stack_;
23 };
//clang.llvm.org/doxygen/classclang_1_1SourceManager.html
//clang.llvm.org/doxygen/classclang_1_1SourceLocation.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 155
clang::SourceRange Type
■ clang::SourceRange represents contiguous part of source code
■ SourceRange represents range of tokens
■ essentially pair of SourceLocation objects (i.e., one SourceLocation
object for each of begin and end locations)
■ range is symmetric (i.e., both begin and end refer to elements in range)
■ begin location specifies location of first character of first token in range
(obtained via getBegin)
■ end location specifies location of first character of last token in range
(obtained via getEnd)
■ can check for invalid value with isValid member function
■ some AST node types have getSourceRange member function to obtain
range of tokens related to AST node (e.g., FunctionDecl and VarDecl)
■ for more information, see:
2 https:
//clang.llvm.org/doxygen/classclang_1_1SourceRange.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 156
FunctionDecl and Source Locations (Declaration Only)
getLocation()
getBeginLoc() getEndLoc()
getLocation()
getBeginLoc() getEndLoc()
getLocation()
getBeginLoc() getEndLoc()
getLocation()
getBeginLoc() getEndLoc()
beginLoc()
endLoc()
beginLoc()
endLoc()
■ source range:
functionDecl->getReturnTypeSourceRange().beginLoc(),
functionDecl->getReturnTypeSourceRange().endLoc()
getLocation()
getBeginLoc() getEndLoc()
getLocation()
getBeginLoc() getEndLoc()
func(a, b, c, d);
getBeginLoc() getEndLoc()
(*func_ptr)(a, b, c, d);
getBeginLoc() getEndLoc()
//clang.llvm.org/doxygen/classclang_1_1CharSourceRange.html
2 getExpansionRange
Preprocessor Output
11111111112222222
12345678901234567890123456
2 int foo (int x) {return x ;}
3 int main () {
4 return foo (42);
5 }
Preprocessor Output
11111111112222222
12345678901234567890123456
4 int x = 42;
1 #include <format>
2 #include "clang/Basic/SourceManager.h"
3 #include "clang/Basic/SourceLocation.h"
4 #include "clang/Lex/Lexer.h"
5
6 std::string locationToString(const clang::SourceManager& sourceManager,
7 clang::SourceLocation sourceLoc) {
8 return std::format("{}:{}({})",
9 std::string(sourceManager.getFilename(sourceLoc)),
10 sourceManager.getSpellingLineNumber(sourceLoc),
11 sourceManager.getSpellingColumnNumber(sourceLoc));
12 }
13
14 std::string rangeToString(const clang::SourceManager& sourceManager,
15 clang::SourceRange sourceRange) {
16 std::string beginFilename(sourceManager.getFilename(
17 sourceRange.getBegin()));
18 std::string endFilename(sourceManager.getFilename(sourceRange.getEnd()));
19 return std::format("{}:{}({})-{}{}({})", beginFilename,
20 sourceManager.getSpellingLineNumber(sourceRange.getBegin()),
21 sourceManager.getSpellingColumnNumber(sourceRange.getBegin()),
22 endFilename != beginFilename ? endFilename + ":" : "",
23 sourceManager.getSpellingLineNumber(sourceRange.getEnd()),
24 sourceManager.getSpellingColumnNumber(sourceRange.getEnd()));
25 }
Diagnostics
1DiagnosticConsumer.html
■ Clang libraries provide mechanism for finding AST nodes that match
specific criteria as determined by some matching predicate
■ predicate embodied by AST matcher type
■ can match nodes that correspond to declarations, statements,
expressions, and types (amongst other things)
■ AST matcher API designed such that expressions involving matchers
have very natural syntax
■ this syntax can be thought of as domain-specific language for AST node
matching
■ as will be seen later, clang-query tool supports similar syntax
AST Matchers
■ AST matcher is class that holds predicate used to test for match
■ three categories of AST matchers:
1 node matchers: match specific type of AST node
2 narrowing matchers: match attributes on AST nodes
3 traversal matchers: allow traversal between AST nodes
■ library provides very rich set of predefined AST matchers
■ predefined AST matchers in clang::ast_matchers namespace
■ library also allows custom AST matchers to be defined by user
■ for list of AST matchers provided by library, see:
2 https://clang.llvm.org/docs/LibASTMatchersReference.html
Name Description
functionDecl matches FunctionDecl node (function declaration)
cxxMethodDecl matches CXXMethodDecl node (class/union/struct
method declaration)
cxxRecordDecl matches CXXRecordDecl node (C++ class/u-
nion/struct declaration)
varDecl matches VarDecl node (variable declaration)
callExpr matches CallExpr node (call expression)
declRefExpr matches DeclRefExpr node (expression referring to
declared entity)
■ ASTs for source files only generated at program startup (so any
subsequent changes to source files not considered)
■ source code can be found in clang-tools-extra/clang-query
directory of LLVM Git repository
■ completion functionality requires LLVM/Clang built to use Editline library
(a.k.a. libedit)?
■ for more information, see:
2 https://firefox-source-docs.mozilla.org/code-quality/
static-analysis/writing-new/clang-query.html
Match #1:
/ home / jdoe / example_1 . cpp :5:15: note : " call " binds here
std :: cout << square (2) + square (3) << ’\n ’;
^~~~~~~~~
Match #2:
/ home / jdoe / example_1 . cpp :5:27: note : " call " binds here
std :: cout << square (2) + square (3) << ’\n ’;
^~~~~~~~~
2 matches .
■ Clang generates many AST nodes for constructs not spelled explicitly in
source
■ in clang-query, all implicit AST nodes can be ignored when matching by
using:
set traversal IgnoreUnlessSpelledInSource
2 unsigned __int128
■ unlike simple AST matcher, polymorphic AST matcher can match against
multiple types of nodes, where types not related by inheritance
■ polymorphic AST matcher is object of template type
clang::ast_matchers::internal::PolymorphicMatcher, which
includes template parameter to capture set of possible node types that
can be matched
■ provides same interface as simple matcher (e.g., provides matches
method)
■ AST matcher function is function that returns AST matcher (i.e., factory
function for matcher)
■ matcher function can be defined to take parameters
■ in practice, usually number of parameters between zero and two
■ matching function can take another matcher object as parameter, which
allows matching predicate result to depend on criterion from another
matcher
■ can overload matcher functions
■ clang::ast_matchers::MatchFinder
::::::::::::::::::::::::::::::::::::
class:
2 provides mechanism for traversing AST in order to find matching nodes
■ matcher class, normally chosen from one of numerous matcher classes in
clang::ast_matchers:
2 holds predicate used to determine if node is match
■ match-callback class, derived from
clang::ast_matchers::MatchCallback
::::::::::::::::::::::::::::::::::::::
:
2 specifies actions to be taken when match found
■ clang::ast_matchers::MatchFinder::MatchResult
::::::::::::::::::::::::::::::::::::::::::::::::::
class:
2 holds match result
■ clang::ast_matchers::MatchFinder
::::::::::::::::::::::::::::::::::::
class provides mechanism for
finding matches over AST
■ after creation, can add one or more matchers via calls to addMatcher
■ addMatcher has several overloads with signature of form:
void addMatcher(MatcherType&, MatchCallback*)
■ newASTConsumer method returns AST consumer that will trigger specified
callbacks at appropriate points in matching process
■ can run matcher over entire AST (for given ASTContext) with matchAST
method
■ can run matcher against single AST node with match method
■ can generate frontend-action factory for MatchFinder using overload of
clang::tooling::newFrontendActionFactory
:::::::::::::::::::::::::::::::::::::::::::::
function, which will
automatically invoke matchAST method
■ clang::ast_matchers::MatchFinder::MatchResult
::::::::::::::::::::::::::::::::::::::::::::::::::
class holds all
information for match found
■ public data members of class include:
2 Nodes: collection of nodes bound on current match, represented by
BoundNodes class
2 Context: ASTContext instance associated with match
2 SourceManager: SourceManager instance associated with match
■ clang::ast_matchers::MatchFinder::MatchCallback
::::::::::::::::::::::::::::::::::::::::::::::::::::
class
provides abstract interface for specifying callbacks that are invoked at
particular stages of matching process
■ each callback is virtual function
■ library user inherits from MatchCallback class and provides desired
behavior by overriding appropriate virtual functions
■ some callbacks include:
2 run: called for each match
2 onStartOfTranslationUnit: called at start of each translation unit
2 onEndOfTranslationUnit: called at end of each translation unit
1 #include <format>
2 #include "clang/ASTMatchers/ASTMatchers.h"
3 #include "clang/ASTMatchers/ASTMatchFinder.h"
4 #include "clang/Frontend/FrontendActions.h"
5 #include "clang/Tooling/CommonOptionsParser.h"
6 #include "clang/Tooling/Tooling.h"
7 #include "llvm/Support/CommandLine.h"
8 #include "utilities.hpp"
9
10 namespace ct = clang::tooling;
11 namespace cam = clang::ast_matchers;
12
13 clang::SourceLocation getLineStart(const clang::SourceManager& sourceManager,
14 clang::SourceLocation loc) {
15 return sourceManager.translateLineCol(sourceManager.getFileID(loc),
16 sourceManager.getSpellingLineNumber(loc), 1);
17 }
18
19 clang::SourceLocation getLineEnd(const clang::SourceManager& sourceManager,
20 clang::SourceLocation loc) {
21 return sourceManager.translateLineCol(sourceManager.getFileID(loc),
22 sourceManager.getSpellingLineNumber(loc), ~0);
23 }
■ VariantMatcher
::::::::::::::::
class (in clang::ast_matchers::dynamic
namespace) can be used store any type of matcher (e.g.,
DeclarationMatcher, StatementMatcher, and so on)
■ allows simple and polymorphic matchers to be represented by single
object type
■ can be very convenient in code that must deal with multiple types of
matchers at once
■ ast_matchers::MatchFinder class provides addDynamicMatcher
method to add dynamic matcher to MatchFinder instance
■ internal::DynTypedMatcher class used by VariantMatcher to
represent dynamically-typed matcher
ambiguous)
AST_MATCHER_P2, AST_MATCHER_P2_OVERLOAD
2 AST_POLYMORPHIC_MATCHER, AST_POLYMORPHIC_MATCHER_P,
AST_POLYMORPHIC_MATCHER_P_OVERLOAD,
AST_POLYMORPHIC_MATCHER_P2,
AST_POLYMORPHIC_MATCHER_P2_OVERLOAD
■ family of macros for defining AST matcher by specifying matcher factory
function, including:
2 AST_MATCHER_FUNCTION, AST_MATCHER_FUNCTION_P,
AST_MATCHER_FUNCTION_P_OVERLOAD
■ numerous other macros for defining AST matchers (e.g., for handling type
traverse matchers and regex parameters)
■ for more information, see:
2 https://clang.llvm.org/doxygen/ASTMatchersMacros_8h.html
clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type
clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ ASTMatchFinder class (i.e., type of Finder variable) provides
getASTContext method to access ASTContext instance
clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type
clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_MATCHER_P_OVERLOAD macro similar to AST_MATCHER_P macro,
except adds extra ID parameter used to disambiguate overloads of
overloaded predicate
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 230
AST_MATCHER_P2 Macro
■ AST_MATCHER_P2 macro defines simple matcher from predicate
■ has syntax:
AST_MATCHER_P2(Type, DefineMatcher, ParamType1, Param1,
ParamType2, Param2)
■ defines two-parameter predicate on nodes of type Type and
corresponding matcher function named DefineMatcher
■ parameter named Param1 of type const ParamType1& and parameter
named Param2 of type const ParamType2&
■ predicate returns bool indicating if node matches
■ provides variables:
2 Node: AST node being matched (of type const Type&)
2 Finder: AST match finder (of type
clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type
clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_MATCHER_P2_OVERLOAD macro similar to AST_MATCHER_P2, except
adds extra ID parameter used for disambiguation in case of overloaded
matcher function
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 231
AST_MATCHER_FUNCTION Macro
ReturnTypes)
2 Node: AST node being matched of type const NodeType&
2 Finder: AST match finder (of type
clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type
clang::ast_matchers::internal::BoundNodesTreeBuilder*)
ReturnTypes)
2 Node: AST node being matched of type const NodeType&
2 Finder: AST match finder (of type
clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type
clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_POLYMORPHIC_MATCHER_P_OVERLOAD analogous to
AST_MATCHER_P_OVERLOAD, except for polymorphic matcher
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 236
AST_POLYMORPHIC_MATCHER_P2 Macro
■ AST_POLYMORPHIC_MATCHER_P2 macro defines polymorphic matcher
from predicate (i.e., polymorphic analog of AST_MATCHER_P2)
■ has syntax:
AST_POLYMORPHIC_MATCHER_P2(DefineMatcher, ReturnTypes,
ParamType1, Param1, ParamType2, Param2)
■ similar to AST_POLYMORPHIC_MATCHER macro but defines matcher with
two-parameter predicate
■ provides variables/types (similar to AST_POLYMORPHIC_MATCHER):
2 NodeType: type of node being matched (which is one of types specified by
ReturnTypes)
2 Node: AST node being matched of type const NodeType&
2 Finder: AST match finder (of type
clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type
clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_POLYMORPHIC_MATCHER_P2_OVERLOAD analogous to
AST_MATCHER_P_OVERLOAD, except for polymorphic matcher
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 237
AST Matcher Example: Summary
33 const cam::internal::VariadicDynCastAllOfMatcher<clang::Stmt,
34 clang::AttributedStmt> attributedStmt;
35
36 AST_POLYMORPHIC_MATCHER_P(boolean, AST_POLYMORPHIC_SUPPORTED_TYPES(
37 clang::Decl, clang::Stmt, clang::Type, clang::TypeLoc), bool, condition)
38 {return condition;}
39
40 AST_MATCHER_P(clang::AttributedStmt, attrCountAtLeast, unsigned, threshold)
41 {return Node.getAttrs().size() >= threshold;}
42
43 AST_MATCHER_P2(clang::ConstantArrayType, isSizeBetween, unsigned long,
44 low, unsigned long, high) {
45 return Node.getSize().getZExtValue() >= low &&
46 Node.getSize().getZExtValue() <= high;
47 }
48
49 AST_MATCHER_FUNCTION(cam::internal::Matcher<clang::FunctionDecl>,
50 isFunctionOfInterest) {
51 using namespace cam;
52 return functionDecl(hasAnyName("::abs", "::max", "::min"));
53 }
54
55 AST_MATCHER_FUNCTION_P(cam::internal::Matcher<clang::CallExpr>,
56 callToFuncWithName, std::string, name) {
57 using namespace cam;
58 return callExpr(callee(functionDecl(hasName(name))));
59 }
■ clang::ast_matchers::internal::BoundNodesTreeBuilder class
used during AST-node matching process to record matches and names
bound to nodes
■ addMatch method adds another BoundNodesTreeBuilder tree as
branch in tree
■ setBinding method adds name binding for node
■ BoundNodesTreeBuilder class and addMatch method useful for custom
matchers that have “for-each” kind of behavior (i.e., that need to iterate
over every match obtained from some inner matcher)
■ addMatch method often used to add match information obtained from
inner matcher to match results for outer matcher
■ use of AST matchers can often result in more concise code (relative to
AST visitors) by eliminating boilerplate
■ for example, if pattern involves descendants/ancestors, approach based
on AST visitor would need additional boilerplate to locate those
descendants/ancestors, whereas in AST matcher case, library itself
provides boilerplate to locate and bind to relevant nodes
■ AST visitors can often be better suited to searching when patterns involve
variable number of nodes and/or complex relationships between nodes
■ use whichever approach best suited for task at hand
1 #include <format>
2 #include <stack>
3 #include <type_traits>
4 #include "clang/AST/ASTConsumer.h"
5 #include "clang/AST/RecursiveASTVisitor.h"
6 #include "clang/Frontend/CompilerInstance.h"
7 #include "clang/Frontend/FrontendAction.h"
8 #include "clang/Tooling/CommonOptionsParser.h"
9 #include "clang/Tooling/Tooling.h"
10 #include "llvm/Support/CommandLine.h"
11
12 namespace ct = clang::tooling;
13
14 static llvm::cl::OptionCategory toolOptions("Tool Options");
1 #include <cassert>
2 #include <format>
3 #include <map>
4 #include "clang/AST/ASTContext.h"
5 #include "clang/ASTMatchers/ASTMatchers.h"
6 #include "clang/ASTMatchers/ASTMatchFinder.h"
7 #include "clang/AST/ParentMapContext.h"
8 #include "clang/AST/RecursiveASTVisitor.h"
9 #include "clang/Frontend/FrontendActions.h"
10 #include "clang/Tooling/CommonOptionsParser.h"
11 #include "clang/Tooling/Tooling.h"
12 #include "llvm/Support/CommandLine.h"
13
14 namespace ct = clang::tooling;
15 namespace cam = clang::ast_matchers;
16
17 static llvm::cl::OptionCategory optionCategory("Tool options");
19 template<class NodeType>
20 const NodeType* getParentOfStmt(clang::ASTContext& astContext,
21 const clang::Stmt* stmt) {
22 auto parents = astContext.getParents(*stmt);
23 const clang::Stmt* curStmt = nullptr;
24 const NodeType* parent = nullptr;
25 for (auto&& node : parents) {
26 if (auto p = node.get<NodeType>()) {
27 assert(!parent);
28 parent = p;
29 }
30 }
31 return parent;
32 }
33
34 unsigned getForDepth(clang::ASTContext& astContext,
35 const clang::Stmt* forStmt) {
36 assert(llvm::isa<clang::ForStmt>(forStmt) ||
37 llvm::isa<clang::CXXForRangeStmt>(forStmt));
38 unsigned count = 1;
39 const clang::Stmt* curStmt = forStmt;
40 while ((curStmt = getParentOfStmt<clang::Stmt>(astContext, curStmt))) {
41 if (llvm::isa<clang::ForStmt>(curStmt) ||
42 llvm::isa<clang::CXXForRangeStmt>(curStmt)) {++count;}
43 }
44 return count;
45 }
79 cam::StatementMatcher getMatcher() {
80 using namespace cam;
81 auto f = anyOf(forStmt(), cxxForRangeStmt());
82 return stmt(f, hasAncestor(functionDecl(isExpansionInMainFile()).bind(
83 "func")), unless(hasDescendant(stmt(f)))).bind("for");
84 }
85
86 struct MyAstConsumer : public clang::ASTConsumer {
87 void HandleTranslationUnit(clang::ASTContext& astContext) final {
88 MyMatchCallback matchCallback;
89 cam::StatementMatcher matcher = getMatcher();
90 cam::MatchFinder matchFinder;
91 matchFinder.addMatcher(matcher, &matchCallback);
92 matchFinder.matchAST(astContext);
93 }
94 };
95
96 struct MyFrontendAction : public clang::ASTFrontendAction {
97 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
98 clang::CompilerInstance&, clang::StringRef fileName) final {
99 llvm::outs() << std::format("PROCESSING SOURCE FILE {}\n",
100 std::string(fileName));
101 return std::unique_ptr<clang::ASTConsumer>{new MyAstConsumer};
102 }
103 };
References
■ many good examples of AST matchers can be found in source code for
clang-tidy (in clang-tools-extra/clang-tidy
:::::::::::::::::::::::::::::::
directory of LLVM
source tree)
■ various talks and articles by Stephen Kelly (as well as talks/articles by Eli
Bendersky and others) listed in References section
■ search for “[clang-ast-matchers]” on StackOverflow
■ yet more examples of AST matchers can be found in
https://github.com/lanl/CoARCT
//clang.llvm.org/docs/InternalsManual.html#the-cfg-class
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 268
CFG Pretty Printer Example: Summary
Program Output
FUNCTION : abs
[ B4 ( ENTRY )]
Succs (1): B3
[ B1 ]
1: x ( ImplicitCastExpr , LValueToRValue , int )
2: return [ B1 .1];
Preds (1): B3
Succs (1): B0
[ B2 ]
1: -x
2: return [ B2 .1];
Preds (1): B3
Succs (1): B0
[ B3 ]
1: x < 0
T: if [ B3 .1]
Preds (1): B4
Succs (2): B2 B1
[ B0 ( EXIT )]
Preds (2): B1 B2
1 #include <format>
2 #include <string>
3 #include "clang/Analysis/CFG.h"
4 #include "clang/ASTMatchers/ASTMatchers.h"
5 #include "clang/ASTMatchers/ASTMatchFinder.h"
6 #include "clang/Basic/LangOptions.h"
7 #include "clang/Frontend/FrontendActions.h"
8 #include "clang/Tooling/CommonOptionsParser.h"
9 #include "clang/Tooling/Tooling.h"
10 #include "llvm/Support/CommandLine.h"
11
12 namespace cam = clang::ast_matchers;
13 namespace ct = clang::tooling;
14 namespace lc = llvm::cl;
15
16 static lc::OptionCategory toolCategory("Tool Options");
17 static lc::opt<std::string> clFuncNamePattern("f", lc::cat(toolCategory),
18 lc::init(".*"));
19 static lc::opt<bool> clUseColor("c", lc::cat(toolCategory), lc::init(false));
statements)
2 size: get number of elements in block
2 succ_begin and succ_end: return range corresponding to blocks that are
successors to block
2 succ_size: get number of successor blocks
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1CFGBlock.html
human-readable format
2 getKind: get kind of element (e.g., statement, constructor)
2 getAs: get as specified type or return empty optional if does not have
specified type
■ for more information, see:
2 https:
//clang.llvm.org/doxygen/classclang_1_1CFGElement.html
1 #include <format>
2 #include "clang/Analysis/CFG.h"
3 #include "clang/AST/ASTContext.h"
4 #include "clang/ASTMatchers/ASTMatchers.h"
5 #include "clang/ASTMatchers/ASTMatchFinder.h"
6 #include "clang/Tooling/CommonOptionsParser.h"
7 #include "clang/Tooling/Tooling.h"
8 #include "llvm/Support/CommandLine.h"
9 #include "llvm/Support/raw_ostream.h"
10
11 namespace ct = clang::tooling;
12 namespace cam = clang::ast_matchers;
13
14 static llvm::cl::OptionCategory toolCategory("Tool Options");
15 static llvm::cl::opt<unsigned int> thresholdOption("t",
16 llvm::cl::init(0), llvm::cl::desc("Set complexity threshold."),
17 llvm::cl::cat(toolCategory));
Code Analysis
1AnalysisDeclContextManager.html
1AnalysisDeclContext.html#details
//clang.llvm.org/doxygen/classclang_1_1LiveVariables.html
Program Output
FUNCTION : foo
[ B0 ( live variables at block exit ) ]
[ B1 ( live variables at block exit ) ]
[ B2 ( live variables at block exit ) ]
t </ home / jdoe / example_3_b . cpp :2:6 >
[ B3 ( live variables at block exit ) ]
t </ home / jdoe / example_3_b . cpp :2:6 >
[ B4 ( live variables at block exit ) ]
[ B5 ( live variables at block exit ) ]
x </ home / jdoe / example_3_b . cpp :1:13 >
y </ home / jdoe / example_3_b . cpp :1:20 >
1 #include "clang/AST/ASTContext.h"
2 void analyzeFunc(clang::ASTContext& astContext, const clang::FunctionDecl*
3 funcDecl, bool printCfg);
1 #include "clang/AST/ASTContext.h"
2 #include "clang/Analysis/CFG.h"
3 #include "clang/Analysis/AnalysisDeclContext.h"
4 #include "clang/Analysis/Analyses/LiveVariables.h"
5
6 void analyzeFunc(clang::ASTContext& astContext, const clang::FunctionDecl*
7 funcDecl, bool printCfg) {
8 clang::AnalysisDeclContextManager adcm(astContext);
9 clang::AnalysisDeclContext *adc = adcm.getContext(
10 llvm::cast<clang::Decl>(funcDecl));
11 assert(adc);
12 adc->getCFGBuildOptions().setAllAlwaysAdd();
13 const clang::CFG& cfg = *adc->getCFG();
14 if (printCfg)
15 {cfg.print(llvm::outs(), astContext.getLangOpts(), false);}
16 clang::LiveVariables *lv = adc->getAnalysis<clang::LiveVariables>();
17 if (!lv) {return;}
18 auto observer = std::make_unique<clang::LiveVariables::Observer>();
19 assert(observer);
20 lv->runOnAllBlocks(*observer);
21 lv->dumpBlockLiveness((funcDecl->getASTContext()).getSourceManager());
22 }
Declarations
■ DeclarationName
:::::::::::::::::
class represents name of declaration
■ declarations names can name entities such as:
2 simple identifiers
2 class constructors and destructors
2 overloaded operators
2 conversion functions
■ getNameKind member of DeclarationName returns value indicating kind
of name stored
■ DeclarationName instances can be compared
■ see also: https://clang.llvm.org/docs/InternalsManual.html#
declaration-names
Types
■ syntactic sugar: syntax intended to make code more readable (e.g., type
alias introduced by typedef or using); see also ::::::::::::
Wikipedia link
■ unqualified type: type without any (top-level) qualifiers (e.g., const,
volatile, and restrict)
■ qualified type: type that can potentially have (top-level) qualifiers (e.g.,
const, volatile, and restrict)
■ due to type aliases (e.g., resulting from typedef and using alias
declarations), type often does not have unique representation
■ canonical type: unique representation of type that results from removing
all syntactic sugar and type aliases from type (see also: https://clang.
llvm.org/docs/InternalsManual.html#canonical-types)
■ desugared type: type with syntactic sugar removed
qualifiers)
■ QualType is very small type that contains set of zero or more qualifiers
and pointer to Type instance
■ clang::Type
::::::::::::
class used to represent unqualified type
■ all (unqualified) types represented using types that derive from Type
■ some examples of types derived from Type include:
2 ArrayType, AttributedType, BuiltinType, DeducedType,
::::::::: :::::::::::::: :::::::::::: :::::::::::
ElaboratedType , FunctionType, FunctionProtoType,
:::::::::::::: ::::::::::::: :::::::::::::::::
MemberPointerType , PointerType, ReferenceType, TagType,
::::::::::::::::: :::::::::::: ::::::::::::: :::::::
EnumType, RecordType
:::::::: ::::::::::
, and TypedefType
:::::::::::
■ methods provided to do things such as:
2 check if type has particular property, such as: pointer type
(isPointerType), literal type (isLiteralType), and trivially-copyable type
(isTriviallyCopyableType)
2 get corresponding canonical type
2 get pointee or array-element type (if any)
2 get access to related type information (e.g., CXXRecordDecl and
getTypeClassName)
2 dump information about Type instance (e.g., dump)
getTypePtrOrNull)
2 check properties of types that depend on qualifiers (e.g.,
isConstQualified)
2 add/remove qualifiers for QualType (e.g., addConst and
removeLocalConst)
2 create QualType instance with qualifiers added/removed (e.g., withConst)
2 create QualType instance from Type instance with specified qualifiers (via
QualType constructor)
2 get QualType instance with various levels of desugaring (e.g.,
getDesugaredType and getSingleStepDesugaredType)
2 convert type to string (e.g., getAsString)
■ most nodes used in AST representation of types derive from Type class
but some exceptions to this
■ for example, in case of templates, some expressions appearing in
template arguments cannot be evaluated until template instantiatied
■ for this reason, some of AST nodes that derive from Type sometimes
need to refer to expression (i.e., Expr) nodes that represent expression to
be evaluated (when template instantiated)
■ after template instantiated, expressions no longer needed for compilation
(but might perhaps be useful for tooling)
■ for example, in case of following template, size of array not known until
template instantiated:
template<int I> using RealArray = float[I];
■ clang::BuiltinType
::::::::::::::::::::
class used to represent built-in types (such as
char, int, and float)
■ getKind method to query particular built-in type
■ getName method to obtain name of type as string
■ various methods corresponding to predicates to test type for certain
properties, such as:
2 isInteger
2 isFloatingPoint
2 isSignedInteger
2 isUnsignedInteger
Source Code
float x[5][4][3];
■ clang::PointerType
::::::::::::::::::::
class used to represent pointer type
■ provides getPointeeType method to get QualType for type of object to
which pointer points
Source Code
volatile char** const p = nullptr; // consider type of p
■ clang::ReferenceType
::::::::::::::::::::::
class used to represent reference type
■ provides getPointeeType method to obtain QualType for pointee type
(i.e., type of value referred to by reference)
■ two types derive from ReferenceType:
2 LValueReferenceType: represents lvalue reference type
:::::::::::::::::::
2 RValueReferenceType: represents rvalue reference type
:::::::::::::::::::
■ clang::FunctionType
:::::::::::::::::::::
class used to represent function types
■ FunctionType has two derived types:
2 FunctionProtoType
:::::::::::::::::
: represents function type with parameter types
specified (as in C++ function declaration)
2 FunctionNoProtoType
:::::::::::::::::::
: represents function type without parameter types
specified (as in old style C)
■ FunctionProtoType members include:
2 getNumParams: get number of function parameters
2 getParamType: get type of specified function parameter as QualType
2 getReturnType: get function return type as QualType
2 getExceptionSpecType: get exception specification type for function
2 getNumExceptions get number of types of exceptions that can be thrown
by function
2 getExceptionType: get specific exception type for function as QualType
Source Code
using Size = unsigned long;
void f(char*, Size); // consider type of f
Source Code
struct widget {int x;};
const int widget::*p = &widget::x; // consider type of p
Source Code
typedef float Real;
using RealArray32 = Real[3][2];
RealArray32 x; // consider type of x
■ clang::TypeWithKeyword
:::::::::::::::::::::::::
class captures information about enum,
struct, union, and class keywords
■ has derived types:
2 ElaboratedType: represents elaborated type keyword (e.g., struct S)
::::::::::::::
or qualified name (e.g., N::M::type) or both
2 DependentNameType: represents qualified type name for which type name
:::::::::::::::::
is dependent
2 DependentTemplateSpecializationType: represents template
::::::::::::::::::::::::::::::::::::
specialization type whose template cannot be resolved
Source Code
namespace foo {struct widget {};}
struct foo::widget w;
■ clang::ParenType
::::::::::::::::::
class captures information about parentheses used
when specifying types
■ has getInnerType method for obtaining QualType for type enclosed by
parentheses
Source Code
int (((x)));
■ SplitQualType
:::::::::::::::
is pair-like structure for storing qualified type split into
local qualifiers and locally-unqualified type
■ CanQualType
::::::::::::
is alias for clang::CanQual<clang::Type>
■ DEF_TRAVERSE_TYPE macro invocations in header
clang/AST/RecursiveASTVisitor.h can be used to determine
relatively easily how type nodes traversed
■ types can be constructed via members of ASTContext
Type Locations
■ TypeLoc
::::::::
class used as base class for representing spelling of type
information in source code
■ TypeLoc instance for many places where type spelled in source code
(e.g., spelling of type in declarator)
■ TypeLoc has two direct derived classes:
2 QualifiedTypeLoc
::::::::::::::::
: associated with type having non-trivial direct qualifiers
2 UnqualTypeLoc
:::::::::::::
: associated with type having no direct qualifiers
■ UnqualTypeLoc has many derived types, including one for each type
derived from Type (e.g., ArrayTypeLoc, EnumTypeLoc, RecordTypeLoc,
and so on)
■ TypeLoc instances form shadow hierarchy of Type hierarchy
■ clang::TypeSourceInfo
:::::::::::::::::::::::
class used as container for storing type
information as written in source code
■ getType: get QualType associated with type source information
■ getTypeLoc: get TypeLoc associated with type source information
■ many classes have methods that return pointers to TypeSourceInfo
■ for example, every DeclaratorDecl
:::::::::::::::
instance has corresponding
TypeLoc instance which can be obtained via getTypeSourceInfo
method (which yields TypeSourceInfo
::::::::::::::::
with getTypeLoc method)
■ in case of DeclaratorDecl for type, no TypeLoc instance corresponding
to declared name of type
Templates
Explicit Partial
Kind Specialization Specialization
Class Yes Yes
Variable Yes Yes
Function Yes No
Alias No No
■ TypeAliasTemplateDecl
:::::::::::::::::::::::
used to represent alias template declaration
■ has child node of type TypeAliasDecl that can be obtained via
getTemplatedDecl member function
Source Code
1 template<class T, int I> using SimpleArray = T[I];
TypeAliasTemplateDecl
TemplateTypeParmDecl T
NonTypeTemplateParmDecl I
TypeAliasDecl SimpleArray T[I]
DependentSizedArrayType T[I]
TemplateTypeParmType T
TemplateTypeParm T
DeclRefExpr I
■ ClassTemplateDecl
:::::::::::::::::::
node used to represent primary class template
■ subclass of ::::::::::::::
TemplateDecl
■ has child nodes to represent template parameters via types including
TemplateTypeParmDecl and NonTypeTemplateParmDecl
:::::::::::::::::::::: :::::::::::::::::::::::::
■ has child node for class/struct/union with type :::::::::::::::
CXXRecordDecl
■ has children for specializations of class template with type
ClassTemplateSpecialization
■ ClassTemplatePartialSpecializationDecl
::::::::::::::::::::::::::::::::::::::::::
represents partial
specialization of class template
■ has children for template parameters and template arguments (e.g., types
TemplateTypeParmDecl , NonTypeTemplateParmDecl, and
:::::::::::::::::::::: ::::::::::::::::::::::::::
TemplateArgument
::::::::::::::::::
)
■ derives from :::::::::::::::
CXXRecordDecl
■ has children for various class/union-related information, such as access
specifiers and field declarations (by virtue of deriving from
CXXRecordDecl)
■ ClassTemplateSpecializationDecl
:::::::::::::::::::::::::::::::::::
represents (full) specialization of
class template
■ has children for template arguments (i.e., type TemplateArgument
:::::::::::::::::
)
■ derives from :::::::::::::::
CXXRecordDecl
■ has children for various class/union-related information, such as access
specifiers and field declarations (by virtue of deriving from
CXXRecordDecl)
ClassTemplateDecl Widget
TemplateTypeParmDecl T
NonTypeTemplateParmDecl int I
CXXRecordDecl Widget
ClassTemplateSpecializationDecl
... DefinitionData
ClassTemplateSpecialization Widget ...
ClassTemplatePartialSpecializationDecl Widget
TemplateArgument int
DefinitionData
BuiltinType int
TemplateArgument type-parameter-0-0
TemplateArgument 42
TemplateTypeParmType type-parameter-0-0
CXXRecordDecl Widget
TemplateArgument 42
TemplateTypeParmDecl T
CXXRecordDecl Widget
cont.
■ FunctionTemplateDecl
::::::::::::::::::::::
represents function template declaration
■ has child nodes to represent template parameters via types including
TemplateTypeParmDecl
::::::::::::::::::::::
and NonTypeTemplateParmDecl
:::::::::::::::::::::::::
■ no “FunctionTemplateSpecializationDecl” type, as Clang
represents explicit specialization of function as FunctionDecl AST node
(and FunctionDecl node can be queried to know if it is associated with
template)
Source Code
1 template<class T, int I> void func(T(&x)[I]) {} // primary template
2 template<> void func(bool(&x)[42]) {} // explicit specialization
AST (Clang 15)
FunctionTemplateDecl func
TemplateTypeParmDecl T FunctionDecl func [explicit specialization]
ParmVarDecl x TemplateTypeParmDecl 42
CompoundStmt ParmVarDecl x
■ VarTemplateDecl
:::::::::::::::::
used to represent variable template declaration
■ subclass of ::::::::::::::
TemplateDecl
■ has child nodes to represent template parameters via types including
TemplateTypeParmDecl and NonTypeTemplateParmDecl
:::::::::::::::::::::: :::::::::::::::::::::::::
■ has child node for variable declaration with type VarDecl
::::::::
■ has children for specializations of variable template with type
VarTemplateSpecialization
■ VarTemplatePartialSpecializationDecl
::::::::::::::::::::::::::::::::::::::::
represents partial
specialization of variable template
■ has children for template parameters and template arguments (e.g., types
TemplateTypeParmDecl , NonTypeTemplateParmDecl, and
:::::::::::::::::::::: ::::::::::::::::::::::::::
TemplateArgument
::::::::::::::::::
)
■ derived from ::::::::
VarDecl
■ has various information for variable declaration by virtue of deriving from
VarDecl
■ VarTemplateSpecializationDecl
::::::::::::::::::::::::::::::::
represents (full) specialization of
variable template
■ has children for template arguments (i.e., type TemplateArgument
:::::::::::::::::
)
■ derived from ::::::::
VarDecl
■ has various information for variable declaration by virtue of deriving from
VarDecl
Source Code
1 template<class T, int I> T var; // primary template
2 template<class T> T var<T, 42>; // partial specialization
3 template<> bool var<int, 0>; // explicit specialization
AST (Clang 15)
VarTemplateDecl var
TemplateTypeParmDecl T
NonTypeTemplateParmDecl int I
VarDecl var
VarTemplateSpecializationDecl var
VarTemplateSpecialization var [explicit specialization]
TemplateArgument int
VarTemplatePartialSpecializationDecl var
BuiltInType int
TemplateTypeParmDecl T
TemplateArgument 0
TemplateArgument type-parameter-0-0
TemplateTypeParmType type-parameter-0-0
TemplateArgument 42
cont.
Matcher Expression
functionDecl(isInstantiated()).bind("f")
Matcher Expression
functionDecl(isExplicitTemplateSpecialization()).bind("f")
Matcher Expression
varDecl(isTemplateInstantiation()).bind("v")
Matcher Expression
varDecl(isExplicitTemplateSpecialization()).bind("v")
1 #include <cassert>
2 #include <format>
3 #include "clang/ASTMatchers/ASTMatchers.h"
4 #include "clang/ASTMatchers/ASTMatchFinder.h"
5 #include "clang/Frontend/FrontendActions.h"
6 #include "clang/Tooling/CommonOptionsParser.h"
7 #include "clang/Tooling/Tooling.h"
8 #include "llvm/Support/CommandLine.h"
9
10 namespace ct = clang::tooling;
11 namespace cam = clang::ast_matchers;
12
13 std::vector<std::string> getPackTypeNames(const clang::TemplateArgument& arg,
14 clang::PrintingPolicy pp) {
15 std::vector<std::string> names;
16 for (auto packIter = arg.pack_begin(); packIter != arg.pack_end();
17 ++packIter) {
18 names.push_back({});
19 llvm::raw_string_ostream outStream(names.back());
20 packIter->print(pp, outStream, false);
21 }
22 return names;
23 }
Attributes
clang::assume_aligned, clang::no_sanitize,
clang::no_sanitize_memory, clang::no_sanitize_address, and
clang::no_sanitize_thread
■ some supported GCC attributes include:
2 gnu::always_inline and gnu::noinline
how-to-add-an-attribute
Source Code
1 [[nodiscard, deprecated]] int get_answer() {return 42;}
FunctionDecl get_answer
CompoundStmt
ReturnStmt
IntegerLiteral 42
WarnUnusedResultAttr nodiscard
DeprecatedAttr
FunctionDecl func
ParmVarDecl int x
CompoundStmt
IfStmt
BinaryOperator <
···
AttributedStmt
UnlikelyAttr unlikely
CompoundStmt
···
Casts
FunctionDecl forty_two
CompoundStmt
DeclStmt
VarDecl const int x
IntegerLiteral 42
ReturnStmt
CXXReinterpretCastExpr reinterpret_cast<void *>
ImplicitCastExpr LValueToRValue
DeclRefExpr x
Source-Code Comments
2 VerbatimBlockLineComment
1Comment.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 378
Section 3.19
Miscellany
■ clang::MangleContext
::::::::::::::::::::::
class provides context for tracking state which
persists across multiple calls to C++ name mangler
■ MangleContext::ManglerKind used to specify name mangling
convention to be used (e.g., MK_Itanium or MK_Microsoft)
■ ASTContext provides createMangleContext method for creating
MangleContext instance
■ MangleContext class provides mangleTypeName and mangleName
methods for mangling names
■ clang::GlobalDecl
:::::::::::::::::::
represents global declaration; used to wrap
declaration and other input needed to perform name mangling
■ for example, name mangling for constructor/destructor needs kind of
constructor/destructor (e.g., complete, base)
■ llvm::ItaniumPartialDemangler
::::::::::::::::::::::::::::::::
class can be used to demangle
name and extract some limited information about demangled entity (e.g.,
determine if name corresponds to function or variable, get base function
name, get function return/parameter types, etc.)
■ llvm::itaniumDemangle function demangles mangled name
:::::::::::::::::::::::
■ clang::CFGReverseBlockReachabilityAnalysis
2 check if one block reachable from another block in CFG
2 https://clang.llvm.org/doxygen/classclang_1_
1CFGReverseBlockReachabilityAnalysis.html#
a73cec1b9cbbc6e2461470906e6a0720a
■ clang::CallGraph
2 used for constructing AST-based call graph
2 https://clang.llvm.org/doxygen/classclang_1_1CallGraph.html
References
1 Min-Yih Hsu. LLVM Techniques, Tips, and Best Practices Clang and
Middle-End Libraries. Packt Publishing, Dec. 2021,
https://isbnsearch.org/isbn/9781838824952. [Source code
available from
https://github.com/PacktPublishing/LLVM-Techniques-Tips-
and-Best-Practices-Clang-and-Middle-End-Libraries.]
[Chapters 5–8 discuss various aspects of the Clang frontend in some detail.]
2 Suyog Sarda and Mayur Pandey. LLVM Essentials. Packt Publishing, Dec.
2015, https://isbnsearch.org/isbn/9781785280801. [This book
focuses on LLVM as opposed to Clang.]
3 Kai Nacke. Learn LLVM 12. Packt Publishing, May 2021, https://www.
packtpub.com/product/cloud_and_networking/9781839213502.
4 Kai Nacke. Learn LLVM 11: A beginner’s guide to learning LLVM compiler
tools and core libraries with C++. Packt Publishing, Dec. 2021,
https://isbnsearch.org/isbn/9781839213502.
5 Bruno Cardoso Lopes and Rafael Auler. Getting Started with LLVM Core
Libraries. Packt Publishing, Aug. 2014,
https://isbnsearch.org/isbn/9781782166924.
6 Mayur Pandey and Suyog Sarda. LLVM Cookbook. Packt Publishing, May
2015, https://isbnsearch.org/isbn/9781785285981. [This book
does not take a systematic approach to teaching LLVM/Clang. Rather, it teaches
by presenting recipes/examples.]
9 Peter Smith. YVR18-223: How to Build a C++ Processing Tool Using the
Clang Libraries. Linaro Connect 2018 — YVR18, Vancouver, BC, Canada,
Sept. 17–21, 2018. Available online at
https://youtu.be/8QvLVEaxzC8. Slides and video available at https:
//resources.linaro.org/en/resource/Bi4FpRDmERUuU5ei7nry9h.
[A fast-paced talk that covers an example of using the Clang libraries to
apply a simple source-code transformation.]
10 Stephan Bergman. Plug Yourself In: Learn How to Write a Clang Compiler
Plugin. LibreOffice Conference, Aarhus, Denmark, Sept. 24, 2015.
Available online at https://youtu.be/pdxlmM477KY. [A simple Clang
plugin is developed in a step-by-step fashion.]