Lecture Slides For The Clang Libraries-0.1.0-1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 419

Lecture Slides for the Clang Libraries (LLVM/Clang 15)

Edition 0.1.0

Michael D. Adams

Department of Electrical and Computer Engineering


University of Victoria
Victoria, British Columbia, Canada

To obtain the :::::::::::::


most recent version of these lecture slides (with functional hyperlinks) or for additional
information and resources related to these slides (including errata), please visit:
https://www.ece.uvic.ca/~mdadams/cppbook
If you would like to show your support for the author’s work on these lecture slides, please consider
starring the corresponding GitHub project at:
https://github.com/mdadams/clang_libraries_companion

youtube.com/iamcanadian1867 github.com/mdadams @mdadams16


The author has taken care in the preparation of this document, but makes no expressed or implied warranty of any kind and assumes no
responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use
of the information or programs contained herein.

Copyright © 2022, 2023 Michael D. Adams


This document is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) License. A copy
of this license can be found on page iii of this document. For a simple explanation of the rights granted by this license, see:
https://creativecommons.org/licenses/by-nc-nd/3.0/

Linux is a registered trademark of Linus Torvalds. UNIX and X Window System are registered trademarks of The Open Group. Windows is a
registered trademark of Microsoft Corporation. macOS is a registered trademark of Apple Inc. Chrome OS is a registered trademark of Google
LLC. Fedora is a registered trademark of Red Hat, Inc. Ubuntu is a registered trademark of Canonical Ltd. The YouTube logo is a registered
trademark of Google, Inc. The GitHub logo is a registered trademark of GitHub, Inc. The Twitter logo is a registered trademark of Twitter, Inc.

This document was typeset with LATEX.

ISBN 978-1-990707-05-6 (PDF)


License I

Creative Commons Legal Code


Attribution-NonCommercial-NoDerivs 3.0 Unported
CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
LEGAL SERVICES. DISTRIBUTION OF THIS LICENSE DOES NOT CREATE AN
ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
REGARDING THE INFORMATION PROVIDED, AND DISCLAIMS LIABILITY FOR
DAMAGES RESULTING FROM ITS USE.
License
THE WORK (AS DEFINED BELOW) IS PROVIDED UNDER THE TERMS OF THIS CREATIVE
COMMONS PUBLIC LICENSE ("CCPL" OR "LICENSE"). THE WORK IS PROTECTED BY
COPYRIGHT AND/OR OTHER APPLICABLE LAW. ANY USE OF THE WORK OTHER THAN AS
AUTHORIZED UNDER THIS LICENSE OR COPYRIGHT LAW IS PROHIBITED.
BY EXERCISING ANY RIGHTS TO THE WORK PROVIDED HERE, YOU ACCEPT AND AGREE
TO BE BOUND BY THE TERMS OF THIS LICENSE. TO THE EXTENT THIS LICENSE MAY
BE CONSIDERED TO BE A CONTRACT, THE LICENSOR GRANTS YOU THE RIGHTS
CONTAINED HERE IN CONSIDERATION OF YOUR ACCEPTANCE OF SUCH TERMS AND
CONDITIONS.
1. Definitions
a. "Adaptation" means a work based upon the Work, or upon the Work and
other pre-existing works, such as a translation, adaptation,
derivative work, arrangement of music or other alterations of a
literary or artistic work, or phonogram or performance and includes
cinematographic adaptations or any other form in which the Work may be
recast, transformed, or adapted including in any form recognizably
derived from the original, except that a work that constitutes a
Collection will not be considered an Adaptation for the purpose of
this License. For the avoidance of doubt, where the Work is a musical
work, performance or phonogram, the synchronization of the Work in
timed-relation with a moving image ("synching") will be considered an
Adaptation for the purpose of this License.
b. "Collection" means a collection of literary or artistic works, such as
encyclopedias and anthologies, or performances, phonograms or
broadcasts, or other works or subject matter other than works listed

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 iii


License II

in Section 1(f) below, which, by reason of the selection and


arrangement of their contents, constitute intellectual creations, in
which the Work is included in its entirety in unmodified form along
with one or more other contributions, each constituting separate and
independent works in themselves, which together are assembled into a
collective whole. A work that constitutes a Collection will not be
considered an Adaptation (as defined above) for the purposes of this
License.
c. "Distribute" means to make available to the public the original and
copies of the Work through sale or other transfer of ownership.
d. "Licensor" means the individual, individuals, entity or entities that
offer(s) the Work under the terms of this License.
e. "Original Author" means, in the case of a literary or artistic work,
the individual, individuals, entity or entities who created the Work
or if no individual or entity can be identified, the publisher; and in
addition (i) in the case of a performance the actors, singers,
musicians, dancers, and other persons who act, sing, deliver, declaim,
play in, interpret or otherwise perform literary or artistic works or
expressions of folklore; (ii) in the case of a phonogram the producer
being the person or legal entity who first fixes the sounds of a
performance or other sounds; and, (iii) in the case of broadcasts, the
organization that transmits the broadcast.
f. "Work" means the literary and/or artistic work offered under the terms
of this License including without limitation any production in the
literary, scientific and artistic domain, whatever may be the mode or
form of its expression including digital form, such as a book,
pamphlet and other writing; a lecture, address, sermon or other work
of the same nature; a dramatic or dramatico-musical work; a
choreographic work or entertainment in dumb show; a musical
composition with or without words; a cinematographic work to which are
assimilated works expressed by a process analogous to cinematography;
a work of drawing, painting, architecture, sculpture, engraving or
lithography; a photographic work to which are assimilated works
expressed by a process analogous to photography; a work of applied
art; an illustration, map, plan, sketch or three-dimensional work
relative to geography, topography, architecture or science; a
performance; a broadcast; a phonogram; a compilation of data to the
extent it is protected as a copyrightable work; or a work performed by
a variety or circus performer to the extent it is not otherwise
considered a literary or artistic work.
g. "You" means an individual or entity exercising rights under this

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 iv


License III

License who has not previously violated the terms of this License with
respect to the Work, or who has received express permission from the
Licensor to exercise rights under this License despite a previous
violation.
h. "Publicly Perform" means to perform public recitations of the Work and
to communicate to the public those public recitations, by any means or
process, including by wire or wireless means or public digital
performances; to make available to the public Works in such a way that
members of the public may access these Works from a place and at a
place individually chosen by them; to perform the Work to the public
by any means or process and the communication to the public of the
performances of the Work, including by public digital performance; to
broadcast and rebroadcast the Work by any means including signs,
sounds or images.
i. "Reproduce" means to make copies of the Work by any means including
without limitation by sound or visual recordings and the right of
fixation and reproducing fixations of the Work, including storage of a
protected performance or phonogram in digital form or other electronic
medium.
2. Fair Dealing Rights. Nothing in this License is intended to reduce,
limit, or restrict any uses free from copyright or rights arising from
limitations or exceptions that are provided for in connection with the
copyright protection under copyright law or other applicable laws.
3. License Grant. Subject to the terms and conditions of this License,
Licensor hereby grants You a worldwide, royalty-free, non-exclusive,
perpetual (for the duration of the applicable copyright) license to
exercise the rights in the Work as stated below:
a. to Reproduce the Work, to incorporate the Work into one or more
Collections, and to Reproduce the Work as incorporated in the
Collections; and,
b. to Distribute and Publicly Perform the Work including as incorporated
in Collections.
The above rights may be exercised in all media and formats whether now
known or hereafter devised. The above rights include the right to make
such modifications as are technically necessary to exercise the rights in
other media and formats, but otherwise you have no rights to make
Adaptations. Subject to 8(f), all rights not expressly granted by Licensor

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 v


License IV

are hereby reserved, including but not limited to the rights set forth in
Section 4(d).
4. Restrictions. The license granted in Section 3 above is expressly made
subject to and limited by the following restrictions:
a. You may Distribute or Publicly Perform the Work only under the terms
of this License. You must include a copy of, or the Uniform Resource
Identifier (URI) for, this License with every copy of the Work You
Distribute or Publicly Perform. You may not offer or impose any terms
on the Work that restrict the terms of this License or the ability of
the recipient of the Work to exercise the rights granted to that
recipient under the terms of the License. You may not sublicense the
Work. You must keep intact all notices that refer to this License and
to the disclaimer of warranties with every copy of the Work You
Distribute or Publicly Perform. When You Distribute or Publicly
Perform the Work, You may not impose any effective technological
measures on the Work that restrict the ability of a recipient of the
Work from You to exercise the rights granted to that recipient under
the terms of the License. This Section 4(a) applies to the Work as
incorporated in a Collection, but this does not require the Collection
apart from the Work itself to be made subject to the terms of this
License. If You create a Collection, upon notice from any Licensor You
must, to the extent practicable, remove from the Collection any credit
as required by Section 4(c), as requested.
b. You may not exercise any of the rights granted to You in Section 3
above in any manner that is primarily intended for or directed toward
commercial advantage or private monetary compensation. The exchange of
the Work for other copyrighted works by means of digital file-sharing
or otherwise shall not be considered to be intended for or directed
toward commercial advantage or private monetary compensation, provided
there is no payment of any monetary compensation in connection with
the exchange of copyrighted works.
c. If You Distribute, or Publicly Perform the Work or Collections, You
must, unless a request has been made pursuant to Section 4(a), keep
intact all copyright notices for the Work and provide, reasonable to
the medium or means You are utilizing: (i) the name of the Original
Author (or pseudonym, if applicable) if supplied, and/or if the
Original Author and/or Licensor designate another party or parties
(e.g., a sponsor institute, publishing entity, journal) for
attribution ("Attribution Parties") in Licensor’s copyright notice,

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 vi


License V

terms of service or by other reasonable means, the name of such party


or parties; (ii) the title of the Work if supplied; (iii) to the
extent reasonably practicable, the URI, if any, that Licensor
specifies to be associated with the Work, unless such URI does not
refer to the copyright notice or licensing information for the Work.
The credit required by this Section 4(c) may be implemented in any
reasonable manner; provided, however, that in the case of a
Collection, at a minimum such credit will appear, if a credit for all
contributing authors of Collection appears, then as part of these
credits and in a manner at least as prominent as the credits for the
other contributing authors. For the avoidance of doubt, You may only
use the credit required by this Section for the purpose of attribution
in the manner set out above and, by exercising Your rights under this
License, You may not implicitly or explicitly assert or imply any
connection with, sponsorship or endorsement by the Original Author,
Licensor and/or Attribution Parties, as appropriate, of You or Your
use of the Work, without the separate, express prior written
permission of the Original Author, Licensor and/or Attribution
Parties.
d. For the avoidance of doubt:
i. Non-waivable Compulsory License Schemes. In those jurisdictions in
which the right to collect royalties through any statutory or
compulsory licensing scheme cannot be waived, the Licensor
reserves the exclusive right to collect such royalties for any
exercise by You of the rights granted under this License;
ii. Waivable Compulsory License Schemes. In those jurisdictions in
which the right to collect royalties through any statutory or
compulsory licensing scheme can be waived, the Licensor reserves
the exclusive right to collect such royalties for any exercise by
You of the rights granted under this License if Your exercise of
such rights is for a purpose or use which is otherwise than
noncommercial as permitted under Section 4(b) and otherwise waives
the right to collect royalties through any statutory or compulsory
licensing scheme; and,
iii. Voluntary License Schemes. The Licensor reserves the right to
collect royalties, whether individually or, in the event that the
Licensor is a member of a collecting society that administers
voluntary licensing schemes, via that society, from any exercise
by You of the rights granted under this License that is for a
purpose or use which is otherwise than noncommercial as permitted

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 vii


License VI

under Section 4(b).


e. Except as otherwise agreed in writing by the Licensor or as may be
otherwise permitted by applicable law, if You Reproduce, Distribute or
Publicly Perform the Work either by itself or as part of any
Collections, You must not distort, mutilate, modify or take other
derogatory action in relation to the Work which would be prejudicial
to the Original Author’s honor or reputation.
5. Representations, Warranties and Disclaimer
UNLESS OTHERWISE MUTUALLY AGREED BY THE PARTIES IN WRITING, LICENSOR
OFFERS THE WORK AS-IS AND MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY
KIND CONCERNING THE WORK, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE,
INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTIBILITY,
FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF
LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS,
WHETHER OR NOT DISCOVERABLE. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION
OF IMPLIED WARRANTIES, SO SUCH EXCLUSION MAY NOT APPLY TO YOU.
6. Limitation on Liability. EXCEPT TO THE EXTENT REQUIRED BY APPLICABLE
LAW, IN NO EVENT WILL LICENSOR BE LIABLE TO YOU ON ANY LEGAL THEORY FOR
ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR EXEMPLARY DAMAGES
ARISING OUT OF THIS LICENSE OR THE USE OF THE WORK, EVEN IF LICENSOR HAS
BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
7. Termination
a. This License and the rights granted hereunder will terminate
automatically upon any breach by You of the terms of this License.
Individuals or entities who have received Collections from You under
this License, however, will not have their licenses terminated
provided such individuals or entities remain in full compliance with
those licenses. Sections 1, 2, 5, 6, 7, and 8 will survive any
termination of this License.
b. Subject to the above terms and conditions, the license granted here is
perpetual (for the duration of the applicable copyright in the Work).
Notwithstanding the above, Licensor reserves the right to release the
Work under different license terms or to stop distributing the Work at
any time; provided, however that any such election will not serve to
withdraw this License (or any other license that has been, or is
required to be, granted under the terms of this License), and this

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 viii


License VII

License will continue in full force and effect unless terminated as


stated above.
8. Miscellaneous
a. Each time You Distribute or Publicly Perform the Work or a Collection,
the Licensor offers to the recipient a license to the Work on the same
terms and conditions as the license granted to You under this License.
b. If any provision of this License is invalid or unenforceable under
applicable law, it shall not affect the validity or enforceability of
the remainder of the terms of this License, and without further action
by the parties to this agreement, such provision shall be reformed to
the minimum extent necessary to make such provision valid and
enforceable.
c. No term or provision of this License shall be deemed waived and no
breach consented to unless such waiver or consent shall be in writing
and signed by the party to be charged with such waiver or consent.
d. This License constitutes the entire agreement between the parties with
respect to the Work licensed here. There are no understandings,
agreements or representations with respect to the Work not specified
here. Licensor shall not be bound by any additional provisions that
may appear in any communication from You. This License may not be
modified without the mutual written agreement of the Licensor and You.
e. The rights granted under, and the subject matter referenced, in this
License were drafted utilizing the terminology of the Berne Convention
for the Protection of Literary and Artistic Works (as amended on
September 28, 1979), the Rome Convention of 1961, the WIPO Copyright
Treaty of 1996, the WIPO Performances and Phonograms Treaty of 1996
and the Universal Copyright Convention (as revised on July 24, 1971).
These rights and subject matter take effect in the relevant
jurisdiction in which the License terms are sought to be enforced
according to the corresponding provisions of the implementation of
those treaty provisions in the applicable national law. If the
standard suite of rights granted under applicable copyright law
includes additional rights not granted under this License, such
additional rights are deemed to be included in the License; this
License is not intended to restrict the license of any rights under
applicable law.

Creative Commons Notice

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 ix


License VIII

Creative Commons is not a party to this License, and makes no warranty


whatsoever in connection with the Work. Creative Commons will not be
liable to You or any party on any legal theory for any damages
whatsoever, including without limitation any general, special,
incidental or consequential damages arising in connection to this
license. Notwithstanding the foregoing two (2) sentences, if Creative
Commons has expressly identified itself as the Licensor hereunder, it
shall have all rights and obligations of Licensor.
Except for the limited purpose of indicating to the public that the
Work is licensed under the CCPL, Creative Commons does not authorize
the use by either party of the trademark "Creative Commons" or any
related trademark or logo of Creative Commons without the prior
written consent of Creative Commons. Any permitted use will be in
compliance with Creative Commons’ then-current trademark usage
guidelines, as may be published on its website or otherwise made
available upon request from time to time. For the avoidance of doubt,
this trademark restriction does not form part of this License.
Creative Commons may be contacted at https://creativecommons.org/.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 x


Other Textbooks and Lecture Slides by the Author I

1 M. D. Adams, Exercises for Programming in C++ (Version 2021-04-01),


Apr. 2021, ISBN 978-0-9879197-5-5 (PDF). Available from Google Books,
Google Play Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/cppbook.
2 M. D. Adams, Lecture Slides for Programming in C++ (Version
2021-04-01), Apr. 2021, ISBN 978-0-9879197-4-8 (PDF). Available from
Google Books, Google Play Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/cppbook.
3 M. D. Adams, Multiresolution Signal and Geometry Processing: Filter
Banks, Wavelets, and Subdivision (Version 2013-09-26), University of
Victoria, Victoria, BC, Canada, Sept. 2013, ISBN 978-1-55058-507-0
(print), ISBN 978-1-55058-508-7 (PDF). Available from Google Books,
Google Play Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/waveletbook.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 xi


Other Textbooks and Lecture Slides by the Author II

4 M. D. Adams, Lecture Slides for Multiresolution Signal and Geometry


Processing (Version 2015-02-03), University of Victoria, Victoria, BC,
Canada, Feb. 2015, ISBN 978-1-55058-535-3 (print), ISBN
978-1-55058-536-0 (PDF). Available from Google Books, Google Play
Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/waveletbook.
5 M. D. Adams, Signals and Systems, Edition 5.0, Dec. 2022, ISBN
978-1-990707-00-1 (PDF). Available from Google Books, Google Play
Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/sigsysbook.
6 M. D. Adams, Lecture Slides for Signals and Systems, Edition 5.0,
Dec. 2022, ISBN 978-1-990707-02-5 (PDF). Available from Google
Books, Google Play Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/sigsysbook.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 xii


Other Textbooks and Lecture Slides by the Author III

7 M. D. Adams, Lecture Slides for Linux System Programming, Edition 0.0,


Dec. 2022, ISBN 978-1-990707-03-2 (PDF). Available from Google
Books, Google Play Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/cppbook.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 xiii


Part 0

Preface

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 xiv


About These Lecture Slides

■ This document constitutes a set of lecture slides that are intended to be


used to provide a detailed introduction to the Clang libraries in version 15
of LLVM/Clang.
■ Although this document specifically targets version 15 of LLVM/Clang, the
information presented herein is still likely to be at least partially relevant to
other versions of LLVM/Clang, especially versions that are close to 15.
■ This document represents a work in progress and should be considered
an alpha release.
■ In spite of this, it is believed that this document will be of benefit to some
people. So, it is being made available in its current form.
■ This document is intended to supplement the following slide deck:
2 M. D. Adams, Lecture Slides for Programming in C++ (Version 2021-04-01),
Apr. 2021, ISBN 978-0-9879197-4-8 (PDF). Available from Google Books,
Google Play Books, and author’s web site
https://www.ece.uvic.ca/~mdadams/cppbook.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 xv


Typesetting Conventions

■ In a definition, the term being defined is often typeset in a font like this.
■ To emphasize particular words, the words are typeset in a font like this.
■ To show that particular text is associated with a hyperlink to an internal
target, the text is typeset .........
like this.
■ To show that particular text is associated with a hyperlink to an external
document, the text is typeset like this.
::::::
■ URLs are typeset like https://www.ece.uvic.ca/~mdadams.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 xvi


Companion Git Repository

■ These lecture slides have a companion Git repository.


■ Numerous code examples are available from this repository.
■ This repository is hosted by GitHub.
■ The URL of the main repository page on GitHub is:
2 https://github.com/mdadams/clang_libraries_companion

■ The URL of the actual repository itself is:


2 https://github.com/mdadams/clang_libraries_companion.git

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 xvii


Part 1

Compilers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 1


Building Code: Compiling, Assembling, and Linking

Source Code (e.g., C++)

Compiler

Assembly Code (e.g., Intel/AMD x86-64, ARM, RISC-V)

Assembler

Object Code (e.g., ELF relocatable)

Linker

Executable Program (e.g., ELF executable)

■ in context of this discussion, interested only in compiler

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 2


Section 1.1

Structure of Compiler

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 3


Structure of Compiler

Source Code (e.g., C++)

Frontend (a.k.a. Analysis Phase): Lexing/Parsing

Intermediate Code Representation (IR)

Middle-End: Optimizer

Optimized IR

Backend (a.k.a. Synthesis Phase): Code Generator

Assembly Code for Target (e.g., Intel/AMD x86-64, ARM, RISC-V)

■ same IR used by all frontends and backends (avoids M × N problem)


■ in context of this discussion, primary focus is compiler frontend

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 4


Structure of Compiler Frontend (Analysis Phase)
Source Code

Lexical Analyzer (Lexer)

Tokens

Syntax Analyzer (Parser) ■ frontend is language


dependent but machine
Abstract Syntax Tree (AST) independent
■ in context of this discussion,
Semantic Analyzer interest lies with tokens and
AST, not IR
Semantically-Verified AST

Intermediate Code Generator

Intermediate Code Representation (IR)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 5


Lexical Analyzer (Lexer)

■ lexical analyzer (also known as lexer) reads source code as sequence of


characters and groups them into tokens
■ tokens often defined using regular expressions
■ token might correspond to entity such as: identifier, keyword, separator,
operator, literal, comment
■ example:
Token Stream
Token Type Token Value
identifier foo
Code Fragment operator assignment
foo = bar + 42; identifier bar
operator addition
literal 42
semicolon —

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 6


Syntax Analyzer (Parser)
■ syntax analyzer (also known as parser) reads sequence of tokens and
uses them to construct abstract syntax tree (AST)
■ AST is tree-based data structure used to represent semantics of source
code
■ much of work of compiler frontend performed using AST
■ unlike concrete syntax tree (also known as parse tree), AST typically (but
not always) omits information that is not necessary for characterizing code
structure (e.g., braces and parentheses)
■ example:
AST for Code Fragment
=
Code Fragment y *
y = (x + 1) * (x + 1)
+ +
x 1 x 1
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 7
Semantic Analyzer

■ checks if AST is valid (i.e., corresponds to well-formed code)


■ semantic analysis would typically include checks for such things as:
2 invalid types
2 invalid operands
2 invalid arguments in function calls
2 undeclared identifiers/variables
■ example:
AST for Code Fragment
Code Fragment
(Not Semantically Valid)
y = *x =
where x and y of type int
(code invalid, cannot y *
dereference x)
x

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 8


Intermediate Code Generator

■ intermediate code generator produces intermediate code representation


(IR) from AST
■ particular IR used highly dependent on compiler
■ IR provides way to describe code in manner than is independent of
language and target architecture
■ can think of IR as very generic assembly-like language for some
idealized/fictitious processor architecture
■ in case of LLVM, for example, IR can be thought of as assembly-code for
processor with infinite number of registers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 9


Optimizer

■ IR optimizer performs transformations to IR in attempt to produce more


efficient code
■ since working with IR, optimizations are machine independent
■ some common types of optimizations might include:
2 eliminate unreachable code
2 eliminate dead stores
2 eliminate unused variables
■ IR optimizer not to be confused with machine-dependent optimizer in
compiler backend

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 10


Structure of Compiler Backend (Synthesis Phase)

Optimized Intermediate Code Representation

Machine-Dependent Code Generator


■ backend is machine
Assembly Code for Target
dependent but
language independent
Machine-Dependent Code Optimizer

Optimized Assembly Code for Target

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 11


Machine-Dependent Code Generator and Optimizer

■ machine-dependent code generator produces assembly code for target


architecture from IR
■ must map operations in IR onto instructions of target architecture
■ must map registers and storage used by operations in IR onto memory
and register set of target architecture
■ assembly code produced may then be further optimized to yield final
assembly-code output

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 12


Example: Source Code

simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 13


Example: Tokens

simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }

Command
clang -std=c++20 -Xclang -dump-tokens -fsyntax-only simple_1.cpp

Output (Standard Error)


int ’int’ [StartOfLine] Loc=<simple_1.cpp:1:1>
identifier ’add’ [LeadingSpace] Loc=<simple_1.cpp:1:5>
l_paren ’(’ Loc=<simple_1.cpp:1:8>
int ’int’ Loc=<simple_1.cpp:1:9>
identifier ’x’ [LeadingSpace] Loc=<simple_1.cpp:1:13>
comma ’,’ Loc=<simple_1.cpp:1:14>
int ’int’ [LeadingSpace] Loc=<simple_1.cpp:1:16>
identifier ’y’ [LeadingSpace] Loc=<simple_1.cpp:1:20>
r_paren ’)’ Loc=<simple_1.cpp:1:21>
l_brace ’{’ [LeadingSpace] Loc=<simple_1.cpp:1:23>
return ’return’ [StartOfLine] [LeadingSpace] Loc=<simple_1.cpp:2:2>
identifier ’x’ [LeadingSpace] Loc=<simple_1.cpp:2:9>
plus ’+’ [LeadingSpace] Loc=<simple_1.cpp:2:11>
identifier ’y’ [LeadingSpace] Loc=<simple_1.cpp:2:13>
semi ’;’ Loc=<simple_1.cpp:2:14>
r_brace ’}’ [StartOfLine] Loc=<simple_1.cpp:3:1>
eof ’’ Loc=<simple_1.cpp:3:2>

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 14


Example: AST [Graphical]

simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }

AST (Clang 14; Slightly Abridged)


TranslationUnitDecl

FunctionDecl
int add(int, int)

ParmVarDecl ParmVarDecl CompoundStmt


int x int y
ReturnStmt

BinaryOperator
+
ImplicitCastExpr ImplicitCastExpr

DeclRefExpr DeclRefExpr
x y

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 15


Example: AST [clang-check]

simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }

Command
clang-check -ast-dump -ast-dump-filter=add simple_1.cpp -- \
-fno-color-diagnostics -std=c++20

Output (Clang 14; Standard Output)


Dumping add:
FunctionDecl 0x74c4e98 </home/jdoe/simple/simple_1.cpp:1:1, line:3:1> line:1:5 add ’int (int, int)’
|-ParmVarDecl 0x74c4cf8 <col:9, col:13> col:13 used x ’int’
|-ParmVarDecl 0x74c4d80 <col:16, col:20> col:20 used y ’int’
‘-CompoundStmt 0x74c5050 <col:23, line:3:1>
‘-ReturnStmt 0x74c5040 <line:2:2, col:13>
‘-BinaryOperator 0x74c5020 <col:9, col:13> ’int’ ’+’
|-ImplicitCastExpr 0x74c4ff0 <col:9> ’int’ <LValueToRValue>
| ‘-DeclRefExpr 0x74c4fb0 <col:9> ’int’ lvalue ParmVar 0x74c4cf8 ’x’ ’int’
‘-ImplicitCastExpr 0x74c5008 <col:13> ’int’ <LValueToRValue>
‘-DeclRefExpr 0x74c4fd0 <col:13> ’int’ lvalue ParmVar 0x74c4d80 ’y’ ’int’

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 16


Example: AST [clang]
simple_1.cpp
1 int add(int x, int y) {
2 return x + y;
3 }

Command
clang -std=c++20 -Xclang -ast-dump -fsyntax-only -fno-color-diagnostics \
simple_1.cpp

Output (Clang 14; Standard Output; Abridged)


TranslationUnitDecl 0x93dfe78 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x93e0700 <<invalid sloc>> <invalid sloc> implicit __int128_t ’__int128’
| ‘-BuiltinType 0x93e0440 ’__int128’
|-TypedefDecl 0x93e0778 <<invalid sloc>> <invalid sloc> implicit __uint128_t ’unsigned __int128’
| ‘-BuiltinType 0x93e0460 ’unsigned __int128’
|-TypedefDecl 0x93e0b30 <<invalid sloc>> <invalid sloc> implicit __NSConstantString ’__NSConstantString_tag’
| ‘-RecordType 0x93e0880 ’__NSConstantString_tag’
| ‘-CXXRecord 0x93e07d8 ’__NSConstantString_tag’
|-TypedefDecl 0x93e0bd8 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list ’char *’
| ‘-PointerType 0x93e0b90 ’char *’
| ‘-BuiltinType 0x93dff20 ’char’
|-TypedefDecl 0x942a930 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list ’__va_list_tag[1]’
| ‘-ConstantArrayType 0x942a8d0 ’__va_list_tag[1]’ 1
| ‘-RecordType 0x93e0ce0 ’__va_list_tag’
| ‘-CXXRecord 0x93e0c38 ’__va_list_tag’
‘-FunctionDecl 0x942ab48 <simple_1.cpp:1:1, line:3:1> line:1:5 add ’int (int, int)’
|-ParmVarDecl 0x942a9a8 <col:9, col:13> col:13 used x ’int’
|-ParmVarDecl 0x942aa30 <col:16, col:20> col:20 used y ’int’
‘-CompoundStmt 0x942ad00 <col:23, line:3:1>
‘-ReturnStmt 0x942acf0 <line:2:2, col:13>
[deleted text]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 17


Example: IR
Command
clang++ -S -emit-llvm simple_1.cpp
LLVM IR (simple_1.ll)
1 ; ModuleID = ’simple_1.cpp’
2 source_filename = "simple_1.cpp"
3 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
4 target triple = "x86_64-unknown-linux-gnu"
5
6 ; Function Attrs: mustprogress noinline nounwind optnone uwtable
7 define dso_local noundef i32 @_Z3addii(i32 noundef %0, i32 noundef %1) #0 {
8 %3 = alloca i32, align 4
9 %4 = alloca i32, align 4
10 store i32 %0, i32* %3, align 4
11 store i32 %1, i32* %4, align 4
12 %5 = load i32, i32* %3, align 4
13 %6 = load i32, i32* %4, align 4
14 %7 = add nsw i32 %5, %6
15 ret i32 %7
16 }
17
18 attributes #0 = { mustprogress noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-
,→ width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "
,→ target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
19
20 !llvm.module.flags = !{!0, !1, !2}
21 !llvm.ident = !{!3}
22
23 !0 = !{i32 1, !"wchar_size", i32 4}
24 !1 = !{i32 7, !"uwtable", i32 1}
25 !2 = !{i32 7, !"frame-pointer", i32 2}
26 !3 = !{!"clang version 14.0.0"}

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 18


Example: Optimized IR

Command
clang++ -O3 -S -emit-llvm -o simple_1-opt.ll simple_1.cpp
LLVM Optimized IR (simple_1-opt.ll)
1 ; ModuleID = ’simple_1.cpp’
2 source_filename = "simple_1.cpp"
3 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
4 target triple = "x86_64-unknown-linux-gnu"
5
6 ; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
7 define dso_local noundef i32 @_Z3addii(i32 noundef %0, i32 noundef %1) local_unnamed_addr #0 {
8 %3 = add nsw i32 %1, %0
9 ret i32 %3
10 }
11
12 attributes #0 = { mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn "frame-
,→ pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-
,→ size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="
,→ generic" }
13
14 !llvm.module.flags = !{!0, !1}
15 !llvm.ident = !{!2}
16
17 !0 = !{i32 1, !"wchar_size", i32 4}
18 !1 = !{i32 7, !"uwtable", i32 1}
19 !2 = !{!"clang version 14.0.0"}

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 19


Example: x86-64 Assembly Code
Command
clang++ -S simple_1.cpp
x86-64 Assembly Code (simple_1.s)
1 .text
2 .file "simple_1.cpp"
3 .globl _Z3addii # -- Begin function _Z3addii
4 .p2align 4, 0x90
5 .type _Z3addii,@function
6 _Z3addii: # @_Z3addii
7 .cfi_startproc
8 # %bb.0:
9 pushq %rbp
10 .cfi_def_cfa_offset 16
11 .cfi_offset %rbp, -16
12 movq %rsp, %rbp
13 .cfi_def_cfa_register %rbp
14 movl %edi, -4(%rbp)
15 movl %esi, -8(%rbp)
16 movl -4(%rbp), %eax
17 addl -8(%rbp), %eax
18 popq %rbp
19 .cfi_def_cfa %rsp, 8
20 retq
21 .Lfunc_end0:
22 .size _Z3addii, .Lfunc_end0-_Z3addii
23 .cfi_endproc
24 # -- End function
25 .ident "clang version 14.0.0"
26 .section ".note.GNU-stack","",@progbits
27 .addrsig

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 20


Example: Optimized x86-64 Assembly Code

Command
clang++ -O3 -S -o simple_1-opt.s simple_1.cpp
Optimized x86-64 Assembly Code (simple_1-opt.s)
1 .text
2 .file "simple_1.cpp"
3 .globl _Z3addii # -- Begin function _Z3addii
4 .p2align 4, 0x90
5 .type _Z3addii,@function
6 _Z3addii: # @_Z3addii
7 .cfi_startproc
8 # %bb.0:
9 # kill: def $esi killed $esi def $rsi
10 # kill: def $edi killed $edi def $rdi
11 leal (%rdi,%rsi), %eax
12 retq
13 .Lfunc_end0:
14 .size _Z3addii, .Lfunc_end0-_Z3addii
15 .cfi_endproc
16 # -- End function
17 .ident "clang version 14.0.0"
18 .section ".note.GNU-stack","",@progbits
19 .addrsig

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 21


Section 1.2

Name Mangling

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 22


Example: Name Demangling

Command (With Mangled Names as Arguments to llvm-cxxfilt)


llvm-cxxfilt _Z3addii

Output
add(int, int)

Command (Run llvm-cxxfilt as Filter)


echo "_Z3addii _Z3addii _Z3addii" | llvm-cxxfilt

Output
add(int, int) add(int, int) add(int, int)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 23


Section 1.3

Control-Flow Graphs (CFGs)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 24


Control-Flow Graphs

■ control-flow graph (CFG) is directed graph that shows paths of


execution in code
■ nodes correspond to statement fragments or statements
■ given two nodes i and j in CFG, edge from i to j exists if and only if
statement fragment corresponding to node j can be executed immediately
after statement fragment corresponding to node i
■ code can be viewed in terms of CFG
■ CFGs are particularly useful for performing certain types of code analysis

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 25


CFG Examples: If-Else and Switch
declarations:
1 bool c; double x; double y;
1 if (c) {
2 4 2 y = x * x;
3 } else {
4 y = x;
6 5 }
6 // ...

declarations:
int n; double x; double y;
1 1 switch (n) {
2 case 0:
3 y = 0.0;
4 break;
3 6 9 5 case 1:
6 y = 2.0 * x;
7 break;
8 case 2:
12 9 y = 0.5 * x * x;
10 break;
11 }
12 // ...

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 26


CFG Examples: While and Do-While Loops

1
declarations:
int n;
2 1 while (n > 0) {
2 --n;
3 }
4 // ...
4

2
declarations:
int n;
3 1 do {
2 --n;
3 while (n > 0);
4 // ...
4

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 27


CFG Examples: For Loop

1a

1b
declarations:
int a[1024];
2 1 for (int i = 0; i < 1024; ++i) {
2 a[i] = 0;
3 }
4 // ...
1c

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 28


Live-Variable Analysis

■ variable is said to be live at some point in code if it holds value that may
be needed in future (or equivalently, if its value may be read before next
time variable is written)
■ at each point in code execution, each variable is either live or dead
■ live-variable analysis (also called liveness analysis) is classic data-flow
analysis to calculate variables that are live at each point in code
■ some uses for liveness analysis include:
2 detecting dead stores (i.e., value written to variable that is never read)
2 detecting use of uninitialized variables
2 register allocation (i.e., deciding which variables should be allocated to
registers)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 29


Liveness Example

entry

1 int foo(int x, int y) {


2,3 2 int t = x * y; // t is dead
3 if ((x + 1) * (x - 1) == y) {
4 t = 1;
4 6 5 } else {
6 t = 2;
7 }
8 return t;
8 9 }

exit

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 30


Performing Live-Variable Analysis

■ CFG used to perform liveness analysis


■ liveness analysis flows backwards through code considering each
statement/expression
■ variable becomes live when statement/expression reads value of variable
■ variable becomes dead when statement/expression assigns value to
variable

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 31


Part 2

LLVM and Clang

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 32


LLVM

■ LLVM is collection of modular compiler and toolchain components


■ much of LLVM code is organized as collection of libraries
■ frontends for numerous languages, including:
2 Ada, C, C++, D, Delphi, Fortran, Haskell, Julia, Objective-C, Rust, and Swift
■ modern source- and target-independent optimizer
■ backends for numerous targets, including:
2 IA-32 (i.e., 32-bit x86), x86-64, ARM, Qualcomm Hexagon, MIPS,
PowerPC, RISC-V, SPARC, z/Architecture
■ numerous tools and libraries, including:
2 debugger (LLDB), linker (LLD)
2 C++ standard library implementation (libc++)
■ open-source (licensed under Apache 2.0 Licence with LLVM exceptions)
■ originally written by Vikram Adve and Chris Lattner
■ initial release in 2003

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 33


Clang

■ Clang is compiler frontend for:


2 C, C++, Objective-C, and Objective-C++ programming languages
■ has support for:
2 OpenMP, OpenCL, CUDA, and HIP
■ Clang very modular and organized as collection of libraries along with
some programs that use these libraries
■ library for each of lexer, parser, semantic analyzer, and so on
■ Clang originally developed by Apple and then later released under
open-source license in 2007
■ Clang compiler frontend not to be confused with compiler-driver programs
clang and clang++, which can do more than just compile (e.g., optimize,
assemble, and link)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 34


Section 2.1

Obtaining and Building LLVM/Clang

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 35


Obtaining LLVM/Clang

■ LLVM/Clang source code can be obtained from LLVM monorepo at:


2 https://github.com/llvm/llvm-project.git

■ repository is quite large


■ can also obtain individual release tarballs from GitHub at:
2 https://github.com/llvm/llvm-project/releases

■ for example, tarball for LLVM/Clang 15.0.6 available from:


2 https://github.com/llvm/llvm-project/releases/download/

llvmorg-15.0.6/llvm-project-15.0.6.src.tar.xz
■ in most cases, probably preferable to obtain LLVM/Clang from Git
repository

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 36


LLVM/Clang, RTTI, and EH

■ by default, LLVM/Clang is built with run-time type-identification (RTTI) and


exception handling (EH) disabled
■ if LLVM/Clang libraries and code using these libraries built with different
RTTI/EH settings, care must be taken to avoid accidentally relying on
RTTI/EH in code where these features have been disabled
■ some problematic things would include:
2 throwing exceptions that must propagate through code for which exception
handling disabled (e.g., Clang libraries)
2 using typeid or dynamic_cast with types for which RTTI disabled
■ probably advisable to build LLVM/Clang with RTTI/EH enabled if any other
code is to be built with RTTI/EH enabled

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 37


Building Clang With CMake

■ LLVM/Clang employs CMake-based build


■ basic build and install of Clang can be performed using command
sequence similar to:
cmake -H$src_dir/llvm -B$build_dir \
-DCMAKE_BUILD_TYPE=$build_type \
-DCMAKE_INSTALL_PREFIX=$install_dir \
-DLLVM_ENABLE_PROJECTS="clang;openmp"
cmake --build $build_dir
cmake --build $build_dir --target install
■ usually some additional CMake options are needed or desirable
■ for more information, see:
https://llvm.org/docs/GettingStarted.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 38


CMake Settings for Building Clang (1)
■ CMAKE_BUILD_TYPE: specify type of build (e.g., Debug,
RelWithDebInfo, or Release)
■ LLVM_ENABLE_PROJECTS: semicolon-separated list of projects to be built
(must at least include clang); some projects include:
2 clang: frontend for C/C++ family of languages
2 clang-tools-extras: various utilities such as clang-format,
clang-tidy, and clangd
2 lld: LLVM linker
2 lldb: LLVM debugger
■ LLVM_ENABLE_RUNTIMES: semicolon-separated list of runtimes to be
built; run-times built using just-built compiler; some runtimes include:
2 libcxx: libc++ C++ standard library

2 libcxxabi: low-level support for libc++ C++ standard library


2 libunwind: stack-unwinding library
2 compiler-rt: compiler run-time support libraries (e.g., run-time libraries

for code sanitizers such as ASan, MSan, and TSan)


■ some components can be built as project or runtime (e.g., openmp)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 39


CMake Settings for Building Clang (2)

■ LLVM_BUILD_LLVM_DYLIB: boolean flag indicating if LLVM should be built


as single shared library
■ LLVM_ENABLE_ASSERTIONS: boolean flag indicating if assertions should
be enabled
■ LLVM_ENABLE_RTTI: boolean flag indicating if RTTI should be enabled
■ LLVM_ENABLE_EH: boolean flag indicating if exception handling (EH)
should be enabled
■ LVM_PARALLEL_COMPILE_JOBS: specifies maximum number of parallel
jobs for compiling
■ LVM_PARALLEL_LINK_JOBS: specifies maximum number of parallel jobs
for linking
■ for more details, see:
2 https://llvm.org/docs/CMake.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 40


Building Clang Distribution

■ use install-distribution target instead of install


■ LLVM_DISTRIBUTION_COMPONENTS: specifies which software
components to be included in distribution (e.g., cmake-exports, LLVM,
llvm-headers)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 41


sde_install_clang Script

■ can use sde_install_clang script to facilitate easier installation of


LLVM/Clang
■ script will obtain specified release of LLVM/Clang from either Git
repository or tarball and then build and install software
■ sde_install_clang script can be obtained from following Git repository:
2 https://github.com/mdadams/sde.git
■ invoking sde_install_clang with no command-line arguments will
cause script to print help information and exit

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 42


llvm-config Program

■ can query how LLVM was configured at time it was built using
llvm-config program
■ can query such things as:
2 LLVM build mode (e.g., Debug or Release)
2 whether LLVM was built with RTTI enabled
2 whether LLVM was built with assertions enabled
2 libraries needed to link against various LLVM components
2 installation directory for LLVM headers
■ for example, to print build mode, RTTI setting, and assertion mode, use:
llvm-config --build-mode --has-rtti --assertion-mode
■ for more information, see:
2 https://llvm.org/docs/CommandGuide/llvm-config.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 43


Section 2.2

Compiler Driver and Stages of Compilation

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 44


Clang Compiler Driver

■ compiler-driver programs, clang and clang++, responsible for performing


various tasks associated with compilation (e.g., compiling, assembling,
and linking)
■ may perform task directly or by running another program
■ determines what tasks (i.e., phases of processing) required
■ runs programs needed to perform those tasks
■ compiler driver can perform many tasks directly itself, which helps to avoid
need to write data to file in order to pass it to next stage in compilation
process

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 45


Example: Printing Compiler Phases

Command
clang++ -ccc-print-phases -c hello.cpp
Output (Standard Error)
+- 0: input, "hello.cpp", c++
+- 1: preprocessor, {0}, c++-cpp-output
+- 2: compiler, {1}, ir
+- 3: backend, {2}, assembler
4: assembler, {3}, object

Command
clang++ -ccc-print-phases hello.cpp
Output (Standard Error)
+- 0: input, "hello.cpp", c++
+- 1: preprocessor, {0}, c++-cpp-output
+- 2: compiler, {1}, ir
+- 3: backend, {2}, assembler
+- 4: assembler, {3}, object
5: linker, {4}, image

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 46


Example: Printing Tool Executions
Command
clang++ -### hello.cpp
Output (Standard Error)
clang version 13.0.0 (Fedora 13.0.0-3.fc35)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
"/usr/bin/clang-13" "-cc1" "-triple" "x86_64-redhat-linux-gnu" "-emit-obj" "-mrelax-all" "--mrelax-
,→ relocations" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "
,→ hello.cpp" "-mrelocation-model" "static" "-mframe-pointer=all" "-fmath-errno" "-fno-rounding-math"
,→ "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-tune-cpu" "generic" "-debugger-
,→ tuning=gdb" "-fcoverage-compilation-dir=/home/mdadams/work/git/clang_slides/software/
,→ llvm_clang_usage" "-resource-dir" "/usr/lib64/clang/13.0.0" "-internal-isystem" "/usr/bin/../lib/
,→ gcc/x86_64-redhat-linux/11/../../../../include/c++/11" "-internal-isystem" "/usr/bin/../lib/gcc/
,→ x86_64-redhat-linux/11/../../../../include/c++/11/x86_64-redhat-linux" "-internal-isystem" "/usr/
,→ bin/../lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/backward" "-internal-isystem" "/
,→ usr/lib64/clang/13.0.0/include" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/
,→ bin/../lib/gcc/x86_64-redhat-linux/11/../../../../x86_64-redhat-linux/include" "-internal-externc-
,→ isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdeprecated-macro" "-fdebug-
,→ compilation-dir=/home/mdadams/work/git/clang_slides/software/llvm_clang_usage" "-ferror-limit" "19"
,→ "-fgnuc-version=4.2.1" "-fcxx-exceptions" "-fexceptions" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM
,→ =1" "-o" "/tmp/hello-96987e.o" "-x" "c++" "hello.cpp"
"/usr/bin/ld" "--hash-style=gnu" "--build-id" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/
,→ ld-linux-x86-64.so.2" "-o" "a.out" "/usr/bin/../lib/gcc/x86_64-redhat-linux/11/../../../../lib64/
,→ crt1.o" "/usr/bin/../lib/gcc/x86_64-redhat-linux/11/../../../../lib64/crti.o" "/usr/bin/../lib/gcc/
,→ x86_64-redhat-linux/11/crtbegin.o" "-L/usr/bin/../lib/gcc/x86_64-redhat-linux/11" "-L/usr/bin/../
,→ lib/gcc/x86_64-redhat-linux/11/../../../../lib64" "-L/lib/../lib64" "-L/usr/lib/../lib64" "-L/usr/
,→ bin/../lib" "-L/lib" "-L/usr/lib" "/tmp/hello-96987e.o" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lc" "-
,→ lgcc_s" "-lgcc" "/usr/bin/../lib/gcc/x86_64-redhat-linux/11/crtend.o" "/usr/bin/../lib/gcc/x86_64-
,→ redhat-linux/11/../../../../lib64/crtn.o"

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 47


Example: Source Code

simple_2.cpp
1 int factorial(int n) {
2 int result = 1;
3 while (n >= 2) {result *= n--;}
4 return result;
5 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 48


Example: Tokens
simple_2.cpp
1 int factorial(int n) {
2 int result = 1;
3 while (n >= 2) {result *= n--;}
4 return result;
5 }

Command
clang -std=c++20 -Xclang -dump-tokens -fsyntax-only simple_2.cpp

Output (Standard Error; Abridged)


int ’int’ [StartOfLine] Loc=<simple_2.cpp:1:1>
identifier ’factorial’ [LeadingSpace] Loc=<simple_2.cpp:1:5>
l_paren ’(’ Loc=<simple_2.cpp:1:14>
int ’int’ Loc=<simple_2.cpp:1:15>
identifier ’n’ [LeadingSpace] Loc=<simple_2.cpp:1:19>
r_paren ’)’ Loc=<simple_2.cpp:1:20>
l_brace ’{’ [LeadingSpace] Loc=<simple_2.cpp:1:22>
int ’int’ [StartOfLine] [LeadingSpace] Loc=<simple_2.cpp:2:2>
identifier ’result’ [LeadingSpace] Loc=<simple_2.cpp:2:6>
equal ’=’ [LeadingSpace] Loc=<simple_2.cpp:2:13>
numeric_constant ’1’ [LeadingSpace] Loc=<simple_2.cpp:2:15>
semi ’;’ Loc=<simple_2.cpp:2:16>
while ’while’ [StartOfLine] [LeadingSpace] Loc=<simple_2.cpp:3:2>
l_paren ’(’ [LeadingSpace] Loc=<simple_2.cpp:3:8>
identifier ’n’ Loc=<simple_2.cpp:3:9>
greaterequal ’>=’ [LeadingSpace] Loc=<simple_2.cpp:3:11>
numeric_constant ’2’ [LeadingSpace] Loc=<simple_2.cpp:3:14>
[text deleted]
eof ’’ Loc=<simple_2.cpp:5:2>
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 49
Example: AST [Graphical]
simple_2.cpp
1 int factorial(int n) {
2 int result = 1;
3 while (n >= 2) {result *= n--;}
4 return result;
5 }

AST (Clang 14; Slightly Abridged)


TranslationUnitDecl

FunctionDecl
int factorial(int)

ParmVarDecl
int n CompoundStmt

DeclStmt WhileStmt ReturnStmt

VarDecl BinaryOperator CompoundStmt ImplicitCastExpr


int result >=
CompoundAssignOperator DeclRefExpr
IntegerLiteral ImplicitCastExpr IntegerLiteral result
1 2 DeclRefExpr UnaryOperator
DeclRefExpr result --
n
DeclRefExpr
n

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 50


Example: AST [clang-check]
simple_2.cpp
1 int factorial(int n) {
2 int result = 1;
3 while (n >= 2) {result *= n--;}
4 return result;
5 }

Command
clang-check -ast-dump -ast-dump-filter=factorial simple_2.cpp -- -fno-color-diagnostics -std=c++20

Output (Clang 14; Standard Output)


Dumping factorial:
FunctionDecl 0x7825ea8 </home/mdadams/work/git/clang_slides/software/llvm_clang_usage/simple_2.cpp:1:1, line
,→ :5:1> line:1:5 factorial ’int (int)’
|-ParmVarDecl 0x7825db8 <col:15, col:19> col:19 used n ’int’
‘-CompoundStmt 0x7826210 <col:22, line:5:1>
|-DeclStmt 0x7826078 <line:2:2, col:16>
| ‘-VarDecl 0x7825fd8 <col:2, col:15> col:6 used result ’int’ cinit
| ‘-IntegerLiteral 0x7826040 <col:15> ’int’ 1
|-WhileStmt 0x78261a8 <line:3:2, col:32>
| |-BinaryOperator 0x78260e8 <col:9, col:14> ’bool’ ’>=’
| | |-ImplicitCastExpr 0x78260d0 <col:9> ’int’ <LValueToRValue>
| | | ‘-DeclRefExpr 0x7826090 <col:9> ’int’ lvalue ParmVar 0x7825db8 ’n’ ’int’
| | ‘-IntegerLiteral 0x78260b0 <col:14> ’int’ 2
| ‘-CompoundStmt 0x7826190 <col:17, col:32>
| ‘-CompoundAssignOperator 0x7826160 <col:18, col:29> ’int’ lvalue ’*=’ ComputeLHSTy=’int’ ComputeResultTy=
,→ ’int’
| |-DeclRefExpr 0x7826108 <col:18> ’int’ lvalue Var 0x7825fd8 ’result’ ’int’
| ‘-UnaryOperator 0x7826148 <col:28, col:29> ’int’ postfix ’--’
| ‘-DeclRefExpr 0x7826128 <col:28> ’int’ lvalue ParmVar 0x7825db8 ’n’ ’int’
‘-ReturnStmt 0x7826200 <line:4:2, col:9>
‘-ImplicitCastExpr 0x78261e8 <col:9> ’int’ <LValueToRValue>
‘-DeclRefExpr 0x78261c8 <col:9> ’int’ lvalue Var 0x7825fd8 ’result’ ’int’

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 51


Example: AST [clang]
simple_2.cpp
1 int factorial(int n) {
2 int result = 1;
3 while (n >= 2) {result *= n--;}
4 return result;
5 }

Command
clang -std=c++20 -Xclang -ast-dump -fsyntax-only -fno-color-diagnostics simple_2.cpp

Output (Clang 14; Standard Output; Abridged)


TranslationUnitDecl 0x9d53738 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x9d53fc0 <<invalid sloc>> <invalid sloc> implicit __int128_t ’__int128’
| ‘-BuiltinType 0x9d53d00 ’__int128’
|-TypedefDecl 0x9d54038 <<invalid sloc>> <invalid sloc> implicit __uint128_t ’unsigned __int128’
| ‘-BuiltinType 0x9d53d20 ’unsigned __int128’
|-TypedefDecl 0x9d543f0 <<invalid sloc>> <invalid sloc> implicit __NSConstantString ’__NSConstantString_tag’
| ‘-RecordType 0x9d54140 ’__NSConstantString_tag’
| ‘-CXXRecord 0x9d54098 ’__NSConstantString_tag’
|-TypedefDecl 0x9d54498 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list ’char *’
| ‘-PointerType 0x9d54450 ’char *’
| ‘-BuiltinType 0x9d537e0 ’char’
|-TypedefDecl 0x9d9e7b0 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list ’__va_list_tag[1]’
| ‘-ConstantArrayType 0x9d9e750 ’__va_list_tag[1]’ 1
| ‘-RecordType 0x9d545a0 ’__va_list_tag’
| ‘-CXXRecord 0x9d544f8 ’__va_list_tag’
‘-FunctionDecl 0x9d9e918 <simple_2.cpp:1:1, line:5:1> line:1:5 factorial ’int (int)’
|-ParmVarDecl 0x9d9e828 <col:15, col:19> col:19 used n ’int’
‘-CompoundStmt 0x9d9ec80 <col:22, line:5:1>
|-DeclStmt 0x9d9eae8 <line:2:2, col:16>
| ‘-VarDecl 0x9d9ea48 <col:2, col:15> col:6 used result ’int’ cinit
[text deleted]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 52


Example: IR
Command
clang++ -O0 -S -emit-llvm -o simple_2.ll -Xclang -disable-O0-optnone simple_2.cpp

LLVM IR (simple_2.ll)
1 ; ModuleID = ’simple_2.cpp’
2 source_filename = "simple_2.cpp"
3 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
4 target triple = "x86_64-unknown-linux-gnu"
5
6 ; Function Attrs: mustprogress noinline nounwind uwtable
7 define dso_local noundef i32 @_Z9factoriali(i32 noundef %0) #0 {
8 %2 = alloca i32, align 4
9 %3 = alloca i32, align 4
10 store i32 %0, i32* %2, align 4
11 store i32 1, i32* %3, align 4
12 br label %4
13 4: ; preds = %7, %1
14 %5 = load i32, i32* %2, align 4
15 %6 = icmp sge i32 %5, 2
16 br i1 %6, label %7, label %12
17 7: ; preds = %4
18 %8 = load i32, i32* %2, align 4
19 %9 = add nsw i32 %8, -1
20 store i32 %9, i32* %2, align 4
21 %10 = load i32, i32* %3, align 4
22 %11 = mul nsw i32 %10, %8
23 store i32 %11, i32* %3, align 4
24 br label %4, !llvm.loop !4
25 12: ; preds = %4
26 %13 = load i32, i32* %3, align 4
27 ret i32 %13
28 }
29
30 ; [text deleted]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 53


Example: Optimized IR
Command
clang++ -O1 -S -emit-llvm -o simple_2-opt.ll simple_2.cpp

LLVM Optimized IR (simple_2-opt.ll)


1 ; ModuleID = ’simple_2.cpp’
2 source_filename = "simple_2.cpp"
3 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
4 target triple = "x86_64-unknown-linux-gnu"
5
6 ; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn
7 define dso_local noundef i32 @_Z9factoriali(i32 noundef %0) local_unnamed_addr #0 {
8 %2 = icmp sgt i32 %0, 1
9 br i1 %2, label %3, label %9
10
11 3: ; preds = %1, %3
12 %4 = phi i32 [ %7, %3 ], [ 1, %1 ]
13 %5 = phi i32 [ %6, %3 ], [ %0, %1 ]
14 %6 = add nsw i32 %5, -1
15 %7 = mul nsw i32 %4, %5
16 %8 = icmp sgt i32 %5, 2
17 br i1 %8, label %3, label %9, !llvm.loop !3
18
19 9: ; preds = %3, %1
20 %10 = phi i32 [ 1, %1 ], [ %7, %3 ]
21 ret i32 %10
22 }
23
24 attributes #0 = { mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn "frame-pointer"=
,→ "none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "
,→ target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
25
26 !llvm.module.flags = !{!0, !1}
27 !llvm.ident = !{!2}
28
29 !0 = !{i32 1, !"wchar_size", i32 4}
30 !1 = !{i32 7, !"uwtable", i32 1}
31 !2 = !{!"clang version 14.0.1"}
32 !3 = distinct !{!3, !4, !5}
33 !4 = !{!"llvm.loop.mustprogress"}
34 !5 = !{!"llvm.loop.unroll.disable"}

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 54


Example: x86-64 Assembly Code
Command
clang++ -S simple_2.cpp

x86-64 Assembly Code (Abridged; simple_2.s)


1 .text
2 .file "simple_2.cpp"
3 .globl _Z9factoriali # -- Begin function _Z9factoriali
4 .p2align 4, 0x90
5 .type _Z9factoriali,@function
6 _Z9factoriali: # @_Z9factoriali
7 .cfi_startproc
8 # %bb.0:
9 pushq %rbp
10 .cfi_def_cfa_offset 16
11 .cfi_offset %rbp, -16
12 movq %rsp, %rbp
13 .cfi_def_cfa_register %rbp
14 movl %edi, -4(%rbp)
15 movl $1, -8(%rbp)
16 .LBB0_1: # =>This Inner Loop Header: Depth=1
17 cmpl $2, -4(%rbp)
18 jl .LBB0_3
19 # %bb.2: # in Loop: Header=BB0_1 Depth=1
20 movl -4(%rbp), %eax
21 movl %eax, %ecx
22 addl $-1, %ecx
23 movl %ecx, -4(%rbp)
24 imull -8(%rbp), %eax
25 movl %eax, -8(%rbp)
26 jmp .LBB0_1
27 .LBB0_3:
28 movl -8(%rbp), %eax
29 popq %rbp
30 .cfi_def_cfa %rsp, 8
31 retq
32 # [text deleted]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 55


Example: Optimized x86-64 Assembly Code
Command
clang++ -O1 -S -o simple_2-opt.s simple_2.cpp

Optimized x86-64 Assembly Code (simple_2-opt.s)


1 .text
2 .file "simple_2.cpp"
3 .globl _Z9factoriali # -- Begin function _Z9factoriali
4 .p2align 4, 0x90
5 .type _Z9factoriali,@function
6 _Z9factoriali: # @_Z9factoriali
7 .cfi_startproc
8 # %bb.0:
9 # kill: def $edi killed $edi def $rdi
10 movl $1, %eax
11 cmpl $2, %edi
12 jl .LBB0_3
13 # %bb.1:
14 movl $1, %eax
15 .p2align 4, 0x90
16 .LBB0_2: # =>This Inner Loop Header: Depth=1
17 imull %edi, %eax
18 leal -1(%rdi), %ecx
19 cmpl $2, %edi
20 movl %ecx, %edi
21 jg .LBB0_2
22 .LBB0_3:
23 retq
24 .Lfunc_end0:
25 .size _Z9factoriali, .Lfunc_end0-_Z9factoriali
26 .cfi_endproc
27 # -- End function
28 .ident "clang version 14.0.1"
29 .section ".note.GNU-stack","",@progbits
30 .addrsig

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 56


Example: Name Demangling

Command (With Mangled Names as Arguments to llvm-cxxfilt)


llvm-cxxfilt _Z9factoriali

Output
factorial(int)

Command (Run llvm-cxxfilt as Filter)


echo "_Z9factoriali _Z9factoriali" | llvm-cxxfilt

Output
factorial(int) factorial(int)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 57


Example: CFG for Unoptimized Code
Command
clang++ -O0 -S -emit-llvm -o simple_2.ll -Xclang -disable-O0-optnone simple_2.cpp
opt -dot-cfg -cfg-func-name=_Z9factoriali -cfg-dot-filename-prefix \
dot-simple_2- simple_2.ll > /dev/null
CFG

%1:
%2 = alloca i32, align 4
%3 = alloca i32, align 4
store i32 %0, i32* %2, align 4
store i32 1, i32* %3, align 4
br label %4

%4:
4:
%5 = load i32, i32* %2, align 4
%6 = icmp sge i32 %5, 2
br i1 %6, label %7, label %12
T F

%7:
7:
%8 = load i32, i32* %2, align 4
%12:
%9 = add nsw i32 %8, -1
12:
store i32 %9, i32* %2, align 4
%13 = load i32, i32* %3, align 4
%10 = load i32, i32* %3, align 4
ret i32 %13
%11 = mul nsw i32 %10, %8
store i32 %11, i32* %3, align 4
br label %4, !llvm.loop !4
CFG for '_Z9factoriali' function

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 58


Example: CFG for Optimized Code
Command
clang++ -O1 -S -emit-llvm -o simple_2-opt.ll simple_2.cpp
opt -dot-cfg -cfg-func-name=_Z9factoriali -cfg-dot-filename-prefix \
dot-simple_2-opt- simple_2-opt.ll > /dev/null
CFG

%1:
%2 = icmp sgt i32 %0, 1
br i1 %2, label %3, label %9
T F

%3:
3:
%4 = phi i32 [ %7, %3 ], [ 1, %1 ]
%5 = phi i32 [ %6, %3 ], [ %0, %1 ]
%6 = add nsw i32 %5, -1
%7 = mul nsw i32 %4, %5
%8 = icmp sgt i32 %5, 2
br i1 %8, label %3, label %9, !llvm.loop !3
T F

%9:
9:
%10 = phi i32 [ 1, %1 ], [ %7, %3 ]
ret i32 %10
CFG for '_Z9factoriali' function

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 59


Part 3

LLVM and Clang Libraries

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 60


Core LLVM Libraries

■ core LLVM libraries provide base-level functionality (i.e., framework) with


which frontends, middle-ends, backends, and other tools can be
constructed
■ written in C++
■ code in llvm namespace and child namespaces

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 61


Comments on LLVM Base Library Functionality

■ llvm::raw_ostream: fully-buffered output stream (which does not


support seeking)
■ llvm::outs: instance of llvm::raw_ostream for normal output from
program (similar to std::cout)
■ llvm::errs: instance of llvm::raw_ostream for error output from
program (similar to std::cerr)
■ llvm::StringRef: immutable non-owning reference to string (similar in
spirit to std::string_view)
■ llvm::Expected: tagged union holding either some expected type or
error in form of llvm::Error
■ llvm::ErrorOr: holds either error in form of std::error_code or some
other type

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 62


RTTI

■ llvm::isa<> returns bool indicating if reference/pointer refers to instance


of specified class
■ llvm::cast<> operator converts pointer/reference from base to derived
class causing assertion if object not instance of derived type
■ llvm::dyn_cast<> operator checks if operand of specified type and if so
returns pointer to it (with that type); otherwise returns null pointer
■ for more details, see:
2 https://llvm.org/docs/ProgrammersManual.html#

the-isa-cast-and-dyn-cast-templates

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 63


Clang Libraries

■ Clang is very modular and mostly structured as collection of libraries


■ Clang libraries (mostly) written in C++
■ for most practical purposes, can simply think of Clang as one large C++
library
■ when linking, however, reality is Clang code split across number of
libraries
■ LibClang provides C library API to access limited subset of Clang
functionality

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 64


Clang C API: LibClang Library

■ C library
■ functionality provided includes:
2 parsing source code into AST
2 loading already-parsed ASTs
2 traversing ASTs
2 annotating source locations with elements within AST
■ does not provide access to all information in Clang AST
■ API intended to be relatively stable from one release to next
■ intended to provide only basic functionality needed to support
development tools
■ data types prefixed with “CX” and functions prefixed with “clang_”
■ access to AST through high-level abstractions
■ for details on API, see:
2 https://clang.llvm.org/doxygen/group__CINDEX.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 65


LibTooling

■ C++ library
■ provides much richer set of functionality relative to LibClang
■ Clang library that provides functionality for utilizing parts of Clang in
standalone tools or Clang compiler plugins
■ provides convenient way to invoke compiler frontend on source code
■ provides support for compilation databases
■ easily integrates with code using CommandLine Library for processing of
command-line arguments
■ code in clang::tooling namespace
■ does not have stable API

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 66


Clang Include Directory (1)

■ Clang include directory contains numerous low-level headers such as


stddef.h, stdint.h, and limits.h
■ pathname of Clang include directory depends on how Clang installed, but
typically looks something like
$INSTALL_PREFIX/lib/clang/$VERSION/include, where
$INSTALL_PREFIX install directory for Clang and $VERSION is version of
Clang (e.g., 15.0.0)
■ compiler driver programs (i.e., clang and clang++) configured with
header search path that includes Clang include directory
■ Clang libraries themselves not preconfigured with Clang include directory
in header search path
■ if Clang include directory not included in header search path used by
Clang libraries, often many compiler errors will arise due to missing
headers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 67


Clang Include Directory (2)

■ Clang libraries assume Clang include directory has pathname


../$lib/clang/$VERSION/include relative to pathname of running
program where $lib is typically either lib or lib64
■ this will only result in correct path if program installed in bin directory of
Clang installation
■ typically, user will not place program in this directory so extra directory
must be added to header search path when compiling using Clang library
(via compiler option)
■ programs that employ standard Clang tool command-line interface (via
clang::tooling::CommonOptionsParser) have -extra-arg option
that can be used to pass extra options to compiler
■ for example, use option -extra-arg=-I$DIR to add $DIR to header
search path

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 68


Section 3.1

Command-Line Processing

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 69


Command-Line Processing

■ LLVM provides CommandLine Library for processing command-line


options and arguments
■ uses namespace llvm::cl
■ handles options as well as positional arguments
■ option may have arguments
■ parses and extracts arguments into variables
■ can specify help information for each option as well as general help
information
■ LLVM CommandLine Library may be more convenient to use than other
alternatives such as getopt or Boost Program Options Library
■ for more details, see:
2 https://llvm.org/docs/CommandLine.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 70


Command-Line Processing Example: Summary
■ in slides/examples/command_line_0 directory in companion
repository
■ processes command-line arguments and prints result obtained
■ uses LLVM CommandLine Library (without
clang::tooling::CommonOptionsParser from Clang Tooling Library)
■ several options are supported:
2 -log-file
2 -verbose
2 -debug-level

2 -value with alias -v

■ zero or more positional parameters allowed


■ example:
2 command: app -log-file logfile.txt -v 1 -v 2 main.cpp
2 output produced:
log file: logfile.txt
debug level: -1
verbose: false
value: 1
value: 2
file: main.cpp

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 71


Command-Line Processing Example: main.cpp
1 #include <format>
2 #include <string>
3 #include "llvm/Support/CommandLine.h"
4
5 namespace lc = llvm::cl;
6
7 static lc::OptionCategory toolOpts("Tool options");
8 static lc::opt<std::string> logFile("log-file", lc::ZeroOrMore);
9 static lc::opt<bool> verbose("verbose", lc::Optional, lc::cat(toolOpts));
10 static lc::opt<int> debugLevel("debug-level", lc::Optional,
11 lc::init(-1), lc::cat(toolOpts));
12 static lc::list<int> values("value", lc::ZeroOrMore, lc::cat(toolOpts));
13 static lc::alias valuesAlias("v", lc::aliasopt(values));
14 static lc::list<std::string> files(lc::Positional, lc::ZeroOrMore,
15 lc::cat(toolOpts));
16
17 static void printVersion(llvm::raw_ostream& out) {out << "version 0.0.0\n";}
18
19 int main(int argc, char **argv) {
20 lc::SetVersionPrinter(printVersion);
21 lc::ParseCommandLineOptions(argc, argv,
22 "This program illustrates the use of the LLVM CommandLine API.");
23 llvm::outs()
24 << std::format("log file: {}\n",
25 !logFile.empty() ? std::string(logFile) : "---")
26 << std::format("debug level: {}\n", debugLevel)
27 << std::format("verbose: {}\n", verbose);
28 for (auto i : values) {llvm::outs() << std::format("value: {}\n", i);}
29 for (auto i : files) {llvm::outs() << std::format("file: {}\n", i);}
30 }
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 72
Command-Line Processing Example: Summary
■ in slides/examples/command_line directory in companion repository
■ processes command-line arguments and prints result obtained
■ uses LLVM CommandLine Library in conjunction with
clang::tooling::CommonOptionsParser from Clang Tooling library
■ several options are supported:
2 -verbose with alias -v
2 -o
2 -foobar

■ one positional parameter is required


■ zero or more additional positional parameters are permitted
■ example:
2 command: app -verbose test hello.cpp
2 program output produced:
verbose : true
foobar : false
operation : test
output file : ( null )
number of compilation database entries : 0
source paths :
hello . cpp

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 73


Command-Line Processing Example: main.cpp (1)
1 #include <format>
2 #include <string>
3 #include "clang/Tooling/CommonOptionsParser.h"
4 #include "clang/Tooling/Tooling.h"
5 #include "llvm/Support/CommandLine.h"
6 #include "llvm/Support/raw_ostream.h"
7
8 using namespace std::literals;
9
10 static llvm::cl::OptionCategory toolOptionCat("Tool Options");
11 static llvm::cl::extrahelp
12 CommonHelp(clang::tooling::CommonOptionsParser::HelpMessage);
13 static llvm::cl::extrahelp MoreHelp(
14 "This tool does not actually do anything useful.\n"
15 "Life is full of disappointments. Get over it.\n"
16 );
17 llvm::cl::opt<std::string> outFile(
18 "o", llvm::cl::desc("Output file"), llvm::cl::value_desc("output_file"),
19 llvm::cl::cat(toolOptionCat));
20 llvm::cl::opt<bool> verbose("verbose",
21 llvm::cl::desc("Enable verbose output."), llvm::cl::cat(toolOptionCat));
22 llvm::cl::alias verbose2("v", llvm::cl::desc("Alias for -verbose"),
23 llvm::cl::aliasopt(verbose));
24 llvm::cl::opt<bool> foobar("foobar",
25 llvm::cl::desc("Enable experimental features."), llvm::cl::Hidden);
26 llvm::cl::opt<std::string> opName(llvm::cl::Positional, llvm::cl::Required,
27 llvm::cl::desc("Operation to perform."),
28 llvm::cl::value_desc("op_name"), llvm::cl::cat(toolOptionCat));

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 74


Command-Line Processing Example: main.cpp (2)

30 int main(int argc, const char **argv) {


31 llvm::Expected<clang::tooling::CommonOptionsParser> expectedOptionsParser(
32 clang::tooling::CommonOptionsParser::create(argc, argv, toolOptionCat));
33 if (!expectedOptionsParser) {
34 llvm::errs() << std::format("Unable to create option parser ({}).\n",
35 llvm::toString(std::move(expectedOptionsParser.takeError())));
36 return 1;
37 }
38 clang::tooling::CommonOptionsParser& optionsParser = *expectedOptionsParser;
39 llvm::outs()
40 << std::format("verbose: {}\n", static_cast<bool>(verbose))
41 << std::format("foobar: {}\n", static_cast<bool>(foobar))
42 << std::format("operation: {}\n",
43 !opName.empty() ? opName : "(null)"s)
44 << std::format("output file: {}\n",
45 !outFile.empty() ? outFile : "(null)"s);
46 llvm::outs() << std::format("number of compilation database entries: {}\n",
47 optionsParser.getCompilations().getAllCompileCommands().size());
48 llvm::outs() << "source paths:\n";
49 for (auto path : optionsParser.getSourcePathList()) {
50 llvm::outs() << std::format(" {}\n", path);
51 }
52 return 0;
53 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 75


Section 3.2

Compilation Databases

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 76


Compilation Databases

■ when running compiler frontend, must specify appropriate options to use


for compiler invocations
■ compilation database specifies which options are used for each source
file being built
■ Clang provides API for compilation database through base class
clang::tooling::CompilationDatabase
■ several classes derive from this base class in order to provide various
kinds of compilation databases
■ also provides adapter class that can be used to transform information
stored in compilation databases in various ways (e.g., by applying
transformations to compiler flags)
■ compilation database can have more than one entry for particular source
file, since source file could be compiled more than once with different
options each time (e.g., non-production and production builds)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 77


CompilationDatabase Class
■ clang::tooling::CompilationDatabase base class provides basic
API for compilation databases
■ some key methods:
2 getCompileCommands: returns all compile commands in which specified

source file is compiled


2 getAllFiles: return list of all source files in compilation database
2 getAllCompileCommands: returns all compile commands for all source
files in compilation database
■ several factory functions for CompilationDatabase objects:
2 loadFromDirectory: load compilation database from build directory

(currently, only looks for compile_commands.json)


2 autoDetectFromSource: looks for compilation database in directory

containing specified source file and each of its successive parents and
loads first one found
2 autoDetectFromDirectory: looks for compilation database in specified
directory and each of its successive parents and loads first one found
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1tooling_1_

1CompilationDatabase.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 78
Fixed Compilation Database

■ Clang supports notion of fixed compilation database


■ every source file compiled with identical flags
■ command-line arguments specified one per line
■ intended for use with very simple projects
■ does not identify source files to be built
■ only specifies compiler options
■ not possible to query source files
■ not possible to query all compile commands for all source files

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 79


Fixed Compilation Database Example

compile_flags.txt
-DGREETING="Hello, World!"
-DANSWER=42
-g
-O2
-I
/usr/local/libfoo/include

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 80


FixedCompilationDatabase Class
■ clang::tooling::FixedCompilationDatabase class inherits from
CompilationDatabase base class
■ provides flavor of compilation database that is associated with fixed
compilation database
■ overrides virtual functions from CompilationDatabase base class
■ some functions have trivial behavior since fixed database cannot provide
full functionality of CompilationDatabase API (e.g., getAllFiles and
getAllCompileCommands)
■ factory functions:
2 loadFromCommandLine: parses command-line arguments and generates

fixed database based on those arguments


2 loadFromFile: reads flags one per line from file
2 loadFromBuffer: reads flags one per line from buffer
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1tooling_1_

1FixedCompilationDatabase.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 81


Fixed Compilation Database Example: Summary

■ in slides/examples/compilation_database directory in companion


repository
■ loads fixed compilation database from specified file
■ queries compilation database for each source file specified
■ example program output:
command :
filename : hello . cpp
directory : ./ data
command line : / home / jdoe / build / clang - tool - DGREETING =" Hello , World !" - DANSWER =42 -g
,→ -O2 -I / usr / local / libfoo / include hello . cpp
no output
no heuristic
command :
filename : goodbye . cpp
directory : ./ data
command line : / home / jdoe / build / clang - tool - DGREETING =" Hello , World !" - DANSWER =42 -g
,→ -O2 -I / usr / local / libfoo / include goodbye . cpp
no output
no heuristic

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 82


Fixed Compilation Database Example: fixed.cpp (1)

1 #include <format>
2 #include <utility>
3 #include "clang/Tooling/CommonOptionsParser.h"
4 #include "clang/Tooling/CompilationDatabase.h"
5 #include "llvm/Config/llvm-config.h"
6 #include "utility.hpp"
7
8 namespace ct = clang::tooling;
9
10 int main(int argc, char** argv) {
11 if (argc < 2) {
12 llvm::errs() << "no fixed database specified\n";
13 return 1;
14 }
15 std::string pathname(argv[1]);
16 std::string errString;
17 std::unique_ptr<ct::CompilationDatabase> compDatabase;
18 compDatabase = ct::FixedCompilationDatabase::loadFromFile(pathname,
19 errString);
20 if (!compDatabase) {
21 llvm::errs() << std::format("ERROR: {}\n", errString);
22 return 1;
23 }
24 std::vector<std::string> sourcePaths = compDatabase->getAllFiles();
25 for (const auto& sourcePath : sourcePaths) {
26 llvm::outs() << std::format("{}\n", sourcePath);
27 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 83


Fixed Compilation Database Example: fixed.cpp (2)

28 std::vector<ct::CompileCommand> compCommands =
29 compDatabase->getAllCompileCommands();
30 printCompCommands(llvm::outs(), compCommands);
31 for (int i = 2; i < argc; ++i) {
32 std::vector<ct::CompileCommand> compCommands =
33 compDatabase->getCompileCommands(argv[i]);
34 printCompCommands(llvm::outs(), compCommands);
35 }
36 return 0;
37 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 84


JSON Compilation Database

■ compilation databases often represented using JavaScript Object Notation


(JSON) format
■ root JSON object is array
■ each array element can have following fields:
2 directory: working directory for running compilation command
2 file: source file processed
2 arguments: compile command argument vector
2 command: compile command string (single shell-escaped string)
2 output: name of output created

■ either arguments or command is required


■ some build tools can generate JSON compilation databases (e.g., CMake)
■ for more information, see:
2 https://clang.llvm.org/docs/JSONCompilationDatabase.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 85


Generating JSON Compilation Database With CMake

■ CMake can generate JSON compilation database


■ CMake variable CMAKE_EXPORT_COMPILE_COMMANDS has boolean value
that specifies if compilation database should be generated
■ compilation-database file named compile_commands.json
■ placed in (top-level) binary directory for CMake build
■ for example, use command like:
cmake -H$src_dir -B$bin_dir \
-DCMAKE_EXPORT_COMPILE_COMMANDS=1

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 86


JSON Compilation Database Example

[
{
"arguments" : [
"/usr/bin/clang++",
"-Irelative",
"-DGREET=Hello, World!\\n",
"-c",
"-o",
"file.o",
"file.cpp"
],
"directory" : "/home/user/llvm/build",
"file" : "file.cpp"
},
{
"command" : "/usr/bin/clang++ -Irelative -DGREET=\"Hello, World!\\\\n\" -c -o file.o file.cpp",
"directory" : "/home/user/llvm/build",
"file" : "file2.cpp"
}
]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 87


JSONCompilationDatabase Class

■ clang::tooling::JSONCompilationDatabase inherits from


CompilationDatabase base class
■ provides flavor of compilation database that is associated with JSON
compilation database
■ provides overrides for virtual functions associated with
CompilationDatabase appropriate for JSON compilation database
■ factory functions:
2 loadFromFile: loads JSON compilation database from specified file
2 loadFromBuffer: loads JSON compilation database from buffer

■ for more information, see:


2 https://clang.llvm.org/doxygen/classclang_1_1tooling_1_

1JSONCompilationDatabase.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 88


JSON Compilation Database Example: Summary

■ in slides/examples/compilation_database directory in companion


repository
■ loads JSON compilation database from specified file
■ queries compilation database for each source file specified
■ example program output:
/ home / user / llvm / build / file2 . cpp
/ home / user / llvm / build / file . cpp
command :
filename : file . cpp
directory : / home / user / llvm / build
command line : / usr / bin / clang ++ - Irelative - DGREET = Hello , World !\ n -c -o file .o file .
,→ cpp
no output
no heuristic
command :
filename : file2 . cpp
directory : / home / user / llvm / build
command line : / usr / bin / clang ++ - Irelative - DGREET = Hello , World !\ n -c -o file .o file .
,→ cpp
no output
no heuristic

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 89


JSON Compilation Database Example: json.cpp (1)
1 #include <format>
2 #include <utility>
3 #include "clang/Tooling/CompilationDatabase.h"
4 #include "clang/Tooling/JSONCompilationDatabase.h"
5 #include "utility.hpp"
6
7 namespace ct = clang::tooling;
8
9 int main(int argc, char** argv) {
10 std::string errString;
11 if (argc < 2) {
12 llvm::errs() << "no JSON compilation database specified\n";
13 return 1;
14 }
15 std::string pathname = argv[1];
16 std::unique_ptr<ct::CompilationDatabase> compDatabase;
17 compDatabase = ct::JSONCompilationDatabase::loadFromFile(pathname,
18 errString, ct::JSONCommandLineSyntax::AutoDetect);
19 if (!compDatabase) {
20 llvm::errs() << std::format("ERROR: {}\n", errString);
21 return 1;
22 }
23 std::vector<std::string> sourcePaths = compDatabase->getAllFiles();
24 for (const auto& sourcePath : sourcePaths) {
25 llvm::outs() << std::format("{}\n", sourcePath);
26 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 90


JSON Compilation Database Example: json.cpp (2)

27 std::vector<ct::CompileCommand> compCommands =
28 compDatabase->getAllCompileCommands();
29 printCompCommands(llvm::outs(), compCommands);
30 for (int i = 2; i < argc; ++i) {
31 std::vector<ct::CompileCommand> compCommands =
32 compDatabase->getCompileCommands(argv[i]);
33 printCompCommands(llvm::outs(), compCommands);
34 }
35 return 0;
36 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 91


JSON Compilation Database Example: utility.hpp

1 #include <vector>
2 #include "llvm/Support/raw_ostream.h"
3
4 bool printCompCommands(llvm::raw_fd_ostream& out,
5 const std::vector<clang::tooling::CompileCommand>& compCommands);

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 92


JSON Compilation Database Example: utility.cpp
1 #include <format>
2 #include <vector>
3 #include "clang/Tooling/CompilationDatabase.h"
4 #include "llvm/Support/raw_ostream.h"
5 #include "utility.hpp"
6
7 namespace ct = clang::tooling;
8
9 bool printCompCommands(llvm::raw_fd_ostream& out,
10 const std::vector<ct::CompileCommand>& compCommands) {
11 for (auto compCommand = compCommands.begin();
12 compCommand != compCommands.end(); ++compCommand) {
13 out << "command:\n"
14 << std::format(" filename: {}\n", compCommand->Filename)
15 << std::format(" directory: {}\n", compCommand->Directory);
16 out << " command line:";
17 for (auto word : compCommand->CommandLine) {out << " " << word;}
18 out << ’\n’;
19 if (!compCommand->Output.empty()) {
20 out << " output: " << compCommand->Output << ’\n’;
21 } else {
22 out << " no output\n";
23 }
24 if (!compCommand->Heuristic.empty()) {
25 out << " heuristic: " << compCommand->Heuristic << ’\n’;
26 } else {
27 out << " no heuristic\n";
28 }
29 }
30 return !out.has_error();
31 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 93


clang::tooling::ArgumentsAdjuster Callable

■ clang::tooling::ArgumentsAdjuster is type of callable (i.e.,


function/functor) that manipulates command-line arguments
■ intended to be used with compilation databases to apply transformations
to compiler flags
■ for more information, see:
2 https://clang.llvm.org/doxygen/namespaceclang_1_1tooling.

html#a8dcb3e0419f4f8de952b46ad1c627f68

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 94


ArgumentsAdjustingCompilations Class

■ clang::tooling::ArgumentsAdjustingCompilations class inherits


from CompilationDatabase base class
■ ArgumentsAdjustingCompilations provides wrapper around existing
compilation database that allow transformations to be applied to compiler
flags
■ provides method appendArgumentsAdjuster that allows
ArgumentsAdjuster to be applied to commands in compilation database
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1tooling_1_

1ArgumentsAdjustingCompilations.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 95


Argument Adjuster Example: Summary

■ in slides/examples/compilation_database directory in companion


repository
■ loads compilation database from specified source
■ optionally applies arguments adjuster to compilation database
■ queries compilation database for each source file specified
■ example program output:
command :
filename : hello . cpp
directory : ./ data
command line : / home / jdoe / build / clang - tool - DGREETING =" Hello , World !" - DANSWER =42 -g
,→ -O2 -I / usr / local / libfoo / include hello . cpp
no output
no heuristic

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 96


Argument Adjuster Example: adjuster.cpp (1)
1 #include <format>
2 #include <string>
3 #include <utility>
4 #include <vector>
5 #include <unistd.h>
6 #include "clang/Tooling/ArgumentsAdjusters.h"
7 #include "clang/Tooling/CommonOptionsParser.h"
8 #include "clang/Tooling/CompilationDatabase.h"
9 #include "clang/Tooling/JSONCompilationDatabase.h"
10 #include "clang/Tooling/Tooling.h"
11 #include "utility.hpp"
12
13 namespace ct = clang::tooling;
14
15 std::pair<std::unique_ptr<ct::CompilationDatabase>, std::string>
16 loadCompDatabase(const std::string& pathname, bool json) {
17 std::unique_ptr<ct::CompilationDatabase> compDatabase;
18 std::string errString;
19 if (json) {
20 compDatabase = ct::JSONCompilationDatabase::loadFromFile(pathname,
21 errString, ct::JSONCommandLineSyntax::AutoDetect);
22 } else {
23 compDatabase = ct::FixedCompilationDatabase::loadFromFile(pathname,
24 errString);
25 }
26 if (compDatabase) {errString.clear();}
27 return {std::move(compDatabase), errString};
28 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 97


Argument Adjuster Example: adjuster.cpp (2)

30 std::unique_ptr<ct::CompilationDatabase> wrapCompDatabase(
31 std::unique_ptr<ct::CompilationDatabase> compDatabase, int adjust) {
32 compDatabase = std::make_unique<ct::ArgumentsAdjustingCompilations>(
33 std::move(compDatabase));
34 auto aac = static_cast<ct::ArgumentsAdjustingCompilations*>(
35 compDatabase.get());
36 switch (adjust) {
37 case 1:
38 aac->appendArgumentsAdjuster(ct::getClangSyntaxOnlyAdjuster());
39 break;
40 case 2:
41 aac->appendArgumentsAdjuster(ct::getInsertArgumentAdjuster("-DFOO",
42 ct::ArgumentInsertPosition::BEGIN));
43 break;
44 }
45 return compDatabase;
46 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 98


Argument Adjuster Example: adjuster.cpp (3)

48 void processCommandLine(int argc, char** argv, std::string& pathname,


49 bool& json, int& adjust, std::vector<std::string>& sourcePathnames) {
50 json = true;
51 pathname.clear();
52 adjust = 0;
53 sourcePathnames.clear();
54 for (int c; (c = getopt(argc, argv, "a:j:f:")) >= 0;) {
55 switch (c) {
56 case ’a’:
57 adjust = std::atoi(optarg);
58 break;
59 case ’j’:
60 pathname = optarg;
61 json = true;
62 break;
63 case ’f’:
64 pathname = optarg;
65 json = false;
66 break;
67 }
68 }
69 if (pathname.empty()) {
70 llvm::errs() << "ERROR: no compilation database specified\n";
71 std::exit(1);
72 }
73 for (int i = optind; i < argc; ++i) {sourcePathnames.push_back(argv[i]);}
74 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 99


Argument Adjuster Example: adjuster.cpp (4)

76 int main(int argc, char** argv) {


77 std::string pathname;
78 bool json;
79 int adjust;
80 std::vector<std::string> sourcePathnames;
81 processCommandLine(argc, argv, pathname, json, adjust, sourcePathnames);
82 auto [compDatabase, errString] = loadCompDatabase(pathname, json);
83 if (!compDatabase) {
84 llvm::errs() << std::format("ERROR: {}\n", errString);
85 return 1;
86 }
87 compDatabase = wrapCompDatabase(std::move(compDatabase), adjust);
88 std::vector<std::string> sourcePaths = compDatabase->getAllFiles();
89 for (const auto& sourcePath : sourcePaths) {
90 llvm::outs() << std::format("{}\n", sourcePath);
91 }
92 std::vector<ct::CompileCommand> compCommands =
93 compDatabase->getAllCompileCommands();
94 printCompCommands(llvm::outs(), compCommands);
95 for (auto sourcePathname : sourcePathnames) {
96 std::vector<ct::CompileCommand> compCommands =
97 compDatabase->getCompileCommands(sourcePathname);
98 printCompCommands(llvm::outs(), compCommands);
99 }
100 return 0;
101 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 100


Section 3.3

ASTs

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 101


ASTs

■ nodes in AST do not all share common base type


■ number of root types from which AST node may be derived
■ three most common root types for AST nodes:
1 Type: used to represent types
::::
2 Decl: used to represent declarations
::::
3 Stmt: used to represent statements and expressions
::::
■ each of Type, Decl, and Stmt has many subclasses
■ Expr
::::
particularly important subclass of Stmt used to represent
expressions (since expression can be viewed as statement that yields
value)
■ AST node types in clang namespace (e.g., clang::Decl and
clang::Stmt)
■ tree traversal complicated by lack of common base type for all nodes
■ strictly speaking, AST is graph, not tree (as nodes can have multiple
parents)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 102


Type Nodes

■ nodes with type :::::


Type or types derived therefrom used to represent types
■ some examples of types derived from Type:
2 BuiltinType

2 PointerType
2 ArrayType
2 RecordType
2 FunctionType

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 103


Decl Nodes
■ Decl type nodes used to represent various kinds of declarations,
::::
including declarations of:
2 variables and data members
2 (member and non-member) functions
2 parameters of (member and non-member) functions and templates
2 classes and structs
■ different kinds of declarations represented by various subclasses of Decl
■ some examples of types derived from Decl include:
2 TranslationUnitDecl: translation unit
:::::::::::::::::::
2 NamedDecl: declaration that may have name
:::::::::
2 VarDecl: variable declaration
:::::::
2 ParmVarDecl: declaration of function parameter
:::::::::::
2 FunctionDecl: function declaration
::::::::::::
2 CxxMethodDecl: declaration of static or non-static method of
:::::::::::::
struct/union/class
2 CxxRecordDecl
:::::::::::::
: declaration of C++ struct/union/class
2 DeclaratorDecl
::::::::::::::
: declaration that uses declarator (e.g., type-and-qualifier
name)
2 ValueDecl
:::::::::
: declaration for which declared instance can be value
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 104
Stmt and Expr Nodes
■ most directives in program that correspond to actions can be classified as
statements
■ statements represented by Stmt
::::
class and subclasses thereof
■ some examples of subclasses of Stmt class include:
2 CompoundStmt used to represent compound statements
::::::::::::
2 IfStmt used to represent if statements
::::::
2 SwitchStmt used to represent switch statements
::::::::::
2 ForStmt used to represent for loops
:::::::
2 WhileStmt used to represent while statements
:::::::::
■ compound statement represents collection of multiple statements
■ expressions are special kinds of statements that generate values (e.g.,
result of operator, such as function call operator)
■ expressions represented by Expr
::::
class and subclasses thereof
■ some examples of subclasses of Expr class include:
2 DeclRefExpr used to represent reference to symbol
:::::::::::
2 CallExpr used to represent function call operator
::::::::

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 105


Other Root Types for Nodes

■ TypeLoc
::::::::
: represents source information regarding type
■ NestedNameSpecifier
:::::::::::::::::::::
: represents C++ nested name specifier (e.g.,
std::vector<int>)
■ NestedNameSpecifierLoc
:::::::::::::::::::::::::
: represents C++ nested-name specifier
augmented with source location information
■ CXXBaseSpecifier
::::::::::::::::::
: represents base class of C++ class
■ CXXCtorInitializer
::::::::::::::::::::
: represents C++ base or member initializer
■ TemplateArgument: represents template argument
::::::::::::::::::
■ TemplateArgumentLoc
:::::::::::::::::::::
: represents source information regarding
template argument
■ LambdaCapture
:::::::::::::::
: describes capture for lambda expression (e.g., capture
of variable or of this)
■ Attr
::::
: represents attribute information (e.g., C++ attribute on type,
function, or statement)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 106


clang::DynTypedNode Class

■ clang::DynTypedNode
:::::::::::::::::::::
class used to represent generic AST node (i.e.,
AST node of any type)
■ get<T> template method returns pointer to underlying node as pointer to
T or null if underyling node is not of that type
■ getUnchecked<T> template method similar to get but underlying node
required to be of T
■ clang::DynTypedNodeList class used to represent list of generic AST
::::::::::::::::::::::::::
nodes

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 107


Section 3.4

Frontend Actions

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 108


Frontend Actions

■ frontend action: task to be performed with help of compiler frontend


■ create frontend-action class that embodies particular task to be performed
by compiler frontend
■ frontend-action class needs to derive from some appropriate class in
Clang libraries (namely, subclass of clang::FrontendAction to be
introduced shortly)
■ create frontend-action factory class that can generate frontend action
instances
■ herein, “frontend-action class” refers to any frontend-action type, whereas
“FrontendAction” refers to specific type in Clang libraries

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 109


Using Frontend Actions

■ construct ClangTool object, specifying desired compilation database and


list of source files to process
■ invoke run member function of ClangTool object to specify
frontend-action factory associated with frontend action to be performed
■ for each source file to be processed, run member function will create
compiler instance and use frontend-action factory to create corresponding
frontend-action object
■ at appropriate points in time, member functions of frontend-action class
instance will be invoked to perform various types of processing associated
with frontend action

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 110


clang::tooling::ClangTool Class
■ clang::tooling::ClangTool class provides convenient way to run
frontend action over set of source files
■ constructed by specifying compilation database and list of source paths
■ run method executes frontend action on each source file associated with
ClangTool object
■ by default modifies command-line arguments (via
ClangSyntaxOnlyAdjuster) to syntax-check-only variant (i.e., no
generation of IR)
■ can supply additional command-line arguments adjusters by using
ClangTool::appendArgumentsAdjuster
■ can control how diagnostics handled via diagnostics-consumer type by
using ClangTool::setDiagnosticConsumer
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1tooling_1_

1ClangTool.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 111


clang::FrontendAction Class
■ clang::FrontendAction abstract base class provides means to
perform user-specified processing in compiler frontend via callbacks
■ FrontendAction instance can be thought of as single task running inside
compiler frontend
■ provides callbacks to be invoked at specific points in processing
performed by frontend
■ some callbacks provided include:
2 BeginSourceFileAction: invoked just before starting processing of

source file
2 EndSourceFileAction: invoked just after finishing processing of source
file
2 ExecuteAction: performs main task for frontend action
2 CreateASTConsumer: factory function used to create instance of
ASTConsumer, which provides callbacks to be invoked at particular points in
processing of AST
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1FrontendAction.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 112
Frontend Actions

■ in order to perform frontend action, library user defines frontend-action


class, which derives from appropriate frontend-action class in library
■ need to provide override of CreateASTConsumer
■ this function creates AST consumer for particular translation unit to be
processed
■ invoked for each translation unit encountered
■ implementation of CreateASTConsumer constructs AST-consumer object
owned by unique_ptr and then returns this unique_ptr

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 113


clang::FrontendActionFactory Class

■ clang::tooling::FrontendActionFactory class used to create


instances of frontend-action type
■ invokes compiler with frontend action
■ provides interface for generating frontend action instances
■ provides pure virtual function create for obtaining new frontend action
instance
■ can be used by ClangTool to generate frontend action instance for each
translation unit to be processed
■ if frontend-action class is default constructible, can use
clang::tooling::newFrontendActionFactory function as factory
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1tooling_1_

1FrontendActionFactory.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 114


ClangTool Example: Summary

■ in slides/examples/frontend_action directory in companion


repository
■ runs compiler frontend on each source file specified on command line
■ performs only syntax checking (i.e., generates AST and then semantically
verifies it)
■ output resembles something like:
/ home / jdoe / invalid_1 . cpp :3:9: error : use of undeclared identifier ’ forty_two ’
return forty_two ;
^
1 error generated .
Error while processing / home / jdoe / invalid_1 . cpp .
error detected

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 115


ClangTool Example
1 #include "clang/Frontend/FrontendAction.h"
2 #include "clang/Frontend/FrontendActions.h"
3 #include "clang/Tooling/CommonOptionsParser.h"
4 #include "clang/Tooling/Tooling.h"
5 #include "llvm/Support/CommandLine.h"
6
7 namespace ct = clang::tooling;
8
9 static llvm::cl::OptionCategory toolOptions("Tool Options");
10
11 int main(int argc, char** argv) {
12 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
13 const_cast<const char**>(argv), toolOptions);
14 if (!expectedOptionsParser) {
15 llvm::errs() << llvm::toString(expectedOptionsParser.takeError());
16 return 1;
17 }
18 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
19 ct::ClangTool tool(optionsParser.getCompilations(),
20 optionsParser.getSourcePathList());
21 int status = tool.run(
22 ct::newFrontendActionFactory<clang::SyntaxOnlyAction>().get());
23 if (status) {llvm::errs() << "error detected\n";}
24 return !status ? 0 : 1;
25 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 116


clang::CompilerInstance Class

■ clang::CompilerInstance class used to manage single instance of


Clang compiler
■ manages various objects which are necessary to run compiler, including
2 source manager (SourceManager), file manager (FileManager),
::::::::::::: :::::::::::
preprocessor (Preprocessor
::::::::::::
), AST context (::::::::::
ASTContext), AST consumer
(ASTConsumer
:::::::::::
), and diagnostics engine (DiagnosticsEngine
:::::::::::::::::
)
■ has various information about compiler options, including:
2 language options (LangOptions), preprocessor options
:::::::::::
(PreprocessorOptions
:::::::::::::::::::
), header-search options
(HeaderSearchOptions
:::::::::::::::::::
), and target options (:::::::::::::
TargetOptions)
■ provides utility routines for constructing and manipulating common Clang
objects
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_

1CompilerInstance.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 117


Section 3.5

Preprocessor-Related Processing

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 118


clang::Preprocessor Class

■ clang::Preprocessor class works in conjunction with lexer to efficiently


preprocess tokens
■ custom processing can be performed during preprocessing via callbacks
registered via addPPCallbacks method
■ can suppress errors due to missing headers via
SetSuppressIncludeNotFoundError method
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1Preprocessor.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 119


clang::PPCallbacks Class

■ clang::PPCallbacks class provides interface for supplying callbacks to


be invoked at various stages during preprocessing
■ some events for which callbacks can be registered include:
2 include directive encountered
2 pragma directive encountered
2 macro has been defined
2 macro has been undefined
2 if directive has been encountered
2 else directive has been encountered
2 endif directive has been encountered
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1PPCallbacks.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 120


Preprocessor Callbacks Example: Summary

■ in slides/examples/preprocessor directory in companion repository


■ runs preprocessor on each source file specified on command line
■ for each preprocessor include directive in main source file, program prints:
2 location of include directive
2 header specified in include directive (with angle brackets or double quotes)
2 full pathname of included file
■ output resembles something like:
include directive :
location : / home / jdoe / test_2 . cpp :1:1
header : <cstdint >
pathname : / usr / lib / gcc / x86_64 - redhat - linux /11/../../../../ include /c ++/11/ cstdint
include directive :
location : / home / jdoe / test_2 . cpp :2:1
header : " test_2 . hpp "
pathname : / home / jdoe / test_2 . hpp

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 121


Preprocessor Callbacks Example: main.cpp (1)

1 #include <format>
2 #include <iostream>
3 #include "clang/Frontend/CompilerInstance.h"
4 #include "clang/Frontend/FrontendActions.h"
5 #include "clang/Lex/PPCallbacks.h"
6 #include "clang/Lex/Preprocessor.h"
7 #include "clang/Tooling/CommonOptionsParser.h"
8 #include "clang/Tooling/Tooling.h"
9 #include "llvm/Support/CommandLine.h"
10
11 namespace ct = clang::tooling;
12 using namespace std::literals;
13
14 static llvm::cl::OptionCategory toolCategory("Tool Options");
15
16 std::string locationToString(const clang::SourceManager& sourceManager,
17 clang::SourceLocation sourceLoc) {
18 return std::format("{}:{}:{}",
19 std::string(sourceManager.getFilename(sourceLoc)),
20 sourceManager.getSpellingLineNumber(sourceLoc),
21 sourceManager.getSpellingColumnNumber(sourceLoc));
22 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 122


Preprocessor Callbacks Example: main.cpp (2)

24 class FindIncludes : public clang::PPCallbacks {


25 public:
26 FindIncludes(clang::SourceManager& sourceManager) :
27 sourceManager_(&sourceManager) {}
28 void InclusionDirective(clang::SourceLocation hashLoc,
29 const clang::Token&, llvm::StringRef fileName, bool isAngled,
30 clang::CharSourceRange, llvm::Optional<clang::FileEntryRef> file,
31 llvm::StringRef, llvm::StringRef, const clang::Module *,
32 clang::SrcMgr::CharacteristicKind) override {
33 std::string actualFileName;
34 if (!sourceManager_->isInMainFile(hashLoc)) {return;}
35 if (file) {actualFileName = file->getName();}
36 std::string headerName = isAngled ?
37 ("<"s + std::string(fileName) + ">"s) :
38 ("\""s + std::string(fileName) + "\""s);
39 llvm::outs() << std::format("include directive:\n location: {}\n"
40 " header: {}\n pathname: {}\n",
41 locationToString(*sourceManager_, hashLoc), headerName,
42 actualFileName);
43 }
44 private:
45 clang::SourceManager* sourceManager_;
46 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 123


Preprocessor Callbacks Example: main.cpp (3)

48 class IncludeFinderAction : public clang::PreprocessOnlyAction {


49 bool BeginSourceFileAction(clang::CompilerInstance& ci) override {
50 std::unique_ptr<FindIncludes> findIncludes(
51 new FindIncludes(ci.getSourceManager()));
52 clang::Preprocessor& pp = ci.getPreprocessor();
53 pp.addPPCallbacks(std::move(findIncludes));
54 return true;
55 }
56 };
57
58 int main(int argc, char **argv) {
59 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
60 const_cast<const char**>(argv), toolCategory);
61 if (!expectedOptionsParser) {
62 llvm::errs() << llvm::toString(expectedOptionsParser.takeError());
63 return 1;
64 }
65 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
66 ct::ClangTool tool(optionsParser.getCompilations(),
67 optionsParser.getSourcePathList());
68 return tool.run(
69 ct::newFrontendActionFactory<IncludeFinderAction>().get());
70 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 124


Section 3.6

AST Frontend Actions

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 125


clang::ASTFrontendAction Class

■ clang::ASTFrontendAction abstract base class provides interface for


frontend actions that consume AST
■ derives from clang::FrontendAction
■ provides override of ExecuteAction function that runs semantic analysis
and builds AST
■ in some cases, may be desirable to override BeginSourceFileAction
and EndSourceFileAction methods
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_

1ASTFrontendAction.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 126


clang::ASTConsumer Class

■ clang::ASTConsumer class provides interface for consuming AST


■ allows code consuming AST to be decoupled from code producing AST
■ callbacks for certain types of events can be provided by overriding virtual
methods
■ some virtual methods include:
2 Initialize: called to perform any initialization of consumer
2 HandleTranslationUnit: called after entire translation unit has been

parsed
2 HandleCXXImplicitFunctionInstantiation: called when function
implicitly instantiated
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1ASTConsumer.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 127


AST-Consumer Classes

■ AST-consumer classes should be derived from clang::ASTConsumer


■ provide overrides of functions appropriate for types of processing needed
■ for example, common to override HandleTranslationUnit

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 128


Frontend Action Example: Summary

■ in slides/examples/frontend_action_2 directory in companion


repository
■ runs compiler frontend on each source file specified on command line in
order to perform syntax check (i.e., build and semantically verify AST)
■ for each source file processed, prints language (e.g., C++ or C), version of
language standard (e.g., C++20 or C11), and language dialect (e.g.,
gnucxx20 or gnu11) used

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 129


Frontend Action Example: main.cpp (1)
1 #include <format>
2 #include <map>
3 #include <string>
4 #include "clang/Basic/LangStandard.h"
5 #include "clang/Frontend/CompilerInstance.h"
6 #include "clang/Frontend/FrontendAction.h"
7 #include "clang/Frontend/FrontendActions.h"
8 #include "clang/Tooling/CommonOptionsParser.h"
9 #include "clang/Tooling/Tooling.h"
10 #include "llvm/Support/CommandLine.h"
11
12 namespace ct = clang::tooling;
13
14 std::string langKindToNameString(clang::LangStandard::Kind kind) {
15 return clang::LangStandard::getLangStandardForKind(kind).getName();
16 }
17
18 std::string langKindToLangString(clang::LangStandard::Kind kind) {
19 clang::Language language =
20 clang::LangStandard::getLangStandardForKind(kind).getLanguage();
21 std::map<clang::Language, std::string> lut{
22 {clang::Language::CXX, "C++"},
23 {clang::Language::C, "C"},
24 };
25 auto i = lut.find(language);
26 return (i != lut.end()) ? (*i).second : "unknown";
27 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 130


Frontend Action Example: main.cpp (2)
29 std::string langKindToStdString(clang::LangStandard::Kind kind) {
30 clang::LangStandard langStd =
31 clang::LangStandard::getLangStandardForKind(kind);
32 clang::Language language = langStd.getLanguage();
33 if (language == clang::Language::CXX) {
34 if (langStd.isCPlusPlus() && !langStd.isCPlusPlus11())
35 {return "C++98";}
36 else if (langStd.isCPlusPlus11() && !langStd.isCPlusPlus14())
37 {return "C++11";}
38 else if (langStd.isCPlusPlus14() && !langStd.isCPlusPlus17())
39 {return "C++14";}
40 else if (langStd.isCPlusPlus17() && !langStd.isCPlusPlus20())
41 {return "C++17";}
42 else if (langStd.isCPlusPlus20()) {return "C++20-or-later";}
43 else {return "pre-C++98";}
44 } else if (language == clang::Language::C) {
45 if (langStd.isC99() && !langStd.isC11()) {return "C99";}
46 else if (langStd.isC11() && !langStd.isC17()) {return "C11";}
47 else if (langStd.isC17()) {return "C17-or-later";}
48 else {return "pre-C99";}
49 } else {
50 return "unknown";
51 }
52 }
53
54 class MyFrontendAction : public clang::SyntaxOnlyAction {
55 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
56 clang::CompilerInstance&, llvm::StringRef) override;
57 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 131


Frontend Action Example: main.cpp (3)

59 std::unique_ptr<clang::ASTConsumer> MyFrontendAction::CreateASTConsumer(
60 clang::CompilerInstance& compInstance, llvm::StringRef inFile) {
61 const clang::LangOptions& langOpts = compInstance.getLangOpts();
62 llvm::outs() << std::format("{}\nlanguage: {}\nstandard: {}\nname: {}\n\n",
63 inFile, langKindToLangString(langOpts.LangStd),
64 langKindToStdString(langOpts.LangStd),
65 langKindToNameString(langOpts.LangStd));
66 return clang::SyntaxOnlyAction::CreateASTConsumer(compInstance, inFile);
67 }
68
69 static llvm::cl::OptionCategory toolOptions("Tool Options");
70
71 int main(int argc, char** argv) {
72 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
73 const_cast<const char**>(argv), toolOptions);
74 if (!expectedOptionsParser) {
75 llvm::errs() << llvm::toString(expectedOptionsParser.takeError());
76 return 1;
77 }
78 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
79 ct::ClangTool tool(optionsParser.getCompilations(),
80 optionsParser.getSourcePathList());
81 int status = tool.run(
82 ct::newFrontendActionFactory<MyFrontendAction>().get());
83 if (status) {llvm::errs() << "error detected\n";}
84 return !status ? 0 : 1;
85 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 132


clang::ASTContext Class

■ clang::ASTContext class holds long-lived AST nodes (such as types


and decls) that can be referred to throughout semantic analysis of file
■ some information about AST not stored in AST nodes themselves but
rather in AST context (ASTContext) and associated source manager
(SourceManager)
■ such information includes source locations and global identifier
information
■ provides getTranslationUnitDecl method for obtaining AST node for
translation unit (which often serves as root node for AST traversal)
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1ASTContext.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 133


AST Consumer Example: Summary

■ in slides/examples/ast_consumer_1 directory in companion


repository
■ runs compiler frontend on each source file specified on command line in
order to produce AST
■ for each source file processed, prints amount of memory used for AST
■ output resembles something like:
input file : / home / jdoe / hello . cpp
AST size : 7667712
input file : / home / jdoe / simple_1 . cpp
AST size : 8847360

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 134


AST Consumer Example: main.cpp (1)
1 #include <format>
2 #include "clang/AST/ASTConsumer.h"
3 #include "clang/Frontend/CompilerInstance.h"
4 #include "clang/Frontend/FrontendAction.h"
5 #include "clang/Tooling/CommonOptionsParser.h"
6 #include "clang/Tooling/Tooling.h"
7 #include "llvm/Support/CommandLine.h"
8 #include "llvm/Support/raw_ostream.h"
9
10 namespace ct = clang::tooling;
11
12 class MyAstConsumer : public clang::ASTConsumer {
13 public:
14 MyAstConsumer(const std::string& fileName) : fileName_(fileName) {}
15 void HandleTranslationUnit(clang::ASTContext& astContext) override {
16 llvm::outs() << std::format("input file: {}\nAST size: {}\n",
17 fileName_, astContext.getASTAllocatedMemory());
18 }
19 private:
20 std::string fileName_;
21 };
22
23 struct MyAstFrontendAction : public clang::ASTFrontendAction {
24 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
25 clang::CompilerInstance&, clang::StringRef inFile) override {
26 return std::make_unique<MyAstConsumer>(std::string(inFile));
27 }
28 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 135


AST Consumer Example: main.cpp (2)

30 static llvm::cl::OptionCategory toolOptions("Tool Options");


31
32 int main(int argc, char** argv) {
33 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
34 const_cast<const char**>(argv), toolOptions);
35 if (!expectedOptionsParser) {
36 llvm::errs() << std::format("Unable to create option parser ({}).\n",
37 llvm::toString(expectedOptionsParser.takeError()));
38 return 1;
39 }
40 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
41 ct::ClangTool tool(optionsParser.getCompilations(),
42 optionsParser.getSourcePathList());
43 int status = tool.run(
44 ct::newFrontendActionFactory<MyAstFrontendAction>().get());
45 if (status) {llvm::errs() << "error occurred\n";}
46 return !status ? 0 : 1;
47 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 136


Section 3.7

Traversing the AST With AST Visitors

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 137


clang::RecursiveASTVisitor Class Template (1)

■ clang::RecursiveASTVisitor class template used for traversing AST


and performing specific actions at appropriate nodes
■ class template employs CRTP
:::::
■ templated on AST-visitor class type
■ hooks are not virtual (due to use of CRTP)
■ be careful to employ correct function signatures for hook methods;
otherwise, code will never be called
■ can perform preorder or postorder depth-first traversal
■ default implementations of hook methods inherited from
RecursiveASTVisitor
■ only need to implement hook methods for which custom behavior required
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_

1RecursiveASTVisitor.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 138


clang::RecursiveASTVisitor Class Template (2)
■ shouldTraversePostOrder method:
2 specifies if postorder traversal should be employed (as opposed to
preorder)
2 defaults to false (i.e., preorder)
■ shouldVisitTemplateInstantiations method:
2 specifies if nodes corresponding to template instantiations should be visited
2 defaults to false (i.e., not visited)
■ nodes corresponding to explicit and partial specializations are visited
■ shouldVisitImplicitCode method:
2 specifies if nodes corresponding to code implicitly generated by compiler
should be visited
2 defaults to false (i.e., not visited)
■ shouldWalkTypesOfTypeLocs method:
2 specifies if should recurse into TypeLoc-based nodes
2 defaults to true (i.e., recursed into)
■ shouldVisitLambdaBody method:
2 specifies if should recurse into lambda bodies
2 defaults to true (i.e., recursed into)
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 139
Traverse, Walk-Up, and Visit Methods

■ traversal functionality provided by three types of methods:


1 traverse method (named as TraverseType): initiates visitation of node and

its descendants (e.g., TraverseDecl)


2 walk-up method (WalkUpFromType): dispatches visitation across AST class

hierarchy (from node’s dynamic type to top-most class) for single node and
then call visit method for that node (e.g.,
WalkUpFromCXXConstructorDecl)
3 visit method (VisitType): handles visitation of single node based on its
type by calling user-specified function (e.g., VisitFunctionDecl,
VisitVarDecl)
■ traverse, walk-up, and visit methods have bool return type which
indicates if traversal should continue

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 140


Traverse Methods

■ traverse method invoked when node being traversed


■ for traverse methods, hooks of following form provided for most AST
nodes of type NodeType:
2 bool TraverseNodeType(NodeType *)

■ returning false terminates traversal early


■ by default, TraverseType invokes WalkUpFromType to visit node for
direct base class of Type (and ultimately each of its other base classes in
inheritance hierarchy)
■ for node of type T, TraverseType method invoked only if T is same as
Type
■ traverse methods can call traverse and walk-up methods, but not visit
methods

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 141


Walk-Up Methods

■ walk-up method invoked as part of process of climbing inheritance


hierarchy (which visits node at each level of inheritance hierarchy)
■ for walk-up methods, hooks of following form provided for most AST nodes
of type NodeType:
2 bool WalkUpFromNodeType(NodeType *)

■ returning false terminates traversal early


■ by default, WalkUpFromType invokes WalkUpFromParentType (where
ParentType is direct base class of Type) and then invokes VisitType
■ since WalkUpFromParentType called before VisitType, inheritance
hierarchy is visited in top-down order (i.e., from least to most derived)
■ for node of type T, WalkUpFromType method will be called if T same as
Type or T is (directly or indirectly) derived from Type
■ walk-up methods can call walk-up and visit methods, but not traverse
methods

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 142


Visit Methods

■ visit method invoked to visit node at particular level in inheritance


hierarchy
■ in terms of visit methods, hooks of following form provided for most AST
nodes of type NodeType (exception being TypeLoc nodes, which are
passed by value):
2 bool VisitNodeType(NodeType *)

■ returning false terminates traversal early


■ all calls to visit methods for same node grouped together (i.e., not
interleaved with calls to visit methods for other nodes)
■ by default, VisitType is no-op
■ visit methods can call visit methods, but not traverse or walk-up methods
■ for node of type T, VisitType method will be called if T same as Type or T
is (directly or indirectly) derived from Type

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 143


Node Handling Example

■ consider AST node of type NamespaceDecl


■ inheritance hierarchy involved:
2 Decl ←− NamedDecl ←− NamespaceDecl

■ when node of type NamespaceDecl encountered during AST traversal,


callbacks (i.e., traverse, walk-up, and visit methods) invoked in following
order:
1 TraverseNamespaceDecl

2 WalkUpFromNamespaceDecl

3 WalkUpFromNamedDecl

4 WalkUpFromDecl

5 VisitDecl

6 VisitNamedDecl

7 VisitNamespaceDecl

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 144


AST Visitor Example: Summary

■ in slides/examples/ast_visitor_1 directory in companion repository


■ runs compiler frontend on each source file specified on command line in
order to produce AST
■ traverses AST, printing fully-qualified name of each function declared
■ only considers function declarations in main source file (not those in
headers)
■ output resembles something like:
main
foo :: max
foo :: abs
get_values

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 145


AST Visitor Example: main.cpp (1)
1 #include <format>
2 #include "clang/AST/ASTConsumer.h"
3 #include "clang/AST/RecursiveASTVisitor.h"
4 #include "clang/Frontend/CompilerInstance.h"
5 #include "clang/Frontend/FrontendAction.h"
6 #include "clang/Tooling/CommonOptionsParser.h"
7 #include "clang/Tooling/Tooling.h"
8 #include "llvm/Config/llvm-config.h"
9 #include "llvm/Support/CommandLine.h"
10
11 namespace ct = clang::tooling;
12
13 class MyAstVisitor : public clang::RecursiveASTVisitor<MyAstVisitor> {
14 public:
15 MyAstVisitor(clang::ASTContext& astContext) : astContext_(&astContext) {}
16 bool VisitFunctionDecl(clang::FunctionDecl* funcDecl) {
17 const auto& fileId = astContext_->getSourceManager().getFileID(
18 funcDecl->getLocation());
19 if (fileId == astContext_->getSourceManager().getMainFileID()) {
20 llvm::outs() << std::format("{}\n",
21 funcDecl->getQualifiedNameAsString());
22 }
23 return true;
24 }
25 private:
26 clang::ASTContext* astContext_;
27 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 146


AST Visitor Example: main.cpp (2)

29 class MyAstConsumer : public clang::ASTConsumer {


30 public:
31 void HandleTranslationUnit(clang::ASTContext& astContext) final {
32 clang::TranslationUnitDecl* tuDecl =
33 astContext.getTranslationUnitDecl();
34 MyAstVisitor visitor(astContext);
35 visitor.TraverseDecl(tuDecl);
36 }
37 };
38
39 class MyFrontendAction : public clang::ASTFrontendAction {
40 public:
41 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
42 clang::CompilerInstance&, clang::StringRef) final {
43 return std::unique_ptr<clang::ASTConsumer>{new MyAstConsumer};
44 }
45 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 147


AST Visitor Example: main.cpp (3)

47 static llvm::cl::OptionCategory toolOptions("Tool Options");


48
49 int main(int argc, char** argv) {
50 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
51 const_cast<const char**>(argv), toolOptions);
52 if (!expectedOptionsParser) {
53 llvm::errs() << std::format("Unable to create option parser ({}).\n",
54 llvm::toString(expectedOptionsParser.takeError()));
55 return 1;
56 }
57 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
58 ct::ClangTool tool(optionsParser.getCompilations(),
59 optionsParser.getSourcePathList());
60 int status = tool.run(
61 ct::newFrontendActionFactory<MyFrontendAction>().get());
62 if (status) {llvm::errs() << "error detected\n";}
63 return !status ? 0 : 1;
64 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 148


Class Hierarchy Example: Summary

■ in slides/examples/ast_visitor_3 directory in companion repository


■ runs compiler frontend on each source file specified on command line in
order to produce AST
■ traverses AST for each translation unit
■ maintains stack to track nesting hierarchy of struct/union declarations
■ for each struct/union declaration, prints corresponding class hierarchy
■ only considers declarations in main source file (not those in headers)
■ output resembles something like:
A1 -> A2 -> A3
A1 -> A2
A1
Something -> Wazzit
Something -> ( anonymous )
Something

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 149


Class Hierarchy Example: main.cpp (1)

1 #include <format>
2 #include <vector>
3 #include "clang/AST/ASTConsumer.h"
4 #include "clang/AST/RecursiveASTVisitor.h"
5 #include "clang/Frontend/CompilerInstance.h"
6 #include "clang/Frontend/FrontendAction.h"
7 #include "clang/Tooling/CommonOptionsParser.h"
8 #include "clang/Tooling/Tooling.h"
9 #include "llvm/Support/CommandLine.h"
10
11 namespace ct = clang::tooling;
12
13 class MyAstVisitor : public clang::RecursiveASTVisitor<MyAstVisitor> {
14 public:
15 MyAstVisitor(clang::ASTContext& astContext) : astContext_(&astContext),
16 stack_() {}
17 bool TraverseCXXRecordDecl(clang::CXXRecordDecl* recDecl);
18 private:
19 using Base = clang::RecursiveASTVisitor<MyAstVisitor>;
20 void printStack() const;
21 clang::ASTContext* astContext_;
22 std::vector<const clang::CXXRecordDecl*> stack_;
23 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 150


Class Hierarchy Example: main.cpp (2)
25 bool MyAstVisitor::TraverseCXXRecordDecl(clang::CXXRecordDecl* recDecl) {
26 clang::SourceManager& sourceManager = astContext_->getSourceManager();
27 stack_.push_back(recDecl);
28 bool result = Base::TraverseCXXRecordDecl(recDecl);
29 if (sourceManager.getFileID(recDecl->getLocation()) ==
30 sourceManager.getMainFileID()) {printStack();}
31 stack_.pop_back();
32 return result;
33 }
34
35 void MyAstVisitor::printStack() const {
36 std::string s;
37 for (auto i = stack_.begin(); i != stack_.end(); ++i) {
38 std::string name((*i)->getName());
39 s += std::format("{}{}", i != stack_.begin() ? " -> " : "",
40 name.size() ? name : "(anonymous)");
41 }
42 llvm::outs() << s << ’\n’;
43 }
44
45 class MyAstConsumer : public clang::ASTConsumer {
46 public:
47 void HandleTranslationUnit(clang::ASTContext& astContext) final {
48 clang::TranslationUnitDecl* tuDecl =
49 astContext.getTranslationUnitDecl();
50 MyAstVisitor astVisitor(astContext);
51 astVisitor.TraverseDecl(tuDecl);
52 }
53 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 151


Class Hierarchy Example: main.cpp (3)

55 class MyFrontendAction : public clang::ASTFrontendAction {


56 public:
57 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
58 clang::CompilerInstance& compInstance, clang::StringRef) final {
59 return std::unique_ptr<clang::ASTConsumer>{new MyAstConsumer};
60 }
61 };
62
63 static llvm::cl::OptionCategory toolOptions("Tool Options");
64
65 int main(int argc, char** argv) {
66 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
67 const_cast<const char**>(argv), toolOptions);
68 if (!expectedOptionsParser) {
69 llvm::errs() << std::format("Unable to create option parser ({}).\n",
70 llvm::toString(expectedOptionsParser.takeError()));
71 return 1;
72 }
73 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
74 ct::ClangTool tool(optionsParser.getCompilations(),
75 optionsParser.getSourcePathList());
76 int status = tool.run(
77 ct::newFrontendActionFactory<MyFrontendAction>().get());
78 if (status) {llvm::errs() << "error detected\n";}
79 return !status ? 0 : 1;
80 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 152


Section 3.8

Source Manager and Source Locations

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 153


clang::SourceManager Class

■ clang::SourceManager manages all source files stored in memory and


provides interface to access them
■ provides APIs to deal with SourceLocation instances
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1SourceManager.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 154


clang::SourceLocation Type
■ clang::SourceLocation represents location of specific position in
source code
■ SourceLocation capable of representing position in source code with
character granularity but often used to refer to tokens (by referring to
location of first character of token)
■ SourceLocation made very lightweight for efficiency
■ just reference/handle to specific part of source code
■ underlying data being referenced stored in SourceManager instance
■ SourceLocation has special invalid value that can used to indicate no
corresponding location in source code exists
■ can check for invalid value with isValid member function
■ can use SourceManager to query file name, line number, column number
associated with SourceLocation
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1SourceLocation.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 155
clang::SourceRange Type
■ clang::SourceRange represents contiguous part of source code
■ SourceRange represents range of tokens
■ essentially pair of SourceLocation objects (i.e., one SourceLocation
object for each of begin and end locations)
■ range is symmetric (i.e., both begin and end refer to elements in range)
■ begin location specifies location of first character of first token in range
(obtained via getBegin)
■ end location specifies location of first character of last token in range
(obtained via getEnd)
■ can check for invalid value with isValid member function
■ some AST node types have getSourceRange member function to obtain
range of tokens related to AST node (e.g., FunctionDecl and VarDecl)
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1SourceRange.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 156
FunctionDecl and Source Locations (Declaration Only)

getLocation()

float func(float x, float y);

getBeginLoc() getEndLoc()

getLocation()

auto func(float x, float y) -> float;

getBeginLoc() getEndLoc()

■ source range: functionDecl->getBeginLoc(),


functionDecl->getEndLoc()
■ source location: functionDecl->getLocation()

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 157


FunctionDecl and Source Locations (Definition)

getLocation()

float func(float x, float y) {return x * y;}

getBeginLoc() getEndLoc()

getLocation()

auto func(float x, float y) -> float {return x * y;}

getBeginLoc() getEndLoc()

■ source range: functionDecl->getBeginLoc(),


functionDecl->getEndLoc()
■ source location: functionDecl->getLocation()

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 158


FunctionDecl::getReturnTypeSourceRange

beginLoc()

float func(float x, float y);

endLoc()

beginLoc()

auto func(float x, float y) -> float;

endLoc()

■ source range:
functionDecl->getReturnTypeSourceRange().beginLoc(),
functionDecl->getReturnTypeSourceRange().endLoc()

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 159


VarDecl and Source Locations

getLocation()

double value = 1.0 / 3.0;

getBeginLoc() getEndLoc()

getLocation()

extern int global;

getBeginLoc() getEndLoc()

■ source range: varDecl->getBeginLoc(), varDecl->getEndLoc()


■ source location: varDecl->getLocation()

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 160


CallExpr and Source Locations

func(a, b, c, d);

getBeginLoc() getEndLoc()

(*func_ptr)(a, b, c, d);

getBeginLoc() getEndLoc()

■ source range: callExpr->getBeginLoc(), callExpr->getEndLoc()

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 161


clang::CharSourceRange Type

■ clang::CharSourceRange represents contiguous part of source code


with character granularity
■ CharSourceRange can be used to specify:
2 range of characters
2 range of tokens
■ for range of characters:
2 begin and end specify location of first and last characters of range
■ for range of tokens:
2 begin specifies location of first character of first token in range
2 end specifies location of first character of last token in range
■ can determine type of range (i.e., character versus token) via
isCharRange and isTokenRange member functions
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1CharSourceRange.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 162


clang::FullSourceLocation Type

■ SourceLocation instance not useful in isolation, since references


information in SourceManager instance
■ FullSourceLocation is SourceLocation along with associated
SourceManager

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 163


Spelling Versus Expansion Locations
■ each source location associated with spelling location and expansion
location
■ spelling location specifies where characters corresponding to token
originated
■ expansion location specifies where characters for token appear to be from
user’s point of view
■ in case of macro expansion:
2 spelling location specifies where token in macro expansion originated (i.e.,
in macro definition)
2 expansion location specifies where macro expansion took place (i.e., point
where macro invoked)
■ clang::SourceManager class provides methods for querying
information about source locations, including:
2 getFilename
2 getSpellingLineNumber
2 getSpellingColumnNumber

2 getExpansionRange

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 164


Example: Spellings Versus Expansions
Source File example_16.cpp
11111111112222222
12345678901234567890123456
1 #define foo1 (x) foo (x)
2 int foo (int x) {return x ;}
3 int main () {
4 return foo1 (42);
5 }

Preprocessor Output
11111111112222222
12345678901234567890123456
2 int foo (int x) {return x ;}
3 int main () {
4 return foo (42);
5 }

■ consider CallExpr AST node associated with call to foo on line 4 of


source file
■ begin spelling location: example_16.cpp:1:17
■ end spelling location: example_16.cpp:1:22
■ expansion range: example_16.cpp:4:9 to example_16.cpp:4:16
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 165
Example: More Macro Strangeness
Source File example_17.cpp
11111111112222222
12345678901234567890123456
1 #define FORTY_TWO 42
2 #define X x
3 #define INT int
4 INT X = FORTY_TWO ;

Preprocessor Output
11111111112222222
12345678901234567890123456
4 int x = 42;

■ consider VarDecl AST node associated with declaration of variable x on


line 4 of source file
■ begin spelling location: example_17.cpp:3:13
■ end spelling location: example_17.cpp:1:19
■ note that end spelling location precedes begin spelling location in source
file
■ expansion range: example_17.cpp:4:1 to example_17.cpp:4:9 (i.e.,
first character of token FORTY_TWO)
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 166
Obtaining Source Code for Source Range

■ to obtain text of source corresponding to source range, use


getSourceText
:::::::::::::::
member of :::::::::::::
clang::Lexer class
■ source range specified as CharSourceRange instance
■ allows source range to be token or character range
■ if token range, gets source text for all tokens covered by range
■ if character range, gets source text for all characters covered by range
■ may fail in cases where source range contains macro expansions
■ failure indicator returned via bool* parameter

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 167


Source Location Example: Summary

■ in slides/examples/clang_utilities directory in companion


repository
■ provides several functions for printing information about source code from
SourceLocation and SourceRange objects:
2 associated source file, line number, and column number
2 corresponding character sequence in source code

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 168


Source Location Example: utilities.cpp (1)

1 #include <format>
2 #include "clang/Basic/SourceManager.h"
3 #include "clang/Basic/SourceLocation.h"
4 #include "clang/Lex/Lexer.h"
5
6 std::string locationToString(const clang::SourceManager& sourceManager,
7 clang::SourceLocation sourceLoc) {
8 return std::format("{}:{}({})",
9 std::string(sourceManager.getFilename(sourceLoc)),
10 sourceManager.getSpellingLineNumber(sourceLoc),
11 sourceManager.getSpellingColumnNumber(sourceLoc));
12 }
13
14 std::string rangeToString(const clang::SourceManager& sourceManager,
15 clang::SourceRange sourceRange) {
16 std::string beginFilename(sourceManager.getFilename(
17 sourceRange.getBegin()));
18 std::string endFilename(sourceManager.getFilename(sourceRange.getEnd()));
19 return std::format("{}:{}({})-{}{}({})", beginFilename,
20 sourceManager.getSpellingLineNumber(sourceRange.getBegin()),
21 sourceManager.getSpellingColumnNumber(sourceRange.getBegin()),
22 endFilename != beginFilename ? endFilename + ":" : "",
23 sourceManager.getSpellingLineNumber(sourceRange.getEnd()),
24 sourceManager.getSpellingColumnNumber(sourceRange.getEnd()));
25 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 169


Source Location Example: utilities.cpp (2)

27 std::string getSourceText(const clang::SourceManager& sourceManager,


28 clang::SourceRange range) {
29 return std::string(clang::Lexer::getSourceText(
30 clang::CharSourceRange::getTokenRange(range), sourceManager,
31 clang::LangOptions()));
32 }
33
34 std::string addLineNumbers(const std::string& source, unsigned int start) {
35 std::string result;
36 result += std::format("{:4d}: ", start);
37 for (auto c : source) {
38 if (c == ’\n’) {
39 ++start;
40 result += std::format("\n{:4d}: ", start);
41 } else {result += c;}
42 }
43 return result;
44 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 170


Source Printing Example: Summary
■ in slides/examples/ast_visitor_2 directory in companion repository
■ for each function definition in main source file, prints:
2 fully-qualified name of function
2 source file containing function definition
2 starting and ending of line number and column number of function definition
2 lines of source code comprising function definition (with line numbers)
■ output for function definition resembles something like:
h2g2 :: get_status
/ home / jdoe / example_1 . cpp :3(1) -5(1)
----------
3: int get_status () {
4: return 42;
5: }
----------
main
/ home / jdoe / example_1 . cpp :9(1) -11(1)
----------
9: int main () {
10: return h2g2 :: get_status ();
11: }
----------

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 171


Source Printing Example: main.cpp (1)
1 #include <format>
2 #include "clang/AST/ASTConsumer.h"
3 #include "clang/AST/RecursiveASTVisitor.h"
4 #include "clang/Frontend/CompilerInstance.h"
5 #include "clang/Frontend/FrontendAction.h"
6 #include "clang/Tooling/CommonOptionsParser.h"
7 #include "clang/Tooling/Tooling.h"
8 #include "llvm/Support/CommandLine.h"
9 #include "utilities.hpp" // header for utilities.cpp
10
11 namespace ct = clang::tooling;
12
13 class MyAstVisitor : public clang::RecursiveASTVisitor<MyAstVisitor> {
14 public:
15 MyAstVisitor(clang::ASTContext& astContext) : astContext_(&astContext) {}
16 bool VisitFunctionDecl(clang::FunctionDecl* funcDecl) {
17 clang::SourceManager& sm = astContext_->getSourceManager();
18 const auto& fileId = sm.getFileID(funcDecl->getLocation());
19 if (funcDecl->hasBody() && fileId == sm.getMainFileID()) {
20 clang::SourceRange sourceRange = funcDecl->getSourceRange();
21 std::string delim("----------\n");
22 llvm::outs() << std::format("{}\n{}\n{}{}\n{}\n",
23 funcDecl->getQualifiedNameAsString(), rangeToString(sm,
24 sourceRange), delim, addLineNumbers(getSourceText(sm,
25 sourceRange), sm.getSpellingLineNumber(sourceRange.getBegin())),
26 delim);
27 }
28 return true;
29 }
30 private:
31 clang::ASTContext* astContext_;
32 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 172


Source Printing Example: main.cpp (2)
34 class MyAstConsumer : public clang::ASTConsumer {
35 public:
36 void HandleTranslationUnit(clang::ASTContext& astContext) final {
37 clang::TranslationUnitDecl* tuDecl =
38 astContext.getTranslationUnitDecl();
39 MyAstVisitor astVisitor(astContext);
40 astVisitor.TraverseDecl(tuDecl);
41 }
42 };
43
44 class MyFrontendAction : public clang::ASTFrontendAction {
45 public:
46 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
47 clang::CompilerInstance& compInstance, clang::StringRef) final {
48 return std::unique_ptr<clang::ASTConsumer>{new MyAstConsumer};
49 }
50 };
51
52 static llvm::cl::OptionCategory toolOptions("Tool Options");
53
54 int main(int argc, char** argv) {
55 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
56 const_cast<const char**>(argv), toolOptions);
57 if (!expectedOptionsParser) {
58 llvm::errs() << std::format("Unable to create option parser ({}).\n",
59 llvm::toString(expectedOptionsParser.takeError()));
60 return 1;
61 }
62 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
63 ct::ClangTool tool(optionsParser.getCompilations(),
64 optionsParser.getSourcePathList());
65 int status = tool.run(
66 ct::newFrontendActionFactory<MyFrontendAction>().get());
67 if (status) {llvm::errs() << "error detected\n";}
68 return !status ? 0 : 1;
69 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 173


Section 3.9

Diagnostics

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 174


clang::DiagnosticConsumer Class

■ clang::DiagnosticConsumer class provides interface for receiving


diagnostic information (i.e., warnings, errors, and so on) from compiler
frontend
■ class provides numerous virtual functions that serve as callbacks
■ some classes derived from DiagnosticConsumer provided by library to
cover some common use cases (e.g., clang::IgnoringDiagConsumer)
■ can also create other diagnostic-consumer classes by deriving from
DiagnosticConsumer and providing desired overrides
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_

1DiagnosticConsumer.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 175


Diagnostic Consumer Example: Summary

■ in slides/examples/diagnostic_consumer directory in companion


repository
■ runs compiler frontend on specified source files
■ for each diagnostic with severity level of (regular) error or fatal error,
program prints filename and line/column number associated with
diagnostic
■ other diagnostic information discarded
■ upon completion, program prints count of number of regular/fatal errors
■ output resembles something like:
error at / home / mdadams / jdoe / invalid_1 . cpp :3:9
1 error (s) occurred

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 176


Diagnostic Consumer Example: main.cpp (1)
1 #include <format>
2 #include <map>
3 #include <string>
4 #include "clang/Basic/Diagnostic.h"
5 #include "clang/Basic/SourceLocation.h"
6 #include "clang/Basic/SourceManager.h"
7 #include "clang/Frontend/FrontendAction.h"
8 #include "clang/Frontend/FrontendActions.h"
9 #include "clang/Tooling/CommonOptionsParser.h"
10 #include "clang/Tooling/Tooling.h"
11 #include "llvm/Support/CommandLine.h"
12
13 namespace ct = clang::tooling;
14
15 std::string locationToString(const clang::SourceManager& sourceManager,
16 clang::SourceLocation sourceLoc) {
17 return std::format("{}:{}:{}",
18 std::string(sourceManager.getFilename(sourceLoc)),
19 sourceManager.getSpellingLineNumber(sourceLoc),
20 sourceManager.getSpellingColumnNumber(sourceLoc));
21 }
22
23 std::string levelToString(clang::DiagnosticsEngine::Level level) {
24 const std::map<clang::DiagnosticsEngine::Level, std::string> lut{
25 {clang::DiagnosticsEngine::Level::Error, "error"},
26 {clang::DiagnosticsEngine::Level::Fatal, "fatal error"},
27 };
28 auto i = lut.find(level);
29 return i != lut.end() ? i->second : "unknown";
30 }
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 177
Diagnostic Consumer Example: main.cpp (2)

32 class MyDiagnosticConsumer : public clang::DiagnosticConsumer {


33 public:
34 MyDiagnosticConsumer() : errCount_(0) {}
35 void HandleDiagnostic(clang::DiagnosticsEngine::Level diagLevel,
36 const clang::Diagnostic& info) override {
37 clang::SourceManager* sm = info.hasSourceManager() ?
38 &info.getSourceManager() : nullptr;
39 if (diagLevel == clang::DiagnosticsEngine::Level::Error ||
40 diagLevel == clang::DiagnosticsEngine::Level::Fatal) {
41 if (sm) {
42 llvm::errs() << std::format("{} at {}\n",
43 levelToString(diagLevel), locationToString(*sm,
44 info.getLocation()));
45 ++errCount_;
46 } else {
47 llvm::errs() << std::format("{}\n", levelToString(diagLevel));
48 }
49 }
50 }
51 unsigned long getErrCount() const {return errCount_;}
52 private:
53 unsigned long errCount_;
54 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 178


Diagnostic Consumer Example: main.cpp (3)

56 static llvm::cl::OptionCategory toolOptions("Tool Options");


57
58 int main(int argc, char** argv) {
59 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
60 const_cast<const char**>(argv), toolOptions);
61 if (!expectedOptionsParser) {
62 llvm::errs() << llvm::toString(expectedOptionsParser.takeError());
63 return 1;
64 }
65 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
66 ct::ClangTool tool(optionsParser.getCompilations(),
67 optionsParser.getSourcePathList());
68 MyDiagnosticConsumer diagnosticConsumer;
69 tool.setDiagnosticConsumer(&diagnosticConsumer);
70 int status = tool.run(
71 ct::newFrontendActionFactory<clang::SyntaxOnlyAction>().get());
72 unsigned long errCount = diagnosticConsumer.getErrCount();
73 if (errCount) {
74 llvm::errs() << std::format("{} error(s) occurred\n", errCount);
75 }
76 return (!status && !errCount) ? 0 : 1;
77 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 179


Section 3.10

Finding AST Nodes With AST Matchers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 180


AST Matchers

■ Clang libraries provide mechanism for finding AST nodes that match
specific criteria as determined by some matching predicate
■ predicate embodied by AST matcher type
■ can match nodes that correspond to declarations, statements,
expressions, and types (amongst other things)
■ AST matcher API designed such that expressions involving matchers
have very natural syntax
■ this syntax can be thought of as domain-specific language for AST node
matching
■ as will be seen later, clang-query tool supports similar syntax

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 181


Trivial Matcher Example
1 #include <string>
2 #include "clang/ASTMatchers/ASTMatchers.h"
3
4 namespace cam = clang::ast_matchers;
5
6 cam::DeclarationMatcher matchFuncDef() {
7 using namespace cam;
8 return functionDecl(isDefinition()).bind("func");
9 }
10
11 cam::DeclarationMatcher matchFuncDeclOf(const std::string& funcName) {
12 using namespace cam;
13 return functionDecl(hasName(funcName)).bind("func");
14 }
15
16 cam::StatementMatcher matchCallTo(const std::string& funcName) {
17 using namespace cam;
18 return callExpr(callee(
19 functionDecl(hasName(funcName)).bind("func"))).bind("call");
20 }
21
22 cam::TypeMatcher matchPointerType() {
23 using namespace cam;
24 return qualType(isAnyPointer());
25 }

■ above code contains some examples of matcher expressions simply for


illustrative purposes

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 182


Section 3.10.1

AST Matchers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 183


AST Matchers

■ AST matcher is class that holds predicate used to test for match
■ three categories of AST matchers:
1 node matchers: match specific type of AST node
2 narrowing matchers: match attributes on AST nodes
3 traversal matchers: allow traversal between AST nodes
■ library provides very rich set of predefined AST matchers
■ predefined AST matchers in clang::ast_matchers namespace
■ library also allows custom AST matchers to be defined by user
■ for list of AST matchers provided by library, see:
2 https://clang.llvm.org/docs/LibASTMatchersReference.html

[click on name of matcher for more detailed information on that matcher]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 184


Node Matchers
■ node matchers allow nodes to be matched on basis of their type
■ every matcher expression must start with node matcher (or traverse
matcher to be discussed shortly)
■ match expression can be further refined with narrowing or traversal
matchers
■ all node matchers take arbitrary number of matchers as arguments and
perform logical AND of all of these matchers
■ matcher expression that matches every FunctionDecl node in AST:
functionDecl()
■ node matchers support bind operation that allows matched node to be
associated with string, which can be later used by callback to gain access
to that node
■ matcher expression that matches every FunctionDecl node and binds
matched node to name "x":
functionDecl().bind("x")

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 185


Some Predefined Node Matchers

Name Description
functionDecl matches FunctionDecl node (function declaration)
cxxMethodDecl matches CXXMethodDecl node (class/union/struct
method declaration)
cxxRecordDecl matches CXXRecordDecl node (C++ class/u-
nion/struct declaration)
varDecl matches VarDecl node (variable declaration)
callExpr matches CallExpr node (call expression)
declRefExpr matches DeclRefExpr node (expression referring to
declared entity)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 186


Narrowing Matchers

■ narrowing matchers match certain attributes on current node


■ narrowing matchers allow number of matches to be reduced by only
keeping matches with specific attributes
■ special logical narrowing matchers provide AND, OR, and NOT logical
operations (i.e., allOf, anyOf, unless)
■ matcher expression that matches node corresponding to declaration of
function whose name is foo:
functionDecl(hasName("foo"))
■ matcher expression that matches node corresponding to declaration of
function whose name is either foo or bar:
functionDecl(anyOf(hasName("foo"), hasName("bar")))
■ matcher expression that matches node corresponding to declaration of
function whose name is neither foo nor bar:
functionDecl(unless(
anyOf(hasName("foo"), hasName("bar"))))

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 187


Some Predefined Narrowing Matchers

Logical Narrowing Matchers


Name Description
allOf logical AND
anyOf logical OR
unless logical NOT

Miscellaneous Narrowing Matchers


Name Description
hasName matches if has specified name
matchesName matches if name matches regular expres-
sion
isExpansionInMainFile matches if from main source file (i.e., not
included header)
isImplicit matches if implicitly generated by com-
piler (e.g., implicit default constructor)
equalsBoundNode matches if same as node bound to spec-
ified name

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 188


Traversal Matchers

■ traversal matchers specify relationship between current node and other


nodes reachable from current node
■ special traverse matcher allows control over how traversal performed
(e.g., all implicit nodes can be ignored)
■ matcher expression that matches declaration of function that contains at
least one if statement:
functionDecl(hasDescendant(ifStmt()))
■ matcher expression that matches call to function whose name is foo:
callExpr(callee(functionDecl(hasName("foo"))))

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 189


Some Predefined Traversal Matchers

Parent/Child Traversal Matchers


Name Description
hasParent parent matches specified matcher
has child matches specified matcher
hasAncestor at least one ancestor matches specified matcher
hasDescendant at least one descendant matches specified matcher

Expr to Decl Traversal


Expr Matcher Decl
CallExpr callee FunctionDecl
DeclRefExpr to VarDecl
MemberExpr member FieldDecl

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 190


Ignoring Implicit AST Nodes

■ compiler sometimes generates AST nodes that do not correspond to


constructs explicitly spelled in source code
■ such nodes are referred to as implicit nodes
■ often implicit nodes can complicate AST matching process (by creating
many additional special cases to handle during matching)
■ may wish to ignore all implicit nodes
■ can be accomplished by using special traverse matcher with
clang::TK_IgnoreUnlessSpelledInSource for first argument
■ that is, all implicit AST nodes can be ignored by wrapping matcher
expression e with traverse as follows:
clang::ast_matchers::traverse(
clang::TK_IgnoreUnlessSpelledInSource, e)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 191


Examples of AST Matcher Expressions (1)

■ match node that corresponds to function declaration in main source file:


functionDecl(isExpansionInMainFile())
■ match node that corresponds to inline function declaration:
functionDecl(isInline())
■ match node that corresponds to call to function whose name matches
regular expression “^::foo_”:
functionDecl(matchesName("^::foo_"))
■ match node that corresponds to function declaration whose 0th parameter
has name x:
functionDecl(
hasParameter(0, parmVarDecl(hasName("x"))))

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 192


Examples of AST Matcher Expressions (2)

■ match statement (i.e., Stmt) node that corresponds to either normal or


range-based for statement, and bind name “for” to matching Stmt
node:
stmt(anyOf(forStmt(), cxxForRangeStmt())).bind("for")
■ match if-statement (i.e., IfStmt) node that is not part of else if, and
bind name “if” to matched node:
ifStmt(stmt().bind("if"), unless(hasParent(ifStmt(
hasElse( ifStmt(equalsBoundNode("if")))))))
■ match node corresponding to if-statement (i.e., IfStmt node) whose
then or else clause is not compound statement:
ifStmt(unless(allOf(hasThen(compoundStmt()),
anyOf(unless(hasElse(anything())),
hasElse(compoundStmt())))))

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 193


Examples of AST Matcher Expressions (3)

■ match call to make_unique with template argument Widget:


callExpr(callee(functionDecl(hasName("make_unique"),
hasAnyTemplateArgument(refersToType(hasDeclaration(
namedDecl(hasName("Widget"))))))))
■ match call to function where function has any parameters of rvalue
reference type:
callExpr(callee(functionDecl(hasAnyParameter(
parmVarDecl(hasType(rValueReferenceType()))))))
■ match overriding virtual method that does not use override or final
keyword (where attr::Override and attr::Final from clang
namespace):
cxxMethodDecl(unless(cxxDestructorDecl()),
isOverride(), unless(anyOf(hasAttr(attr::Override),
hasAttr(attr::Final))))

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 194


Section 3.10.2

Using clang-query to Facilitate AST Matcher Development

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 195


clang-query Program (1)

■ clang-query is program that facilitates easier development of AST


matchers
■ reads sequence of commands as input that can be used to find nodes in
AST corresponding to given source code using AST matchers
■ source files to be considered specified as command-line arguments
■ commands for specifying AST matchers and controlling how results
generated and presented
■ program reads commands from standard input and query results and
diagnostics sent to standard output/error
■ syntax for specifying matchers mostly compatible with syntax that would
be used in C++ source code
■ only matchers provided by Clang library supported (i.e., no support for
custom matchers)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 196


clang-query Program (2)

■ ASTs for source files only generated at program startup (so any
subsequent changes to source files not considered)
■ source code can be found in clang-tools-extra/clang-query
directory of LLVM Git repository
■ completion functionality requires LLVM/Clang built to use Editline library
(a.k.a. libedit)?
■ for more information, see:
2 https://firefox-source-docs.mozilla.org/code-quality/

static-analysis/writing-new/clang-query.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 197


clang-query Example
example_1.cpp
1 #include <iostream>
2 int square(int x) {return x * x;}
3 int cube(int x) {return x * x * x;}
4 int main() {
5 std::cout << square(2) + square(3) << ’\n’;
6 std::cout << cube(42) << ’\n’;
7 }

Command Line Invocation of clang-query Program


clang-query example_1.cpp

Input to clang-query Program


set bind-root false
m callExpr(callee(functionDecl(hasName("square")))).bind("call")

Output from clang-query Program

Match #1:
/ home / jdoe / example_1 . cpp :5:15: note : " call " binds here
std :: cout << square (2) + square (3) << ’\n ’;
^~~~~~~~~
Match #2:
/ home / jdoe / example_1 . cpp :5:27: note : " call " binds here
std :: cout << square (2) + square (3) << ’\n ’;
^~~~~~~~~
2 matches .

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 198


Another clang-query Example
example_13.cpp
1 int g = 42;
2 int foo(int n) {
3 static int c = 0;
4 return n * ++g * ++c;
5 }

Input to clang-query Program


set bind-root false
m declRefExpr(to(varDecl(unless(isStaticStorageClass())).bind("d"))).bind("r")

Output from clang-query Program


Match #1:
/ home / jdoe / example_13 . cpp :2:9: note : "d" binds here
int foo ( int n) {
^~~~~
/ home / jdoe / example_13 . cpp :4:9: note : "r" binds here
return n * ++ g * ++ c;
^
Match #2:
/ home / jdoe / example_13 . cpp :1:1: note : "d" binds here
int g = 42;
^~~~~~~~~~
/ home / jdoe / example_13 . cpp :4:15: note : "r" binds here
return n * ++ g * ++ c;
^
2 matches .

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 199


Ignoring Implicit AST Nodes

■ Clang generates many AST nodes for constructs not spelled explicitly in
source
■ in clang-query, all implicit AST nodes can be ignored when matching by
using:
set traversal IgnoreUnlessSpelledInSource

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 200


Implicit AST Nodes Example

■ consider following AST matcher expression, which matches all type


declarations:
type()
■ consider running above matcher on empty (i.e., zero-length) C++ source
file
■ if all AST nodes considered, several matches occur due to nodes
corresponding to some special pre-defined types, such as:
2 __int128

2 unsigned __int128

■ if IgnoreUnlessSpelledInSource specified, no matches found (as


expected)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 201


Implicit AST Nodes Example: AsIs

clang-query program invoked to process empty C++ source file


Input commands for clang-query program
set output print
set traversal AsIs
match type()

Output from clang-query program


Match #1:
Binding for " root ":
__int128
Match #2:
Binding for " root ":
unsigned __int128
Match #3:
Binding for " root ":
__NSConstantString_tag
Match #4:
Binding for " root ":
char *
Match #5:
Binding for " root ":
char
Match #6:
Binding for " root ":
__va_list_tag [1]
Match #7:
Binding for " root ":
__va_list_tag
7 matches .

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 202


Implicit AST Nodes Example: IgnoreUnlessSpelledInSource

clang-query program invoked to process empty C++ source file


Input commands for clang-query program
set output print
set traversal IgnoreUnlessSpelledInSource
match type()

Output from clang-query program


0 matches .

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 203


Determining AST Matcher Expression

■ dump AST for code of interest to determine particular structure to be


matched
■ use clang-query program to assist in selection of appropriate matcher
expression
■ use expression in source code of Clang tool being developed

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 204


clang-query Demonstration

■ consider matching all for statements in C++ source code


■ use clang-query to help guide process

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 205


Section 3.10.3

Finding AST Nodes With AST Matchers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 206


Simple AST Matchers

■ simple (i.e., nonpolymorphic) AST matcher can match against nodes of


one specific type and its derived types
■ simple AST matcher is object of type
clang::ast_matchers::internal::Matcher<T>, where T is type of
AST node to which matching predicate can be applied
■ in many cases, not necessary to refer to above internal class template
directly, as some type aliases provided for common cases (as shown ......
here)
■ AST matcher instance used to test if single AST node matches particular
predicate (i.e., predicate associated with matcher instance)
■ Matcher class provides matches method that tests if node matches
predicate (associated with Matcher instance) and returns bool result
■ Matcher<Base> can be used anywhere Matcher<Derived> required
(i.e., reverse of inheritance relationship to AST node type hierarchy)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 207


Simple AST Matcher Classes

■ numerous simple matcher types provided (in clang::ast_matchers


namespace) to allow matching various types of AST nodes
Matcher Type Type of Node Matched
DeclarationMatcher Decl
::::
StatementMatcher Stmt (which includes Expr
:::: ::::
)
TypeMatcher QualType
::::::::
TypeLocMatcher TypeLoc
:::::::
NestedNameSpecifierMatcher NestedNameSpecifier
::::::::::::::::::::
NestedNameSpecifierLocMatcher NestedNameSpecifierLoc
:::::::::::::::::::::::
CXXBaseSpecifierMatcher CXXBaseSpecifier
:::::::::::::::::
CXXCtorInitializerMatcher CXXCtorInitializer
:::::::::::::::::::
TemplateArgumentMatcher TemplateArgument
:::::::::::::::::
TemplateArgumentLocMatcher TemplateArgumentLoc
::::::::::::::::::::
LambdaCaptureMatcher LambdaCapture
:::::::::::::
AttrMatcher Attr
::::

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 208


Polymorphic AST Matchers

■ unlike simple AST matcher, polymorphic AST matcher can match against
multiple types of nodes, where types not related by inheritance
■ polymorphic AST matcher is object of template type
clang::ast_matchers::internal::PolymorphicMatcher, which
includes template parameter to capture set of possible node types that
can be matched
■ provides same interface as simple matcher (e.g., provides matches
method)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 209


AST Matcher Functions

■ AST matcher function is function that returns AST matcher (i.e., factory
function for matcher)
■ matcher function can be defined to take parameters
■ in practice, usually number of parameters between zero and two
■ matching function can take another matcher object as parameter, which
allows matching predicate result to depend on criterion from another
matcher
■ can overload matcher functions

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 210


Key Classes Associated Matching Process

■ clang::ast_matchers::MatchFinder
::::::::::::::::::::::::::::::::::::
class:
2 provides mechanism for traversing AST in order to find matching nodes
■ matcher class, normally chosen from one of numerous matcher classes in
clang::ast_matchers:
2 holds predicate used to determine if node is match
■ match-callback class, derived from
clang::ast_matchers::MatchCallback
::::::::::::::::::::::::::::::::::::::
:
2 specifies actions to be taken when match found
■ clang::ast_matchers::MatchFinder::MatchResult
::::::::::::::::::::::::::::::::::::::::::::::::::
class:
2 holds match result

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 211


clang::ast_matchers::MatchFinder Class

■ clang::ast_matchers::MatchFinder
::::::::::::::::::::::::::::::::::::
class provides mechanism for
finding matches over AST
■ after creation, can add one or more matchers via calls to addMatcher
■ addMatcher has several overloads with signature of form:
void addMatcher(MatcherType&, MatchCallback*)
■ newASTConsumer method returns AST consumer that will trigger specified
callbacks at appropriate points in matching process
■ can run matcher over entire AST (for given ASTContext) with matchAST
method
■ can run matcher against single AST node with match method
■ can generate frontend-action factory for MatchFinder using overload of
clang::tooling::newFrontendActionFactory
:::::::::::::::::::::::::::::::::::::::::::::
function, which will
automatically invoke matchAST method

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 212


clang::...::MatchFinder::MatchResult Class

■ clang::ast_matchers::MatchFinder::MatchResult
::::::::::::::::::::::::::::::::::::::::::::::::::
class holds all
information for match found
■ public data members of class include:
2 Nodes: collection of nodes bound on current match, represented by

BoundNodes class
2 Context: ASTContext instance associated with match
2 SourceManager: SourceManager instance associated with match

■ BoundNodes type provides getNodeAs template member function which


can be used to obtain pointer to matched node bound to particular name

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 213


clang::...::MatchCallback Class

■ clang::ast_matchers::MatchFinder::MatchCallback
::::::::::::::::::::::::::::::::::::::::::::::::::::
class
provides abstract interface for specifying callbacks that are invoked at
particular stages of matching process
■ each callback is virtual function
■ library user inherits from MatchCallback class and provides desired
behavior by overriding appropriate virtual functions
■ some callbacks include:
2 run: called for each match
2 onStartOfTranslationUnit: called at start of each translation unit
2 onEndOfTranslationUnit: called at end of each translation unit

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 214


Writing a Simple AST Matcher Program

■ at code development time:


2 create match-callback type that derives from
clang::ast_matchers::MatchFinder::MatchCallback
2 override run method to handle each match result found
2 for each pattern of interest in AST, write matcher (of appropriate matcher
type) that matches pattern
■ at run time:
1 create instance of match-callback type
2 create clang::ast_matchers::MatchFinder instance
3 add each matcher to MatchFinder instance via addMatcher method
4 use MatchFinder instance to generate AST frontend-action factory (e.g.,
by using clang::tooling::newFrontendActionFactory)
5 create clang::tooling::ClangTool instance
6 invoke run method of ClangTool instance with (above) frontend-action
factory to generate AST for each source file and perform AST matching

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 215


AST Matcher Example: Summary

■ in slides/examples/ast_matcher_1 directory in companion repository


■ runs compiler frontend on specified source files
■ uses AST matcher to find each instance of call to function specified on
command line
■ for each function call found, program prints lines of source code
containing function call
■ when looking for calls to function foo, output might resemble something
like:
match at / home / jdoe / example_1 . cpp :14(10) -18(2):
14: int i = foo (
15: 1,
16: 2,
17: 3
18: );
match at / home / jdoe / example_1 . cpp :19(10) -19(21):
19: int j = foo (4 , 5, 6);

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 216


AST Matcher Example: main.cpp (1)

1 #include <format>
2 #include "clang/ASTMatchers/ASTMatchers.h"
3 #include "clang/ASTMatchers/ASTMatchFinder.h"
4 #include "clang/Frontend/FrontendActions.h"
5 #include "clang/Tooling/CommonOptionsParser.h"
6 #include "clang/Tooling/Tooling.h"
7 #include "llvm/Support/CommandLine.h"
8 #include "utilities.hpp"
9
10 namespace ct = clang::tooling;
11 namespace cam = clang::ast_matchers;
12
13 clang::SourceLocation getLineStart(const clang::SourceManager& sourceManager,
14 clang::SourceLocation loc) {
15 return sourceManager.translateLineCol(sourceManager.getFileID(loc),
16 sourceManager.getSpellingLineNumber(loc), 1);
17 }
18
19 clang::SourceLocation getLineEnd(const clang::SourceManager& sourceManager,
20 clang::SourceLocation loc) {
21 return sourceManager.translateLineCol(sourceManager.getFileID(loc),
22 sourceManager.getSpellingLineNumber(loc), ~0);
23 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 217


AST Matcher Example: main.cpp (2)

25 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {


26 void run(const cam::MatchFinder::MatchResult& result) override {
27 clang::SourceManager& sourceManager = *result.SourceManager;
28 if (auto p = result.Nodes.getNodeAs<clang::CallExpr>("call")) {
29 clang::SourceLocation startLoc = p->getBeginLoc();
30 clang::SourceLocation endLoc = p->getEndLoc();
31 llvm::outs() << std::format("match at {}:\n", rangeToString(
32 sourceManager, clang::SourceRange(startLoc, endLoc)));
33 clang::SourceLocation lineStartLoc = getLineStart(sourceManager,
34 startLoc);
35 clang::SourceLocation lineEndLoc = getLineEnd(sourceManager,
36 endLoc);
37 unsigned int startLineNo = sourceManager.getSpellingLineNumber(
38 lineStartLoc);
39 std::string text = getSourceText(sourceManager,
40 clang::SourceRange(lineStartLoc, lineEndLoc));
41 llvm::outs() << addLineNumbers(text, startLineNo) << "\n";
42 }
43 }
44 };
45
46 cam::StatementMatcher getMatcher(const std::string& funcName) {
47 using namespace cam;
48 return callExpr(callee(functionDecl(hasName(funcName)))).bind("call");
49 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 218


AST Matcher Example: main.cpp (3)

51 static llvm::cl::OptionCategory optionCategory("Tool options");


52 static llvm::cl::opt<std::string> clFuncName(
53 "f", llvm::cl::desc("Function name"), llvm::cl::value_desc("function_name"),
54 llvm::cl::cat(optionCategory), llvm::cl::Required);
55 ;
56
57 int main(int argc, const char **argv) {
58 auto expectedParser = ct::CommonOptionsParser::create(argc, argv,
59 optionCategory);
60 if (!expectedParser) {
61 llvm::errs() << llvm::toString(expectedParser.takeError());
62 return 1;
63 }
64 ct::CommonOptionsParser& optionsParser = expectedParser.get();
65 ct::ClangTool tool(optionsParser.getCompilations(),
66 optionsParser.getSourcePathList());
67 MyMatchCallback matchCallback;
68 cam::StatementMatcher matcher = getMatcher(clFuncName);
69 cam::MatchFinder matchFinder;
70 matchFinder.addMatcher(matcher, &matchCallback);
71 return tool.run(ct::newFrontendActionFactory(&matchFinder).get());
72 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 219


clang::...::dynamic::VariantMatcher Class (1)

■ VariantMatcher
::::::::::::::::
class (in clang::ast_matchers::dynamic
namespace) can be used store any type of matcher (e.g.,
DeclarationMatcher, StatementMatcher, and so on)
■ allows simple and polymorphic matchers to be represented by single
object type
■ can be very convenient in code that must deal with multiple types of
matchers at once
■ ast_matchers::MatchFinder class provides addDynamicMatcher
method to add dynamic matcher to MatchFinder instance
■ internal::DynTypedMatcher class used by VariantMatcher to
represent dynamically-typed matcher

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 220


clang::...::dynamic::VariantMatcher Class (2)

■ VariantMatcher class provides static methods for creating


VariantMatcher instance from matcher instance:
2 SingleMatcher: create VariantMatcher instance from Matcher instance
2 PolymorphicMatcher: create VariantMatcher instance from
PolymorphicMatcher instance
2 VariadicMatcher create VariantMatcher instance from
VariadicMatcher instance
■ provides methods that are useful for extracting Matcher instance stored in
VariantMatcher instance, such as:
2 hasTypedMatcher<T>: determines if contained matcher can be converted
to Matcher<T>
2 getTypedMatcher<T>: returns contained matcher as Matcher<T>
■ also provides method to extract contained matcher as dynamically-typed
matcher:
2 getSingleMatcher: get contained matcher as DynTypedMatcher (if not

ambiguous)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 221


VariantMatcher Example: Summary

■ in slides/examples/ast_matcher_5 directory in companion repository


■ program generates AST for one or more source files
■ then runs zero or more AST matchers on each AST obtained
■ can specify source files for processing and AST matchers to be used via
command line
■ code example illustrates use of VariantMatcher
■ AST matchers stored in VariantMatcher instances so they can be
handled generically (i.e., without concern for their type)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 222


VariantMatcher Example: main.cpp (1)
1 #include <format>
2 #include <string>
3 #include "clang/ASTMatchers/ASTMatchers.h"
4 #include "clang/ASTMatchers/ASTMatchFinder.h"
5 #include "clang/ASTMatchers/Dynamic/VariantValue.h"
6 #include "clang/AST/Type.h"
7 #include "clang/Frontend/FrontendActions.h"
8 #include "clang/Tooling/CommonOptionsParser.h"
9 #include "clang/Tooling/Tooling.h"
10 #include "llvm/Support/CommandLine.h"
11
12 namespace ct = clang::tooling;
13 namespace cam = clang::ast_matchers;
14 namespace lc = llvm::cl;
15
16 cam::dynamic::VariantMatcher getMatcher(int id) {
17 using namespace cam;
18 switch (id) {
19 default:
20 case 0:
21 return dynamic::VariantMatcher::SingleMatcher(qualType().bind("qt"));
22 case 1:
23 return dynamic::VariantMatcher::SingleMatcher(varDecl().bind("v"));
24 case 2:
25 return dynamic::VariantMatcher::SingleMatcher(
26 callExpr(callee(functionDecl().bind("f"))));
27 }
28 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 223


VariantMatcher Example: main.cpp (2)

30 cam::dynamic::VariantMatcher traverse(clang::TraversalKind kind,


31 cam::dynamic::VariantMatcher matcher) {
32 using namespace cam;
33 if (matcher.hasTypedMatcher<clang::Decl>()) {
34 return dynamic::VariantMatcher::SingleMatcher(traverse(kind,
35 matcher.getTypedMatcher<clang::Decl>()));
36 } else if (matcher.hasTypedMatcher<clang::Stmt>()) {
37 return dynamic::VariantMatcher::SingleMatcher(traverse(kind,
38 matcher.getTypedMatcher<clang::Stmt>()));
39 } else if (matcher.hasTypedMatcher<clang::QualType>()) {
40 return dynamic::VariantMatcher::SingleMatcher(traverse(kind,
41 matcher.getTypedMatcher<clang::QualType>()));
42 } else {std::abort();}
43 return matcher;
44 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 224


VariantMatcher Example: main.cpp (3)

46 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {


47 MyMatchCallback() : count(0) {}
48 void run(const cam::MatchFinder::MatchResult& result) override;
49 unsigned count;
50 };
51
52 void MyMatchCallback::run(const cam::MatchFinder::MatchResult& result) {
53 llvm::outs() << std::format("MATCH {}:\n", count);
54 if (auto qualTypePtr = result.Nodes.getNodeAs<clang::QualType>("qt")) {
55 llvm::outs() << "dump for QualType:\n";
56 qualTypePtr->dump();
57 } else if (auto varDecl =
58 result.Nodes.getNodeAs<clang::VarDecl>("v")) {
59 std::string s(varDecl->getQualifiedNameAsString());
60 llvm::outs() << std::format("dump for VarDecl {}:\n",
61 !s.empty() ? s : "[unnamed]");
62 varDecl->dump();
63 } else if (auto funcDecl =
64 result.Nodes.getNodeAs<clang::FunctionDecl>("f")) {
65 llvm::outs() << std::format("dump for FunctionDecl {}:\n",
66 funcDecl->getQualifiedNameAsString());
67 funcDecl->dump();
68 }
69 ++count;
70 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 225


VariantMatcher Example: main.cpp (4)
72 static lc::OptionCategory optionCategory("Tool options");
73 static lc::list<int> clMatcherIds("m", lc::desc("Matcher ID"),
74 lc::cat(optionCategory), lc::ZeroOrMore);
75 static lc::opt<bool> clAsIs("i", lc::desc("Implicit nodes"),
76 lc::cat(optionCategory));
77
78 int main(int argc, const char **argv) {
79 auto optParser = ct::CommonOptionsParser::create(argc, argv,
80 optionCategory);
81 if (!optParser) {
82 llvm::errs() << llvm::toString(optParser.takeError());
83 return 1;
84 }
85 ct::ClangTool tool(optParser->getCompilations(),
86 optParser->getSourcePathList());
87 MyMatchCallback matchCallback;
88 cam::MatchFinder matchFinder;
89 for (auto id : clMatcherIds) {
90 matchFinder.addDynamicMatcher(*traverse(
91 clAsIs ? clang::TK_AsIs : clang::TK_IgnoreUnlessSpelledInSource,
92 getMatcher(id)).getSingleMatcher(), &matchCallback);
93 }
94 int status = tool.run(ct::newFrontendActionFactory(&matchFinder).get());
95 llvm::outs() << std::format("number of matches: {}\n",
96 matchCallback.count);
97 return !status ? 0 : 1;
98 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 226


Section 3.10.4

Custom AST Matchers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 227


Macros for Defining AST Matchers
■ numerous macros provided for defining new AST matchers
■ family of macros for defining AST matcher by specifying predicate,
including:
2 AST_MATCHER, AST_MATCHER_P, AST_MATCHER_P_OVERLOAD,

AST_MATCHER_P2, AST_MATCHER_P2_OVERLOAD
2 AST_POLYMORPHIC_MATCHER, AST_POLYMORPHIC_MATCHER_P,

AST_POLYMORPHIC_MATCHER_P_OVERLOAD,
AST_POLYMORPHIC_MATCHER_P2,
AST_POLYMORPHIC_MATCHER_P2_OVERLOAD
■ family of macros for defining AST matcher by specifying matcher factory
function, including:
2 AST_MATCHER_FUNCTION, AST_MATCHER_FUNCTION_P,

AST_MATCHER_FUNCTION_P_OVERLOAD
■ numerous other macros for defining AST matchers (e.g., for handling type
traverse matchers and regex parameters)
■ for more information, see:
2 https://clang.llvm.org/doxygen/ASTMatchersMacros_8h.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 228


AST_MATCHER Macro

■ AST_MATCHER matcher defines simple matcher from predicate


■ has syntax:
AST_MATCHER(Type, DefineMatcher)
■ defines zero-parameter predicate on nodes of type Type and
corresponding matcher function named DefineMatcher
■ predicate returns bool indicating if node matches
■ provides variables:
2 Node: AST node being matched (of type const Type&)
2 Finder: AST match finder (of type

clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type

clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ ASTMatchFinder class (i.e., type of Finder variable) provides
getASTContext method to access ASTContext instance

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 229


AST_MATCHER_P Macro
■ AST_MATCHER_P macro defines simple matcher from predicate
■ has syntax:
AST_MATCHER_P(Type, DefineMatcher, ParamType, Param)
■ defines single-parameter predicate on nodes of type Type and
corresponding matcher function named DefineMatcher
■ predicate returns bool indicating if node matches
■ parameter named Param of type const ParamType&
■ provides variables:
2 Node: AST node being matched (of type const Type&)
2 Finder: AST match finder (of type

clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type

clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_MATCHER_P_OVERLOAD macro similar to AST_MATCHER_P macro,
except adds extra ID parameter used to disambiguate overloads of
overloaded predicate
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 230
AST_MATCHER_P2 Macro
■ AST_MATCHER_P2 macro defines simple matcher from predicate
■ has syntax:
AST_MATCHER_P2(Type, DefineMatcher, ParamType1, Param1,
ParamType2, Param2)
■ defines two-parameter predicate on nodes of type Type and
corresponding matcher function named DefineMatcher
■ parameter named Param1 of type const ParamType1& and parameter
named Param2 of type const ParamType2&
■ predicate returns bool indicating if node matches
■ provides variables:
2 Node: AST node being matched (of type const Type&)
2 Finder: AST match finder (of type

clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type

clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_MATCHER_P2_OVERLOAD macro similar to AST_MATCHER_P2, except
adds extra ID parameter used for disambiguation in case of overloaded
matcher function
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 231
AST_MATCHER_FUNCTION Macro

■ AST_MATCHER_FUNCTION macro defines simple matcher from


matcher-instance factory function
■ has syntax:
AST_MATCHER_FUNCTION(ReturnType, DefineMatcher)
■ defines zero-parameter function named DefineMatcher that returns
matcher instance (of type ReturnType)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 232


AST_MATCHER_FUNCTION_P Macro

■ AST_MATCHER_FUNCTION_P macro defines simple matcher from


matcher-instance factory function
■ has syntax:
AST_MATCHER_FUNCTION_P(ReturnType, DefineMatcher,
ParamType, Param)
■ defines single-parameter function named DefineMatcher that returns
matcher instance (of type ReturnType)
■ variable Param used for parameter (of type const ParamType&) passed
to function
■ AST_MATCHER_FUNCTION_P_OVERLOAD macro similar to
AST_MATCHER_FUNCTION_P macro, except adds extra ID parameter used
for disambiguation in case of overloaded matcher function

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 233


AST_POLYMORPHIC_SUPPORTED_TYPES Macro

■ in macros used to define polymorphic matchers, need way to specify list of


arbitrary number of types
■ AST_POLYMORPHIC_SUPPORTED_TYPES is variadic macro used with
AST_POLYMORPHIC_MATCHER* family of macros to specify list of types for
polymorphic matcher
■ has syntax:
AST_POLYMORPHIC_SUPPORTED_TYPES(...)
■ for example:
AST_POLYMORPHIC_SUPPORTED_TYPES(clang::FunctionDecl,
clang::VarDecl, clang::CXXRecordDecl)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 234


AST_POLYMORPHIC_MATCHER Macro
■ AST_POLYMORPHIC_MATCHER macro defines polymorphic matcher from
predicate (i.e., polymorphic analog of AST_MATCHER)
■ has syntax:
AST_POLYMORPHIC_MATCHER(DefineMatcher, ReturnTypes)
■ defines matcher function named DefineMatcher corresponding to
zero-parameter predicate on nodes of types specified by ReturnTypes
(using AST_POLYMORPHIC_SUPPORTED_TYPES)
■ predicate returns bool indicating if node matches
■ provides variables/types:
2 NodeType: type of node being matched (which is one of types specified by

ReturnTypes)
2 Node: AST node being matched of type const NodeType&
2 Finder: AST match finder (of type

clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type

clang::ast_matchers::internal::BoundNodesTreeBuilder*)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 235


AST_POLYMORPHIC_MATCHER_P Macro
■ AST_POLYMORPHIC_MATCHER_P macro defines polymorphic matcher from
predicate (i.e., polymorphic analog of AST_MATCHER_P)
■ has syntax:
AST_POLYMORPHIC_MATCHER_P(DefineMatcher, ReturnTypes,
ParamType, Param)
■ similar to AST_POLYMORPHIC_MATCHER macro but defines matcher with
one-parameter predicate
■ parameter named Param of type const ParamType&
■ provides variables/types (similar to AST_POLYMORPHIC_MATCHER):
NodeType: type of node being matched (which is one of types specified by
2

ReturnTypes)
2 Node: AST node being matched of type const NodeType&
2 Finder: AST match finder (of type

clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type

clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_POLYMORPHIC_MATCHER_P_OVERLOAD analogous to
AST_MATCHER_P_OVERLOAD, except for polymorphic matcher
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 236
AST_POLYMORPHIC_MATCHER_P2 Macro
■ AST_POLYMORPHIC_MATCHER_P2 macro defines polymorphic matcher
from predicate (i.e., polymorphic analog of AST_MATCHER_P2)
■ has syntax:
AST_POLYMORPHIC_MATCHER_P2(DefineMatcher, ReturnTypes,
ParamType1, Param1, ParamType2, Param2)
■ similar to AST_POLYMORPHIC_MATCHER macro but defines matcher with
two-parameter predicate
■ provides variables/types (similar to AST_POLYMORPHIC_MATCHER):
2 NodeType: type of node being matched (which is one of types specified by

ReturnTypes)
2 Node: AST node being matched of type const NodeType&
2 Finder: AST match finder (of type

clang::ast_matchers::internal::ASTMatchFinder*)
2 Builder: builder (of type

clang::ast_matchers::internal::BoundNodesTreeBuilder*)
■ AST_POLYMORPHIC_MATCHER_P2_OVERLOAD analogous to
AST_MATCHER_P_OVERLOAD, except for polymorphic matcher
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 237
AST Matcher Example: Summary

■ in slides/examples/ast_matcher_2 directory in companion repository


■ runs compiler frontend on specified source files
■ uses AST matcher to find each instance of node that matches criteria
selected on command line
■ for each match found, program prints source code associated with match
■ demonstrates use of macros like AST_MATCHER, AST_MATCHER_P, and
AST_MATCHER_P2
■ demonstrates use of clang::ast_matchers::traverse to control if
implicit nodes should be ignored

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 238


AST Matcher Example: main.cpp (1)
1 #include <format>
2 #include "clang/ASTMatchers/ASTMatchers.h"
3 #include "clang/ASTMatchers/ASTMatchFinder.h"
4 #include "clang/Frontend/FrontendActions.h"
5 #include "clang/Tooling/CommonOptionsParser.h"
6 #include "clang/Tooling/Tooling.h"
7 #include "llvm/Support/CommandLine.h"
8 #include "utilities2.hpp"
9
10 namespace ct = clang::tooling;
11 namespace cam = clang::ast_matchers;
12
13 static llvm::cl::OptionCategory optionCategory("Tool options");
14 static llvm::cl::opt<int> clMatcherId("m", llvm::cl::desc("Matcher ID"),
15 llvm::cl::value_desc("matcher_id"), llvm::cl::cat(optionCategory),
16 llvm::cl::init(0));
17 static llvm::cl::opt<bool> clAllNodes("a", llvm::cl::desc("all nodes"),
18 llvm::cl::cat(optionCategory), llvm::cl::init(false));
19
20 AST_MATCHER(clang::CXXMethodDecl, isSpecialMember) {
21 if (auto p = llvm::dyn_cast<clang::CXXConstructorDecl>(&Node)) {
22 return p->isDefaultConstructor() || p->isCopyConstructor() ||
23 p->isMoveConstructor();
24 } else if (auto p = llvm::dyn_cast<clang::CXXDestructorDecl>(&Node)) {
25 return true;
26 } else {
27 return Node.isCopyAssignmentOperator() ||
28 Node.isMoveAssignmentOperator();
29 }
30 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 239


AST Matcher Example: main.cpp (2)

32 AST_MATCHER_P(clang::CXXMethodDecl, paramCountAtLeast, unsigned, threshold) {


33 return Node.param_size() >= threshold;
34 }
35
36 AST_MATCHER_P2(clang::NamedDecl, nameLengthBetween, unsigned, low, unsigned,
37 high) {
38 return Node.getIdentifier() && Node.getName().size() >= low &&
39 Node.getName().size() <= high;
40 }
41
42 cam::DeclarationMatcher getMatcher(int id) {
43 using namespace cam;
44 switch (id) {
45 default:
46 case 0:
47 return cxxMethodDecl(isDefinition(), isSpecialMember()).bind("x");
48 case 1:
49 return cxxMethodDecl(paramCountAtLeast(4)).bind("x");
50 case 2:
51 return namedDecl(nameLengthBetween(3, 4)).bind("x");
52 }
53 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 240


AST Matcher Example: main.cpp (3)
55 class MyMatchCallback : public cam::MatchFinder::MatchCallback {
56 public:
57 MyMatchCallback() : count_(0) {}
58 void run(const cam::MatchFinder::MatchResult& result) override {
59 const clang::SourceManager& sourceManager = *result.SourceManager;
60 clang::SourceRange sourceRange;
61 std::string nodeType;
62 if (auto p = result.Nodes.getNodeAs<clang::CXXMethodDecl>("x")) {
63 nodeType = "CXXMethodDecl";
64 sourceRange = p->getSourceRange();
65 } else if (auto p = result.Nodes.getNodeAs<clang::FunctionDecl>("x")) {
66 nodeType = "FunctionDecl";
67 sourceRange = p->getSourceRange();
68 }
69 if (sourceRange.isValid()) {
70 llvm::outs() << std::format("found matching {} at {}\n", nodeType,
71 locationToString(sourceManager, sourceRange.getBegin(), true));
72 sourceRange.setBegin(sourceManager.getSpellingLoc(sourceRange.getBegin()));
73 sourceRange.setEnd(sourceManager.getSpellingLoc(sourceRange.getEnd()));
74 sourceRange.setEnd(getEndOfToken(sourceManager,
75 sourceRange.getEnd()));
76 if (sourceRange.isValid()) {
77 llvm::outs() << getSourceTextRaw(sourceManager, sourceRange) << ’\n’;
78 }
79 }
80 ++count_;
81 }
82 unsigned getCount() const {return count_;}
83 private:
84 unsigned count_;
85 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 241


AST Matcher Example: main.cpp (4)

87 int main(int argc, const char **argv) {


88 auto expectedParser = ct::CommonOptionsParser::create(argc, argv,
89 optionCategory);
90 if (!expectedParser) {
91 llvm::errs() << llvm::toString(expectedParser.takeError());
92 return 1;
93 }
94 ct::CommonOptionsParser& optionsParser = expectedParser.get();
95 ct::ClangTool tool(optionsParser.getCompilations(),
96 optionsParser.getSourcePathList());
97 cam::DeclarationMatcher matcher = getMatcher(clMatcherId);
98 if (!clAllNodes) {
99 matcher = cam::traverse(clang::TK_IgnoreUnlessSpelledInSource,
100 matcher);
101 }
102 MyMatchCallback matchCallback;
103 cam::MatchFinder matchFinder;
104 matchFinder.addMatcher(matcher, &matchCallback);
105 int status = tool.run(ct::newFrontendActionFactory(&matchFinder).get());
106 llvm::outs() << std::format("number of matches: {}\n",
107 matchCallback.getCount());
108 return !status ? 0 : 1;
109 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 242


AST Matcher Example: Summary

■ in slides/examples/ast_matcher_3 directory in companion repository


■ runs compiler frontend on specified source files
■ uses AST matcher to find each instance of node that matches criteria
selected on command line
■ for each match found, program prints source code associated with match
■ demonstrates:
2 use of macros for defining polymorphic matchers, as well as
AST_MATCHER_FUNCTION* family of macros
2 defining matcher for AttributedStmt type nodes

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 243


AST Matcher Example: main.cpp (1)
1 #include <format>
2 #include <string>
3 #include <optional>
4 #include "clang/ASTMatchers/ASTMatchers.h"
5 #include "clang/ASTMatchers/ASTMatchFinder.h"
6 #include "clang/ASTMatchers/Dynamic/VariantValue.h"
7 #include "clang/Frontend/FrontendActions.h"
8 #include "clang/Tooling/CommonOptionsParser.h"
9 #include "clang/Tooling/Tooling.h"
10 #include "llvm/Support/CommandLine.h"
11
12 namespace lc = llvm::cl;
13 namespace ct = clang::tooling;
14 namespace cam = clang::ast_matchers;
15
16 static lc::OptionCategory optCategory("Tool options");
17 static lc::list<int> clMatcherIds("m", lc::desc("Matcher ID"),
18 lc::value_desc("matcher_id"), llvm::cl::cat(optCategory));
19 static lc::opt<bool> clFlag("f", lc::desc("Some flag"),
20 lc::cat(optCategory));
21
22 std::optional<std::string> getSourceText(
23 const clang::ASTContext& astContext, clang::SourceRange sourceRange) {
24 if (!sourceRange.isValid()) {return {};}
25 bool invalid = true;
26 std::string text{clang::Lexer::getSourceText(
27 clang::CharSourceRange::getTokenRange(sourceRange),
28 astContext.getSourceManager(), astContext.getLangOpts(), &invalid)};
29 if (invalid) {return {};}
30 return text;
31 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 244


AST Matcher Example: main.cpp (2)

33 const cam::internal::VariadicDynCastAllOfMatcher<clang::Stmt,
34 clang::AttributedStmt> attributedStmt;
35
36 AST_POLYMORPHIC_MATCHER_P(boolean, AST_POLYMORPHIC_SUPPORTED_TYPES(
37 clang::Decl, clang::Stmt, clang::Type, clang::TypeLoc), bool, condition)
38 {return condition;}
39
40 AST_MATCHER_P(clang::AttributedStmt, attrCountAtLeast, unsigned, threshold)
41 {return Node.getAttrs().size() >= threshold;}
42
43 AST_MATCHER_P2(clang::ConstantArrayType, isSizeBetween, unsigned long,
44 low, unsigned long, high) {
45 return Node.getSize().getZExtValue() >= low &&
46 Node.getSize().getZExtValue() <= high;
47 }
48
49 AST_MATCHER_FUNCTION(cam::internal::Matcher<clang::FunctionDecl>,
50 isFunctionOfInterest) {
51 using namespace cam;
52 return functionDecl(hasAnyName("::abs", "::max", "::min"));
53 }
54
55 AST_MATCHER_FUNCTION_P(cam::internal::Matcher<clang::CallExpr>,
56 callToFuncWithName, std::string, name) {
57 using namespace cam;
58 return callExpr(callee(functionDecl(hasName(name))));
59 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 245


AST Matcher Example: main.cpp (3)

61 cam::dynamic::VariantMatcher getMatcher(int id) {


62 using namespace cam;
63 switch (id) {
64 default:
65 case 0:
66 return dynamic::VariantMatcher::SingleMatcher(
67 attributedStmt(attrCountAtLeast(1)).bind("stmt"));
68 case 1:
69 return dynamic::VariantMatcher::SingleMatcher(varDecl(
70 anyOf(unless(boolean(clFlag)), isExpansionInMainFile()),
71 hasType(constantArrayType(isSizeBetween(3, 7)))).bind("decl"));
72 case 2:
73 return dynamic::VariantMatcher::SingleMatcher(
74 callExpr(callee(functionDecl(isFunctionOfInterest()))).bind("stmt"));
75 case 3:
76 return dynamic::VariantMatcher::SingleMatcher(
77 callExpr(callToFuncWithName("size")).bind("stmt"));
78 }
79 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 246


AST Matcher Example: main.cpp (4)

81 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {


82 void run(const cam::MatchFinder::MatchResult& result) override;
83 unsigned count = 0;
84 };
85
86 void MyMatchCallback::run(const cam::MatchFinder::MatchResult& result) {
87 llvm::outs() << std::format("MATCH {}\n", count);
88 clang::SourceRange sourceRange;
89 if (auto decl = result.Nodes.getNodeAs<clang::Decl>("decl"))
90 {sourceRange = decl->getSourceRange();}
91 else if (auto stmt = result.Nodes.getNodeAs<clang::Stmt>("stmt"))
92 {sourceRange = stmt->getSourceRange();}
93 auto sourceText = getSourceText(*result.Context, sourceRange);
94 if (sourceText)
95 {llvm::outs() << std::format("source text:\n{}\n", *sourceText);}
96 ++count;
97 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 247


AST Matcher Example: main.cpp (5)

99 int main(int argc, const char **argv) {


100 auto optParser = ct::CommonOptionsParser::create(argc, argv, optCategory);
101 if (!optParser) {
102 llvm::errs() << llvm::toString(optParser.takeError());
103 return 1;
104 }
105 ct::ClangTool tool(optParser->getCompilations(),
106 optParser->getSourcePathList());
107 MyMatchCallback matchCallback;
108 cam::MatchFinder matchFinder;
109 for (auto id : clMatcherIds) {
110 llvm::outs() << std::format("enabling matcher {}\n", id);
111 matchFinder.addDynamicMatcher(*getMatcher(id).getSingleMatcher(),
112 &matchCallback);
113 }
114 int status = tool.run(ct::newFrontendActionFactory(&matchFinder).get());
115 llvm::outs() << std::format("number of matches: {}\n",
116 matchCallback.count);
117 return !status ? 0 : 1;
118 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 248


BoundNodesTreeBuilder Class

■ clang::ast_matchers::internal::BoundNodesTreeBuilder class
used during AST-node matching process to record matches and names
bound to nodes
■ addMatch method adds another BoundNodesTreeBuilder tree as
branch in tree
■ setBinding method adds name binding for node
■ BoundNodesTreeBuilder class and addMatch method useful for custom
matchers that have “for-each” kind of behavior (i.e., that need to iterate
over every match obtained from some inner matcher)
■ addMatch method often used to add match information obtained from
inner matcher to match results for outer matcher

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 249


BoundNodesTreeBuilder Example: Summary

■ in slides/examples/ast_matcher_6 directory in companion repository


■ run compiler frontend on specified source files
■ for each class declaration, print any virtual bases of class
■ accomplished by using “for-each” style of matcher (i.e., for each virtual
base)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 250


BoundNodesTreeBuilder Example: main.cpp (1)
1 #include <format>
2 #include <clang/ASTMatchers/ASTMatchers.h>
3 #include <clang/ASTMatchers/ASTMatchFinder.h>
4 #include <clang/Tooling/CommonOptionsParser.h>
5 #include <clang/Tooling/Tooling.h>
6 #include <llvm/Support/CommandLine.h>
7
8 namespace ct = clang::tooling;
9 namespace cam = clang::ast_matchers;
10
11 AST_MATCHER_P(clang::CXXRecordDecl, forEachVirtualBase,
12 cam::internal::Matcher<clang::CXXBaseSpecifier>, innerMatcher) {
13 bool matched = false;
14 clang::ast_matchers::internal::BoundNodesTreeBuilder result;
15 const clang::CXXRecordDecl* def = Node.getDefinition();
16 if (!def) {return false;}
17 for (auto baseIter = def->vbases_begin(); baseIter != def->vbases_end();
18 ++baseIter) {
19 clang::ast_matchers::internal::BoundNodesTreeBuilder argBuilder(
20 *Builder);
21 if (innerMatcher.matches(*baseIter, Finder, &argBuilder)) {
22 matched = true;
23 result.addMatch(argBuilder);
24 }
25 }
26 *Builder = std::move(result);
27 return matched;
28 }
29
30 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {
31 void run(const cam::MatchFinder::MatchResult& result) override;
32 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 251


BoundNodesTreeBuilder Example: main.cpp (2)
34 void MyMatchCallback::run(const cam::MatchFinder::MatchResult& result) {
35 auto decl = result.Nodes.getNodeAs<clang::CXXRecordDecl>("decl");
36 auto baseDecl = result.Nodes.getNodeAs<clang::CXXRecordDecl>(
37 "baseDecl");
38 if (!decl || !baseDecl) {return;}
39 llvm::outs() << std::format("{} has virtual base {}\n", decl->getName(),
40 baseDecl->getName());
41 }
42
43 static llvm::cl::OptionCategory optionCategory("Tool options");
44
45 int main(int argc, const char **argv) {
46 auto optParser = ct::CommonOptionsParser::create(argc, argv,
47 optionCategory);
48 if (!optParser) {
49 llvm::errs() << llvm::toString(optParser.takeError());
50 return 1;
51 }
52 ct::ClangTool tool(optParser->getCompilations(),
53 optParser->getSourcePathList());
54 cam::DeclarationMatcher matcher = [](){
55 using namespace cam;
56 return cxxRecordDecl(forEachVirtualBase(cxxBaseSpecifier(hasType(
57 qualType(hasDeclaration(cxxRecordDecl().bind("baseDecl")))))))
58 .bind("decl");
59 }();
60 MyMatchCallback matchCallback;
61 cam::MatchFinder matchFinder;
62 matchFinder.addMatcher(matcher, &matchCallback);
63 return tool.run(ct::newFrontendActionFactory(&matchFinder).get());
64 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 252


Section 3.10.5

AST Visitors Versus AST Matchers

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 253


AST Visitors Versus AST Matchers

■ use of AST matchers can often result in more concise code (relative to
AST visitors) by eliminating boilerplate
■ for example, if pattern involves descendants/ancestors, approach based
on AST visitor would need additional boilerplate to locate those
descendants/ancestors, whereas in AST matcher case, library itself
provides boilerplate to locate and bind to relevant nodes
■ AST visitors can often be better suited to searching when patterns involve
variable number of nodes and/or complex relationships between nodes
■ use whichever approach best suited for task at hand

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 254


For-Statement Example: Summary

■ in slides/examples/ast_visitor_matcher_1 directory in companion


repository
■ run compiler frontend on specified source files
■ for each non-member and member function containing at least one for
statement (including range-based for statements), print maximum
number of levels of nested for statements in function
■ example output:
foo1c ... 1
foo2a ... 2
foo3a ... 3
(anonymous class)::operator() ... 1
main()::(anonymous class)::operator() ... 2

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 255


Visitor-Based For-Statement Example: visitor.cpp (1)

1 #include <format>
2 #include <stack>
3 #include <type_traits>
4 #include "clang/AST/ASTConsumer.h"
5 #include "clang/AST/RecursiveASTVisitor.h"
6 #include "clang/Frontend/CompilerInstance.h"
7 #include "clang/Frontend/FrontendAction.h"
8 #include "clang/Tooling/CommonOptionsParser.h"
9 #include "clang/Tooling/Tooling.h"
10 #include "llvm/Support/CommandLine.h"
11
12 namespace ct = clang::tooling;
13
14 static llvm::cl::OptionCategory toolOptions("Tool Options");

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 256


Visitor-Based For-Statement Example: visitor.cpp (2)

16 class MyAstVisitor : public clang::RecursiveASTVisitor<MyAstVisitor> {


17 public:
18 using Base = clang::RecursiveASTVisitor<MyAstVisitor>;
19 MyAstVisitor(clang::ASTContext& astContext) : astContext_(&astContext) {}
20 bool shouldVisitImplicitCode() const {return true;}
21 bool shouldVisitTemplateInstantiations() const {return true;}
22 bool TraverseFunctionDecl(clang::FunctionDecl* funcDecl)
23 {return handleFunc<clang::FunctionDecl>(funcDecl);}
24 bool TraverseCXXMethodDecl(clang::CXXMethodDecl* funcDecl)
25 {return handleFunc<clang::CXXMethodDecl>(funcDecl);}
26 bool TraverseForStmt(clang::ForStmt* forStmt)
27 {return handleFor<clang::ForStmt>(forStmt);}
28 bool TraverseCXXForRangeStmt(clang::CXXForRangeStmt* forStmt)
29 {return handleFor<clang::CXXForRangeStmt>(forStmt);}
30 private:
31 struct StackEntry {
32 const clang::FunctionDecl* funcDecl;
33 unsigned forDepth;
34 unsigned maxForDepth;
35 };
36 template<class NodeType> bool handleFunc(NodeType* funcDecl);
37 template<class NodeType> bool handleFor(NodeType* forStmt);
38 clang::ASTContext* astContext_;
39 std::stack<StackEntry> stack_;
40 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 257


Visitor-Based For-Statement Example: visitor.cpp (3)
42 template<class NodeType> bool MyAstVisitor::handleFunc(NodeType* funcDecl) {
43 const clang::SourceManager& sourceManager =
44 astContext_->getSourceManager();
45 if (sourceManager.getFileID(funcDecl->getLocation()) !=
46 sourceManager.getMainFileID()) {return true;}
47 stack_.push({funcDecl, 0, 0});
48 bool result;
49 if constexpr (std::is_same_v<NodeType, clang::CXXMethodDecl>)
50 {result = Base::TraverseCXXMethodDecl(funcDecl);}
51 else {result = Base::TraverseFunctionDecl(funcDecl);}
52 if (stack_.top().maxForDepth > 0) {
53 llvm::outs() << std::format("{} ... {}\n",
54 stack_.top().funcDecl->getQualifiedNameAsString(),
55 stack_.top().maxForDepth);
56 }
57 stack_.pop();
58 return result;
59 }
60
61 template<class NodeType> bool MyAstVisitor::handleFor(NodeType* forStmt) {
62 if (stack_.empty()) {return true;}
63 StackEntry& top = stack_.top();
64 ++top.forDepth;
65 top.maxForDepth = std::max(top.maxForDepth, top.forDepth);
66 bool result;
67 if constexpr(std::is_same_v<NodeType, clang::CXXForRangeStmt>)
68 {result = Base::TraverseCXXForRangeStmt(forStmt);}
69 else {result = Base::TraverseForStmt(forStmt);}
70 --top.forDepth;
71 return result;
72 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 258


Visitor-Based For-Statement Example: visitor.cpp (4)
74 struct MyAstConsumer : public clang::ASTConsumer {
75 void HandleTranslationUnit(clang::ASTContext& astContext) final {
76 MyAstVisitor visitor(astContext);
77 visitor.TraverseDecl(astContext.getTranslationUnitDecl());
78 }
79 };
80
81 struct MyFrontendAction : public clang::ASTFrontendAction {
82 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
83 clang::CompilerInstance&, clang::StringRef fileName) final {
84 llvm::outs() << std::format("PROCESSING SOURCE FILE {}\n",
85 std::string(fileName));
86 return std::unique_ptr<clang::ASTConsumer>{new MyAstConsumer};
87 }
88 };
89
90 int main(int argc, char** argv) {
91 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
92 const_cast<const char**>(argv), toolOptions);
93 if (!expectedOptionsParser) {
94 llvm::errs() << std::format("Unable to create option parser ({}).\n",
95 llvm::toString(expectedOptionsParser.takeError()));
96 return 1;
97 }
98 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
99 ct::ClangTool tool(optionsParser.getCompilations(),
100 optionsParser.getSourcePathList());
101 int status = tool.run(
102 ct::newFrontendActionFactory<MyFrontendAction>().get());
103 if (status) {llvm::errs() << "error detected\n";}
104 return !status ? 0 : 1;
105 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 259


Matcher-Based For-Statement Example: matcher.cpp (1)

1 #include <cassert>
2 #include <format>
3 #include <map>
4 #include "clang/AST/ASTContext.h"
5 #include "clang/ASTMatchers/ASTMatchers.h"
6 #include "clang/ASTMatchers/ASTMatchFinder.h"
7 #include "clang/AST/ParentMapContext.h"
8 #include "clang/AST/RecursiveASTVisitor.h"
9 #include "clang/Frontend/FrontendActions.h"
10 #include "clang/Tooling/CommonOptionsParser.h"
11 #include "clang/Tooling/Tooling.h"
12 #include "llvm/Support/CommandLine.h"
13
14 namespace ct = clang::tooling;
15 namespace cam = clang::ast_matchers;
16
17 static llvm::cl::OptionCategory optionCategory("Tool options");

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 260


Matcher-Based For-Statement Example: matcher.cpp (2)

19 template<class NodeType>
20 const NodeType* getParentOfStmt(clang::ASTContext& astContext,
21 const clang::Stmt* stmt) {
22 auto parents = astContext.getParents(*stmt);
23 const clang::Stmt* curStmt = nullptr;
24 const NodeType* parent = nullptr;
25 for (auto&& node : parents) {
26 if (auto p = node.get<NodeType>()) {
27 assert(!parent);
28 parent = p;
29 }
30 }
31 return parent;
32 }
33
34 unsigned getForDepth(clang::ASTContext& astContext,
35 const clang::Stmt* forStmt) {
36 assert(llvm::isa<clang::ForStmt>(forStmt) ||
37 llvm::isa<clang::CXXForRangeStmt>(forStmt));
38 unsigned count = 1;
39 const clang::Stmt* curStmt = forStmt;
40 while ((curStmt = getParentOfStmt<clang::Stmt>(astContext, curStmt))) {
41 if (llvm::isa<clang::ForStmt>(curStmt) ||
42 llvm::isa<clang::CXXForRangeStmt>(curStmt)) {++count;}
43 }
44 return count;
45 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 261


Matcher-Based For-Statement Example: matcher.cpp (3)
47 class MyMatchCallback : public cam::MatchFinder::MatchCallback {
48 public:
49 void run(const cam::MatchFinder::MatchResult& result) final;
50 void onStartOfTranslationUnit() final {funcTab_.clear();}
51 void onEndOfTranslationUnit() final;
52 private:
53 using FuncTab = std::map<const clang::FunctionDecl*, unsigned>;
54 FuncTab funcTab_;
55 };
56
57 void MyMatchCallback::onEndOfTranslationUnit() {
58 for (auto [funcDecl, maxForDepth] : funcTab_) {
59 llvm::outs() << std::format("{} ... {}\n",
60 funcDecl->getQualifiedNameAsString(), maxForDepth);
61 }
62 funcTab_.clear();
63 }
64
65 void MyMatchCallback::run(const cam::MatchFinder::MatchResult& result) {
66 const clang::SourceManager& sourceManager = *result.SourceManager;
67 auto forStmt = result.Nodes.getNodeAs<clang::Stmt>("for");
68 auto funcDecl = result.Nodes.getNodeAs<clang::FunctionDecl>("func");
69 if (funcDecl && forStmt) {
70 auto iter = funcTab_.find(funcDecl);
71 if (iter == funcTab_.end()) {
72 iter = funcTab_.insert(std::make_pair(funcDecl, 0)).first;
73 }
74 unsigned depth = getForDepth(*result.Context, forStmt);
75 iter->second = std::max(iter->second, depth);
76 }
77 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 262


Matcher-Based For-Statement Example: matcher.cpp (4)

79 cam::StatementMatcher getMatcher() {
80 using namespace cam;
81 auto f = anyOf(forStmt(), cxxForRangeStmt());
82 return stmt(f, hasAncestor(functionDecl(isExpansionInMainFile()).bind(
83 "func")), unless(hasDescendant(stmt(f)))).bind("for");
84 }
85
86 struct MyAstConsumer : public clang::ASTConsumer {
87 void HandleTranslationUnit(clang::ASTContext& astContext) final {
88 MyMatchCallback matchCallback;
89 cam::StatementMatcher matcher = getMatcher();
90 cam::MatchFinder matchFinder;
91 matchFinder.addMatcher(matcher, &matchCallback);
92 matchFinder.matchAST(astContext);
93 }
94 };
95
96 struct MyFrontendAction : public clang::ASTFrontendAction {
97 std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(
98 clang::CompilerInstance&, clang::StringRef fileName) final {
99 llvm::outs() << std::format("PROCESSING SOURCE FILE {}\n",
100 std::string(fileName));
101 return std::unique_ptr<clang::ASTConsumer>{new MyAstConsumer};
102 }
103 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 263


Matcher-Based For-Statement Example: matcher.cpp (5)

105 int main(int argc, const char **argv) {


106 auto expectedParser = ct::CommonOptionsParser::create(argc, argv,
107 optionCategory);
108 if (!expectedParser) {
109 llvm::errs() << llvm::toString(expectedParser.takeError());
110 return 1;
111 }
112 ct::CommonOptionsParser& optionsParser = expectedParser.get();
113 ct::ClangTool tool(optionsParser.getCompilations(),
114 optionsParser.getSourcePathList());
115 int status =
116 tool.run(ct::newFrontendActionFactory<MyFrontendAction>().get());
117 if (status) {llvm::errs() << "error detected\n";}
118 return !status ? 0 : 1;
119 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 264


Section 3.10.6

References

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 265


Additional Sources of Information on AST Matchers

■ many good examples of AST matchers can be found in source code for
clang-tidy (in clang-tools-extra/clang-tidy
:::::::::::::::::::::::::::::::
directory of LLVM
source tree)
■ various talks and articles by Stephen Kelly (as well as talks/articles by Eli
Bendersky and others) listed in References section
■ search for “[clang-ast-matchers]” on StackOverflow
■ yet more examples of AST matchers can be found in
https://github.com/lanl/CoARCT

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 266


Section 3.11

Control-Flow Graphs (CFGs)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 267


clang::CFG Class
■ clang::CFG class represents control-flow graph (CFG) corresponding to
.............................
source-level intra-procedural control-flow of Stmt (i.e., statement)
■ CFG object is essentially collection of CFGBlock elements, which
represent basic blocks in CFG
■ CFG object always has two dummy blocks, designating entry and exit
points of CFG
■ some nonstatic methods provided by class include:
2 begin and end: return range corresponding to blocks in CFG

2 size: get number of blocks in CFG


2 isLinear: return true if CFG has no branches

■ clang::CFG::BuildCFG factory function provided for building CFG


corresponding to Stmt node in AST (such as compound statement
comprising function body)
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1CFG.html
2 https:

//clang.llvm.org/docs/InternalsManual.html#the-cfg-class
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 268
CFG Pretty Printer Example: Summary

■ in slides/examples/dump_cfg directory in companion repository


■ runs compiler frontend on specified source files to produce AST
■ uses AST matcher to find functions whose name matches regular
expression
■ for each function found, CFG is generated
■ information about CFG is output using pretty-print functionality of Clang
libraries

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 269


CFG Pretty Printer Example: Sample Program Output
Input Source File
int abs(int x) {
if (x < 0) {return -x;}
return x;
}

Program Output
FUNCTION : abs
[ B4 ( ENTRY )]
Succs (1): B3
[ B1 ]
1: x ( ImplicitCastExpr , LValueToRValue , int )
2: return [ B1 .1];
Preds (1): B3
Succs (1): B0
[ B2 ]
1: -x
2: return [ B2 .1];
Preds (1): B3
Succs (1): B0
[ B3 ]
1: x < 0
T: if [ B3 .1]
Preds (1): B4
Succs (2): B2 B1
[ B0 ( EXIT )]
Preds (2): B1 B2

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 270


CFG Pretty Printer Example: main.cpp (1)

1 #include <format>
2 #include <string>
3 #include "clang/Analysis/CFG.h"
4 #include "clang/ASTMatchers/ASTMatchers.h"
5 #include "clang/ASTMatchers/ASTMatchFinder.h"
6 #include "clang/Basic/LangOptions.h"
7 #include "clang/Frontend/FrontendActions.h"
8 #include "clang/Tooling/CommonOptionsParser.h"
9 #include "clang/Tooling/Tooling.h"
10 #include "llvm/Support/CommandLine.h"
11
12 namespace cam = clang::ast_matchers;
13 namespace ct = clang::tooling;
14 namespace lc = llvm::cl;
15
16 static lc::OptionCategory toolCategory("Tool Options");
17 static lc::opt<std::string> clFuncNamePattern("f", lc::cat(toolCategory),
18 lc::init(".*"));
19 static lc::opt<bool> clUseColor("c", lc::cat(toolCategory), lc::init(false));

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 271


CFG Pretty Printer Example: main.cpp (2)

21 cam::DeclarationMatcher getFuncMatcher(const std::string& namePattern)


22 {return cam::functionDecl(cam::matchesName(namePattern)).bind("func");}
23
24 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {
25 virtual void run(const cam::MatchFinder::MatchResult& result) final {
26 if (const auto* funcDecl =
27 result.Nodes.getNodeAs<clang::FunctionDecl>("func")) {
28 clang::ASTContext *astContext = result.Context;
29 clang::Stmt *funcBody = funcDecl->getBody();
30 if (!funcBody) {return;}
31 llvm::outs() << std::format("FUNCTION: {}\n",
32 funcDecl->getQualifiedNameAsString());
33 std::unique_ptr<clang::CFG> cfg = clang::CFG::buildCFG(
34 funcDecl, funcBody, astContext, clang::CFG::BuildOptions());
35 if (!cfg) {
36 llvm::outs() << "unable to generate CFG\n";
37 return;
38 }
39 auto langOpts = astContext->getLangOpts();
40 cfg->print(llvm::outs(), langOpts, clUseColor);
41 }
42 }
43 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 272


CFG Pretty Printer Example: main.cpp (3)

45 int main(int argc, const char **argv) {


46 llvm::Expected<ct::CommonOptionsParser> expOptionsParser =
47 ct::CommonOptionsParser::create(argc, argv, toolCategory);
48 if (!expOptionsParser) {
49 llvm::errs() << llvm::toString(expOptionsParser.takeError());
50 return 1;
51 }
52 ct::CommonOptionsParser& optionsParser = *expOptionsParser;
53 ct::ClangTool tool(optionsParser.getCompilations(),
54 optionsParser.getSourcePathList());
55 cam::DeclarationMatcher funcMatcher = getFuncMatcher(clFuncNamePattern);
56 MyMatchCallback matchCallback;
57 cam::MatchFinder finder;
58 finder.addMatcher(funcMatcher, &matchCallback);
59 int status = tool.run(ct::newFrontendActionFactory(&finder).get());
60 if (status) {llvm::errs() << "error occurred\n";}
61 return !status ? 0 : 1;
62 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 273


clang::CFGBlock Class

■ clang::CFGBlock class represents single basic block in CFG


■ CFGBlock object consists of:
2 set of statements/expressions (which may contain subexpressions)
2 terminator statement (not in set of statements), which represents type of
control-flow that occurs at end of basic block
2 list of successors (where order is not arbitrary)
2 list of predecessors (where order is arbitrary)
■ some methods provided by class include:
2 begin and end: return range corresponding to elements in block (e.g.,

statements)
2 size: get number of elements in block
2 succ_begin and succ_end: return range corresponding to blocks that are
successors to block
2 succ_size: get number of successor blocks
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1CFGBlock.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 274


clang::CFGElement Class

■ clang::CFGElement class represents top-level expression in basic block


■ some methods provided by class include:
2 dumpToStream: outputs information about element to stream in

human-readable format
2 getKind: get kind of element (e.g., statement, constructor)
2 getAs: get as specified type or return empty optional if does not have
specified type
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1CFGElement.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 275


Cyclomatic-Complexity Example: Summary
■ in slides/examples/cyclomatic_complexity directory in companion
repository
■ cyclomatic (a.k.a., McCabe) complexity is measure of code complexity
::::::::::::::::::::::::::::::::::
■ cyclomatic complexity M = E − N + 2P, where E is number of edges, N is
number of nodes, P is number of connected components
■ runs compiler frontend on specified source files to produce AST
■ uses AST matcher to find function definitions
■ for each function definition, program does following:
2 constructs CFG for function
2 computes cyclomatic complexity of function
2 if complexity not less than specified threshold, prints complexity information
for function
■ complexity threshold can be specified as command-line option
■ output resembles something like:
identity 1
abs 2
foo 6

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 276


Cyclomatic-Complexity Example: matcher.cpp (1)

1 #include <format>
2 #include "clang/Analysis/CFG.h"
3 #include "clang/AST/ASTContext.h"
4 #include "clang/ASTMatchers/ASTMatchers.h"
5 #include "clang/ASTMatchers/ASTMatchFinder.h"
6 #include "clang/Tooling/CommonOptionsParser.h"
7 #include "clang/Tooling/Tooling.h"
8 #include "llvm/Support/CommandLine.h"
9 #include "llvm/Support/raw_ostream.h"
10
11 namespace ct = clang::tooling;
12 namespace cam = clang::ast_matchers;
13
14 static llvm::cl::OptionCategory toolCategory("Tool Options");
15 static llvm::cl::opt<unsigned int> thresholdOption("t",
16 llvm::cl::init(0), llvm::cl::desc("Set complexity threshold."),
17 llvm::cl::cat(toolCategory));

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 277


Cyclomatic-Complexity Example: matcher.cpp (2)
19 int cyclomaticComplexity(const clang::FunctionDecl& funcDecl,
20 clang::ASTContext& astContext) {
21 const auto cfg = clang::CFG::buildCFG(&funcDecl, funcDecl.getBody(),
22 &astContext, clang::CFG::BuildOptions());
23 if (!cfg) {return -1;}
24 const int numNodes = cfg->size() - 2;
25 int numEdges = 0;
26 for (const auto* block : *cfg) {numEdges += block->succ_size();}
27 numEdges -= 2;
28 return numEdges - numNodes + (2 * 1);
29 }
30
31 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {
32 using MatchResult = cam::MatchFinder::MatchResult;
33 void run(const MatchResult& result) override {
34 const auto* function =
35 result.Nodes.getNodeAs<clang::FunctionDecl>("f");
36 std::string s = function->getQualifiedNameAsString();
37 int complexity = cyclomaticComplexity(*function,
38 *result.Context);
39 if (complexity >= 0 && complexity >= thresholdOption) {
40 llvm::outs() << std::format("{} {}\n", s, complexity);
41 }
42 }
43 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 278


Cyclomatic-Complexity Example: matcher.cpp (3)

45 int main(int argc, char** argv) {


46 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
47 const_cast<const char**>(argv), toolCategory);
48 if (!expectedOptionsParser) {
49 llvm::errs() << llvm::toString(expectedOptionsParser.takeError());
50 return 1;
51 }
52 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
53 auto matcher =
54 cam::functionDecl(cam::isExpansionInMainFile()).bind("f");
55 MyMatchCallback matchCallback;
56 cam::MatchFinder matchFinder;
57 matchFinder.addMatcher(matcher, &matchCallback);
58 ct::ClangTool tool(optionsParser.getCompilations(),
59 optionsParser.getSourcePathList());
60 auto status =
61 tool.run(ct::newFrontendActionFactory(&matchFinder).get());
62 if (status) {llvm::errs() << "error detected\n";}
63 return !status ? 0 : 1;
64 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 279


CFG Example: Summary

■ in slides/examples/cfg_1 directory in companion repository


■ runs compiler frontend on specified source files to produce AST
■ uses AST matcher to find functions (or functions with specified name)
■ for each function found, program generates CFG for function and prints
information about CFG blocks and elements
■ output resembles something like:
FUNCTION : main
block : 0 ( exit )
block : 1
successors : 0
statement : 0
statement : return 0;
block : 2
successors : 0
statement : 1
statement : return 1;
block : 3
successors : 2 1
statement : argc != 1
block : 4 ( entry )
successors : 3

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 280


CFG Example: cfg.cpp (1)
1 #include <format>
2 #include <map>
3 #include <string>
4 #include "clang/Analysis/CFG.h"
5 #include "clang/AST/ASTContext.h"
6 #include "clang/ASTMatchers/ASTMatchers.h"
7 #include "clang/ASTMatchers/ASTMatchFinder.h"
8 #include "clang/Frontend/FrontendAction.h"
9 #include "clang/Tooling/CommonOptionsParser.h"
10 #include "clang/Tooling/Tooling.h"
11 #include "llvm/Support/CommandLine.h"
12 #include "llvm/Support/raw_ostream.h"
13
14 namespace lc = llvm::cl;
15 namespace ct = clang::tooling;
16 namespace cam = clang::ast_matchers;
17
18 static lc::OptionCategory toolCategory("Tool Options");
19 static lc::opt<std::string> clFuncName("f", lc::cat(toolCategory));
20
21 std::string toString(clang::CFGElement::Kind kind) {
22 const std::map<clang::CFGElement::Kind, std::string> lut{
23 {clang::CFGElement::Kind::Statement, "statement"},
24 {clang::CFGElement::Kind::Constructor, "constructor"},
25 {clang::CFGElement::Kind::CXXRecordTypedCall, "recordTypedCall"},
26 };
27 auto i = lut.find(kind);
28 return std::format("{}", (i != lut.end() ? i->second : "unknown"));
29 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 281


CFG Example: cfg.cpp (2)
31 void printBlock(llvm::raw_ostream& out, const clang::CFG& cfg,
32 const clang::CFGBlock& block) {
33 out << std::format("block: {}", block.BlockID);
34 if (&block == &cfg.getEntry()) {out << " (entry)";}
35 if (&block == &cfg.getExit()) {out << " (exit)";}
36 if (block.hasNoReturnElement()) {out << " (noreturn)";}
37 out << ’\n’;
38 if (block.succ_size()) {
39 out << "successors:";
40 for (auto succBlockIter = block.succ_begin(); succBlockIter !=
41 block.succ_end(); ++succBlockIter) {
42 out << std::format(" {}", (*succBlockIter) ? std::format("{}",
43 (*succBlockIter)->BlockID) : "invalid");
44 }
45 out << ’\n’;
46 }
47 for (auto elemIter = block.begin(); elemIter != block.end(); ++elemIter) {
48 out << std::format("{}: ", toString(elemIter->getKind()));
49 elemIter->dumpToStream(out);
50 }
51 }
52
53 void processFunc(const clang::FunctionDecl& funcDecl, clang::ASTContext&
54 astContext) {
55 llvm::outs() << std::format("FUNCTION: {}\n",
56 funcDecl.getQualifiedNameAsString());
57 const auto cfg = clang::CFG::buildCFG(&funcDecl, funcDecl.getBody(),
58 &astContext, clang::CFG::BuildOptions());
59 if (!cfg) {return;}
60 for (auto blockIter = cfg->nodes_begin(); blockIter != cfg->nodes_end();
61 ++blockIter) {printBlock(llvm::outs(), *cfg, **blockIter);}
62 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 282


CFG Example: cfg.cpp (3)
64 cam::DeclarationMatcher getFuncMatcher(const std::string& name) {
65 return (name.size() ? cam::functionDecl(cam::hasName(name)) :
66 cam::functionDecl()).bind("func");
67 }
68
69 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {
70 virtual void run(const cam::MatchFinder::MatchResult& result) final {
71 if (const auto* funcDecl =
72 result.Nodes.getNodeAs<clang::FunctionDecl>("func")) {
73 if (const clang::Stmt *funcBody = funcDecl->getBody())
74 {processFunc(*funcDecl, *result.Context);}
75 }
76 }
77 };
78
79 int main(int argc, char** argv) {
80 auto expectedOptionsParser = ct::CommonOptionsParser::create(argc,
81 const_cast<const char**>(argv), toolCategory);
82 if (!expectedOptionsParser) {
83 llvm::errs() << llvm::toString(expectedOptionsParser.takeError());
84 return 1;
85 }
86 ct::CommonOptionsParser& optionsParser = *expectedOptionsParser;
87 ct::ClangTool tool(optionsParser.getCompilations(),
88 optionsParser.getSourcePathList());
89 cam::DeclarationMatcher funcMatcher = getFuncMatcher(clFuncName);
90 MyMatchCallback matchCallback;
91 cam::MatchFinder finder;
92 finder.addMatcher(funcMatcher, &matchCallback);
93 int status = tool.run(ct::newFrontendActionFactory(&finder).get());
94 if (status) {llvm::errs() << "error occurred\n";}
95 return !status ? 0 : 1;
96 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 283


Section 3.11.1

Code Analysis

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 284


clang::AnalysisDeclContextManager Class

■ clang::AnalysisDeclContextManager class provides means to


create/manage clang::AnalysisDeclContext instances
■ AnalysisDeclContext class (to be discussed shortly) used to hold state
needed for some types of code analysis
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_

1AnalysisDeclContextManager.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 285


clang::AnalysisDeclContext Class

■ clang::AnalysisDeclContext class contains context data for function,


method, or block under analysis
■ holds CFG and CFG-related information
■ can be used to perform various kinds of code analysis
■ getAnalysis method:
2 factory function for generating code-analysis objects of various types
2 kind of analysis to perform specified via type template parameter
2 returns specified analysis object, lazily running analysis if necessary or
nullptr if analysis could not run
2 example of analysis type: clang::LiveVariables
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_

1AnalysisDeclContext.html#details

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 286


clang::LiveVariables Class

■ clang::LiveVariables class provides mechanism for performing


live-variable analysis
. . . . . . . . ................
■ provides methods for querying if variable is live at various points (e.g., at
end of specified block or at beginning of statement)
■ for more information, see:
2 https:

//clang.llvm.org/doxygen/classclang_1_1LiveVariables.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 287


Liveness-Analysis Example: Summary

■ in slides/examples/liveness_analysis directory in companion


repository
■ runs compiler frontend on specified source files to produce AST
■ uses AST matcher to find functions (or functions with specified name)
■ for each function found, CFG for function is generated and liveness
analysis is performed, and then results output
■ sample output shown on next slide

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 288


Liveness-Analysis Example: Sample Program Output
Input Source File
1 int foo(int x, int y) { // B5 (entry)
2 int t = x * y; // B4.1
3 if ((x + 1) * (x - 1) == y) { // B4.2
4 t = 1; // B3
5 } else {
6 t = 2; // B2
7 }
8 return t; // B1
9 } // B0 (exit)

Program Output
FUNCTION : foo
[ B0 ( live variables at block exit ) ]
[ B1 ( live variables at block exit ) ]
[ B2 ( live variables at block exit ) ]
t </ home / jdoe / example_3_b . cpp :2:6 >
[ B3 ( live variables at block exit ) ]
t </ home / jdoe / example_3_b . cpp :2:6 >
[ B4 ( live variables at block exit ) ]
[ B5 ( live variables at block exit ) ]
x </ home / jdoe / example_3_b . cpp :1:13 >
y </ home / jdoe / example_3_b . cpp :1:20 >

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 289


Liveness-Analysis Example: analyze.hpp

1 #include "clang/AST/ASTContext.h"
2 void analyzeFunc(clang::ASTContext& astContext, const clang::FunctionDecl*
3 funcDecl, bool printCfg);

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 290


Liveness-Analysis Example: analyze.cpp

1 #include "clang/AST/ASTContext.h"
2 #include "clang/Analysis/CFG.h"
3 #include "clang/Analysis/AnalysisDeclContext.h"
4 #include "clang/Analysis/Analyses/LiveVariables.h"
5
6 void analyzeFunc(clang::ASTContext& astContext, const clang::FunctionDecl*
7 funcDecl, bool printCfg) {
8 clang::AnalysisDeclContextManager adcm(astContext);
9 clang::AnalysisDeclContext *adc = adcm.getContext(
10 llvm::cast<clang::Decl>(funcDecl));
11 assert(adc);
12 adc->getCFGBuildOptions().setAllAlwaysAdd();
13 const clang::CFG& cfg = *adc->getCFG();
14 if (printCfg)
15 {cfg.print(llvm::outs(), astContext.getLangOpts(), false);}
16 clang::LiveVariables *lv = adc->getAnalysis<clang::LiveVariables>();
17 if (!lv) {return;}
18 auto observer = std::make_unique<clang::LiveVariables::Observer>();
19 assert(observer);
20 lv->runOnAllBlocks(*observer);
21 lv->dumpBlockLiveness((funcDecl->getASTContext()).getSourceManager());
22 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 291


Liveness-Analysis Example: main.cpp (1)
1 #include <format>
2 #include <string>
3 #include "clang/ASTMatchers/ASTMatchers.h"
4 #include "clang/ASTMatchers/ASTMatchFinder.h"
5 #include "clang/Frontend/FrontendActions.h"
6 #include "clang/Tooling/CommonOptionsParser.h"
7 #include "clang/Tooling/Tooling.h"
8 #include "llvm/Support/CommandLine.h"
9 #include "analyze.hpp"
10
11 namespace cam = clang::ast_matchers;
12 namespace ct = clang::tooling;
13 namespace lc = llvm::cl;
14
15 static lc::OptionCategory toolCategory("Tool Options");
16 static lc::opt<std::string> clFuncNamePattern("f", lc::cat(toolCategory),
17 lc::init(".*"));
18 static lc::opt<bool> clPrintCfg("c", lc::cat(toolCategory), lc::init(false));
19
20 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {
21 virtual void run(const cam::MatchFinder::MatchResult& result) final {
22 if (auto funcDecl =
23 result.Nodes.getNodeAs<clang::FunctionDecl>("func")) {
24 clang::ASTContext *astContext = result.Context;
25 clang::Stmt *funcBody = funcDecl->getBody();
26 if (!funcBody) {return;}
27 llvm::outs() << std::format("FUNCTION: {}\n",
28 funcDecl->getQualifiedNameAsString());
29 analyzeFunc(*astContext, funcDecl, clPrintCfg);
30 }
31 }
32 };

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 292


Liveness-Analysis Example: main.cpp (2)

34 cam::DeclarationMatcher getFuncMatcher(const std::string& namePattern)


35 {return cam::functionDecl(cam::matchesName(namePattern)).bind("func");}
36
37 int main(int argc, const char **argv) {
38 llvm::Expected<ct::CommonOptionsParser> expOptionsParser =
39 ct::CommonOptionsParser::create(argc, argv, toolCategory);
40 if (!expOptionsParser) {
41 llvm::errs() << llvm::toString(expOptionsParser.takeError());
42 return 1;
43 }
44 ct::CommonOptionsParser& optionsParser = *expOptionsParser;
45 ct::ClangTool tool(optionsParser.getCompilations(),
46 optionsParser.getSourcePathList());
47 cam::DeclarationMatcher funcMatcher = getFuncMatcher(clFuncNamePattern);
48 MyMatchCallback matchCallback;
49 cam::MatchFinder finder;
50 finder.addMatcher(funcMatcher, &matchCallback);
51 int status = tool.run(ct::newFrontendActionFactory(&finder).get());
52 if (status) {llvm::errs() << "error occurred\n";}
53 return !status ? 0 : 1;
54 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 293


Section 3.12

Declarations

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 294


Declaration Names

■ DeclarationName
:::::::::::::::::
class represents name of declaration
■ declarations names can name entities such as:
2 simple identifiers
2 class constructors and destructors
2 overloaded operators
2 conversion functions
■ getNameKind member of DeclarationName returns value indicating kind
of name stored
■ DeclarationName instances can be compared
■ see also: https://clang.llvm.org/docs/InternalsManual.html#
declaration-names

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 295


Declaration Contexts
■ declaration context: collection of related declarations
■ declaration contexts represented by DeclContext
::::::::::::
class
■ every declaration exists within some declaration context
■ declaration contexts associated with types that inherit from DeclContext
and correspond to such things as:
2 translation units declarations (TranslationUnitDecl)
2 namespaces declarations (NamespaceDecl)

2 enum, class, and union declarations (TagDecl)


2 nonmember/member function declarations (FunctionDecl)

■ declaration contexts have hierarchical (i.e., parent-child) structure with


context for translation unit at top of hierarchy
■ numerous classes deriving from DeclContext used to distinguish
between different types of declaration contexts (some of these types
noted above)
■ see also: https://clang.llvm.org/docs/InternalsManual.html#
declaration-contexts
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 296
Lexical and Semantic Declaration Contexts

■ each declaration has two potentially different declaration contexts:


1 lexical (i.e., source centric)
2 semantic
■ Decl class provides methods getLexicalDeclContext and
getDeclContext, for accessing lexical and semantic contexts,
respectively
■ lexical context corresponds to declaration context in which actual
declaration occurred in source code
■ semantic context corresponds to declaration context to which declaration
belongs in terms of language semantics
■ in most cases, lexical and semantic contexts tend to be same; but not
always (e.g., when member function defined outside class body)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 297


Example: Lexical Versus Semantic Declaration Contexts

■ consider lexical and semantic declaration contexts of C::f in following


code:
class C {
public:
// for declarations of each of C::x and C::f, lexical
// and semantic contexts are same, namely, class
// (i.e., CXXRecordDecl) to which declared member
// of C belongs
static int x;
void f();
};
// for definitions of C::x and C::f outside class body,
// lexical context is translation unit (i.e.,
// TranslationUnitDecl) containing definition, and
// semantic context is class (i.e., CXXRecordDecl) to
// which defined member belongs
static int x = 0;
void C::f() {}

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 298


Semantic Declaration Context Hierarchy Example
1 // alpha: TranslationUnitDecl
2 namespace alpha {
3 // beta: NamespaceDecl alpha -> TranslationUnitDecl
4 namespace beta {
5 // Widget: NamespaceDecl beta -> NamespaceDecl alpha -> TranslationUnitDecl
6 struct Widget {
7 // g1: CXXRecord Widget -> NamespaceDecl beta -> NamespaceDecl alpha ->
8 // TranslationUnitDecl
9 int g1();
10 // i1: CXXRecord Widget -> NamespaceDecl beta -> NamespaceDecl alpha ->
11 // TranslationUnitDecl
12 static int i1;
13 };
14 // w1: NamespaceDecl beta -> NamespaceDecl alpha -> TranslationUnitDecl
15 Widget w1;
16 // f1: NamespaceDecl beta -> NamespaceDecl alpha -> TranslationUnitDecl
17 int f1();
18 } // namespace beta
19 } // namespace alpha
20
21 // i1: CXXRecord Widget -> NamespaceDecl beta -> NamespaceDecl alpha ->
22 // TranslationUnitDecl
23 int alpha::beta::Widget::i1;
24 // w2: TranslationUnitDecl
25 alpha::beta::Widget w2;
26 // f2: TranslationUnitDecl
27 int f2();
■ each declaration labelled with its chain of semantic declaration contexts

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 299


Section 3.13

Types

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 300


Key Concepts Related to Types

■ syntactic sugar: syntax intended to make code more readable (e.g., type
alias introduced by typedef or using); see also ::::::::::::
Wikipedia link
■ unqualified type: type without any (top-level) qualifiers (e.g., const,
volatile, and restrict)
■ qualified type: type that can potentially have (top-level) qualifiers (e.g.,
const, volatile, and restrict)
■ due to type aliases (e.g., resulting from typedef and using alias
declarations), type often does not have unique representation
■ canonical type: unique representation of type that results from removing
all syntactic sugar and type aliases from type (see also: https://clang.
llvm.org/docs/InternalsManual.html#canonical-types)
■ desugared type: type with syntactic sugar removed

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 301


Type Sugar

■ sugar in representation of types in AST used to capture information useful


for tooling but not strictly required for compilation
■ many kinds of sugar for types
■ type sugar used to capture information such as:
2 exact spelling of keywords in source code (e.g., struct versus class)
2 presence and exact spelling of name qualifiers in source code (e.g.,
std::find versus find)
2 type aliases (introduced by typedef and using)
2 namespace aliases
2 use of extra (i.e., redundant) parentheses
2 redundant use of keywords such as enum, struct, class, and union

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 302


Examples of Type Sugar

■ type aliases used to specify type:


using MyInt = int;
MyInt i; // "MyInt" is sugar
■ parentheses used in type specification in variable declaration:
int (x); // parentheses are sugar
■ unnecessary (i.e., redundant) use of class, struct, or union
keyword when specifying type:
class Widget {};
class Widget w; // "class" is sugar
auto bad = (class Widget*) 42;
// "class" is sugar

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 303


Types in Clang

■ in Clang, type qualifiers (e.g., const, volatile, restrict)


represented separately from basic (i.e., unqualified) type
■ since representation of basic type requires much more storage than
representation of type qualifiers, important to minimize storage for basic
type information
■ basic type and type qualfiers separated in order to avoid redundantly
storing same basic type information, once for every combination of type
qualifiers used in source (e.g., widget, const widget,
volatile widget, const volatile widget)
■ leads to two key types associated with type representations:
2 Type: basic (i.e., unqualified) type
2 QualType: potentially-qualified type (i.e., basic type with zero or more type

qualifiers)
■ QualType is very small type that contains set of zero or more qualifiers
and pointer to Type instance

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 304


clang::Type Class (1)

■ clang::Type
::::::::::::
class used to represent unqualified type
■ all (unqualified) types represented using types that derive from Type
■ some examples of types derived from Type include:
2 ArrayType, AttributedType, BuiltinType, DeducedType,
::::::::: :::::::::::::: :::::::::::: :::::::::::
ElaboratedType , FunctionType, FunctionProtoType,
:::::::::::::: ::::::::::::: :::::::::::::::::
MemberPointerType , PointerType, ReferenceType, TagType,
::::::::::::::::: :::::::::::: ::::::::::::: :::::::
EnumType, RecordType
:::::::: ::::::::::
, and TypedefType
:::::::::::
■ methods provided to do things such as:
2 check if type has particular property, such as: pointer type
(isPointerType), literal type (isLiteralType), and trivially-copyable type
(isTriviallyCopyableType)
2 get corresponding canonical type
2 get pointee or array-element type (if any)
2 get access to related type information (e.g., CXXRecordDecl and
getTypeClassName)
2 dump information about Type instance (e.g., dump)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 305


clang::Type Class (2)

■ important to use members provided by Type in order to check properties


of types as not doing so will likely result in incorrect handling of many
special cases
■ for example, cannot check for pointer type by simply checking if Type
object has PointerType as derived type (e.g., via llvm::isa), since
pointer type can also be associated with TypedefType
■ see also: https://clang.llvm.org/docs/InternalsManual.html#
the-type-class-and-its-subclasses

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 306


clang::QualType Class (1)
■ clang::QualType
:::::::::::::::::
class used to represent (potentially) qualified type
■ conceptually QualType is pair that stores Type* and qualifiers
■ type is small and passed by value
■ has methods to do things such as:
2 check if QualType refers to any Type (i.e., isNull)
2 get pointer to underlying Type instance (i.e., getTypePtr or

getTypePtrOrNull)
2 check properties of types that depend on qualifiers (e.g.,
isConstQualified)
2 add/remove qualifiers for QualType (e.g., addConst and
removeLocalConst)
2 create QualType instance with qualifiers added/removed (e.g., withConst)
2 create QualType instance from Type instance with specified qualifiers (via
QualType constructor)
2 get QualType instance with various levels of desugaring (e.g.,
getDesugaredType and getSingleStepDesugaredType)
2 convert type to string (e.g., getAsString)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 307


clang::QualType Class (2)

■ dereferencing QualType via indirection (i.e., operator* or


operator->) yields underlying Type
■ QualTypes instances compare equal if their types and qualifiers are
identical
■ QualType also ordered
■ QualType can be constructed from Type* and qualifiers
■ pointee types and array element types not sugar (i.e., are distinct
semantic entities); so desugar functions will not pass through them
■ see also:
https://clang.llvm.org/docs/InternalsManual.html#qualtype

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 308


Remarks on Dumps of QualType Instances

■ QualType class has dump method that prints type represented by


QualType instance in human-readable format
■ such dumps often useful for debugging and in tools like clang-query
■ some comments in order with respect to format of QualType dumps
■ QualType instance for which dump method invoked not shown unless one
or more qualifiers present
■ Type AST nodes typically have QualType member to link to child node
■ although QualType instances contained in AST nodes and are not AST
nodes themselves, QualType instance shown as descendant of its
containing node in dump
■ QualType only shown in dump, however, if one or more qualifiers present;
otherwise, omitted

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 309


QualType Dump Interpretation
Source Code
const char *** p; // consider type of p

Actual Representation (Clang 15)


QualType: unqualified
Dump Visualization (Clang 15)
PointerType
PointerType
QualType: unqualified
PointerType
PointerType
PointerType
QualType: unqualified
QualType: const
PointerType
BuiltinType: char
QualType: const
BuiltinType: char

■ to reduce verbosity of output, dump method of QualType class does not


display QualType instance unless qualifiers present
■ subsequent slides will adopt similar convention

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 310


General Comments About ASTs for Types

■ most nodes used in AST representation of types derive from Type class
but some exceptions to this
■ for example, in case of templates, some expressions appearing in
template arguments cannot be evaluated until template instantiatied
■ for this reason, some of AST nodes that derive from Type sometimes
need to refer to expression (i.e., Expr) nodes that represent expression to
be evaluated (when template instantiated)
■ after template instantiated, expressions no longer needed for compilation
(but might perhaps be useful for tooling)
■ for example, in case of following template, size of array not known until
template instantiated:
template<int I> using RealArray = float[I];

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 311


clang::BuiltinType Class

■ clang::BuiltinType
::::::::::::::::::::
class used to represent built-in types (such as
char, int, and float)
■ getKind method to query particular built-in type
■ getName method to obtain name of type as string
■ various methods corresponding to predicates to test type for certain
properties, such as:
2 isInteger
2 isFloatingPoint
2 isSignedInteger
2 isUnsignedInteger

■ terminal node in AST representation of type

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 312


clang::ArrayType Class
■ clang::ArrayType
::::::::::::::::::
class represents array type
■ getElementType method gets QualType for array element
■ getIndexTypeQualifiers method gets qualifiers for array index type
■ several types derive from ArrayType:
2 ConstantArrayType: represents C array with constant size
:::::::::::::::::
2 DependentSizedArrayType: represents array where size is
:::::::::::::::::::::::
value-dependent expression
2 IncompleteArrayType
:::::::::::::::::::
: represents C array with unspecified size
2 VariableArrayType
:::::::::::::::::
: represents C array with size that is not integer
constant expression (i.e., variable length array)
■ ConstantArrayType has getSize method to get number of elements in
array
■ example of value-dependent expression for array size:
template<typename T, int Size>
class array {
T data[Size]; // Size is value dependent
};

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 313


Array Example

Source Code
float x[5][4][3];

Fully-Sugared Type and Canonical Type (Clang 15)


ConstantArrayType: size 5
ConstantArrayType: size 4
ConstantArrayType: size 3
BuiltinType: float

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 314


clang::PointerType Class

■ clang::PointerType
::::::::::::::::::::
class used to represent pointer type
■ provides getPointeeType method to get QualType for type of object to
which pointer points

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 315


Pointer Example

Source Code
volatile char** const p = nullptr; // consider type of p

Fully-Sugared and Canonical Type (Clang 15)


QualType: const
PointerType
PointerType
QualType: volatile
BuiltinType: char

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 316


clang::ReferenceType Class

■ clang::ReferenceType
::::::::::::::::::::::
class used to represent reference type
■ provides getPointeeType method to obtain QualType for pointee type
(i.e., type of value referred to by reference)
■ two types derive from ReferenceType:
2 LValueReferenceType: represents lvalue reference type
:::::::::::::::::::
2 RValueReferenceType: represents rvalue reference type
:::::::::::::::::::

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 317


clang::FunctionType Class

■ clang::FunctionType
:::::::::::::::::::::
class used to represent function types
■ FunctionType has two derived types:
2 FunctionProtoType
:::::::::::::::::
: represents function type with parameter types
specified (as in C++ function declaration)
2 FunctionNoProtoType
:::::::::::::::::::
: represents function type without parameter types
specified (as in old style C)
■ FunctionProtoType members include:
2 getNumParams: get number of function parameters
2 getParamType: get type of specified function parameter as QualType
2 getReturnType: get function return type as QualType
2 getExceptionSpecType: get exception specification type for function
2 getNumExceptions get number of types of exceptions that can be thrown
by function
2 getExceptionType: get specific exception type for function as QualType

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 318


Function Example

Source Code
using Size = unsigned long;
void f(char*, Size); // consider type of f

Fully Sugared Type (Clang 15)


FunctionProtoType Canonical Type (Clang 15)
BuiltinType: void FunctionProtoType
PointerType BuiltinType: void
BuiltinType: char PointerType
TypedefType BuiltinType: char
TypeAliasDecl: Size BuiltinType: unsigned long
BuiltinType: unsigned long

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 319


clang::TagType Class

■ clang::TagType class used to represent types associated with tags (i.e.,


::::::::::::::::
enum, struct, class, and union)
■ has getDecl method to obtain TagDecl corresponding to TagType
■ has two derived classes:
2 EnumType: used to represent enumerations (i.e., enum)
::::::::
2 RecordType: used to represent structs/classes and unions
::::::::::
■ EnumType:
2 has method getDecl to obtain EnumDecl corresponding to EnumType
2 getIntegerType method gets QualType for integer type used to hold
enumeration value
■ RecordType:
2 getDecl method provides access to RecordDecl that declares type
2 in turn RecordDecl has getDeclType to obtain QualType associated with
RecordType

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 320


clang::MemberPointerType Class

■ clang::MemberPointerType class used to represent pointer to member


:::::::::::::::::::::::::::
■ getPointeeType: gets type of pointee as QualType
■ getClass: gets unqualified type for class to which member pointer
corresponds as Type*
■ isMemberFunctionPointer: tests if member pointer points to function
member (as opposed to data member)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 321


Member Pointer Example

Source Code
struct widget {int x;};
const int widget::*p = &widget::x; // consider type of p

Fully-Sugared Type and Canonical Type (Clang 15)


MemberPointerType
RecordType
CXXRecordDecl: widget
QualType: const
BuiltinType: int

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 322


clang::TypedefType Class

■ clang::TypedefType class represents alias declarations (i.e.,


::::::::::::::::::::
declarations employing typedef and using)
■ has desugar method to obtain QualType of type named by type alias
■ has getDecl method to obtain TypedefNameDecl corresponding to
TypedefType
■ from TypedefNameDecl can obtain name of type alias
■ can also obtain type associated with alias from TypedefNameDecl via
getUnderlyingType method

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 323


Type Alias Example

Source Code
typedef float Real;
using RealArray32 = Real[3][2];
RealArray32 x; // consider type of x

(One-Level) Desugared Type (Clang 15)

Fully Sugared Type (Clang 15) ConstantArrayType: size 3

TypedefType ConstantArrayType: size 2

TypeAliasDecl: RealArray32 TypedefType

ConstantArrayType: size 3 TypedefDecl: Real

ConstantArrayType: size 2 BuiltinType: float

TypedefType Canonical Type (Clang 15)


TypedefDecl: Real ConstantArrayType: size 3
BuiltinType: float ConstantArrayType: size 2
BuiltinType: float

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 324


clang::TypeWithKeyword Class

■ clang::TypeWithKeyword
:::::::::::::::::::::::::
class captures information about enum,
struct, union, and class keywords
■ has derived types:
2 ElaboratedType: represents elaborated type keyword (e.g., struct S)
::::::::::::::
or qualified name (e.g., N::M::type) or both
2 DependentNameType: represents qualified type name for which type name
:::::::::::::::::
is dependent
2 DependentTemplateSpecializationType: represents template
::::::::::::::::::::::::::::::::::::
specialization type whose template cannot be resolved

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 325


Elaborated Type Example

Source Code
namespace foo {struct widget {};}
struct foo::widget w;

Fully Sugared Type (Clang 15)


Canonical Type (Clang 15)
ElaboratedType: “struct” “foo::widget”
RecordType: foo::widget
RecordType: foo::widget
CXXRecord: widget
CXXRecord: widget

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 326


clang::ParenType Class

■ clang::ParenType
::::::::::::::::::
class captures information about parentheses used
when specifying types
■ has getInnerType method for obtaining QualType for type enclosed by
parentheses

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 327


Parenthesized Type Example

Source Code
int (((x)));

Fully Sugared Type (Clang 15)


ParenType
Canonical Type (Clang 15)
ParenType
BuiltinType: int
ParenType
BuiltinType: int

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 328


Other Remarks

■ SplitQualType
:::::::::::::::
is pair-like structure for storing qualified type split into
local qualifiers and locally-unqualified type
■ CanQualType
::::::::::::
is alias for clang::CanQual<clang::Type>
■ DEF_TRAVERSE_TYPE macro invocations in header
clang/AST/RecursiveASTVisitor.h can be used to determine
relatively easily how type nodes traversed
■ types can be constructed via members of ASTContext

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 329


Section 3.14

Type Locations

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 330


clang::TypeLoc Class

■ TypeLoc
::::::::
class used as base class for representing spelling of type
information in source code
■ TypeLoc instance for many places where type spelled in source code
(e.g., spelling of type in declarator)
■ TypeLoc has two direct derived classes:
2 QualifiedTypeLoc
::::::::::::::::
: associated with type having non-trivial direct qualifiers
2 UnqualTypeLoc
:::::::::::::
: associated with type having no direct qualifiers
■ UnqualTypeLoc has many derived types, including one for each type
derived from Type (e.g., ArrayTypeLoc, EnumTypeLoc, RecordTypeLoc,
and so on)
■ TypeLoc instances form shadow hierarchy of Type hierarchy

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 331


TypeLoc Uses

■ TypeLoc instances used to capture how type information spelled in


source code
■ used for things such as type in:
2 declarator (DeclaratorDecl)
::::::::::::::
2 template argument (TemplateArgumentLoc)
:::::::::::::::::::
2 base class specifier in class declaration (CXXBaseSpecifier)
::::::::::::::::
2 integral type for class enumeration (EnumDecl)
:::::::::
2 new expression (CXXNewExpr)
::::::::::
2 explicit cast (ExplicitCastExpr)
:::::::::::::::::
2 functional cast (CXXTemporaryObjectExpr)
::::::::::::::::::::::
2 typeid expression (CXXTypeidExpr)
:::::::::::::
2 offsetof expression (OffsetOfExpr)
::::::::::::
2 constructor base initializer (CXXCtorInitializer)
::::::::::::::::::
■ one of most frequently occurring uses of TypeLoc for declarators

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 332


clang::TypeSourceInfo Class

■ clang::TypeSourceInfo
:::::::::::::::::::::::
class used as container for storing type
information as written in source code
■ getType: get QualType associated with type source information
■ getTypeLoc: get TypeLoc associated with type source information
■ many classes have methods that return pointers to TypeSourceInfo
■ for example, every DeclaratorDecl
:::::::::::::::
instance has corresponding
TypeLoc instance which can be obtained via getTypeSourceInfo
method (which yields TypeSourceInfo
::::::::::::::::
with getTypeLoc method)
■ in case of DeclaratorDecl for type, no TypeLoc instance corresponding
to declared name of type

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 333


Section 3.15

Templates

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 334


Template Instantiation

■ template instantiation: act of creating new definition (e.g., of function,


class, variable, or member of class) from template declaration for some
specific choice of template parameters
■ template specialization: new definition produced by instantiation of
template
■ instantiation can be implicit or explicit
■ implicit instantiation: compiler automatically instantiates template when
its definition needed
■ explicit instantiation: source code explicitly requests compiler to
instantiate template (via special language construct)
■ example:
template<class T> class Widget {/* ... */};
Widget<bool> w; // implicit instantiation
template class Widget<int>; // explicit instantiation

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 335


Explicit and Partial Specialization
■ partial and explicit specialization provide mechanism for specifying
alternate definitions of templates for certain template arguments
■ explicit specialization (informally “full specialization”): alternate definition
of template for one fixed choice of template parameters
■ partial specialization: alternate definition of template for subset of
template parameters being fixed
■ primary template: unspecialized version of template
■ unless template specialization explicitly instantiated or explicitly
specialized, compiler will generate specialization for template only when it
needs definition
■ example:
// primary template
template<class T, int I> class Widget {/* ... */};
// partial specialization
template<int I> class Widget<int, I> {/* ... */};
// explicit specialization
template<> class Widget<int, 42> {/* ... */};

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 336


Constraints on Explicit and Partial Specialization

Explicit Partial
Kind Specialization Specialization
Class Yes Yes
Variable Yes Yes
Function Yes No
Alias No No

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 337


Some Key AST Node Types for Templates

■ generic template declaration type:


2 TemplateDecl
::::::::::::
■ types related to template parameters and arguments:
2 TemplateTypeParmDecl, NonTypeTemplateParmDecl, and
:::::::::::::::::::: ::::::::::::::::::::::::
TemplateArgument
::::::::::::::::
■ types related to alias-template declarations:
2 TypeAliasTemplateDecl and TypeAliasDecl
::::::::::::::::::::: ::::::::::::::
■ types related to class-template declarations:
2 ClassTemplateDecl, ClassTemplatePartialSpecializationDecl,
::::::::::::::::: :::::::::::::::::::::::::::::::::::::::
ClassTemplateSpecializationDecl, and :::::::::::::
::::::::::::::::::::::::::::::::
CXXRecordDecl
■ types related to function-template declarations:
2 FunctionTemplateDecl and FunctionDecl
:::::::::::::::::::: :::::::::::::
■ types related to variable-template declarations:
2 VarTemplateDecl, VarTemplatePartialSpecializationDecl,
::::::::::::::: :::::::::::::::::::::::::::::::::::::
VarTemplateSpecializationDecl
::::::::::::::::::::::::::::::
, and :::::::
VarDecl

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 338


TypeAliasTemplateDecl Type

■ TypeAliasTemplateDecl
:::::::::::::::::::::::
used to represent alias template declaration
■ has child node of type TypeAliasDecl that can be obtained via
getTemplatedDecl member function

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 339


Alias Template AST Example

Source Code
1 template<class T, int I> using SimpleArray = T[I];

AST (Clang 15)

TypeAliasTemplateDecl
TemplateTypeParmDecl T
NonTypeTemplateParmDecl I
TypeAliasDecl SimpleArray T[I]
DependentSizedArrayType T[I]
TemplateTypeParmType T
TemplateTypeParm T
DeclRefExpr I

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 340


ClassTemplateDecl Type

■ ClassTemplateDecl
:::::::::::::::::::
node used to represent primary class template
■ subclass of ::::::::::::::
TemplateDecl
■ has child nodes to represent template parameters via types including
TemplateTypeParmDecl and NonTypeTemplateParmDecl
:::::::::::::::::::::: :::::::::::::::::::::::::
■ has child node for class/struct/union with type :::::::::::::::
CXXRecordDecl
■ has children for specializations of class template with type
ClassTemplateSpecialization

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 341


ClassTemplatePartialSpecializationDecl Type

■ ClassTemplatePartialSpecializationDecl
::::::::::::::::::::::::::::::::::::::::::
represents partial
specialization of class template
■ has children for template parameters and template arguments (e.g., types
TemplateTypeParmDecl , NonTypeTemplateParmDecl, and
:::::::::::::::::::::: ::::::::::::::::::::::::::
TemplateArgument
::::::::::::::::::
)
■ derives from :::::::::::::::
CXXRecordDecl
■ has children for various class/union-related information, such as access
specifiers and field declarations (by virtue of deriving from
CXXRecordDecl)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 342


ClassTemplateSpecializationDecl Type

■ ClassTemplateSpecializationDecl
:::::::::::::::::::::::::::::::::::
represents (full) specialization of
class template
■ has children for template arguments (i.e., type TemplateArgument
:::::::::::::::::
)
■ derives from :::::::::::::::
CXXRecordDecl
■ has children for various class/union-related information, such as access
specifiers and field declarations (by virtue of deriving from
CXXRecordDecl)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 343


Class Template AST Example
Source Code
1 template<class T, int I> struct Widget {}; // primary template
2 template<class T> struct Widget<T, 42> {}; // partial specialization
3 template<> struct Widget<int, 42> {}; // explicit specialization
AST (Clang 15)

ClassTemplateDecl Widget
TemplateTypeParmDecl T
NonTypeTemplateParmDecl int I
CXXRecordDecl Widget
ClassTemplateSpecializationDecl
... DefinitionData
ClassTemplateSpecialization Widget ...
ClassTemplatePartialSpecializationDecl Widget
TemplateArgument int
DefinitionData
BuiltinType int
TemplateArgument type-parameter-0-0
TemplateArgument 42
TemplateTypeParmType type-parameter-0-0
CXXRecordDecl Widget
TemplateArgument 42
TemplateTypeParmDecl T
CXXRecordDecl Widget
cont.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 344


FunctionTemplateDecl Type

■ FunctionTemplateDecl
::::::::::::::::::::::
represents function template declaration
■ has child nodes to represent template parameters via types including
TemplateTypeParmDecl
::::::::::::::::::::::
and NonTypeTemplateParmDecl
:::::::::::::::::::::::::
■ no “FunctionTemplateSpecializationDecl” type, as Clang
represents explicit specialization of function as FunctionDecl AST node
(and FunctionDecl node can be queried to know if it is associated with
template)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 345


Function Template AST Example

Source Code
1 template<class T, int I> void func(T(&x)[I]) {} // primary template
2 template<> void func(bool(&x)[42]) {} // explicit specialization
AST (Clang 15)

FunctionTemplateDecl func
TemplateTypeParmDecl T FunctionDecl func [explicit specialization]

NonTypeTemplateParmDecl int I TemplateArgument bool

FunctionDecl func BuiltInType bool

ParmVarDecl x TemplateTypeParmDecl 42

CompoundStmt ParmVarDecl x

Function func [refers to explicit specialization] CompoundStmt


cont.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 346


VarTemplateDecl Type

■ VarTemplateDecl
:::::::::::::::::
used to represent variable template declaration
■ subclass of ::::::::::::::
TemplateDecl
■ has child nodes to represent template parameters via types including
TemplateTypeParmDecl and NonTypeTemplateParmDecl
:::::::::::::::::::::: :::::::::::::::::::::::::
■ has child node for variable declaration with type VarDecl
::::::::
■ has children for specializations of variable template with type
VarTemplateSpecialization

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 347


VarTemplatePartialSpecializationDecl Type

■ VarTemplatePartialSpecializationDecl
::::::::::::::::::::::::::::::::::::::::
represents partial
specialization of variable template
■ has children for template parameters and template arguments (e.g., types
TemplateTypeParmDecl , NonTypeTemplateParmDecl, and
:::::::::::::::::::::: ::::::::::::::::::::::::::
TemplateArgument
::::::::::::::::::
)
■ derived from ::::::::
VarDecl
■ has various information for variable declaration by virtue of deriving from
VarDecl

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 348


VarTemplateSpecializationDecl Type

■ VarTemplateSpecializationDecl
::::::::::::::::::::::::::::::::
represents (full) specialization of
variable template
■ has children for template arguments (i.e., type TemplateArgument
:::::::::::::::::
)
■ derived from ::::::::
VarDecl
■ has various information for variable declaration by virtue of deriving from
VarDecl

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 349


Variable Template AST Example

Source Code
1 template<class T, int I> T var; // primary template
2 template<class T> T var<T, 42>; // partial specialization
3 template<> bool var<int, 0>; // explicit specialization
AST (Clang 15)

VarTemplateDecl var
TemplateTypeParmDecl T
NonTypeTemplateParmDecl int I
VarDecl var
VarTemplateSpecializationDecl var
VarTemplateSpecialization var [explicit specialization]
TemplateArgument int
VarTemplatePartialSpecializationDecl var
BuiltInType int
TemplateTypeParmDecl T
TemplateArgument 0
TemplateArgument type-parameter-0-0
TemplateTypeParmType type-parameter-0-0
TemplateArgument 42
cont.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 350


AST Matcher Example: Function Templates (1)

Matcher Expression
functionDecl(isInstantiated()).bind("f")

Source Code for Processing


1 template<class T> struct Widget { // instantiated (T = int)
2 void w1(T x) {return -x;} // matched (T = int)
3 template<class U> U w2(U x) {return x;} // matched (T = int)
4 };
5 template<class T> struct Gadget { // not instantiated
6 void g1(T x) {return x;} // not matched
7 };
8 struct Doodad {
9 template<class T> T identity(T x) {return x;}
10 // not matched (not instantiated)
11 };
12 template<class T> T f1(T x) {return x;} // matched (T = int)
13 template<class T> T f2(T x) {return -x;} // not matched (not instantiated)
14 Widget<int> w;
15 auto x = f1(42);

■ match function declaration that is template instantiation or inside template


instantiation

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 351


AST Matcher Example: Function Templates (2)

Matcher Expression
functionDecl(isExplicitTemplateSpecialization()).bind("f")

Source Code for Processing


1 template<class T> T f1(T x) {return x;} // no match (primary template)
2 template<> int f1(int x) {return x + 42;} // match (explicit specialization)

■ match declaration of function that corresponds to explicit specialization of


template

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 352


AST Matcher Example: Variable Templates (1)

Matcher Expression
varDecl(isTemplateInstantiation()).bind("v")

Source Code for Processing


1 template<int I, int J> int a = I * J; // matched (I = 1, J = 0)
2 template<int J> int a<0, J> = 2 * J; // matched (J = 1)
3 template<> int a<0, 0> = 42; // not matched (explicit specialization)
4 template<int I, int J> int b = I * J; // not matched (not instantiated)
5 template<int J> int b<0, J> = 2 * J; // not matched (not instantiated)
6 template<> int b<0, 0> = 42; // not matched (explicit specialization)
7 auto x = a<1, 0> + a<0, 1> + a<0, 0>;

■ match variable declaration corresponding to instantiation of primary template or


partial (not explicit) specialization

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 353


AST Matcher Example: Variable Templates (2)

Matcher Expression
varDecl(isExplicitTemplateSpecialization()).bind("v")

Source Code for Processing


1 template<int I, int J> int a = I * J; // not matched (primary template)
2 template<int J> int a<0, J> = 2 * J; // matched (partial specialization)
3 template<> int a<0, 0> = 42; // matched (explicit specialization)

■ match variable declaration corresponding to partial or explicit specialization of


template

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 354


Tuple Example: Summary

■ in slides/examples/template_1 directory in companion repository


■ run compiler frontend on specified source files
■ use AST matcher to find each declaration of variable of std::tuple type
■ prints name of variable and template arguments for corresponding
std::tuple type
■ sample output from program:
variable g_0 of type std::tuple with 0 template arguments
variable g_1 of type std::tuple with 4 template arguments
char
short
int
long
variable g_2 of type std::tuple with 2 template arguments
std::tuple<int, float>
std::tuple<>

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 355


Tuple Example: main.cpp (1)

1 #include <cassert>
2 #include <format>
3 #include "clang/ASTMatchers/ASTMatchers.h"
4 #include "clang/ASTMatchers/ASTMatchFinder.h"
5 #include "clang/Frontend/FrontendActions.h"
6 #include "clang/Tooling/CommonOptionsParser.h"
7 #include "clang/Tooling/Tooling.h"
8 #include "llvm/Support/CommandLine.h"
9
10 namespace ct = clang::tooling;
11 namespace cam = clang::ast_matchers;
12
13 std::vector<std::string> getPackTypeNames(const clang::TemplateArgument& arg,
14 clang::PrintingPolicy pp) {
15 std::vector<std::string> names;
16 for (auto packIter = arg.pack_begin(); packIter != arg.pack_end();
17 ++packIter) {
18 names.push_back({});
19 llvm::raw_string_ostream outStream(names.back());
20 packIter->print(pp, outStream, false);
21 }
22 return names;
23 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 356


Tuple Example: main.cpp (2)
25 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {
26 void run(const cam::MatchFinder::MatchResult& result) override;
27 };
28
29 void MyMatchCallback::run(const cam::MatchFinder::MatchResult& result) {
30 const clang::SourceManager& sourceManager = *result.SourceManager;
31 auto tempDecl =
32 result.Nodes.getNodeAs<clang::ClassTemplateSpecializationDecl>("c");
33 auto varDecl = result.Nodes.getNodeAs<clang::VarDecl>("v");
34 assert(tempDecl && varDecl);
35 const clang::TemplateArgumentList& args = tempDecl->getTemplateArgs();
36 if (args.size() != 1) {
37 llvm::errs() << "tuple does not have one template parameter\n";
38 return;
39 }
40 const clang::TemplateArgument& arg = args.get(0);
41 if (arg.getKind() != clang::TemplateArgument::ArgKind::Pack) {
42 llvm::errs() << "tuple template parameter is not a pack\n";
43 return;
44 }
45 clang::PrintingPolicy pp(result.Context->getLangOpts());
46 std::vector<std::string> names = getPackTypeNames(arg, pp);
47 assert(tempDecl->getQualifiedNameAsString() == "std::tuple");
48 llvm::outs() << std::format(
49 "variable {} of type {} with {} template arguments\n",
50 varDecl->getName(), tempDecl->getQualifiedNameAsString(),
51 arg.pack_size());
52 for (auto i : names) {llvm::outs() << std::format(" {}\n", i);}
53 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 357


Tuple Example: main.cpp (3)
55 AST_MATCHER(clang::ClassTemplateSpecializationDecl, isPartialSpecialization)
56 {return llvm::isa<clang::ClassTemplatePartialSpecializationDecl>(&Node);}
57
58 AST_MATCHER(clang::VarDecl, isParmDecl)
59 {return llvm::isa<clang::ParmVarDecl>(&Node);}
60
61 cam::DeclarationMatcher getMatcher() {
62 using namespace cam;
63 return varDecl(unless(isParmDecl()),
64 hasType(classTemplateSpecializationDecl(hasName("std::tuple"),
65 unless(isPartialSpecialization())).bind("c"))).bind("v");
66 }
67
68 static llvm::cl::OptionCategory optionCategory("Tool options");
69
70 int main(int argc, const char **argv) {
71 auto expectedParser = ct::CommonOptionsParser::create(argc, argv,
72 optionCategory);
73 if (!expectedParser) {
74 llvm::errs() << llvm::toString(expectedParser.takeError());
75 return 1;
76 }
77 ct::CommonOptionsParser& optionsParser = expectedParser.get();
78 ct::ClangTool tool(optionsParser.getCompilations(),
79 optionsParser.getSourcePathList());
80 cam::DeclarationMatcher matcher = getMatcher();
81 MyMatchCallback matchCallback;
82 cam::MatchFinder matchFinder;
83 matchFinder.addMatcher(matcher, &matchCallback);
84 return !tool.run(ct::newFrontendActionFactory(&matchFinder).get()) ? 0 : 1;
85 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 358


Section 3.16

Attributes

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 359


Attributes

■ some key AST node types related to attributes include:


2 AttributeCommonInfo, Attr, TypeAttr, DeclOrStmtAttr, and
::::::::::::::::::: :::: ::::::::: ::::::::::::::
StmtAttr
::::::::
■ each language attribute has corresponding AST node type (e.g.,
LikelyAttr and UnusedAttr)
■ all AST node types derived from AttributeCommonInfo and Attr
(ignoring syntactic attributes)
■ attribute for declaration represented by attribute node (e.g.,
DeprecatedAttr) as child of declaration node (e.g., FunctionDecl)
■ statement with attribute represented by AttributedStmt node with child
nodes for statement (e.g., IfStmt) and each attribute (e.g., LikelyAttr)
■ unsupported attributes ignored and have no representation in AST

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 360


Standard Attributes

■ node types used to represent attributes from C++ standard:


Node Type Attribute
CXX11NoReturnAttr noreturn
CarriesDependencyAttr carries_dependency
DeprecatedAttr deprecated
FallThroughAttr fallthrough
WarnUnusedResultAttr nodiscard
UnusedAttr maybe_unused
LikelyAttr likely
UnlikelyAttr unlikely
NoUniqueAddressAttr no_unique_address
■ context-sensitive keywords final and override represented as
attributes FinalAttr and OverrideAttr

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 361


Other Supported Attributes

■ Clang supports many attributes in addition to those specified in C++


standard
■ some supported Clang attributes include:
2 clang::always_inline, clang::noinline, clang::assume,

clang::assume_aligned, clang::no_sanitize,
clang::no_sanitize_memory, clang::no_sanitize_address, and
clang::no_sanitize_thread
■ some supported GCC attributes include:
2 gnu::always_inline and gnu::noinline

■ for more information on attributes supported by Clang, see


https://clang.llvm.org/docs/AttributeReference.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 362


Additional Remarks on Attributes

■ getAttr<T> member of Decl class can be used to get particular attribute


for Decl node
■ Decl provides attribute iterators via attr_begin and attr_end
■ definitions of many attributes can be found in file clang/AST/Attrs.inc
installed in include directory
■ useful information related to attributes can also be found at:
2 https://clang.llvm.org/docs/InternalsManual.html#

how-to-add-an-attribute

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 363


Declaration Attribute AST Example

Source Code
1 [[nodiscard, deprecated]] int get_answer() {return 42;}

AST (Clang 15)

FunctionDecl get_answer
CompoundStmt
ReturnStmt
IntegerLiteral 42
WarnUnusedResultAttr nodiscard
DeprecatedAttr

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 364


Statement Attribute AST Example
Source Code
1 int func(int x) {
2 if (x < 0) [[unlikely]] {x = 42;}
3 return x;
4 }

AST (Clang 15)

FunctionDecl func
ParmVarDecl int x
CompoundStmt
IfStmt
BinaryOperator <

···

AttributedStmt
UnlikelyAttr unlikely
CompoundStmt

···

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 365


Attribute Example: Summary

■ in slides/examples/attribute_1 directory in companion repository


■ runs compiler frontend on each source file specified on command line to
produce AST
■ for each declaration with attribute, prints information about attribute,
including:
2 attribute name
2 attribute syntax
2 location of attribute in source code
■ sample output from program as follows:
function get_answer
attribute nodiscard AS_CXX11 /home/jdoe/example_1.cpp:1(3)
attribute deprecated AS_CXX11 /home/jdoe/example_1.cpp:1(14)
field x
attribute no_unique_address AS_CXX11 /home/jdoe/example_1.cpp:5(4)
variable foobar
attribute deprecated AS_GNU /home/jdoe/example_1.cpp:18(14)
function slow
attribute gnu::noinline AS_CXX11 /home/jdoe/example_1.cpp:33(3)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 366


Attribute Example: main.cpp (1)
1 #include <format>
2 #include <string>
3 #include <map>
4 #include "clang/ASTMatchers/ASTMatchers.h"
5 #include "clang/ASTMatchers/ASTMatchFinder.h"
6 #include "clang/Frontend/FrontendActions.h"
7 #include "clang/Tooling/CommonOptionsParser.h"
8 #include "clang/Tooling/Tooling.h"
9 #include "llvm/Support/CommandLine.h"
10
11 namespace ct = clang::tooling;
12 namespace cam = clang::ast_matchers;
13
14 std::string locationToString(const clang::SourceManager& sourceManager,
15 clang::SourceLocation sourceLoc) {
16 return std::format("{}:{}({})",
17 std::string(sourceManager.getFilename(sourceLoc)),
18 sourceManager.getSpellingLineNumber(sourceLoc),
19 sourceManager.getSpellingColumnNumber(sourceLoc));
20 }
21
22 std::string declTypeToString(const clang::Decl* decl) {
23 if (llvm::isa<clang::FunctionDecl>(decl)) {return "function";}
24 else if (llvm::isa<clang::VarDecl>(decl)) {return "variable";}
25 else if (llvm::isa<clang::FieldDecl>(decl)) {return "field";}
26 else if (llvm::isa<clang::CXXRecordDecl>(decl)) {return "class";}
27 else {return "other";}
28 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 367


Attribute Example: main.cpp (2)
30 std::string attrSyntaxToString(clang::Attr::Syntax syntax) {
31 const std::map<clang::Attr::Syntax, std::string> lut{
32 {clang::Attr::Syntax::AS_GNU, "AS_GNU"},
33 {clang::Attr::Syntax::AS_CXX11, "AS_CXX11"},
34 {clang::Attr::Syntax::AS_Keyword, "AS_Keyword"},
35 {clang::Attr::Syntax::AS_Pragma, "AS_Pragma"}
36 };
37 auto i = lut.find(syntax);
38 return (i != lut.end()) ? i->second : "unknown";
39 }
40
41 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {
42 void run(const cam::MatchFinder::MatchResult& result) override;
43 };
44
45 void MyMatchCallback::run(const cam::MatchFinder::MatchResult& result) {
46 const clang::SourceManager& sourceManager = *result.SourceManager;
47 auto decl = result.Nodes.getNodeAs<clang::NamedDecl>("d");
48 if (!decl) {return;}
49 llvm::outs() << std::format("{} {}\n", declTypeToString(decl),
50 decl->getIdentifier() ? decl->getName() : "---");
51 for (auto attrIter = decl->attr_begin(); attrIter != decl->attr_end();
52 ++attrIter) {
53 const clang::Attr* attr = *attrIter;
54 clang::SourceLocation loc = attr->getLoc();
55 llvm::outs() << std::format(" attribute {} {} {}\n",
56 attr->getAttrName() ? attr->getNormalizedFullName() : "---",
57 attrSyntaxToString(attr->getSyntax()),
58 locationToString(sourceManager, loc));
59 }
60 }
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 368
Attribute Example: main.cpp (3)

62 int main(int argc, const char **argv) {


63 static llvm::cl::OptionCategory optionCategory("Tool options");
64 auto expectedParser = ct::CommonOptionsParser::create(argc, argv,
65 optionCategory);
66 if (!expectedParser) {
67 llvm::errs() << llvm::toString(expectedParser.takeError());
68 return 1;
69 }
70 ct::CommonOptionsParser& optionsParser = expectedParser.get();
71 ct::ClangTool tool(optionsParser.getCompilations(),
72 optionsParser.getSourcePathList());
73 MyMatchCallback matchCallback;
74 cam::MatchFinder matchFinder;
75 matchFinder.addMatcher(cam::namedDecl(cam::has(cam::attr())).bind("d"),
76 &matchCallback);
77 return !tool.run(ct::newFrontendActionFactory(&matchFinder).get()) ? 0 : 1;
78 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 369


Section 3.17

Casts

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 370


Casts

■ cast expressions represented by node types derived from:::::::::


CastExpr
■ cast expression can be either implicit or explicit
■ node types corresponding to implicit and explicit cast expressions derived
from::::::::::::::::::
ImplicitCastExpr and ExplicitCastExpr
::::::::::::::::::
, respectively
■ node types for explicit casts include:
Node Type Cast
CStyleCastExpr
::::::::::::::::
C-style cast
CXXConstCastExpr
::::::::::::::::::
const_cast
CXXReinterpretCastExpr
:::::::::::::::::::::::::
reinterpret_cast
CXXDynamicCastExpr
::::::::::::::::::::
dynamic_cast
CXXStaticCastExpr
:::::::::::::::::::
static_cast
CXXFunctionalCastExpr
::::::::::::::::::::::::
functional-notation cast
BuiltinBitCastExpr
::::::::::::::::::::
bit cast (i.e., __builtin_bit_cast for
std::bit_cast)
CXXAddrspaceCastExpr
:::::::::::::::::::::::
addrspace_cast (OpenCL only)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 371


Cast AST Example
Source Code
1 void* forty_two() {
2 const int x = 42;
3 return reinterpret_cast<void*>(x);
4 }

AST (Clang 15)

FunctionDecl forty_two
CompoundStmt
DeclStmt
VarDecl const int x
IntegerLiteral 42
ReturnStmt
CXXReinterpretCastExpr reinterpret_cast<void *>
ImplicitCastExpr LValueToRValue
DeclRefExpr x

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 372


Cast Example: Summary

■ in slides/examples/cast_1 directory in companion repository


■ runs compiler frontend on each source file specified on command line to
produce AST
■ for explicit cast, prints information about cast, including:
2 type of cast (e.g., C-style cast or static cast)
2 from and to type for cast
2 location of cast in source code
■ sample output from program as follows:
location: /home/jdoe/example_3.cpp:7(9)
cast: reinterpret_cast
from type: int
to type: void *
location: /home/jdoe/example_3.cpp:10(9)
cast: dynamic_cast
from type: Widget *
to type: BigWidget *

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 373


Cast Example: main.cpp (1)
1 #include <format>
2 #include "clang/ASTMatchers/ASTMatchers.h"
3 #include "clang/ASTMatchers/ASTMatchFinder.h"
4 #include "clang/Frontend/FrontendActions.h"
5 #include "clang/Tooling/CommonOptionsParser.h"
6 #include "clang/Tooling/Tooling.h"
7 #include "llvm/Support/CommandLine.h"
8
9 namespace ct = clang::tooling;
10 namespace cam = clang::ast_matchers;
11
12 std::string locationToString(const clang::SourceManager& sourceManager,
13 clang::SourceLocation sourceLoc) {
14 return std::format("{}:{}({})",
15 std::string(sourceManager.getFilename(sourceLoc)),
16 sourceManager.getSpellingLineNumber(sourceLoc),
17 sourceManager.getSpellingColumnNumber(sourceLoc));
18 }
19
20 std::string getCastName(const clang::ExplicitCastExpr* castExpr) {
21 if (llvm::isa<clang::CStyleCastExpr>(castExpr)) {
22 return "C-style-cast";
23 } else if (llvm::isa<clang::CXXFunctionalCastExpr>(castExpr)) {
24 return "functional-cast";
25 } else if (llvm::isa<clang::BuiltinBitCastExpr>(castExpr)) {
26 return "bit-cast";
27 } else if (auto namedCastExpr =
28 llvm::dyn_cast<clang::CXXNamedCastExpr>(castExpr)) {
29 return namedCastExpr->getCastName();
30 } else {return "unknown";}
31 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 374


Cast Example: main.cpp (2)

33 struct MyMatchCallback : public cam::MatchFinder::MatchCallback {


34 void run(const cam::MatchFinder::MatchResult& result) override;
35 };
36
37 void MyMatchCallback::run(const cam::MatchFinder::MatchResult& result) {
38 const clang::SourceManager& sourceManager = *result.SourceManager;
39 const clang::ASTContext& astContext = *result.Context;
40 auto castExpr = result.Nodes.getNodeAs<clang::ExplicitCastExpr>("c");
41 clang::QualType targetQualType = castExpr->getType();
42 clang::SourceLocation loc = castExpr->getExprLoc();
43 const clang::Expr* sourceExpr = castExpr->getSubExpr();
44 clang::QualType sourceQualType = sourceExpr->getType();
45 clang::PrintingPolicy pp(astContext.getLangOpts());
46 pp.PrintCanonicalTypes = 1;
47 llvm::outs() << std::format(
48 "location: {}\ncast: {}\nfrom type: {}\nto type: {}\n\n",
49 locationToString(sourceManager, loc), getCastName(castExpr),
50 sourceQualType.getAsString(pp), targetQualType.getAsString(pp)
51 );
52 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 375


Cast Example: main.cpp (3)

54 static llvm::cl::OptionCategory optionCategory("Tool options");


55
56 int main(int argc, const char **argv) {
57 auto expectedParser = ct::CommonOptionsParser::create(argc, argv,
58 optionCategory);
59 if (!expectedParser) {
60 llvm::errs() << llvm::toString(expectedParser.takeError());
61 return 1;
62 }
63 ct::CommonOptionsParser& optionsParser = expectedParser.get();
64 ct::ClangTool tool(optionsParser.getCompilations(),
65 optionsParser.getSourcePathList());
66 MyMatchCallback matchCallback;
67 cam::MatchFinder matchFinder;
68 matchFinder.addMatcher(cam::explicitCastExpr().bind("c"),
69 &matchCallback);
70 return !tool.run(ct::newFrontendActionFactory(&matchFinder).get()) ? 0 : 1;
71 }

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 376


Section 3.18

Source-Code Comments

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 377


Source-Code Comments and AST
■ normally, comments discarded before AST is built
■ including compiler option -fparse-all-comments will cause comments
to be captured in AST
■ some AST node types associated with comments (in clang::comments
namespace) include (amongst many):
2 BlockContentComment
2 FullComment
2 InlineContentComment

2 VerbatimBlockLineComment

■ HTML tags in comments represented explicitly in AST


■ comments can be accessed via Comments member of ASTContext
■ can get comments for Decl via getCommentForDecl member of
ASTContext
■ for more information, see:
2 https://clang.llvm.org/doxygen/classclang_1_1comments_1_

1Comment.html
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 378
Section 3.19

Miscellany

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 379


Name Mangling

■ clang::MangleContext
::::::::::::::::::::::
class provides context for tracking state which
persists across multiple calls to C++ name mangler
■ MangleContext::ManglerKind used to specify name mangling
convention to be used (e.g., MK_Itanium or MK_Microsoft)
■ ASTContext provides createMangleContext method for creating
MangleContext instance
■ MangleContext class provides mangleTypeName and mangleName
methods for mangling names
■ clang::GlobalDecl
:::::::::::::::::::
represents global declaration; used to wrap
declaration and other input needed to perform name mangling
■ for example, name mangling for constructor/destructor needs kind of
constructor/destructor (e.g., complete, base)

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 380


Name Demangling (Itanium ABI)

■ llvm::ItaniumPartialDemangler
::::::::::::::::::::::::::::::::
class can be used to demangle
name and extract some limited information about demangled entity (e.g.,
determine if name corresponds to function or variable, get base function
name, get function return/parameter types, etc.)
■ llvm::itaniumDemangle function demangles mangled name
:::::::::::::::::::::::

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 381


Mangling/Demangling Example

■ code example illustrating use of mangling/demangling functionality of


Clang/LLVM can be found in miscellany/examples/mangle_1 directory
in companion repository

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 382


A Few More Types

■ clang::CFGReverseBlockReachabilityAnalysis
2 check if one block reachable from another block in CFG
2 https://clang.llvm.org/doxygen/classclang_1_
1CFGReverseBlockReachabilityAnalysis.html#
a73cec1b9cbbc6e2461470906e6a0720a
■ clang::CallGraph
2 used for constructing AST-based call graph
2 https://clang.llvm.org/doxygen/classclang_1_1CallGraph.html

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 383


Part 4

References

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 384


LLVM Quick Links I
1 LLVM Project, https://llvm.org.
2 Building LLVM with CMake, https://llvm.org/docs/CMake.html.
3 LLVM Programmer’s Manual,
https://llvm.org/docs/ProgrammersManual.html.
4 LLVM IR Language Reference,
https://llvm.org/docs/LangRef.html.
5 LLVM Machine IR (MIR) Language Reference,
https://llvm.org/docs/MIRLangRef.html.
6 LLVM Coding Standards,
https://llvm.org/docs/CodingStandards.html.
7 LLVM Tutorial: Table of Contents,
https://llvm.org/docs/tutorial/index.html.
8 CommandLine 2.0 Library Manual,
https://llvm.org/docs/CommandLine.html.
9 Mapping High Level Constructs to LLVM IR, https://
mapping-high-level-constructs-to-llvm-ir.readthedocs.io.
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 385
Clang Quick Links I

1 Clang Documentation, https://clang.llvm.org/docs.


2 Clang Frontend (CFE) Internals Manual,
https://clang.llvm.org/docs/InternalsManual.html.
3 Clang Driver Design and Internals,
https://clang.llvm.org/docs/DriverInternals.html.
4 Introduction to the Clang AST, https:
//clang.llvm.org/docs/IntroductionToTheClangAST.html.
5 Using Clang as a Library
https://clang.llvm.org/docs/#using-clang-as-a-library.
6 Checker Developer Manual,
https://clang-analyzer.llvm.org/checker_dev_manual.html.
7 Getting Started: Building and Running Clang,
https://clang.llvm.org/get_started.html.
8 Hacking on Clang, https://clang.llvm.org/hacking.html.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 386


Clang Quick Links II

9 How to Write RecursiveASTVisitor Based ASTFrontendActions,


https://clang.llvm.org/docs/RAVFrontendAction.html.
10 Tutorial for Building Tools Using LibTooling and LibASTMatchers
https://clang.llvm.org/docs/LibASTMatchersTutorial.html.
11 Data Flow Analysis: An Informal Introduction,
https://clang.llvm.org/docs/DataFlowAnalysisIntro.html.
12 Clang Plugins, https://clang.llvm.org/docs/ClangPlugins.html.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 387


LLVM/Clang Community I

1 LLVM Developers’ Meetings. https://llvm.org/devmtg.


2 LLVM Office Hours.
https://llvm.org/docs/GettingInvolved.html#office-hours.
3 LLVM Community Calendar. https://calendar.google.com/
calendar/u/0/embed?src=calendar@llvm.org.
4 LLVM Discourse Site. https://discourse.llvm.org.
5 Clang Frontend Section of LLVM Discourse Site.
https://discourse.llvm.org/c/clang/6.
6 LLVM Weekly. https://llvmweekly.org.
7 r/LLVM: The LLVM Compiler Infrastructure.
https://www.reddit.com/r/LLVM.
8 r/Clang: Clang, for Discussion of the C, C++, and Objective-C
Programming Languages. https://www.reddit.com/r/Clang.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 388


Books I

1 Min-Yih Hsu. LLVM Techniques, Tips, and Best Practices Clang and
Middle-End Libraries. Packt Publishing, Dec. 2021,
https://isbnsearch.org/isbn/9781838824952. [Source code
available from
https://github.com/PacktPublishing/LLVM-Techniques-Tips-
and-Best-Practices-Clang-and-Middle-End-Libraries.]
[Chapters 5–8 discuss various aspects of the Clang frontend in some detail.]
2 Suyog Sarda and Mayur Pandey. LLVM Essentials. Packt Publishing, Dec.
2015, https://isbnsearch.org/isbn/9781785280801. [This book
focuses on LLVM as opposed to Clang.]
3 Kai Nacke. Learn LLVM 12. Packt Publishing, May 2021, https://www.
packtpub.com/product/cloud_and_networking/9781839213502.
4 Kai Nacke. Learn LLVM 11: A beginner’s guide to learning LLVM compiler
tools and core libraries with C++. Packt Publishing, Dec. 2021,
https://isbnsearch.org/isbn/9781839213502.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 389


Books II

5 Bruno Cardoso Lopes and Rafael Auler. Getting Started with LLVM Core
Libraries. Packt Publishing, Aug. 2014,
https://isbnsearch.org/isbn/9781782166924.
6 Mayur Pandey and Suyog Sarda. LLVM Cookbook. Packt Publishing, May
2015, https://isbnsearch.org/isbn/9781785285981. [This book
does not take a systematic approach to teaching LLVM/Clang. Rather, it teaches
by presenting recipes/examples.]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 390


Articles I
1 Xin Huang. Clang Tutorial: Finding Declarations. Oct. 19, 2014,
https://xinhuang.github.io/posts/
2014-10-19-clang-tutorial-finding-declarations.html.
2 Xin Huang. Clang Tutorial: The AST Matcher. Feb. 8, 2015,
https://xinhuang.github.io/posts/
2015-02-08-clang-tutorial-the-ast-matcher.html.
3 Eli Bendersky. AST matchers and Clang refactoring tools. July 29, 2014,
https://eli.thegreenplace.net/2014/07/29/
ast-matchers-and-clang-refactoring-tools.
4 Eli Bendersky. Compilation databases for Clang-based tools. May 21,
2014, https://eli.thegreenplace.net/2014/05/21/
compilation-databases-for-clang-based-tools.
5 Eli Bendersky. Modern source-to-source transformation with Clang and
libTooling. May 1, 2014,
https://eli.thegreenplace.net/2014/05/01/modern-source-t
o-source-transformation-with-clang-and-libtooling.
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 391
Articles II

6 Jonas Devlieghere. Understanding the Clang AST. Dec. 31, 2015,


https://jonasdevlieghere.com/understanding-the-clang-ast/.
7 Stephen Kelly. Exploring Clang Tooling, Part 0: Building Your Code with
Clang. Sept. 18, 2018,
https://devblogs.microsoft.com/cppblog/exploring-clang-t
ooling-part-0-building-your-code-with-clang/.
8 Stephen Kelly. Exploring Clang Tooling Part 1: Extending Clang-Tidy.
Oct. 19, 2018, https://devblogs.microsoft.com/cppblog/
exploring-clang-tooling-part-1-extending-clang-tidy/.
9 Stephen Kelly. Exploring Clang Tooling Part 2: Examining the Clang AST
with clang-query. Oct. 23, 2018,
https://devblogs.microsoft.com/cppblog/exploring-clang-t
ooling-part-2-examining-the-clang-ast-with-clang-query/.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 392


Articles III
10 Stephen Kelly. Exploring Clang Tooling Part 3: Rewriting Code with
clang-tidy. Nov. 6, 2018,
https://devblogs.microsoft.com/cppblog/exploring-clang-t
ooling-part-3-rewriting-code-with-clang-tidy/.
11 Stephen Kelly. Composing AST Matchers in clang-tidy. Nov. 11, 2018,
https://steveire.wordpress.com/2018/11/20/
composing-ast-matchers-in-clang-tidy.
12 Stephen Kelly. Debugging Clang AST Matchers. Apr. 16, 2019,
https://steveire.wordpress.com/2019/04/16/
debugging-clang-ast-matchers.
13 Stephen Kelly. AST Matchmaking Made Easy. Feb. 14, 2021,
https://steveire.wordpress.com/2021/02/14/
ast-matchmaking-made-easy.
14 Stephen Kelly. Location, Location, Location. Apr. 27, 2021,
https://steveire.wordpress.com/2021/04/27/
location-location-location.
Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 393
Articles IV

15 Ehsan Akhgari. C++ Static Analysis Using Clang. Dec. 7, 2015,


https://ehsanakhgari.org/blog/2015-12-07/
c-static-analysis-using-clang.
16 Peter Goldsborough. Emitting Diagnostics in Clang. Feb. 24, 2017.
http://www.goldsborough.me/c++/clang/llvm/tools/2017/02/
24/00-00-06-emitting_diagnostics_and_fixithints_in_clang_
tools/.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 394


Talks I

1 Peter Goldsborough. Clang-useful: Building Useful Tools with LLVM and


Clang for Fun and Profit. C++ Now, Aspen, CO, USA, May 15–20, 2017.
Available online at https://youtu.be/E6i8jmiy8MY. [Code examples
available at https://github.com/peter-can-talk/cppnow-2017.]
2 Sven van Haastregt and Anastasia Stulova. An Overview of Clang. LLVM
Developers’ Meeting, San Jose, CA, USA, Oct. 22–23, 2019. Available
online at https://youtu.be/5kkMpJpIGYU.
3 Manuel Klimek. The Clang AST — A Tutorial. LLVM Euro Conference,
Paris, France, Apr. 29–30, 2013. Available online at
https://youtu.be/VqCkCDFLSsc.
4 Vince Bridgers and Felipe de Azevedo Piovezan. LLVM IR Tutorial —
Phis, GEPs and Other Things, Oh My!. EuroLLVM Developers’ Meeting,
Brussels, Belgium, Apr. 8–9, 2019. Available online at
https://youtu.be/m8G_S5LwlTo. [An excellent talk that well explains
various aspects of LLVM IR, including phis and GEPs.]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 395


Talks II
5 Mike Shah. Introduction to LLVM — Building Simple Program Analysis
Tools and Instrumentation. FOSDEM, Brussels, Belgium, Feb. 4, 2018.
Available online at https://youtu.be/VKIv_Bkp4pk.
6 Sergei Sadovnikov. Automatic C++ Source Code Generation with Clang.
ACCU, Bristol, UK, Apr. 26–29, 2017. Available online at
https://youtu.be/aPTyatTI42k.
7 Stephen Kelly. Refactor Your Codebase with Clang Tooling. code::dive,
Wroclaw, Poland, Nov. 7–8, 2018. Available online at
https://youtu.be/_T-5pWQVxeE. [Discusses how to develop
refactoring tools based on clang-tidy and covers topics such as AST
matchers, and how to use clang-query to facilitate the development of
matchers.]
8 Meike Baumgartner and Dmitri Gribenko. My First Clang Warning. LLVM
Developers’ Meeting, San Jose, CA, USA, Oct. 22–23, 2019. Available
online at https://youtu.be/FNnKMSkaLkY. [An excellent talk that walks
through the various steps of adding a new compiler warning to Clang.]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 396


Talks III

9 Peter Smith. YVR18-223: How to Build a C++ Processing Tool Using the
Clang Libraries. Linaro Connect 2018 — YVR18, Vancouver, BC, Canada,
Sept. 17–21, 2018. Available online at
https://youtu.be/8QvLVEaxzC8. Slides and video available at https:
//resources.linaro.org/en/resource/Bi4FpRDmERUuU5ei7nry9h.
[A fast-paced talk that covers an example of using the Clang libraries to
apply a simple source-code transformation.]
10 Stephan Bergman. Plug Yourself In: Learn How to Write a Clang Compiler
Plugin. LibreOffice Conference, Aarhus, Denmark, Sept. 24, 2015.
Available online at https://youtu.be/pdxlmM477KY. [A simple Clang
plugin is developed in a step-by-step fashion.]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 397


Talks IV

11 Toby Ho. Live Code: LLVM Tutorial Walkthrough. Available online at


https://youtube.com/playlist?list=PLSq9OFrD2Q3ChEc_
ejnBcO5u9JeT0ufkg. [This series of videos performs a walkthrough of
the LLVM tutorial titled “My First Language Frontend with LLVM Tutorial”,
which can be found at
https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/.]

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 398


Miscellany I

1 Victor Ciura. Better Tools in Your Clang Toolbox: Extending clang-tidy


With Your Custom Checks. C++ on Sea, Folkestone, UK, Feb. 4, 2019.
Available online at https://youtu.be/7CnFrn0-2TQ.
2 Jeremy Demeule. Adding a New clang-tidy Check by the Practice. CPPP,
Paris, France, June 15, 2019. Available online at
https://youtu.be/K-WhaEUEZWc.
3 Vince Bridgers. Using Clang-tidy for Customized Checkers and Large
Scale Source Refactoring. LLVM Developers’ Meeting, Online, Oct. 7,
2020. Available online at https://youtu.be/UfLH7dORav8.
4 Vince Bridgers. Using the Clang Static Analyzer to Find Bugs. LLVM
Developers’ Meeting, Online, Oct. 8, 2020. Available online at
https://youtu.be/nTslG8HtKeA.
5 Stephen Kelly. Extending clang-tidy in the Present and in the Future.
ACCU, Bristol, UK, Apr. 13, 2019. Available online at
https://youtu.be/38tYYrnfNrs.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 399


Miscellany II

6 Artem Dergachev. Developing the Clang Static Analyzer. LLVM


Developers’ Meeting, San Jose, CA, USA, Oct. 23, 2019. Available online
at https://youtu.be/g0Mqx1niUi0.
7 Chris Bieneman and Kit Barton. How to Contribute to LLVM. LLVM
Developers’ Meeting, San Jose, CA, USA, Oct. 22, 2019. Available online
at https://youtu.be/C5Y977rLqpw.
8 Andrzej Warzynski. Writing an LLVM Pass: 101. LLVM Developers’
Meeting, San Jose, CA, USA, Oct. 22, 2019. Available online at
https://youtu.be/ar7cJl2aBuU
9 Chandler Carruth. Understanding Compiler Optimization. Meeting C++,
Berlin, Germany, Dec. 4–5, 2015. Available online at
https://youtu.be/FnGCDLhaxKU.
10 Chandler Carruth. Tuning C++: Benchmarks, and CPUs, and Compilers!
Oh My!. CppCon, Bellevue, WA, USA, Sept. 24, 2015. Available online at
https://youtu.be/nXaxk27zwlk.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 400


Miscellany III

11 Dmitri Gribenko. Parsing Documentation Comments in Clang. LLVM


Developers’ Meeting, San Jose, CA, USA, Nov. 7–8, 2012. Video available
at https://youtu.be/DzRq9Dy0b9c. Slides available at https:
//llvm.org/devmtg/2012-11/Gribenko_CommentParsing.pdf.
12 Alan Phipps. MC/DC: Enabling Easy-to-Use Safety-Critical Code
Coverage Analysis with LLVM. LLVM Developers’ Meeting, San Jose, CA,
USA, Nov. 8–9, 2022. Video available
at https://youtu.be/RmX_8GxxTbs. Slides available
at https://llvm.org/devmtg/2022-11/slides/
TechTalk4-MCDC-EnablingSafetyCriticalCodeCoverage.pdf.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 401


Compilers and Static Analysis I

1 John Regehr. How Clang Compiles a Function, Embedded in Academia.


June 26, 2018. https://blog.regehr.org/archives/1605.
2 John Regehr, How LLVM Optimizes a Function. Embedded in Academia.
Sept. 3, 2018. https://blog.regehr.org/archives/1603.
3 John Regehr, A Tourist’s Guide to the LLVM Source Code. Embedded in
Academia. Jan. 5, 2017. https://blog.regehr.org/archives/1453.
4 Eli Bendersky. Life of an Instruction in LLVM. Nov. 24, 2012.
https://eli.thegreenplace.net/2012/11/24/
life-of-an-instruction-in-llvm.
5 CS 6120: Advanced Compilers: The Self-Guided Online Course. Dept. of
Computer Science, Cornell University, Ithaca, NY, USA. https:
//www.cs.cornell.edu/courses/cs6120/2020fa/self-guided/.
6 DCC888: Static Program Analysis. Dept. of Computer Science,
Universidade Federal de Minas Gerais, Gerais, Brazil.
https://homepages.dcc.ufmg.br/~fernando/classes/dcc888/.

Copyright © 2022–2023 Michael D. Adams Clang Libraries Edition 0.1.0 402

You might also like