-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add indirect build tracing docs #6667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
fb22931
add indirect build tracing content and example
ethanpalm 47a543e
Add reviewer feedback
ethanpalm 930a36d
Add example step for ending build tracing
ethanpalm c62a21e
Apply suggestions from code review
ethanpalm 080867a
Add reviewer feedback
ethanpalm 4d7aa5c
Update example note
ethanpalm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -228,6 +228,123 @@ commands that you can specify for compiled languages. | |
This command runs a custom script that contains all of the commands required | ||
to build the project. | ||
|
||
Using indirect build tracing | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action. | ||
|
||
To create a CodeQL database with indirect build tracing, run the following command from the checkout root of your project: | ||
|
||
:: | ||
|
||
codeql database init ... --begin-tracing <database> | ||
|
||
You must specify: | ||
|
||
- ``<database>``: a path to the new database to be created. This directory will | ||
be created when you execute the command---you cannot specify an existing | ||
directory. | ||
- ``--begin-tracing``: creates scripts that can be used to set up an environment in which build commands will be traced. | ||
|
||
You may specify other options for the ``codeql database init`` command as normal. | ||
|
||
.. pull-quote:: Note | ||
|
||
If the build runs on Windows, you must set either ``--trace-process-level <number>`` or ``--trace-process-name <parent process name>`` so that the option points to a parent CI process that will observe all build steps for the code being analyzed. | ||
|
||
|
||
The ``codeql database init`` command will output a message:: | ||
|
||
Created skeleton <database>. This in-progress database is ready to be populated by an extractor. | ||
In order to initialise tracing, some environment variables need to be set in the shell your build will run in. | ||
A number of scripts to do this have been created in <database>/temp/tracingEnvironment. | ||
Please run one of these scripts before invoking your build command. | ||
|
||
Based on your operating system, we recommend you run: ... | ||
|
||
The ``codeql database init`` command creates ``<database>/temp/tracingEnvironment`` with files that contain environment variables and values that will enable CodeQL to trace a sequence of build steps. These files are named ``start-tracing.{json,sh,bat,ps1}``. Use one of these files with your CI system's mechanism for setting environment variables for future steps. You can: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good explanation. |
||
|
||
* Read the JSON file, process it, and print out environment variables in the format expected by your CI system. For example, Azure DevOps expects ``echo "##vso[task.setvariable variable=NAME]VALUE"``. | ||
* Or, if your CI system persists the environment, source the appropriate ``start-tracing`` script to set the CodeQL variables in the shell environment of the CI system. | ||
|
||
Build your code; optionally, unset the environment variables using an ``end-tracing.{json,sh,bat,ps1}`` script from the directory where the ``start-tracing`` scripts are stored; and then run the command ``codeql database finalize <database>``. | ||
|
||
Once you have created a CodeQL database using indirect build tracing, you can work with it like any other CodeQL database. For example, analyze the database, and upload the results to GitHub if you use code scanning. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Optional: link to the docs on analyzing and uploading? |
||
|
||
Example of creating a CodeQL database using indirect build tracing | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The following example shows how you could use indirect build tracing in an Azure DevOps pipeline to create a CodeQL database:: | ||
|
||
steps: | ||
# Download the CodeQL CLI and query packs... | ||
felicitymay marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Check out the repository ... | ||
|
||
# Run any pre-build tasks, for example, restore NuGet dependencies... | ||
|
||
# Initialize the CodeQL database. | ||
# In this example, the CodeQL CLI has been downloaded and placed on the PATH. | ||
- task: CmdLine@1 | ||
displayName: Initialize CodeQL database | ||
inputs: | ||
# Assumes the source code is checked out to the current working directory. | ||
# Creates a database at `<current working directory>/db`. | ||
# Running on Windows, so specifies a trace process level. | ||
script: "codeql database init --language csharp --trace-process-name Agent.Worker.exe --source-root . --begin-tracing db" | ||
|
||
# Read the generated environment variables and values, | ||
# and set them so they are available for subsequent commands | ||
# in the build pipeline. This is done in PowerShell in this example. | ||
- task: PowerShell@1 | ||
displayName: Set CodeQL environment variables | ||
inputs: | ||
targetType: inline | ||
script: > | ||
$json = Get-Content $(System.DefaultWorkingDirectory)/db/temp/tracingEnvironment/start-tracing.json | ConvertFrom-Json | ||
$json.PSObject.Properties | ForEach-Object { | ||
$template = "##vso[task.setvariable variable=" | ||
$template += $_.Name | ||
$template += "]" | ||
$template += $_.Value | ||
echo "$template" | ||
} | ||
|
||
# Execute the pre-defined build step. Note the `msbuildArgs` variable. | ||
- task: VSBuild@1 | ||
inputs: | ||
solution: '**/*.sln' | ||
# Disable MSBuild shared compilation for C# builds. | ||
msbuildArgs: /p:OutDir=$(Build.ArtifactStagingDirectory) /p:UseSharedCompilation=false | ||
platform: Any CPU | ||
configuration: Release | ||
# Execute a clean build, in order to remove any existing build artifacts prior to the build. | ||
clean: True | ||
displayName: Visual Studio Build | ||
|
||
# Read and set the generated environment variables to end build tracing. This is done in PowerShell in this example. | ||
- task: PowerShell@1 | ||
displayName: Clear CodeQL environment variables | ||
inputs: | ||
targetType: inline | ||
script: > | ||
$json = Get-Content $(System.DefaultWorkingDirectory)/db/temp/tracingEnvironment/end-tracing.json | ConvertFrom-Json | ||
$json.PSObject.Properties | ForEach-Object { | ||
$template = "##vso[task.setvariable variable=" | ||
$template += $_.Name | ||
$template += "]" | ||
$template += $_.Value | ||
echo "$template" | ||
} | ||
|
||
- task: CmdLine@2 | ||
ethanpalm marked this conversation as resolved.
Show resolved
Hide resolved
|
||
displayName: Finalize CodeQL database | ||
inputs: | ||
script: 'codeql database finalize db' | ||
|
||
# Other tasks go here, for example: | ||
# `codeql database analyze` | ||
# then `codeql github upload-results` ... | ||
|
||
Obtaining databases from LGTM.com | ||
--------------------------------- | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋🏻 AFAICT we don't currently explain how to "wrap invocations of build commands with
codeql database trace-command
".The section above uses
codeql database create
with--command='<build command>
. Are these equivalent? (I notice that thetrace-command
help describes it as a plumbing command).I wonder whether it should be more like:
Alternatively, we may be missing a section on how to use
codeql database trace-command
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,
codeql database init
followed bycodeql database trace-command <command>
followed bycodeql database finalize
should be equivalent tocodeql database create --comand=<command>
. The latter is the recommended way to create the DB if you have a single command. The former is the recommended way to create a DB if the build requires multiple commands (and you can't wrap them in a script to make them into a single command) and you can addcodeql database trace-command
in front of each one. The new indirect tracing option addresses the case where:codeql database trace-command
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the extra information. It sounds as if we could do with a short overview giving the options for tracing the build for compiled languages. Possibly this would be better as part of a follow up PR, but I'll leave @ethanpalm to make the call on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the considerations we're trying to balance is to provide indirect tracing as an option for people who need it without directing people toward it unintentionally. This came up in naming and avoiding calling indirect build tracing an advanced option. It feels to me like indirect build tracing would be better introduced as a troubleshooting option rather than one of several options for tracing the build of compiled languages in general.
Perhaps a line at the beginning of Creating databases for compiled languages, after
For compiled languages, CodeQL needs to invoke the required build system to generate a database, therefore the build method must be available to the CLI.
explaining to see below about indirect build tracing if it is relevant to the specific use case. I think this could help direct people who need to use indirect build tracing to the procedure but won't cause people to think they should use indirect build tracing when they don't need to.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a bit more thinking, I am going to open a separate issue for how we introduce this information because I think there are a few different approaches we can take.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @felicitymay's suggestion about providing a short guide on the options for compiled languages. There are many possibilities, but to me the high level options are:
codeql database create
(without a--command
argument) to autobuild the code.codeql database create ... --command "<build command>"
trace-command
multiple times. I don't think we need to mention that just yet.Notably, indirect tracing is not a viable troubleshooting option. Aside from autobuild failing, there's no way to try out a build without it if you don't know the build command.