-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add indirect build tracing docs #6667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@adityasharad or @edoardopirovano, would one of you be able to provide a technical review of this new content? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ethanpalm! A few technical comments below - @adityasharad or @hmakholm might have more 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. It's great to have such a detailed example too 💖
I've added a few comments on the text, but nothing major.
Using indirect build tracing | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋🏻 AFAICT we don't currently explain how to "wrap invocations of build commands with codeql database trace-command
".
The section above uses codeql database create
with --command='<build command>
. Are these equivalent? (I notice that the trace-command
help describes it as a plumbing command).
I wonder whether it should be more like:
If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action. | |
If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot specify build commands with using the ``--command`` option, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action. |
Alternatively, we may be missing a section on how to use codeql database trace-command
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, codeql database init
followed by codeql database trace-command <command>
followed by codeql database finalize
should be equivalent to codeql database create --comand=<command>
. The latter is the recommended way to create the DB if you have a single command. The former is the recommended way to create a DB if the build requires multiple commands (and you can't wrap them in a script to make them into a single command) and you can add codeql database trace-command
in front of each one. The new indirect tracing option addresses the case where:
- You have multiple build commands.
- You cannot wrap them with
codeql database trace-command
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the extra information. It sounds as if we could do with a short overview giving the options for tracing the build for compiled languages. Possibly this would be better as part of a follow up PR, but I'll leave @ethanpalm to make the call on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the considerations we're trying to balance is to provide indirect tracing as an option for people who need it without directing people toward it unintentionally. This came up in naming and avoiding calling indirect build tracing an advanced option. It feels to me like indirect build tracing would be better introduced as a troubleshooting option rather than one of several options for tracing the build of compiled languages in general.
Perhaps a line at the beginning of Creating databases for compiled languages, after For compiled languages, CodeQL needs to invoke the required build system to generate a database, therefore the build method must be available to the CLI.
explaining to see below about indirect build tracing if it is relevant to the specific use case. I think this could help direct people who need to use indirect build tracing to the procedure but won't cause people to think they should use indirect build tracing when they don't need to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a bit more thinking, I am going to open a separate issue for how we introduce this information because I think there are a few different approaches we can take.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @felicitymay's suggestion about providing a short guide on the options for compiled languages. There are many possibilities, but to me the high level options are:
- Do you use a well-known build system recognised by the CodeQL autobuilders? Use
codeql database create
(without a--command
argument) to autobuild the code. - Do you know the build command line? Use
codeql database create ... --command "<build command>"
- A variation of this is if you have multiple build command lines, in which case you would use
trace-command
multiple times. I don't think we need to mention that just yet.
- A variation of this is if you have multiple build command lines, in which case you would use
- If neither of the above are suitable, for example if you are using preconfigured build steps from your CI system that do not expose the build command, then use indirect build tracing. Examples of such build steps are the VSBuild and MSBuild tasks in Azure DevOps.
Notably, indirect tracing is not a viable troubleshooting option. Aside from autobuild failing, there's no way to try out a build without it if you don't know the build command.
Co-authored-by: Felicity Chapman <felicitymay@github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work documenting a complex feature. Some suggestions for added clarity but I think this is generally on the right track, and I'd support shipping these changes and making incremental improvements over time.
Using indirect build tracing | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @felicitymay's suggestion about providing a short guide on the options for compiled languages. There are many possibilities, but to me the high level options are:
- Do you use a well-known build system recognised by the CodeQL autobuilders? Use
codeql database create
(without a--command
argument) to autobuild the code. - Do you know the build command line? Use
codeql database create ... --command "<build command>"
- A variation of this is if you have multiple build command lines, in which case you would use
trace-command
multiple times. I don't think we need to mention that just yet.
- A variation of this is if you have multiple build command lines, in which case you would use
- If neither of the above are suitable, for example if you are using preconfigured build steps from your CI system that do not expose the build command, then use indirect build tracing. Examples of such build steps are the VSBuild and MSBuild tasks in Azure DevOps.
Notably, indirect tracing is not a viable troubleshooting option. Aside from autobuild failing, there's no way to try out a build without it if you don't know the build command.
|
||
Based on your operating system, we recommend you run: ... | ||
|
||
The ``codeql database init`` command creates ``<database>/temp/tracingEnvironment`` with files that contain environment variables and values that will enable CodeQL to trace a sequence of build steps. These files are named ``start-tracing.{json,sh,bat,ps1}``. Use one of these files with your CI system's mechanism for setting environment variables for future steps. You can: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good explanation.
|
||
Build your code and then run the command ``codeql database finalize <database>``. Optionally, after building the code, unset the environment variables using an ``end-tracing.{json,sh,bat,ps1}`` script from the directory where the ``start-tracing`` scripts are stored. | ||
|
||
Once you have created a CodeQL database using indirect build tracing, you can work with it like any other CodeQL database. For example, analyze the database, and upload the results to GitHub if you use code scanning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: link to the docs on analyzing and uploading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As requested, I took another look at this after the technical updates. It looks nearly ready to merge. There was just one point that I missed in my original review. Once resolved, this looks ready to merge. ✨
# If no language is specified, a GitHub Apps or personal access token must be passed through stdin. | ||
# to autodetect the language. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is from the original and I overlooked line 288 in my earlier review. I think a copy/paste operation went slightly awry here. Based on --language
I suspect this ought to be:
# If no language is specified, a GitHub Apps or personal access token must be passed through stdin. | |
# to autodetect the language. | |
# If you omit `--language`, the CLI will call the GitHub API for language data. | |
# This will fail unless a GitHub Apps or personal access token is available in the | |
# environment variable GITHUB_TOKEN or passed through stdin using `--github-auth-stdin`. |
However, we don't mention omitting the --language
option anywhere else in this article, so I wonder if we really want to introduce it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Felicity here - I think we want to keep the example limited to talking about indirect tracing so I would propose that we do not need to mention --language
at all and can just remove the rest of this comment after "In this example, the CodeQL CLI has been downloaded and placed on the PATH.".
Documenting exactly how codeql database init
behaves belongs in the documentation for that command, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given this another read through after your changes, and from a technical stand-point I think this is now good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ✔️ @edoardopirovano.
It looks ready to merge to me too. 🚀
This PR documents the indirect build tracing feature (referred to as "sandwiched tracing" here: https://github.com/github/codeql-cli-binaries/releases/tag/v2.6.0). This PR adds a section to "Creating CodeQL databases" explaining when and how to use indirect build tracing and an example of indirect build tracing. More info in the internally linked issue.