Add indirect build tracing docs #6667

ethanpalm · 2021-09-10T20:09:32Z

This PR documents the indirect build tracing feature (referred to as "sandwiched tracing" here: https://github.com/github/codeql-cli-binaries/releases/tag/v2.6.0). This PR adds a section to "Creating CodeQL databases" explaining when and how to use indirect build tracing and an example of indirect build tracing. More info in the internally linked issue.

ethanpalm · 2021-09-10T20:14:52Z

@adityasharad or @edoardopirovano, would one of you be able to provide a technical review of this new content? Thanks!

docs/codeql/codeql-cli/creating-codeql-databases.rst

edoardopirovano

Thanks @ethanpalm! A few technical comments below - @adityasharad or @hmakholm might have more 🙂

docs/codeql/codeql-cli/creating-codeql-databases.rst

felicitymay

This is looking good. It's great to have such a detailed example too 💖

I've added a few comments on the text, but nothing major.

felicitymay · 2021-09-14T07:33:05Z

docs/codeql/codeql-cli/creating-codeql-databases.rst

+Using indirect build tracing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action.


👋🏻 AFAICT we don't currently explain how to "wrap invocations of build commands with codeql database trace-command".

The section above uses codeql database create with --command='<build command>. Are these equivalent? (I notice that the trace-command help describes it as a plumbing command).

I wonder whether it should be more like:

Suggested change

If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action.

If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot specify build commands with using the ``--command`` option, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action.

Alternatively, we may be missing a section on how to use codeql database trace-command.

Yes, codeql database init followed by codeql database trace-command <command> followed by codeql database finalize should be equivalent to codeql database create --comand=<command>. The latter is the recommended way to create the DB if you have a single command. The former is the recommended way to create a DB if the build requires multiple commands (and you can't wrap them in a script to make them into a single command) and you can add codeql database trace-command in front of each one. The new indirect tracing option addresses the case where:

You have multiple build commands.

You cannot wrap them with codeql database trace-command.

Thanks for the extra information. It sounds as if we could do with a short overview giving the options for tracing the build for compiled languages. Possibly this would be better as part of a follow up PR, but I'll leave @ethanpalm to make the call on this.

One of the considerations we're trying to balance is to provide indirect tracing as an option for people who need it without directing people toward it unintentionally. This came up in naming and avoiding calling indirect build tracing an advanced option. It feels to me like indirect build tracing would be better introduced as a troubleshooting option rather than one of several options for tracing the build of compiled languages in general.

Perhaps a line at the beginning of Creating databases for compiled languages, after For compiled languages, CodeQL needs to invoke the required build system to generate a database, therefore the build method must be available to the CLI. explaining to see below about indirect build tracing if it is relevant to the specific use case. I think this could help direct people who need to use indirect build tracing to the procedure but won't cause people to think they should use indirect build tracing when they don't need to.

After a bit more thinking, I am going to open a separate issue for how we introduce this information because I think there are a few different approaches we can take.

I agree with @felicitymay's suggestion about providing a short guide on the options for compiled languages. There are many possibilities, but to me the high level options are:

Do you use a well-known build system recognised by the CodeQL autobuilders? Use codeql database create (without a --command argument) to autobuild the code.

Do you know the build command line? Use codeql database create ... --command "<build command>"

A variation of this is if you have multiple build command lines, in which case you would use trace-command multiple times. I don't think we need to mention that just yet.

If neither of the above are suitable, for example if you are using preconfigured build steps from your CI system that do not expose the build command, then use indirect build tracing. Examples of such build steps are the VSBuild and MSBuild tasks in Azure DevOps.

Notably, indirect tracing is not a viable troubleshooting option. Aside from autobuild failing, there's no way to try out a build without it if you don't know the build command.

docs/codeql/codeql-cli/creating-codeql-databases.rst

Co-authored-by: Felicity Chapman <felicitymay@github.com>

adityasharad

Great work documenting a complex feature. Some suggestions for added clarity but I think this is generally on the right track, and I'd support shipping these changes and making incremental improvements over time.

docs/codeql/codeql-cli/creating-codeql-databases.rst

adityasharad · 2021-09-14T19:11:39Z

docs/codeql/codeql-cli/creating-codeql-databases.rst

+Using indirect build tracing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action.


I agree with @felicitymay's suggestion about providing a short guide on the options for compiled languages. There are many possibilities, but to me the high level options are:

Do you use a well-known build system recognised by the CodeQL autobuilders? Use codeql database create (without a --command argument) to autobuild the code.

Do you know the build command line? Use codeql database create ... --command "<build command>"

A variation of this is if you have multiple build command lines, in which case you would use trace-command multiple times. I don't think we need to mention that just yet.

If neither of the above are suitable, for example if you are using preconfigured build steps from your CI system that do not expose the build command, then use indirect build tracing. Examples of such build steps are the VSBuild and MSBuild tasks in Azure DevOps.

Notably, indirect tracing is not a viable troubleshooting option. Aside from autobuild failing, there's no way to try out a build without it if you don't know the build command.

adityasharad · 2021-09-14T19:13:27Z

docs/codeql/codeql-cli/creating-codeql-databases.rst

+
+  Based on your operating system, we recommend you run: ...
+
+The ``codeql database init`` command creates ``<database>/temp/tracingEnvironment`` with files that contain environment variables and values that will enable CodeQL to trace a sequence of build steps. These files are named ``start-tracing.{json,sh,bat,ps1}``. Use one of these files with your CI system's mechanism for setting environment variables for future steps. You can:


This is a good explanation.

docs/codeql/codeql-cli/creating-codeql-databases.rst

adityasharad · 2021-09-14T19:15:53Z

docs/codeql/codeql-cli/creating-codeql-databases.rst

+
+Build your code and then run the command ``codeql database finalize <database>``. Optionally, after building the code, unset the environment variables using an ``end-tracing.{json,sh,bat,ps1}`` script from the directory where the ``start-tracing`` scripts are stored.
+
+Once you have created a CodeQL database using indirect build tracing, you can work with it like any other CodeQL database. For example, analyze the database, and upload the results to GitHub if you use code scanning.


Optional: link to the docs on analyzing and uploading?

felicitymay

As requested, I took another look at this after the technical updates. It looks nearly ready to merge. There was just one point that I missed in my original review. Once resolved, this looks ready to merge. ✨

felicitymay · 2021-09-16T07:21:16Z

docs/codeql/codeql-cli/creating-codeql-databases.rst

+       # If no language is specified, a GitHub Apps or personal access token must be passed through stdin.
+       # to autodetect the language.


This comment is from the original and I overlooked line 288 in my earlier review. I think a copy/paste operation went slightly awry here. Based on --language I suspect this ought to be:

Suggested change

# If no language is specified, a GitHub Apps or personal access token must be passed through stdin.

# to autodetect the language.

# If you omit `--language`, the CLI will call the GitHub API for language data.

# This will fail unless a GitHub Apps or personal access token is available in the

# environment variable GITHUB_TOKEN or passed through stdin using `--github-auth-stdin`.

However, we don't mention omitting the --language option anywhere else in this article, so I wonder if we really want to introduce it here.

I agree with Felicity here - I think we want to keep the example limited to talking about indirect tracing so I would propose that we do not need to mention --language at all and can just remove the rest of this comment after "In this example, the CodeQL CLI has been downloaded and placed on the PATH.".

Documenting exactly how codeql database init behaves belongs in the documentation for that command, I think.

edoardopirovano

I've given this another read through after your changes, and from a technical stand-point I think this is now good to go.

felicitymay

Thanks for the ✔️ @edoardopirovano.
It looks ready to merge to me too. 🚀

add indirect build tracing content and example

fb22931

github-actions bot added the documentation label Sep 10, 2021

Wahhe1 previously approved these changes Sep 11, 2021

View reviewed changes

docs/codeql/codeql-cli/creating-codeql-databases.rst Show resolved Hide resolved

edoardopirovano reviewed Sep 11, 2021

View reviewed changes

felicitymay reviewed Sep 13, 2021

View reviewed changes

docs/codeql/codeql-cli/creating-codeql-databases.rst Outdated Show resolved Hide resolved

Add reviewer feedback

47a543e

ethanpalm dismissed Wahhe1’s stale review via 47a543e September 13, 2021 16:02

Add example step for ending build tracing

930a36d

felicitymay reviewed Sep 14, 2021

View reviewed changes

Apply suggestions from code review

c62a21e

Co-authored-by: Felicity Chapman <felicitymay@github.com>

adityasharad reviewed Sep 14, 2021

View reviewed changes

Add reviewer feedback

080867a

felicitymay reviewed Sep 16, 2021

View reviewed changes

edoardopirovano previously approved these changes Sep 16, 2021

View reviewed changes

Update example note

4d7aa5c

ethanpalm dismissed edoardopirovano’s stale review via 4d7aa5c September 16, 2021 16:29

felicitymay approved these changes Sep 16, 2021

View reviewed changes

ethanpalm merged commit b73a2f7 into github:main Sep 16, 2021

	If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot wrap invocations of build commands with ``codeql database trace-command``, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action.
	If the CodeQL CLI autobuilders for compiled languages do not work with your CI workflow and you cannot specify build commands with using the ``--command`` option, you can use indirect build tracing to create a CodeQL database. To use indirect build tracing, your CI system must be able to set custom environment variables for each build action.


		Based on your operating system, we recommend you run: ...

		The ``codeql database init`` command creates ``<database>/temp/tracingEnvironment`` with files that contain environment variables and values that will enable CodeQL to trace a sequence of build steps. These files are named ``start-tracing.{json,sh,bat,ps1}``. Use one of these files with your CI system's mechanism for setting environment variables for future steps. You can:


		Build your code and then run the command ``codeql database finalize <database>``. Optionally, after building the code, unset the environment variables using an ``end-tracing.{json,sh,bat,ps1}`` script from the directory where the ``start-tracing`` scripts are stored.

		Once you have created a CodeQL database using indirect build tracing, you can work with it like any other CodeQL database. For example, analyze the database, and upload the results to GitHub if you use code scanning.

		# If no language is specified, a GitHub Apps or personal access token must be passed through stdin.
		# to autodetect the language.

-       # If no language is specified, a GitHub Apps or personal access token must be passed through stdin.
-       # to autodetect the language.
+       # If you omit `--language`, the CLI will call the GitHub API for language data.
+       # This will fail unless a GitHub Apps or personal access token is available in the
+       # environment variable GITHUB_TOKEN or passed through stdin using `--github-auth-stdin`.

Add indirect build tracing docs #6667

Add indirect build tracing docs #6667

Uh oh!

Conversation

ethanpalm commented Sep 10, 2021

Uh oh!

ethanpalm commented Sep 10, 2021

Uh oh!

Uh oh!

edoardopirovano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

felicitymay left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ethanpalm Sep 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adityasharad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

felicitymay left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edoardopirovano Sep 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edoardopirovano left a comment

Choose a reason for hiding this comment

Uh oh!

felicitymay left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ethanpalm Sep 14, 2021 •

edited

Loading

edoardopirovano Sep 16, 2021 •

edited

Loading