Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

efi-stub: when booting a kernel foo.efi then pack foo.efi.extra.d/*.{cred,raw} as an initrd #20789

Merged
merged 24 commits into from
Sep 23, 2021

Conversation

poettering
Copy link
Member

This adds support for the EFI stub to look for credential files and sysext files next to the EFI kernel image being loaded, and pack them up in an initrd cpio image, and pass them to the kernel.

Specifically, for a kernel image foo.efi it looks for foo.efi.extra.d/*.cred and packs these files up in an initrd, placing it inside a directory /.extra/credentials/. It then looks for foo.efi.extra.d/*.raw and pack these files up in an initrd, placing them inside a directory /.extra/sysexts/. It then concatenates any other initrd with these two initrds, so they are combined.

Or in other words auxiliary files placed next to the kernel image are picked up automatically by the EFI stub and be made available in the initrd in the /.extra/ directory.

What's the usecase for this? This is supposed to be useful in the context of implementing fully trusted initrds, i.e. initrds that are not built locally on the system and unsigned/unmeasured as currently, but instead are built by the vendor, and measures to TPM. The idea is that a basic initrd is always linked into the kernel EFI image anyway. This will already be sufficient for many cases. However, in some cases it is necessary to parameterize initrds, or to extend the basic initrds with additional subsystems (e.g. think complex storage, or passing server info/certificates/… to initrds). The idea is that the parameterization is done using the "credentials" logic we already have in systemd, with these credential files (which can optionally be encrypted+authenticated by TPM2) being placed in the ESP next to the kernel image. And the initrd extension via the "sysext" logic we already have in systemd too.

Note that the files read by this code are not verified immediately, they are copied as-is and placed into /.extra/ in the initrd. In a trusted environment they need to be validated later, but before first use. For the credentials logic this should be done via the TPM2 encryption/authentication logic. For the sysext stuff the idea is that this is done via signed images, as implemented by #20691.

@poettering
Copy link
Member Author

poettering commented Sep 20, 2021

(Note that this doesn't do anything with the data yet, once it is available in the initrd. We probably want to teach systemd-sysext to automatically pick up extensions from /.extra/sysexts/, but insist on signatures. And we probably want to move the credential files from /extra/credentials/ somewhere into /run/ so that they will survive into the host, (because the stuff dropped as initrd is all flushed out during the transition to the host) as a general mechanism to parameterize the boot.)

@lgtm-com
Copy link

lgtm-com bot commented Sep 20, 2021

This pull request introduces 1 alert when merging 520e21b into fa4f366 - view on LGTM.com

new alerts:

  • 1 for Unsigned comparison to zero

Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice.

I didn't verify the code the generates the cpio payload… I wonder if we could somehow make this into library code, so that we can test it independently.

src/boot/efi/util.c Outdated Show resolved Hide resolved
src/boot/efi/util.c Outdated Show resolved Hide resolved
src/boot/efi/util.c Outdated Show resolved Hide resolved
src/boot/efi/cpio.c Show resolved Hide resolved
src/boot/efi/stub.c Outdated Show resolved Hide resolved
meson_options.txt Show resolved Hide resolved
src/boot/efi/stub.c Outdated Show resolved Hide resolved
@medhefgo
Copy link
Contributor

It feels like you're doing the job of userspace here. We could just expect userspace to have pre-assembled the extra initrd so we don't need to ship our own cpio generator. If it's about not trusting it to not override vendor files, you could just have a cmdline option that points userspace to the partition+folder or partition+cpio where it can then pick these up. Or another idea would be to hand it to them using volatile efi vars (I assume they are stored in ram, not nvram).

@poettering
Copy link
Member Author

It feels like you're doing the job of userspace here. We could just expect userspace to have pre-assembled the extra initrd so we don't need to ship our own cpio generator. If it's about not trusting it to not override vendor files, you could just have a cmdline option that points userspace to the partition+folder or partition+cpio where it can then pick these up. Or another idea would be to hand it to them using volatile efi vars (I assume they are stored in ram, not nvram).

But doing it via cpio is pretty easy, and places these resources where we want them: in the file system of the initrd. cpio is such a trivial format, it's just the files concatenated with super short text headers before. it fantastically easy to generate, places the data where we want it, in the format we want it, without size limits and pretty naturally maps to the right PCRs to enroll things in and operates with only minimal reliance of EFI features (i.e. just fs reading).

It makes everything so very simple: no pregeneration or special tools on the host. A simply "cp" of the sysext image or the credential file is enough. It makes everything so simple: placing the files where they are found, picking them up from there, measuring them, and passing them along, and then picking them up from the initrd.

Copy link
Contributor

@medhefgo medhefgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine I guess. I do feel I am being overzealous on assert()... :/

src/boot/efi/util.c Outdated Show resolved Hide resolved
src/boot/efi/util.c Show resolved Hide resolved
src/boot/efi/util.c Outdated Show resolved Hide resolved
src/boot/efi/util.c Outdated Show resolved Hide resolved
src/boot/efi/util.c Outdated Show resolved Hide resolved
src/boot/efi/cpio.c Show resolved Hide resolved
src/boot/efi/cpio.c Outdated Show resolved Hide resolved
src/boot/efi/stub.c Show resolved Hide resolved
src/boot/efi/stub.c Outdated Show resolved Hide resolved
src/boot/efi/measure.c Outdated Show resolved Hide resolved
@poettering
Copy link
Member Author

It feels like you're doing the job of userspace here. We could just expect userspace to have pre-assembled the extra initrd so we don't need to ship our own cpio generator.

I forgot earlier: the killer argument against the idea of having userspace assemble the cpio archive here is trust isolation: the data we pick up here is initially untrusted, i.e. it's just a bunch of files from the ESP. Validation for them happens only when they are used (i.e. for the credentials when they are decrypted/authenticated via TPM; and for the sysext images the PKCS7 signature checked). This means we need to make sure to isolate them from the rest of the initrd data. This patch does this by packing them in a subdir /.extra/. We now have a "quarantine" subdir called /.extra/ where this untrusted data is placed, and we can give you the guarantee that these files we picked up are not mixed up with the otherwise trusted initrd data.

If we pre-generate the cpio archives then we can't do that: the esp is untrusted territory, thus if we load the cpio from there, it could have been manipulated to replace /bin/sh or so, and the kernel would then happily use that instead of the real /bin/sh.

The only way you can salvage that if these pre-generated initrds would establish trust some otherway. That could be: are signed (but with which key, and how do you communicate that key?) or run through an initrd validator before passed to the kernel that ensures everything in them is in a quarantined dir (and thus not replacing /bin/sh, …). But such a validator would be a lot more complex then the trivial generator the current PR has.

Hence: pre-generated cpios really don't deliver the same trust guarantees so easily.

@poettering
Copy link
Member Author

Or another idea would be to hand it to them using volatile efi vars (I assume they are stored in ram, not nvram).

I doubt you can pack potentially huge sysext GPT images in EFI vars. In particular as we want to free them after use, and I am pretty sure memory taken by EFI vars is not returned to the OS even if they are deleted.

@poettering
Copy link
Member Author

If it's about not trusting it to not override vendor files, you could just have a cmdline option that points userspace to the partition+folder or partition+cpio where it can then pick these up.

I am not sure I understand what you mean by this? Can you elaborate?

@medhefgo
Copy link
Contributor

medhefgo commented Sep 21, 2021

I am not sure I understand what you mean by this? Can you elaborate?

Add rd.extra=ESPUUID:/path/to/extra.d/ to cmdline and then let the initrd do with it whatever it wants. (Mount esp and verify/copy the payload, measure it to TPM if it wants and then use it)

@poettering
Copy link
Member Author

Add rd.extra=ESPUUID:/path/to/extra.d/ to cmdline and then let the initrd do with it whatever it wants. (Mount esp and verify/copy the payload, measure it to TPM if it wants and then use it)

I think it's really important to load these parameters and extension files at the same time as the main initrd itself from the same source. After all, they might contain the information need to even be able to mount the ESP. i.e. consider booting from UEFI iscsi stuff: to actually be able to find the ESP you need a full iscsi stack to be up, and that might require a sysext to your initrd (since iscsi is complex storage), or special credentials to be able to access the iscsi share.

it should be possible to use the creds and the sysexts to actually discover the root fs and then the ESP from the Linux envrionment. It should not be assumed that probing the hw, accessing it and finding the ESP is something we can trviailly easily do from the Linux environment, and might require additional info/drivers/subsystems for which we want to provide the mechanism here to provide to the initrd.

(also, right now we mount the ESP super late during late boot. It would require major reworking to make this something the initrd already does).

@lgtm-com
Copy link

lgtm-com bot commented Sep 21, 2021

This pull request introduces 1 alert when merging af860a5 into 71a80dc - view on LGTM.com

new alerts:

  • 1 for Unsigned comparison to zero

src/boot/efi/boot.c Outdated Show resolved Hide resolved
src/boot/efi/drivers.c Outdated Show resolved Hide resolved
src/boot/efi/boot.c Outdated Show resolved Hide resolved
continue;
if (dirent->Attribute & EFI_FILE_DIRECTORY)
continue;
if (match_suffix && !endswith_no_case(dirent->FileName, match_suffix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the first 3 checks here should be part of readdir_harder

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure that'd really make things ncier, since the invocations of readdir_harder() we have filter by different stuff...

dunno, maybe for a later cleanup patch, but i don't see it for now.

man/systemd-stub.xml Outdated Show resolved Hide resolved
man/systemd-stub.xml Outdated Show resolved Hide resolved
man/systemd-stub.xml Outdated Show resolved Hide resolved
@poettering poettering changed the title wip: efi-stub: when booting a kernel foo.efi then pack foo.efi.extra.d/*.{cred,raw} as an initrd efi-stub: when booting a kernel foo.efi then pack foo.efi.extra.d/*.{cred,raw} as an initrd Sep 21, 2021
@poettering
Copy link
Member Author

Should be ready for review now, has all the docs now. Please have a look.

I added a bunch of clean-up commits on top, if people want I can split those out.

…call

Just some refactoring, no real code changes.
Let's move conditionalization of tpm_log_load_options() into the
measure.h to encapsulate the ifdeffery a bit more.
This isn't trivial when trying to be compatible with 32bit archs, hence
add a set of helper macro-like functions that make the conversion safe.
Let's make some stuff const. Most importanly call AllocatePages() with
a pointer to an EFI_PHYSICAL_ADDRESS instead of a pointer to a
pointer. On 64bit this makes no difference, but on i386 this is simply
not correct, since EFI_PHYSICAL_ADDRESS is 64bit there, even though
pointers are 32bit.
We inquire the EFI var for this at two places, let's add a helper that
queries it and gracefully handles it if we can't get it, by returning a
zero mask, i.e. no features supported.
…nce don't lie

(also, most communication happens between boot loader and OS, only
seldom stuff goes the other way, hence mention that the boot loader
first)
@poettering poettering added good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed and removed ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR please-review labels Sep 23, 2021
@poettering
Copy link
Member Author

@medhefgo thanks for the review! addressed everything (but the TPM ifdef thing, see above)

given the changes were rather tiny this time, taking liberity to set green label

If the field exists it's probably the best version we have for sorting,
since it will change on every single OS image update.
/* key= */ 'l',
os_name_pretty ?: os_name,
path,
os_image_version ?: (os_version ?: (os_version_id ? : os_build_id)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work to add "n/a" at the end of this chain to fix #20820 ? Things wouldn't be sorted, but if one wants that then they need to provide a version

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the fix for that should look different, i.e. we might just want to pass NULL in that case. the version is optional for the rest of the code and used for presentation purposes only (unlike i suggested elsewhere we actually order entries by the filename where they are defined in, not the version)

@poettering poettering merged commit 37eb710 into systemd:main Sep 23, 2021
@bluca
Copy link
Member

bluca commented Sep 24, 2021

The mkosi changes are making the CI sad:

‣ qemu-system-x86_64 -machine type=q35,accel=tcg,smm=off -smp 2 -m 1G -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -vga virtio -nic tap,script=no,downscript=no,ifname=vt-image,model=virtio-net-pci -drive if=pflash,format=raw,readonly=on,file=/usr/share/ovmf/OVMF.fd -drive if=none,id=hd,file=/home/runner/work/systemd/systemd/image.raw,format=raw -device virtio-scsi-pci,id=scsi -device scsi-hd,drive=hd,bootindex=1

VNC server running on ::1:5900
TIMED OUT
Error: Process completed with exit code 1.

@poettering this PR borked all mkosi CI jobs

@poettering
Copy link
Member Author

oh, hmm i fucked this up, but how so? i didn't actually make any changes to mkosi.

(i had assumed the timeouts were causes by something else.)

@bluca
Copy link
Member

bluca commented Sep 24, 2021

I don't know, maybe I missed that it happened somewhere else first? Haven't seen it in other PRs though

@bluca
Copy link
Member

bluca commented Sep 24, 2021

wait, won't it be this commit? a02c123 I thought it was part of this PR, but now github doesn't link it anymore?

@poettering
Copy link
Member Author

Ah, fuck it, I pushed that by accident when I pushed a TODO addition directly to main.

@bluca
Copy link
Member

bluca commented Sep 24, 2021

Are you going to push a revert? Or shall I?

@poettering
Copy link
Member Author

already did

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed new-feature sd-boot/sd-stub/bootctl
7 participants