[systemd-devel] Measured systemd-sysext

Mon May 27 07:00:45 UTC 2024

On 24/5/24 18:44, Lennart Poettering wrote:
> On Fr, 24.05.24 17:39, Dimitris Karakasilis (dimitris at karakasilis.me) wrote:
>
>> we (at kairos.io) are trying to understand how systemd-sysext
>> extensions can
> Hmm, I thought kairos wasn't so fond of systemd?
Why would you think that? Kairos is distro-agnostic, thus it tries to 
work on openrc based distros as well but the systemd based ones are 
better supported and tested to be honest.
Thanks for the detailed information (below). We are not so familiar with 
these features to be able to contribute an implementation but we'll keep 
an eye on development and contribute in
any way we can.
>
>> also be made tamper-proof by being measured in a system that boots in UKI
>> mode.
> It's pretty simple: there's no nice support for comprehensively
> measuring sysext images right now. There's support for measuring into
> PCR 13 the sysext images passed into the UKI, but that's pretty much
> it: there's no support for measuring sysexts activated from other
> sources and later during runtime.
>
> So there are two issues:
>
> 1. Right now we don't really have another PCR to spare. The various
>     PCRs systemd measures stuff into right now contain maesurements
>     that typically happen only once during boot. That makes them really
>     nice for validating/attesting boot success, or to bind policy to
>     and so on, as they are relatively stable, they "settle"
>     eventually. Measurements of sysexts on activation are different
>     from that, after all sysexts are added/removed/updated during
>     runtime all the time, hence they probably should be expected to be
>     a continuing series of measurements, one for each activation during
>     runtime. That makes them nice for attestation, but much less useful
>     for binding policy to. Hence, I think there's a strong reason to
>     keeping these measurements separate from the existing measurements,
>     i.e. place them in a separate PCR – but we have none left.
>
>     Now, TPM2 allows adding new "fake" PCRs via a special type of
>     nvindex so that this restriction goes away. It's high on our todo
>     list to have an API for "registering" such "fake" PCRs (which would
>     mean: allocating the nvindex with an apprpriate locked down policy,
>     and then storing information about this somewhere). This should
>     probably be placed in systemd-pcrextend at .service (which already
>     provides an API to measure arbitrary stuff to arbitrary PCRs, so it
>     looks like it would be a nice place to allow measuring arbitrary
>     stuff to "fake" PCRs, and allocating them. This is probably not
>     particularly involved, but so far noone has worked ont his.
>
> 2. The questions is where (in which piece of code) the system
>     extensions should be measured. There are two potential places: when
>     we activate them, from userspace code. That would be trivial to add for
>     us. We have all the internal apis after all. i.e. we could just use
>     the aforementioned pcrextend apis once we have them to allocate a
>     fake PCR and then immediately measure into them.
>
>     However, what might be nicer would be to measure this in kernel
>     space. I was discussing this at last week's LSFMMBPF conference
>     with various relevant folks, and one idea we came up with is
>     something like this:
>
>     a) introduce a BPF kfunc for TPM measurements in the kernel, so
>        that BPF code loaded into the kernel can do measurements. THis
>        would require an upstream kernel patch, but the BPF folks seemed
>        kinda on board with that.
>
>     b) then put together a small BPF LSM for the Linux kernel that
>        hooks into the dm-verity activation, and does two things:
>        measures the root hash of the device (plus some metadata such as
>        the DM device name), and writes a quick log message into a bpf
>        ringbuffer to userspace. Userspace would then read that and
>        ensure the log ends up in the measurement logs systemd maintains
>        anyway.
>
>     In systemd we already ship and load some BPF LSMs, adding another
>     like the above should be relatively straight-forward.
>
>     (Of course, it's a bit more complicated than this, because a BPF
>     kfunc that can measure into a PCR is not going to be enough [NB:
>     the kernel already has general code to measure into PCRs], after
>     all we want to measure into a "fake PCR" nvindex, which the kernel
>     has no existing code for yet. Somebody would have to write that
>     first, but it should be managable).
>
> Putting this all together (under the assumption we go for the bpf-lsm
> option), the codeflow would be something like this:
>
> 1. early during boot, systemd allocates a "fake PCR" for dm-verity
>     measurements, from userspace
>
> 2. it then loads the small BPF LSM that makes sure all dm-verity
>     activations are measured, and parameterizes it with the allocated
>     fake PCR nvindex.
>
> 3. A bpf ringbuffer is kept in place that will receive the measurement
>     log from the bpf lsm, and some code in userspace picks the data up
>     from there and writes it to the usual measurement log.
>
> And then we should have a really nice, very comprehensive solution.
>
> Work to making this a reality would be very welcome of course.
>
> (Full disclosure: you can use IMA today to measure all dm-verity root
> hashes into the IMA logs, but I personally am not a fan of IMA, it's a
> complex beast with so many features I find quite questionnable today,
> that I'd rather have a much much simpler lsm-bpf as alternative, that
> just does this one thing and nothing else. IMA keeps its logs in
> kernel memory, unbounded, with no mechanism for rotation, which I
> personally find a complete dealbreaker.)
>
> So much about my current ideas regarding all this.
>
> Lennart
>
> --
> Lennart Poettering, Berlin