[systemd-devel] repart: Value too large for defined data type
Thayne Harbaugh
thayne at mastodonlabs.com
Wed Apr 2 00:51:13 UTC 2025
On Thu, 2025-01-09 at 14:47 -0700, Thayne Harbaugh wrote:
I *finally* got back around to investigating the below EOVERFLOW
failure - additional details in-line:
> I have a mkosi build that is failing with the following message:
>
> >$ mkosi build
> ...
> /var/tmp/.#repart020c1a929b048b02 successfully formatted as ext4 (label "root", uuid c8ed6cd3-04b0-4667-8f2f-af9487b8b986)
> Automatically determined minimal disk image size as 2.2G.
> Sized '/home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw' to 2.2G.
> Applying changes to /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw.
> Failed to read file attributes of /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw: Value too large for defined data type
> ? "systemd-repart --empty=allow --size=auto --dry-run=no --json=pretty --no-pager --offline=yes --seed 1feaec73-f24d-454c-bd15-4f80926e951e /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw --root=/buildroot --empty=create --defer-partitions esp,xbootldr --generate-fstab=/etc/fstab --generate-crypttab=/etc/crypttab --definitions /source/mkosi.repart" returned non-zero exit code 1.
The above failure is specifically triggered by the following line in
repart.c:prepare_temporary_file():
r = read_attr_fd(fdisk_get_devfd(context->fdisk_context), &attrs);
Then chattr-util.c:read_attr_fd() calls the following line:
return RET_NERRNO(ioctl(fd, FS_IOC_GETFLAGS, ret));
> It is a build that has been running without problems for some time.
> Recently it has changed from incorporating systemd v256 to v257.
> When I switch it back to v256 it succeeds and does not error.
The repart.c:prepare_temporary_file() changed recently with this commit:
commit b9c0b6c011fd0b30d3484d21d70cef6f5ae2fc0a
Author: Daan De Meyer <daan.j.demeyer at gmail.com>
Date: Tue Jul 23 21:43:13 2024 +0200
repart: Make partition files NOCOW if the disk image is NOCOW
https://github.com/systemd/systemd/commit/b9c0b6c011f
> I have tried to narrow it further.
>
> * The build runs inside of a Docker container with the mkosi source
> tree mounted inside. The container is started with the
> following command:
>
> >$ docker run -it --privileged -v "/dir/to/source:/source" ubuntu:24.04
>
> * The container has mkosi 24.3 inside
>
> * The failure occurs with both erofs+best and ext4+guess
>
> * The mkosi configuration is the following:
>
> mkosi.conf
> ==========
>
> [Distribution]
> Distribution=ubuntu
> # noble == 24.04
> Release=noble
> Repositories=noble,noble-security,noble-updates
> Architecture=x86-64
>
> [Output]
> Format=disk
> ImageId=test
> ImageVersion=1.2.3
>
> [Content]
> RootPassword=tomato
> Bootable=yes
> Bootloader=systemd-boot
>
> Packages=
> linux-image-6.8.0-51-generic
>
> mkosi.repart/10-esp.conf
> ========================
>
> [Partition]
> Type=esp
> Format=vfat
> CopyFiles=/boot:/
> CopyFiles=/efi:/
> SizeMinBytes=2048M
>
> mkosi.repart/20-rootfs.conf
> ===========================
>
> [Partition]
> Type=root
> Label=root
> #Format=erofs
> #Minimize=best
> Format=ext4
> Minimize=guess
> MountPoint=/:ro
> CopyFiles=/:/
> ReadOnly=on
> Encrypt=key-file
> EncryptedVolume=root:/run/fscrypt.sock:luks,headless,x-initrd.attach
>
> While initially the failure seemed to correlate directly with systemd
> v257 packages being injected into mkosi.packages I have since been
> able to reproduce the failure with the upstream Ubuntu 24.04/noble
> v255 version of systemd.
>
> It seems that building on a mount inside of a Docker container is
> necessary factor to cause the failure. While I have made a quick
> glance at src/repart/repart.c in prepare_temporary_file(),
> context_split() and context_minimize() I have not done any serious
> digging yet.
I'm running Linux kernel 6.11.10 and Docker 26.1.5.
> Any ideas about the specifics of what causes this failure and what it
> will take to fix it?
It seems to me that this is related to this comment from the Linux
kernel source tree in include/uapi/linux/fs.h:
/*
* Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS)
*
* Note: for historical reasons, these flags were originally used and
* defined for use by ext2/ext3, and then other file systems started
* using these flags so they wouldn't need to write their own version
* of chattr/lsattr (which was shipped as part of e2fsprogs). You
* should think twice before trying to use these flags in new
* contexts, or trying to assign these flags, since they are used both
* as the UAPI and the on-disk encoding for ext2/3/4. Also, we are
* almost out of 32-bit flags. :-)
*
* We have recently hoisted FS_IOC_FSGETXATTR / FS_IOC_FSSETXATTR from
* XFS to the generic FS level interface. This uses a structure that
* has padding and hence has more room to grow, so it may be more
* appropriate for many new use cases.
...
*/
I'm uncertain which file system layer - or other translation layer -
introduced by Docker is causing the EOVERFLOW error. A quick scan
through the kernel code hints that inodes, uid/gid translations and a
few other possibilities can return EOVERFLOW.
Maybe chattr-util.c:read_attr_fd() might be better-implemented using
the newer FS_IOC_FSGETXATTR ioctl? Maybe there's a different way to
detect FS_NOCOW_FL?
I'm continuing to poke at this. Please send me suggestions.
More information about the systemd-devel
mailing list