Page MenuHomeVyOS Platform

fsck during boot doesnt work
In progress, NormalPublicBUG

Description

Trying to issue a fsck on the partition which VyOS is installed at but I fail.

  1. Adding "fsck.mode=force fsck.repair=yes” to linux bootstring in /boot/grub/grub.cfg and reboot - failed!
  1. Adding empty file named “forcefsck” to the root like so: “sudo touch /forcefsck” and reboot - failed!
  1. Altering the partition itself by “sudo tune2fs -c 1 /dev/sda1” changes max mount count to 1 (so that fsck should be run each boot) and reboot - failed!

Doing tune2fs -l /dev/sda1 | grep -i check still says (and today is 19 aug):

Last checked:             Sun Jul  2 19:58:46 2023

Layout of current storage:

root@vyos:/home/vyos# lsblk -f
NAME   FSTYPE   FSVER LABEL       UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
loop0  squashfs 4.0                                                          0   100% /usr/lib/live/mount/rootfs/1.4-rolling-202308170317.squashfs
sda                                                                                   
└─sda1 ext4     1.0   persistence 8feee3c2-8263-4220-a18d-23c2d0e646bb   71.4G     3% /usr/lib/live/mount/persistence/boot/1.4-rolling-202308170317/grub
                                                                                      /boot/grub
                                                                                      /boot
                                                                                      /opt/vyatta/etc/config
                                                                                      /usr/lib/live/mount/persistence
sr0

Details

Difficulty level
Unknown (require assessment)
Version
VyOS 1.4-rolling-202308170317
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

My guess is that a whole bunch of systemd-things are missing inside initramfs.
For example systemd-fsck-root.service has Before=local-fs.target shutdown.target & ConditionPathIsReadWrite=!/ which (to me) suggests it should be ran from inside initramfs, before your root partition is mounted.

Nope, scrap the above. (Even though it would not surprise me if systemd would be able to perform such tasks in initramfs or else where.)

I digested the VyOS initrd and compared with initrd from vanilla Debian Bookworm. They are similar for the most part, and the fsck-binaries is available inside initrd. The difference is that VyOS has it's own, quite special, boot process. The cmdline boot=live bypasses almost everything from a 'normal' (aka. boot=local) boot, which would include filesystem-checks.
Instead boot=live runs Live-function from https://github.com/vyos/live-boot/blob/current/components/9990-main.sh - I think. And these scripts doesn't include any fsck support - that I can see.

So the bug is that "boot=live" is being used when installing VyOS to a harddrive?

Is the proper fix to change this to "boot=local"?

No, setting boot=local will run a completely different set of ("vanilla") boot-scripts, which (i guess) will not set up the special mounts that VyOS requires, and you will end up in initramfs with an error.

The fix to enable fsck at boot would probably be to, from the Live-function mentioned above, utilize some of the fsck related functions exposed from the "vanilla" boot-scripts - or write a new one.

I do think I've seen in a comment in a ticket somewhere, a which to completely rebuild the boot.

Will attempt to:

Available fsck binaries (and symlinks) from chroot/sbin:

lrwxrwxrwx 1 root root      8 feb  8  2021 dosfsck -> fsck.fat
-rwxr-xr-x 1 root root 356624 mar  5  2023 e2fsck
-rwxr-xr-x 1 root root  55664 mar 23  2023 fsck
-rwxr-xr-x 1 root root  43384 mar 23  2023 fsck.cramfs
-rwxr-xr-x 1 root root  56416 okt 28  2022 fsck.exfat
lrwxrwxrwx 1 root root      6 mar  5  2023 fsck.ext2 -> e2fsck
lrwxrwxrwx 1 root root      6 mar  5  2023 fsck.ext3 -> e2fsck
lrwxrwxrwx 1 root root      6 mar  5  2023 fsck.ext4 -> e2fsck
-rwxr-xr-x 1 root root  84208 feb  8  2021 fsck.fat
-rwxr-xr-x 1 root root 125272 mar 23  2023 fsck.minix
lrwxrwxrwx 1 root root      8 feb  8  2021 fsck.msdos -> fsck.fat
lrwxrwxrwx 1 root root      8 feb  8  2021 fsck.vfat -> fsck.fat

Choose to include most of them (leaving out fsck.framfs and fsck.minix due to less likely that the host uses that for storage where VyOS boots from) in data/live-build-config/includes.chroot/etc/initramfs-tools/hooks/10-vyos-addons:

# missing fsck in initramfs
copy_exec /sbin/dosfsck
copy_exec /sbin/e2fsck
copy_exec /sbin/fsck
copy_exec /sbin/fsck.exfat
copy_exec /sbin/fsck.ext2
copy_exec /sbin/fsck.ext3
copy_exec /sbin/fsck.ext4
copy_exec /sbin/fsck.fat
copy_exec /sbin/fsck.msdos
copy_exec /sbin/fsck.vfat
copy_exec /sbin/logsave

Also adding data/live-build-config/includes.chroot/etc/initramfs-tools/initramfs.conf with following changes from default:

COMPRESS=zstd
# COMPRESSLEVEL=3

to:

COMPRESS=xz
COMPRESSLEVEL=9

Ref:

https://forum.armbian.com/topic/11207-include-fsck-on-a-init-ramdisk-espressobin/?do=findComment&comment=84245

https://community.centminmod.com/threads/round-3-compression-comparison-benchmarks-zstd-vs-brotli-vs-pigz-vs-bzip2-vs-xz-etc.17259/

https://www.phoronix.com/news/Ubuntu-2023-Initramfs-Compress

Turns out that packages/linux-kernel/arch/x86/configs/vyos_defconfig doesnt include xz as option for initrd:

CONFIG_RD_GZIP=y
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
# CONFIG_RD_LZ4 is not set
CONFIG_RD_ZSTD=y

So the data/live-build-config/includes.chroot/etc/initramfs-tools/initramfs.conf will be attempted with following settings instead:

COMPRESS=zstd
COMPRESSLEVEL=19

I have filed the missing kernel config as https://vyos.dev/T5640

As @twan mentioned previously...

Comparing VyOS init-scripts (content of initrd.img) with vanilla Debian 12.2 scripts we can see that:

  • The /init file is the same.
  • The /scripts/functions file is the same.
  • The /scripts/local file is the same.
  • The /scripts/live file differs - however the differences have nothing to do with fsck:

VyOS 1.5-rolling:

#!/bin/sh

#set -e

. /bin/live-boot

. /scripts/functions

mountroot ()
{
        # initramfs-tools entry point for live-boot is mountroot(); function
        Live
}

Debian 12.2:

# Live system filesystem mounting			-*- shell-script -*-

. /bin/live-boot

live_top()
{
	if [ "${live_top_used}" != "yes" ]; then
		[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/live-top"
		run_scripts /scripts/live-top
		[ "$quiet" != "y" ] && log_end_msg
	fi
	live_top_used=yes
}

live_premount()
{
	if [ "${live_premount_used}" != "yes" ]; then
		[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/live-premount"
		run_scripts /scripts/live-premount
		[ "$quiet" != "y" ] && log_end_msg
	fi
	live_premount_used=yes
}

live_bottom()
{
	if [ "${live_premount_used}" = "yes" ] || [ "${live_top_used}" = "yes" ]; then
		[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/live-bottom"
		run_scripts /scripts/live-bottom
		[ "$quiet" != "y" ] && log_end_msg
	fi
	live_premount_used=no
	live_top_used=no
}


mountroot()
{
	# initramfs-tools entry point for live-boot is mountroot(); function
	Live
}

mount_top()
{
	# Note, also called directly in case it's overridden.
	live_top
}

mount_premount()
{
	# Note, also called directly in case it's overridden.
	live_premount
}

mount_bottom()
{
	# Note, also called directly in case it's overridden.
	live_bottom
}

The magic regarding fsck occurs in /scripts/local file in functions:

local_mount_root()
{
        local_top
        if [ -z "${ROOT}" ]; then
                panic "No root device specified. Boot arguments must include a root= parameter."
        fi
        local_device_setup "${ROOT}" "root file system"
        ROOT="${DEV}"

        # Get the root filesystem type if not set
        if [ -z "${ROOTFSTYPE}" ] || [ "${ROOTFSTYPE}" = auto ]; then
                FSTYPE=$(get_fstype "${ROOT}")
        else
                FSTYPE=${ROOTFSTYPE}
        fi

        local_premount

        if [ "${readonly?}" = "y" ]; then
                roflag=-r
        else
                roflag=-w
        fi

        checkfs "${ROOT}" root "${FSTYPE}"

        # Mount root
        # shellcheck disable=SC2086
        if ! mount ${roflag} ${FSTYPE:+-t "${FSTYPE}"} ${ROOTFLAGS} "${ROOT}" "${rootmnt?}"; then
                panic "Failed to mount ${ROOT} as root file system."
        fi
}

local_mount_fs()
{
        read_fstab_entry "$1"

        local_device_setup "$MNT_FSNAME" "$1 file system"
        MNT_FSNAME="${DEV}"

        local_premount

        if [ "${readonly}" = "y" ]; then
                roflag=-r
        else
                roflag=-w
        fi

        if [ "$MNT_PASS" != 0 ]; then
                checkfs "$MNT_FSNAME" "$MNT_DIR" "${MNT_TYPE}"
        fi

        # Mount filesystem
        if ! mount ${roflag} -t "${MNT_TYPE}" -o "${MNT_OPTS}" "$MNT_FSNAME" "${rootmnt}${MNT_DIR}"; then
                panic "Failed to mount ${MNT_FSNAME} as $MNT_DIR file system."
        fi
}

mountroot()
{
        local_mount_root
}

So most likely in order to make fsck work properly during boot (fsck.mode=force fsck.repair=yes in /boot/grub/grub.cfg) in VyOS the content of function local_mount_root() in file scripts/local must be converted into the mountroot() function in file scripts/live before "Live" is called.

Viacheslav changed the task status from Open to In progress.Jan 20 2024, 1:09 PM
Viacheslav triaged this task as Normal priority.

Removed assignee for now in case somebody else wants to fix this?

Also this seems to need more work than expected since (back in oct 2023) the rolling release was not based on native live-boot from Debian but some custom broken one (who are missing alot of pieces incl ability to run fsck during boot).

Suggestion to include the fixes by https://github.com/vyos/vyos-build/pull/435 in future attempts to fix this.

Most are probably fixed by using the live-boot that comes with Debian and only add a few modifications to that which are VyOS unique along with conf.d/10-vyos-compression to maximize initramfs compression such as:

COMPRESS=zstd
COMPRESSLEVEL=19

Also make sure that these fsck binaries are included with the initramfs (should be by default and some of the original ones could be removed depending on what partitiontypes VyOS is expected to support):

copy_exec /sbin/dosfsck
copy_exec /sbin/e2fsck
copy_exec /sbin/fsck
copy_exec /sbin/fsck.exfat
copy_exec /sbin/fsck.ext2
copy_exec /sbin/fsck.ext3
copy_exec /sbin/fsck.ext4
copy_exec /sbin/fsck.fat
copy_exec /sbin/fsck.msdos
copy_exec /sbin/fsck.vfat
copy_exec /sbin/logsave