======================================= FreeBSD zfs zpool will only boot from disk0 ======================================= ZFS can be configured to withstand multiple disk failures and still boot... except if the first disk0 fails. Problem: Installation on a zpool multi disk stripe/mirror/raidz with a partition scheme of GPT (UEFI) or GPT (BIOS+UEFI) Primary Cause: The efi boot loader is only installed onto the first disk0 (ada0p1) Secondary Cause: The mount point of the efi msdosfs points to the failed disk and will hang and drop into single user mode Remedy: I think the best remedy (so far) is to mirror the efi partitions on all the disks ======================================= Install of FreeBSD - always mirror zroot (2 or more disks) - always mirror swap YES - force 4k sectors YES - restart - see what you get on a fresh system... --------------------------------------- root@judo:~ # gpart show -l => 40 20479920 ada0 GPT (9.8G) 40 532480 1 efiboot0 (260M) 532520 1024 2 gptboot0 (512K) 533544 984 - free - (492K) <-- this free space is a mystery to me 534528 8388608 3 swap0 (4.0G) 8923136 11554816 4 zfs0 (5.5G) 20477952 2008 - free - (1.0M) <-- this free space is also a mystery => 40 20479920 ada1 GPT (9.8G) 40 532480 1 efiboot1 (260M) 532520 1024 2 gptboot1 (512K) 533544 984 - free - (492K) 534528 8388608 3 swap1 (4.0G) 8923136 11554816 4 zfs1 (5.5G) 20477952 2008 - free - (1.0M) root@judo:~ # zpool status pool: zroot state: ONLINE config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p4 ONLINE 0 0 0 ada1p4 ONLINE 0 0 0 errors: No known data errors root@judo:~ # gmirror status Name Status Components mirror/swap COMPLETE ada0p3 (ACTIVE) ada1p3 (ACTIVE) root@judo:~ # cat /etc/fstab # Device Mountpoint FStype Options Dump Pass# /dev/gpt/efiboot0 /boot/efi msdosfs rw 2 2 <-- mount (ada0p1) is necessary when upgrading /dev/mirror/swap none swap sw 0 0 root@judo:~ # mount | grep boot /dev/gpt/efiboot0 on /boot/efi (msdosfs, local) ======================================= GENTLY SIMULATE A DISK0 FAILURE (IF YOU WANT TO SEE IT FAIL TO BOOT) - shutdown power off - disconnect disk0 - restart - the system will fail to boot - shutdown power off - reconnect disk0 - restart --------------------------------------- ======================================= Mirror the EFI partitions to fix the problem --------------------------------------- root@judo:~ # umount /boot/efi root@judo:~ # gmirror label -v efiboot /dev/ada0p1 /dev/ada1p1 Metadata value stored on /dev/ada0p1. Metadata value stored on /dev/ada1p1. Done. root@judo:~ # ls -l /dev/mirror total 0 crw-r----- 1 root operator 0x61 Jun 1 21:29 efiboot crw-r----- 1 root operator 0x63 Jun 1 21:16 swap root@judo:~ # newfs_msdos /dev/mirror/efiboot newfs_msdos: cannot get number of sectors per track: Operation not supported newfs_msdos: cannot get number of heads: Operation not supported /dev/mirror/efiboot: 532288 sectors in 16634 FAT16 clusters (16384 bytes/cluster) BytesPerSec=512 SecPerClust=32 ResSectors=1 FATs=2 RootDirEnts=512 Media=0xf0 FATsecs=65 SecPerTrack=63 Heads=16 HiddenSecs=0 HugeSectors=532479 root@judo:~ # mkdir /tmp/mirror root@judo:~ # mount_msdosfs /dev/mirror/efiboot /tmp/mirror root@judo:~ # mkdir -p /tmp/mirror/efi/boot root@judo:~ # cp /boot/loader.efi /tmp/mirror/efi/boot/bootx64.efi root@judo:~ # umount /tmp/mirror root@judo:~ # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 2 /dev/ada1 partcode written to ada1p2 bootcode written to ada1 ======================================= Edit /etc/fstab --------------------------------------- root@judo:~ # cat /etc/fstab # Device Mountpoint FStype Options Dump Pass# # original # /dev/gpt/efiboot0 /boot/efi msdosfs rw 2 2 <-- adding options "rw,late" still hangs if ada0 has failed /dev/mirror/efiboot /boot/efi msdosfs rw 2 2 <-- mounting the mirror is better /dev/mirror/swap none swap sw 0 0 root@judo:~ # mount -a root@judo:~ # mount | grep boot /dev/mirror/efiboot on /boot/efi (msdosfs, local) root@judo:~ # gmirror status Name Status Components mirror/swap COMPLETE ada0p3 (ACTIVE) ada1p3 (ACTIVE) mirror/efiboot COMPLETE ada0p1 (ACTIVE) ada1p1 (ACTIVE) ======================================= CONFIRM THAT THE SYSTEM BOOTS AND WORKS PERFECTLY --------------------------------------- root@judo:~ # shutdown -r now ======================================= GENTLY SIMULATE A DISK0 FAILURE TEST (IF YOU WANT TO CONFIRM THIS SOLUTION WORKS) - shutdown - remove disk0 - restart - the system should boot successfully degraded --------------------------------------- root@judo:~ # zpool status pool: zroot state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J scan: resilvered 280K in 00:00:00 with 0 errors on Sat Jun 1 20:31:00 2024 config: NAME STATE READ WRITE CKSUM zroot DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 ada0p4 FAULTED 0 0 0 corrupted data ada0p4 ONLINE 0 0 0 errors: No known data errors root@judo:~ # zpool status -g pool: zroot state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J scan: resilvered 280K in 00:00:00 with 0 errors on Sat Jun 1 20:31:00 2024 config: NAME STATE READ WRITE CKSUM zroot DEGRADED 0 0 0 16653767519316097051 DEGRADED 0 0 0 12833965069643490787 FAULTED 0 0 0 corrupted data 7501003034083070706 ONLINE 0 0 0 errors: No known data errors root@judo:~ # mount | grep boot /dev/mirror/efiboot on /boot/efi (msdosfs, local) root@judo:~ # gmirror status Name Status Components mirror/efiboot DEGRADED ada0p1 (ACTIVE) mirror/swap DEGRADED ada0p3 (ACTIVE) ======================================= Works! =======================================