u/mmarshall540

▲ 3 r/btrfs

Unable to boot due to btrfs errors apparently started during a resume from suspend

I'm using btrfs in a RAID1 setup comprised of 2 nvme drives, 2TB each. One of the drives is mounted directly on the motherboard. The other is connected using a PCIe adapter card.

So I just updated my debian desktop system and restarted the computer. As the system rebooted itself, I noticed a console message BTRFS error along the lines of "error writing primary super block". So I crossed my fingers and hoped it would not be a serious problem. But of course, the system would not boot after that.

I assumed (incorrectly, as explained below) that it must have something to do with the debian update but couldn't find any similar reports online. So I rebooted using SystemRescue. Was fortunately able to mount one of the drives read-only and am currently backing up everything from the @home volume just in case things get worse.

Have not attempted to do anything with btrfs tools yet, because I want to understand the situation better before attempting to fix anything. But I was able to use journalctl -D to review the logs.

I was shocked to find a continuous stream of btrfs errors occurring since yesterday morning, long before the debian update. It seems to have begun during a resume from suspend. (I only recently enabled the suspend option in Gnome, figuring it would be good to reduce electricity when not using the computer.)

Any suggestions for safely fixing this and preventing it in the future?

Here are some of the journalctl messages. Will provide more if needed.

First btrfs errors from yesterday:

May 15 10:55:14 vader kernel: nvme nvme1: Device not ready; aborting initialisation, CSTS=0x0
May 15 10:55:14 vader kernel: nvme nvme1: Disabling device after reset failure: -19
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 0, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 0, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 1, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 2, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 3, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 4, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 6, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 5, rd 4, flush 0, corrupt 0, gen 0
May 15 10:55:14 vader systemd[1]: Started systemd-rfkill.service - Load/Save RF Kill Switch Status.
May 15 10:55:14 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader systemd[1]: user.slice: Unit now thawed.
May 15 10:55:14 vader systemd[1]: user-1000.slice: Unit now thawed.
May 15 10:55:14 vader systemd[1]: user@1000.service: Unit now thawed.
May 15 10:55:14 vader systemd[1]: session-94.scope: Unit now thawed.
May 15 10:55:14 vader systemd-sleep[438122]: Successfully thawed unit 'user.slice'.
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader dbus-daemon[963]: [system] Rejected send message, 0 matched rules; type="error", sender=":1.3274" (uid=1000 pid=397855 comm="/usr/bi>
May 15 10:55:14 vader systemd[1]: systemd-suspend.service: Deactivated successfully.
May 15 10:55:14 vader systemd[1]: Finished systemd-suspend.service - System Suspend.
May 15 10:55:14 vader systemd[1]: Stopped target sleep.target - Sleep.
May 15 10:55:14 vader systemd[1]: Reached target suspend.target - Suspend.
May 15 10:55:14 vader systemd[1]: Starting grub-common.service - Record successful boot for GRUB...
May 15 10:55:14 vader systemd[1]: Stopped target suspend.target - Suspend.
May 15 10:55:14 vader systemd-logind[1292]: Operation 'suspend' finished.
May 15 10:55:14 vader NetworkManager[1172]: <info>  [1778842514.2639] manager: sleep: wake requested (sleeping: yes  enabled: yes)
May 15 10:55:14 vader ModemManager[1083]: <msg> [sleep-monitor-systemd] system is resuming
May 15 10:55:14 vader NetworkManager[1172]: <info>  [1778842514.2648] device (enp5s0): state change: unmanaged -> unavailable (reason 'managed', managed-t>
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2
May 15 10:55:14 vader kernel: Generic FE-GE Realtek PHY r8169-0-500:00: attached PHY driver (mii_bus:phy_addr=r8169-0-500:00, irq=MAC)
May 15 10:55:14 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2
May 15 10:55:14 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2
May 15 10:55:14 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2
May 15 10:55:14 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2
May 15 10:55:14 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2
May 15 10:55:14 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 10:55:14 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2

From there, I see a continuous stream of errors that lasts all the way until reboot earlier today. They are various combinations of the following lines:

May 15 11:13:03 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 11:13:03 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 11:13:03 vader kernel: BTRFS warning (device nvme0n1p3): lost super block write due to IO error on /dev/nvme1n1p1 (-5)
May 15 11:13:03 vader kernel: BTRFS error (device nvme0n1p3): error writing primary super block to device 2
May 15 11:13:17 vader kernel: btrfs_dev_stat_inc_and_print: 15 callbacks suppressed
May 15 11:13:17 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 557066, rd 12930, flush 593, corrupt 0, gen 0
May 15 11:13:17 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 557067, rd 12930, flush 593, corrupt 0, gen 0
May 15 11:13:17 vader kernel: BTRFS error (device nvme0n1p3): bdev /dev/nvme1n1p1 errs: wr 557068, rd 12930, flush 593, corrupt 0, gen 0

Thanks in advance for any help.

reddit.com
u/mmarshall540 — 6 days ago