From e1476260e16332a9cb23a9345a5fd9936b409eba Mon Sep 17 00:00:00 2001 From: Zygo Blaxell Date: Fri, 24 May 2019 11:16:21 -0400 Subject: [PATCH] docs: update kernel compatibility page, now recommending 5.0.4 * comprehensive list of kernels with bees-triggered corruption bug fixes * deadlock between dedupe and rename is now fixed (in some places) * compressed data corruption is now fixed (in more places) * btrfs send fix for one bug is now merged in 5.2-rc1, another bug remains * retired the bcache/lvmcache bug (can't reproduce those bugs any more, although I *can* reproduce an interesting non-destructive bcache bug) * new minor bug entries for two harmless kernel warnings * new entry for storm-of-soft-lockups Fixes: https://github.com/Zygo/bees/issues/107 Signed-off-by: Zygo Blaxell --- docs/btrfs-kernel.md | 178 +++++++++++++++++++++++++++++++------------ 1 file changed, 131 insertions(+), 47 deletions(-) diff --git a/docs/btrfs-kernel.md b/docs/btrfs-kernel.md index 7df5479..48c86c0 100644 --- a/docs/btrfs-kernel.md +++ b/docs/btrfs-kernel.md @@ -1,42 +1,81 @@ Recommended kernel version ========================== -Linux **4.14.34** or later. +Currently 5.0.4, 5.1, and *chronologically* later versions are recommended +to avoid all currently known and fixed kernel issues and obtain best +performance. Older kernel versions can be used with bees with some +caveats (see below). -A Brief List Of Btrfs Kernel Bugs +All unmaintained kernel trees (those which do not receive -stable updates) +should be avoided due to potential data corruption bugs. + +**Kernels older than 4.2 cannot run bees at all** due to missing features. + +DATA CORRUPTION WARNING +----------------------- + +There is a data corruption bug in older Linux kernel versions that can +be triggered by bees. The bug can be triggered in other ways, but bees +will trigger it especially often. + +This bug is **fixed** in the following kernel versions: + +* **5.1 or later** versions. + +* **5.0.4 or later 5.0.y** versions. + +* **4.19.31 or later 4.19.y** LTS versions. + +* **4.14.108 or later 4.14.y** LTS versions. + +* **4.9.165 or later 4.9.y** LTS versions. + +* **4.4.177 or later 4.4.y** LTS versions. + +* **v3.18.137 or later 3.18.y** LTS versions (note these versions cannot +run bees). + +All older kernel versions (including 4.20.17, 4.18.20, 4.17.19, 4.16.18, +4.15.18) have the data corruption bug. + +The commit that fixes the last known data corruption bug is +8e928218780e2f1cf2f5891c7575e8f0b284fcce "btrfs: fix corruption reading +shared and compressed extents after hole punching". + + +Lockup/hang WARNING +------------------- + +Kernel versions prior to 5.0.4 have a deadlock bug when file A is +renamed to replace B while both files A and B are referenced in a +dedupe operation. This situation may arise often while bees is running, +which will make processes accessing the filesystem hang while writing. +A reboot is required to recover. No data is lost when this occurs +(other than unflushed writes due to the reboot). + +A common problem case is rsync receiving updates to large files when not +in `--inplace` mode. If the file is sufficiently large, bees will start +to dedupe the original file and rsync's temporary modified version of +the file while rsync is still writing the modified version of the file. +Later, when rsync renames the modified temporary file over the original +file, the rename in rsync can occasionally deadlock with the dedupe +in bees. + +This bug is **fixed** in the following kernel versions: + +* **5.1 or later** versions. + +* **5.0.4 or later 5.0.y** versions. + +The commit that fixes this bug is 4ea748e1d2c9f8a27332b949e8210dbbf392987e +"btrfs: fix deadlock between clone/dedupe and rename". + + + +A Brief List Of btrfs Kernel Bugs --------------------------------- -Recent kernel bug fixes: - -* 4.14.29: `WARN_ON(ref->count < 0)` in fs/btrfs/backref.c triggers - almost once per second. The `WARN_ON` is incorrect, and is now removed. - -Unfixed kernel bugs (as of 4.14.71): - -* **Bad _filesystem destroying_ interactions** with other Linux block - layers: `bcache` and `lvmcache` can fail spectacularly, and apparently - only do so while running bees. This is definitely a kernel bug, - either in btrfs or the lower block layers. **Avoid using bees with - these tools unless your filesystem is disposable and you intend to - debug the kernel.** - -* **Compressed data corruption** is possible when using the `fallocate` - system call to punch holes into compressed extents that contain long - runs of zeros. The [bug results in intermittent corruption during - reads](https://www.spinics.net/lists/linux-btrfs/msg81293.html), but - due to the bug, the kernel might sometimes mistakenly determine data - is duplicate, and deduplication will corrupt the data permanently. - This bug also affects compressed `kvm` raw images with the `discard` - feature on btrfs or any compressed file where `fallocate -d` or - `fallocate -p` has been used. - -* **Deadlock** when [simultaneously using the same files in dedupe and - `rename`](https://www.spinics.net/lists/linux-btrfs/msg81109.html). - There is no way for bees to reliably know when another process is - about to rename a file while bees is deduping it. In the `rsync` case, - bees will dedupe the new file `rsync` is creating using the old file - `rsync` is copying from, while `rsync` will rename the new file over - the old file to replace it. +Unfixed kernel bugs (as of 5.0.21): Minor kernel problems with workarounds: @@ -47,30 +86,75 @@ Minor kernel problems with workarounds: the kernel spends performing `LOGICAL_INO` operations and permanently blacklisting any extent or hash involved where the kernel starts to get slow. In the bees log, such blocks are labelled as 'toxic' - hash/block addresses. + hash/block addresses. Toxic extents are rare (about 1 in 100,000 + extents become toxic), but toxic extents can become 8 orders of + magnitude more expensive to process than the fastest non-toxic + extents. This seems to affect all dedupe agents on btrfs; at this + time of writing only bees has a workaround for this bug. -* **btrfs send** has various bugs that are triggered when bees is +* **btrfs send** has bugs that are triggered when bees is deduping snapshots. bees provides the [`--workaround-btrfs-send` option](options.md) which should be used whenever `btrfs send` and bees are run on the same filesystem. - This issue affects: - * `btrfs send` (any mode) and bees active at the same time. - * `btrfs send` in incremental mode (using `-p` option) with bees - active at the same or different times. + Note `btrfs receive` is not affected, nor is any other btrfs operation + except `send`. It is OK to run bees with no workarounds on a filesystem + that receives btrfs snapshots. - Note `btrfs receive` is not affected. It is OK to run bees with no - workarounds on a filesystem that receives btrfs snapshots. + A fix for one problem has been [merged into kernel + 5.2-rc1](https://github.com/torvalds/linux/commit/62d54f3a7fa27ef6a74d6cdf643ce04beba3afa7). + bees has not been updated to handle the new EAGAIN case optimally, + but the excess error messages that are produced are harmless. + + The other problem is that [parent snapshots for incremental sends + are broken by bees](https://github.com/Zygo/bees/issues/115), even + when the snapshots are deduped while send is not running. + +* **btrfs send** also seems to have severe performance issues with + dedupe agents that produce toxic extents. bees has a workaround to + prevent this where possible. * **Systems with many CPU cores** may [lock up when bees runs with one worker thread for every core](https://github.com/Zygo/bees/issues/91). bees limits the number of threads it will try to create based on detected CPU core count. Users may override this limit with the - [`--thread-count` option](options.md). + [`--thread-count` option](options.md). It is possible this is the + same bug as the next one: -Older kernels: +* **Storm of Soft Lockups**, a bug that occurs when running the + `LOGICAL_INO` ioctl in a large number of threads, leads to a soft lockup + on all CPUs. Some details and analysis is available on [the btrfs + mailing list](https://www.spinics.net/lists/linux-btrfs/msg89326.html). + This occurs after hitting a BUG_ON in `fs/btrfs/ctree.c`: -* Older kernels have various data corruption and deadlock/hang issues - that are no longer listed here, and older kernels are missing important - features such as `LOGICAL_INO_V2`. Using an older kernel is not - recommended. + switch (tm->op) { + case MOD_LOG_KEY_REMOVE_WHILE_FREEING: + BUG_ON(tm->slot < n); + /* Fallthrough */ + + The rate of incidence of this bug seems to depend on the total number + of bees threads running on the system, although occasionally other + processes such as `rsync` or `btrfs balance` are involved. A workaround + is to run only 1 bees thread, i.e. [`--thread-count=1`](options.md). + +* **Spurious warnings in `fs/fs-writeback.c`** on kernel 4.15 and later + when filesystem is mounted with `flushoncommit`. These + seem to be harmless (there are other locks which prevent + concurrent umount of the filesystem), but the underlying + problems that trigger the `WARN_ON` are [not trivial to + fix](https://www.spinics.net/lists/linux-btrfs/msg87752.html). + Workarounds: + + 1. mount with `-o noflushoncommit` + 2. patch kernel to remove warning in `fs/fs-writeback.c`. + + Note that using kernels 4.14 and earlier is *not* a viable workaround + for this issue, because kernels 4.14 and earlier will eventually + deadlock when a filesystem is mounted with `-o flushoncommit` (a single + commit fixes one bug and introduces the other). + +* **Spurious kernel warnings in `fs/btrfs/delayed-ref.c`** on 5.0.x. + This also seems harmless, but there have been [no comments + since this issue was reported to the `linux-btrfs` mailing + list](https://www.spinics.net/lists/linux-btrfs/msg89061.html). + Workaround: patch kernel to remove the warning.