mirror of
https://github.com/Zygo/bees.git
synced 2025-05-17 21:35:45 +02:00
docs: update kernel bugs and workarounds list for 6.2.0
Remove some of the repetition to make the document easier to edit. Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit is contained in:
parent
3430f16998
commit
3d5ebe4d40
@ -7,23 +7,24 @@ First, a warning that is not specific to bees:
|
|||||||
severe regression that can lead to fatal metadata corruption.**
|
severe regression that can lead to fatal metadata corruption.**
|
||||||
This issue is fixed in kernel 5.4.14 and later.
|
This issue is fixed in kernel 5.4.14 and later.
|
||||||
|
|
||||||
**Recommended kernel versions for bees are 4.19, 5.4, 5.10, 5.11, or 5.12,
|
**Recommended kernel versions for bees are 4.19, 5.4, 5.10, 5.11, 5.15,
|
||||||
with recent LTS and -stable updates.** The latest released kernel as
|
6.0, or 6.1, with recent LTS and -stable updates.** The latest released
|
||||||
of this writing is 5.18.18.
|
kernel as of this writing is 6.2.0.
|
||||||
|
|
||||||
4.14, 4.9, and 4.4 LTS kernels with recent updates are OK with
|
4.14, 4.9, and 4.4 LTS kernels with recent updates are OK with some
|
||||||
some issues. Older kernels will be slower (a little slower or a lot
|
issues. Older kernels will be slower (a little slower or a lot slower
|
||||||
slower depending on which issues are triggered). Not all fixes are
|
depending on which issues are triggered). Not all fixes are backported.
|
||||||
backported.
|
|
||||||
|
|
||||||
Obsolete non-LTS kernels have a variety of unfixed issues and should
|
Obsolete non-LTS kernels have a variety of unfixed issues and should
|
||||||
not be used with btrfs. For details see the table below.
|
not be used with btrfs. For details see the table below.
|
||||||
|
|
||||||
bees requires btrfs kernel API version 4.2 or higher, and does not work
|
bees requires btrfs kernel API version 4.2 or higher, and does not work
|
||||||
on older kernels.
|
at all on older kernels.
|
||||||
|
|
||||||
bees will detect and use btrfs kernel API up to version 4.15 if present.
|
Some bees features rely on kernel 4.15 to work, and these features will
|
||||||
In some future bees release, this API version may become mandatory.
|
not be available on older kernels. Currently, bees is still usable on
|
||||||
|
older kernels with degraded performance or with options disabled, but
|
||||||
|
support for older kernels may be removed.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -65,7 +66,8 @@ These bugs are particularly popular among bees users, though not all are specifi
|
|||||||
| - | 5.17 | crash during device removal can make filesystem unmountable | 5.15.54, 5.16.20, 5.17.3, 5.18 and later | bbac58698a55 btrfs: remove device item and update super block in the same transaction
|
| - | 5.17 | crash during device removal can make filesystem unmountable | 5.15.54, 5.16.20, 5.17.3, 5.18 and later | bbac58698a55 btrfs: remove device item and update super block in the same transaction
|
||||||
| - | 5.18 | wrong superblock num_devices makes filesystem unmountable | 4.14.283, 4.19.247, 5.4.198, 5.10.121, 5.15.46, 5.17.14, 5.18.3, 5.19 and later | d201238ccd2f btrfs: repair super block num_devices automatically
|
| - | 5.18 | wrong superblock num_devices makes filesystem unmountable | 4.14.283, 4.19.247, 5.4.198, 5.10.121, 5.15.46, 5.17.14, 5.18.3, 5.19 and later | d201238ccd2f btrfs: repair super block num_devices automatically
|
||||||
| 5.18 | 5.19 | parent transid verify failed during log tree replay after a crash during a rename operation | 5.18.18, 5.19.2, 6.0 and later | 723df2bcc9e1 btrfs: join running log transaction when logging new name
|
| 5.18 | 5.19 | parent transid verify failed during log tree replay after a crash during a rename operation | 5.18.18, 5.19.2, 6.0 and later | 723df2bcc9e1 btrfs: join running log transaction when logging new name
|
||||||
| 5.4 | - | kernel hang when multiple threads are running `LOGICAL_INO` and dedupe ioctl | - | workaround: reduce bees thread count to 1 with `-c1`
|
| 5.12 | 6.0 | space cache corruption and potential double allocations | 5.15.65, 5.19.6, 6.0 and later | ced8ecf026fd btrfs: fix space cache corruption and potential double allocations
|
||||||
|
| 5.4 | - | kernel hang when multiple threads are running `LOGICAL_INO` and dedupe ioctl on the same extent | - | workaround: avoid doing that
|
||||||
|
|
||||||
"Last bad kernel" refers to that version's last stable update from
|
"Last bad kernel" refers to that version's last stable update from
|
||||||
kernel.org. Distro kernels may backport additional fixes. Consult
|
kernel.org. Distro kernels may backport additional fixes. Consult
|
||||||
@ -80,21 +82,45 @@ through 5.4.13 inclusive.
|
|||||||
A "-" for "first bad kernel" indicates the bug has been present since
|
A "-" for "first bad kernel" indicates the bug has been present since
|
||||||
the relevant feature first appeared in btrfs.
|
the relevant feature first appeared in btrfs.
|
||||||
|
|
||||||
A "-" for "last bad kernel" indicates the bug has not yet been fixed as
|
A "-" for "last bad kernel" indicates the bug has not yet been fixed in
|
||||||
of 5.18.18.
|
current kernels (see top of this page for which kernel version that is).
|
||||||
|
|
||||||
In cases where issues are fixed by commits spread out over multiple
|
In cases where issues are fixed by commits spread out over multiple
|
||||||
kernel versions, "fixed kernel version" refers to the version that
|
kernel versions, "fixed kernel version" refers to the version that
|
||||||
contains all components of the fix.
|
contains the last committed component of the fix.
|
||||||
|
|
||||||
|
|
||||||
Workarounds for known kernel bugs
|
Workarounds for known kernel bugs
|
||||||
---------------------------------
|
---------------------------------
|
||||||
|
|
||||||
* **Hangs with high worker thread counts**: On kernels newer than
|
* **Hangs with concurrent `LOGICAL_INO` and dedupe**: on all
|
||||||
5.4, multiple threads running `LOGICAL_INO` and dedupe ioctls
|
kernel versions so far, multiple threads running `LOGICAL_INO`
|
||||||
at the same time can lead to a kernel hang. The workaround is
|
and dedupe ioctls at the same time on the same inodes or extents
|
||||||
to reduce the thread count to 1 with `-c1`.
|
can lead to a kernel hang. The kernel enters an infinite loop in
|
||||||
|
`add_all_parents`, where `count` is 0, `ref->count` is 1, and
|
||||||
|
`btrfs_next_item` or `btrfs_next_old_item` never find a matching ref).
|
||||||
|
|
||||||
|
bees has two workarounds for this bug: 1. schedule work so that multiple
|
||||||
|
threads do not simultaneously access the same inode or the same extent,
|
||||||
|
and 2. use a brute-force global lock within bees that prevents any
|
||||||
|
thread from running `LOGICAL_INO` while any other thread is running
|
||||||
|
dedupe.
|
||||||
|
|
||||||
|
Workaround #1 isn't really a workaround, since we want to do the same
|
||||||
|
thing for unrelated performance reasons. If multiple threads try to
|
||||||
|
perform dedupe operations on the same extent or inode, btrfs will make
|
||||||
|
all the threads wait for the same locks anyway, so it's better to have
|
||||||
|
bees find some other inode or extent to work on while waiting for btrfs
|
||||||
|
to finish.
|
||||||
|
|
||||||
|
Workaround #2 doesn't seem to be needed after implementing workaround
|
||||||
|
#1, but it's better to be slightly slower than to hang one CPU core
|
||||||
|
and the filesystem until the kernel is rebooted.
|
||||||
|
|
||||||
|
It is still theoretically possible to trigger the kernel bug when
|
||||||
|
running bees at the same time as other dedupers, or other programs
|
||||||
|
that use `LOGICAL_INO` like `btdu`; however, it's extremely difficult
|
||||||
|
to reproduce the bug without closely cooperating threads.
|
||||||
|
|
||||||
* **Slow backrefs** (aka toxic extents): Under certain conditions,
|
* **Slow backrefs** (aka toxic extents): Under certain conditions,
|
||||||
if the number of references to a single shared extent grows too
|
if the number of references to a single shared extent grows too
|
||||||
@ -110,8 +136,8 @@ Workarounds for known kernel bugs
|
|||||||
at this time of writing only bees has a workaround for this bug.
|
at this time of writing only bees has a workaround for this bug.
|
||||||
|
|
||||||
This workaround is less necessary for kernels 5.4.96, 5.7 and later,
|
This workaround is less necessary for kernels 5.4.96, 5.7 and later,
|
||||||
though it can still take 2 ms of CPU to resolve each extent ref on a
|
though the bees workaround can still be triggered on newer kernels
|
||||||
fast machine on a large, heavily fragmented file.
|
by changes in btrfs since kernel version 5.1.
|
||||||
|
|
||||||
* **dedupe breaks `btrfs send` in old kernels**. The bees option
|
* **dedupe breaks `btrfs send` in old kernels**. The bees option
|
||||||
`--workaround-btrfs-send` prevents any modification of read-only subvols
|
`--workaround-btrfs-send` prevents any modification of read-only subvols
|
||||||
@ -127,8 +153,6 @@ Workarounds for known kernel bugs
|
|||||||
Unfixed kernel bugs
|
Unfixed kernel bugs
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
As of 5.18.18:
|
|
||||||
|
|
||||||
* **The kernel does not permit `btrfs send` and dedupe to run at the
|
* **The kernel does not permit `btrfs send` and dedupe to run at the
|
||||||
same time**. Recent kernels no longer crash, but now refuse one
|
same time**. Recent kernels no longer crash, but now refuse one
|
||||||
operation with an error if the other operation was already running.
|
operation with an error if the other operation was already running.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user