Emphasize that the option is relevant to old kernels, older than the
minimum supportable version threshold.
De-emphasize the use case of "send-workaround" as a synonym for "exclude
read-only".
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
One of the more obvious ways to reduce bees load is to simply not run
it all the time. Explicitly state using maintenance windows as a load
management option.
SIGUSR1 and SIGUSR2 should have been documented somewhere else before now.
Better late than never.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The theories behind bees slowing down when presented with a larger has
table turned out to be wrong. The real cause was a very old bug which
submitted thousands of `LOGICAL_INO` requests when only a handful of
requests were needed.
"Compression on the filesystem" -> "Compression in files"
Don't be so "dramatic". Be "rapid" instead.
Remove "cannot avoid modifying read-only snapshots" as a distinction
between subvol and extent scans. Both modes support send workaround
and send waiting with no significant distinction.
Emphasize extent scan's better handling of many snapshots. Also reflinks.
Add some discussion of `--throttle-factor`.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Thread names have changed. Document some of the newer ones.
Don't jump immediately to blaming poor performance on qgroups or
autodefrag. These do sometimes have kernel regressions but not all
the time.
Emphasize advantage of controlling bees deferred work requests at the
source, before btrfs gets stuck committing them.
Avoid asserting that it's OK for gdb to crash.
Remove mention of lower-layer block device issues wrt corruption.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
"Kernel" -> "Linux kernel". If you can run bees on a kernel that isn't
Linux, congratulations!
Emphasize the age of the data corruption warnings. Once 5.4 reaches
EOL we can remove those.
Simplify the discussion of old kernels and API levels. There's a
new optional kernel API for `openat2` support at 5.6. The absolute
minimum kernel version is still 4.2, and will not increase to 4.15
until the subvol scanners are removed.
Remove discussion of bees support for kernels 4.19 (which recently
reached EOL) and earlier.
The `LOGICAL_INO` vs dedupe bug is actually a `LOGICAL_INO` vs clone bug.
Dedupe isn't necessary to reproduce it.
Remove a stray ')'.
Strip out most of the discussion of slow backrefs, as they are no longer a
concern on the range of supported kernel versions. Leave some description
there because bees still has some vestigial workarounds.
Remove `btrfs send` from the "Unfixed kernel bugs" section, which makes
the section empty, so remove the section too. bees now handles send on
a subvol reasonably well.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Emphasize "large" is an upper bound on the size of filesystem bees
can handle.
New strengths: largest extent first for fixed maintenance windows,
scans data only once (ish), recovers more space
Removed weaknesses: less temporary space
Need more caps than `CAP_SYS_ADMIN`.
Emphasize DATA CORRUPTION WARNING is an old-kernel thing.
Update copyright year.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Tested on larger filesystems than 100T too, but let's use Fermi
approximation. Next size is 1P.
Removed interaction with block-level SSD caching subsystems. These are
really btrfs metadata vs. a lower block layer, and have nothing to do
with bees.
Added mixed block groups to the tested list, as mixed block groups
required explicit support in the extent scanner.
Added btrfs-convert to the tested list. btrfs-convert has various
problems with space allocation in general, but these can be solved by
carefully ordered balances after conversion, and they have nothing to
do with bees.
In-kernel dedupe is dead and the stubs were removed years ago. Remove it
from the list.
btrfs send now plays nicely with bees on all supportable kernels, now
that stable/linux-4.19.y is dead. Send workaround is only needed for
kernels before v5.4 (technically v5.2, but nobody should ever mount a
btrfs with kernel v5.1 to v5.3). bees will pause automatically when
deduping a subvol that is currently running a send.
bees will no longer gratuitously refragment data that was defragmented
by autodefrag.
Explicitly list all the RAID profiles tested so far, as there have been
some new ones.
Explicitly list other deduplicators tested.
Sort the list of btrfs features alphabetically.
Add scrub and balance, which have been tested with bees since the
beginning.
New tested btrfs features: block-group-tree, raid1c3, raid1c4.
New untested btrfs features: squotas, raid-stripe-tree.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The extent scan mode has been implemented (partially, but close enough
to win benchmarks).
New features include several nuisance dedupe countermeasures.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Extent is a different kind of scan mode, so introduce the concept of
the two kinds of scan mode, and rearrange the description of scan modes
along the new boundaries.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We can no longer reliably determine the number of hash table matches,
since we'll stop counting after the first one.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The bug is:
v6.3-rc6: f349b15e183d mm: vmalloc: avoid warn_alloc noise caused by fatal signal
The fixes are:
v6.4: 95a301eefa82 mm/vmalloc: do not output a spurious warning when huge vmalloc() fails
v6.3.10: c189994b5dd3 mm/vmalloc: do not output a spurious warning when huge vmalloc() fails
The bug has been backported to LTS, but the fix has not:
v6.2.11: 61334bc29781 mm: vmalloc: avoid warn_alloc noise caused by fatal signal
v6.1.24: ef6bd8f64ce0 mm: vmalloc: avoid warn_alloc noise caused by fatal signal
v5.15.107: a184df0de132 mm: vmalloc: avoid warn_alloc noise caused by fatal signal
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
There was a bug in kernel 6.3 where LOGICAL_INO with IGNORE_OFFSET
sometimes fails to ignore the offset. That bug is now fixed, but
LOGICAL_INO still returns 0 refs much more often than seems appropriate.
This is most likely because bees frequently deletes extents while there
is still work waiting for them in Task queues. In this case, LOGICAL_INO
correctly returns an empty list, because every reference to some extent
is deleted, but the new extent tree with that extent removed is not yet
committed in btrfs.
Add a DEBUG-level log message and an event counter to track these events.
In the absence of a kernel bug, the debug message may indicate CPU time
was wasted performing a search whose outcome could have been predicted.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The critical kernel bugs in send have been fixed for years.
The limitations that remain aren't bugs, and bees has no sustainable
workaround for them.
Also update copyright year range.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
At least one user was significantly confused by "designed for large
filesystems".
The btrfs send workarounds aren't new any more.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Clarify that "too large" and "too small" are some distance away from each other.
The Goldilocks zone is _wide_.
The interval between cache drops is now shorter.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Also attempted to clarify the descriptions of the modes based on
feedback and questions from users over the years.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When two Tasks attempt to lock the same extent, append the later Task
to the earlier Task's post-exec work queue. This will guarantee that
all Tasks which attempt to manipulate the same extent will execute
sequentially, and free up threads to process other extents.
Similarly, if two scanner threads operate on the same inode, any dedupe
they perform will lock out other scanner threads in btrfs. Avoid this
by serializing Task objects that reference the same file.
This does theoretically use an unbounded amount of memory, but in practice
a Task that encounters a contended extent or inode quickly stops spawning
new Tasks that might increase the queue size, and all Tasks that might
contend for the same lock(s) end up on a single FIFO queue.
Note that the scope of inode locks is intentionally global, i.e. when
an inode is locked, it locks every inode with the same number in every
subvol. This avoids significant lock contention and task queue growth
when the same inode with the same file extents appear in snapshots.
Fixes: https://github.com/Zygo/bees/issues/158
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Kernels that needed the balance workaround frankly are too buggy
to run bees at all. The workaround also makes the locking stories
around logical_ino calls and process exit complicated, so get rid of
it completely.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
https://github.com/Zygo/bees/pull/209
Edited: regenerate docs for the downstream change in index.md.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
For one thing, it should _say_ that there are too many duplicates.
We were making the user read the manual to find that out.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In commit d9e3c0070b8e6b382b7956d286e43e0e6643f360 "context: stop creating
new refs when there are too many already" we added a new counter, but didn't
document it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>