Tested on larger filesystems than 100T too, but let's use Fermi
approximation. Next size is 1P.
Removed interaction with block-level SSD caching subsystems. These are
really btrfs metadata vs. a lower block layer, and have nothing to do
with bees.
Added mixed block groups to the tested list, as mixed block groups
required explicit support in the extent scanner.
Added btrfs-convert to the tested list. btrfs-convert has various
problems with space allocation in general, but these can be solved by
carefully ordered balances after conversion, and they have nothing to
do with bees.
In-kernel dedupe is dead and the stubs were removed years ago. Remove it
from the list.
btrfs send now plays nicely with bees on all supportable kernels, now
that stable/linux-4.19.y is dead. Send workaround is only needed for
kernels before v5.4 (technically v5.2, but nobody should ever mount a
btrfs with kernel v5.1 to v5.3). bees will pause automatically when
deduping a subvol that is currently running a send.
bees will no longer gratuitously refragment data that was defragmented
by autodefrag.
Explicitly list all the RAID profiles tested so far, as there have been
some new ones.
Explicitly list other deduplicators tested.
Sort the list of btrfs features alphabetically.
Add scrub and balance, which have been tested with bees since the
beginning.
New tested btrfs features: block-group-tree, raid1c3, raid1c4.
New untested btrfs features: squotas, raid-stripe-tree.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
flushoncommit or not-flushoncommit isn't really a bees matter--it's
a sysadmin's tradeoff between reliability and performance. bees does
not affect that tradeoff because all dedupe src extents are flushed, so
bees introduces no *new* data loss risks in the noflushoncommit
case--i.e. any data that you could lose while running bees, you'd also
lose when not running bees.
Note that the converse is not true: bees might trigger flushing on
data that would not normally have been flushed with noflushoncommit,
and improve data integrity after a crash as a side-effect of dedupe
operations. The risks of noflushoncommit might be reduced by running
bees. I don't have evidence based on experimental data to support that
conclusion, so I'll just leave this possibility as a rumor in a commit
log message.
lvmcache can be moved from the "bad" list to the "good" list now.
bcache remains in the "bad" list due to some non-data-losing failures
that only seem to happen with bcache.
Add a note about CPUs with strange endianness or page sizes, as nobody
seems to have tried those.
Remove "at great cost" from the btrfs send workaround. The cost is
the cost, there is no need to editorialize.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
options.md was a disorganized mess that markdown couldn't parse properly.
Break the options list down into sections by theme. Add the new
'--workaround-btrfs-send' option to the new 'Workarounds' section.
Clean up the rest of the text and fix some inconsistencies.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Split the rather large README into smaller sections with a pitch and
a ToC at the top.
Move the sections into docs/ so that Github Pages can read them.
'make doc' produces a local HTML tree.
Update the kernel bugs and gotchas list.
Add some information that has been accumulating in Github comments.
Remove information about bugs in kernels earlier than 4.14.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>