1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00

README: update the state of bees and the kernel for v4.14

Read-only snapshots have always just worked.  Remove them from the
"untested" list.

nodatasum (and therefore nodatacow) inodes are simply ignored.  This seems
like the right thing to do since deduping a nodatacow extent turns it
into a datacow extent, which seems contrary to administrator wishes
implied by the nodatacow bit.  We probably need an option to
override that assumption.

Clarify why converted ext[234] filesystems may cause problems and
the nature of those problems.

Assorted minor editorial changes.

Discuss calculation of the balance limit parameter when ensuring
sufficient metadata space.

Update kernel version bug/fix/feature lists, including LOGICAL_INO_V2.

Annotate kernel workaround list with known kernel versions that make
the workarounds necessary.

Remove reference to 'DEFRAG_RANGE' as bees requires much more control
over data placement than this interface can offer.  It's easy enough to
create a new ioctl to implement bees requirements once it's known what
those requirements are.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit is contained in:
Zygo Blaxell 2018-01-07 03:41:45 -05:00
parent 305ab5dbfa
commit dc7360397e

View File

@ -134,11 +134,6 @@ performance by caching, but really fixing this requires rewriting the
crawler to scan the btrfs extent tree directly instead of the subvol
FS trees.
* Bees had support for multiple worker threads in the past; however,
this was removed because it made Bees too aggressive to coexist with
other applications on the same machine. It also hit the *slow backrefs*
on N CPU cores instead of just one.
* Block reads are currently more allocation- and CPU-intensive than they
should be, especially for filesystems on SSD where the IO overhead is
much smaller. This is a problem for power-constrained environments
@ -171,6 +166,7 @@ Bees has been tested in combination with the following:
* Large (>16M) extents
* Huge files (>1TB--although Btrfs performance on such files isn't great in general)
* filesystems up to 25T bytes, 100M+ files
* btrfs read-only snapshots
Bad Btrfs Feature Interactions
------------------------------
@ -179,15 +175,13 @@ Bees has not been tested with the following, and undesirable interactions may oc
* Non-4K filesystem data block size (should work if recompiled)
* Non-equal hash (SUM) and filesystem data block (CLONE) sizes (probably never will work)
* btrfs read-only snapshots (never tested, probably wouldn't work well)
* btrfs send/receive (receive is probably OK, but send requires RO snapshots. See above)
* btrfs send/receive (receive is probably OK, but send could be confused?)
* btrfs qgroups (never tested, no idea what might happen)
* btrfs seed filesystems (does anyone even use those?)
* btrfs autodefrag mount option (never tested, could fight with Bees)
* btrfs nodatacow inode attribute (needs datasum detection on extents, skipped for now)
* btrfs nodatacow mount option (*could* work, but might not, skipped due to above note)
* btrfs nodatacow/nodatasum inode attribute or mount option (bees skips all nodatasum files)
* btrfs out-of-tree kernel patches (e.g. in-band dedup or encryption)
* btrfs-convert from ext2/3/4 (never tested)
* btrfs-convert from ext2/3/4 (never tested, might run out of space or ignore significant portions of the filesystem due to sanity checks)
* btrfs mixed block groups (don't know a reason why it would *not* work, but never tested)
* open(O_DIRECT)
* Filesystems mounted *without* the flushoncommit option
@ -195,7 +189,7 @@ Bees has not been tested with the following, and undesirable interactions may oc
Other Caveats
-------------
* btrfs balance will invalidate parts of the dedup table. Bees will
* btrfs balance will invalidate parts of the dedup hash table. Bees will
happily rebuild the table, but it will have to scan all the blocks
again.
@ -206,17 +200,35 @@ Other Caveats
* Bees creates temporary files (with O_TMPFILE) and uses them to split
and combine extents elsewhere in btrfs. These will take up to 2GB
during normal operation.
of disk space per thread during normal operation.
* Like all deduplicators, Bees will replace data blocks with metadata
references. It is a good idea to ensure there are several GB of
unallocated space (see `btrfs fi df`) on the filesystem before running
Bees for the first time. Use
references. It is a good idea to ensure there is sufficient unallocated
space (see `btrfs fi usage`) on the filesystem to allow the metadata
to multiply in size by the number of snapshots before running Bees
for the first time. Use
btrfs balance start -dusage=100,limit=1 /your/filesystem
btrfs balance start -dusage=100,limit=N /your/filesystem
If possible, raise the `limit` parameter to the current size of metadata
usage (from `btrfs fi df`) plus 1.
where the `limit` parameter 'N' should be calculated as follows:
* start with the current size of metadata usage (from `btrfs fi
df`) in GB, plus 1
* multiply by the proportion of disk space in subvols with
snapshots (i.e. if there are no snapshots, multiply by 0;
if all of the data is shared between at least one origin
and one snapshot subvol, multiply by 1)
* multiply by the number of snapshots (i.e. if there is only
one subvol, multiply by 0; if there are 3 snapshots and one
origin subvol, multiply by 3)
`limit = GB_metadata * (disk_space_in_snapshots / total_disk_space) * number_of_snapshots`
Monitor unallocated space to ensure that the filesystem never runs out
of metadata space (whether Bees is running or not--this is a general
btrfs requirement).
A Brief List Of Btrfs Kernel Bugs
@ -229,18 +241,27 @@ Missing features (usually not available in older LTS kernels):
* 3.16: `SEARCH_V2` ioctl added. Bees could use `SEARCH` instead.
* 4.2: `FILE_EXTENT_SAME` no longer updates mtime, can be used at EOF.
Future features (kernel features Bees does not yet use, but may rely on
in the future):
* 4.14: `LOGICAL_INO_V2` allows userspace to create forward and backward
reference maps to entire physical extents with a single ioctl call,
and raises the limit of 2730 references per extent. Bees has not yet
been rewritten to take full advantage of these features.
Bug fixes (sometimes included in older LTS kernels):
* Bugs fixed prior to 4.4.3 are not listed here.
* 4.5: hang in the `INO_PATHS` ioctl used by Bees.
* 4.5: use-after-free in the `FILE_EXTENT_SAME` ioctl used by Bees.
* 4.6: lost inodes after a rename, crash, and log tree replay
(triggered by the fsync() while writing `beescrawl.dat`).
* 4.7: *slow backref* bug no longer triggers a softlockup panic. It still
too long to resolve a block address to a root/inode/offset triple.
takes too long to resolve a block address to a root/inode/offset triple.
* 4.10: reduced CPU time cost of the LOGICAL_INO ioctl and dedup
backref processing in general.
* 4.13 integration trees: 053582a7d423 btrfs: add cond_resched() calls
when resolving backrefs
* 4.11: yet another dedup deadlock case is fixed.
* 4.14: backref performance improvements make LOGICAL_INO even faster.
Unfixed kernel bugs (as of 4.11.9) with workarounds in Bees:
@ -251,7 +272,7 @@ Unfixed kernel bugs (as of 4.11.9) with workarounds in Bees:
measuring the time the kernel spends performing certain operations
and permanently blacklisting any extent or hash where the kernel
starts to get slow. Inside Bees, such blocks are marked as 'toxic'
hash/block addresses.
hash/block addresses. *Needs to be retested after v4.14.*
* `LOGICAL_INO` output is arbitrarily limited to 2730 references
even if more buffer space is provided for results. Once this number
@ -262,6 +283,7 @@ Unfixed kernel bugs (as of 4.11.9) with workarounds in Bees:
This places an obvious limit on dedup efficiency for extremely common
blocks or filesystems with many snapshots (although this limit is
far greater than the effective limit imposed by the *slow backref* bug).
*Fixed in v4.14.*
* `LOGICAL_INO` on compressed extents returns a list of root/inode/offset
tuples matching the extent bytenr of its argument. On uncompressed
@ -271,7 +293,7 @@ Unfixed kernel bugs (as of 4.11.9) with workarounds in Bees:
references requires calling `LOGICAL_INO` for every single block of
the extent. This is undesirable behavior for Bees, which wants a
list of all extent refs referencing a data extent (i.e. Bees wants
the compressed-extent behavior in all cases).
the compressed-extent behavior in all cases). *Fixed in v4.14.*
* `LOGICAL_INO` is only called from one thread at any time per process.
This means at most one core is irretrievably stuck in this ioctl.
@ -281,13 +303,6 @@ Unfixed kernel bugs (as of 4.11.9) with workarounds in Bees:
or prealloc. Bees avoids feedback loops this can generate while
attempting to replace extents over 16MB in length.
* `DEFRAG_RANGE` is useless. The ioctl attempts to implement `btrfs
fi defrag` in the kernel, and will arbitrarily defragment more or
less than the range requested to match the behavior expected from the
userspace tool. Bees implements its own defrag instead, copying data
to a temporary file and using the `FILE_EXTENT_SAME` ioctl to replace
precisely the specified range of offending fragmented blocks.
* If the `fsync()` in `BeesTempFile::make_copy` is removed, the filesystem
hangs within a few hours, requiring a reboot to recover. On the other
hand, the `fsync()` only costs about 8% of overall performance.