mirror of https://github.com/Zygo/bees.git synced 2025-08-09 17:13:47 +02:00

Go to file

Zygo Blaxell 3e7eb43b51 BeesStringFile: figure out when to call--or _not_ call--fsync

Older kernel versions featured some bugs in btrfs `fsync`, which could
leave behind "ghost dirents", orphan filename items that did not have
a corresponding inode.  These dirents were created during log replay
during the first mount after a crash due to several different bugs in
the log tree and its use over the years.  The last known bug of this
kind was fixed in kernel 5.16.  As of this writing, no fixes for this
bug have been backported to any earlier LTS kernel.

Some filesystems, including btrfs, will flush the contents of a new
file before renaming it over an old file.  On paper, btrfs can do this
very cheaply since the contents of the new file are not referenced, and
the old file not dereferenced, until a tree commit which includes both
actions atomically; however, in real life, btrfs provides `fsync`-like
semantics and uses the log-tree infrastructure to implement them, which
compromises performance and acts as a magnet for bugs.

The benefit of this trade-off is that `rename` can be used as a
synchronization point for data outside of the btrfs, which would not
happen if everything `rename` does was simply deferred to the next
tree commit.  The cost of this trade-off is that for the first 8 years
of its existence, bees would trigger the bug so often that the project
recommended its users put $BEESHOME in its own subvol to make it easy
to remove ghost dirents left behind by the bug.

Some other filesystems, such as xfs, don't have any special semantics for
`rename`, and require `fsync` to avoid garbage or missing data after
a crash.  Even filesystems which do have a special case for `rename`
can be configured to turn it off.

btrfs will silently delete data from files in the event that an
unrecoverable data block write error occurs.  Kernel version 6.2 adds
important new and unexpected cases where this can happen on filesystems
using raid56 data, but it also happens in all usable btrfs versions
(the silent deletion behavior was introduced in kernel version 3.9).

Unrecoverable write errors are currently reported to userspace only
through `fsync`.  Since the failed extents are deleted, they cannot be
detected via csum failures or scrub after the fact--and it's too late
by then, the data is already gone.  `fsync` is the last opportunity
to detect the write failure before the `rename`.  If the error is not
detected, the contents of the file will be silently discarded in btrfs.
The impact on bees is that scans will abruptly restart from zero after
a crash combined with some other reasonably common failures.

Putting all of this together leads to a rather complex workaround:
if the filesystem under $BEESHOME (specifically, the filesystem where
BeesStringFile objects such as `beescrawl.dat` are written) is a btrfs
filesystem, and the host kernel is a version prior to 5.16, then don't
call `fsync` before `rename`.  In all other cases, do call `fsync`,
and prevent dependent writes (i.e. the following `rename`) in the event
of errors.

Since present kernel versions still require `fsync`, we don't need
an upper bound on the kernel version check until someone fixes btrfs
`rename` (or perhaps adds a flag to `renameat2` which prevents use of
the log tree) in the kernel.  Once that fix happens, we can drop the
`fsync` call for kernels after that fixed version.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>

2025-02-10 21:04:20 -05:00

bin

bees: remove local cruft, throw at github

2016-11-17 12:12:13 -05:00

docs

docs: update event counters after extent scan refactoring and crawl skipping

2025-02-06 22:43:22 -05:00

include/crucible

hexdump: fix pointer cast const mismatch

2025-02-10 21:00:31 -05:00

lib

fs: improve compatibility with linux-libc-dev 5.4

2025-02-08 21:17:15 -05:00

scripts

scripts/beesd: harden the mount options

2025-01-20 01:00:41 -05:00

src

BeesStringFile: figure out when to call--or _not_ call--fsync

2025-02-10 21:04:20 -05:00

test

table: add a simple text table renderer

2024-11-30 23:30:33 -05:00

.gitignore

gitignore: clang creates a lot of *.tmp files

2021-11-29 21:27:48 -05:00

COPYING

GPL-3: license it

2016-11-17 12:12:15 -05:00

Defines.mk

beesd: Honor DESTDIR on installation.

2022-12-23 11:10:17 +08:00

Makefile

Makefile: also drop fiemap and fiewalk from main Makefile

2023-01-28 11:21:51 +01:00

makeflags

lib: deprecate memset_zero template, use C99 compound literals instead

2021-11-29 21:27:48 -05:00

README.md

docs: update README.md

2025-01-11 23:39:55 -05:00

README.md

BEES

Best-Effort Extent-Same, a btrfs deduplication agent.

About bees

bees is a block-oriented userspace deduplication agent designed to scale up to large btrfs filesystems. It is an offline dedupe combined with an incremental data scan capability to minimize time data spends on disk from write to dedupe.

Strengths

Space-efficient hash table - can use as little as 1 GB hash table per 10 TB unique data (0.1GB/TB)
Daemon mode - incrementally dedupes new data as it appears
Largest extents first - recover more free space during fixed maintenance windows
Works with btrfs compression - dedupe any combination of compressed and uncompressed files
Whole-filesystem dedupe - scans data only once, even with snapshots and reflinks
Persistent hash table for rapid restart after shutdown
Constant hash table size - no increased RAM usage if data set becomes larger
Works on live data - no scheduled downtime required
Automatic self-throttling - reduces system load
btrfs support - recovers more free space from btrfs than naive dedupers

Weaknesses

Whole-filesystem dedupe - has no include/exclude filters, does not accept file lists
Requires root privilege (CAP_SYS_ADMIN plus the usual filesystem read/modify caps)
First run may increase metadata space usage if many snapshots exist
Constant hash table size - no decreased RAM usage if data set becomes smaller
btrfs only

Installation and Usage

More Information

Bug Reports and Contributions

Email bug reports and patches to Zygo Blaxell bees@furryterror.org.

You can also use Github:

    https://github.com/Zygo/bees

Copyright & License

GPL (version 3 or later).

README.md

BEES

About bees

Strengths

Weaknesses

Installation and Usage

Recommended Reading

More Information

Bug Reports and Contributions

Copyright & License