mirror of
https://github.com/Zygo/bees.git
synced 2025-07-01 08:12:27 +02:00
d5e805ab8dbfa683087e360e40e61196fad6b21c
This seek_backward failed in bees because an extent appeared during the search: fetch(probe_pos = 6821971036, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821971004 = probe_pos - have_delta 32 (want_delta 32) fetch(probe_pos = 6821971004, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821970972 = probe_pos - have_delta 32 (want_delta 32) fetch(probe_pos = 6821970972, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821970908 = probe_pos - have_delta 64 (want_delta 64) fetch(probe_pos = 6821970908, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821970780 = probe_pos - have_delta 128 (want_delta 128) fetch(probe_pos = 6821970780, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821970524 = probe_pos - have_delta 256 (want_delta 256) fetch(probe_pos = 6821970524, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821970012 = probe_pos - have_delta 512 (want_delta 512) fetch(probe_pos = 6821970012, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821968988 = probe_pos - have_delta 1024 (want_delta 1024) fetch(probe_pos = 6821968988, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821966940 = probe_pos - have_delta 2048 (want_delta 2048) fetch(probe_pos = 6821966940, target_pos = 6821971036) = 6822575316..6822575316 probe_pos 6821962844 = probe_pos - have_delta 4096 (want_delta 4096) fetch(probe_pos = 6821962844, target_pos = 6821971036) = 6821962845..6821962848 found_low = true, lower_bound = 6821962845 lower_bound = high_pos 6821962848 loop: lower_bound 6821962848, probe_pos 6821966942, upper_bound 6821971036 fetch(probe_pos = 6821966942, target_pos = 6821971036) = 6822575316..6822575316 upper_bound = probe_pos 6821966942 loop: lower_bound 6821962848, probe_pos 6821964895, upper_bound 6821966942 fetch(probe_pos = 6821964895, target_pos = 6821971036) = 6822575316..6822575316 upper_bound = probe_pos 6821964895 loop: lower_bound 6821962848, probe_pos 6821963871, upper_bound 6821964895 fetch(probe_pos = 6821963871, target_pos = 6821971036) = 6822575316..6822575316 upper_bound = probe_pos 6821963871 loop: lower_bound 6821962848, probe_pos 6821963359, upper_bound 6821963871 fetch(probe_pos = 6821963359, target_pos = 6821971036) = 6821963411..6821963422 lower_bound = high_pos 6821963422 loop: lower_bound 6821963422, probe_pos 6821963646, upper_bound 6821963871 fetch(probe_pos = 6821963646, target_pos = 6821971036) = 6822575316..6822575316 Here, we found nothing between 6821963646 and 6822575316, so upper_bound is reduced to 6821963646... upper_bound = probe_pos 6821963646 loop: lower_bound 6821963422, probe_pos 6821963534, upper_bound 6821963646 fetch(probe_pos = 6821963534, target_pos = 6821971036) = 6821963536..6821963539 lower_bound = high_pos 6821963539 loop: lower_bound 6821963539, probe_pos 6821963592, upper_bound 6821963646 fetch(probe_pos = 6821963592, target_pos = 6821971036) = 6821963835..6821963841 ...but here, we found 6821963835 and 6821963841, which are between 6821963646 and 6822575316. They were not there before, so the binary search result is now invalid because new extent items were added while it was running. This results in an exception: lower_bound = high_pos 6821963841 --- BEGIN TRACE --- exception --- objectid = 27942759813120, adjusted to 27942793363456 at bees-roots.cc:1103 Crawling extent BeesCrawlState 250:0 offset 0x0 transid 1311734..1311735 at bees-roots.cc:991 get_state_end at bees-roots.cc:988 find_next_extent 250 at bees-roots.cc:929 --- END TRACE --- exception --- *** EXCEPTION *** exception type std::out_of_range: lower_bound = 6821963841, upper_bound = 6821963646 failed constraint check (lower_bound <= upper_bound) at ../include/crucible/seeker.h:139 The exception prevents the result of seek_backward from returning a value, which prevents a nonsense result from a consumer of that value. Copy the details of this search into a test case. Note that the test case won't reproduce the exception because the simulation of fetch() is not changing the results part way through. Signed-off-by: Zygo Blaxell <bees@furryterror.org>
BEES
Best-Effort Extent-Same, a btrfs deduplication agent.
About bees
bees is a block-oriented userspace deduplication agent designed to scale up to large btrfs filesystems. It is an offline dedupe combined with an incremental data scan capability to minimize time data spends on disk from write to dedupe.
Strengths
- Space-efficient hash table - can use as little as 1 GB hash table per 10 TB unique data (0.1GB/TB)
- Daemon mode - incrementally dedupes new data as it appears
- Largest extents first - recover more free space during fixed maintenance windows
- Works with btrfs compression - dedupe any combination of compressed and uncompressed files
- Whole-filesystem dedupe - scans data only once, even with snapshots and reflinks
- Persistent hash table for rapid restart after shutdown
- Constant hash table size - no increased RAM usage if data set becomes larger
- Works on live data - no scheduled downtime required
- Automatic self-throttling - reduces system load
- btrfs support - recovers more free space from btrfs than naive dedupers
Weaknesses
- Whole-filesystem dedupe - has no include/exclude filters, does not accept file lists
- Requires root privilege (
CAP_SYS_ADMIN
plus the usual filesystem read/modify caps) - First run may increase metadata space usage if many snapshots exist
- Constant hash table size - no decreased RAM usage if data set becomes smaller
- btrfs only
Installation and Usage
Recommended Reading
- bees Gotchas
- btrfs kernel bugs - especially DATA CORRUPTION WARNING for old kernels
- bees vs. other btrfs features
- What to do when something goes wrong
More Information
Bug Reports and Contributions
Email bug reports and patches to Zygo Blaxell bees@furryterror.org.
You can also use Github:
https://github.com/Zygo/bees
Copyright & License
Copyright 2015-2025 Zygo Blaxell bees@furryterror.org.
GPL (version 3 or later).
Languages
C++
97%
C
1.6%
Makefile
0.8%
Shell
0.6%