GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-08-23 22:42:20 +02:00

Author	SHA1	Message	Date
Zygo Blaxell	925b12823e	fs: add do_ioctl_nothrow and fsid methods to btrfs fs info Enable use of the ioctl to probe whether two fds refer to the same btrfs, without throwing an exception. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	561e604edc	seeker: turn off debug logging The debug log is only revealed when something goes wrong, but it is created and discarded every time `seek_backward` is called, and it is quite CPU-intensive. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	30cd375d03	readahead: clean up the code, update docs Remove dubious comments and #if 0 section. Document new event counters, and add one for read failures. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	48b7fbda9c	progress: adjust minimum thresholds for ETA to 10 seconds and 1 GiB of data 1% is a lot of data on a petabyte filesystem, and a long time to wait for an ETA. After 1 GiB we should have some idea of how fast we're reading the data. Increase the time to 10 seconds to avoid a nonsense result just after a scan starts. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	85aba7b695	openat2: #include <linux/types.h> so we can know `__u64` Alternative implementations could use `uint64_t` instead, from `cstdint`. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-20 17:02:19 -05:00
Zygo Blaxell	de38b46dd8	scripts/beesd: harden the mount options * `nodev`: This reduces rename attack surface by preventing bees from opening any device file on the target filesystem. * `noexec`: This prevents access to the mount point from being leveraged to execute setuid binaries, or execute anything at all through the mount point. These options are not required because they duplicate features in the bees binary (assuming that the mount namespace remains private): * `noatime`: bees always opens every file with `O_NOATIME`, making this option redundant. * `nosymfollow`: bees uses `openat2` on kernels 5.6 and later with flags that prevent symlink attacks. `nosymfollow` was introduced in kernel 5.10, so every kernel that can do `nosymfollow` can already do `openat2`. Also, historically, `$BEESHOME` can be a relative path with symlinks in any path component except the last one, and `nosymfollow` doesn't allow that. Between `openat2` and `nodev`, all symlink attacks are prevented, and rename attacks cannot be used to force bees to open a device file. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-20 01:00:41 -05:00
Zygo Blaxell	0abf6ebb3d	scripts/beesd: no need for `$BEESHOME` to be a subvol We _recommend_ that `$BEESHOME` should be a subvol, and we'll create a subvol if no directory exists; however, there's no reason to reject an existing plain directory if the user chooses to use one. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-20 00:43:13 -05:00
Kai Krakow	360ce7e125	scripts/beesd: Unshare namespace without systemd If starting the beesd script without systemd, the mount point won't automatically unmount if the script is cancelled with ctrl+c. Fixes: https://github.com/Zygo/bees/issues/281 Signed-off-by: Kai Krakow <kai@kaishome.de>	2025-01-20 00:05:57 -05:00
Zygo Blaxell	ad11db2ee1	openat2: supply the missing definitions for building with old headers and new kernel Apparently Ubuntu 20 has upgraded to kernel 5.15, but still builds things with 5.4 headers. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-19 22:20:06 -05:00
Zygo Blaxell	874832dc58	openat2: log a warning when we fall back to openat This should occur only once per run, but it's worth leaving a note that it has happened. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-19 22:19:42 -05:00
Zygo Blaxell	5fe89d85c3	extent scan: make sure we run every extent crawler once per transaction There's a pathological case where all of the extent scan crawlers except one are at the end of a crawl cycle, but the one crawler that is still running is keeping the Task queue full. The result is that bees never starts the other extent scan crawlers, because the queue is always full at the instant a new transid triggers the start of a new scan. That's bad because it will result in bees falling behind when new data from the inactive size tiers appears. To fix this, check for throttling _after_ creating at least one scan task in each crawler. That will keep the crawlers running, and possibly allow them to claw back some space in the Task queue. It slightly overcommits the Task queue, so there will be a few more Tasks than nominally allowed. Also (re)introduce some hysteresis in the queue size limit and reduce it a little, so that bees isn't continually stopping and restarting crawls every time one task is created or completed, and so that we stay under the configured Task limit despite overcommitting. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-19 22:19:42 -05:00
Zygo Blaxell	a2b3e1e0c2	log: demote a lot of BEESLOGWARN to higher verbosity levels Toxic extent workarounds are going away because the underlying kernel bugs have been fixed. They are no longer worthy of spamming non-developer logs. INO_PATHS can return no paths if an inode has been deleted. It doesn't need a log message at all, much less one at WARN level. Dedupe failure can be INFO, the same level as dedupe itself, especially since the "NO dedupe" message doesn't mention what was [not] deduped. Inspired by Kai Krakow's "context: demote "abandoned toxic match" to debug log level". Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-19 01:08:28 -05:00
Kai Krakow	aaec931081	context: demote "abandoned toxic match" to debug log level This log message creates a overwhelmingly lot of messages in the system journal, leading to write-back flushing storms under high activity. As it is a work-around message, it is probably only useful to developers, thus demote to debug level. This fixes latency spikes in desktop usage after adding a lot of new files, especially since systemd-journal starts to flush caches if it sees memory pressure. Signed-off-by: Kai Krakow <kai@kaishome.de>	2025-01-19 00:59:22 -05:00
Zygo Blaxell	c53fa04a2f	task: fixes for priority and idle Tasks Tasks are not allowed to be queued more than once, but it is allowed to queue a Task while it's already running, which means a Task can be executed on two threads in parallel. Tasks detect this and handle it by queueing the Task on its own post-exec queue. That in turn leads to Workers which continually execute the same Task if that Task doesn't create any new Tasks, while other Tasks sit on the Master queue waiting for a Worker to dequeue them. For idle Tasks, we don't want the Task to be rescheduled immediately. We want the idle Task to execute again after every available Task on both the main and idle queues has been executed. Fix these by having each Task reschedule itself on the appropriate queue when it finishes executing. Priority queued Tasks should executed in priority order not just one Task's post-exec queue, but the entire local queue of the TaskConsumer. Fix this by moving the sort into either the TaskConsumer that receives a post-exec queue, if there is one, or into the Task that is created to insert the post-exec queue into a TaskConsumer when one becomes available in the future. Signed-off-by: Zygo Blaxell <bees@furryterror.org> v0.11-rc3	2025-01-15 00:43:25 -05:00
Zygo Blaxell	d4a681c8a2	Revert "roots: use a non-idle task for next_transid" next_transid tasks don't respect queue selection very well, because they effectively end up spinning in a loop until all other worker threads become busy. Back this out, and fix the priority handling in the Task library. This reverts commit `58db4071de`.	2025-01-12 18:48:33 -05:00
Zygo Blaxell	a819d623f7	task: do not allow queue loops in priority queueing mode Tasks using non-priority FIFO dependency tracking can insert themselves into their own queue, to run the Task again immediately after it exits. For priority queues, this attempts to splice the post-exec queue into itself, which doesn't seem like a good idea. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-12 15:28:26 -05:00
Zygo Blaxell	de9d72da80	task: flatten queues of dependent Tasks Suppose Task A, B, and C are created in that order, and currently running. Task T acquires Exclusion E. Task B, A, and C attempt to acquire the same Exclusion, in that order, but fail because Task T holds it. The result is Task T with a post-exec queue: T, [ B, A, C ] sort_requested Now suppose Task U acquires Exclusion F, then Task T attempts to acquire Exclusion F. Task T fails to acquire F, so T is inserted into U's post-exec queue. The result at the end of the execution of T is a tree: U, [ T ] sort_requested \-> [ B, A, C ] sort_requested Task T exits after failing to acquire a lock. When T exits, T will sort its post-exec queue and submit the post-exec queue for execution immediately: Worker 1: U, [ T ] sort_requested Worker 2: A, B, C This isn't ideal because T, A, B, and C all depend on at least one common Exclusion, so they are likely to immediately conflict with T when U exits and T runs again. Ideally, A, B, and C would at least remain in a common queue with T, and ideally that queue is sorted. Instead of inserting T into U's post-exec queue, insert T and all of T's post-exec queue, which creates a single flattened Task list: U, [ T, B, A, C ] sort_requested Then when U exits, it will sort [ T, B, A, C ] into [ A, B, C, T ], and run all of the queued Tasks in age priority order: U exited, [ T, B, A, C ] sort_requested U exited, [ A, B, C, T ] [ A, B, C, T ] on TaskConsumer queue Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-12 14:05:44 -05:00
Zygo Blaxell	74d8bdd60f	task: add an `insert` method for priority-queueing Tasks by age Task started out as a self-organizing parallel-make algorithm, but ended up becoming a half-broken wait-die algorithm. When a contended object is already locked, Tasks enter a FIFO queue to restart and acquire the lock. This is the "die" part of wait-die (all locks on an Exclusion are non-blocking, so no Task ever does "wait"). The lock queue is FIFO wrt _lock acquisition order_, not _Task age_ as required by the wait-die algorithm. Make it a 25%-broken wait-die algorithm by sorting the Tasks on lock queues in order of Task ID, i.e. oldest-first, or FIFO wrt Task age. This ensures the oldest Task waiting for an object is the one to get it when it becomes available, as expected from the wait-die algorithm. This should reduce the amount of time Tasks spend on the execution queue, and reduce memory usage by avoiding the accumulation of Tasks that cannot make forward progress. Note that turning `TaskQueue` into an ordered container would have undesirable side-effects: * `std::list` has some useful properties wrt stability of object location and cost of splicing. Other containers may not have these, and `std::list` does have a `sort` method. * Some Task objects are created at the beginning and reused continually, but we really do want those Tasks to be executed in FIFO order wrt submission, not Task ID. We can exclude these tasks by only doing the sorting when a Task is queued for an Exclusin object. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-12 00:35:37 -05:00
Zygo Blaxell	a5d078d48b	docs: deprecate the `--workaround-btrfs-send` option Emphasize that the option is relevant to old kernels, older than the minimum supportable version threshold. De-emphasize the use case of "send-workaround" as a synonym for "exclude read-only". Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:56 -05:00
Zygo Blaxell	e2587cae9b	docs: expand "Threads and load management" to suggest not running bees so much One of the more obvious ways to reduce bees load is to simply not run it all the time. Explicitly state using maintenance windows as a load management option. SIGUSR1 and SIGUSR2 should have been documented somewhere else before now. Better late than never. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:56 -05:00
Zygo Blaxell	ac581273d3	docs: config.md updates The theories behind bees slowing down when presented with a larger has table turned out to be wrong. The real cause was a very old bug which submitted thousands of `LOGICAL_INO` requests when only a handful of requests were needed. "Compression on the filesystem" -> "Compression in files" Don't be so "dramatic". Be "rapid" instead. Remove "cannot avoid modifying read-only snapshots" as a distinction between subvol and extent scans. Both modes support send workaround and send waiting with no significant distinction. Emphasize extent scan's better handling of many snapshots. Also reflinks. Add some discussion of `--throttle-factor`. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:56 -05:00
Zygo Blaxell	7fcde97b70	docs: update the bug reporting and status instructions Thread names have changed. Document some of the newer ones. Don't jump immediately to blaming poor performance on qgroups or autodefrag. These do sometimes have kernel regressions but not all the time. Emphasize advantage of controlling bees deferred work requests at the source, before btrfs gets stuck committing them. Avoid asserting that it's OK for gdb to crash. Remove mention of lower-layer block device issues wrt corruption. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:55 -05:00
Zygo Blaxell	e457f502b7	docs: update kernel bugs page for January 2025 "Kernel" -> "Linux kernel". If you can run bees on a kernel that isn't Linux, congratulations! Emphasize the age of the data corruption warnings. Once 5.4 reaches EOL we can remove those. Simplify the discussion of old kernels and API levels. There's a new optional kernel API for `openat2` support at 5.6. The absolute minimum kernel version is still 4.2, and will not increase to 4.15 until the subvol scanners are removed. Remove discussion of bees support for kernels 4.19 (which recently reached EOL) and earlier. The `LOGICAL_INO` vs dedupe bug is actually a `LOGICAL_INO` vs clone bug. Dedupe isn't necessary to reproduce it. Remove a stray ')'. Strip out most of the discussion of slow backrefs, as they are no longer a concern on the range of supported kernel versions. Leave some description there because bees still has some vestigial workarounds. Remove `btrfs send` from the "Unfixed kernel bugs" section, which makes the section empty, so remove the section too. bees now handles send on a subvol reasonably well. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:55 -05:00
Zygo Blaxell	46815f1a9d	docs: update README.md Emphasize "large" is an upper bound on the size of filesystem bees can handle. New strengths: largest extent first for fixed maintenance windows, scans data only once (ish), recovers more space Removed weaknesses: less temporary space Need more caps than `CAP_SYS_ADMIN`. Emphasize DATA CORRUPTION WARNING is an old-kernel thing. Update copyright year. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:55 -05:00
Zygo Blaxell	0d251d30f4	docs: update feature interaction lists Tested on larger filesystems than 100T too, but let's use Fermi approximation. Next size is 1P. Removed interaction with block-level SSD caching subsystems. These are really btrfs metadata vs. a lower block layer, and have nothing to do with bees. Added mixed block groups to the tested list, as mixed block groups required explicit support in the extent scanner. Added btrfs-convert to the tested list. btrfs-convert has various problems with space allocation in general, but these can be solved by carefully ordered balances after conversion, and they have nothing to do with bees. In-kernel dedupe is dead and the stubs were removed years ago. Remove it from the list. btrfs send now plays nicely with bees on all supportable kernels, now that stable/linux-4.19.y is dead. Send workaround is only needed for kernels before v5.4 (technically v5.2, but nobody should ever mount a btrfs with kernel v5.1 to v5.3). bees will pause automatically when deduping a subvol that is currently running a send. bees will no longer gratuitously refragment data that was defragmented by autodefrag. Explicitly list all the RAID profiles tested so far, as there have been some new ones. Explicitly list other deduplicators tested. Sort the list of btrfs features alphabetically. Add scrub and balance, which have been tested with bees since the beginning. New tested btrfs features: block-group-tree, raid1c3, raid1c4. New untested btrfs features: squotas, raid-stripe-tree. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:55 -05:00
Zygo Blaxell	b8dd9a2db0	progress: put a timestamp in the bottom row This records the time when the progress data was calculated, to help indicate when the data might be very old. While we're here, move "now" out of the loop so there's only one value. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:55 -05:00
Zygo Blaxell	8bc90b743b	task: get rid of the `insert_task` method Nothing calls it (not even tests), and there's significant functional overlap with `try_lock`. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:55 -05:00
Zygo Blaxell	2f2a68be3d	roots: use openat2 instead of openat when available This increases resistance to symlink and mount attacks. Previously, bees could follow a symlink or a mount point in a directory component of a subvol or file name. Once the file is opened, the open file descriptor would be checked to see if its subvol and inode matches the expected file in the target filesystem. Files that fail to match would be immediately closed. With openat2 resolve flags, symlinks and mount points terminate path resolution in the kernel. Paths that lead through symlinks or onto mount points cannot be opened at all. Fall back to openat() if openat2() returns ENOSYS, so bees will still run on kernels before v5.6. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-09 02:26:53 -05:00
Zygo Blaxell	82f1fd8054	process: replace crucible::gettid() with a weak symbol Since we're now using weak symbols for dodgy libc functions, we might as well do it for gettid() too. Use the ::gettid() global namespace and let libc override it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-09 01:37:44 -05:00
Zygo Blaxell	a9b07d7684	openat2: create a weak syscall wrapper for it openat2 allows closing more TOCTOU holes, but we can only use it when the kernel supports it. This should disappear seamlessly when libc implements the function. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-09 01:36:39 -05:00
Zygo Blaxell	613ddc3c71	progress: rename "ctime" -> "tm_left" "ctime", an abbreviation of "cycle time", collides with "ctime", an abbreviation of "st_ctime", a well-known filesystem term. "tm_left" fits in the column, so use that. Signed-off-by: Zygo Blaxell <bees@furryterror.org> v0.11-rc2	2025-01-06 12:50:50 -05:00
Zygo Blaxell	c3a39b7691	progress: rework the progress table after github discussion * Report position within cycle in units that cannot be mistaken for size or percentage * Put the total/maximum values in their own row * Add a start time column * Change column titles to reference "cycles" * Use "idle" instead of "finished" when a crawler is not running * Replace "transid" with "gen" because it's shorter Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:45:37 -05:00
Zygo Blaxell	58db4071de	roots: use a non-idle task for next_transid The scanners which finish early can become stuck behind scanners that are able to keep the queue full. Switch the next_transid task to the normal Task queues so that we force scanners to restart on every new transaction, possibly deferring already queued work to do so. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:36:53 -05:00
Zygo Blaxell	0d3e13cc5f	context: report time in scan_one_extent Add yet another field to the scan/skip report line: the wallclock time used to process the extent ref. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:36:53 -05:00
Zygo Blaxell	1af5fcdf34	roots: don't access a shared variable after releasing a lock Access the local copy of `m_root_crawl_map` instead. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:36:53 -05:00
Zygo Blaxell	87472b6086	extent scan: don't put non-data block groups in the data extent map The total data size should not include metadata or system block groups, and already does not; however, we still have these block groups in the map for mapping the crawl pointer to a logical offset within the filesystem. Rearrange a few lines around the `if` statement so that the map doesn't contain anything it should not. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:32:48 -05:00
Zygo Blaxell	ca351d389f	extent scan: pick the right block groups for mixed-bg filesystems The progress indicator was failing on a mixed-bg filesystem because those filesystems have block groups which have both _DATA and _METADATA bits, and the filesystem size calculation was excluding block groups that have _METADATA set. It should exclude block groups that have _DATA not set. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:15:37 -05:00
Zygo Blaxell	1f0b8c623c	options: improve message when too many--or too few--path arguments given Running bees with no arguments complains about "Only one" path argument. Replace this with "Exactly one" which uses similar terminology to other btrfs tools. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:15:37 -05:00
Zygo Blaxell	74296c644a	options: return EXIT_SUCCESS after displaying help message `getopt_long` already supplies a message when an option cannot be parsed, so there isn't a need to distinguish option parse failures from help requests. Fixes: https://github.com/Zygo/bees/pull/277 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:15:37 -05:00
Zygo Blaxell	231593bfbc	throttle: don't hold the multilock during throttle Release the lock before entering the throttle sleep, so that other threads can still run. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:15:37 -05:00
Zygo Blaxell	d4900cc5d5	docs: default throttle is zero Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:15:37 -05:00
Zygo Blaxell	81bbf7e1d4	throttle: set default to 0.0 Longer latency testing runs are not showing a consistent gain from a throttle factor of 1.0. Make the default more conservative. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:15:37 -05:00
Zygo Blaxell	bd9dc0229b	docs: add `--throttle-factor` option Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:15:37 -05:00
Zygo Blaxell	2a1ed0b455	throttle: track time values more closely Decaying averages by 10% every 5 minutes gives roughly a half-hour half-life to the rolling average. Speed that up to once per minute. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:14:31 -05:00
Zygo Blaxell	d160edc15a	throttle: add --throttle-factor option to control throttling factor Also change the initializer syntax for the option list to use C99 compound literals. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-03 23:13:51 -05:00
Zygo Blaxell	e79b242ce2	options: clean up the parser, prepare for new options with no short form We're not adding any more short options, but the debugging code doesn't work with optvals above 255. Also clean up constness and variable lifetimes. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-16 23:32:18 -05:00
Zygo Blaxell	ea45982293	throttle: add delays to match deferred request rate to btrfs completion rate Measure the time spent running various operations that extend btrfs transaction completion times (`LOGICAL_INO`, tmpfiles, and dedupe) and arrange for each operation to run for not less than the average amount of time by adding a sleep after each operation that takes less than the average. The delay after each operation is intended to slow down the rate of deferred and long-running requests from bees to match the rate at which btrfs is actually completing them. This may help avoid big spikes in latency if btrfs has so many requests queued that it has to force a commit to release memory. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-16 23:32:18 -05:00
Zygo Blaxell	f209cafcd8	bees: bump the file limits again, 512k files and 64k dirs Test machines keep blowing past the 32k file limit. 16 worker threads at 10,000 files each is much larger than 32k. Other high-FD-count services like DNS servers ask for million-file rlimits. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-16 22:54:12 -05:00
Zygo Blaxell	c4b31bdd5c	extent scan: no need for "No ref for extent" debug message While a snapshot is being deleted, there will be a continuous stream of "No ref for extent" messages. This is a common event that does not need to be reported. There is an analogous situation when a call to open() fails with ENOENT. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-14 15:02:39 -05:00
Zygo Blaxell	08fe145988	context: wait for btrfs send to finish, then try dedupe again Dedupe is not possible on a subvol where a btrfs send is running: BTRFS warning (device dm-22): cannot deduplicate to root 259417 while send operations are using it (1 in progress) btrfs informs a process with EAGAIN that a dedupe could not be performed due to a running send operation. It would be possible to save the crawler state at the affected point, fork a new crawler that avoids the subvol under send, and resume the crawler state after a successful dedupe is detected; however, this only helps the intersection of the set of users who have unrelated subvols that don't share extents, and the set of users who cannot simply delay dedupe until send is finished. The simplest approach is to simply stop and wait until the send goes away. The simplest approach is taken here. When a dedupe fails with EAGAIN, affected Tasks will poll, approximately once per transaction, until the dedupe succeeds or fails with a different error. bees dedupe performance corresponds with the availability of subvols that can accept dedupe requests. While the dedupe is paused, no new Tasks can be performed by the worker thread. If subvols are small and isolated from the bulk of the filesystem data, the result will be a small but partial loss of dedupe performance during the send as some worker threads get stuck on the sending subvol. If subvols heavily share extents with duplicate data in other subvols, worker threads will all become blocked, and the entire bees process will pause until at least some of the running sends terminate. During the polling for btrfs send, the dedupe Task will hold its dst file open. This open FD won't interfere with snapshot or file delete because send subvols are always read-only (it is not possible to delete a file on a RO subvol, open or otherwise) and send itself holds the affected subvol open, preventing its deletion. Once the send terminates, the dedupe will terminate soon after, and the normal FD release can occur. This pausing during btrfs send is unrelated to the `--workaround-btrfs-send` option, although `--workaround-btrfs-send` will cause the pausing to trigger less often. It applies to all scan modes. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-14 14:51:28 -05:00

1 2 3 4 5 ...

778 Commits