GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-08-13 02:33:46 +02:00

Author	SHA1	Message	Date
Zygo Blaxell	6dbef5f27b	fs: improve compatibility with linux-libc-dev 5.4 Fix the missing symbols that popped up when adding chunk tree to lib/fs.cc. Also define the missing symbols instead of merely trying to avoid them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-08 21:17:15 -05:00
Zygo Blaxell	d32f31f411	btrfs-tree: harden `rlower_bound` against exceptional objects Rearrange the logic in `rlower_bound` so it can cope with a tree that contains mostly block-aligned objects, with a few exceptions filtered out by `hdr_stop`. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	dd08f6379f	btrfs-tree: add a method to get root backref items to BtrfsRootFetcher This complements the already existing support for reading the fields of a root backref. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	58ee297cde	btrfs-tree: connect methods to the debug stream interface In some cases functions already had existing debug stream support which can be redirected to the new interface. In other cases, new debug messages are added. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	a3c0ba0d69	fs: add a runtime debug stream for btrfs tree searches This allows plugging in an ostream at run time so that we can audit all the search calls we are doing. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	75040789c6	btrfs-tree: drop BtrfsFsTreeFetcher and clean up class comments BtrfsFsTreeFetcher was used for early versions of the extent scanner, but neither subvol nor extent scan now needs an object that is both persistent and configured to access only one subvol. BtrfsExtentDataFetcher does the same thing in that case. Clarify the comments on what the remaining classes do, so that BtrfsFsTreeFetcher doesn't get inadvertently reinvented in the future. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	f9a697518d	btrfs-tree: introduce BtrfsDataExtentTreeFetcher to read data extents without metadata Binary searches can be extremely slow if the target bytenr is near a metadata block group, because metadata items are not visible to the binary search algorithm. In a non-mixed-bg filesystem, there can be hundreds of thousands of metadata items between data extent items, and since the binary search algorithm can't see them, it will run searches that iterate over hundreds of thousands of objects about a dozen times. This is less of a problem for mixed-bg filesystems because the data and metadata blocks are not isolated from each other. The binary search algorithm still can't see the metadata items, but there are usually some data items close by to prevent the linear item filter from running too long. Introduce a new fetcher class (all the good names were taken) that tracks where the end of the current block group is. When the end of the current block group is reached in the linear search, skip ahead to a block group that can contain data items. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	c4ba6ec269	fs: add a ntoa function for chunk types Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	925b12823e	fs: add do_ioctl_nothrow and fsid methods to btrfs fs info Enable use of the ioctl to probe whether two fds refer to the same btrfs, without throwing an exception. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-02-06 22:42:15 -05:00
Zygo Blaxell	ad11db2ee1	openat2: supply the missing definitions for building with old headers and new kernel Apparently Ubuntu 20 has upgraded to kernel 5.15, but still builds things with 5.4 headers. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-19 22:20:06 -05:00
Zygo Blaxell	c53fa04a2f	task: fixes for priority and idle Tasks Tasks are not allowed to be queued more than once, but it is allowed to queue a Task while it's already running, which means a Task can be executed on two threads in parallel. Tasks detect this and handle it by queueing the Task on its own post-exec queue. That in turn leads to Workers which continually execute the same Task if that Task doesn't create any new Tasks, while other Tasks sit on the Master queue waiting for a Worker to dequeue them. For idle Tasks, we don't want the Task to be rescheduled immediately. We want the idle Task to execute again after every available Task on both the main and idle queues has been executed. Fix these by having each Task reschedule itself on the appropriate queue when it finishes executing. Priority queued Tasks should executed in priority order not just one Task's post-exec queue, but the entire local queue of the TaskConsumer. Fix this by moving the sort into either the TaskConsumer that receives a post-exec queue, if there is one, or into the Task that is created to insert the post-exec queue into a TaskConsumer when one becomes available in the future. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-15 00:43:25 -05:00
Zygo Blaxell	a819d623f7	task: do not allow queue loops in priority queueing mode Tasks using non-priority FIFO dependency tracking can insert themselves into their own queue, to run the Task again immediately after it exits. For priority queues, this attempts to splice the post-exec queue into itself, which doesn't seem like a good idea. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-12 15:28:26 -05:00
Zygo Blaxell	de9d72da80	task: flatten queues of dependent Tasks Suppose Task A, B, and C are created in that order, and currently running. Task T acquires Exclusion E. Task B, A, and C attempt to acquire the same Exclusion, in that order, but fail because Task T holds it. The result is Task T with a post-exec queue: T, [ B, A, C ] sort_requested Now suppose Task U acquires Exclusion F, then Task T attempts to acquire Exclusion F. Task T fails to acquire F, so T is inserted into U's post-exec queue. The result at the end of the execution of T is a tree: U, [ T ] sort_requested \-> [ B, A, C ] sort_requested Task T exits after failing to acquire a lock. When T exits, T will sort its post-exec queue and submit the post-exec queue for execution immediately: Worker 1: U, [ T ] sort_requested Worker 2: A, B, C This isn't ideal because T, A, B, and C all depend on at least one common Exclusion, so they are likely to immediately conflict with T when U exits and T runs again. Ideally, A, B, and C would at least remain in a common queue with T, and ideally that queue is sorted. Instead of inserting T into U's post-exec queue, insert T and all of T's post-exec queue, which creates a single flattened Task list: U, [ T, B, A, C ] sort_requested Then when U exits, it will sort [ T, B, A, C ] into [ A, B, C, T ], and run all of the queued Tasks in age priority order: U exited, [ T, B, A, C ] sort_requested U exited, [ A, B, C, T ] [ A, B, C, T ] on TaskConsumer queue Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-12 14:05:44 -05:00
Zygo Blaxell	74d8bdd60f	task: add an `insert` method for priority-queueing Tasks by age Task started out as a self-organizing parallel-make algorithm, but ended up becoming a half-broken wait-die algorithm. When a contended object is already locked, Tasks enter a FIFO queue to restart and acquire the lock. This is the "die" part of wait-die (all locks on an Exclusion are non-blocking, so no Task ever does "wait"). The lock queue is FIFO wrt _lock acquisition order_, not _Task age_ as required by the wait-die algorithm. Make it a 25%-broken wait-die algorithm by sorting the Tasks on lock queues in order of Task ID, i.e. oldest-first, or FIFO wrt Task age. This ensures the oldest Task waiting for an object is the one to get it when it becomes available, as expected from the wait-die algorithm. This should reduce the amount of time Tasks spend on the execution queue, and reduce memory usage by avoiding the accumulation of Tasks that cannot make forward progress. Note that turning `TaskQueue` into an ordered container would have undesirable side-effects: * `std::list` has some useful properties wrt stability of object location and cost of splicing. Other containers may not have these, and `std::list` does have a `sort` method. * Some Task objects are created at the beginning and reused continually, but we really do want those Tasks to be executed in FIFO order wrt submission, not Task ID. We can exclude these tasks by only doing the sorting when a Task is queued for an Exclusin object. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-12 00:35:37 -05:00
Zygo Blaxell	8bc90b743b	task: get rid of the `insert_task` method Nothing calls it (not even tests), and there's significant functional overlap with `try_lock`. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-11 23:39:55 -05:00
Zygo Blaxell	82f1fd8054	process: replace crucible::gettid() with a weak symbol Since we're now using weak symbols for dodgy libc functions, we might as well do it for gettid() too. Use the ::gettid() global namespace and let libc override it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-09 01:37:44 -05:00
Zygo Blaxell	a9b07d7684	openat2: create a weak syscall wrapper for it openat2 allows closing more TOCTOU holes, but we can only use it when the kernel supports it. This should disappear seamlessly when libc implements the function. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2025-01-09 01:36:39 -05:00
Zygo Blaxell	a02588b16f	time: add more methods to support dynamic rate throttling * Allow RateLimiter to change rate after construction. * Check range of rate argument in constructor. * Atomic increment for RateEstimator. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-12 23:10:15 -05:00
Zygo Blaxell	21cedfb13e	bytevector: rename the argument to operator[] to be more descriptive Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-12 23:10:15 -05:00
Zygo Blaxell	9beb602b16	task: ignore paused status while calculating dynamic thread count bees might be unpaused at any time, so make sure that the dynamic load calculation is ready with a non-zero thread count. This avoids a delay of up to 5 seconds when responding to SIGUSR2 when loadavg tracking is enabled. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-12 23:10:15 -05:00
Zygo Blaxell	1cbc894e6f	task: start up more worker threads when unpausing When paused, TaskConsumer threads will eventually notice the paused condition and exit; however, there's nothing to restart threads when exiting the paused state. When unpausing, and while the lock is already held, create TaskConsumer threads as needed to reach the target thread count. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-12 22:53:00 -05:00
Zygo Blaxell	d74862f1fc	fs: set the correct nr_items to 0 in the ENOENT search case Commit `72c3bf8438` ("fs: handle ENOENT within lib") was meant to prevent exceptions when a subvol is deleted. If the search ioctl fails, the kernel won't set nr_items in the ioctl output, which means `nr_items` still has the input value. When ENOENT is detected, `this->nr_items` is set to 0, then later `*this = ioctl_ptr->key` overwrites `this->nr_items` with the original requested number of items. This replaced the ENOENT exception with an exception triggered by interpreting garbage in the memory buffer. The number of exceptions was reduced because the memory buffers are frequently reused, but upper layers would then reject the data or ignore it because it didn't match the key range. Fix by setting `ioctl_ptr->key.nr_items`, which then overwrites `this->nr_items`, so the loop that extracts items from the ioctl data gets the right number of items (i.e. zero). Fixes: `72c3bf8438` ("fs: handle ENOENT within lib") Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-12 22:48:15 -05:00
Zygo Blaxell	e99a505b3b	bytevector: don't deadlock on operator<< operator<< was a friend class that locked the ByteVector, then invoked hexdump on the bytevector, which used ByteVector::operator[]...which locked the ByteVector, resulting in a deadlock. operator<< shouldn't be a friend class anyway. Make hexdump use the normal public access methods for ByteVector. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-12-03 23:39:33 -05:00
Zygo Blaxell	b99d80b40f	task: add an idle queue Add a second level queue which is only serviced when the local and global queues are empty. At some point there might be a need to implement a full priority queue, but for now two classes are sufficient. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	099ad2ce7c	fs: add some performance metrics for TREE_SEARCH_V2 calls These give some visibility into how efficiently bees is using the TREE_SEARCH_V2 ioctl. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	a59a02174f	table: add a simple text table renderer This should help clean up some of the uglier status outputs. Supports: * multi-line table cells * character fills * sparse tables * insert, delete by row and column * vertical separators and not much else. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	606ac01d56	multilock: allow turning it off Add a master switch to turn off the entire MultiLock infrastructure for testing, without having to remove and add all the individual entry points. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	72c3bf8438	fs: handle ENOENT within lib This prevents the storms of exceptions that occur when a subvol is deleted. We simply treat the entire tree as if it was empty. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	72958a5e47	btrfs-tree: accessors for TreeFetcher classes' type and tree values Sometimes we have a generic TreeFetcher and we need to know which tree it came from. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	f25b4c81ba	btrfs-tree: add root refs and extent flags fields Lazily filling in accessor methods for btrfs objects as needed by bees. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	792fdbbb13	fs: get rid of 16 MiB limit on dedupe requests The kernel has not required a 16 MiB limit on dedupe requests since v4.18-rc1 b67287682688 ("Btrfs: dedupe_file_range ioctl: remove 16MiB restriction"). Kernels before v4.18 would truncate the request and return the size actually deduped in `bytes_deduped`. Kernel v4.18 and later will loop in the kernel until the entire request is satisfied (although still in 16 MiB chunks, so larger extents will be split). Modify the loop in userspace to measure the size the kernel actually deduped, instead of assuming the kernel will only accept 16 MiB. On current kernels this will always loop exactly once. Since we now rely on `bytes_deduped`, make sure it has a sane value. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-11-30 23:30:33 -05:00
Zygo Blaxell	3839690ba3	lib: fix btrfs_data_container pointer casts for 32-bit userspace on 64-bit kernels Apparently reinterpret_cast<uint64_t> sign-extends 32-bit pointers. This is OK when running on a 32-bit kernel that will truncate the pointer to 32 bits, but when running on a 64-bit kernel, the extra bits are interpreted as part of the (now very invalid) address. Use <uintptr_t> instead, which is unsigned, integer, and the same word size as the arch's pointer type. Ordinary numeric conversion can take it from there, filling the rest of the word with zeros. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2024-04-17 23:07:41 -04:00
Zygo Blaxell	75b2067cef	btrfs-tree: fix build on clang++16 The "loops" variable isn't read (only set) if not built with extra debug code. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-05-07 21:23:27 -04:00
Zygo Blaxell	7c764a73c8	fs: allow BtrfsIoctlLogicalInoArgs to be reused, remove virtual methods Some malloc implementations will try to mmap() and munmap() large buffers every time they are used, causing a severe loss of performance. Nothing ever overrode the virtual methods, and there was no virtual destructor, so they cause compiler warnings at build time when used with a template that tries to delete pointers to them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-02-23 22:40:12 -05:00
Zygo Blaxell	3b85fc8bc7	lib: drop version.cc entirely crucible::VERSION doesn't make much sense now that libcrucible no longer exists as a shared library. Nothing ever referenced it, so it can go away. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-27 22:16:02 -05:00
Zygo Blaxell	4df1b2c834	lib: simplify dependency generation We don't need to run all the dependencies first, Make can do those in parallel. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-27 22:16:02 -05:00
Zygo Blaxell	495218104a	fd: FS_IOC_SETFLAGS takes an int* argument not a long* According to ioctl_iflags(2): The type of the argument given to the FS_IOC_GETFLAGS and FS_IOC_SETFLAGS operations is int , notwithstanding the implication in the kernel source file include/uapi/linux/fs.h that the argument is long . So this code doesn't work on be64 machines. Also, Valgrind complains about it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-27 22:16:02 -05:00
Zygo Blaxell	e82ce3c06e	fd: pwrite returns ssize_t not int A subtle distinction, and not one that is particularly relevant to bees, but it does make toolchains complain. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-27 22:16:02 -05:00
Zygo Blaxell	bd336e81a6	fs: get rid of base class btrfs_ioctl_logical_ino_args Another instance of the pattern where we derived a crucible class from a btrfs struct. Make it an automatic variable instead. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-27 22:16:02 -05:00
Zygo Blaxell	cb2c20ccc9	fs: get rid of base class btrfs_ioctl_same_extent_info We only use BtrfsExtentInfo when it's exactly equivalent to the base, so drop the derived class. While we're here, fix BtrfsExtentSame::add so it uses a btrfs-compatible uint64_t instead of an off_t. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-05 01:10:17 -05:00
Zygo Blaxell	ded5bf0148	btrfs-tree: fix whitespace and const Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-05 01:10:17 -05:00
Zygo Blaxell	d5de012a17	btrfs-tree: translate item types for error messages Look up the name when filling in the what() field for the exception. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-05 01:10:17 -05:00
Zygo Blaxell	66d1e8a89b	btrfs-tree: add chunk items: length and type Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-05 01:10:17 -05:00
Zygo Blaxell	563e584da4	task: use pthread_setname_np correctly It turns out I've been using pthread_setname_np wrong the whole time: * on Linux, the thread name length is 15 characters. TASK_COMM_LEN is 16 bytes, and the last one is always 0. This is now hardcoded in many places and cannot be changed. * pthread_setname_np doesn't return -errno, so DIE_IF_MINUS_ERRNO was the wrong macro. On the other hand, we never want to do anything differently when pthread_setname_np fails, so we never needed to check the return value. Also, libc silently ignores attempts to set the thread name when it is too long. That's almost certainly a libc bug, but libc probably suppresses the error result for the same reasons I ignore the error result. Wrap the pthread_setname function with a C++ std::string overload that truncates the argument at 15 characters, so we at least get the first part of the task name in the thread name field. Later commits can deal with making the bees thread names shorter. Also wrap pthread_getname for symmetry. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2023-01-05 01:10:17 -05:00
Zygo Blaxell	4d59939b07	btrfs-tree: introduce lightweight classes for btrfs tree search operations btrfs-tree provides classes for low-level access to btrfs tree objects. An item class is provided to decode polymorphic btrfs item fields. Several tree classes provide forward and backward iteration over raw object items at different tree levels. A csum tree class provides convenient access to csums by bytenr, supporting all current btrfs csum types. Wrapper classes for inode and subvol items provide direct access to btrfs metadata fields without clumsy stat() wrappers or ioctls. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:59 -05:00
Zygo Blaxell	148cc03060	bytevector: do not deadlock in self-assignment Not that this is a particularly useful use case, but it will lock up, and it should not. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:58 -05:00
Zygo Blaxell	b699325a77	bytevector: don't need _all_ of those mutexes Methods that don't even look at the pointer don't need a mutex. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:58 -05:00
Zygo Blaxell	a59d89ea81	bytevector: add some fugly mutexes We are using ByteVectors from multiple threads in some cases. Mostly these are the status and progress threads which read the ByteVector object references embedded in BEESNOTE macros. Since it's not clear what the data race implications are, protect the shared_ptr in ByteVector with a mutex for now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:58 -05:00
Zygo Blaxell	d1015b683f	bytevector: add ostream output with hexdump There is a hexdump template in fs. Move hexdump to its own header, then ByteVector can use it too. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:58 -05:00
Zygo Blaxell	b143664747	task: use exponential backoff algorithm to set thread count Tasks are often running longer than 5 seconds (especially extents with multiple references requiring copy operations), so the load tracking algorithm needs to average several samples over a longer period of time than 5 seconds. If the sample period is 60 seconds, we end up recomputing the original load average from current_load, so skip the rounding error and use the original load average value. Arguably the real fix is to break up the more complex extent operations over several downstream Task objects, but that's a more significant design change. Tweak the attack and decay rates so that threads are started a little more slowly, but still stopped rapidly when load spikes up. Remove the hysteresis to provide support for load average targets below 1, or with fractional components, with a PWM-like effect. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:57 -05:00

1 2 3 4

193 Commits