GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-08-23 22:42:20 +02:00

Author	SHA1	Message	Date
Kai Krakow	081a6af278	bees: Avoid unused result with -Werror=unused-result Fixes: commit `20b8f8ae0b` ("bees: use helper function for readahead") Signed-off-by: Kai Krakow <kai@kaishome.de>	2021-06-19 10:35:28 +02:00
Zygo Blaxell	3d95460eb7	fiemap: don't force flush so we can see the delalloc shenanigans Like filefrag, fiemap was defaulting to FIEMAP_FLAG_SYNC, and providing no option to turn it off. This prevents observation of delayed allocations, making fiemap less useful. Override the default flag setting so fiemap gets the current (i.e. unflushed) extent map state. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 21:09:14 -04:00
Zygo Blaxell	d9e3c0070b	context: stop creating new refs when there are too many already LOGICAL_INO_V2 has a maximum limit of 655050 references per extent. Although it no longer has a crippling performance problem, at roughly two seconds to process extent, it's too slow to be useful. When an extent gains an absurd number of references, stop making any more. Returning zero extent refs will make bees believe the extent was deleted, and it will remove the block from the hash table. This helps speed processing of highly duplicated large files like VM images, and the cost of a slightly lower dedupe hit rate. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 21:05:55 -04:00
Zygo Blaxell	955b8ae459	task: set the name of consumer threads so it is not "load_tracker" The default name of a newly constructed thread is apparently the name of the thread that created it. That's very misleading when there are a lot of TaskConsumer threads and they have nothing to do, so set the name of each TaskConsumer thread as soon as it is created. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 21:02:00 -04:00
Zygo Blaxell	08899052ad	trace: current_exception() is not a replacement for uncaught_exception() In `15ab981d9e` "bees: replace uncaught_exception(), deprecated in C++17", uncaught_exception() was replaced with current_exception(); however, current_exception() is only valid after an exception has been captured by a catch block. BeesTracer wants to know about exceptions _before_ they are caught, so current_exception() is not useful here. Instead, conditionally compile using uncaught_exception() or uncaught_exceptions(), selected by C++ standard version, and make bees stack traces work again. Fixes: `15ab981d9e` "bees: replace uncaught_exception(), deprecated in C++17" Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	03532effed	trace: move BeesTrace and BeesNote into their own translation unit This allows these components to be used by test executables without pulling in all of bees, and more rapidly iterate their code. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	6adaedeecd	extentwalker: fix the binary search and add some debug infrastructure Add some conditionally-compiled debug code, including an in-memory log of what ExtentWalker does. Dump that log on exceptions. If we loop too many times in a debug build, kill the process so we can stack trace. In non-debug builds just throw a normal exception. Grow the step size instead of shrinking it, to reduce the number of binary search iterations. Prevent a bug where the step size bottoms out before positioning the target extent in the middle of the result vector. Use the first extent for "first_extent", instead of the 3rd. Get rid of some redundant checks. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	54f03a0297	extentwalker: fix missing characters "C" in LOGICAL_INO, and avoid writing "flags=" in the log. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	52279656cf	extentwalker: fix the hole position logic When a file ends with a hole, ExtentWalker synthesizes a hole extent record to cover the distance between the last ipos and EOF. Unfortunately, ipos was incremented by the number of items in the result vector instead. Fix that by incrementing by hole_extent.size(). While we're here, fix up some of the other data quality logic, including a useless THROW_CHECK that was nothing but workarounds for earlier bugs. Fixes: https://github.com/Zygo/bees/issues/26 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	1fd26a03b2	tracer: annotate both ends of the stack trace Add a matching "--- BEGIN TRACE..." line to complement the "--- END TRACE..." line. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	b083003cf7	docs: update kernel bugs table as of 5.12.3 Two new tree mod log bugs #5 and #6 (uncovered by the zoned IO work, though #6 has been seen in the wild on 5.10.29). Tweak the next of some of the workarounds. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	b2d4a07c6f	roots: add a TRACE for transid_max search and crawl_transid thread Some users are hitting an exception somewhere in crawl_transid, which forces bees to return back to the transid_max calculation over and over. Also out-of-range transids. Add some BEESTRACE so we can see what we were doing in the exception handler. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	7008c74113	bees: trace and log improvements during roots and context startup Currently if crawl throws an exception, we don't have basic information about what was being crawled or even if the crawler was running at all. These traces also help identify the causes of early exception failures. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	5f0f7a8319	bees: increase StringFile size limit If we are going to dedupe thousands of subvols, we are going to need a bigger beescrawl.dat. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	ee86b585a5	bees: use a reserved symbol name in BEESLOG "c" could be a local variable name, which would do interesting things to some log messages. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	cf4b5417c9	context: remove unnecessary copies These were added while debugging a crash that was fixed years ago. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	77ef6a0638	roots: split constructor into separate start method This allows us to use the fd cache and inode resolve functions without starting crawler threads. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	0f0da21198	context: track record extent reference counts This might be interesting information, though most of the motivation for this evaporated when kernel 5.7 came out. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	8a70bca011	bees: misc comment updates These have been accumulating in unpublished bees commits. Squash them all into one. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	20b8f8ae0b	bees: use helper function for readahead There seem to be multiple ways to do readahead in Linux, and only some of them work. Hopefully reading the actual data is one of them. This is an attempt to avoid page-by-page reads in the generic dedupe code. We load both extents into the VFS cache (read sequentially) and hope they are still there by the time we call dedupe on them. We also call readahead(2) and hopefully that either helps or does nothing. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	0afd2850f4	cache: emit log messages when clearing FD cache This enables us to correlate FD cache clears with external events such as btrfs inode eviction storms. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:46 -04:00
Zygo Blaxell	ffac407a9b	roots: clean up crawl_master Remove some broken #if 0 code, and take advantage of new Task non-repeating execution semantics. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	4f032ab85b	context: report Task instance count Report the number of Task objects that currently exist as well as the number on the global work queue. THREADS (work queue 298 of 2385 tasks, 16 workers): This helps spot leaks, since Task objects that are blocked on other Task post-exec queues are otherwise invisible. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	5f763f6d41	task: handle thread lifecycle more strictly Testing sometimes crashes during exec of the first Task object, which triggers construction of TaskConsumer threads. Manage the life cycle of the thread more strictly--don't access any methods of TaskConsumer or std::thread until the constructor's caller's lock on TaskMaster is released. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	0928362aab	task: replace waiting state with run/exec counter Task::run() would schedule a new execution of Task, unless it was waiting on a queue for execution. This cannot be implemented with a bool, since a Task might be included in multiple queues, and should still be in waiting state even when executed in that case. Replace the bool with a counter. run() and append() (but not append_nolock) increment the counter, exec() decrements the counter. If the counter is non-zero when run() or append() is called, the Task is not scheduled. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	d5ff35eacf	task: track number of Task objects in program and provide report This is a simple lightweight counter that tracks the number of Task objects that exist. Useful for leak detection. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	b7f9ce3f08	task: serialize Task execution when Tasks block due to mutex contention Quite often we want to execute task B after task A finishes executing, especially if tasks A and B attempt to acquire locks on the same objects. Implement that capability in Task directly: each Task holds a queue of Tasks which will be executed strictly after this Task has finished executing, or if the Task is destroyed. Add a local queue to each TaskConsumer. This queue contains a list of Tasks which are to be executed by a single thread in sequential order. These tasks are executed before fetching any tasks from TaskMaster. Each time a Task finishes executing, the list of tasks appended to the recently executed Task are spliced at the beginning of the thread's TaskConsumer local queue. These tasks will be executed in the same thread in the same order they were appended to the recently executed Task. If a Task is destroyed with a post-execution queue, that queue is also inserted at the front of the current TaskConsumer's local queue. If a Task is destroyed or somehow executed outside of a TaskConsumer thread, or a TaskConsumer thread is destroyed, the local queue of Tasks is wrapped in a "rescue_task" Task, and spliced before the head of the global queue. This preserves the sequential ordering of tasks. In all cases the order of sequential execution of Tasks that are appended to another Task is preserved. The unused queue insertion functions are removed. Exclusion is now simply a mutex, a bool, and a Task with an empty function. Tasks that queue up waiting for the mutex are stored in Exclusion's Task, and Exclusion simply runs that task when the ExclusionState is released. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	592580369e	docs: btrfs-kernel: add the extent ref hash bug Fixed in 5.11 and 5.10 but _not_ 5.10 or 5.4 (yet). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	0bbaddd54c	docs: finally concede that the consensus spelling is "dedupe" Change documentation and comments to use the word "dedupe," not "dedup" as found in circa-3.15 kernel sources. No changes in code or program output--if they used "dedup" before, they will continue to be spelled "dedup" now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	06a46e2736	chatter: add option to remove log level prefix Some projects use only one log level, so there is no need to repeat it for every line. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	45afce72e3	test: fd: note when bad cast exception is expected Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	e4c95d618a	crucible: use '#include "crucible/...' everywhere Make the #include syntax more consistent (even if it has no effect). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	7cffad5fc3	fd: make the close method on IOHandle private Fd's cache does not handle changes in the state of its IOHandle parameter. If we allow: Fd f; f->close(); then Fd ends up caching a pointer to a closed Fd, and will become very badly confused if a new Fd appears with the same int identifier. Fix by removing the close method. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	06062cfd96	pool: use weak_ptr to run destructor earlier Drop the ListType alias because we only use it once. Rename ListRep to PoolRep to better reflect what it does. We don't need the Pool to be available to handle destroyed Pool::Handle objects. A weak_ptr in the Handle would detect the Pool has been destroyed, so we don't need to track that ourselves. As a bonus, we can destroy the PoolRep object as soon as the Pool has been destroyed, delayed only if there is a Handle object currently executing its destructor. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	fbd1091052	options: remove default 8 CPU thread limit Higher CPU core counts became more common, and kernel bugs became less common, since the arbitrary 8-thread limit was introduced. We can remove the limit now, and treat any remaining scaling inefficiency as a bug to be removed. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	032c740678	process: SIGCLD is not portable MUSL libc doesn't have it, for instance. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	5b72f35657	src: bees depends on libcrucible.a The dependency was missing, so changes to the library would not trigger a rebuild of the bees binary. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
SeerLite	3bf6db0354	install.md: Update Arch Linux instructions bees is now available in the community repository. Also changed AUR installation line to something more generic.	2021-06-11 13:21:41 -04:00
Zygo Blaxell	80c69f1ce4	context: get rid of shared_ptr<BeesContext> in every single cached Fd object Support for multiple BeesContext objects sharing a FdCache was wasting significant space and atomic inc/dec memory cycles for no good reason since the shared-FdCache feature was deprecated. open_root and open_root_ino still need a BeesContext to work. Pass the BeesContext pointer through the function object instead of the cache key arguments. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-28 21:54:00 -04:00
Zygo Blaxell	db65031c2b	context: get rid of all instances of pthread_cancel pthread_cancel doesn't really work properly. It was only being used in bees to bring threads to a stop if the BeesContext is destroyed early. It is frequently implicated in core dump reports because of the fragility of the C++ iostream / C stdio / library infrastructure, particularly surrounding upgrades on the host running bees. The pthread_cancel call itself often simply fails even when it doesn't call terminate(). Defer creation of the status and progress threads until after the BeesContext::start method is invoked. At that point, the existing ask-threads-nicely-to-stop code is up and running, and normal condvars can be used to bring bees to a stop, without having to resort to pthread_cancel. Since we're deleting half of the BeesContext constructor in this change, let's remove the other half too, and put an end to the deprecated support for multiple BeesContexts sharing a process. It's still possible to run multiple BeesContexts, but they will not share a FD cache. This will allow the FD cache's keys to become smaller and hopefully save some memory later on. Fixes: #171 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-28 21:42:03 -04:00
Zygo Blaxell	243480b515	ntoa: fix comment disparaging gcc for not implementing C99 compound literals in C++ C99's "{ 0 }" notation for filling in a struct with all zeros was not included in the C++11 standard, so gcc doesn't implement it and neither does clang. gcc does (did?) have issues with warnings on the same code in C99, complaining about uninitialized struct members when "{0}" explicitly initializes every member to a zero value. These issues don't apply in the C++ code where NTOA_TABLE_ENTRY_END is used. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-23 08:20:03 -04:00
Zygo Blaxell	f8a8704135	ntoa: fix bits_ntoa formatting and error handling Get rid of an assert in bits_ntoa. Throw an exception instead. Fix hex formatting (adding "0x" before a decimal number is not the correct way to format hex strings). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-23 08:20:03 -04:00
Zygo Blaxell	8a60850e32	docs: note that FIEMAP is also affected by backref performance issue Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-23 08:20:03 -04:00
Zygo Blaxell	9d21e6b456	docs: drop incomplete build recipe for ubuntu 14.04 The kernel from such an old distro version likely has several unfixed bugs. Better not to support it at all. Users who can upgrade the kernel are probably also sophisticated enough to fix the build issues too. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-23 08:20:03 -04:00
Zygo Blaxell	bcf3e7de3e	uuid: drop dependency on uuid.h The weird things distros do to the path where uuid.h gets installed have broken bees builds for the last time. We were only using uuid to support a legacy feature that was removed over four years ago. Hypothetical users who are upgrading directly from bees v0.1 should probably restart all the crawlers anyway--there were bugs. Also, if any such users exist, I respect their tremendous patience with the horrible performance all these years--bees got about 30x faster since v0.1. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-23 08:16:50 -04:00
Zygo Blaxell	6465a7c37c	docs: btrfs-kernel: update recommended kernels list, slow backrefs bug has been backported The slow backrefs performance improvement is confirmed by reports from multiple users: * Me (5.4.60 + backref patches, 5.7 to 5.11) * https://github.com/Zygo/bees/issues/161 (5.8) * https://github.com/Zygo/bees/issues/162 (5.8) * IRC user S0rin (5.4.88 + backref patches) The issue still exists, but at a significantly reduced scale: now about 2 ms of CPU per ref on a fast machine. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-04 14:01:55 -04:00
Zygo Blaxell	177f393ed6	docs: btrfs-kernel: add the 5.10 performance regression, the Ctrl-C on balance kernel crash has been fixed Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-02-23 17:37:51 -05:00
Zygo Blaxell	5f40f9edb0	docs: remove libbtrfs-dev as a build-time dependency We no longer require ctree.h from libbtrfs-dev. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-02-22 20:07:06 -05:00
Zygo Blaxell	7f660f50b8	lib: fs: stop using libbtrfs-dev helper functions to re-enable buffer length checks The Linux kernel's btrfs headers are better than the libbtrfs-dev headers: - the libbtrfs-dev headers have C++ language compatibility issues - upstream version in Linux kernel is more accurate and up to date - macros in libbtrfs-dev's ctree.h hide information that would enable bees to perform runtime buffer length checking - enum types whose presence cannot be detected with #ifdef When accessing members of metadata items from the filesystem, we want to verify that the member we are accessing is within the boundaries of the item that was retrieved; otherwise, a memory access violation may occur or garbage may be returned to the caller. A simple C++ template, given a pointer to a structure member and a buffer, can determine that the buffer contains enough bytes to safely access a struct member. This was implemented back in 2016, but left unused due to ctree.h issues. Some btrfs metadata structures have variable length despite using a fixed-size in-memory structure. The members that appear earliest in the structure contain information about which following members of the structure are used. The item stored in the filesystem is truncated after the last used member, and all following members must not be accessed. 'btrfs_stack_*' accessor macros obscure the memory boundaries of the members they access, which makes it impossible for a C++ template to verify the memory access. If the template checks the length of the entire structure, it will find an access violation for variable-length metadata items because the item is rarely large enough for the entire structure. Get rid of all the libbtrfs-dev accessor macros and reimplement them with the necessary buffer length checks. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-02-22 20:06:43 -05:00
Zygo Blaxell	6eb7afa65c	build: include localconf everywhere Overriding makeflags did not work from localconf in the src, lib, or test directories. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-02-22 20:06:43 -05:00

1 2 3 4 5 ...

474 Commits