GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2026-01-08 20:00:22 +01:00

Author	SHA1	Message	Date
Zygo Blaxell	152e69a6d1	bytevector: validate length in get<T>() Don't allow a pointer to T to be taken from a ByteVector that is not at least sizeof(T) bytes long. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:58 -05:00
Zygo Blaxell	a59d89ea81	bytevector: add some fugly mutexes We are using ByteVectors from multiple threads in some cases. Mostly these are the status and progress threads which read the ByteVector object references embedded in BEESNOTE macros. Since it's not clear what the data race implications are, protect the shared_ptr in ByteVector with a mutex for now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:58 -05:00
Zygo Blaxell	d1015b683f	bytevector: add ostream output with hexdump There is a hexdump template in fs. Move hexdump to its own header, then ByteVector can use it too. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:58 -05:00
Zygo Blaxell	a85ada3a49	task: export load tracking statistics Provide an interface so that programs can monitor the Task load average calculations. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:57 -05:00
Zygo Blaxell	7873988dac	task: add a pause() method as an alternative to cancel() pause(true) stops the TaskMaster from processing any more Tasks, but does not destroy any queued Tasks. pause(false) re-enables Task processing. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:57 -05:00
Zygo Blaxell	2f25f89067	task: get rid of separate Exclusion and ExclusionState Exclusion was generating a new Task every time a lock was contended. That results in thousands of empty Task objects which contain a single Task item. Get rid of ExclusionState. Exclusion is now a simple weak_ptr to a Task. If the weak_ptr is expired, the Exclusion is unlocked. If the weak_ptr is not expired, it points to the Task which owns the Exclusion. try_lock now appends the Task attempting to lock the Exclusion directly to the owning Task, eliminating the need for Exclusion to have one. This also removes the need to call insert_task separately, though insert_task remains for other use cases. With no ExclusionState there is no need for a string argument to Exclusion's constructor, so get rid of that too. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:56 -05:00
Zygo Blaxell	7fdb87143c	task: get rid of the separate Barrier and BarrierLock Make one class Barrier which is copiable, so we don't have to have users making shared Barrier all the time. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:55 -05:00
Zygo Blaxell	4a4a2de89f	multilocker: serialize conflicting parallel operations For performance or workaround reasons we sometimes have to avoid doing two conflicting operations at the same time, but we can still run any number of non-conflicting operations in parallel. MultiLocker (suggestions for a better class name welcome) blocks the calling thread until there are no threads attempting to run a conflicting operation. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:54 -05:00
Zygo Blaxell	942800ad00	fd: add some doxygen Still very incomplete, but better than it was before. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:54 -05:00
Zygo Blaxell	21c08008e6	namedptr: add some doxygen, fix the #endif comment Document the overall purpose of the class and what some of the methods do, particularly the ones with terrible names like 'insert_item' (which only inserts an item after calling the Function). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:54 -05:00
Zygo Blaxell	30ece57116	fs: export btrfs_compress_type_ntoa We already had a function that was _similar_, so add decoding for compress type NONE, give it a less specific name, and declare it in fs.h. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:54 -05:00
Zygo Blaxell	6556566f54	ntoa: fix type of mask It really needs to be uint64_t, but at least it now doesn't contradict the definition in the earlier header. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:53 -05:00
Zygo Blaxell	ece58cc910	cache: add a method to get estimated cache size Estimated because there is no lock preventing the result from changing before it is used. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-12-20 20:50:53 -05:00
Zygo Blaxell	5953ea6d3c	fs: update btrfs compatibility header: add csum types, BTRFS_FS_INFO_FLAG_GENERATION and _METADATA_UUID I guess this means it's "args_v3" now? Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-10-25 12:56:16 -04:00
Zygo Blaxell	972721016b	fs: get rid of base class fiemap Yet another build failure of the form: error: flexible array member fiemap... not at end of struct crucible::Fiemap... bees doesn't use fiemap any more, so the fixes here are minimal changes to make it build, not shining examples of C++ class design. Signer-off-by: Zygo Blaxell <bees@furryterror.org>	2022-10-25 12:56:16 -04:00
Zygo Blaxell	5040303f50	fs: get rid of base class btrfs_data_container This fixes another build failure of the form: error: flexible array member btrfs_... not at end of struct crucible::Btrfs... Fixes: https://github.com/Zygo/bees/issues/236 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-10-23 22:42:57 -04:00
Zygo Blaxell	be3c54e14c	extentwalker: drop explicit default constructors They're all public because it's a struct, so there's no need to make them explicit. clang-14 deprecates these. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-10-23 22:39:59 -04:00
Zygo Blaxell	14ce81c081	fs: get rid of silly base class that causes build failures now The base class thing was an ugly way to get around the lack of C99 compound literals in C++, and also to make the bare ioctls usable with the derived classes. Today, both clang and gcc have C99 compound literals, so there's no need to do crazy things with memset. We never used the derived classes for ioctls, and for this specific ioctl it would have been a very, very bad idea, so there's no need to support that either. We do need to jump through hoops for ostream& operator<<() but we had to do those anyway as there are other members in the derived type. So we can simply drop the base class, and build the args object on the stack in `do_ioctl`. This also removes the need to verify initialization. There's no bug here since the `info` member of the base class was never used in place by the derived class, but new compilers reject the flexible array member in the base class because the derived class makes `info` be not at the end of the struct any more: error: flexible array member btrfs_ioctl_same_args::info not at end of struct crucible::BtrfsExtentSame Fixes: https://github.com/Zygo/bees/issues/232 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-10-09 20:39:15 -04:00
Zygo Blaxell	a3d2bc26d5	progress: lock down some const methods begin() and end() don't mutate their object Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	5d7e815eb4	lib: add Uname, a constructor for utsname Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	73f94750ec	namedptr: concurrency and const cleanup Fix the locking order for the case where an exception is thrown in shared_ptr's allocator. More const. Drop the explicit closure return type since the compiler can deduce it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	6325f9ed72	lib: deprecate memset_zero template, use C99 compound literals instead Sprinkle in some asserts to make sure compilers aren't getting creative. This may introduce a new compiler dependency, as I suspect older versions of GCC don't support this syntax. It definitely needs a new compiler flag to suppress a warning when some fields are not explicitly initialized. If we've omitted a field, it's because it's a field we don't know (or care) about, and we want that thing initialized to zero. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	fcd847bbf9	fs: add an item type parameter to next_min When we are searching the btrfs metadata trees, we usually want only one type of item. If the last item in a search result is not of the desired type, we can restart the search at the next possible key with that item type, potentially skipping over some uninteresting items we would otherwise have to fetch, process, and discard. Also remove a bug in the previous next_min code that would skip over items if the offset overflowed and the next objectid in the tree had a lower item type number than the previous objectid. This doesn't seem to be a bug that has ever happened, as it would require a file to roll over in the offset field. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:56:04 -04:00
Zygo Blaxell	fb0e676ee8	string: drop vector_copy_struct, obsoleted by ByteVector vector_copy_struct constructed a std::vector<uint8_t> from a fixed-size struct. ByteVector replaces std::vector<uint8_t> and has a template constructor which does the same thing as vector_copy_struct, so there is no longer a need for this function. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:56:04 -04:00
Zygo Blaxell	b2db140666	spanner: drop Spanner, replaced by ByteVector Spanner was a workaround for terrible std::vector _copy_ performance, but it turns out that std::vector has terrible _allocator_ performance (compared to an implementation based on malloc and memcpy). Spanner is a workaround for the copy performance issue, so it doesn't help very much. Refraining from using vector at all is much better. Now that all code that used Spanner has been converted to ByteVector, there's no further need for Spanner<uint8_t>, which was the only type it was ever used for. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:50:25 -04:00
Zygo Blaxell	55dc98e21a	fd: finish deprecating vector<uint8_t> in IO wrapper functions We can simply remove the template specializations, but if we do that, then existing code might accidentally write out the vector<uint8_t> struct. Prevent regressions by deleting the vector specializations, making any code that uses them fail to build. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	99709d889f	fd: start deprecating vector<uint8_t> for p{read,write}_or_die Add support for pread and pwrite of ByteVector objects alongside vector<uint8_t>. A later commit will delete the template specializations for vector<uint8_t>, but existing users have to be updated to use ByteVector first. Nothing currently uses vector<char>, so we can delete that immediately. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	bba6f4f183	fs: convert vector<uint8_t> and Spanner to ByteVector and rewrite TREE_SEARCH_V2 wrapper Switch various methods in fs to use ByteVector to cut down on the number of slow allocations and copies. Automatically determine the correct size for TREE_SEARCH_V2 buffers based on the number of items requested, and grow the buffer as needed. This eliminates the need to cache some objects that were heavy to create. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	ba1f3b93e4	fs: drop virtual do_ioctl methods for btrfs_ioctl_search_key These were never used, and they make the object very slightly heavier. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	f0eb9b202f	lib: introduce ByteVector as a replacement for vector<uint8_t> and Spanner After some benchmarking, it turns out that std::vector<uint8_t> is about 160 times slower than malloc(). malloc() is faster than "new uint8_t[]" too. Get rid of std:;vector<uint8_t> and replace it with a lightweight wrapper around malloc(), free(), and memcpy(). ByteVector has helpful methods for the common case of moving data to and from ioctl calls that use a fixed-length header placed contiguously with a variable-length input/output buffer. Data bytes are shared between copied ByteVector objects, allowing a large single buffer to be cheaply chopped up into smaller objects without memory copies. ByteVector implements the more useful parts of the std::vector API, so it can replace std::vector objects without needing an awkward adaptor class like Spanner. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	2e36dd2d58	error: introduce THROW_CHECK4, the long-awaited sequel to THROW_CHECK3 Sometimes we need to check constraints on 4 variables at once. It would be nice if variadic macros in C++ were also polymorphic. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	cf4091b352	endian: fix uint16_t specialization of le_to_cpu Fortunately, we have not had cause to read any 16-bit fields out of btrfs structures yet. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	12e80658a8	fs: fix FIEMAP_MAX_OFFSET type silliness in fiemap.h In fiemap.h the members of struct fiemap are declared as __u64, but the FIEMAP_MAX_OFFSET macro is an unsigned long long value: $ grep FIEMAP_MAX_OFFSET -r /usr/include/ /usr/include/linux/fiemap.h:#define FIEMAP_MAX_OFFSET (~0ULL) $ grep fe_length -r /usr/include/ /usr/include/linux/fiemap.h: __u64 fe_length; /* length in bytes for this extent */ This results in a type mismatch error on architectures like ppc64le: fiemap.cc:31:35: note: deduced conflicting types for parameter 'const _Tp' ('long unsigned int' and 'long long unsigned int') 31 \| fm.fm_length = min(fm.fm_length, FIEMAP_MAX_OFFSET - fm.fm_start); \| ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Work around this by copying the macro into a uint64_t constant, and not using the macro any more. Fixes: https://github.com/Zygo/bees/issues/194 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-06 15:17:02 -04:00
Zygo Blaxell	6adaedeecd	extentwalker: fix the binary search and add some debug infrastructure Add some conditionally-compiled debug code, including an in-memory log of what ExtentWalker does. Dump that log on exceptions. If we loop too many times in a debug build, kill the process so we can stack trace. In non-debug builds just throw a normal exception. Grow the step size instead of shrinking it, to reduce the number of binary search iterations. Prevent a bug where the step size bottoms out before positioning the target extent in the middle of the result vector. Use the first extent for "first_extent", instead of the 3rd. Get rid of some redundant checks. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	0928362aab	task: replace waiting state with run/exec counter Task::run() would schedule a new execution of Task, unless it was waiting on a queue for execution. This cannot be implemented with a bool, since a Task might be included in multiple queues, and should still be in waiting state even when executed in that case. Replace the bool with a counter. run() and append() (but not append_nolock) increment the counter, exec() decrements the counter. If the counter is non-zero when run() or append() is called, the Task is not scheduled. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	d5ff35eacf	task: track number of Task objects in program and provide report This is a simple lightweight counter that tracks the number of Task objects that exist. Useful for leak detection. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	b7f9ce3f08	task: serialize Task execution when Tasks block due to mutex contention Quite often we want to execute task B after task A finishes executing, especially if tasks A and B attempt to acquire locks on the same objects. Implement that capability in Task directly: each Task holds a queue of Tasks which will be executed strictly after this Task has finished executing, or if the Task is destroyed. Add a local queue to each TaskConsumer. This queue contains a list of Tasks which are to be executed by a single thread in sequential order. These tasks are executed before fetching any tasks from TaskMaster. Each time a Task finishes executing, the list of tasks appended to the recently executed Task are spliced at the beginning of the thread's TaskConsumer local queue. These tasks will be executed in the same thread in the same order they were appended to the recently executed Task. If a Task is destroyed with a post-execution queue, that queue is also inserted at the front of the current TaskConsumer's local queue. If a Task is destroyed or somehow executed outside of a TaskConsumer thread, or a TaskConsumer thread is destroyed, the local queue of Tasks is wrapped in a "rescue_task" Task, and spliced before the head of the global queue. This preserves the sequential ordering of tasks. In all cases the order of sequential execution of Tasks that are appended to another Task is preserved. The unused queue insertion functions are removed. Exclusion is now simply a mutex, a bool, and a Task with an empty function. Tasks that queue up waiting for the mutex are stored in Exclusion's Task, and Exclusion simply runs that task when the ExclusionState is released. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	0bbaddd54c	docs: finally concede that the consensus spelling is "dedupe" Change documentation and comments to use the word "dedupe," not "dedup" as found in circa-3.15 kernel sources. No changes in code or program output--if they used "dedup" before, they will continue to be spelled "dedup" now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	06a46e2736	chatter: add option to remove log level prefix Some projects use only one log level, so there is no need to repeat it for every line. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	e4c95d618a	crucible: use '#include "crucible/...' everywhere Make the #include syntax more consistent (even if it has no effect). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	7cffad5fc3	fd: make the close method on IOHandle private Fd's cache does not handle changes in the state of its IOHandle parameter. If we allow: Fd f; f->close(); then Fd ends up caching a pointer to a closed Fd, and will become very badly confused if a new Fd appears with the same int identifier. Fix by removing the close method. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	06062cfd96	pool: use weak_ptr to run destructor earlier Drop the ListType alias because we only use it once. Rename ListRep to PoolRep to better reflect what it does. We don't need the Pool to be available to handle destroyed Pool::Handle objects. A weak_ptr in the Handle would detect the Pool has been destroyed, so we don't need to track that ourselves. As a bonus, we can destroy the PoolRep object as soon as the Pool has been destroyed, delayed only if there is a Handle object currently executing its destructor. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	243480b515	ntoa: fix comment disparaging gcc for not implementing C99 compound literals in C++ C99's "{ 0 }" notation for filling in a struct with all zeros was not included in the C++11 standard, so gcc doesn't implement it and neither does clang. gcc does (did?) have issues with warnings on the same code in C99, complaining about uninitialized struct members when "{0}" explicitly initializes every member to a zero value. These issues don't apply in the C++ code where NTOA_TABLE_ENTRY_END is used. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-23 08:20:03 -04:00
Zygo Blaxell	bcf3e7de3e	uuid: drop dependency on uuid.h The weird things distros do to the path where uuid.h gets installed have broken bees builds for the last time. We were only using uuid to support a legacy feature that was removed over four years ago. Hypothetical users who are upgrading directly from bees v0.1 should probably restart all the crawlers anyway--there were bugs. Also, if any such users exist, I respect their tremendous patience with the horrible performance all these years--bees got about 30x faster since v0.1. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-04-23 08:16:50 -04:00
Zygo Blaxell	7f660f50b8	lib: fs: stop using libbtrfs-dev helper functions to re-enable buffer length checks The Linux kernel's btrfs headers are better than the libbtrfs-dev headers: - the libbtrfs-dev headers have C++ language compatibility issues - upstream version in Linux kernel is more accurate and up to date - macros in libbtrfs-dev's ctree.h hide information that would enable bees to perform runtime buffer length checking - enum types whose presence cannot be detected with #ifdef When accessing members of metadata items from the filesystem, we want to verify that the member we are accessing is within the boundaries of the item that was retrieved; otherwise, a memory access violation may occur or garbage may be returned to the caller. A simple C++ template, given a pointer to a structure member and a buffer, can determine that the buffer contains enough bytes to safely access a struct member. This was implemented back in 2016, but left unused due to ctree.h issues. Some btrfs metadata structures have variable length despite using a fixed-size in-memory structure. The members that appear earliest in the structure contain information about which following members of the structure are used. The item stored in the filesystem is truncated after the last used member, and all following members must not be accessed. 'btrfs_stack_*' accessor macros obscure the memory boundaries of the members they access, which makes it impossible for a C++ template to verify the memory access. If the template checks the length of the entire structure, it will find an access violation for variable-length metadata items because the item is rarely large enough for the entire structure. Get rid of all the libbtrfs-dev accessor macros and reimplement them with the necessary buffer length checks. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-02-22 20:06:43 -05:00
Zygo Blaxell	c0149d72b7	fs: use Spanner to refer to ioctl arg buffer instead of making vector copies This avoids some allocations and copying. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 18:07:36 -05:00
Zygo Blaxell	333aea9822	lib: introduce Spanner, a pointer and size delimiting a range Spanner<Iterator> turns a pair of pointers into a sequence container with several of vector's methods. A partial specialization of make_spanner is provided which uses shared_ptr as the beginning of the range. Some of the Spanner code is a questionable hack in support of this. C++20 has ranges and span, but neither is worth moving the minimum C++ standard forward. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 18:07:36 -05:00
Zygo Blaxell	9ca69bb7ff	fs: remove buffer overrun check in get_struct_ptr for non-copying containers When we are using non-copying containers, we can't call resize() on them. get_struct_ptr is essentially a pointer cast, so we will end up with a pointer to a struct that extends beyond the boundaries of the container. As long as the btrfs metadata is not corrupted, we should not have too many problems. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 18:07:36 -05:00
Zygo Blaxell	f45e379802	fs: deprecate vector<char> Use uint8_t when we mean uint8_t, i.e. vector<uint8_t> instead of vector<char>. Add a template parameter instead of vector so we can swap in a non-copying data type. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 18:07:36 -05:00
Zygo Blaxell	180bb60cde	fs: add support and workarounds for btrfs fs_info v2 Define a local copy of the header that has fields for the csum type and length, so we can build in places that haven't caught up to kernel 5.5 headers yet. The reason why the csum type and length are not unconditionally filled in eludes me. csum_length is necessarily non-zero, and the cost of the conditional is worse than the cost of the copy, so the whole flags dance is a WTF...but it's part of the kernel API now, so it's too late to NAK it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 18:07:36 -05:00

1 2 3

125 Commits