GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2026-01-08 20:00:22 +01:00

Author	SHA1	Message	Date
Khalil Santana	dffd6e0b13	Get rid of errors by using grep -E "egrep: warning: egrep is obsolescent; using grep -E"	2022-10-05 23:00:37 -04:00
Zygo Blaxell	a32cd5247f	docs: update kernel bugs list for 5.18 ptvf fix Also correct my own style for the fixed version column. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-08-17 13:04:06 -04:00
Zygo Blaxell	9c68f15474	README: update copyright year 2022 It has been some years since the copyright statement was updated. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-07-29 22:20:02 -04:00
Zygo Blaxell	5f3cb9b374	docs: update kernel bugs list for 2022-07-29 * RAID1 device count problems fixed * log tree replay parent transid verify failure in 5.18 and 5.19 added, patches available but not upstream yet * flushoncommit issues fixed, discussion section removed * LOGICAL_INO vs dedupe hang added Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2022-07-29 22:07:26 -04:00
Ayla Ounce	a52062822a	Fix beesd script arg parsing to respect PREFIX Without this, if you install to a different PREFIX such as /usr/local it will fail to recognize any arguments and if you use the systemd unit, that makes --no-timestamps the first NOT_SUPPORTED_ARG which will get passed to uuidparse, which doesn't recognize it and errors.	2022-04-10 14:12:24 -07:00
Zygo Blaxell	fbf6b395c8	types: member m_fd in BeesFileRange must be protected against data races We had an unfortunate pattern of: const BeesFileRange bfr; shared_ptr<BeesContext> ctx; // ... BEESNOTE("foo " << bfr); bfr.fd(ctx); BEESNOTE("foo after opening: " << bfr); If dump_status started running after the first BEESNOTE, but before the second, then bfr.fd() might expose a single Fd object's shared_ptr member to two threads at the same time (the thread running dump_status and the thread running BEESNOTE) without protection by a lock. One of the threads would see a partially-initialized Fd object, and the other thread would crash on an assertion failure, e.g. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f4c4fde5537 in __GI_abort () at abort.c:79 #2 0x00007f4c4fde540f in __assert_fail_base (fmt=0x7f4c4ff4e128 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5557605629dd "!m_destroyed", file=0x5557605627c0 "../include/crucible/namedptr.h", line=77, function=<optimized out>) at assert.c:92 #3 0x00007f4c4fdf4662 in __GI___assert_fail (assertion=assertion@entry=0x5557605629dd "!m_destroyed", file=file@entry=0x5557605627c0 "../include/crucible/namedptr.h", line=line@entry=77, function=function@entry=0x555760562970 "crucible::NamedPtr<Return, Arguments>::Value::~Value() [with Return = crucible::IOHandle; Arguments = {int}]") at assert.c:101 #4 0x00005557605306f6 in crucible::NamedPtr<crucible::IOHandle, int>::Value::~Value (this=0x7f4a3c2ff0d0, __in_chrg=<optimized out>) at ../include/crucible/namedptr.h:77 #5 0x00005557605137da in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f4a3c2ff0c0) at /usr/include/c++/10/bits/shared_ptr_base.h:151 #6 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f4a3c2ff0c0) at /usr/include/c++/10/bits/shared_ptr_base.h:151 #7 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7f4c4c5b5f28, __in_chrg=<optimized out>) at /usr/include/c++/10/bits/shared_ptr_base.h:733 #8 std::__shared_ptr<crucible::IOHandle, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7f4c4c5b5f20, __in_chrg=<optimized out>) at /usr/include/c++/10/bits/shared_ptr_base.h:1183 #9 std::shared_ptr<crucible::IOHandle>::~shared_ptr (this=0x7f4c4c5b5f20, __in_chrg=<optimized out>) at /usr/include/c++/10/bits/shared_ptr.h:121 #10 crucible::Fd::~Fd (this=0x7f4c4c5b5f20, __in_chrg=<optimized out>) at ../include/crucible/fd.h:46 #11 BeesFileRange::file_size (this=0x7f4c4e5ba4a0) at bees-types.cc:156 #12 0x0000555760513950 in operator<< (os=..., bfr=...) at bees-types.cc:80 #13 0x000055576050d662 in std::function<void (std::ostream&)>::operator()(std::ostream&) const (__args#0=..., this=0x7f4c4e5b9f60) at /usr/include/c++/10/bits/std_function.h:622 #14 BeesNote::get_status[abi:cxx11]() () at bees-trace.cc:165 #15 0x00005557604c9676 in BeesContext::dump_status (this=0x5557611c4de0) at bees-context.cc:89 #16 0x00005557605206fb in std::function<void ()>::operator()() const (this=this@entry=0x7f4c4c5b65f0) at /usr/include/c++/10/bits/std_function.h:622 #17 crucible::catch_all(std::function<void ()> const&, std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)> const&) (f=..., explainer=...) at error.cc:55 #18 0x000055576050aaa7 in operator() (__closure=0x5557611c52c8) at bees-thread.cc:22 #19 0x00007f4c501beed0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #20 0x00007f4c502c8ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477 #21 0x00007f4c4febddef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Fix by making BeesFileRange::m_fd really const (not just mutable), then fix all the broken code referencing it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	26acc6adfd	bytevector: introduce BEES_VALGRIND to help work around valgrind valgrind doesn't understand ioctl arguments, so it does not know if or when they initialize memory, and it complains about conditionals depending on data that comes out of ioctls. That's a problem for bees, where every decision we ever make is based on data an ioctl gave us. Fix the initialization issue by using calloc instead of malloc for ByteVectors when we are building for valgrind. Don't enable this by default because all the callocs aren't necessary (assuming the rest of the code is correct) and hurt performance. Define BEES_VALGRIND in localconf to activate, e.g. echo CCFLAGS += -DBEES_VALGRIND=1 >> localconf Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	01734e6d4b	hash: initialize m_dirty in BeesHashTable It turns out we never set m_dirty's initial value. This is not a practical problem because 1) it's mostly harmless if m_dirty is spuriously true, 2) we set it to true every time bees scans a data block, and 3) the allocation happens early in startup when most memory allocations are using zero-filled pages, so it's probably getting a false value at construction in most cases. valgrind complains about it, so it has to go. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	84094c7cb9	context: use consistent status for dedupe in log and thread note Once the physical addresses are known, put them where they can be seen in BEESTATUS as well as the log. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	a3d2bc26d5	progress: lock down some const methods begin() and end() don't mutate their object Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	d0c35b4734	fs: yet another const References to the search key do not need to be modified. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	a83c68eb18	bees: style cleanups: const, size_t, symbolic names No functional changes. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	6d6686eb5b	context: get rid of resolve (LOGICAL_INO) serializer There are kernel bugs in LOGICAL_INO from time to time; however, we can't avoid these bugs by serializing LOGICAL_INO calls. It hasn't been used for some time, so remove the code and less-than-completely-accurate comments. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
Zygo Blaxell	007067b83f	docs: add missing 'adjust_offset_hit' counter Reported by York-Simon Johannsen via github issue 208. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:10:02 -05:00
suorcd	bb5160987e	docs: spell "snapshot" correctly https://github.com/Zygo/bees/pull/209 Edited: regenerate docs for the downstream change in index.md. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-12-19 15:08:26 -05:00
Zygo Blaxell	feed04c944	gitignore: clang creates a lot of *.tmp files Also sort the list of extensions. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	670fce5be5	resolve: reword the too-many-duplicates exception message For one thing, it should _say_ that there are too many duplicates. We were making the user read the manual to find that out. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	ff3b5a7a1b	hash: drop bees_unreadahead Forcing the entire hash table into immediate writeback causes crippling write latencies at shutdown. Even discarding pages as they are read in at startup can trigger a writeback latency spike if the pages are dirty at read time. Better to let the VM subsystem handle this on its own. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	13ec4b5165	hash: add utsname fields to log output Putting this information in the logs saves us from having to ask for the kernel version and machine name every time. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	5d7e815eb4	lib: add Uname, a constructor for utsname Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	7f67f55746	docs: remove some stray whitespace Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	0103a04ca0	task: concurrency cleanups Update thread_local task state pointers while locked. This avoids potential concurrent access of the pointers while making copies of them. Verify that the queue is really empty after splicing lists, and the current consumer is really gone after swapping the empty one. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	5e346beb2d	task: delete the move constructor for TaskState Move-constructing isn't good for that class either. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	85c93c10e6	bees: clean up #include list No need for atomic, and sort the Linux headers. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	ba694b4881	hash: move the random generator out of bees-hash.cc We need random numbers in more places, so centralize the engines. Initialize with a proper random seed so every worker thread gets different behavior. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	73f94750ec	namedptr: concurrency and const cleanup Fix the locking order for the case where an exception is thrown in shared_ptr's allocator. More const. Drop the explicit closure return type since the compiler can deduce it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	5e379b4c48	readahead: update comments to reflect bakeoff results It turns out that readahead() alone is fastest. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	6325f9ed72	lib: deprecate memset_zero template, use C99 compound literals instead Sprinkle in some asserts to make sure compilers aren't getting creative. This may introduce a new compiler dependency, as I suspect older versions of GCC don't support this syntax. It definitely needs a new compiler flag to suppress a warning when some fields are not explicitly initialized. If we've omitted a field, it's because it's a field we don't know (or care) about, and we want that thing initialized to zero. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	c698fd7211	context: stop using deprecated memset_zero template Use ordinary literal initialization instead. The ioctl doesn't need initialization of args at all. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	95347a08bb	fd: better error messages for pread/pwrite Include file name and offset. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	eb2630dee6	docs: document resolve_overflow In commit `d9e3c0070b` "context: stop creating new refs when there are too many already" we added a new counter, but didn't document it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	b828f14dd1	task: optimize for common case of single following Task If there is only one Task in the post exec queue, we can simply insert that Task instead of creating a task to hold a post exec queue of one item. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	ecf110f377	context: add a comment explaining why we are not adding bees_unreadahead At the end of scanning one extent, in theory we do not need that extent any more. In practice, it hurts benchmark scores if we drop the extents after reading them. Add a comment to note this where we put the bees_unreadhead call. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	7f7f919d08	context: fix the status message that will never be seen BEESNOTE can only be seen if the status thread is running at the time, making the log of activities during shutdown incomplete. Wake up the status thread early during shutdown so the logged sequence of shutdown actions is complete. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	11fabd66a8	context: add experimental code for avoiding tiny extents In the current architecture we can't directly measure the physical extent size, and we can't make good decisions with the extent data (reference) item alone. If the early return is enabled here, there is a small speedup and a large drop in dedupe hit rate, especially when extent splits occur. Leave the early return commented for now, but collect the event statistics. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Zygo Blaxell	a60c53a9e1	fs: dump the TREE_SEARCH_V2 parameters on exception The current error message is useless. At least say which tree we were searching. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-11-29 21:27:48 -05:00
Javi Vilarroig	01cb75ac0e	Minimal changes in beesd script to make it functional in my system	2021-11-29 20:53:04 +01:00
Zygo Blaxell	7a8d98f94d	roots: use the new type argument to next_min Tree searches are all looking for specific item types. Skip over any item types we are not interested in when resetting the search key for the next search. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:59:09 -04:00
Zygo Blaxell	fcd847bbf9	fs: add an item type parameter to next_min When we are searching the btrfs metadata trees, we usually want only one type of item. If the last item in a search result is not of the desired type, we can restart the search at the next possible key with that item type, potentially skipping over some uninteresting items we would otherwise have to fetch, process, and discard. Also remove a bug in the previous next_min code that would skip over items if the offset overflowed and the next objectid in the tree had a lower item type number than the previous objectid. This doesn't seem to be a bug that has ever happened, as it would require a file to roll over in the offset field. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:56:04 -04:00
Zygo Blaxell	e861957632	roots: use default nr_items BtrfsIoctlSearchKeyV2's constructor now fills in nr_items = 1, so we don't need to set it explicitly any more. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:56:04 -04:00
Zygo Blaxell	fb0e676ee8	string: drop vector_copy_struct, obsoleted by ByteVector vector_copy_struct constructed a std::vector<uint8_t> from a fixed-size struct. ByteVector replaces std::vector<uint8_t> and has a template constructor which does the same thing as vector_copy_struct, so there is no longer a need for this function. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:56:04 -04:00
Zygo Blaxell	b2db140666	spanner: drop Spanner, replaced by ByteVector Spanner was a workaround for terrible std::vector _copy_ performance, but it turns out that std::vector has terrible _allocator_ performance (compared to an implementation based on malloc and memcpy). Spanner is a workaround for the copy performance issue, so it doesn't help very much. Refraining from using vector at all is much better. Now that all code that used Spanner has been converted to ByteVector, there's no further need for Spanner<uint8_t>, which was the only type it was ever used for. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:50:25 -04:00
Zygo Blaxell	55dc98e21a	fd: finish deprecating vector<uint8_t> in IO wrapper functions We can simply remove the template specializations, but if we do that, then existing code might accidentally write out the vector<uint8_t> struct. Prevent regressions by deleting the vector specializations, making any code that uses them fail to build. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	14cd6ed033	bees: deprecate vector<uint8_t> and replace with ByteVector The vector<uint8_t> in the hash table doesn't hurt very much--only a few microseconds per 128K hash block. The vector<uint8_t> in BeesBlockData hurts a bit more--we run that constructor thousands of times per second. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	99709d889f	fd: start deprecating vector<uint8_t> for p{read,write}_or_die Add support for pread and pwrite of ByteVector objects alongside vector<uint8_t>. A later commit will delete the template specializations for vector<uint8_t>, but existing users have to be updated to use ByteVector first. Nothing currently uses vector<char>, so we can delete that immediately. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	bba6f4f183	fs: convert vector<uint8_t> and Spanner to ByteVector and rewrite TREE_SEARCH_V2 wrapper Switch various methods in fs to use ByteVector to cut down on the number of slow allocations and copies. Automatically determine the correct size for TREE_SEARCH_V2 buffers based on the number of items requested, and grow the buffer as needed. This eliminates the need to cache some objects that were heavy to create. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	daf8a2cde1	extentwalker: use default sizing of TREE_SEARCH_V2 buffers Now that we can guess the size more or less automatically, there's no need to make it unnecessarily large. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	ba1f3b93e4	fs: drop virtual do_ioctl methods for btrfs_ioctl_search_key These were never used, and they make the object very slightly heavier. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	f0eb9b202f	lib: introduce ByteVector as a replacement for vector<uint8_t> and Spanner After some benchmarking, it turns out that std::vector<uint8_t> is about 160 times slower than malloc(). malloc() is faster than "new uint8_t[]" too. Get rid of std:;vector<uint8_t> and replace it with a lightweight wrapper around malloc(), free(), and memcpy(). ByteVector has helpful methods for the common case of moving data to and from ioctl calls that use a fixed-length header placed contiguously with a variable-length input/output buffer. Data bytes are shared between copied ByteVector objects, allowing a large single buffer to be cheaply chopped up into smaller objects without memory copies. ByteVector implements the more useful parts of the std::vector API, so it can replace std::vector objects without needing an awkward adaptor class like Spanner. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00
Zygo Blaxell	2e36dd2d58	error: introduce THROW_CHECK4, the long-awaited sequel to THROW_CHECK3 Sometimes we need to check constraints on 4 variables at once. It would be nice if variadic macros in C++ were also polymorphic. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-10-31 19:42:01 -04:00

1 2 3 4 5 ...

543 Commits