GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2026-01-08 20:00:22 +01:00

Author	SHA1	Message	Date
Zygo Blaxell	ab5316a3da	src: use correct flags for compiling .c files, fix missing dependencies fiewalk and fiemap depend on a lot of crucible, and incremental builds fail hard without proper dependency tracking. All binaries must be rebuilt when makeflags changes. This dependency exists already in lib and test, but src was missing. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:52 -05:00
Zygo Blaxell	420c218c83	cache: remove unused #includes Also fix bees-roots's missing headers. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:52 -05:00
Zygo Blaxell	5248985da0	context: fix shutdown log messages identifying the wrong thread We are waiting for the status thread, not the progress thread. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	d1f1c386bc	tempfile: remove size limit in realign() Now that tempfiles are using pool checkin functions to control their size, we don't need a size limit in realign(). We keep the limit in make_copy because it's a sanity check against letting a multi-terabyte copy operation slip through. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	6705cd9c26	context: move TempFile from TLS to Pool and fix some FdCache issues Get rid of the thread-local TempFiles and use Pool instead. This eliminates a potential FD leak when the loadavg governor repeatedly creates and destroys threads. With the old per-thread TempFiles, we were guaranteed to have exclusive ownership of the TempFile object within the current thread. Pool is somewhat stricter: it only guarantees ownership while the checked-out Handle exists. Adjust the users of TempFile objects to ensure they hold the Handle object until they are finished using the TempFile. It appears that maintaining large, heavily-reflinked, long-lived temporary files costs more than truncating after every use: btrfs has to write multiple references to the temporary file's extents, then some commits later, remove references as the temporary file is deleted or truncated. Using the temporary file in a dedupe operation flushes the data to disk, so nothing is saved by pretending that there is writeback pipelining and trying to avoid flushes in truncate. Pool provides usage tracking and a checkin callback, so use it to truncate the temporary file immediately after every use. Redesign TempFile so that every instance creates exactly one Fd which persists over the lifetime of the TempFile object. Provide a reset() method which resets the file back to the initial state and call it from the Pool checkin callback. This makes TempFile's lifetime equivalent to its Fd's lifetime, which simplifies interactions with FdCache and Roots. This change means we can now blacklist temporary files without having an effective memory leak, so do that. We also have a reason to ever remove something from the blacklist, so add a method for that too. In order to move to extent-centric addressing, we need to be able to reliably open temporary files by root and inode number. Previously we would place TempFile fd's into the cache with insert_root_ino, but the cache would be cleared periodically, and it would not be possible to reopen temporary files after that happened. Now that the TempFile's lifetime is the same as the TempFile Fd's lifetime, we can have TempFile manage a separate FileId -> Fd map in Roots which is unaffected by the periodic cache clearing. BeesRoots::open_root_ino_nocache will check this map before attempting to open the file via btrfs root+ino lookup, and return it through the cache as if Roots had opened the file via btrfs. Hold a reference to BeesRoots in BeesTempFile because the usual way to get such a reference now throws an exception in BeesTempFile's destructor. These changes make method BeesTempFile::create() and all methods named insert_root_ino unnecessary, so delete them. We construct and destroy TempFiles much less often now, so make their constructor and destructor more informative. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	b49d458792	context: move prealloc dedupe to a separate Task A prealloc extent reference can be deduped immediately and asynchronously. There is no need to slow down extent scanning to do it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	1e7dbc6f97	tempfile: remove old comments about fsync and deadlock bugs I was never able to prove a connection between fsync() and deadlock bugs. There were too many deadlock bugs to be able to isolate a bug that is triggered specifically by fsync. Update the comment (which has been unchanged since kernel 4.14). We still may want to do fsync() on temporary files someday, but there's a full internal API rewrite between here and there. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	de6282c6cd	roots: separate crawl sizes into bytes and items Number of items should be low enough that we don't have too many stale items, but high enough to amortize system call overhead to a reasonable ratio. Number of bytes should be constant: one worst-case metadata page (the btrfs limit is 64K, though 16K is much more common) so that we always have enough space for one worst-case item; otherwise, we get EOVERFLOW if we set the number of items too low and there's a big item in the tree, and we can't make further progress. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	d332616eff	roots: report the search parameters on tree search ioctl error There are lots of ways the search can fail, but it's hard to pick one without knowing the parameters. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	e654e29f45	bees: move usage message out of source file and fix a few inaccuracies It's a pain to read, edit, and format large blocks of text in C++ code, so rip the usage message out of bees.cc and put it in a plain text file. Use a minimal translator to convert it into a C string. While we're here, remove the multiple roots feature from the command line synopsis, as we don't really support it any more. Also clarify that "id 5" is "subvol id 5", and describe in one sentence what workaround-btrfs-send does. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	bbaf55b2b0	roots: make it build with clang Remove an unnecessary cast that was breaking namespace lookup for clang. Closes: #159 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	7ec19d1eff	clang: fix struct/class declaration/definition mismatches clang does not like a defined class to be declared as a struct. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	f263c8751e	bees context: make it build with clang Remove unused function getenv_or_die. All of our environment variable parameters are optional or have default values. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	17d8759011	bees: make it build with clang Remove unused "addr check" functions. We have ranged_cast for detecting overflow bits. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	8e9b53b3fd	stats: remove nonsense dedup_unique_bytes stat A long time ago, when bees used dedicated threads to scan each subvol, the calculation of the "dedup_unique_bytes" statistic was still wrong. This stat can only be calculated when dedupe runs on extent data items instead of extent reference items. Remove the stat variable until that happens. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-12-17 17:54:51 -05:00
Zygo Blaxell	15ab981d9e	bees: replace uncaught_exception(), deprecated in C++17 uncaught_exception() had only the one valid use case, and it can be reimplemented by literally calling current_exception() instead. current_exception() has several valid use cases, so it is not likely to be deprecated any time soon. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-10-09 12:07:10 -04:00
Zygo Blaxell	05bd65444d	bees: initialize context in the correct order We cannot use BeesContext::roots() until after BeesContext::set_root_path() has been called. Save up the parameter settings until then. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2020-08-31 22:35:17 -04:00
Zygo Blaxell	c4f0e4abee	context: workaround to prevent LOGICAL_INO and btrfs balance from running concurrently This avoids some kernel bugs. One of them is fixed in 5.3.4 and later: efad8a853a "Btrfs: fix use-after-free when using the tree modification log" There are apparently others in current kernels, so for now just put bees on pause until the balance is done. At some point we may want to provide an option to disable this workaround; however, running bees and balance at the same time makes neither particularly fast, so maybe we'll just leave it this way. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2019-11-28 00:13:15 -05:00
Kai Krakow	44f446e17e	bees-context: Remove confusing log message Saying just "This feature" at some log levels could be puzzling. Let's remove this message, the feature works without problems for a year. Signed-off-by: Kai Krakow <kai@kaishome.de>	2019-11-06 09:04:52 +01:00
Zygo Blaxell	4363463342	process: Fix gettid() ambiguity with glibc >= 2.30 In version 2.30 glibc added it's own gettid() function. This resulted in "error: call of overloaded ‘gettid()’ is ambiguous" because gettid() now exists in both namespace crucible and std. For now, use explicit references to namespace crucible. This continues to work with new and old libc without having to test specific library versions. At some point, glibc gettid() will be deployed widely enough that we can remove the crucible version entirely. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2019-10-30 00:12:33 -04:00
Zygo Blaxell	7117cb40c5	hash: prepare for user-selectable hash functions Localize the hash function in bees to a single spot to make it easier to change later (or at runtime). Remove some code that was using a property of CRC as an optimization. The optimization doesn't work for other hash functions, and running the CRC function takes more CPU time than the optimization saved. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2019-06-12 22:48:06 -04:00
Zygo Blaxell	2c3d1822f7	bees: don't try to print si_lower and si_upper Some build environments (ARM? AARCH64?) do not have the fields si_lower and si_upper in siginfo. bees doesn't need them, so don't try to access them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2019-06-12 22:48:05 -04:00
Zygo Blaxell	978c577412	status: report number of active worker threads in status output This is especially useful when dynamic load management allocates more worker threads than active tasks, so the extra threads are effectively invisible. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2019-01-07 22:52:12 -05:00
Zygo Blaxell	be2c55119e	bees: make exceptions less prominent in log output Introduce a mechanism to suppress exceptions which do not produce a full stack trace for common known cases where a loop should be aborted. Use this mechanism to suppress the infamous "FIXME" exception. Reduce the log level to at most NOTICE, and in some cases DEBUG. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2019-01-06 01:48:35 -05:00
Zygo Blaxell	7933ccb660	build: make libcrucible a static library libcrucible at one time in the distant past had to be a shared library to force global C++ object initialization; however, this is no longer required. Make libcrucible static to solve various rpath and soname versioning issues, especially when distros try (unwisely) to package the library separately. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-12-09 23:39:44 -05:00
Zygo Blaxell	f17cf084e6	hash: clean up comments, audit for bugs We stopped supporting shared hash tables a long time ago. Remove comments describing the behavior of shared hash tables. Add an event counter for pushing a hash to the front when it is already at the front. Audited the code for a bug related to bucket handling that impairs space efficiency when the bucket size is greater than 1. Didn't find one. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-12-09 23:39:44 -05:00
Zygo Blaxell	570b3f7de0	bees: handle SIGTERM and SIGINT, force immediate flush and exit Capture SIGINT and SIGTERM and shut down, preserving current completed crawl and hash table state. * Executing tasks are completed, queued tasks are paused. * Crawl state is saved. * The crawl master and crawl writeback threads are terminated. * The task queue is flushed. * Dirty hash table extents are flushed. * Hash prefetch and writeback threads are terminated. * Hash table is deallocated. * FD caches and tmpfiles are destroyed. * Assuming the above didn't crash or deadlock, bees exits. The above order isn't the fastest, but it does roughly follow the shared_ptr dependencies and avoids data races--especially those that might lead to bees reporting an extent scanned when it was only queued for future scanning that did not occur. In case of a violation of expected shared_ptr dependency order, exceptions in BeesContext child object accessor methods (i.e. roots(), hash_table(), etc) prevent any further progress in threads that somehow remain unexpectedly active. Move some threads from main into BeesContext so they can be stopped via BeesContext. The main thread now runs a loop waiting for signals. A slow FD leak was discovered in TempFile handling. This has not been fixed yet, but an implementation detail of the C++ runtime library makes the leak so slow it may never be important enough to fix. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-12-09 23:39:44 -05:00
Zygo Blaxell	389dd52cc1	tempfile: drop the fsync() The deadlock seems to be fixed now (if there ever was one--there certainly were deadlocks, but matching deadlocks to root causes is non-trivial and a number of distinct deadlock cases have been fixed in recent years). The benchmark data is inconclusive about whether it is better to fsync or not to fsync. A paranoia option might be useful here. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-12-09 01:00:36 -05:00
Zygo Blaxell	f4464c6896	roots: quick fix for task scheduling bug leading to loss of crawl_master The crawl_master task had a simple atomic variable that was supposed to prevent duplicate crawl_master tasks from ending up in the queue; however, this had a race condition that could lead to m_task_running being set with no crawl_master task running to clear it. This would in turn prevent crawl_thread from scheduling any further crawl_master tasks, and bees would eventually stop doing any more work. A proper fix is to modify the Task class and its friends such that Task::run() guarantees that 1) at most one instance of a Task is ever scheduled or running at any time, and 2) if a Task is scheduled while an instance of the Task is running, the scheduling is deferred until after the current instance completes. This is part of a fairly large planned change set, but it's not ready to push now. So instead, unconditionally push a new crawl_master Task into the queue on every poll, then silently and quickly exit if the queue is too full or the supply of new extents is empty. Drop the scheduling-related members of BeesRoots as they will not be needed when the proper fix lands. Fixes: `4f0bc78a` "crawl: don't block a Task waiting for new transids" Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-25 23:46:55 -05:00
Zygo Blaxell	bf2a014607	roots: improve "RO root 6094" message This sequence of log messages isn't clear: crawl_master: WORKAROUND: Avoiding RO subvol 6094 crawl_master: WORKAROUND: RO root 6094 The first is from a cache miss, and appears wherever a root is opened (dedupe or crawl). The second is skipping an entire subvol scan, and only happens in crawl_master. Elaborate on the second message a little. Also use the term "root" consistently when referring to subvol tree IDs. btrfs refers to these objects by (at least) three distinct names: tree, subvol, and root. Using three different words for the same thing is worse than using a single wrong word consistently to refer to the same concept. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-22 21:10:15 -05:00
Zygo Blaxell	cdca2bcdcd	main: single BeesContext instance per process After weeks of testing I copied part of a change to main without copying the rest of the change, leading to an immediate segfault on startup. So here is the rest of the change: limit the number of BeesContexts per process to 1. This change was discussed at https://github.com/Zygo/bees/issues/54#issuecomment-360332529 but there are more reasons to do it now: the candidates to replace the current hash table format are less forgiving of sharing hash tables, and it may even become necessary to have more than one hash table per BeesContext instance (e.g. to keep datasum and nodatasum data separate). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-22 20:40:30 -05:00
Zygo Blaxell	34b04f4255	bees: soft-limit computed thread counts to 8 https://github.com/Zygo/bees/issues/91 describes problems encountered when running bees on systems with many CPU cores. Limit the computed number of threads (using --thread-factor or the default) to a maximum of 8 (i.e. the number of logical cores in a modern laptop). Users can override the limit by using --thread-count. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-21 21:49:16 -05:00
Zygo Blaxell	23f3e4ec42	workarounds: add workaround for btrfs send Introduce --workaround options which trade performance or effectiveness to avoid triggering kernel bugs. The first such option is --workaround-btrfs-send, which avoids making any modification to read-only subvols to avoid btrfs send bugs. Clean up usage message: no tabs for formatting, split options into sections by theme. Make scan mode a non-static data member like all (most?) other options. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-21 21:49:16 -05:00
Zygo Blaxell	e74122b512	resolver: don't log hash collision incidents The log message is quite CPU-intensive to generate, and some data sets have enough hash collisions to throw off benchmarks. Keep the event counter but drop the log message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-16 17:20:49 -05:00
Zygo Blaxell	e3247d3471	stats: streamline add_count Perf was blaming BeesStats::add_count for >1% of instructions. Trim the instruction count a little. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-08 23:31:50 -05:00
Kai Krakow	c69a954d8f	Makefile: Bring back -O3 in a downstream-compatible way This commit brings back -O3 but in an overridable way. This should make downstream distributions happy enough to accept it. While at the subject, let's apply the same fixup logic to LDFLAGS, too. This commit also properly gets rid of the implicit rules which collided too easily with the depends.mk. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-11-08 03:23:40 +01:00
Kai Krakow	f2dec480a6	Makefile: mkdir .depends only when needed Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-11-08 02:56:48 +01:00
Zygo Blaxell	c2762740ef	context: remove limit on the number of references to an extent Better toxic extent detection means we can now handle extents with many more references--easily hundreds of thousands. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-11-05 21:12:11 -05:00
Zygo Blaxell	aa74a238b3	hash: remove preloaded toxic hash blacklist Faster and more reliable toxic extent detection means we can now be much less paranoid about creating toxic extents. The paranoia has significant impact on dedupe hit rates because every extent that contains even one toxic hash is abandoned. The preloaded toxic hashes were chosen because they occur more frequently than any other block contents in typical filesystem data. The combination of these resulted in as much as 30% of duplicate extents being left untouched. Remove the preloaded toxic extent blacklist, and rely on the new kernel-CPU-usage-based workaround instead. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-31 23:03:01 -04:00
Zygo Blaxell	542371684c	context: better detection for toxic extents We detect toxic extents by measuring how long the LOGICAL_INO ioctl takes to run. If it is above some threshold, we consider the extent toxic, and blacklist it; otherwise, we process the extent normally. The detector was using the execution time of the ioctl, which detects toxic extents, but it also detects pauses of the bees process and transaction commit latency due to load. This leads to a significant number of false positives. The detection threshold was also very long, burning a lot of kernel CPU before the detection was triggered. Use the per-thread system CPU statistics to measure the kernel CPU usage of the LOGICAL_INO call directly. This is much more reliable because it is not confounded by other threads, and it's faster because we can set the time threshold two orders of magnitude lower. Also remove the lock and mutex added in "context: serialize LOGICAL_INO calls" because we theoretically no longer need it (but leave the code there with #if 0 in case we do need it in practice). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-31 21:12:16 -04:00
Zygo Blaxell	9a97699dd9	roots: reimplement transid_max_nocache using extent tree root ROOT_TREE contains the ROOT_ITEM for EXTENT_TREE. Every modification (that we care about) to a btrfs must go through EXTENT_TREE, and must modify the page in ROOT_TREE pointing to the root of EXTENT_TREE... which makes that a very good source for the filesystem transid. Remove the loop and the root lookups, and just look at one item for max_transid. Also note that every caller of transid_max_nocache() immediately feeds the return value to m_transid_re.update(), so don't do that inside transid_max_nocache(). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-31 00:09:49 -04:00
Zygo Blaxell	0e8b591232	Revert "roots: simplify BeesRoots::transid_max_nocache" It turns out that we do need to scan all the subvols in order to find transid_max. Keep the bug fix though. This reverts commit `bf6ae80eee`. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 23:29:05 -04:00
Zygo Blaxell	bf6ae80eee	roots: simplify BeesRoots::transid_max_nocache BeesRoots::transid_max_nocache calls btrfs_get_root_transid() which retrieves the transid of the root of the given Fd. Since the FS_TREE (subvol 5) is the root of the subvol hierarchy, it will always have the highest transid on the filesystem, and we do not need to look at any others. Also fix a bug where we pass BTRFS_FS_TREE_OBJECTID instead of the file descriptor root_fd() to btrfs_get_root_transid(). If BEESHOME is somewhere on the same btrfs filesystem, and there are no leaked FDs at bees startup, then BTRFS_FS_TREE_OBJECTID (5) usually has the same integer value as a valid file descriptor of some object on the filesystem that has a regularly increasing transid value. If Fd 5 happens to be a file in BEESHOME then bees itself drives the transid increments. This, combined with the search of all subvol roots, hides the bug (unless Fd 5 gets closed somehow). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 21:12:17 -04:00
Zygo Blaxell	1a51bb53bf	context: cache result of home_fd() BeesContext::home_fd() is supposed to open $BEESHOME once and cache the Fd for later calls; however, instead it was reopening a new Fd each time it was called, and _also_ holding that Fd in a BeesContext member. Fds clean themselves up when they are forgotten, so it was not leaking per se, but it certainly had more open Fds than it needed to. Check to see if we have m_home_fd open, and return that if so. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 21:12:16 -04:00
Zygo Blaxell	35b21687bc	bees: drop unused member m_uuid There is a m_root_uuid which is used. m_uuid is not, so drop it and save a tiny amount of memory. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 21:12:16 -04:00
Zygo Blaxell	63ddbb9a4f	context: serialize LOGICAL_INO calls LOGICAL_INO can trip over the btrfs slow-backrefs bug, resulting in some very long in-kernel runtimes. If too many threads are executing LOGICAL_INO then there may be no cores left on the system to run other tasks. Toxic extent detection is done by a very rudimentary algorithm which can be confused by unrelated sources of latency within btrfs (especially commit latency). The algorithm can also be confused by other threads executing the LOGICAL_INO ioctl. These are two good reasons to prevent any two threads in a single bees process instance from executing LOGICAL_INO at the same time, so let's do that. It is possible to limit the number of threads executing LOGICAL_INO with the -c and -C options; however, this also limits the number of threads which can perform any operation, while only LOGICAL_INO () has such a profound effect on the rest of system operation. Also make the status message clearer about exactly when LOGICAL_INO is executed, as opposed to merely waiting to acquire a lock before executing the ioctl. () or maybe FILE_EXTENT_SAME. The problem function that keeps showing up in kernel stack traces is find_parent_nodes, which is called by both the LOGICAL_INO and FILE_EXTENT_SAME ioctls. We'll try this change first and see if it prevents any recurrences of forced watchdog reboots; if it does not, then we'll limit FILE_EXTENT_SAME the same way. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 21:12:16 -04:00
Zygo Blaxell	373b9ef038	roots: fix subvol scan rollover on subvols with empty transid range The ordering function for BeesCrawlState did not consider root 292 inode 0 min_transid 2345 max_transid 3456 to be larger than root 292 inode 258 min_transid 2345 max_transid 2345 so when we attempted to update the end pointer for the crawl progress, the new state was not considered newer than the old state because the min_transid was equal, but the new crawl state's inode number was smaller. Normally this is not a problem because subvol scans typically begin and end in separate transactions (in part because we don't start a subvol scan until at least two transactions are available); however, the cleanup code for the aftermath of the recent transid_min() bug can create crawlers with equal max_transid and min_transid records. Fix this by ordering both transid fields before any others in the crawl state. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 21:12:14 -04:00
Zygo Blaxell	866a35c7fb	roots: do not accept 18446744073709551615 as max_transid in beescrawl.dat Due to an earlier bug some beescrawl.dat files will contain uint64_t max as max_transid. This prevents any further scanning on the subvol because there is no possibiity of having a real transid (or any other uint64_t number) larger than uint64_t max. If we detect a bad transid in beescrawl.dat, log a warning, then use some more plausible value: either min_transid to repeat the previous incremental crawl, or 0 to restart the subvol scan from the beginning. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 21:12:14 -04:00
Zygo Blaxell	90132182fd	roots: do not allow transid_min to be numeric_limits<uint64_t>::max() On a few test machines max_transid on subvols is getting set to 18446744073709551615 (aka uint64_t max). Prevent transid_min() from ever returning this value. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-30 21:12:14 -04:00
Zygo Blaxell	90f98250c2	hash: remove pointless copy "saved" is used only during hash table correctness analysis, which is normally not enabled at compile time, and requires source modification to enable. Remove the pointless copy and save a tiny bit of CPU. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-10-19 20:21:04 -04:00

1 2 3 4 5 ...

275 Commits