GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-11-03 11:40:34 +01:00

Author	SHA1	Message	Date
Zygo Blaxell	8bc4bee8a3	crucible: progress: drop the set() method set() was broken and redundant. Calling hold() and discarding the returned object has the correct effect. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-09-14 23:49:54 -04:00
Zygo Blaxell	c3effe0a20	crawl: use custom order instead of (ab)using BeesFileRange::operator< This makes the code clearer and keeps changes to BeesFileRange ordering isolated. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-05-18 00:16:08 -04:00
Zygo Blaxell	5bdad7fc93	crucible: progress: a progress tracker for worker queues The task queue can become very large with many subvols, requiring hours for the queue to clear. 'beescrawl.dat' saves in the meantime will save the work currently scheduled, not the work currently completed. Fix by tracking progress with ProgressTracker. ProgressTracker::begin() gives the last completed crawl position. ProgressTracker::end() gives the last scheduled crawl position. begin() does not advance if there is any item between begin() and end() is not yet completed. In between are crawled extents that are on the task queue but not yet processed. The file 'beescrawl.dat' saves the begin() position while the extent scanning task queue is fed from the end() position. Also remove an unused method crawl_state_get() and repurpose the operator<(BeesCrawlState) that nobody was using. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-28 23:49:39 -05:00
Zygo Blaxell	8f0e88433e	roots: get rid of common error messages, add more error counters One very common case is losing a race to open a file that was deleted. No need to spam the logs with mere ENOENT reports. Other errors are more significant. Log those with errno, and add event counters to record them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-07 23:12:01 -05:00
Zygo Blaxell	6aad124241	crawl: somebody should set max_transid The previous commit had both max_transid assigments commented out. It happens to work because we set max_transid in the constructor and it doesn't change after that, but it's cleaner to assign it explicitly. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-31 22:52:12 -05:00
Zygo Blaxell	087ec26c44	crawl: filter extents correctly When an extent ref is modified, all of the refs in the same metadata page get the same transid in the TREE_SEARCH_V2 header. This causes two problems: - Extents with generation < min_transid are included if they happen to be referenced by pages with generation >= min_transid. - Extent refs with generation > max_transid are excluded even if they reference extents with generation <= max_transid. Both of these are wrong: the first causes some extents to be repeatedly scanned, the second causes some extents to not be scanned at all. Change the TREE_SEARCH_V2 parameters so that Crawl sees all extents newer than min_transid (i.e. set max_transid to max). The TREE_SEARCH_V2 kernel logic already operates this way, i.e. it fetches every page with transid >= min_transid and discards newer items if they are too new for max_transid. Filter strictly by the extent reference generation field (i.e. the copy of the extent generation that is in the extent reference). Note this still scans extent data multiple times, but it should now be exactly once per extent reference. A proper fix for this requires extent-based scanning instead of extent-ref-based scanning. Formerly commit `5a8c655fc4` "roots: filter out obsolete extents from extent refs" which landed in the subvol-threads branch but not master. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-31 22:48:39 -05:00
Zygo Blaxell	af250f7732	roots: determine transid_max without open()ing every subvol root Scan the roots tree directly for roots other than 5 (the FS root), and use btrfs_get_root_transid on root_fd for root 5. This avoids filling up the root FD cache every time we want a new transid_max. Now the only reason we open a subvol root FD is to open a file within the subvol. transid_max may be the same as the FS root's transid, in which case the search loop is not necessary. Place a counter (transid_max_miss) to see if we ever need to look at root items. If this counter never goes above zero, or does so very rarely, we can delete the search loop. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:39 -05:00
Zygo Blaxell	4f0bc78a4c	crawl: don't block a Task waiting for new transids Task should not block for extended periods of time. Remove the RateEstimator::wait_for() in crawl_roots. When crawl_roots runs out of data, let the last crawl_task end without rescheduling. Schedule crawl_task again on transid polls if it was not already running. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:39 -05:00
Zygo Blaxell	636328fdc2	roots: add scan-mode 2 "oldest crawler first" Add a third scan mode with alternative trade-offs. Benefits: Good sequential read performance. Avoids race conditions described in https://github.com/Zygo/bees/issues/27. Avoids diverting scan resources into short-lived snapshots before their long-lived origin subvols are fully scanned. Drawbacks: Takes the longest time of the three implemented scan-modes to free space in extents that are shared between snapshots. Uses the maximum amount of temporary space. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:48:05 -05:00
Zygo Blaxell	ef44947145	roots: move common code for creating crawl Tasks into a method Duplicated code between the different scan modes has slowly been becoming less and less trivial. Move the code to a method and make both scan-modes call it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-28 22:52:17 -05:00
Zygo Blaxell	762f833ab0	roots: poll every 10 transids Restartng scans for each transid is a bit aggressive. Scan every 10 transids for a polling rate close to the former BEES_COMMIT_INTERVAL. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	48e78bbe82	roots: use RateEstimator as a transid_max cache and clean up logs transid_max is now measured at a single point in the crawl_transid thread. Move the Crawl deferred logic into BeesRoots so it restarts all crawls when transid_max increases. Gets rid of some messy time arithmetic. Change name of Crawl thread to "crawl_master" in both thread name and log messages. Replace "Next transid" with "Crawl started". Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	ded26ff044	FdCache: clear cache on every new transid / crawl cycle The periodic cache age check was not protected by a lock, so multiple threads may decide to concurrently clear the cache. This led to duplicate log messages. Fix by moving the cache expiry trigger out of FdCache and into Roots, which knows when transids change and can perform cache clears at exactly the time they are most relevant, i.e. after something that was deleted becomes permanently so. This removes the last references to BEES_COMMIT_INTERVAL, so get rid of its definition too. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	72857e84c0	crawl: combine two messages per crawl cycle into one Now that the polling interval is up to 30 times faster, next_transid seems too verbose again. Make it clearer that the interval quoted in the "Deferring..." message is the computed transaction polling interval. Combine "Next transid" and "Restarted crawl" into a single message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	0fdae37962	roots: use RateEstimator to track transids Make the crawl polling interval more closely track the commit interval on the btrfs filesystem. In the future this will provide opportunities to do things like clear FD caches and stop crawls on deleted subvols, but triggered by transaction commits instead of arbitrary time intervals. Rename the "crawl" thread so it no longer has the same name as the "crawl" task, and repurpose it for dedicated transid polling. Cancel the deletion of crawl_thread and repurpose it to trigger new crawls and wake up the main crawl Task when it runs out of data. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	a3f02d5dec	roots: comment updates and general cleanup Fix discussion of nodatasum files, clarifying what we can and cannot do. Get rid of some BEESNOTE and BEESTRACE calls which cannot be observed (well, BEESNOTE can, but you have to be quick!). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	f6909dac17	bees: drop BEESINFO Having too many "write a message to the log" primitives is confusing, and having one that intermittently and silently discards output is even _more_ confusing. Replace all BEESINFO with appropriate BEESLOG*s. Usually DEBUG. Except for one or two that occur too often. Just delete those. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	f64fc78e36	Task: convert print_fn to a string Since we are now unconditionally rendering the print_fn as a static string, there is no need for it to be a function. We also need it to be brief and mostly constant. Use a string instead. Put the string before the function in the Task constructor arguments so that the title string appears as a heading in code, since we are making a breaking API change already. Drop TASK_MACRO as it is broken by this change, but there is no similar usage of Task anywhere to make it worth fixing. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:04 -05:00
Zygo Blaxell	4c05c53d28	roots: update Task print functions for new usage This restores the old "crawl" prefix in the case of Crawler log messages. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 14:00:52 -05:00
Zygo Blaxell	e970ac6c02	crawl: make logging less verbose Silence the three(!) log messages per crawl increment an extra one at the end of the subvol. The three critical messages per subvol crawl cycle are: Next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <A>..<B> started <T> (<AGO>s ago) Subvol has been completely scanned and a new transaction range will be created. CrawlState is the state of the old subvol. Restarted crawl BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago) Subvol has been restarted. CRawlState is the state of the new subvol. Deferring next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago) Subvol has been completely scanned, but it is too soon to start a new scan. Fix the "Restart..." message to use the correct verb tense and to use the correct BeesCrawlState data. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 13:50:47 -05:00
Kai Krakow	677da5de45	Logging: Add log levels to output This commit adds log levels to the output. In systemd, it makes colored lines, otherwise it's probably just a number. Bees is very chatty, so this paves the road for log level filtering. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-18 23:41:29 +01:00
Zygo Blaxell	56c23c4517	crawl: implement two crawler algorithms and adjust scheduling parameters There are two subvol scan algorithms implemented so far. The two modes are unimaginatively named 0 and 1. 0: sorts extents by (inode, subvol, offset), 1: scans extents round-robin from all subvols. Algorithm 0 scans references to the same extent at close to the same time, which is good for performance; however, whenever a snapshot is created, the scan of the entire filesystem restarts at the beginning of the new snapshot. Algorithm 1 makes continuous forward progress even when new snapshots are created, but it does not benefit from caching and will force the kernel to reread data multiple times when there are snapshots. The algorithm can be selected at run-time using the -m or --scan-mode option. We can collect some field data on these before replacing them with an extent-tree-based scanner. Alternatively, for pre-4.14 kernels, we can keep these two modes as non-default options. Currently these algorithms have terrible names. TODO: fix that, but also TODO: delete all that code and do scans directly from the extent tree instead. Augment the scan algorithms relative to their earlier implementation by batching multiple extents to scan from each subvol before switching to a different subvol. Sprinkle some BEESNOTEs on the Task objects so that they don't disappear from the thread status output. Adjust some timing constants to deal with the increased latency from competing threads. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:53:49 -05:00
Zygo Blaxell	055c8d4c75	roots: scan in parallel using Tasks Distribute incoming extents across a thread pool for faster execution on multi-core, multi-disk environments. Switch extent enumeration model to scan extent refs consecutively(ish). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:52:00 -05:00
Zygo Blaxell	796aaed7f8	roots: remove dead code and #if blocks In both instances the code contained within (or the conditional compilation surrounding it) is no longer controversial. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:52:00 -05:00
Zygo Blaxell	42a6053229	roots: remove open_root_cache correctly BEESNOTE puts a message on the status message stack. BEESINFO logs a message with rate limiting. The message that was flooding the logs was coming from BEESINFO not BEESNOTE. Fix earlier commit which removed the wrong message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:30:07 -05:00
Zygo Blaxell	5afbcb99e3	roots: drop open_root_nocache log entry After a few hundred subvol threads start running, the inode cache starts to thrash, and the log gets spammed with messages of the form: "open_root_nocache <subvolid>: <path>" Ideally there would be some way to schedule work to minimize inode thrashing. Until that gets done, just silence the messages for now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 21:16:40 -04:00
Zygo Blaxell	5275249396	roots: trace transid_max calculation transid_max calculations can take considerable time. Report their progress in more detail. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 17:30:45 -04:00
Zygo Blaxell	339579096f	roots: move flags check after file identity checks and make error message style consistent If we lose a race and open the wrong file, we will not retry with the next path if the file we opened had incompatible flags. We need to keep trying paths until we open the correct file or run out of paths. Fix by moving the inode flag check after the checks for file identity. Output attributes in hex to be consistent with other attribute error messages. There is no need to report root and file paths separately in the error message for incompatible flags because we have confirmed the identity of the file before the incompatible flag error is detected. Other messages in this loop still output root path and file_path separately because the identity of 'rv' is unknown at the time these messages are emitted. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 14:49:09 -04:00
Kai Krakow	a5e2bdff47	Skip nocow files to speed up processing If you have a lot of or a few big nocow files (like vm images) which contain a lot of potential deduplication candidates, bees becomes incredibly slow running through a lot "invalid operation" exceptions. Let's just skip over such files to get more bang for the buck. I did no regression testing as this patch seems trivial (and I cannot imagine any pitfalls either). The process progresses much faster for me now.	2017-09-12 02:09:22 +02:00
Timofey Titovets	5350b0f113	Bees: fix [-Werror=implicit-fallthrough=] In gcc 7+ warning: implicit-fallthrough has been added In some places fallthrough is expectable, disable warning Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2017-06-13 18:05:38 +03:00
Zygo Blaxell	db8ea92133	bees: fix further instances of copy-after-unlock bug Before: unique_lock<mutex> lock(some_mutex); // run lock.~unique_lock() because return // return reference to unprotected heap return foo[bar]; After: unique_lock<mutex> lock(some_mutex); // make copy of object on heap protected by mutex lock auto tmp_copy = foo[bar]; // run lock.~unique_lock() because return // pass locally allocated object to copy constructor return tmp_copy; Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-22 22:00:27 -05:00
Zygo Blaxell	c1e31004b6	crawl: change scan order to make forward progress at all times Previously, the scan order processed each subvol in order. This required very large amounts of temporary disk space, as a full filesystem scan was required before any shared extents could be deduped. If the hash table RAM was underprovisioned this would mean some shared dup blocks were removed from the hash table before they could be deduped. Currently the scan order takes the first unscanned extent from each subvol. This works well if--and only if--the subvols are either empty or children of a common ancestor. It forces the same inode/offset pairs to be read at close to the same time from each subvol. When a new snapshot is created, this ordering diverts scanning to the new subvol until it catches up to the existing subvols. For large filesystems with frequent snapshot creation this means that the scanner never reaches the end of all subvols. Each new subvol effectively resets the current scan position for the entire filesystem to zero. This prevents bees from ever completing the first filesystem scan. Change the order again, so that we now read one unscanned extent from each subvol in round-robin fashion. When a new subvol is created, we share scan time between old and new subvols. This ensures we eventually finish scanning initial subvols and enter the incremental scanning state. The cost of this change is more repeated reading of shared extents at scan time with less benefit from disk-device-level caching; however, the only way to really fix this problem is to implement scanning on tree 2 (the btrfs extent tree) instead of the subvol trees. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:15:42 -05:00
Zygo Blaxell	efda609f66	log: remove path from thread name The thread name has an arbitrarily limited size, and we are eventually removing support for multiple paths in a single bees daemon process. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:15:16 -05:00
Zygo Blaxell	7782b79e4b	crucible: reduce buffer size and CPU overhead for BtrfsIoctlSearchKey We really do need some large buffers for BtrfsIoctlSearchKey in some cases, but we don't need to zero them out first. Don't do that so we save some CPU. Reduce the default buffer size to 4K because most BISK users don't get need much more than 1K. Set the buffer size explicitly to the product of the number of items and the desired item size in the places that really need a lot of items.	2016-12-13 21:46:35 -05:00
Zygo Blaxell	eec80944cd	roots: add a counter for crawl_ms, open_root and open_root_ino Linux kernel commit 7f8e406 ("btrfs: improve delayed refs iterations") seems to dramatically improve LOGICAL_INO performance. Hopefully this commit will find its way into mainline Linux soon. This means that most of the time in Bees is now spent on block reading (50-75%); however, there is still a big gap between block read and the sum of everything else we are measuring with the "*_ms" counters. This gap is about 30% of the run time, so it would be good to find out what's in the gap. Add ms counters around the crawl and open calls to capture where we are spending all the time.	2016-12-08 23:55:39 -05:00
Zygo Blaxell	06e111c229	crawl: remove UUID from file names Unfortunately we don't get to remove the libuuid dependency because we still want to read a file that exists in the legacy location.	2016-12-02 00:16:03 -05:00
Zygo Blaxell	cca0ee26a8	bees: remove local cruft, throw at github	2016-11-17 12:12:13 -05:00

37 Commits