GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2026-01-08 20:00:22 +01:00

Author	SHA1	Message	Date
Zygo Blaxell	c21518d8ff	stats: rename "chase_wrong_data" to "chase_no_data" An empty BeesBlockData from the chasing algorithm used to mean that data was found at the expected location but it does not match; however, there are now other reasons for this and they occur much more often. The name is misleading. Change the name to report more correctly what happens: no data, without any guess about the reason. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-03-01 00:01:13 -05:00
Zygo Blaxell	082f04818f	BeesBlockData: fix data type issues Not sure if these cause any problems, but they are theoretically incorrect data types. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-28 23:58:28 -05:00
Zygo Blaxell	5bdad7fc93	crucible: progress: a progress tracker for worker queues The task queue can become very large with many subvols, requiring hours for the queue to clear. 'beescrawl.dat' saves in the meantime will save the work currently scheduled, not the work currently completed. Fix by tracking progress with ProgressTracker. ProgressTracker::begin() gives the last completed crawl position. ProgressTracker::end() gives the last scheduled crawl position. begin() does not advance if there is any item between begin() and end() is not yet completed. In between are crawled extents that are on the task queue but not yet processed. The file 'beescrawl.dat' saves the begin() position while the extent scanning task queue is fed from the end() position. Also remove an unused method crawl_state_get() and repurpose the operator<(BeesCrawlState) that nobody was using. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-28 23:49:39 -05:00
Zygo Blaxell	33d274eabd	resolve: break up long intra-extent dedup loops When both block candidates for dedup are located in the same extent, bees excludes them from deduplication because the dedup operation would not free any space (both blocks are still referenced, so neither is deleted). Candidates in other extents are still considered. Typically a few blocks are duplicated many thousands or even millions of times within a filesystem. Many of these blocks appear in the same extent as each other. In cases where an extent contains an extremely common duplicate block, it may appear multiple times in many extents. bees can get into a loop with a very bad worst-case running time: 32768 blocks per extent * 2560 bees reference limit * 256 distinct hash table entries = 21.5 billion iterations...squared, because this loop happens every time bees encounteres any of the references. Not an infinite number, but close enough. In each iteration of the loop, replace_dst detects that both src and dst block are part of the same btrfs extent data item and therefore should not be deduped; however, this occurs after the block has been allocated and read by chase_extent_ref. This dst is discarded, but the outer loop tries again with another reference to the same block and gets the same result. An easy fix for this problem is to stop the loop immediately when the same physical extent is found in both src and dst. The condition is rare enough to ignore the negligible space efficiency loss, and filesystem scan stops dead if the loop is allowed to proceed. An exception is thrown to terminate the loop at scan_one_extent from within replace_dst. It would be better to determine the extent bytenr of each candidate extent and filter them out in scan_one_extent (which reduces the number of LOGICAL_INO calls as a side-effect), but bees has no code capable of doing extent data tree lookups with backward iteration yet. Even better would be to change the hash table format so that the extent bytenr can be decoded directly from the hash table entry (this already exists for compressed extents). Both of these changes are too large for v0.6. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-25 10:08:42 -05:00
Zygo Blaxell	8f0e88433e	roots: get rid of common error messages, add more error counters One very common case is losing a race to open a file that was deleted. No need to spam the logs with mere ENOENT reports. Other errors are more significant. Log those with errno, and add event counters to record them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-07 23:12:01 -05:00
Zygo Blaxell	6aad124241	crawl: somebody should set max_transid The previous commit had both max_transid assigments commented out. It happens to work because we set max_transid in the constructor and it doesn't change after that, but it's cleaner to assign it explicitly. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-31 22:52:12 -05:00
Zygo Blaxell	087ec26c44	crawl: filter extents correctly When an extent ref is modified, all of the refs in the same metadata page get the same transid in the TREE_SEARCH_V2 header. This causes two problems: - Extents with generation < min_transid are included if they happen to be referenced by pages with generation >= min_transid. - Extent refs with generation > max_transid are excluded even if they reference extents with generation <= max_transid. Both of these are wrong: the first causes some extents to be repeatedly scanned, the second causes some extents to not be scanned at all. Change the TREE_SEARCH_V2 parameters so that Crawl sees all extents newer than min_transid (i.e. set max_transid to max). The TREE_SEARCH_V2 kernel logic already operates this way, i.e. it fetches every page with transid >= min_transid and discards newer items if they are too new for max_transid. Filter strictly by the extent reference generation field (i.e. the copy of the extent generation that is in the extent reference). Note this still scans extent data multiple times, but it should now be exactly once per extent reference. A proper fix for this requires extent-based scanning instead of extent-ref-based scanning. Formerly commit `5a8c655fc4` "roots: filter out obsolete extents from extent refs" which landed in the subvol-threads branch but not master. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-31 22:48:39 -05:00
Kai Krakow	408b6ae138	Code style: Fix wrong indentation This had spaces instead of tabs by accident. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	5590fc0b13	Cmdline: Fix text alignment Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	29d40ca359	Cmdline: Rename "relative-paths" to "strip-paths" The previous name didn't match what this option really does. Affects: #41 Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	b164717a25	Cmdline: Rename "notimestamps" to "no-timestamps" That aligns better with the other options. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Zygo Blaxell	af250f7732	roots: determine transid_max without open()ing every subvol root Scan the roots tree directly for roots other than 5 (the FS root), and use btrfs_get_root_transid on root_fd for root 5. This avoids filling up the root FD cache every time we want a new transid_max. Now the only reason we open a subvol root FD is to open a file within the subvol. transid_max may be the same as the FS root's transid, in which case the search loop is not necessary. Place a counter (transid_max_miss) to see if we ever need to look at root items. If this counter never goes above zero, or does so very rarely, we can delete the search loop. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:39 -05:00
Zygo Blaxell	4f0bc78a4c	crawl: don't block a Task waiting for new transids Task should not block for extended periods of time. Remove the RateEstimator::wait_for() in crawl_roots. When crawl_roots runs out of data, let the last crawl_task end without rescheduling. Schedule crawl_task again on transid polls if it was not already running. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:39 -05:00
Zygo Blaxell	b67fba0acd	log: BEESLOGNOTE doesn't do what we think it does BEESLOGNOTE was intended to combine BEESLOG and BEESNOTE, i.e. write a log message and set the task status message from a single expression. With the log levels we would now need several more variants (BEESLOGNOTEDEBUG, BEESLOGNOTEERR...) or a parameter (BEESNOTELOG(DEBUG, ...)). Or we give up on the idea. This combination was used only 3 times so far. The log messages and the note message have different editorial styles. Remove the three instances of BEESLOGNOTE, and make the BEESLOGNOTE definition equvalent to BEESLOG at LOG_NOTICE level for consistency. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:38 -05:00
Zygo Blaxell	d367c6364c	context: improve toxic match logs Reword log message for discovery of new toxic extents vs. lookup of previously known toxic extents. Also add the block data (especially filename) to the discovery message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:48:06 -05:00
Zygo Blaxell	591a44e59a	resolve: drop support for old-style compressed BeesAddr No public version of bees ever created old-style compressed hash table entries. Remove the code that supports them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:48:06 -05:00
Zygo Blaxell	636328fdc2	roots: add scan-mode 2 "oldest crawler first" Add a third scan mode with alternative trade-offs. Benefits: Good sequential read performance. Avoids race conditions described in https://github.com/Zygo/bees/issues/27. Avoids diverting scan resources into short-lived snapshots before their long-lived origin subvols are fully scanned. Drawbacks: Takes the longest time of the three implemented scan-modes to free space in extents that are shared between snapshots. Uses the maximum amount of temporary space. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:48:05 -05:00
Zygo Blaxell	ef44947145	roots: move common code for creating crawl Tasks into a method Duplicated code between the different scan modes has slowly been becoming less and less trivial. Move the code to a method and make both scan-modes call it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-28 22:52:17 -05:00
Zygo Blaxell	e74c0a9d80	scan: fix length mismatch exception for prealloc extents at EOF Prealloc extent sizes were taken from the Extent object and did not take the file size into account. If a file with a non-4K-aligned size is preallocated, the resulting dedup fails with an exception because the size of both ranges of the BeesRangePair do not match. Limit the size of the replacement hole extent to not extend past the end of the file. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-28 01:46:08 -05:00
Zygo Blaxell	762f833ab0	roots: poll every 10 transids Restartng scans for each transid is a bit aggressive. Scan every 10 transids for a polling rate close to the former BEES_COMMIT_INTERVAL. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	48e78bbe82	roots: use RateEstimator as a transid_max cache and clean up logs transid_max is now measured at a single point in the crawl_transid thread. Move the Crawl deferred logic into BeesRoots so it restarts all crawls when transid_max increases. Gets rid of some messy time arithmetic. Change name of Crawl thread to "crawl_master" in both thread name and log messages. Replace "Next transid" with "Crawl started". Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	ded26ff044	FdCache: clear cache on every new transid / crawl cycle The periodic cache age check was not protected by a lock, so multiple threads may decide to concurrently clear the cache. This led to duplicate log messages. Fix by moving the cache expiry trigger out of FdCache and into Roots, which knows when transids change and can perform cache clears at exactly the time they are most relevant, i.e. after something that was deleted becomes permanently so. This removes the last references to BEES_COMMIT_INTERVAL, so get rid of its definition too. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	72857e84c0	crawl: combine two messages per crawl cycle into one Now that the polling interval is up to 30 times faster, next_transid seems too verbose again. Make it clearer that the interval quoted in the "Deferring..." message is the computed transaction polling interval. Combine "Next transid" and "Restarted crawl" into a single message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	0fdae37962	roots: use RateEstimator to track transids Make the crawl polling interval more closely track the commit interval on the btrfs filesystem. In the future this will provide opportunities to do things like clear FD caches and stop crawls on deleted subvols, but triggered by transaction commits instead of arbitrary time intervals. Rename the "crawl" thread so it no longer has the same name as the "crawl" task, and repurpose it for dedicated transid polling. Cancel the deletion of crawl_thread and repurpose it to trigger new crawls and wake up the main crawl Task when it runs out of data. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	a3f02d5dec	roots: comment updates and general cleanup Fix discussion of nodatasum files, clarifying what we can and cannot do. Get rid of some BEESNOTE and BEESTRACE calls which cannot be observed (well, BEESNOTE can, but you have to be quick!). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	f6909dac17	bees: drop BEESINFO Having too many "write a message to the log" primitives is confusing, and having one that intermittently and silently discards output is even _more_ confusing. Replace all BEESINFO with appropriate BEESLOG*s. Usually DEBUG. Except for one or two that occur too often. Just delete those. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:05 -05:00
Zygo Blaxell	4ecd467ca0	BeesBlockData: don't leak file contents in the log The data field of BeesBlockData is only interesting to those who want to debug the BeesBlockData implementation or other battle-tested parts of bees. Users who want to do this can modify and rebuild the source to enable the output. To everyone else, the data field is a huge, ongoing infoleak through the log. Don't bother with an option, just output the length of the data field and nothing else. Fixes: https://github.com/Zygo/bees/issues/53 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:04 -05:00
Zygo Blaxell	71be53eff6	types: don't throw an exception when it's likely we are already reporting an exception Empty files are a thing that can happen. Don't bomb out just reporting one's existence. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:04 -05:00
Zygo Blaxell	f64fc78e36	Task: convert print_fn to a string Since we are now unconditionally rendering the print_fn as a static string, there is no need for it to be a function. We also need it to be brief and mostly constant. Use a string instead. Put the string before the function in the Task constructor arguments so that the title string appears as a heading in code, since we are making a breaking API change already. Drop TASK_MACRO as it is broken by this change, but there is no similar usage of Task anywhere to make it worth fixing. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:04 -05:00
Zygo Blaxell	0710208354	BeesNote: thread naming fixes Move pthread_setname_np to the same place we do pthread_getname_np. Detect errors in pthread_getname_np--but don't throw an exception because we would call ourself recursively from the exception handler when it tries to log the exception. Fix the order of set_name and the first BEESNOTE/BEESLOG call in threads, closing small time intervals where logs have the wrong thread name, and that wrong name becomes persistent for the thread. Make the main thread's name "bees" because Linux kernel stack traces use the pthread name of the main thread instead of the name of the process. Anonymous threads get the process name (usually "bees"). We should not have any such threads, but we do. This appears to occur mostly during exception stack unwinding. GCC/pthread bug? Fixes: https://github.com/Zygo/bees/issues/51 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:47:47 -05:00
Zygo Blaxell	5533d09b3d	Merge remote-tracking branch 'kakra/proposal/prepare-for-more-libs'	2018-01-20 14:23:55 -05:00
Zygo Blaxell	4c05c53d28	roots: update Task print functions for new usage This restores the old "crawl" prefix in the case of Crawler log messages. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 14:00:52 -05:00
Zygo Blaxell	5063a635fc	logging: get Task names for log messages When a Task worker thread is executing a Task, the thread name is less useful than the Task description. Use the Task description instead of the thread name if the thread has no BeesThread name and the thread is currently executing a task. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 14:00:51 -05:00
Zygo Blaxell	fef7aed8fa	BeesNote: if thread name was not set, get it from Task or pthread_getname_np Threads from the Task module in libcrucible don't set BeesNote::tl_name. Even if they did, in Task context the thread name is unspecific to the point of meaninglessness. Use the Task::print method as the name for such threads, and be sure that future Task print functions are designed for that usage. The extra complexity in BeesNote::get_name() seems preferable to bombarding pthread_setname_np hundreds or thousands of times per second. FIXME: we are now calling Task::print() on every BeesNote, which is effectively unconditionally. Maybe we should have Task::print() and get_name() return a closure, or just evaluate Task::print() once and cache it in TaskState, or define Task's constructor with a string argument instead of the current print_fn closure. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 13:57:51 -05:00
Zygo Blaxell	e970ac6c02	crawl: make logging less verbose Silence the three(!) log messages per crawl increment an extra one at the end of the subvol. The three critical messages per subvol crawl cycle are: Next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <A>..<B> started <T> (<AGO>s ago) Subvol has been completely scanned and a new transaction range will be created. CrawlState is the state of the old subvol. Restarted crawl BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago) Subvol has been restarted. CRawlState is the state of the new subvol. Deferring next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago) Subvol has been completely scanned, but it is too soon to start a new scan. Fix the "Restart..." message to use the correct verb tense and to use the correct BeesCrawlState data. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 13:50:47 -05:00
Zygo Blaxell	38ccf5c921	counters: track pair growing time When we find a matching block we attempt to extend ("grow") the matched pair around the first matching block. This function takes the IO hit of reading the second extent from each duplicate extent pair. It's also very slow--too many allocations, too small reads, reads in the wrong order, an order of magnitude too many calls to TREE_SEARCH_V2, and it is usually in the top 3 most frequent PERFORMANCE warnings. Start tracking the running time of grows using the pairforward_ms and pairbackward_ms counters so that we can compare it to various replacements. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 13:04:56 -05:00
Kai Krakow	826b27fde2	Makefile: Fix some dependencies Some deps are already referenced by depends.mk, some where actually missing. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-19 01:50:13 +01:00
Kai Krakow	8a5f790a03	Makefile: Some cleanups Reorder and reformat some arguments so it looks more streamlined during the build process. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-19 01:50:13 +01:00
Kai Krakow	677da5de45	Logging: Add log levels to output This commit adds log levels to the output. In systemd, it makes colored lines, otherwise it's probably just a number. Bees is very chatty, so this paves the road for log level filtering. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-18 23:41:29 +01:00
Kai Krakow	d6b847db0d	Makefile: speedup dependency generation Dependencies can be generated in parallel which can be much faster. It also puts away the problem that for may fail multiple times in a row and leaving behind a broken intermediate file which would be picked up by successive runs. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-18 22:53:00 +01:00
Kai Krakow	c8787fecd2	Makefile: depends.mk is not an optional include We really need depends.mk in the following Makefile reorganization. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-18 22:53:00 +01:00
Zygo Blaxell	00d9b8ed76	hash: do the mlock after loading the table The mlock runs much faster, probably because the hash fetches are doing most of the work that mlock does. It makes bees startup latency for testing smaller, even if it takes more time in absolute terms. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:58:44 -05:00
Zygo Blaxell	56c23c4517	crawl: implement two crawler algorithms and adjust scheduling parameters There are two subvol scan algorithms implemented so far. The two modes are unimaginatively named 0 and 1. 0: sorts extents by (inode, subvol, offset), 1: scans extents round-robin from all subvols. Algorithm 0 scans references to the same extent at close to the same time, which is good for performance; however, whenever a snapshot is created, the scan of the entire filesystem restarts at the beginning of the new snapshot. Algorithm 1 makes continuous forward progress even when new snapshots are created, but it does not benefit from caching and will force the kernel to reread data multiple times when there are snapshots. The algorithm can be selected at run-time using the -m or --scan-mode option. We can collect some field data on these before replacing them with an extent-tree-based scanner. Alternatively, for pre-4.14 kernels, we can keep these two modes as non-default options. Currently these algorithms have terrible names. TODO: fix that, but also TODO: delete all that code and do scans directly from the extent tree instead. Augment the scan algorithms relative to their earlier implementation by batching multiple extents to scan from each subvol before switching to a different subvol. Sprinkle some BEESNOTEs on the Task objects so that they don't disappear from the thread status output. Adjust some timing constants to deal with the increased latency from competing threads. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:53:49 -05:00
Zygo Blaxell	055c8d4c75	roots: scan in parallel using Tasks Distribute incoming extents across a thread pool for faster execution on multi-core, multi-disk environments. Switch extent enumeration model to scan extent refs consecutively(ish). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:52:00 -05:00
Zygo Blaxell	090d79e13b	crucible: remove unused TimeQueue and WorkQueue classes WorkQueue is superceded by Task. TimeQueue will be replaced by something based on Tasks. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:52:00 -05:00
Zygo Blaxell	796aaed7f8	roots: remove dead code and #if blocks In both instances the code contained within (or the conditional compilation surrounding it) is no longer controversial. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:52:00 -05:00
Zygo Blaxell	a175ee0689	bees: clean up #if 0 ... fsync ... #endif code Remove some dead code because dedup-related deadlocks have not been observed since Linux kernel v4.11. Preserve rationale of remaining #if 0 block (why we do write/rename instead of write/fsync/rename) so that people don't try to replace the "missing" fsync() there. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:30:07 -05:00
Zygo Blaxell	8d3a27bf85	subvol-threads: increase resource and thread limits With kernel 4.14 there is no sign of the previous LOGICAL_INO performance problems, so there seems to be no need to throttle threads using this ioctl. Increase the FD cache size limits and scan thread count. Let the kernel figure out scheduling. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:30:07 -05:00
Zygo Blaxell	42a6053229	roots: remove open_root_cache correctly BEESNOTE puts a message on the status message stack. BEESINFO logs a message with rate limiting. The message that was flooding the logs was coming from BEESINFO not BEESNOTE. Fix earlier commit which removed the wrong message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-17 22:30:07 -05:00
Zygo Blaxell	4aa5978a89	hash: reduce mutex contention using one mutex per hash table extent This avoids PERFORMANCE warnings when large hash tables are used on slow CPUs or with lots of worker threads. It also simplifies the code (no locksets, only one object-wide mutex instead of two). Fixed a few minor bugs along the way (e.g. we were not setting the dirty flag on the right hash table extent when we detected hash table errors). Simplified error handling: IO errors on the hash table are ignored, instead of throwing an exception into the function that tried to use the hash table. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-10 23:25:45 -05:00

1 2 3 4 5

212 Commits