GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-07-03 01:02:27 +02:00

Author	SHA1	Message	Date
Zygo Blaxell	b7e316b005	roots: clean out dead code around crawl locks Remove a number of #if 0's. Remove the redundant thread yield after implementing the same or better in LockSet. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-10-02 08:33:30 -04:00
Zygo Blaxell	5a8c655fc4	roots: filter out obsolete extents from extent refs When an extent ref is modified, all of the refs in the same metadata page get the same transid in the TREE_SEARCH_V2 header. All of the extents are rescanned by later subvol scans. This causes up to 80% overhead due to redundant reads of the same extents. A proper fix for this requires extent-based scanning instead of extent-ref-based scanning. Until that happens, filter out new references to old extents. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-10-01 16:18:47 -04:00
Zygo Blaxell	16432d0bb7	roots: remove open_root_cache correctly BEESNOTE puts a message on the status message stack. BEESINFO logs a message with rate limiting. The message that was flooding the logs was coming from BEESINFO not BEESNOTE. Fix earlier commit which removed the wrong message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-10-01 16:18:47 -04:00
Zygo Blaxell	175d7fc10e	roots: drop open_root_nocache log entry After a few hundred subvol threads start running, the inode cache starts to thrash, and the log gets spammed with messages of the form: "open_root_nocache <subvolid>: <path>" Ideally there would be some way to schedule work to minimize inode thrashing. Until that gets done, just silence the messages for now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 21:26:26 -04:00
Zygo Blaxell	1f668d1055	roots: trace transid_max calculation transid_max calculations can take considerable time. Report their progress in more detail. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:52:10 -04:00
Zygo Blaxell	552e74066d	bees: adjust concurrency model Tune the concurrency model to work a little better with large numbers of subvols. This is much less than the full rewrite Bees desparately needs, but it provides a marginal improvement until the new code is ready. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:52:10 -04:00
Zygo Blaxell	7defaf9751	roots: move flags check after file identity checks and make error message style consistent If we lose a race and open the wrong file, we will not retry with the next path if the file we opened had incompatible flags. We need to keep trying paths until we open the correct file or run out of paths. Fix by moving the inode flag check after the checks for file identity. Output attributes in hex to be consistent with other attribute error messages. There is no need to report root and file paths separately in the error message for incompatible flags because we have confirmed the identity of the file before the incompatible flag error is detected. Other messages in this loop still output root path and file_path separately because the identity of 'rv' is unknown at the time these messages are emitted. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 15:33:36 -04:00
Kai Krakow	2dc027c701	Skip nocow files to speed up processing If you have a lot of or a few big nocow files (like vm images) which contain a lot of potential deduplication candidates, bees becomes incredibly slow running through a lot "invalid operation" exceptions. Let's just skip over such files to get more bang for the buck. I did no regression testing as this patch seems trivial (and I cannot imagine any pitfalls either). The process progresses much faster for me now.	2017-09-16 15:33:24 -04:00
Zygo Blaxell	c1dbd30d82	bees: don't limit number of active crawlers All testing so far incidates more crawlers go faster up to a limit much larger than btrfs's performance limitations on subvols, even on spinning rust. Remove the artificial constraint. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:16 -04:00
Zygo Blaxell	b004b22e47	Merge branch 'master' into subvol-threads	2017-06-17 08:15:34 -04:00
Timofey Titovets	5350b0f113	Bees: fix [-Werror=implicit-fallthrough=] In gcc 7+ warning: implicit-fallthrough has been added In some places fallthrough is expectable, disable warning Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2017-06-13 18:05:38 +03:00
Zygo Blaxell	99fe452101	context: raise limit on the number of concurrent ioctls to cpu_cores/2 This might improve performance on systems with more than 3 CPU cores...or it might bring such a machine to its knees. TODO: find out which of those two things happens. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:05 -05:00
Zygo Blaxell	e46b96d23c	context: lock extents by bytenr instead of globally prohibiting tmpfiles This prevents two threads from attempting to dispose of the same physical extent at the same time. This is a more precise exclusion than the general lock on all tmpfiles. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:03 -05:00
Zygo Blaxell	920cfbc1f6	crawl: put the current crawl state in the thread status It's more useful than a generic "waiting for thread limit" status Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:02 -05:00
Zygo Blaxell	4f9c2c0310	roots: don't deadlock while deleting a crawl thread BeesRoots::crawl_state_erase may invoke BeesCrawl::~BeesCrawl, which will do a join on its crawl thread, which might be trying to lock BeesRoots::m_mutex, which is locked by crawl_state_erase at the time. Fix this by creating an extra reference to the BeesCrawl object, then releasing the lock on BeesRoots::m_mutex, then deleting the reference. The BeesCrawl object may still call methods on BeesRoots, but the only such method is BeesRoots::crawl_state_set_dirty, and that method has no dependency on the erased BeesCrawl shared_ptr. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:00 -05:00
Zygo Blaxell	4604f5bc96	crawl: remove the unused single-threaded crawl implementation This is a TODO from "bees: process each subvol in its own thread" Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:17:59 -05:00
Zygo Blaxell	b22b4ed427	bees: process each subvol in its own thread This is yet another multi-threaded Bees experiment. This time we are dividing the work by subvol: one thread is created to process each subvol in the filesystem. There is no change in behavior on filesystems containing only one subvol. In order to avoid or mitigate the impact of kernel bugs and performance issues, the btrfs ioctls FILE_EXTENT_SAME, SEARCH_V2, and LOGICAL_INO are serialized. Only one thread may execute any of these ioctls at any time. All three ioctls share a single lock. In order to simplify the implementation, only one thread is permitted to create a temporary file during one call to scan_one_extent. This prevents multiple threads from racing to replace the same physical extent with separate physical copies. The single "crawl" thread is replaced by one "crawl_<root_number>" for each subvol. The crawl size is reduced from 4096 items to 1024. This reduces the memory requirement per subvol and keeps the data in memory fresher. It also increases the number of log messages, so turn some of them off. TODO: Currently there is no configurable limit on the total number of threads. The number of CPUs is used as an upper bound on the number of active threads; however, we still have one thread per subvol even if all most of the threads do is wait for locks. TODO: Some of the single-threaded code is left behind until I make up my mind about whether this experiment is successful. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:17:54 -05:00
Zygo Blaxell	db8ea92133	bees: fix further instances of copy-after-unlock bug Before: unique_lock<mutex> lock(some_mutex); // run lock.~unique_lock() because return // return reference to unprotected heap return foo[bar]; After: unique_lock<mutex> lock(some_mutex); // make copy of object on heap protected by mutex lock auto tmp_copy = foo[bar]; // run lock.~unique_lock() because return // pass locally allocated object to copy constructor return tmp_copy; Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-22 22:00:27 -05:00
Zygo Blaxell	c1e31004b6	crawl: change scan order to make forward progress at all times Previously, the scan order processed each subvol in order. This required very large amounts of temporary disk space, as a full filesystem scan was required before any shared extents could be deduped. If the hash table RAM was underprovisioned this would mean some shared dup blocks were removed from the hash table before they could be deduped. Currently the scan order takes the first unscanned extent from each subvol. This works well if--and only if--the subvols are either empty or children of a common ancestor. It forces the same inode/offset pairs to be read at close to the same time from each subvol. When a new snapshot is created, this ordering diverts scanning to the new subvol until it catches up to the existing subvols. For large filesystems with frequent snapshot creation this means that the scanner never reaches the end of all subvols. Each new subvol effectively resets the current scan position for the entire filesystem to zero. This prevents bees from ever completing the first filesystem scan. Change the order again, so that we now read one unscanned extent from each subvol in round-robin fashion. When a new subvol is created, we share scan time between old and new subvols. This ensures we eventually finish scanning initial subvols and enter the incremental scanning state. The cost of this change is more repeated reading of shared extents at scan time with less benefit from disk-device-level caching; however, the only way to really fix this problem is to implement scanning on tree 2 (the btrfs extent tree) instead of the subvol trees. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:15:42 -05:00
Zygo Blaxell	efda609f66	log: remove path from thread name The thread name has an arbitrarily limited size, and we are eventually removing support for multiple paths in a single bees daemon process. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:15:16 -05:00
Zygo Blaxell	7782b79e4b	crucible: reduce buffer size and CPU overhead for BtrfsIoctlSearchKey We really do need some large buffers for BtrfsIoctlSearchKey in some cases, but we don't need to zero them out first. Don't do that so we save some CPU. Reduce the default buffer size to 4K because most BISK users don't get need much more than 1K. Set the buffer size explicitly to the product of the number of items and the desired item size in the places that really need a lot of items.	2016-12-13 21:46:35 -05:00
Zygo Blaxell	eec80944cd	roots: add a counter for crawl_ms, open_root and open_root_ino Linux kernel commit 7f8e406 ("btrfs: improve delayed refs iterations") seems to dramatically improve LOGICAL_INO performance. Hopefully this commit will find its way into mainline Linux soon. This means that most of the time in Bees is now spent on block reading (50-75%); however, there is still a big gap between block read and the sum of everything else we are measuring with the "*_ms" counters. This gap is about 30% of the run time, so it would be good to find out what's in the gap. Add ms counters around the crawl and open calls to capture where we are spending all the time.	2016-12-08 23:55:39 -05:00
Zygo Blaxell	06e111c229	crawl: remove UUID from file names Unfortunately we don't get to remove the libuuid dependency because we still want to read a file that exists in the legacy location.	2016-12-02 00:16:03 -05:00
Zygo Blaxell	cca0ee26a8	bees: remove local cruft, throw at github	2016-11-17 12:12:13 -05:00

24 Commits