GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-12-12 22:53:39 +01:00

Author	SHA1	Message	Date
Zygo Blaxell	dc00dce842	context: purge FD cache every COMMIT_INTERVAL Holding file FDs open for long periods of time delays inode destruction. For very large files this can lead to excessive delays while bees dedups data that will cease to be reachable. Use the same workaround for file FDs (in the root_ino cache) that is used for subvols (in the root cache): forcibly close all cached FDs at regular intervals. The FD cache will reacquire FDs from files that still have existing paths, and will abandon FDs from files that no longer have existing paths. The non-existing-path case is not new (bees has always been able to discover deleted inodes) so it is already handled by existing code. Fixes: https://github.com/Zygo/bees/issues/18 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-02-08 22:01:00 -05:00
Timofey Titovets	82b3ba76fa	Makefile: make service install compatible with debian systems Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2017-01-30 05:29:28 +03:00
Zygo Blaxell	5a7f4f7899	makeflags: fix missing -D_FILE_OFFSET_BITS=64 in comment Interesting things happen when blindly swapping the release-build CCFLAGS with the debug-build commented-out CCFLAGS. None of these things that happen are good. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-26 22:09:17 -05:00
Zygo Blaxell	dc975f1fa4	crucible: resource: remove excess locking The bugs in other parts of the code have been identified and fixed, so the overprotective locks around shared_ptr can be removed. Keep the other improvements to the Resource class. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-26 22:03:45 -05:00
Zygo Blaxell	99fe452101	context: raise limit on the number of concurrent ioctls to cpu_cores/2 This might improve performance on systems with more than 3 CPU cores...or it might bring such a machine to its knees. TODO: find out which of those two things happens. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:05 -05:00
Zygo Blaxell	9cb48c35b9	crucible: lockset: add LockSet<T>::Lock make_lock Before: decltype(foo)::Lock lock(foo, key); After: auto lock = foo.make_lock(key); Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:03 -05:00
Zygo Blaxell	be1aa049c6	context: allow concurrent dedup Dedup was spending a lot of time waiting for the ioctl mutex while it was held by non-dedup ioctls; however, when dedup finally locked the mutex, its average run time was comparatively short and the variance was low. With the various workarounds and kernel fixes in place, FILE_EXTENT_SAME and bees play well enough together that we can allow multiple threads to do dedup at the same time. The extent bytenr lockset should be sufficient to prevent undesirable results (i.e. dup data not removed, or deadlocks on old kernels). Remove the ioctl lock on dedup. LOGICAL_INO and SEARCH_V2 (as used by BeesCrawl) remain under the ioctl mutex because they can still have abitrarily large run times. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:03 -05:00
Zygo Blaxell	e46b96d23c	context: lock extents by bytenr instead of globally prohibiting tmpfiles This prevents two threads from attempting to dispose of the same physical extent at the same time. This is a more precise exclusion than the general lock on all tmpfiles. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:03 -05:00
Zygo Blaxell	e7fddcbc04	hash: use the LockSet max_size to read hash table from only one thread at a time This reduces disk thrashing at startup. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:03 -05:00
Zygo Blaxell	920cfbc1f6	crawl: put the current crawl state in the thread status It's more useful than a generic "waiting for thread limit" status Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:02 -05:00
Zygo Blaxell	4f9c2c0310	roots: don't deadlock while deleting a crawl thread BeesRoots::crawl_state_erase may invoke BeesCrawl::~BeesCrawl, which will do a join on its crawl thread, which might be trying to lock BeesRoots::m_mutex, which is locked by crawl_state_erase at the time. Fix this by creating an extra reference to the BeesCrawl object, then releasing the lock on BeesRoots::m_mutex, then deleting the reference. The BeesCrawl object may still call methods on BeesRoots, but the only such method is BeesRoots::crawl_state_set_dirty, and that method has no dependency on the erased BeesCrawl shared_ptr. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:00 -05:00
Zygo Blaxell	4604f5bc96	crawl: remove the unused single-threaded crawl implementation This is a TODO from "bees: process each subvol in its own thread" Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:17:59 -05:00
Zygo Blaxell	09ab0778e8	README: we have multiple worker threads now, so don't say that we don't Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:17:58 -05:00
Zygo Blaxell	b22b4ed427	bees: process each subvol in its own thread This is yet another multi-threaded Bees experiment. This time we are dividing the work by subvol: one thread is created to process each subvol in the filesystem. There is no change in behavior on filesystems containing only one subvol. In order to avoid or mitigate the impact of kernel bugs and performance issues, the btrfs ioctls FILE_EXTENT_SAME, SEARCH_V2, and LOGICAL_INO are serialized. Only one thread may execute any of these ioctls at any time. All three ioctls share a single lock. In order to simplify the implementation, only one thread is permitted to create a temporary file during one call to scan_one_extent. This prevents multiple threads from racing to replace the same physical extent with separate physical copies. The single "crawl" thread is replaced by one "crawl_<root_number>" for each subvol. The crawl size is reduced from 4096 items to 1024. This reduces the memory requirement per subvol and keeps the data in memory fresher. It also increases the number of log messages, so turn some of them off. TODO: Currently there is no configurable limit on the total number of threads. The number of CPUs is used as an upper bound on the number of active threads; however, we still have one thread per subvol even if all most of the threads do is wait for locks. TODO: Some of the single-threaded code is left behind until I make up my mind about whether this experiment is successful. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:17:54 -05:00
Zygo Blaxell	4113a171be	crucible: cache: clean up use of iterators check_overflow() will invalidate iterators if it decides there are too many cache entries. If items are deleted from the cache, search for the inserted item again to ensure the iterator is valid. Increase size of timestamp to size_t. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:12:34 -05:00
Zygo Blaxell	5713fcd770	bees: clean up statistics class Some whitespace fixes. Remove some duplicate code. Don't lock two BeesStats objects in the - operator method. Get the locking for T& at(const K&) right to avoid locking a mutex recursively. Make the non-const version of the function private. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-22 22:00:28 -05:00
Zygo Blaxell	db8ea92133	bees: fix further instances of copy-after-unlock bug Before: unique_lock<mutex> lock(some_mutex); // run lock.~unique_lock() because return // return reference to unprotected heap return foo[bar]; After: unique_lock<mutex> lock(some_mutex); // make copy of object on heap protected by mutex lock auto tmp_copy = foo[bar]; // run lock.~unique_lock() because return // pass locally allocated object to copy constructor return tmp_copy; Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-22 22:00:27 -05:00
Zygo Blaxell	6099bf0b01	crucible: fix further instances of copy-after-unlock bug Before: unique_lock<mutex> lock(some_mutex); // run lock.~unique_lock() because return // return reference to unprotected heap return foo[bar]; After: unique_lock<mutex> lock(some_mutex); // make copy of object on heap protected by mutex lock auto tmp_copy = foo[bar]; // run lock.~unique_lock() because return // pass locally allocated object to copy constructor return tmp_copy; Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-22 22:00:27 -05:00
Zygo Blaxell	c58e5cd75b	crucible: cache: construct return value before releasing lock If we release the lock first (and C++ destructor order says we do), then the return value will be constructed from data living in an unprotected container object. That data might be destroyed before we get to the copy constructor for the return value. Make a temporary copy of the return value that won't be destroyed by any other thread, then unlock the mutex, then return the copy object. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-22 12:15:07 -05:00
Paul Jones	123d4e83c5	Remove reference to .c files in Makefile On Gentoo it errors out because there is no .c Signed-off-by: Paul Jones <paul@pauljones.id.au>	2017-01-22 16:49:50 +11:00
Zygo Blaxell	5de3b15daa	src: Update bees-version.c more often Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-18 22:17:03 -05:00
Zygo Blaxell	38fffa8e27	lib: add a version string Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-18 22:17:02 -05:00
Zygo Blaxell	50417e961f	crucible: rework the Resource class Get rid of the ResourceHolder class. Fix GCC static template member instantiation issues. Replace assert() with exceptions. shared_ptr can't seem to do reference counting in a multi-threaded environment. The code looks correct (for both ResourceHandle and std::shared_ptr); however, continual segfaults don't lie. Carpet-bomb with mutex locks to reduce the likelihood of losing shared_ptr races. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-18 22:09:18 -05:00
Zygo Blaxell	6cc9b267ef	crucible: time: fix uninitialized member Found by valgrind. It was mostly harmless because the range of usable values is limited by m_burst (which was initialized) and 0. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-16 22:02:14 -05:00
Zygo Blaxell	9f120e326b	bees: fix deadlock in thread status reporting "s_name" was a thread_local variable, not static, and did not require a mutex to protect access. A deadlock is possible if a thread triggers an exception with a handler that attempts to log a message (as the top-level exception handler in bees does). Remove multiple unnecessary mutex locks. Rename the thread_local variables to make their scope clearer. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-15 01:55:34 -05:00
Zygo Blaxell	382f8bf06a	hash: prevent eleventy-gigabyte core dumps Add MADV_DONTDUMP to the list of advice flags. There are now three flags which may or may not be supported by the target kernel. Try each one and log its success or failure separately. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-12 22:55:08 -05:00
Zygo Blaxell	5e91529ad2	hash: remove the unused m_prefetch_rate_limit The hash table statistics calculation in BeesHashTable::prefetch_loop and the data-driven operation of the extent scanner always pulls the hash table into RAM as fast as the disk will push the data. We never use the prefetch rate limit, so remove it. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-11 21:15:12 -05:00
Zygo Blaxell	bddc07bd28	hash: make thread status message more consistent Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-11 21:15:12 -05:00
Zygo Blaxell	ffe2a767d3	crucible: extentwalker: add compressed() and bytenr() methods Also use C++11 syntax for construction. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-11 21:15:11 -05:00
Zygo Blaxell	845267821c	main: count arguments correctly Replace one braindead mistake for another. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-10 01:10:38 -05:00
Zygo Blaxell	3138002a1f	main: ArgList would silently drop the first argument This fixes a bug where bees tries to process itself as a btrfs filesystem. This is a species of bug that I only notice after pushing to a public git repo. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-09 23:42:02 -05:00
Zygo Blaxell	4a57c5f499	README: update copyright year, remove some obsolete statements	2017-01-09 23:32:33 -05:00
Zygo Blaxell	bda4638048	crucible: LockSet: add a maximum size constraint Extend the LockSet class so that the total number of locked (active) items can be limited. When the limit is reached, no new items can be locked until some existing locked items are unlocked. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-09 23:23:51 -05:00
Zygo Blaxell	fa8607bae0	crucible: get rid of DefaultBool, just use C++11 initializer syntax Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-09 23:23:32 -05:00
Zygo Blaxell	1b261b1ba7	build: move BEES_VERSION to a separate C file to avoid unnecessary building Every git commit was causing bees.cc and bees-hash.cc to be rebuilt, which was expensive and unnecessary. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-09 23:23:05 -05:00
Zygo Blaxell	6980935463	README: "btrfs: improve delayed refs iterations" has been merged into v4.10-rc1 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-09 00:05:18 -05:00
Zygo Blaxell	cf04fb17de	crucible: remove unused execpipe Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-08 23:45:05 -05:00
Zygo Blaxell	4a9f26d12e	crucible: remove ArgList and drop the unimplemented interpreter classes Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-08 23:45:05 -05:00
Zygo Blaxell	e8eaa7e471	trivial: mass purge of whitespace errors Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-06 22:14:50 -05:00
Timofey Titovets	22e601912e	Make filters configurable Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2016-12-30 04:26:42 +03:00
Timofey Titovets	badfa6e9b9	Add filter to remove time from bees output Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2016-12-29 17:04:47 +03:00
Timofey Titovets	03609f73db	Add help section to Makefile Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2016-12-29 13:40:13 +03:00
Timofey Titovets	7f92f22dea	Add install_scripts subcommand to make Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2016-12-29 13:30:08 +03:00
Timofey Titovets	f7c71a7a25	Add install subcommand to make Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2016-12-29 13:27:47 +03:00
Timofey Titovets	37713c2dd4	Scripts: Remove code for short path name in log Commit: "log: remove path from thread name" remove path from logs, so this useless Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2016-12-29 13:14:17 +03:00
Zygo Blaxell	65a950bc41	README.md: 32-bit hosts work now v0.3	2016-12-27 18:01:30 -05:00
Zygo Blaxell	ef8d92a3cb	resolve: don't stop at the first physical address lookup failure The btrfs LOGICAL_INO ioctl has no way to report references to compressed blocks precisely, so we must always consider all references to a compressed block, and discard those that do not have the desired offset. When we encounter compressed shared extents containing a mix of unique and duplicate data, we attempt to replace all references to the mixed extent with the same number of references to multiple extents consisting entirely of unique or duplicate blocks. An early exit from the loop in BeesResolver::for_each_extent_ref was stopping this operation early, after replacing as few as one shared reference. This left other shared references to the unique data on the filesystem, effectively creating new dup data. The failing pattern looks like this: dedup: replace 0x14000..0x18000 from some other extent copy: 0x10000..0x14000 dedup: replace 0x10000..0x14000 with the copy [may be multiple dedup lines due to multiple shared references] copy: 0x18000..0x1c000 [missing dedup 0x18000..0x1c000 with the copy here] scan: 0x10000 [++++dddd++++] 0x1c000 If the extent 0x10000..0x1c000 is shared and compressed, we will make a copy of the extent at 0x18000..1c0000. When we try to dedup this copy extent, LOGICAL_INO will return a mix of references to the data at logical 0x10000 and 0x18000 (which are both references to the original shared extent with different offsets). If we break out of the loop too early, we will stop as soon as a reference to 0x10000 is found, and ignore all other references to the extent we are trying to remove. The copy at the beginning of the extent (0x10000..0x14000) usually works because all references to the extent cover the entire extent. When bees performs the dedup at 0x14000..0x18000, bees itself creates the shared references with different offsets. Uncompressed extents were not affected because LOGICAL_INO can locate physical blocks precisely if they reside in uncompressed extents. This change will hurt performance when looking up old physical addresses that belong to new data, but that is a much less urgent problem. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:23:40 -05:00
Zygo Blaxell	6e7137f282	bees: work around btrfs fsync bug btrfs provides a flush on rename when the rename target exists, so the fsync is not necessary. In the initialization case (when the rename target does not exist and the implicit flush does not occur), the file may be empty or a hole after a crash. Bees treats this case the same as if the file did not exist. Since this condition occurs for only the first 15 minutes of the lifetime of a bees installation, it's not worth bothering to fix. If we attempt to fsync the file ourselves, on a crash with log replay, btrfs will end up with a directory entry pointing to a non-existent inode. This directory entry cannot be deleted or renamed except by deleting the entire subvol. On large filesystems this bug is triggered by nearly every crash (verified on kernels up to 4.5.7). Remove the fsync to avoid the btrfs bug, and accept the failure mode that occurs in the first 15 minutes after a bees install. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:20:31 -05:00
Zygo Blaxell	c1e31004b6	crawl: change scan order to make forward progress at all times Previously, the scan order processed each subvol in order. This required very large amounts of temporary disk space, as a full filesystem scan was required before any shared extents could be deduped. If the hash table RAM was underprovisioned this would mean some shared dup blocks were removed from the hash table before they could be deduped. Currently the scan order takes the first unscanned extent from each subvol. This works well if--and only if--the subvols are either empty or children of a common ancestor. It forces the same inode/offset pairs to be read at close to the same time from each subvol. When a new snapshot is created, this ordering diverts scanning to the new subvol until it catches up to the existing subvols. For large filesystems with frequent snapshot creation this means that the scanner never reaches the end of all subvols. Each new subvol effectively resets the current scan position for the entire filesystem to zero. This prevents bees from ever completing the first filesystem scan. Change the order again, so that we now read one unscanned extent from each subvol in round-robin fashion. When a new subvol is created, we share scan time between old and new subvols. This ensures we eventually finish scanning initial subvols and enter the incremental scanning state. The cost of this change is more repeated reading of shared extents at scan time with less benefit from disk-device-level caching; however, the only way to really fix this problem is to implement scanning on tree 2 (the btrfs extent tree) instead of the subvol trees. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:15:42 -05:00
Zygo Blaxell	7ecead1700	doc: comment updates We stopped using FIEMAP for a number of reasons. Document some of them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2016-12-27 15:15:42 -05:00

1 2 3 4 5

237 Commits