GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2026-01-08 19:00:22 +00:00

Author	SHA1	Message	Date
Timofey Titovets	06d41fd518	Rewrite beesd arg parser Signed-off-by: Timofey Titovets <timofey.titovets@synesis.ru>	2018-09-15 00:21:06 +03:00
Kai Krakow	788774731b	Gentoo: Rework Gentoo ebuild into overlay This commit squashes all the little changes from the previous integration branch into one, adjusts to the new Makefile changes, and introduces an overlay layout so that the contrib/gentoo-bees subtree can be directly added as a Portage overlay to the system. The following list contains the previous commit descriptions: sys-fs/bees: Keyword tested architecture ~amd64 Bees was tested on this platform. sys-fs/bees: Add kernel version checks Add checking the kernel versions and write some info and/or warnings before building and installing the package. Running bees on older kernels may have some serious performance and stability impacts, let's tell the user about it. Closes #55 sys-fs/bees: Add metadata.xml sys-fs/bees: There's no configure script So, there's no point in calling "default". sys-fs/bees: Simplify src_configure() sys-fs/bees: Don't depend on markdown It makes no sense to install both README.md and README.html, and we can get rid of one dependency. Dependencies: btrfs-progs is no longer a buildtime-only dep It is actually needed by the bees service wrapper script, as pointed out by Gentoo QA review. sys-fs/bees: DOCS is not needed "COPYING" is already covered by the licensing. The ebuild defaults already include README* sys-fs/bees: Make warnings exclusive It was recommended by Gentoo QA to show only either one or another warning, and change the texts accordingly. sys-fs/bees: RDEPEND is not implicit RDEPEND does not implicitly default to DEPEND. Let's explicitly set the variable. sys-fs/bees: IUSE=test is only needed for explicit dependencies Thus, remove it. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 05:06:39 +02:00
Kai Krakow	679a327ac5	Makefile: Do not force optimizations by default Make life easier for package maintainers by not forcing architecture or compiler optimizations by default. E.g., Gentoo QA refuses to accept both "-march=native" and "-O3". These are usually provided by the package tooling. Instead, we provide easily accessible templates in "makeflags". Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 04:05:15 +02:00
Kai Krakow	31b41bb3c2	Makefile: Do not force making README.html This forces us to depend on markdown which would be otherwise optional. Most of the time it is sufficient to let package managers just install the README.md file. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 03:34:48 +02:00
Kai Krakow	d7e235c178	Makefile: "which" is not portable It was pointed out by Gentoo QA that "type -P" is a better choice. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 03:14:18 +02:00
Kai Krakow	51108f839d	Makefile: Due to VPATH, libcrucible links to hard-coded libuuid path Due to VPATH and how make resolves source paths, libcrucible.so ends up with a hard-coded path to link against libuuid.so. Let's fix it by turning the general rule into an explicit rule for libcrucible.so. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 03:07:20 +02:00
Kai Krakow	8d102abf8b	Makefile: create a template compiler This creates a simple template compiler using sed in as a reusable variable. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:59:54 +02:00
Kai Krakow	83e8f87dc9	Scripts: Don't prefix timestamps when running with systemd Since systemd prefix it's own timestamps, we can unconditionally remove timestamps when bees is executed by systemd. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:59:54 +02:00
Kai Krakow	4417b18d9e	Makefile: .version.o is made from a generated file We should probably not put it into the objects list. Let's instead explicitly put it as a depend of libcrucible.so. This allows us to not use *.cc as a depend for .version.cc which makes more sense as CRUCIBLE_OBJS is also explicitly defined and not built from wildcards. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:59:54 +02:00
Kai Krakow	8636312cab	Compilation: Let the code know about package config This commit adds support for putting package configuration options into header files. This is needed to prepare reading config files from /etc. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:59:54 +02:00
Kai Krakow	17e1171464	Installation: Remove USR_PREFIX from Makefile This commit removes USR_PREFIX and introduces ETC_PREFIX instead. The purpose of PREFIX is the installation prefix in the system, not the installation destination. The latter one is what DESTDIR is used for. This should clear up the confusion. PREFIX was already mis-used as installation destination. But that doesn't mix well with how the make targets are designed. CC: Timofey Titovets <nefelim4ag@gmail.com> Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:59:52 +02:00
Kai Krakow	9069201036	Scripts: Fix systemd unit not being templated Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:21:08 +02:00
Kai Krakow	ace814321f	Makefile: Auto-detect systemd unit path This uses pkg-config to detect the system unit dir. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:21:08 +02:00
Kai Krakow	451f0ad9aa	Makefile: Allow installation of fiemap/fiewalk support tools There's now a new make target called "install_tools" which would not run by default on installation. One can add "OPTIONAL_INSTALL_TARGETS=install_tools" into localconf to install these by default. fiewalk would be installed to sbin, as only root can run it, the other goes to bin. Gentoo can use this to optionally install these tools as a package feature. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:20:59 +02:00
Kai Krakow	85f9265034	Makefile: make installing libs a separate target This will allow installing fiemap/fiewalk support tools as an optional install target. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:13:27 +02:00
Kai Krakow	5b28aad27f	Makefile: Run install tests only for default target "reallyall" Otherwise, tests would still run during "make install". Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:13:27 +02:00
Kai Krakow	6c47bb61c1	Makefile: remove tests from "make all" Instead, introduce "make reallyall" and make it the default target. Now, one can override the default target using localconf. Needed for preparing Gentoo ebuild test behavior. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-09-08 02:13:27 +02:00
Timofey Titovets	2d14fd90e4	Update options in sample config Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2018-08-29 11:44:25 +03:00
Timofey Titovets	e0f315d47a	Make beesd -h useful Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2018-08-29 11:44:25 +03:00
Zygo Blaxell	e564d27dda	README: update known bugs and issues list Also split "bad feature interactions" into "unknown" (which is what it really was before) and "bad" (which includes some filesystem-destroying problems). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-05-18 00:16:09 -04:00
Zygo Blaxell	c3effe0a20	crawl: use custom order instead of (ab)using BeesFileRange::operator< This makes the code clearer and keeps changes to BeesFileRange ordering isolated. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-05-18 00:16:08 -04:00
Zygo Blaxell	f8c27f5c6a	bees: revert TOXIC_INTERVAL back to pre-4.14 levels Linux kernel 4.14, while resistant to extent toxicity, is not immune to it. Go back to the paranoid setting to avoid tying up filesystems in ridiculously long kernel loops in find_parent_nodes. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-05-18 00:16:08 -04:00
Zygo Blaxell	26039cd559	tempfile: update comments around bees_sync Deadlock reproduced on kernel 4.14.34. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-05-18 00:16:04 -04:00
Zygo Blaxell	e9aef89293	fs: fix FTBFS on GCC 8 The memset is just doing an assignment from one dereferenced pointer to another, so do an assignment to keep GCC 8 happy. Fixes: https://github.com/Zygo/bees/issues/64 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-05-18 00:15:37 -04:00
Zygo Blaxell	c21518d8ff	stats: rename "chase_wrong_data" to "chase_no_data" An empty BeesBlockData from the chasing algorithm used to mean that data was found at the expected location but it does not match; however, there are now other reasons for this and they occur much more often. The name is misleading. Change the name to report more correctly what happens: no data, without any guess about the reason. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-03-01 00:01:13 -05:00
Zygo Blaxell	082f04818f	BeesBlockData: fix data type issues Not sure if these cause any problems, but they are theoretically incorrect data types. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-28 23:58:28 -05:00
Zygo Blaxell	5bdad7fc93	crucible: progress: a progress tracker for worker queues The task queue can become very large with many subvols, requiring hours for the queue to clear. 'beescrawl.dat' saves in the meantime will save the work currently scheduled, not the work currently completed. Fix by tracking progress with ProgressTracker. ProgressTracker::begin() gives the last completed crawl position. ProgressTracker::end() gives the last scheduled crawl position. begin() does not advance if there is any item between begin() and end() is not yet completed. In between are crawled extents that are on the task queue but not yet processed. The file 'beescrawl.dat' saves the begin() position while the extent scanning task queue is fed from the end() position. Also remove an unused method crawl_state_get() and repurpose the operator<(BeesCrawlState) that nobody was using. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-28 23:49:39 -05:00
Zygo Blaxell	90c32c3f05	crucible: MAP_32BIT is not defined on ARM Also fix a stray #if that should be #ifdef. Closes: https://github.com/Zygo/bees/issues/59 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-25 10:08:44 -05:00
Zygo Blaxell	33d274eabd	resolve: break up long intra-extent dedup loops When both block candidates for dedup are located in the same extent, bees excludes them from deduplication because the dedup operation would not free any space (both blocks are still referenced, so neither is deleted). Candidates in other extents are still considered. Typically a few blocks are duplicated many thousands or even millions of times within a filesystem. Many of these blocks appear in the same extent as each other. In cases where an extent contains an extremely common duplicate block, it may appear multiple times in many extents. bees can get into a loop with a very bad worst-case running time: 32768 blocks per extent * 2560 bees reference limit * 256 distinct hash table entries = 21.5 billion iterations...squared, because this loop happens every time bees encounteres any of the references. Not an infinite number, but close enough. In each iteration of the loop, replace_dst detects that both src and dst block are part of the same btrfs extent data item and therefore should not be deduped; however, this occurs after the block has been allocated and read by chase_extent_ref. This dst is discarded, but the outer loop tries again with another reference to the same block and gets the same result. An easy fix for this problem is to stop the loop immediately when the same physical extent is found in both src and dst. The condition is rare enough to ignore the negligible space efficiency loss, and filesystem scan stops dead if the loop is allowed to proceed. An exception is thrown to terminate the loop at scan_one_extent from within replace_dst. It would be better to determine the extent bytenr of each candidate extent and filter them out in scan_one_extent (which reduces the number of LOGICAL_INO calls as a side-effect), but bees has no code capable of doing extent data tree lookups with backward iteration yet. Even better would be to change the hash table format so that the extent bytenr can be decoded directly from the hash table entry (this already exists for compressed extents). Both of these changes are too large for v0.6. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-25 10:08:42 -05:00
Zygo Blaxell	2ac94438bd	README: FD caches are now cleared every 10 transactions Also some other minor editorial changes. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-14 21:09:05 -05:00
Zygo Blaxell	9063c6442f	README: clarify that bees is not to be used on old kernels Also note that there is currently no released Linux kernel that is free of relevant bugs. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-14 20:54:48 -05:00
Zygo Blaxell	86afa69cd1	cache: release lock before clearing Clearing the FD cache could trigger a lot of inode evicts in the kernel, which will block the cache entry destructors called by map::clear(). This prevents any cache lookups or new file opens while it happens. Move the map to an auto variable and destroy it after releasing the mutex lock. This probably has the same net result (all the bees threads will be blocked in the kernel instead of on a bees mutex), but at least the problem is outside of userspace now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-07 23:14:38 -05:00
Zygo Blaxell	8f0e88433e	roots: get rid of common error messages, add more error counters One very common case is losing a race to open a file that was deleted. No need to spam the logs with mere ENOENT reports. Other errors are more significant. Log those with errno, and add event counters to record them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-07 23:12:01 -05:00
Zygo Blaxell	5c1b45d67c	extentwalker: remove wrong constraint check Extents that extend past EOF will have ipos = (file size rounded up to next block) and e.end() = (file size not rounded), which fails this constraint check. The constraint check is wrong. Remove it for now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-07 00:07:57 -05:00
Zygo Blaxell	6aad124241	crawl: somebody should set max_transid The previous commit had both max_transid assigments commented out. It happens to work because we set max_transid in the constructor and it doesn't change after that, but it's cleaner to assign it explicitly. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-31 22:52:12 -05:00
Zygo Blaxell	087ec26c44	crawl: filter extents correctly When an extent ref is modified, all of the refs in the same metadata page get the same transid in the TREE_SEARCH_V2 header. This causes two problems: - Extents with generation < min_transid are included if they happen to be referenced by pages with generation >= min_transid. - Extent refs with generation > max_transid are excluded even if they reference extents with generation <= max_transid. Both of these are wrong: the first causes some extents to be repeatedly scanned, the second causes some extents to not be scanned at all. Change the TREE_SEARCH_V2 parameters so that Crawl sees all extents newer than min_transid (i.e. set max_transid to max). The TREE_SEARCH_V2 kernel logic already operates this way, i.e. it fetches every page with transid >= min_transid and discards newer items if they are too new for max_transid. Filter strictly by the extent reference generation field (i.e. the copy of the extent generation that is in the extent reference). Note this still scans extent data multiple times, but it should now be exactly once per extent reference. A proper fix for this requires extent-based scanning instead of extent-ref-based scanning. Formerly commit `5a8c655fc4` "roots: filter out obsolete extents from extent refs" which landed in the subvol-threads branch but not master. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-31 22:48:39 -05:00
Kai Krakow	408b6ae138	Code style: Fix wrong indentation This had spaces instead of tabs by accident. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	e3c4a07216	Makefile: Unclutter "make test" output This adds a .txt Makefile target to create a text file which receives the test program output. In case the test failed, it will cat the contents and fail the target. Execution of each test itself is forced, so it would run every time make is invoked, thus no failing test would be missed. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	d8241a7720	README: Add notes about packaging Give some pointers on how to package bees for a distribution. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	5590fc0b13	Cmdline: Fix text alignment Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	29d40ca359	Cmdline: Rename "relative-paths" to "strip-paths" The previous name didn't match what this option really does. Affects: #41 Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Kai Krakow	b164717a25	Cmdline: Rename "notimestamps" to "no-timestamps" That aligns better with the other options. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-29 21:37:40 -05:00
Zygo Blaxell	af250f7732	roots: determine transid_max without open()ing every subvol root Scan the roots tree directly for roots other than 5 (the FS root), and use btrfs_get_root_transid on root_fd for root 5. This avoids filling up the root FD cache every time we want a new transid_max. Now the only reason we open a subvol root FD is to open a file within the subvol. transid_max may be the same as the FS root's transid, in which case the search loop is not necessary. Place a counter (transid_max_miss) to see if we ever need to look at root items. If this counter never goes above zero, or does so very rarely, we can delete the search loop. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:39 -05:00
Zygo Blaxell	4f0bc78a4c	crawl: don't block a Task waiting for new transids Task should not block for extended periods of time. Remove the RateEstimator::wait_for() in crawl_roots. When crawl_roots runs out of data, let the last crawl_task end without rescheduling. Schedule crawl_task again on transid polls if it was not already running. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:39 -05:00
Zygo Blaxell	b67fba0acd	log: BEESLOGNOTE doesn't do what we think it does BEESLOGNOTE was intended to combine BEESLOG and BEESNOTE, i.e. write a log message and set the task status message from a single expression. With the log levels we would now need several more variants (BEESLOGNOTEDEBUG, BEESLOGNOTEERR...) or a parameter (BEESNOTELOG(DEBUG, ...)). Or we give up on the idea. This combination was used only 3 times so far. The log messages and the note message have different editorial styles. Remove the three instances of BEESLOGNOTE, and make the BEESLOGNOTE definition equvalent to BEESLOG at LOG_NOTICE level for consistency. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 21:37:38 -05:00
Zygo Blaxell	92fda34a68	task: allow user access to ID and default constructor The default constructor makes it more convenient to use Task as a class member. The ID is useful to disambiguate Task references. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:54:06 -05:00
Zygo Blaxell	2aacdcd95f	time: add update_monotonic to RateEstimator update_monotonic does not reset the counter if a new count is smaller than earlier counts. Useful when consuming an unsorted stream of eveent counts. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:51:13 -05:00
Zygo Blaxell	d367c6364c	context: improve toxic match logs Reword log message for discovery of new toxic extents vs. lookup of previously known toxic extents. Also add the block data (especially filename) to the discovery message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:48:06 -05:00
Zygo Blaxell	591a44e59a	resolve: drop support for old-style compressed BeesAddr No public version of bees ever created old-style compressed hash table entries. Remove the code that supports them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:48:06 -05:00
Zygo Blaxell	27125b8140	README: add scan-mode 2 and expand descriptions of modes 0 and 1 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-29 00:48:06 -05:00

... 2 3 4 5 6 ...

434 Commits