GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-07-01 00:02:27 +02:00

Author	SHA1	Message	Date
Zygo Blaxell	20b8f8ae0b	bees: use helper function for readahead There seem to be multiple ways to do readahead in Linux, and only some of them work. Hopefully reading the actual data is one of them. This is an attempt to avoid page-by-page reads in the generic dedupe code. We load both extents into the VFS cache (read sequentially) and hope they are still there by the time we call dedupe on them. We also call readahead(2) and hopefully that either helps or does nothing. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:56:54 -04:00
Zygo Blaxell	0bbaddd54c	docs: finally concede that the consensus spelling is "dedupe" Change documentation and comments to use the word "dedupe," not "dedup" as found in circa-3.15 kernel sources. No changes in code or program output--if they used "dedup" before, they will continue to be spelled "dedup" now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2021-06-11 20:49:15 -04:00
Zygo Blaxell	7117cb40c5	hash: prepare for user-selectable hash functions Localize the hash function in bees to a single spot to make it easier to change later (or at runtime). Remove some code that was using a property of CRC as an optimization. The optimization doesn't work for other hash functions, and running the CRC function takes more CPU time than the optimization saved. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2019-06-12 22:48:06 -04:00
Zygo Blaxell	96eb100ded	bees: use readahead instead of posix_fadvise Other btrfs utils use readahead() not posix_fadvise(). There does not appear to be a performance or correctness difference between the three (none, posix_fadvise, or readahead()). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-09-14 23:50:00 -04:00
Zygo Blaxell	082f04818f	BeesBlockData: fix data type issues Not sure if these cause any problems, but they are theoretically incorrect data types. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-02-28 23:58:28 -05:00
Zygo Blaxell	4ecd467ca0	BeesBlockData: don't leak file contents in the log The data field of BeesBlockData is only interesting to those who want to debug the BeesBlockData implementation or other battle-tested parts of bees. Users who want to do this can modify and rebuild the source to enable the output. To everyone else, the data field is a huge, ongoing infoleak through the log. Don't bother with an option, just output the length of the data field and nothing else. Fixes: https://github.com/Zygo/bees/issues/53 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:04 -05:00
Zygo Blaxell	71be53eff6	types: don't throw an exception when it's likely we are already reporting an exception Empty files are a thing that can happen. Don't bomb out just reporting one's existence. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-26 23:48:04 -05:00
Zygo Blaxell	38ccf5c921	counters: track pair growing time When we find a matching block we attempt to extend ("grow") the matched pair around the first matching block. This function takes the IO hit of reading the second extent from each duplicate extent pair. It's also very slow--too many allocations, too small reads, reads in the wrong order, an order of magnitude too many calls to TREE_SEARCH_V2, and it is usually in the top 3 most frequent PERFORMANCE warnings. Start tracking the running time of grows using the pairforward_ms and pairbackward_ms counters so that we can compare it to various replacements. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2018-01-20 13:04:56 -05:00
Kai Krakow	677da5de45	Logging: Add log levels to output This commit adds log levels to the output. In systemd, it makes colored lines, otherwise it's probably just a number. Bees is very chatty, so this paves the road for log level filtering. Signed-off-by: Kai Krakow <kai@kaishome.de>	2018-01-18 23:41:29 +01:00
Zygo Blaxell	8cde833863	bees: make a thread note when we read data Reads can block indefinitely due to bugs, low io priority, or poor storage performance. Record the block origin data in the thread state so we can see which reads are problematic. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `f56f736d28`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	e0951ed4ba	bees: use C++11 syntax for constant initializers This lets us use more default constructors. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `8a932a632f`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	c479b361cd	bees: remove file open serialization mutex It is no longer necessary. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `5c91045557`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	c6c3990d19	bees: types: improve serialization of byte ranges Use () instead of [] when the respective end of the byte range touches the beginning or end of the file. Also omit the '0' at beginning of file. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `3023b7f57a`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	db8ea92133	bees: fix further instances of copy-after-unlock bug Before: unique_lock<mutex> lock(some_mutex); // run lock.~unique_lock() because return // return reference to unprotected heap return foo[bar]; After: unique_lock<mutex> lock(some_mutex); // make copy of object on heap protected by mutex lock auto tmp_copy = foo[bar]; // run lock.~unique_lock() because return // pass locally allocated object to copy constructor return tmp_copy; Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-22 22:00:27 -05:00
Zygo Blaxell	cca0ee26a8	bees: remove local cruft, throw at github	2016-11-17 12:12:13 -05:00

15 Commits