GGLinnk/bees - bees - Virtual World Git

mirror of https://github.com/Zygo/bees.git synced 2025-10-30 09:40:35 +01:00

Author	SHA1	Message	Date
Zygo Blaxell	bb273770c5	Merge remote-tracking branch 'kakra/integration' into subvol-threads	2017-11-11 14:38:17 -05:00
Kai Krakow	c6be07e158	Add option for prefixing timestamps To make bees more friendly to use with syslog/systemd, we add an option to omit timestamps from the log output. Signed-off-by: Kai Krakow <kai@kaishome.de>	2017-10-27 23:02:47 +02:00
Kai Krakow	c6bf6bfe1d	Implement getopt options parser This commit adds a simple getopt options parser to show help. This can be used as a boilerplate for adding more options later. Signed-off-by: Kai Krakow <kai@kaishome.de>	2017-10-27 22:36:00 +02:00
Zygo Blaxell	b7e316b005	roots: clean out dead code around crawl locks Remove a number of #if 0's. Remove the redundant thread yield after implementing the same or better in LockSet. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-10-02 08:33:30 -04:00
Zygo Blaxell	5a8c655fc4	roots: filter out obsolete extents from extent refs When an extent ref is modified, all of the refs in the same metadata page get the same transid in the TREE_SEARCH_V2 header. All of the extents are rescanned by later subvol scans. This causes up to 80% overhead due to redundant reads of the same extents. A proper fix for this requires extent-based scanning instead of extent-ref-based scanning. Until that happens, filter out new references to old extents. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-10-01 16:18:47 -04:00
Zygo Blaxell	16432d0bb7	roots: remove open_root_cache correctly BEESNOTE puts a message on the status message stack. BEESINFO logs a message with rate limiting. The message that was flooding the logs was coming from BEESINFO not BEESNOTE. Fix earlier commit which removed the wrong message. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-10-01 16:18:47 -04:00
Zygo Blaxell	175d7fc10e	roots: drop open_root_nocache log entry After a few hundred subvol threads start running, the inode cache starts to thrash, and the log gets spammed with messages of the form: "open_root_nocache <subvolid>: <path>" Ideally there would be some way to schedule work to minimize inode thrashing. Until that gets done, just silence the messages for now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 21:26:26 -04:00
Zygo Blaxell	5afbcb99e3	roots: drop open_root_nocache log entry After a few hundred subvol threads start running, the inode cache starts to thrash, and the log gets spammed with messages of the form: "open_root_nocache <subvolid>: <path>" Ideally there would be some way to schedule work to minimize inode thrashing. Until that gets done, just silence the messages for now. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 21:16:40 -04:00
Zygo Blaxell	5275249396	roots: trace transid_max calculation transid_max calculations can take considerable time. Report their progress in more detail. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 17:30:45 -04:00
Zygo Blaxell	a07728bc7e	tmpfiles: note that kernel race condition is not yet fixed Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 17:30:36 -04:00
Zygo Blaxell	732896b471	log: simplify output for dedup and scan With many threads it is inconvenient to reassemble the elided parts of the dedup src/dst and scan filenames output. Simply output them unconditionally, and balance the line lengths. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 17:30:30 -04:00
Zygo Blaxell	5cc5a44661	bees: drop unused BeesWorkQueue classes Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 17:30:22 -04:00
Zygo Blaxell	1f668d1055	roots: trace transid_max calculation transid_max calculations can take considerable time. Report their progress in more detail. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:52:10 -04:00
Zygo Blaxell	802d5faf46	tmpfiles: note that kernel race condition is not yet fixed Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:52:10 -04:00
Zygo Blaxell	552e74066d	bees: adjust concurrency model Tune the concurrency model to work a little better with large numbers of subvols. This is much less than the full rewrite Bees desparately needs, but it provides a marginal improvement until the new code is ready. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:52:10 -04:00
Zygo Blaxell	1052119a53	log: simplify output for dedup and scan With many threads it is inconvenient to reassemble the elided parts of the dedup src/dst and scan filenames output. Simply output them unconditionally, and balance the line lengths. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:42:52 -04:00
Zygo Blaxell	917fc8c412	context: drop dead code in dedup wrapper This code has been #if 0 for a long time, and it seems unlikely it will ever be useful in the future. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:37:04 -04:00
Zygo Blaxell	59fe9f4617	bees: drop unused BeesWorkQueue classes Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 16:35:42 -04:00
Zygo Blaxell	7defaf9751	roots: move flags check after file identity checks and make error message style consistent If we lose a race and open the wrong file, we will not retry with the next path if the file we opened had incompatible flags. We need to keep trying paths until we open the correct file or run out of paths. Fix by moving the inode flag check after the checks for file identity. Output attributes in hex to be consistent with other attribute error messages. There is no need to report root and file paths separately in the error message for incompatible flags because we have confirmed the identity of the file before the incompatible flag error is detected. Other messages in this loop still output root path and file_path separately because the identity of 'rv' is unknown at the time these messages are emitted. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 15:33:36 -04:00
Zygo Blaxell	9ba9a8e9fa	bees: use ioctl_iflags_get and ioctl_iflags_set instead of opencoded versions Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 15:33:34 -04:00
Kai Krakow	2dc027c701	Skip nocow files to speed up processing If you have a lot of or a few big nocow files (like vm images) which contain a lot of potential deduplication candidates, bees becomes incredibly slow running through a lot "invalid operation" exceptions. Let's just skip over such files to get more bang for the buck. I did no regression testing as this patch seems trivial (and I cannot imagine any pitfalls either). The process progresses much faster for me now.	2017-09-16 15:33:24 -04:00
Zygo Blaxell	339579096f	roots: move flags check after file identity checks and make error message style consistent If we lose a race and open the wrong file, we will not retry with the next path if the file we opened had incompatible flags. We need to keep trying paths until we open the correct file or run out of paths. Fix by moving the inode flag check after the checks for file identity. Output attributes in hex to be consistent with other attribute error messages. There is no need to report root and file paths separately in the error message for incompatible flags because we have confirmed the identity of the file before the incompatible flag error is detected. Other messages in this loop still output root path and file_path separately because the identity of 'rv' is unknown at the time these messages are emitted. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 14:49:09 -04:00
Zygo Blaxell	702a8eec8c	bees: use ioctl_iflags_get and ioctl_iflags_set instead of opencoded versions Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-09-16 14:31:43 -04:00
Kai Krakow	a5e2bdff47	Skip nocow files to speed up processing If you have a lot of or a few big nocow files (like vm images) which contain a lot of potential deduplication candidates, bees becomes incredibly slow running through a lot "invalid operation" exceptions. Let's just skip over such files to get more bang for the buck. I did no regression testing as this patch seems trivial (and I cannot imagine any pitfalls either). The process progresses much faster for me now.	2017-09-12 02:09:22 +02:00
Zygo Blaxell	703bb7c1a3	bees: use handle type for hash table extent locks Fixes build breakage after "crucible: lockset: track lockers and use handle type". Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:22:06 -04:00
Zygo Blaxell	3901962379	bees: trace calls to BeesResolver This helps identify causes of the "same physical address in dedup" exception. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `cc7b4f22b5`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	48aac8a99a	bees: drop unused constants BLOCK_SIZE_MIN_EXTENT_DEFRAG, BLOCK_SIZE_MIN_EXTENT_SPLIT, and others are no longer used. Remove them. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `a3d7032eda`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	b0ba4c4f38	bees: time tmpfile create and copy operations Add time spent in file create and copy operations to the stats. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `f01c20f972`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	74d256f0fe	bees: handle trace functions that throw exceptions A BEESTRACE closure could throw an exception. Trap those so we don't end up in terminate(). Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `59660cfc00`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	8cde833863	bees: make a thread note when we read data Reads can block indefinitely due to bugs, low io priority, or poor storage performance. Record the block origin data in the thread state so we can see which reads are problematic. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `f56f736d28`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	e0951ed4ba	bees: use C++11 syntax for constant initializers This lets us use more default constructors. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `8a932a632f`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	c479b361cd	bees: remove file open serialization mutex It is no longer necessary. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `5c91045557`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	c6c3990d19	bees: types: improve serialization of byte ranges Use () instead of [] when the respective end of the byte range touches the beginning or end of the file. Also omit the '0' at beginning of file. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `3023b7f57a`)	2017-06-17 10:15:11 -04:00
Zygo Blaxell	3fdc217b4f	bees: change formatting for physical bytenr ranges in dedup Use a different character to make it easier to search for bytenr ranges in the logs. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `d43199e3d6`)	2017-06-17 10:15:08 -04:00
Zygo Blaxell	6c8d2bf428	bees: limit FD cache size explicitly This will allow the default size limit for cache objects to be changed with impunity. Signed-off-by: Zygo Blaxell <bees@furryterror.org> (cherry picked from commit `9daa51edaa`)	2017-06-17 10:15:08 -04:00
Zygo Blaxell	cc7b4f22b5	bees: trace calls to BeesResolver This helps identify causes of the "same physical address in dedup" exception. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:09:24 -04:00
Zygo Blaxell	a3d7032eda	bees: drop unused constants BLOCK_SIZE_MIN_EXTENT_DEFRAG, BLOCK_SIZE_MIN_EXTENT_SPLIT, and others are no longer used. Remove them. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:17 -04:00
Zygo Blaxell	f01c20f972	bees: time tmpfile create and copy operations Add time spent in file create and copy operations to the stats. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:17 -04:00
Zygo Blaxell	59660cfc00	bees: handle trace functions that throw exceptions A BEESTRACE closure could throw an exception. Trap those so we don't end up in terminate(). Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:17 -04:00
Zygo Blaxell	f56f736d28	bees: make a thread note when we read data Reads can block indefinitely due to bugs, low io priority, or poor storage performance. Record the block origin data in the thread state so we can see which reads are problematic. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:17 -04:00
Zygo Blaxell	8a932a632f	bees: use C++11 syntax for constant initializers This lets us use more default constructors. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:17 -04:00
Zygo Blaxell	5c91045557	bees: remove file open serialization mutex It is no longer necessary. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:17 -04:00
Zygo Blaxell	3023b7f57a	bees: types: improve serialization of byte ranges Use () instead of [] when the respective end of the byte range touches the beginning or end of the file. Also omit the '0' at beginning of file. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:16 -04:00
Zygo Blaxell	c1dbd30d82	bees: don't limit number of active crawlers All testing so far incidates more crawlers go faster up to a limit much larger than btrfs's performance limitations on subvols, even on spinning rust. Remove the artificial constraint. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 10:06:16 -04:00
Zygo Blaxell	d43199e3d6	bees: change formatting for physical bytenr ranges in dedup Use a different character to make it easier to search for bytenr ranges in the logs. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 09:50:59 -04:00
Zygo Blaxell	9daa51edaa	bees: limit FD cache size explicitly This will allow the default size limit for cache objects to be changed with impunity. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-06-17 09:50:59 -04:00
Zygo Blaxell	b004b22e47	Merge branch 'master' into subvol-threads	2017-06-17 08:15:34 -04:00
Timofey Titovets	5350b0f113	Bees: fix [-Werror=implicit-fallthrough=] In gcc 7+ warning: implicit-fallthrough has been added In some places fallthrough is expectable, disable warning Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>	2017-06-13 18:05:38 +03:00
Zygo Blaxell	dc00dce842	context: purge FD cache every COMMIT_INTERVAL Holding file FDs open for long periods of time delays inode destruction. For very large files this can lead to excessive delays while bees dedups data that will cease to be reachable. Use the same workaround for file FDs (in the root_ino cache) that is used for subvols (in the root cache): forcibly close all cached FDs at regular intervals. The FD cache will reacquire FDs from files that still have existing paths, and will abandon FDs from files that no longer have existing paths. The non-existing-path case is not new (bees has always been able to discover deleted inodes) so it is already handled by existing code. Fixes: https://github.com/Zygo/bees/issues/18 Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-02-08 22:01:00 -05:00
Zygo Blaxell	99fe452101	context: raise limit on the number of concurrent ioctls to cpu_cores/2 This might improve performance on systems with more than 3 CPU cores...or it might bring such a machine to its knees. TODO: find out which of those two things happens. Signed-off-by: Zygo Blaxell <bees@furryterror.org>	2017-01-23 21:18:05 -05:00

1 2

87 Commits