1
0
mirror of https://github.com/Zygo/bees.git synced 2025-06-16 17:46:16 +02:00
Commit Graph

70 Commits

Author SHA1 Message Date
56c23c4517 crawl: implement two crawler algorithms and adjust scheduling parameters
There are two subvol scan algorithms implemented so far.  The two modes
are unimaginatively named 0 and 1.

	0:  sorts extents by (inode, subvol, offset),

	1:  scans extents round-robin from all subvols.

Algorithm 0 scans references to the same extent at close to the same
time, which is good for performance; however, whenever a snapshot is
created, the scan of the entire filesystem restarts at the beginning of
the new snapshot.

Algorithm 1 makes continuous forward progress even when new snapshots
are created, but it does not benefit from caching and will force the
kernel to reread data multiple times when there are snapshots.

The algorithm can be selected at run-time using the -m or --scan-mode
option.

We can collect some field data on these before replacing them with
an extent-tree-based scanner.  Alternatively, for pre-4.14 kernels,
we can keep these two modes as non-default options.

Currently these algorithms have terrible names.  TODO:  fix that, but
also TODO: delete all that code and do scans directly from the extent
tree instead.

Augment the scan algorithms relative to their earlier implementation by
batching multiple extents to scan from each subvol before switching to
a different subvol.

Sprinkle some BEESNOTEs on the Task objects so that they don't
disappear from the thread status output.

Adjust some timing constants to deal with the increased latency from
competing threads.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:53:49 -05:00
055c8d4c75 roots: scan in parallel using Tasks
Distribute incoming extents across a thread pool for faster execution
on multi-core, multi-disk environments.

Switch extent enumeration model to scan extent refs consecutively(ish).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:52:00 -05:00
090d79e13b crucible: remove unused TimeQueue and WorkQueue classes
WorkQueue is superceded by Task.  TimeQueue will be replaced by
something based on Tasks.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:52:00 -05:00
796aaed7f8 roots: remove dead code and #if blocks
In both instances the code contained within (or the conditional
compilation surrounding it) is no longer controversial.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:52:00 -05:00
a175ee0689 bees: clean up #if 0 ... fsync ... #endif code
Remove some dead code because dedup-related deadlocks have not been
observed since Linux kernel v4.11.

Preserve rationale of remaining #if 0 block (why we do write/rename
instead of write/fsync/rename) so that people don't try to replace the
"missing" fsync() there.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:30:07 -05:00
8d3a27bf85 subvol-threads: increase resource and thread limits
With kernel 4.14 there is no sign of the previous LOGICAL_INO performance
problems, so there seems to be no need to throttle threads using this
ioctl.

Increase the FD cache size limits and scan thread count.  Let the kernel
figure out scheduling.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:30:07 -05:00
42a6053229 roots: remove open_root_cache correctly
BEESNOTE puts a message on the status message stack.  BEESINFO logs a
message with rate limiting.  The message that was flooding the logs
was coming from BEESINFO not BEESNOTE.

Fix earlier commit which removed the wrong message.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:30:07 -05:00
4aa5978a89 hash: reduce mutex contention using one mutex per hash table extent
This avoids PERFORMANCE warnings when large hash tables are used on slow
CPUs or with lots of worker threads.  It also simplifies the code (no
locksets, only one object-wide mutex instead of two).

Fixed a few minor bugs along the way (e.g. we were not setting the dirty
flag on the right hash table extent when we detected hash table errors).

Simplified error handling:  IO errors on the hash table are ignored,
instead of throwing an exception into the function that tried to use the
hash table.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-10 23:25:45 -05:00
3a24cd3010 Installation: Fix soname QA warning in Gentoo
Gentoo warns about libs missing a proper soname during QA phase. Let's
fix this.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-01-11 02:30:12 +01:00
ba981c133a Merge remote-tracking branches 'kakra/feature/add-relative-path-option' and 'kakra/integration' 2018-01-07 21:39:01 -05:00
77614a0e99 scan: insert toxic matched extents into hash table as they are discovered
When a toxic extent is discovered, insert the offending hash/address/toxic
entry into the hash table.

When a previously discovered toxic extent is encountered, do nothing,
i.e. allow the offending hash/address/toxic entry in the hash table
to expire.

Previously both inserts were removed from the code, but the former one
is required.  The latter prevents bees from forgiving toxic extents
(or any hash matching one) should they be relocated, deleted, or simply
become non-toxic.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-12-21 13:56:15 -05:00
bfb768a079 Fix a fallthrough error in GCC 7+
GCC 7 and higher turn a previous warning into an error for implicit
fallthrough. Let's hint the compiler that this is intentional here.

Signed-off-by: Kai Krakow <kai@kaishome.de>
(cherry picked from commit 270a91cf17)
2017-11-14 11:26:47 -05:00
3024e43355 Fix a fallthrough error in GCC 7+
GCC 7 and higher turn a previous warning into an error for implicit
fallthrough. Let's hint the compiler that this is intentional here.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2017-11-14 07:00:28 +01:00
270a91cf17 Fix a fallthrough error in GCC 7+
GCC 7 and higher turn a previous warning into an error for implicit
fallthrough. Let's hint the compiler that this is intentional here.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2017-11-14 06:58:43 +01:00
f7320baa56 Fix indentation/alignment after integration 2017-11-14 06:58:43 +01:00
52997936d5 getopt: Add logic to set relative path from $CWD
This commit adds a new option to set relative path output for name_fd().

Signed-off-by: Kai Krakow <kai@kaishome.de>
2017-11-14 01:16:06 +01:00
71514e7229 main: use static function to control timestamps in log output
Adjust bees to match changes in Chatter's interface.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 66fd28830d)
2017-11-11 15:18:46 -05:00
c6be07e158 Add option for prefixing timestamps
To make bees more friendly to use with syslog/systemd, we add an option
to omit timestamps from the log output.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2017-10-27 23:02:47 +02:00
c6bf6bfe1d Implement getopt options parser
This commit adds a simple getopt options parser to show help. This can
be used as a boilerplate for adding more options later.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2017-10-27 22:36:00 +02:00
5afbcb99e3 roots: drop open_root_nocache log entry
After a few hundred subvol threads start running, the inode cache starts
to thrash, and the log gets spammed with messages of the form:

	"open_root_nocache <subvolid>: <path>"

Ideally there would be some way to schedule work to minimize inode
thrashing.  Until that gets done, just silence the messages for now.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 21:16:40 -04:00
5275249396 roots: trace transid_max calculation
transid_max calculations can take considerable time.  Report their
progress in more detail.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:45 -04:00
a07728bc7e tmpfiles: note that kernel race condition is not yet fixed
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:36 -04:00
732896b471 log: simplify output for dedup and scan
With many threads it is inconvenient to reassemble the elided parts of
the dedup src/dst and scan filenames output.  Simply output them
unconditionally, and balance the line lengths.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:30 -04:00
5cc5a44661 bees: drop unused BeesWorkQueue classes
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:22 -04:00
339579096f roots: move flags check after file identity checks and make error message style consistent
If we lose a race and open the wrong file, we will not retry with the
next path if the file we opened had incompatible flags.  We need to keep
trying paths until we open the correct file or run out of paths.
Fix by moving the inode flag check after the checks for file identity.

Output attributes in hex to be consistent with other attribute error
messages.

There is no need to report root and file paths separately in the error
message for incompatible flags because we have confirmed the identity of
the file before the incompatible flag error is detected.  Other messages
in this loop still output root path and file_path separately because
the identity of 'rv' is unknown at the time these messages are emitted.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 14:49:09 -04:00
702a8eec8c bees: use ioctl_iflags_get and ioctl_iflags_set instead of opencoded versions
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 14:31:43 -04:00
a5e2bdff47 Skip nocow files to speed up processing
If you have a lot of or a few big nocow files (like vm images) which
contain a lot of potential deduplication candidates, bees becomes
incredibly slow running through a lot "invalid operation" exceptions.

Let's just skip over such files to get more bang for the buck. I did no
regression testing as this patch seems trivial (and I cannot imagine any
pitfalls either). The process progresses much faster for me now.
2017-09-12 02:09:22 +02:00
703bb7c1a3 bees: use handle type for hash table extent locks
Fixes build breakage after "crucible: lockset: track lockers and use
handle type".

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:22:06 -04:00
3901962379 bees: trace calls to BeesResolver
This helps identify causes of the "same physical address in dedup"
exception.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit cc7b4f22b5)
2017-06-17 10:15:11 -04:00
48aac8a99a bees: drop unused constants
BLOCK_SIZE_MIN_EXTENT_DEFRAG, BLOCK_SIZE_MIN_EXTENT_SPLIT, and others
are no longer used.  Remove them.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit a3d7032eda)
2017-06-17 10:15:11 -04:00
b0ba4c4f38 bees: time tmpfile create and copy operations
Add time spent in file create and copy operations to the stats.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit f01c20f972)
2017-06-17 10:15:11 -04:00
74d256f0fe bees: handle trace functions that throw exceptions
A BEESTRACE closure could throw an exception.  Trap those so we don't
end up in terminate().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 59660cfc00)
2017-06-17 10:15:11 -04:00
8cde833863 bees: make a thread note when we read data
Reads can block indefinitely due to bugs, low io priority, or poor
storage performance.  Record the block origin data in the thread state
so we can see which reads are problematic.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit f56f736d28)
2017-06-17 10:15:11 -04:00
e0951ed4ba bees: use C++11 syntax for constant initializers
This lets us use more default constructors.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 8a932a632f)
2017-06-17 10:15:11 -04:00
c479b361cd bees: remove file open serialization mutex
It is no longer necessary.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 5c91045557)
2017-06-17 10:15:11 -04:00
c6c3990d19 bees: types: improve serialization of byte ranges
Use () instead of [] when the respective end of the byte range touches
the beginning or end of the file.  Also omit the '0' at beginning of
file.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 3023b7f57a)
2017-06-17 10:15:11 -04:00
3fdc217b4f bees: change formatting for physical bytenr ranges in dedup
Use a different character to make it easier to search for bytenr ranges
in the logs.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit d43199e3d6)
2017-06-17 10:15:08 -04:00
6c8d2bf428 bees: limit FD cache size explicitly
This will allow the default size limit for cache objects to be changed
with impunity.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 9daa51edaa)
2017-06-17 10:15:08 -04:00
5350b0f113 Bees: fix [-Werror=implicit-fallthrough=]
In gcc 7+ warning: implicit-fallthrough has been added
In some places fallthrough is expectable, disable warning

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
2017-06-13 18:05:38 +03:00
dc00dce842 context: purge FD cache every COMMIT_INTERVAL
Holding file FDs open for long periods of time delays inode destruction.
For very large files this can lead to excessive delays while bees dedups
data that will cease to be reachable.

Use the same workaround for file FDs (in the root_ino cache) that
is used for subvols (in the root cache):  forcibly close all cached
FDs at regular intervals.  The FD cache will reacquire FDs from files
that still have existing paths, and will abandon FDs from files that
no longer have existing paths.  The non-existing-path case is not new
(bees has always been able to discover deleted inodes) so it is already
handled by existing code.

Fixes: https://github.com/Zygo/bees/issues/18

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-02-08 22:01:00 -05:00
5713fcd770 bees: clean up statistics class
Some whitespace fixes.  Remove some duplicate code.  Don't lock
two BeesStats objects in the - operator method.

Get the locking for T& at(const K&) right to avoid locking a mutex
recursively.  Make the non-const version of the function private.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:28 -05:00
db8ea92133 bees: fix further instances of copy-after-unlock bug
Before:

        unique_lock<mutex> lock(some_mutex);
        // run lock.~unique_lock() because return
        // return reference to unprotected heap
        return foo[bar];

After:

        unique_lock<mutex> lock(some_mutex);
        // make copy of object on heap protected by mutex lock
        auto tmp_copy = foo[bar];
        // run lock.~unique_lock() because return
        // pass locally allocated object to copy constructor
        return tmp_copy;

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:27 -05:00
5de3b15daa src: Update bees-version.c more often
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-18 22:17:03 -05:00
9f120e326b bees: fix deadlock in thread status reporting
"s_name" was a thread_local variable, not static, and did not require a
mutex to protect access.  A deadlock is possible if a thread triggers an
exception with a handler that attempts to log a message (as the top-level
exception handler in bees does).

Remove multiple unnecessary mutex locks.  Rename the thread_local variables
to make their scope clearer.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-15 01:55:34 -05:00
382f8bf06a hash: prevent eleventy-gigabyte core dumps
Add MADV_DONTDUMP to the list of advice flags.

There are now three flags which may or may not be supported by the
target kernel.  Try each one and log its success or failure separately.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-12 22:55:08 -05:00
5e91529ad2 hash: remove the unused m_prefetch_rate_limit
The hash table statistics calculation in BeesHashTable::prefetch_loop
and the data-driven operation of the extent scanner always pulls the
hash table into RAM as fast as the disk will push the data.  We never
use the prefetch rate limit, so remove it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-11 21:15:12 -05:00
bddc07bd28 hash: make thread status message more consistent
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-11 21:15:12 -05:00
845267821c main: count arguments correctly
Replace one braindead mistake for another.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-10 01:10:38 -05:00
3138002a1f main: ArgList would silently drop the first argument
This fixes a bug where bees tries to process itself as a btrfs filesystem.
This is a species of bug that I only notice *after* pushing to a public
git repo.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-09 23:42:02 -05:00
fa8607bae0 crucible: get rid of DefaultBool, just use C++11 initializer syntax
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-09 23:23:32 -05:00