Duplicated code between the different scan modes has slowly been
becoming less and less trivial. Move the code to a method and
make both scan-modes call it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Restartng scans for each transid is a bit aggressive. Scan every 10
transids for a polling rate close to the former BEES_COMMIT_INTERVAL.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
transid_max is now measured at a single point in the crawl_transid thread.
Move the Crawl deferred logic into BeesRoots so it restarts all crawls
when transid_max increases. Gets rid of some messy time arithmetic.
Change name of Crawl thread to "crawl_master" in both thread name and
log messages.
Replace "Next transid" with "Crawl started".
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The periodic cache age check was not protected by a lock, so multiple
threads may decide to concurrently clear the cache. This led to
duplicate log messages.
Fix by moving the cache expiry trigger out of FdCache and into Roots,
which knows when transids change and can perform cache clears at exactly
the time they are most relevant, i.e. after something that was deleted
becomes permanently so.
This removes the last references to BEES_COMMIT_INTERVAL, so get rid
of its definition too.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Make the crawl polling interval more closely track the commit interval
on the btrfs filesystem. In the future this will provide opportunities
to do things like clear FD caches and stop crawls on deleted subvols,
but triggered by transaction commits instead of arbitrary time intervals.
Rename the "crawl" thread so it no longer has the same name as the "crawl"
task, and repurpose it for dedicated transid polling. Cancel the deletion
of crawl_thread and repurpose it to trigger new crawls and wake up the
main crawl Task when it runs out of data.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Having too many "write a message to the log" primitives is confusing,
and having one that intermittently and silently discards output is even
_more_ confusing.
Replace all BEESINFO with appropriate BEESLOG*s. Usually DEBUG.
Except for one or two that occur too often. Just delete those.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit adds log levels to the output. In systemd, it makes colored
lines, otherwise it's probably just a number. Bees is very chatty, so
this paves the road for log level filtering.
Signed-off-by: Kai Krakow <kai@kaishome.de>
There are two subvol scan algorithms implemented so far. The two modes
are unimaginatively named 0 and 1.
0: sorts extents by (inode, subvol, offset),
1: scans extents round-robin from all subvols.
Algorithm 0 scans references to the same extent at close to the same
time, which is good for performance; however, whenever a snapshot is
created, the scan of the entire filesystem restarts at the beginning of
the new snapshot.
Algorithm 1 makes continuous forward progress even when new snapshots
are created, but it does not benefit from caching and will force the
kernel to reread data multiple times when there are snapshots.
The algorithm can be selected at run-time using the -m or --scan-mode
option.
We can collect some field data on these before replacing them with
an extent-tree-based scanner. Alternatively, for pre-4.14 kernels,
we can keep these two modes as non-default options.
Currently these algorithms have terrible names. TODO: fix that, but
also TODO: delete all that code and do scans directly from the extent
tree instead.
Augment the scan algorithms relative to their earlier implementation by
batching multiple extents to scan from each subvol before switching to
a different subvol.
Sprinkle some BEESNOTEs on the Task objects so that they don't
disappear from the thread status output.
Adjust some timing constants to deal with the increased latency from
competing threads.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Distribute incoming extents across a thread pool for faster execution
on multi-core, multi-disk environments.
Switch extent enumeration model to scan extent refs consecutively(ish).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
With kernel 4.14 there is no sign of the previous LOGICAL_INO performance
problems, so there seems to be no need to throttle threads using this
ioctl.
Increase the FD cache size limits and scan thread count. Let the kernel
figure out scheduling.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This avoids PERFORMANCE warnings when large hash tables are used on slow
CPUs or with lots of worker threads. It also simplifies the code (no
locksets, only one object-wide mutex instead of two).
Fixed a few minor bugs along the way (e.g. we were not setting the dirty
flag on the right hash table extent when we detected hash table errors).
Simplified error handling: IO errors on the hash table are ignored,
instead of throwing an exception into the function that tried to use the
hash table.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
BLOCK_SIZE_MIN_EXTENT_DEFRAG, BLOCK_SIZE_MIN_EXTENT_SPLIT, and others
are no longer used. Remove them.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit a3d7032edaf5fc584412d0dcf8773f1cafa8f2dc)
This lets us use more default constructors.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 8a932a632ff4602a0357ed5fbcd3f86b6bc50283)
This will allow the default size limit for cache objects to be changed
with impunity.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 9daa51edaab44c02ce0917ff94b20683036d7594)
Holding file FDs open for long periods of time delays inode destruction.
For very large files this can lead to excessive delays while bees dedups
data that will cease to be reachable.
Use the same workaround for file FDs (in the root_ino cache) that
is used for subvols (in the root cache): forcibly close all cached
FDs at regular intervals. The FD cache will reacquire FDs from files
that still have existing paths, and will abandon FDs from files that
no longer have existing paths. The non-existing-path case is not new
(bees has always been able to discover deleted inodes) so it is already
handled by existing code.
Fixes: https://github.com/Zygo/bees/issues/18
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Some whitespace fixes. Remove some duplicate code. Don't lock
two BeesStats objects in the - operator method.
Get the locking for T& at(const K&) right to avoid locking a mutex
recursively. Make the non-const version of the function private.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
"s_name" was a thread_local variable, not static, and did not require a
mutex to protect access. A deadlock is possible if a thread triggers an
exception with a handler that attempts to log a message (as the top-level
exception handler in bees does).
Remove multiple unnecessary mutex locks. Rename the thread_local variables
to make their scope clearer.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The hash table statistics calculation in BeesHashTable::prefetch_loop
and the data-driven operation of the extent scanner always pulls the
hash table into RAM as fast as the disk will push the data. We never
use the prefetch rate limit, so remove it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Every git commit was causing bees.cc and bees-hash.cc to be rebuilt,
which was expensive and unnecessary.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The experiments are over, and the results were not success.
Having two filesystems cohabiting in the same hash table results in a
lot of false positives, each of which requires some heavy IO to resolve.
Using MAP_SHARED to share a beeshash.dat between processes results in
catastrophically bad performance.
These features were abandoned long ago, but some of the code--and even
worse, its documentation--still remains.
Bees wants a hash table false positive rate below 0.1%. With a shared
hash table the FP rate is about the same as the dedup rate. Typically
duplicate files on one filesystem are duplicate on many filesystems.
One or more of Linux VFS and the btrfs mmap(MAP_SHARED) implementation
produce extremely poor performance results. A five-order-of-magnitude
speedup was achieved by implementing paging in userspace with worker
threads. We no longer need the support code for the MAP_SHARED case.
It is still possible to run many BeesContexts in a single process,
but now the only thing contexts share is the FD cache.
Allow relative paths with BEESHOME. These paths will be relative
to the root of the dedup target filesystem.
BEESHOME is now optional. If not specified, '.beeshome' is used.
We don't try to create BEESHOME if it doesn't exist. BEESHOME might
not be on a btrfs filesystem, so we can't insist it be a subvol.
BeesHashTable can now create a beeshash.dat if the file does not already
exist. Currently the default size is one hash table extent (16MB) and
there's no way to change that (yet), so users should still create their
own hash tables for now.
The opening of the hash table is deferred (slightly) in preparation for
hash table resizing.
No doc as the feature is currently unfinished.