1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-18 13:55:44 +02:00

82 Commits

Author SHA1 Message Date
Zygo Blaxell
23c16aa978 BeesFileRange: coalesce is not used, subtract was never implemented
Less dead code to maintain.  Also more Doxygen comments.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:59 -05:00
Zygo Blaxell
9cdeb608f5 bees: drop the balance/logical workaround that has been disabled for two years
Kernels that needed the balance workaround frankly are too buggy
to run bees at all.  The workaround also makes the locking stories
around logical_ino calls and process exit complicated, so get rid of
it completely.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:58 -05:00
Zygo Blaxell
31b2aa3c0d context: speed up orderly process termination
Quite often bees exceeds its service timeout for termination because
it is waiting for a loop embedded in a Task to finish some long-running
btrfs operation.  This can cause bees to be aborted by SIGKILL before
it can completely flush the hash table or save crawl state.

There are only two important things SIGTERM does when bees terminates:
 1.  Save crawl progress
 2.  Flush out the hash table

Everything else is automatically handled by the kernel when the process
is terminated by SIGKILL, so we don't have to bother doing it ourselves.
This can save considerable time at shutdown since we don't have to wait
for every thread to reach a point where it becomes idle, or force loops
to terminate by throwing exceptions, or check a condition every time we
access a pointer.  Instead, we need do only the things in the list
above, and then call _exit() to clean up everything else.

Hash table and crawl state writeback can happen in their background
threads instead of the foreground one.  Separate the "stop" method for
these classes into "stop_request" and "stop_wait" so that these writebacks
can run at the same time.

Deprecate and remove all references to the BeesHalt exception, and remove
several unnecessary checks for BeesContext::stop_requested.

Pause the task queue instead of cancelling it, which preserves the
crawl progress state and stops new Tasks from competing for iops and
CPU during writeback.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:58 -05:00
Zygo Blaxell
a2e1887c52 bees: use MultiLocker to serialize dedupe and logical_ino
In current kernels there is a bug which leads to an infinite loop in
add_all_parents().  The bug is triggered by one thread running dedupe
while another runs logical_ino.

Work around this by ensuring that bees process never runs dedupe and
logical_ino ioctls at the same time.  Any number of either can run
at the same time, but not one of both.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:55 -05:00
Zygo Blaxell
cc87125e41 bees: drop bees_sync, we will not need it
bees_sync() was an exception-trapping wrapper around fsync() which is
not needed in any of the contexts from which it was called:

	1.  dedupe operations implicitly flush the src data, so there is
	no need to call fsync() to do that twice.

	2.  crawl position is written to a temporary file and renamed
	over the original, which always forces a flush when the original
	exists.  On the first write, where there is no original, a
	crash would result in starting over with an empty or hole-filled
	beescrawl file, which is the initial state of bees.  There is also
	a long history of kernel bugs triggered by fsync() in this case.

	3.  we use unreadahead to trigger writeback for flushing the
	hash table to persistent storage.  Here is a space where we might
	use fsync after all, as part of bees_unreadahead's emulation of
	POSIX_FADV_DONTNEED, but we need to get read-once behavior from
	the scanner before we can use this capability.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:54 -05:00
Zygo Blaxell
be9321cdb3 roots: correctly track crawl dirty state
If there's an error while writing the crawl state, the state should
remain dirty.  If the crawl state is successfully written, the state
is only clean if there were no changes to crawl state since the write
was committed.  We need to release the lock while writing the state but
correctly set the dirty flag when the state is written successfully.

Replace the bool with a version number counter.  Track the last version
successfully saved and the current version of the crawl state.  The state
is dirty if these counters disagree and clean if they agree.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:54 -05:00
Zygo Blaxell
a9c81e5531 bees: drop m_parent_ctx
It has not been used since 2016.

Also drop the explicit default constructor.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:54 -05:00
Zygo Blaxell
3654738f56 bees: fix deprecated-copy warnings for clang-14
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-10-23 22:39:59 -04:00
Zygo Blaxell
fbf6b395c8 types: member m_fd in BeesFileRange must be protected against data races
We had an unfortunate pattern of:

	const BeesFileRange bfr;
	shared_ptr<BeesContext> ctx;
	// ...
	BEESNOTE("foo " << bfr);
	bfr.fd(ctx);
	BEESNOTE("foo after opening: " << bfr);

If dump_status started running after the first BEESNOTE, but before
the second, then bfr.fd() might expose a single Fd object's shared_ptr
member to two threads at the same time (the thread running dump_status
and the thread running BEESNOTE) without protection by a lock.  One of
the threads would see a partially-initialized Fd object, and the other
thread would crash on an assertion failure, e.g.

	#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
	#1  0x00007f4c4fde5537 in __GI_abort () at abort.c:79
	#2  0x00007f4c4fde540f in __assert_fail_base (fmt=0x7f4c4ff4e128 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5557605629dd "!m_destroyed", file=0x5557605627c0 "../include/crucible/namedptr.h", line=77, function=<optimized out>) at assert.c:92
	#3  0x00007f4c4fdf4662 in __GI___assert_fail (assertion=assertion@entry=0x5557605629dd "!m_destroyed", file=file@entry=0x5557605627c0 "../include/crucible/namedptr.h", line=line@entry=77,
	    function=function@entry=0x555760562970 "crucible::NamedPtr<Return, Arguments>::Value::~Value() [with Return = crucible::IOHandle; Arguments = {int}]") at assert.c:101
	#4  0x00005557605306f6 in crucible::NamedPtr<crucible::IOHandle, int>::Value::~Value (this=0x7f4a3c2ff0d0, __in_chrg=<optimized out>) at ../include/crucible/namedptr.h:77
	#5  0x00005557605137da in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f4a3c2ff0c0) at /usr/include/c++/10/bits/shared_ptr_base.h:151
	#6  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f4a3c2ff0c0) at /usr/include/c++/10/bits/shared_ptr_base.h:151
	#7  std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7f4c4c5b5f28, __in_chrg=<optimized out>) at /usr/include/c++/10/bits/shared_ptr_base.h:733
	#8  std::__shared_ptr<crucible::IOHandle, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7f4c4c5b5f20, __in_chrg=<optimized out>) at /usr/include/c++/10/bits/shared_ptr_base.h:1183
	#9  std::shared_ptr<crucible::IOHandle>::~shared_ptr (this=0x7f4c4c5b5f20, __in_chrg=<optimized out>) at /usr/include/c++/10/bits/shared_ptr.h:121
	#10 crucible::Fd::~Fd (this=0x7f4c4c5b5f20, __in_chrg=<optimized out>) at ../include/crucible/fd.h:46
	#11 BeesFileRange::file_size (this=0x7f4c4e5ba4a0) at bees-types.cc:156
	#12 0x0000555760513950 in operator<< (os=..., bfr=...) at bees-types.cc:80
	#13 0x000055576050d662 in std::function<void (std::ostream&)>::operator()(std::ostream&) const (__args#0=..., this=0x7f4c4e5b9f60) at /usr/include/c++/10/bits/std_function.h:622
	#14 BeesNote::get_status[abi:cxx11]() () at bees-trace.cc:165
	#15 0x00005557604c9676 in BeesContext::dump_status (this=0x5557611c4de0) at bees-context.cc:89
	#16 0x00005557605206fb in std::function<void ()>::operator()() const (this=this@entry=0x7f4c4c5b65f0) at /usr/include/c++/10/bits/std_function.h:622
	#17 crucible::catch_all(std::function<void ()> const&, std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)> const&) (f=..., explainer=...) at error.cc:55
	#18 0x000055576050aaa7 in operator() (__closure=0x5557611c52c8) at bees-thread.cc:22
	#19 0x00007f4c501beed0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
	#20 0x00007f4c502c8ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
	#21 0x00007f4c4febddef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Fix by making BeesFileRange::m_fd really const (not just mutable),
then fix all the broken code referencing it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:10:02 -05:00
Zygo Blaxell
01734e6d4b hash: initialize m_dirty in BeesHashTable
It turns out we never set m_dirty's initial value.  This is not a
practical problem because 1) it's mostly harmless if m_dirty is spuriously
true, 2) we set it to true every time bees scans a data block, and 3)
the allocation happens early in startup when most memory allocations
are using zero-filled pages, so it's probably getting a false value at
construction in most cases.

valgrind complains about it, so it has to go.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:10:02 -05:00
Zygo Blaxell
a83c68eb18 bees: style cleanups: const, size_t, symbolic names
No functional changes.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:10:02 -05:00
Zygo Blaxell
6d6686eb5b context: get rid of resolve (LOGICAL_INO) serializer
There are kernel bugs in LOGICAL_INO from time to time; however, we
can't avoid these bugs by serializing LOGICAL_INO calls.

It hasn't been used for some time, so remove the code and
less-than-completely-accurate comments.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:10:02 -05:00
Zygo Blaxell
85c93c10e6 bees: clean up #include list
No need for atomic, and sort the Linux headers.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
ba694b4881 hash: move the random generator out of bees-hash.cc
We need random numbers in more places, so centralize the engines.
Initialize with a proper random seed so every worker thread gets
different behavior.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
14cd6ed033 bees: deprecate vector<uint8_t> and replace with ByteVector
The vector<uint8_t> in the hash table doesn't hurt very much--only a few
microseconds per 128K hash block.

The vector<uint8_t> in BeesBlockData hurts a bit more--we run that
constructor thousands of times per second.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:42:01 -04:00
Zygo Blaxell
2f14a5a9c7 roots: reduce number of objects per TREE_SEARCH_V2, drop BEES_MAX_CRAWL_ITEMS and BEES_MAX_CRAWL_BYTES
This makes better use of dynamic buffer sizing, and reduces the amount
of stale date lying around.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:42:01 -04:00
Zygo Blaxell
a353d8cc6e hash: use POSIX_FADV_WILLNEED and POSIX_FADV_DONTNEED
The hash table is one of the few cases in bees where a non-trivial amount
of page cache memory will be used in a predictable way, so we can advise
the kernel about our IO demands in advance.

Use WILLNEED to prefetch hash table pages at startup.

Use DONTNEED to trigger writeback on hash table pages at shutdown.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-04 20:41:09 -04:00
Zygo Blaxell
d9e3c0070b context: stop creating new refs when there are too many already
LOGICAL_INO_V2 has a maximum limit of 655050 references per extent.
Although it no longer has a crippling performance problem, at roughly
two seconds to process extent, it's too slow to be useful.

When an extent gains an absurd number of references, stop making any
more.  Returning zero extent refs will make bees believe the extent
was deleted, and it will remove the block from the hash table.

This helps speed processing of highly duplicated large files like
VM images, and the cost of a slightly lower dedupe hit rate.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 21:05:55 -04:00
Zygo Blaxell
1fd26a03b2 tracer: annotate both ends of the stack trace
Add a matching "--- BEGIN TRACE..." line to complement the "---  END
TRACE..." line.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
5f0f7a8319 bees: increase StringFile size limit
If we are going to dedupe thousands of subvols, we are going to need a
bigger beescrawl.dat.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
ee86b585a5 bees: use a reserved symbol name in BEESLOG
"c" could be a local variable name, which would do interesting things
to some log messages.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
8a70bca011 bees: misc comment updates
These have been accumulating in unpublished bees commits.  Squash them all
into one.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
20b8f8ae0b bees: use helper function for readahead
There seem to be multiple ways to do readahead in Linux, and only some
of them work.  Hopefully reading the actual data is one of them.

This is an attempt to avoid page-by-page reads in the generic dedupe code.
We load both extents into the VFS cache (read sequentially) and hope they
are still there by the time we call dedupe on them.

We also call readahead(2) and hopefully that either helps or does nothing.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
0bbaddd54c docs: finally concede that the consensus spelling is "dedupe"
Change documentation and comments to use the word "dedupe," not "dedup"
as found in circa-3.15 kernel sources.

No changes in code or program output--if they used "dedup" before, they
will continue to be spelled "dedup" now.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
fbd1091052 options: remove default 8 CPU thread limit
Higher CPU core counts became more common, and kernel bugs became less
common, since the arbitrary 8-thread limit was introduced.  We can remove
the limit now, and treat any remaining scaling inefficiency as a bug to
be removed.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
80c69f1ce4 context: get rid of shared_ptr<BeesContext> in every single cached Fd object
Support for multiple BeesContext objects sharing a FdCache was wasting
significant space and atomic inc/dec memory cycles for no good reason
since the shared-FdCache feature was deprecated.

open_root and open_root_ino still need a BeesContext to work.  Pass the
BeesContext pointer through the function object instead of the cache
key arguments.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-28 21:54:00 -04:00
Zygo Blaxell
db65031c2b context: get rid of all instances of pthread_cancel
pthread_cancel doesn't really work properly.  It was only being used in
bees to bring threads to a stop if the BeesContext is destroyed early.
It is frequently implicated in core dump reports because of the fragility
of the C++ iostream / C stdio / library infrastructure, particularly
surrounding upgrades on the host running bees.  The pthread_cancel call
itself often simply fails even when it doesn't call terminate().

Defer creation of the status and progress threads until after the
BeesContext::start method is invoked.  At that point, the existing
ask-threads-nicely-to-stop code is up and running, and normal condvars
can be used to bring bees to a stop, without having to resort to
pthread_cancel.

Since we're deleting half of the BeesContext constructor in this change,
let's remove the other half too, and put an end to the deprecated support
for multiple BeesContexts sharing a process.  It's still possible to run
multiple BeesContexts, but they will not share a FD cache.  This will
allow the FD cache's keys to become smaller and hopefully save some
memory later on.

Fixes: #171

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-28 21:42:03 -04:00
Zygo Blaxell
bcf3e7de3e uuid: drop dependency on uuid.h
The weird things distros do to the path where uuid.h gets installed
have broken bees builds for the last time.

We were only using uuid to support a legacy feature that was removed
over four years ago.

Hypothetical users who are upgrading directly from bees v0.1 should
probably restart all the crawlers anyway--there were bugs.  Also, if any
such users exist, I respect their tremendous patience with the horrible
performance all these years--bees got about 30x faster since v0.1.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:16:50 -04:00
Zygo Blaxell
636e69267e resolve: add bees.h constants for balance and logical_ino serialization
Make these workarounds configurable in src/bees.h instead of #if 0
code blocks.  Someday we'll make the constants in bees.h configurable
through a file or similar.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:07:36 -05:00
Zygo Blaxell
6705cd9c26 context: move TempFile from TLS to Pool and fix some FdCache issues
Get rid of the thread-local TempFiles and use Pool instead.  This
eliminates a potential FD leak when the loadavg governor repeatedly
creates and destroys threads.

With the old per-thread TempFiles, we were guaranteed to have exclusive
ownership of the TempFile object within the current thread.  Pool is
somewhat stricter:  it only guarantees ownership while the checked-out
Handle exists.  Adjust the users of TempFile objects to ensure they hold
the Handle object until they are finished using the TempFile.

It appears that maintaining large, heavily-reflinked, long-lived temporary
files costs more than truncating after every use: btrfs has to write
multiple references to the temporary file's extents, then some commits
later, remove references as the temporary file is deleted or truncated.
Using the temporary file in a dedupe operation flushes the data to disk,
so nothing is saved by pretending that there is writeback pipelining and
trying to avoid flushes in truncate.  Pool provides usage tracking and
a checkin callback, so use it to truncate the temporary file immediately
after every use.

Redesign TempFile so that every instance creates exactly one Fd which
persists over the lifetime of the TempFile object.  Provide a reset()
method which resets the file back to the initial state and call it from
the Pool checkin callback.  This makes TempFile's lifetime equivalent to
its Fd's lifetime, which simplifies interactions with FdCache and Roots.

This change means we can now blacklist temporary files without having
an effective memory leak, so do that.  We also have a reason to ever
remove something from the blacklist, so add a method for that too.

In order to move to extent-centric addressing, we need to be able to
reliably open temporary files by root and inode number.  Previously we
would place TempFile fd's into the cache with insert_root_ino, but the
cache would be cleared periodically, and it would not be possible to
reopen temporary files after that happened.  Now that the TempFile's
lifetime is the same as the TempFile Fd's lifetime, we can have TempFile
manage a separate FileId -> Fd map in Roots which is unaffected by the
periodic cache clearing.  BeesRoots::open_root_ino_nocache will check
this map before attempting to open the file via btrfs root+ino lookup,
and return it through the cache as if Roots had opened the file via btrfs.

Hold a reference to BeesRoots in BeesTempFile because the usual way
to get such a reference now throws an exception in BeesTempFile's
destructor.

These changes make method BeesTempFile::create() and all methods named
insert_root_ino unnecessary, so delete them.

We construct and destroy TempFiles much less often now, so make their
constructor and destructor more informative.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
de6282c6cd roots: separate crawl sizes into bytes and items
Number of items should be low enough that we don't have too many stale
items, but high enough to amortize system call overhead to a reasonable
ratio.

Number of bytes should be constant:  one worst-case metadata page (the
btrfs limit is 64K, though 16K is much more common) so that we always
have enough space for one worst-case item; otherwise, we get EOVERFLOW
if we set the number of items too low and there's a big item in the tree,
and we can't make further progress.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
e654e29f45 bees: move usage message out of source file and fix a few inaccuracies
It's a pain to read, edit, and format large blocks of text in C++ code,
so rip the usage message out of bees.cc and put it in a plain text file.
Use a minimal translator to convert it into a C string.

While we're here, remove the multiple roots feature from the command
line synopsis, as we don't really support it any more.  Also clarify
that "id 5" is "subvol id 5", and describe in one sentence what
workaround-btrfs-send does.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
7ec19d1eff clang: fix struct/class declaration/definition mismatches
clang does not like a defined class to be declared as a struct.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
c4f0e4abee context: workaround to prevent LOGICAL_INO and btrfs balance from running concurrently
This avoids some kernel bugs.  One of them is fixed in 5.3.4 and later:

	efad8a853a "Btrfs: fix use-after-free when using the tree modification log"

There are apparently others in current kernels, so for now just put bees
on pause until the balance is done.

At some point we may want to provide an option to disable this
workaround; however, running bees and balance at the same time makes
neither particularly fast, so maybe we'll just leave it this way.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-11-28 00:13:15 -05:00
Zygo Blaxell
7117cb40c5 hash: prepare for user-selectable hash functions
Localize the hash function in bees to a single spot to make it easier
to change later (or at runtime).

Remove some code that was using a property of CRC as an optimization.
The optimization doesn't work for other hash functions, and running the
CRC function takes more CPU time than the optimization saved.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-06-12 22:48:06 -04:00
Zygo Blaxell
be2c55119e bees: make exceptions less prominent in log output
Introduce a mechanism to suppress exceptions which do not produce a
full stack trace for common known cases where a loop should be aborted.
Use this mechanism to suppress the infamous "FIXME" exception.

Reduce the log level to at most NOTICE, and in some cases DEBUG.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-01-06 01:48:35 -05:00
Zygo Blaxell
570b3f7de0 bees: handle SIGTERM and SIGINT, force immediate flush and exit
Capture SIGINT and SIGTERM and shut down, preserving current completed
crawl and hash table state.

  * Executing tasks are completed, queued tasks are paused.
  * Crawl state is saved.
  * The crawl master and crawl writeback threads are terminated.
  * The task queue is flushed.
  * Dirty hash table extents are flushed.
  * Hash prefetch and writeback threads are terminated.
  * Hash table is deallocated.
  * FD caches and tmpfiles are destroyed.
  * Assuming the above didn't crash or deadlock, bees exits.

The above order isn't the fastest, but it does roughly follow the
shared_ptr dependencies and avoids data races--especially those that
might lead to bees reporting an extent scanned when it was only queued
for future scanning that did not occur.

In case of a violation of expected shared_ptr dependency order,
exceptions in BeesContext child object accessor methods (i.e. roots(),
hash_table(), etc) prevent any further progress in threads that somehow
remain unexpectedly active.

Move some threads from main into BeesContext so they can be stopped
via BeesContext.  The main thread now runs a loop waiting for signals.

A slow FD leak was discovered in TempFile handling.  This has not been
fixed yet, but an implementation detail of the C++ runtime library makes
the leak so slow it may never be important enough to fix.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-12-09 23:39:44 -05:00
Zygo Blaxell
f4464c6896 roots: quick fix for task scheduling bug leading to loss of crawl_master
The crawl_master task had a simple atomic variable that was supposed
to prevent duplicate crawl_master tasks from ending up in the queue;
however, this had a race condition that could lead to m_task_running
being set with no crawl_master task running to clear it.  This would in
turn prevent crawl_thread from scheduling any further crawl_master tasks,
and bees would eventually stop doing any more work.

A proper fix is to modify the Task class and its friends such that
Task::run() guarantees that 1) at most one instance of a Task is ever
scheduled or running at any time, and 2) if a Task is scheduled while
an instance of the Task is running, the scheduling is deferred until
after the current instance completes.  This is part of a fairly large
planned change set, but it's not ready to push now.

So instead, unconditionally push a new crawl_master Task into the queue
on every poll, then silently and quickly exit if the queue is too full
or the supply of new extents is empty.  Drop the scheduling-related
members of BeesRoots as they will not be needed when the proper fix lands.

Fixes: 4f0bc78a "crawl: don't block a Task waiting for new transids"
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-25 23:46:55 -05:00
Zygo Blaxell
34b04f4255 bees: soft-limit computed thread counts to 8
https://github.com/Zygo/bees/issues/91 describes problems encountered
when running bees on systems with many CPU cores.

Limit the computed number of threads (using --thread-factor or the
default) to a maximum of 8 (i.e. the number of logical cores in a modern
laptop).  Users can override the limit by using --thread-count.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-21 21:49:16 -05:00
Zygo Blaxell
23f3e4ec42 workarounds: add workaround for btrfs send
Introduce --workaround options which trade performance or effectiveness to
avoid triggering kernel bugs.

The first such option is --workaround-btrfs-send, which avoids making any
modification to read-only subvols to avoid btrfs send bugs.

Clean up usage message:  no tabs for formatting, split options into
sections by theme.

Make scan mode a non-static data member like all (most?) other options.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-21 21:49:16 -05:00
Zygo Blaxell
c2762740ef context: remove limit on the number of references to an extent
Better toxic extent detection means we can now handle extents with
many more references--easily hundreds of thousands.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-05 21:12:11 -05:00
Zygo Blaxell
aa74a238b3 hash: remove preloaded toxic hash blacklist
Faster and more reliable toxic extent detection means we can now be much
less paranoid about creating toxic extents.

The paranoia has significant impact on dedupe hit rates because every
extent that contains even one toxic hash is abandoned.  The preloaded
toxic hashes were chosen because they occur more frequently than any
other block contents in typical filesystem data.  The combination of these
resulted in as much as 30% of duplicate extents being left untouched.

Remove the preloaded toxic extent blacklist, and rely on the new
kernel-CPU-usage-based workaround instead.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-31 23:03:01 -04:00
Zygo Blaxell
542371684c context: better detection for toxic extents
We detect toxic extents by measuring how long the LOGICAL_INO ioctl takes
to run.  If it is above some threshold, we consider the extent toxic,
and blacklist it; otherwise, we process the extent normally.

The detector was using the execution time of the ioctl, which detects
toxic extents, but it also detects pauses of the bees process and
transaction commit latency due to load.  This leads to a significant
number of false positives.  The detection threshold was also very long,
burning a lot of kernel CPU before the detection was triggered.

Use the per-thread system CPU statistics to measure the kernel CPU usage
of the LOGICAL_INO call directly.  This is much more reliable because it
is not confounded by other threads, and it's faster because we can set
the time threshold two orders of magnitude lower.

Also remove the lock and mutex added in "context: serialize LOGICAL_INO
calls" because we theoretically no longer need it (but leave the code
there with #if 0 in case we do need it in practice).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-31 21:12:16 -04:00
Zygo Blaxell
35b21687bc bees: drop unused member m_uuid
There is a m_root_uuid which is used.  m_uuid is not, so drop it
and save a tiny amount of memory.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:16 -04:00
Zygo Blaxell
924008603e hash: reduce hash table extent size to 128KB
The 16MB hash table extent size did not serve any useful defragmentation
or compression purpose, and for very small filesystems (under 100GB),
16MB is much larger than necessary.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-19 20:21:04 -04:00
Zygo Blaxell
041ad717a5 bees: configurable log verbosity
Log messages were already labelled with log levels, but there was no
way to filter by log level at run time.

Implement the filter inside the bees process so it can skip evaluation
of the BEESLOG* arguments if the log messages would not be emitted.

Fixes: https://github.com/Zygo/bees/issues/67

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:00 -04:00
Zygo Blaxell
f8c27f5c6a bees: revert TOXIC_INTERVAL back to pre-4.14 levels
Linux kernel 4.14, while resistant to extent toxicity, is not immune to it.

Go back to the paranoid setting to avoid tying up filesystems in
ridiculously long kernel loops in find_parent_nodes.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-05-18 00:16:08 -04:00
Zygo Blaxell
082f04818f BeesBlockData: fix data type issues
Not sure if these cause any problems, but they are theoretically
incorrect data types.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-02-28 23:58:28 -05:00
Zygo Blaxell
5bdad7fc93 crucible: progress: a progress tracker for worker queues
The task queue can become very large with many subvols, requiring hours
for the queue to clear.  'beescrawl.dat' saves in the meantime will save
the work currently scheduled, not the work currently completed.

Fix by tracking progress with ProgressTracker.  ProgressTracker::begin()
gives the last completed crawl position.  ProgressTracker::end() gives
the last scheduled crawl position.  begin() does not advance if there
is any item between begin() and end() is not yet completed.  In between
are crawled extents that are on the task queue but not yet processed.
The file 'beescrawl.dat' saves the begin() position while the extent
scanning task queue is fed from the end() position.

Also remove an unused method crawl_state_get() and repurpose the
operator<(BeesCrawlState) that nobody was using.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-02-28 23:49:39 -05:00
Zygo Blaxell
4f0bc78a4c crawl: don't block a Task waiting for new transids
Task should not block for extended periods of time.

Remove the RateEstimator::wait_for() in crawl_roots.  When crawl_roots
runs out of data, let the last crawl_task end without rescheduling.
Schedule crawl_task again on transid polls if it was not already running.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-29 21:37:39 -05:00