1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 13:25:45 +02:00

436 Commits

Author SHA1 Message Date
Zygo Blaxell
80c69f1ce4 context: get rid of shared_ptr<BeesContext> in every single cached Fd object
Support for multiple BeesContext objects sharing a FdCache was wasting
significant space and atomic inc/dec memory cycles for no good reason
since the shared-FdCache feature was deprecated.

open_root and open_root_ino still need a BeesContext to work.  Pass the
BeesContext pointer through the function object instead of the cache
key arguments.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-28 21:54:00 -04:00
Zygo Blaxell
db65031c2b context: get rid of all instances of pthread_cancel
pthread_cancel doesn't really work properly.  It was only being used in
bees to bring threads to a stop if the BeesContext is destroyed early.
It is frequently implicated in core dump reports because of the fragility
of the C++ iostream / C stdio / library infrastructure, particularly
surrounding upgrades on the host running bees.  The pthread_cancel call
itself often simply fails even when it doesn't call terminate().

Defer creation of the status and progress threads until after the
BeesContext::start method is invoked.  At that point, the existing
ask-threads-nicely-to-stop code is up and running, and normal condvars
can be used to bring bees to a stop, without having to resort to
pthread_cancel.

Since we're deleting half of the BeesContext constructor in this change,
let's remove the other half too, and put an end to the deprecated support
for multiple BeesContexts sharing a process.  It's still possible to run
multiple BeesContexts, but they will not share a FD cache.  This will
allow the FD cache's keys to become smaller and hopefully save some
memory later on.

Fixes: #171

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-28 21:42:03 -04:00
Zygo Blaxell
243480b515 ntoa: fix comment disparaging gcc for not implementing C99 compound literals in C++
C99's "{ 0 }" notation for filling in a struct with all zeros was not
included in the C++11 standard, so gcc doesn't implement it and neither
does clang.

gcc does (did?) have issues with warnings on the same code in C99,
complaining about uninitialized struct members when "{0}" explicitly
initializes every member to a zero value.  These issues don't apply in
the C++ code where NTOA_TABLE_ENTRY_END is used.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:20:03 -04:00
Zygo Blaxell
f8a8704135 ntoa: fix bits_ntoa formatting and error handling
Get rid of an assert in bits_ntoa.  Throw an exception instead.

Fix hex formatting (adding "0x" before a decimal number is not
the correct way to format hex strings).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:20:03 -04:00
Zygo Blaxell
8a60850e32 docs: note that FIEMAP is also affected by backref performance issue
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:20:03 -04:00
Zygo Blaxell
9d21e6b456 docs: drop incomplete build recipe for ubuntu 14.04
The kernel from such an old distro version likely has several unfixed
bugs.  Better not to support it at all.

Users who can upgrade the kernel are probably also sophisticated enough
to fix the build issues too.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:20:03 -04:00
Zygo Blaxell
bcf3e7de3e uuid: drop dependency on uuid.h
The weird things distros do to the path where uuid.h gets installed
have broken bees builds for the last time.

We were only using uuid to support a legacy feature that was removed
over four years ago.

Hypothetical users who are upgrading directly from bees v0.1 should
probably restart all the crawlers anyway--there were bugs.  Also, if any
such users exist, I respect their tremendous patience with the horrible
performance all these years--bees got about 30x faster since v0.1.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:16:50 -04:00
Zygo Blaxell
6465a7c37c docs: btrfs-kernel: update recommended kernels list, slow backrefs bug has been backported
The slow backrefs performance improvement is confirmed by reports from
multiple users:

	* Me (5.4.60 + backref patches, 5.7 to 5.11)

	* https://github.com/Zygo/bees/issues/161 (5.8)

	* https://github.com/Zygo/bees/issues/162 (5.8)

	* IRC user S0rin (5.4.88 + backref patches)

The issue still exists, but at a significantly reduced scale:  now about
2 ms of CPU per ref on a fast machine.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-04 14:01:55 -04:00
Zygo Blaxell
177f393ed6 docs: btrfs-kernel: add the 5.10 performance regression, the Ctrl-C on balance kernel crash has been fixed
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-23 17:37:51 -05:00
Zygo Blaxell
5f40f9edb0 docs: remove libbtrfs-dev as a build-time dependency
We no longer require ctree.h from libbtrfs-dev.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-22 20:07:06 -05:00
Zygo Blaxell
7f660f50b8 lib: fs: stop using libbtrfs-dev helper functions to re-enable buffer length checks
The Linux kernel's btrfs headers are better than the libbtrfs-dev headers:

	- the libbtrfs-dev headers have C++ language compatibility issues

	- upstream version in Linux kernel is more accurate and up to date

	- macros in libbtrfs-dev's ctree.h hide information that would
	enable bees to perform runtime buffer length checking

	- enum types whose presence cannot be detected with #ifdef

When accessing members of metadata items from the filesystem, we want
to verify that the member we are accessing is within the boundaries of
the item that was retrieved; otherwise, a memory access violation may
occur or garbage may be returned to the caller.  A simple C++ template,
given a pointer to a structure member and a buffer, can determine that
the buffer contains enough bytes to safely access a struct member.
This was implemented back in 2016, but left unused due to ctree.h issues.

Some btrfs metadata structures have variable length despite using a
fixed-size in-memory structure.  The members that appear earliest in
the structure contain information about which following members of the
structure are used.  The item stored in the filesystem is truncated after
the last used member, and all following members must not be accessed.

'btrfs_stack_*' accessor macros obscure the memory boundaries of the
members they access, which makes it impossible for a C++ template to
verify the memory access.  If the template checks the length of the
entire structure, it will find an access violation for variable-length
metadata items because the item is rarely large enough for the entire
structure.

Get rid of all the libbtrfs-dev accessor macros and reimplement them
with the necessary buffer length checks.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-22 20:06:43 -05:00
Zygo Blaxell
6eb7afa65c build: include localconf everywhere
Overriding makeflags did not work from localconf in the src, lib, or
test directories.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-22 20:06:43 -05:00
Zygo Blaxell
10af3f9763 bees: remove si_addr_lsb from siginfo debug message to fix FTBFS
Apparently it is missing in newer Linux headers, making
builds fail.  We don't need it, so remove it.

Closes: #160
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-21 19:26:22 -05:00
Zygo Blaxell
636e69267e resolve: add bees.h constants for balance and logical_ino serialization
Make these workarounds configurable in src/bees.h instead of #if 0
code blocks.  Someday we'll make the constants in bees.h configurable
through a file or similar.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:07:36 -05:00
Zygo Blaxell
c0149d72b7 fs: use Spanner to refer to ioctl arg buffer instead of making vector copies
This avoids some allocations and copying.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:07:36 -05:00
Zygo Blaxell
333aea9822 lib: introduce Spanner, a pointer and size delimiting a range
Spanner<Iterator> turns a pair of pointers into a sequence container
with several of vector's methods.

A partial specialization of make_spanner is provided which uses
shared_ptr as the beginning of the range.  Some of the Spanner code
is a questionable hack in support of this.

C++20 has ranges and span, but neither is worth moving the minimum
C++ standard forward.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:07:36 -05:00
Zygo Blaxell
9ca69bb7ff fs: remove buffer overrun check in get_struct_ptr for non-copying containers
When we are using non-copying containers, we can't call resize() on them.
get_struct_ptr is essentially a pointer cast, so we will end up with a
pointer to a struct that extends beyond the boundaries of the container.

As long as the btrfs metadata is not corrupted, we should not have too
many problems.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:07:36 -05:00
Zygo Blaxell
f45e379802 fs: deprecate vector<char>
Use uint8_t when we mean uint8_t, i.e. vector<uint8_t> instead of
vector<char>.

Add a template parameter instead of vector so we can swap in a
non-copying data type.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:07:36 -05:00
Zygo Blaxell
180bb60cde fs: add support and workarounds for btrfs fs_info v2
Define a local copy of the header that has fields for the csum type
and length, so we can build in places that haven't caught up to kernel
5.5 headers yet.

The reason why the csum type and length are not unconditionally filled
in eludes me.  csum_length is necessarily non-zero, and the cost of
the conditional is worse than the cost of the copy, so the whole flags
dance is a WTF...but it's part of the kernel API now, so it's too late
to NAK it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:07:36 -05:00
Zygo Blaxell
c80af1cb4f fd: deprecate Resource in favor of NamedPtr
Rewrite Fd using a much simpler named resource template class with
a more straightforward derivation strategy.

Behavior change:  we no longer throw an exception while calling get_fd()
on a closed Fd.  This does not seem to bother any current callers except
for the tests.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 18:06:44 -05:00
Zygo Blaxell
ab5316a3da src: use correct flags for compiling .c files, fix missing dependencies
fiewalk and fiemap depend on a lot of crucible, and incremental builds
fail hard without proper dependency tracking.

All binaries must be rebuilt when makeflags changes.  This dependency
exists already in lib and test, but src was missing.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:52 -05:00
Zygo Blaxell
d2ecb4c9ea lib: namedptr: thread-safe reference counted named object store
NamedPtr provides reference-counted handles to named objects.  The object
is created the first time the associated name is used, and stored under
the associated name until the last handle is destroyed.  NamedPtr may
itself be destroyed while handles are still active.

This template is intended to replace ResourceHandle with a more general
and less invasive implementation.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:52 -05:00
Zygo Blaxell
8a2fb75462 fd: move relative path string to library
Use a single static variable located in the library, instead of
having a separate one for each compilation unit.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:52 -05:00
Zygo Blaxell
420c218c83 cache: remove unused #includes
Also fix bees-roots's missing headers.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:52 -05:00
Zygo Blaxell
6ee5da7d77 cache: clean up pointer mangling and duplicate code
std::list and std::map both have stable iterators, and list has the
splice() method, so we don't need a hand-rolled double-linked list here.

Coalesce insert() and operator() into a single function.

Drop the unused prune() method.

Move destructor calls for cached objects out from under the cache lock.
Closing a lot of files at once is already expensive, might as well not
stop the world while we do it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
b1bdd9e056 test: rebuild the tests if libcrucible.a changes
Due to a missing dependency, tests are not rebuilt when the library
changes, so tests return false results after library source changes.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
0c84302d9a lib: don't rebuild libcrucible unless there is a version change
If we create an identical .version.cc then don't bother keeping it.
This prevents libcrucible from rebuilding if there are no other changes,
which in turn prevents all the binaries from rebuilding unconditionally.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
03627503ec include: #undef crc32c
Some versions of linux-libc header files define a macro named 'crc32c'.
We want to use that name too, so #undef it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
5248985da0 context: fix shutdown log messages identifying the wrong thread
We are waiting for the status thread, not the progress thread.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
d1f1c386bc tempfile: remove size limit in realign()
Now that tempfiles are using pool checkin functions to control their
size, we don't need a size limit in realign().

We keep the limit in make_copy because it's a sanity check against
letting a multi-terabyte copy operation slip through.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
6705cd9c26 context: move TempFile from TLS to Pool and fix some FdCache issues
Get rid of the thread-local TempFiles and use Pool instead.  This
eliminates a potential FD leak when the loadavg governor repeatedly
creates and destroys threads.

With the old per-thread TempFiles, we were guaranteed to have exclusive
ownership of the TempFile object within the current thread.  Pool is
somewhat stricter:  it only guarantees ownership while the checked-out
Handle exists.  Adjust the users of TempFile objects to ensure they hold
the Handle object until they are finished using the TempFile.

It appears that maintaining large, heavily-reflinked, long-lived temporary
files costs more than truncating after every use: btrfs has to write
multiple references to the temporary file's extents, then some commits
later, remove references as the temporary file is deleted or truncated.
Using the temporary file in a dedupe operation flushes the data to disk,
so nothing is saved by pretending that there is writeback pipelining and
trying to avoid flushes in truncate.  Pool provides usage tracking and
a checkin callback, so use it to truncate the temporary file immediately
after every use.

Redesign TempFile so that every instance creates exactly one Fd which
persists over the lifetime of the TempFile object.  Provide a reset()
method which resets the file back to the initial state and call it from
the Pool checkin callback.  This makes TempFile's lifetime equivalent to
its Fd's lifetime, which simplifies interactions with FdCache and Roots.

This change means we can now blacklist temporary files without having
an effective memory leak, so do that.  We also have a reason to ever
remove something from the blacklist, so add a method for that too.

In order to move to extent-centric addressing, we need to be able to
reliably open temporary files by root and inode number.  Previously we
would place TempFile fd's into the cache with insert_root_ino, but the
cache would be cleared periodically, and it would not be possible to
reopen temporary files after that happened.  Now that the TempFile's
lifetime is the same as the TempFile Fd's lifetime, we can have TempFile
manage a separate FileId -> Fd map in Roots which is unaffected by the
periodic cache clearing.  BeesRoots::open_root_ino_nocache will check
this map before attempting to open the file via btrfs root+ino lookup,
and return it through the cache as if Roots had opened the file via btrfs.

Hold a reference to BeesRoots in BeesTempFile because the usual way
to get such a reference now throws an exception in BeesTempFile's
destructor.

These changes make method BeesTempFile::create() and all methods named
insert_root_ino unnecessary, so delete them.

We construct and destroy TempFiles much less often now, so make their
constructor and destructor more informative.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
3ce00a5ebe lib: introduce Pool, a class for storing reusable anonymous objects
Pool is a place to store shared_ptrs to generated objects (T) that are
too expensive to create and destroy between individual uses, such as
temporary files.  Objects in a Pool have no distinct identity
(contrast with Cache or NamedPtr).

Users of the Pool invoke the Pool function call overload and "check out"
a shared_ptr<T> for a T object from the Pool.  When the last referencing
shared_otr<T> is destroyed, the T object is "checked in" to the Pool.

Each call of the Pool function overload checks out a shared_ptr<T> to a T
object that is not currently referenced by any other public shared_ptr<T>.

If there are no existing T objects in the Pool, a new T is constructed
by calling the generator function.

The clear() method destroys all checked in T objects owned by the Pool
at the time the method is called.  T objects that are checked out are
not affected by clear(), and they will be stored in the Pool when they
are checked in.

If the checkout function is provided, it is called on a shared_ptr<T>
during checkout, before returning to the caller.

If the checkin function is provided, it is called on a shared_ptr<T>
before returning it to the Pool.  The checkin function must not throw
exceptions.

The Pool may be destroyed while T objects are checked out of the Pool.
In that case, when the T objects are checked in, the T object is
immediately destroyed without calling the checkin function.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
1086900a9d string: second argument to stoull is technically a nullptr
This comes up if too many compiler warnings are enabled.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
97c167d63a fs: don't zero-fill btrfs data containers
The kernel does it already, and we gain a little performance here because
we do it so often.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
b49d458792 context: move prealloc dedupe to a separate Task
A prealloc extent reference can be deduped immediately and asynchronously.
There is no need to slow down extent scanning to do it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
1e7dbc6f97 tempfile: remove old comments about fsync and deadlock bugs
I was never able to prove a connection between fsync() and deadlock bugs.
There were too many deadlock bugs to be able to isolate a bug that is
triggered specifically by fsync.

Update the comment (which has been unchanged since kernel 4.14).  We still
may want to do fsync() on temporary files someday, but there's a full
internal API rewrite between here and there.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
459071597b fs: make operator<() for search ioctl inline
Perf blames this operator for >1% of instructions with -O2, and
70% of instructions without -O2.

Let the compiler inline the function.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
187d12fc25 fs: always use container's actual size not requested size
The requested size may not match the final size of the container,
so consistently use the container's size after prepare(), not the
requested size.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
de6282c6cd roots: separate crawl sizes into bytes and items
Number of items should be low enough that we don't have too many stale
items, but high enough to amortize system call overhead to a reasonable
ratio.

Number of bytes should be constant:  one worst-case metadata page (the
btrfs limit is 64K, though 16K is much more common) so that we always
have enough space for one worst-case item; otherwise, we get EOVERFLOW
if we set the number of items too low and there's a big item in the tree,
and we can't make further progress.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
d332616eff roots: report the search parameters on tree search ioctl error
There are lots of ways the search can fail, but it's hard to pick one
without knowing the parameters.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
e654e29f45 bees: move usage message out of source file and fix a few inaccuracies
It's a pain to read, edit, and format large blocks of text in C++ code,
so rip the usage message out of bees.cc and put it in a plain text file.
Use a minimal translator to convert it into a C string.

While we're here, remove the multiple roots feature from the command
line synopsis, as we don't really support it any more.  Also clarify
that "id 5" is "subvol id 5", and describe in one sentence what
workaround-btrfs-send does.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
bbaf55b2b0 roots: make it build with clang
Remove an unnecessary cast that was breaking namespace lookup for clang.

Closes: #159

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
62a20ebf9c chatter: make it build with clang
Silence the unused variable warning.  The compiler is correct, but we
may implement line-level debug at some point in the future, so we
want to keep the member and parameters.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
7ec19d1eff clang: fix struct/class declaration/definition mismatches
clang does not like a defined class to be declared as a struct.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
b7b18d9fa1 extentwalker: make it build with clang
Remove unused MAX_OFFSET.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
f263c8751e bees context: make it build with clang
Remove unused function getenv_or_die.  All of our environment variable
parameters are optional or have default values.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
17d8759011 bees: make it build with clang
Remove unused "addr check" functions.  We have ranged_cast for detecting
overflow bits.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
e34237886d process: make it build with clang
Get rid of unused template instantiation.

Drop the unused realtime signals from the ntoa table.  If in the future
we really need to solve clang's issue with them, we'll address it then.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
29b051b131 task: make it build with clang
Remove unused closure captures.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
8e9b53b3fd stats: remove nonsense dedup_unique_bytes stat
A long time ago, when bees used dedicated threads to scan each subvol, the
calculation of the "dedup_unique_bytes" statistic was still wrong.

This stat can only be calculated when dedupe runs on extent data items
instead of extent reference items.  Remove the stat variable until
that happens.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00