Make these workarounds configurable in src/bees.h instead of #if 0
code blocks. Someday we'll make the constants in bees.h configurable
through a file or similar.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Spanner<Iterator> turns a pair of pointers into a sequence container
with several of vector's methods.
A partial specialization of make_spanner is provided which uses
shared_ptr as the beginning of the range. Some of the Spanner code
is a questionable hack in support of this.
C++20 has ranges and span, but neither is worth moving the minimum
C++ standard forward.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When we are using non-copying containers, we can't call resize() on them.
get_struct_ptr is essentially a pointer cast, so we will end up with a
pointer to a struct that extends beyond the boundaries of the container.
As long as the btrfs metadata is not corrupted, we should not have too
many problems.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Use uint8_t when we mean uint8_t, i.e. vector<uint8_t> instead of
vector<char>.
Add a template parameter instead of vector so we can swap in a
non-copying data type.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Define a local copy of the header that has fields for the csum type
and length, so we can build in places that haven't caught up to kernel
5.5 headers yet.
The reason why the csum type and length are not unconditionally filled
in eludes me. csum_length is necessarily non-zero, and the cost of
the conditional is worse than the cost of the copy, so the whole flags
dance is a WTF...but it's part of the kernel API now, so it's too late
to NAK it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Rewrite Fd using a much simpler named resource template class with
a more straightforward derivation strategy.
Behavior change: we no longer throw an exception while calling get_fd()
on a closed Fd. This does not seem to bother any current callers except
for the tests.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
fiewalk and fiemap depend on a lot of crucible, and incremental builds
fail hard without proper dependency tracking.
All binaries must be rebuilt when makeflags changes. This dependency
exists already in lib and test, but src was missing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
NamedPtr provides reference-counted handles to named objects. The object
is created the first time the associated name is used, and stored under
the associated name until the last handle is destroyed. NamedPtr may
itself be destroyed while handles are still active.
This template is intended to replace ResourceHandle with a more general
and less invasive implementation.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Use a single static variable located in the library, instead of
having a separate one for each compilation unit.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
std::list and std::map both have stable iterators, and list has the
splice() method, so we don't need a hand-rolled double-linked list here.
Coalesce insert() and operator() into a single function.
Drop the unused prune() method.
Move destructor calls for cached objects out from under the cache lock.
Closing a lot of files at once is already expensive, might as well not
stop the world while we do it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Due to a missing dependency, tests are not rebuilt when the library
changes, so tests return false results after library source changes.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
If we create an identical .version.cc then don't bother keeping it.
This prevents libcrucible from rebuilding if there are no other changes,
which in turn prevents all the binaries from rebuilding unconditionally.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Some versions of linux-libc header files define a macro named 'crc32c'.
We want to use that name too, so #undef it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Now that tempfiles are using pool checkin functions to control their
size, we don't need a size limit in realign().
We keep the limit in make_copy because it's a sanity check against
letting a multi-terabyte copy operation slip through.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Get rid of the thread-local TempFiles and use Pool instead. This
eliminates a potential FD leak when the loadavg governor repeatedly
creates and destroys threads.
With the old per-thread TempFiles, we were guaranteed to have exclusive
ownership of the TempFile object within the current thread. Pool is
somewhat stricter: it only guarantees ownership while the checked-out
Handle exists. Adjust the users of TempFile objects to ensure they hold
the Handle object until they are finished using the TempFile.
It appears that maintaining large, heavily-reflinked, long-lived temporary
files costs more than truncating after every use: btrfs has to write
multiple references to the temporary file's extents, then some commits
later, remove references as the temporary file is deleted or truncated.
Using the temporary file in a dedupe operation flushes the data to disk,
so nothing is saved by pretending that there is writeback pipelining and
trying to avoid flushes in truncate. Pool provides usage tracking and
a checkin callback, so use it to truncate the temporary file immediately
after every use.
Redesign TempFile so that every instance creates exactly one Fd which
persists over the lifetime of the TempFile object. Provide a reset()
method which resets the file back to the initial state and call it from
the Pool checkin callback. This makes TempFile's lifetime equivalent to
its Fd's lifetime, which simplifies interactions with FdCache and Roots.
This change means we can now blacklist temporary files without having
an effective memory leak, so do that. We also have a reason to ever
remove something from the blacklist, so add a method for that too.
In order to move to extent-centric addressing, we need to be able to
reliably open temporary files by root and inode number. Previously we
would place TempFile fd's into the cache with insert_root_ino, but the
cache would be cleared periodically, and it would not be possible to
reopen temporary files after that happened. Now that the TempFile's
lifetime is the same as the TempFile Fd's lifetime, we can have TempFile
manage a separate FileId -> Fd map in Roots which is unaffected by the
periodic cache clearing. BeesRoots::open_root_ino_nocache will check
this map before attempting to open the file via btrfs root+ino lookup,
and return it through the cache as if Roots had opened the file via btrfs.
Hold a reference to BeesRoots in BeesTempFile because the usual way
to get such a reference now throws an exception in BeesTempFile's
destructor.
These changes make method BeesTempFile::create() and all methods named
insert_root_ino unnecessary, so delete them.
We construct and destroy TempFiles much less often now, so make their
constructor and destructor more informative.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Pool is a place to store shared_ptrs to generated objects (T) that are
too expensive to create and destroy between individual uses, such as
temporary files. Objects in a Pool have no distinct identity
(contrast with Cache or NamedPtr).
Users of the Pool invoke the Pool function call overload and "check out"
a shared_ptr<T> for a T object from the Pool. When the last referencing
shared_otr<T> is destroyed, the T object is "checked in" to the Pool.
Each call of the Pool function overload checks out a shared_ptr<T> to a T
object that is not currently referenced by any other public shared_ptr<T>.
If there are no existing T objects in the Pool, a new T is constructed
by calling the generator function.
The clear() method destroys all checked in T objects owned by the Pool
at the time the method is called. T objects that are checked out are
not affected by clear(), and they will be stored in the Pool when they
are checked in.
If the checkout function is provided, it is called on a shared_ptr<T>
during checkout, before returning to the caller.
If the checkin function is provided, it is called on a shared_ptr<T>
before returning it to the Pool. The checkin function must not throw
exceptions.
The Pool may be destroyed while T objects are checked out of the Pool.
In that case, when the T objects are checked in, the T object is
immediately destroyed without calling the checkin function.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
A prealloc extent reference can be deduped immediately and asynchronously.
There is no need to slow down extent scanning to do it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
I was never able to prove a connection between fsync() and deadlock bugs.
There were too many deadlock bugs to be able to isolate a bug that is
triggered specifically by fsync.
Update the comment (which has been unchanged since kernel 4.14). We still
may want to do fsync() on temporary files someday, but there's a full
internal API rewrite between here and there.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Perf blames this operator for >1% of instructions with -O2, and
70% of instructions without -O2.
Let the compiler inline the function.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The requested size may not match the final size of the container,
so consistently use the container's size after prepare(), not the
requested size.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Number of items should be low enough that we don't have too many stale
items, but high enough to amortize system call overhead to a reasonable
ratio.
Number of bytes should be constant: one worst-case metadata page (the
btrfs limit is 64K, though 16K is much more common) so that we always
have enough space for one worst-case item; otherwise, we get EOVERFLOW
if we set the number of items too low and there's a big item in the tree,
and we can't make further progress.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
There are lots of ways the search can fail, but it's hard to pick one
without knowing the parameters.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
It's a pain to read, edit, and format large blocks of text in C++ code,
so rip the usage message out of bees.cc and put it in a plain text file.
Use a minimal translator to convert it into a C string.
While we're here, remove the multiple roots feature from the command
line synopsis, as we don't really support it any more. Also clarify
that "id 5" is "subvol id 5", and describe in one sentence what
workaround-btrfs-send does.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Silence the unused variable warning. The compiler is correct, but we
may implement line-level debug at some point in the future, so we
want to keep the member and parameters.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Remove unused function getenv_or_die. All of our environment variable
parameters are optional or have default values.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Get rid of unused template instantiation.
Drop the unused realtime signals from the ntoa table. If in the future
we really need to solve clang's issue with them, we'll address it then.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
A long time ago, when bees used dedicated threads to scan each subvol, the
calculation of the "dedup_unique_bytes" statistic was still wrong.
This stat can only be calculated when dedupe runs on extent data items
instead of extent reference items. Remove the stat variable until
that happens.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
There was a 4th tree mod log crash that showed up in testing. It can
be reproduced or eliminated by applying or reverting d2311e698578
("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
to a 5.4.x kernel before 5.4.54.
Unfortunately, the test can only run if several other patches that
fixed other bugs in d2311e698578 are applied or removed at the same time.
Commit d2311e698578 introduces a bug which destroys filesystems under test
long before tree mod log failures can be reproduced in testing. One of
those patches also fixes tree mod log issue #4. I do not know which one,
but since kernels after 5.1 cannot run without all of those patches, I do
not think it matters.
Tree mod issue #4 is the reason why the tree mod workaround is still
required on all kernels before 5.4. The issue still exists on older
LTS kernels, e.g. 4.9.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Rewrite the text related to 'btrfs send' to clarify that the send
workaround is no longer necessary to avoid kernel crashes, but still
useful because send and dedupe still do not work at the same time.
Replace "many backref code changes" with a specific commit reference,
and improve the grammar of some issue descriptions.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Apparently there's Github Flavored Markdown, and there's the markup
language that github uses, and they are distinct things.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Present known kernel bugs in table form with issue descriptions,
fixed and broken kernel versions, and references to fixes.
Update kernel version recommendations to include information on kernel
versions up to 5.8.14.
Reduce emphasis on data corruption bugs which are 1) two or more
years old now, and 2) much less bad than the bugs in kernel 5.1.
Add deprecation warning for kernels before 4.15.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Prefer to use cmark-gfm with extension 'table' so we can use tables in
locally-generated HTML files. If cmark-gfm is not installed then
fall back to some other Markdown implemeentation, but the tables will
be broken on every other implementation I have tried so far.
Also make the HTML output depend on the Makefile, since there may be
document translation options specified there (like '-e table' or an
entirely different Markdown implementation).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
uncaught_exception() had only the one valid use case, and it can be
reimplemented by literally calling current_exception() instead.
current_exception() has several valid use cases, so it is not likely
to be deprecated any time soon.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We cannot use BeesContext::roots() until after
BeesContext::set_root_path() has been called.
Save up the parameter settings until then.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
"Storm of softlockups" starts with a simple BUG_ON, but after the
BUG_ON, all cores that are waiting on spinlocks get stuck.
The _first_ kernel call trace is required to identify the bug.
At least two such bugs have been identified.
Add some notes about the conflict between LOGICAL_INO and balance,
and the recently added bees workaround.
Update the gotchas page for balances to point to the kernel bugs page.
Remove "bees and the full balance will both work correctly" as that
statement is not true.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This avoids some kernel bugs. One of them is fixed in 5.3.4 and later:
efad8a853a "Btrfs: fix use-after-free when using the tree modification log"
There are apparently others in current kernels, so for now just put bees
on pause until the balance is done.
At some point we may want to provide an option to disable this
workaround; however, running bees and balance at the same time makes
neither particularly fast, so maybe we'll just leave it this way.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Saying just "This feature" at some log levels could be puzzling. Let's
remove this message, the feature works without problems for a year.
Signed-off-by: Kai Krakow <kai@kaishome.de>
In version 2.30 glibc added it's own gettid() function. This resulted in
"error: call of overloaded ‘gettid()’ is ambiguous" because gettid()
now exists in both namespace crucible and std.
For now, use explicit references to namespace crucible. This continues
to work with new and old libc without having to test specific library
versions.
At some point, glibc gettid() will be deployed widely enough that we can
remove the crucible version entirely.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>