Fd's cache does not handle changes in the state of its IOHandle parameter.
If we allow:
Fd f;
f->close();
then Fd ends up caching a pointer to a closed Fd, and will become very
badly confused if a new Fd appears with the same int identifier.
Fix by removing the close method.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Drop the ListType alias because we only use it once. Rename ListRep to
PoolRep to better reflect what it does.
We don't need the Pool to be available to handle destroyed Pool::Handle
objects. A weak_ptr in the Handle would detect the Pool has been
destroyed, so we don't need to track that ourselves. As a bonus, we can
destroy the PoolRep object as soon as the Pool has been destroyed, delayed
only if there is a Handle object currently executing its destructor.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
C99's "{ 0 }" notation for filling in a struct with all zeros was not
included in the C++11 standard, so gcc doesn't implement it and neither
does clang.
gcc does (did?) have issues with warnings on the same code in C99,
complaining about uninitialized struct members when "{0}" explicitly
initializes every member to a zero value. These issues don't apply in
the C++ code where NTOA_TABLE_ENTRY_END is used.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The weird things distros do to the path where uuid.h gets installed
have broken bees builds for the last time.
We were only using uuid to support a legacy feature that was removed
over four years ago.
Hypothetical users who are upgrading directly from bees v0.1 should
probably restart all the crawlers anyway--there were bugs. Also, if any
such users exist, I respect their tremendous patience with the horrible
performance all these years--bees got about 30x faster since v0.1.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The Linux kernel's btrfs headers are better than the libbtrfs-dev headers:
- the libbtrfs-dev headers have C++ language compatibility issues
- upstream version in Linux kernel is more accurate and up to date
- macros in libbtrfs-dev's ctree.h hide information that would
enable bees to perform runtime buffer length checking
- enum types whose presence cannot be detected with #ifdef
When accessing members of metadata items from the filesystem, we want
to verify that the member we are accessing is within the boundaries of
the item that was retrieved; otherwise, a memory access violation may
occur or garbage may be returned to the caller. A simple C++ template,
given a pointer to a structure member and a buffer, can determine that
the buffer contains enough bytes to safely access a struct member.
This was implemented back in 2016, but left unused due to ctree.h issues.
Some btrfs metadata structures have variable length despite using a
fixed-size in-memory structure. The members that appear earliest in
the structure contain information about which following members of the
structure are used. The item stored in the filesystem is truncated after
the last used member, and all following members must not be accessed.
'btrfs_stack_*' accessor macros obscure the memory boundaries of the
members they access, which makes it impossible for a C++ template to
verify the memory access. If the template checks the length of the
entire structure, it will find an access violation for variable-length
metadata items because the item is rarely large enough for the entire
structure.
Get rid of all the libbtrfs-dev accessor macros and reimplement them
with the necessary buffer length checks.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Spanner<Iterator> turns a pair of pointers into a sequence container
with several of vector's methods.
A partial specialization of make_spanner is provided which uses
shared_ptr as the beginning of the range. Some of the Spanner code
is a questionable hack in support of this.
C++20 has ranges and span, but neither is worth moving the minimum
C++ standard forward.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When we are using non-copying containers, we can't call resize() on them.
get_struct_ptr is essentially a pointer cast, so we will end up with a
pointer to a struct that extends beyond the boundaries of the container.
As long as the btrfs metadata is not corrupted, we should not have too
many problems.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Use uint8_t when we mean uint8_t, i.e. vector<uint8_t> instead of
vector<char>.
Add a template parameter instead of vector so we can swap in a
non-copying data type.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Define a local copy of the header that has fields for the csum type
and length, so we can build in places that haven't caught up to kernel
5.5 headers yet.
The reason why the csum type and length are not unconditionally filled
in eludes me. csum_length is necessarily non-zero, and the cost of
the conditional is worse than the cost of the copy, so the whole flags
dance is a WTF...but it's part of the kernel API now, so it's too late
to NAK it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Rewrite Fd using a much simpler named resource template class with
a more straightforward derivation strategy.
Behavior change: we no longer throw an exception while calling get_fd()
on a closed Fd. This does not seem to bother any current callers except
for the tests.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
NamedPtr provides reference-counted handles to named objects. The object
is created the first time the associated name is used, and stored under
the associated name until the last handle is destroyed. NamedPtr may
itself be destroyed while handles are still active.
This template is intended to replace ResourceHandle with a more general
and less invasive implementation.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Use a single static variable located in the library, instead of
having a separate one for each compilation unit.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
std::list and std::map both have stable iterators, and list has the
splice() method, so we don't need a hand-rolled double-linked list here.
Coalesce insert() and operator() into a single function.
Drop the unused prune() method.
Move destructor calls for cached objects out from under the cache lock.
Closing a lot of files at once is already expensive, might as well not
stop the world while we do it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Some versions of linux-libc header files define a macro named 'crc32c'.
We want to use that name too, so #undef it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Pool is a place to store shared_ptrs to generated objects (T) that are
too expensive to create and destroy between individual uses, such as
temporary files. Objects in a Pool have no distinct identity
(contrast with Cache or NamedPtr).
Users of the Pool invoke the Pool function call overload and "check out"
a shared_ptr<T> for a T object from the Pool. When the last referencing
shared_otr<T> is destroyed, the T object is "checked in" to the Pool.
Each call of the Pool function overload checks out a shared_ptr<T> to a T
object that is not currently referenced by any other public shared_ptr<T>.
If there are no existing T objects in the Pool, a new T is constructed
by calling the generator function.
The clear() method destroys all checked in T objects owned by the Pool
at the time the method is called. T objects that are checked out are
not affected by clear(), and they will be stored in the Pool when they
are checked in.
If the checkout function is provided, it is called on a shared_ptr<T>
during checkout, before returning to the caller.
If the checkin function is provided, it is called on a shared_ptr<T>
before returning it to the Pool. The checkin function must not throw
exceptions.
The Pool may be destroyed while T objects are checked out of the Pool.
In that case, when the T objects are checked in, the T object is
immediately destroyed without calling the checkin function.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Perf blames this operator for >1% of instructions with -O2, and
70% of instructions without -O2.
Let the compiler inline the function.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In version 2.30 glibc added it's own gettid() function. This resulted in
"error: call of overloaded ‘gettid()’ is ambiguous" because gettid()
now exists in both namespace crucible and std.
For now, use explicit references to namespace crucible. This continues
to work with new and old libc without having to test specific library
versions.
At some point, glibc gettid() will be deployed widely enough that we can
remove the crucible version entirely.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
CityHash64 appears to be the fastest available block hashing algorithm
that is good enough for dedupe. It takes much less CPU than the CRC64
function, and avoids hash-collision problems with file formats that use
CRC64 as an integrity check on 4K block boundaries.
Extracted from git://github.com/google/cityhash with the "CRC" hash
functions (which require Intel/AMD CPU support) removed. We don't
need those, and they introduce a new (if only theoretical) build-time
dependency.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
It is not possible to emulate extent-same by clone in a safe way.
EXTENT_SAME has been supported in btrfs since kernel 3.13, which
is much too old to contemplate running bees on.
Remove this dangerous and unused function.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This is especially useful when dynamic load management allocates more
worker threads than active tasks, so the extra threads are effectively
invisible.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Enable much simpler Task management: each time a Task needs to be done
at least once in the future, simply invoke the run() method on the Task.
The Task will ensure that it only runs once, only appears in a queue
once, and will run again if a run request is made while the Task is
already running.
Make the queue policy a member of the Task rather than a method. This
enables Tasks to reschedule themselves, possibly on the appropriate queue
if we have more than one of those some day.
This happens to make Tasks more similar to Linux kernel workers.
This similarity is coincidental, but not undesirable.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We need to replace nanosleeps with condition variables so that we
can implement BeesContext::stop. Export the time calculation from
sleep_for() into a new method called sleep_time().
If the thread executing RateLimiter::sleep_for() is interrupted, it will
no longer be able to restart, as the sleep_time() method is destructive.
This calls for further refactoring of sleep_time() into destructive
and non-destructive parts; however, there are currently no users of
sleep_for() which rely on being able to restart after being interrupted
by a signal.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add a method to have TaskMaster discard any entries in its queue, terminate
all worker threads, and prevent any new Tasks from being queued.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
If we are not zero-filling containers then the overhead of allocating them
on each use is negligible. The effect that the thread_local containers
were having on RAM usage was very non-negligible.
Use dynamic containers (members or stack objects) for better control
of object lifetimes and much lower peak RAM usage. They're a tiny bit
faster, too.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Automatically fall back to LOGICAL_INO if LOGICAL_INO_V2 fails and no
_V2 flags are used.
Add methods to set the flags argument with build portability to older
headers.
Use thread_local storage for the somewhat large buffers used by
LOGICAL_INO_V2 (and other users of BtrfsDataContainer like INO_PATHS).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
ExtentWalker doesn't gain significant benefits from caching, and the
extra SEARCH_V2 ioctls were blamed for a 33% kernel CPU overhead by perf.
Reduce the number of extents to 16 in lieu of fixing the caching.
This gives a significant speed boost on CPU-bound workloads compared
to the original 1024--almost 40% faster on a single SSD with a filesystem
consisting of raw VM images mounted with compress=zstd.
This also seems to reduce LOGICAL_INO overhead. Perhaps SEARCH_V2 and
LOGICAL_INO were trying to lock the same extents, and interfering with
each other?
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The -g option limits the number of worker threads when the target load
average is exceeded. On some systems the load normally runs high, and
continuous bees operation is required to avoid running out of disk space.
Add a -G/--thread-min option to force at least some threads to continue
running.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add -g / --loadavg-target parameter to track system load and add or
remove bees worker threads dynamically to keep system load close to the
loadavg target. Thread count may vary from zero to the maximum
specified by -c or -C, and is adjusted every 5 seconds.
This is better than implementing a similar load average scheme from
outside of the process (though that is still possible) because the
in-process load tracker does not disrupt the performance timing feedback
mechanisms as a freezer cgroup or SIGSTOP would when controlling bees
from outside. The internal load average tracker can also adjust the
number of active threads while an external tracker can only choose from
the maximum or zero.
Also fix a bug where a Task could deadlock waiting for itself to exit
if it tries to insert a new Task after the number of worker threads has
been set to zero.
Also correct usage message for --scan-mode (values are 0..2) since
we are touching adjacent lines anyway.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
set() was broken and redundant. Calling hold() and discarding the
returned object has the correct effect.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The task queue can become very large with many subvols, requiring hours
for the queue to clear. 'beescrawl.dat' saves in the meantime will save
the work currently scheduled, not the work currently completed.
Fix by tracking progress with ProgressTracker. ProgressTracker::begin()
gives the last completed crawl position. ProgressTracker::end() gives
the last scheduled crawl position. begin() does not advance if there
is any item between begin() and end() is not yet completed. In between
are crawled extents that are on the task queue but not yet processed.
The file 'beescrawl.dat' saves the begin() position while the extent
scanning task queue is fed from the end() position.
Also remove an unused method crawl_state_get() and repurpose the
operator<(BeesCrawlState) that nobody was using.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Clearing the FD cache could trigger a lot of inode evicts in the kernel,
which will block the cache entry destructors called by map::clear().
This prevents any cache lookups or new file opens while it happens.
Move the map to an auto variable and destroy it after releasing the
mutex lock. This probably has the same net result (all the bees threads
will be blocked in the kernel instead of on a bees mutex), but at least
the problem is outside of userspace now.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The default constructor makes it more convenient to use Task as a
class member.
The ID is useful to disambiguate Task references.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
update_monotonic does not reset the counter if a new count is smaller than
earlier counts. Useful when consuming an unsorted stream of eveent counts.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Perf was blaming more than 50% of cycles on TREE_SEARCH_V2. strace
showed 4 TREE_SEARCH_V2 calls for every pread in grow_backward().
Fix by increasing the extent fetch batch size so it is more likely
to include the desired items in the first fetch attempt.
This removes TREE_SEARCH_V2 from the top 10 list of cycle consumers.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
RateEstimator estimates the rate of external events by sampling a
counter.
Conversion functions are provided to predict the time when the
event counter will be incremented to particular values based on past
observations of the event counter.
Synchronization functions are provided to block a thread until a specific
counter value is reached.
Event polling is supported using the history of previous event counts
to determine the predicted time of the next event. A decay function
emphasizes more recent event history.
Polling delays are bounded by minimum and maximum values in the constructor
parameters.
wait_for() and wait_until() block the calling thread until the target
event count is reached (or the counter is reset). These functions are
not bounded by min_delay or max_delay, and require a separate tread
to call update(). wait_for() waits for the counter to be incremented
from its current value by the given count. wait_until() waits for the
counter to reach an absolute value.
update() counts external events and unblocks threads that are blocked
in wait_for() or wait_until(). If the event counter decreases then it
is reset to the new value.
duration() and time_point() convert relative and absolute event counts
into relative and absolute C++11 time quantities based on the last update
time, last observed event count, and the observed event rate.
Convenience functions seconds_for() and seconds_until() calculate
polling delays for for the desired relative and absolute event counts
respectively. These delays are bounded by max and min delay parameters.
rate() and ratio() provide conversion factors based on the current
estimated event rate.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Since we are now unconditionally rendering the print_fn as a static
string, there is no need for it to be a function. We also need it to
be brief and mostly constant.
Use a string instead. Put the string before the function in the Task
constructor arguments so that the title string appears as a heading in
code, since we are making a breaking API change already.
Drop TASK_MACRO as it is broken by this change, but there is no similar
usage of Task anywhere to make it worth fixing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This enables bees' thread introspection to use task descriptions in
status and log messages.
BeesNote will be calling Task::current_task() from non-Task contexts,
which means we need to allow Task's shared state pointer to be null.
Remove some asserts that will ruin our day in that case.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit adds log levels to the output. In systemd, it makes colored
lines, otherwise it's probably just a number. Bees is very chatty, so
this paves the road for log level filtering.
Signed-off-by: Kai Krakow <kai@kaishome.de>
We need a better cache expiration algorithm than "make a copy of
the entire thing, sort it while holding a lock, and delete half
the items in a single burst."
Replace the Lamport clock with a double-linked list. Each insert
or lookup operation moves the affected item to the head of the list.
Each erase operation deletes one single item at the tail of the list.
Also sort out some iterator invalidation nonsense by doing erases before
inserts instead of "insert, erase, find the inserted item again because
we invalidated the found iterator during the erase."
The new implementation adds a second word-sized member to each Value
as well as a copy of the Key. Hopefully the enlarged size is not
a deal-breaker.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We need a mechanism for distributing work across processor cores and
disks.
Task implements a simple FIFO/LIFO queue model for executing closures.
Some locking primitives are included (mutex and barrier).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We were holding weak refs until the next time the resource ID was used.
This is a bad thing if resource IDs are sparse (e.g. pointers or hashes)
because we'll never see an ID twice.
To fix, determine whether we released the last instance of a resource,
and if so, free its weak ref immediately.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The bugs in other parts of the code have been identified and fixed,
so the overprotective locks around shared_ptr can be removed.
Keep the other improvements to the Resource class.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>