Another hit from the exotic compiler collection: build fails on GCC 9,
from Ubuntu 20...but not later versions of GCC.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Fix the missing symbols that popped up when adding chunk tree to
lib/fs.cc. Also define the missing symbols instead of merely trying to
avoid them.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This allows plugging in an ostream at run time so that we can audit all
the search calls we are doing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
BtrfsFsTreeFetcher was used for early versions of the extent scanner, but
neither subvol nor extent scan now needs an object that is both persistent
and configured to access only one subvol. BtrfsExtentDataFetcher does
the same thing in that case.
Clarify the comments on what the remaining classes do, so that
BtrfsFsTreeFetcher doesn't get inadvertently reinvented in the future.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Binary searches can be extremely slow if the target bytenr is near a
metadata block group, because metadata items are not visible to the
binary search algorithm. In a non-mixed-bg filesystem, there can be
hundreds of thousands of metadata items between data extent items, and
since the binary search algorithm can't see them, it will run searches
that iterate over hundreds of thousands of objects about a dozen times.
This is less of a problem for mixed-bg filesystems because the data and
metadata blocks are not isolated from each other. The binary search
algorithm still can't see the metadata items, but there are usually
some data items close by to prevent the linear item filter from running
too long.
Introduce a new fetcher class (all the good names were taken) that tracks
where the end of the current block group is. When the end of the current
block group is reached in the linear search, skip ahead to a block group
that can contain data items.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Enable use of the ioctl to probe whether two fds refer to the same btrfs,
without throwing an exception.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The debug log is only revealed when something goes wrong, but it is
created and discarded every time `seek_backward` is called, and it
is quite CPU-intensive.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Task started out as a self-organizing parallel-make algorithm, but ended
up becoming a half-broken wait-die algorithm. When a contended object
is already locked, Tasks enter a FIFO queue to restart and acquire the
lock. This is the "die" part of wait-die (all locks on an Exclusion are
non-blocking, so no Task ever does "wait"). The lock queue is FIFO wrt
_lock acquisition order_, not _Task age_ as required by the wait-die
algorithm.
Make it a 25%-broken wait-die algorithm by sorting the Tasks on lock
queues in order of Task ID, i.e. oldest-first, or FIFO wrt Task age.
This ensures the oldest Task waiting for an object is the one to get
it when it becomes available, as expected from the wait-die algorithm.
This should reduce the amount of time Tasks spend on the execution queue,
and reduce memory usage by avoiding the accumulation of Tasks that cannot
make forward progress.
Note that turning `TaskQueue` into an ordered container would have
undesirable side-effects:
* `std::list` has some useful properties wrt stability of object
location and cost of splicing. Other containers may not have these,
and `std::list` does have a `sort` method.
* Some Task objects are created at the beginning and reused continually,
but we really do want those Tasks to be executed in FIFO order wrt
submission, not Task ID. We can exclude these tasks by only doing the
sorting when a Task is queued for an Exclusin object.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Since we're now using weak symbols for dodgy libc functions, we might
as well do it for gettid() too.
Use the ::gettid() global namespace and let libc override it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
openat2 allows closing more TOCTOU holes, but we can only use it when
the kernel supports it.
This should disappear seamlessly when libc implements the function.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
* Allow RateLimiter to change rate after construction.
* Check range of rate argument in constructor.
* Atomic increment for RateEstimator.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
hexdump was moved into a template in its own header years ago, but
the declaration of the implementation that used to be in fs.cc remains.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
hexdump processes a vector as a contiguous sequence of bytes, regardless
of V's value type, so hexdump should get a pointer and use uint8_t to
read the data.
Some vector types have a lock and some atomics in their operator[], so
let's avoid hammering those.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
operator<< was a friend class that locked the ByteVector, then invoked
hexdump on the bytevector, which used ByteVector::operator[]...which
locked the ByteVector, resulting in a deadlock.
operator<< shouldn't be a friend class anyway. Make hexdump use the
normal public access methods for ByteVector.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add a second level queue which is only serviced when the local and global
queues are empty.
At some point there might be a need to implement a full priority queue,
but for now two classes are sufficient.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This should help clean up some of the uglier status outputs.
Supports:
* multi-line table cells
* character fills
* sparse tables
* insert, delete by row and column
* vertical separators
and not much else.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add a master switch to turn off the entire MultiLock infrastructure for
testing, without having to remove and add all the individual entry points.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
try_lock allows specification of a different Task to be run instead of
the current Task when the lock is busy.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Some malloc implementations will try to mmap() and munmap() large buffers
every time they are used, causing a severe loss of performance.
Nothing ever overrode the virtual methods, and there was no virtual
destructor, so they cause compiler warnings at build time when used with
a template that tries to delete pointers to them.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
ProgressTracker was only freeing memory for work items when they reach
the head of the work tracking queue. If the first work item takes
hours to complete, and thousands of items are processed every second,
this leads to millions of completed items tracked in memory at a time,
wasting gigabytes of system RAM.
Rewrite ProgressHolderState methods to keep only incomplete work items
in memory, regardless of the order in which they are added or removed.
Also fix the unit tests which were relying on the memory leak to work,
and add test cases for code coverage.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Another instance of the pattern where we derived a crucible class
from a btrfs struct. Make it an automatic variable instead.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This was fixed in
7f660f50b lib: fs: stop using libbtrfs-dev helper functions to re-enable buffer length checks
but apparently some copies live on.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We only use BtrfsExtentInfo when it's exactly equivalent to the
base, so drop the derived class.
While we're here, fix BtrfsExtentSame::add so it uses a btrfs-compatible
uint64_t instead of an off_t.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
It turns out I've been using pthread_setname_np wrong the whole time:
* on Linux, the thread name length is 15 characters.
TASK_COMM_LEN is 16 bytes, and the last one is always 0.
This is now hardcoded in many places and cannot be changed.
* pthread_setname_np doesn't return -errno, so DIE_IF_MINUS_ERRNO
was the wrong macro. On the other hand, we never want to do anything
differently when pthread_setname_np fails, so we never needed to
check the return value.
Also, libc silently ignores attempts to set the thread name when it is too
long. That's almost certainly a libc bug, but libc probably suppresses
the error result for the same reasons I ignore the error result.
Wrap the pthread_setname function with a C++ std::string overload that
truncates the argument at 15 characters, so we at least get the first
part of the task name in the thread name field. Later commits can deal
with making the bees thread names shorter.
Also wrap pthread_getname for symmetry.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
btrfs-tree provides classes for low-level access to btrfs tree objects.
An item class is provided to decode polymorphic btrfs item fields.
Several tree classes provide forward and backward iteration over raw
object items at different tree levels.
A csum tree class provides convenient access to csums by bytenr,
supporting all current btrfs csum types.
Wrapper classes for inode and subvol items provide direct access to
btrfs metadata fields without clumsy stat() wrappers or ioctls.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This template turns a forward search primitive (e.g. lower_bound, FIEMAP,
TREE_SEARCH_V2) into a backward search primitive.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We are using ByteVectors from multiple threads in some cases. Mostly
these are the status and progress threads which read the ByteVector
object references embedded in BEESNOTE macros.
Since it's not clear what the data race implications are, protect
the shared_ptr in ByteVector with a mutex for now.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
pause(true) stops the TaskMaster from processing any more Tasks,
but does not destroy any queued Tasks.
pause(false) re-enables Task processing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Exclusion was generating a new Task every time a lock was contended.
That results in thousands of empty Task objects which contain a single
Task item.
Get rid of ExclusionState. Exclusion is now a simple weak_ptr to a Task.
If the weak_ptr is expired, the Exclusion is unlocked. If the weak_ptr
is not expired, it points to the Task which owns the Exclusion.
try_lock now appends the Task attempting to lock the Exclusion directly
to the owning Task, eliminating the need for Exclusion to have one.
This also removes the need to call insert_task separately, though
insert_task remains for other use cases.
With no ExclusionState there is no need for a string argument to
Exclusion's constructor, so get rid of that too.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Make one class Barrier which is copiable, so we don't have to
have users making shared Barrier all the time.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
For performance or workaround reasons we sometimes have to avoid doing
two conflicting operations at the same time, but we can still run any
number of non-conflicting operations in parallel.
MultiLocker (suggestions for a better class name welcome) blocks the
calling thread until there are no threads attempting to run a conflicting
operation.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Document the overall purpose of the class and what some of the methods do,
particularly the ones with terrible names like 'insert_item' (which only
inserts an item after calling the Function).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We already had a function that was _similar_, so add decoding for compress
type NONE, give it a less specific name, and declare it in fs.h.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
It really needs to be uint64_t, but at least it now doesn't contradict
the definition in the earlier header.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Yet another build failure of the form:
error: flexible array member fiemap... not at end of struct crucible::Fiemap...
bees doesn't use fiemap any more, so the fixes here are minimal changes
to make it build, not shining examples of C++ class design.
Signer-off-by: Zygo Blaxell <bees@furryterror.org>