1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00

193 Commits

Author SHA1 Message Date
Zygo Blaxell
a85ada3a49 task: export load tracking statistics
Provide an interface so that programs can monitor the Task load
average calculations.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:57 -05:00
Zygo Blaxell
46a38fe016 task: rescue post-exec queue on Task destruction
task1.append(task2) is supposed to run task2 after task1 is executed;
however, if task1 was just executed, and its last reference was owned by
a TaskConsumer, then task2 will be appended to a Task that will never
run again.

A similar problem arises in Exclusion, which can cause blocked tasks
to occasionally be dropped without executing them.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:57 -05:00
Zygo Blaxell
2aafa802a9 task: increase saved thread name length to 64
24 bytes seems a little low.  64 is a rounder (and more square) number.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:57 -05:00
Zygo Blaxell
cdef59e2f3 task: add more Doxygen comments for PairLock
I need to remind myself why it's there, and not just std::lock.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:57 -05:00
Zygo Blaxell
dc2dc8d08a task: delete the queue after deleting all of its children
This was resulting in an assertion failure later on if a queue was
being rescued from a deleted task with only one post-exec queue.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:57 -05:00
Zygo Blaxell
7873988dac task: add a pause() method as an alternative to cancel()
pause(true) stops the TaskMaster from processing any more Tasks,
but does not destroy any queued Tasks.

pause(false) re-enables Task processing.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:57 -05:00
Zygo Blaxell
3f740d6b2d task: simplify clear_queue
Simplify the loop in clear_queue because we can't be modifying a
queue while we are clearing it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:56 -05:00
Zygo Blaxell
c0a7533dd4 task: use const for current_consumer
The const version of this code has much more testing, but any
effect at run time is unlikely.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:56 -05:00
Zygo Blaxell
090fa39995 task: don't hold the mutex while disposing of pending Tasks
In the event that someday Barrier allows users to force execution of
its pending tasks prior to the destruction of the BarrierState object,
we'll be ready to submit those Tasks for execution without waiting for
the BarrierState mutex lock.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:56 -05:00
Zygo Blaxell
2f25f89067 task: get rid of separate Exclusion and ExclusionState
Exclusion was generating a new Task every time a lock was contended.
That results in thousands of empty Task objects which contain a single
Task item.

Get rid of ExclusionState.  Exclusion is now a simple weak_ptr to a Task.
If the weak_ptr is expired, the Exclusion is unlocked.  If the weak_ptr
is not expired, it points to the Task which owns the Exclusion.

try_lock now appends the Task attempting to lock the Exclusion directly
to the owning Task, eliminating the need for Exclusion to have one.
This also removes the need to call insert_task separately, though
insert_task remains for other use cases.

With no ExclusionState there is no need for a string argument to
Exclusion's constructor, so get rid of that too.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:56 -05:00
Zygo Blaxell
7fdb87143c task: get rid of the separate Barrier and BarrierLock
Make one class Barrier which is copiable, so we don't have to
have users making shared Barrier all the time.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:55 -05:00
Zygo Blaxell
4a4a2de89f multilocker: serialize conflicting parallel operations
For performance or workaround reasons we sometimes have to avoid doing
two conflicting operations at the same time, but we can still run any
number of non-conflicting operations in parallel.

MultiLocker (suggestions for a better class name welcome) blocks the
calling thread until there are no threads attempting to run a conflicting
operation.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:54 -05:00
Zygo Blaxell
30ece57116 fs: export btrfs_compress_type_ntoa
We already had a function that was _similar_, so add decoding for compress
type NONE, give it a less specific name, and declare it in fs.h.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:54 -05:00
Zygo Blaxell
331cb142e3 fs: make dedupe work again after a really unfortunate build fix
In commit 14ce81c08 "fs: get rid of silly base class that causes build
failures now" I neglected to set the dest_count field in the ioctl
arg structure, so bees master hasn't been deduping anything for about
three weeks.

I'd put a THROW_CHECK in here to catch this kind of bug in the future,
but it would be placed at exactly the point where this fix is.

Fixes: 14ce81c08
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-11-05 13:43:21 -04:00
Zygo Blaxell
5953ea6d3c fs: update btrfs compatibility header: add csum types, BTRFS_FS_INFO_FLAG_GENERATION and _METADATA_UUID
I guess this means it's "args_v3" now?

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-10-25 12:56:16 -04:00
Zygo Blaxell
972721016b fs: get rid of base class fiemap
Yet another build failure of the form:

	error: flexible array member fiemap... not at end of struct crucible::Fiemap...

bees doesn't use fiemap any more, so the fixes here are minimal changes
to make it build, not shining examples of C++ class design.

Signer-off-by: Zygo Blaxell <bees@furryterror.org>
2022-10-25 12:56:16 -04:00
Zygo Blaxell
5040303f50 fs: get rid of base class btrfs_data_container
This fixes another build failure of the form:

	error: flexible array member btrfs_... not at end of struct crucible::Btrfs...

Fixes: https://github.com/Zygo/bees/issues/236
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-10-23 22:42:57 -04:00
Zygo Blaxell
587588d53f bytevector: fix length check
ByteVectors, and shared subranges thereof, might be empty.  The parameter
check should allow that.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-10-10 17:40:33 -04:00
Zygo Blaxell
14ce81c081 fs: get rid of silly base class that causes build failures now
The base class thing was an ugly way to get around the lack of C99
compound literals in C++, and also to make the bare ioctls usable with
the derived classes.

Today, both clang and gcc have C99 compound literals, so there's no need
to do crazy things with memset.  We never used the derived classes for
ioctls, and for this specific ioctl it would have been a very, very bad
idea, so there's no need to support that either.  We do need to jump
through hoops for ostream& operator<<() but we had to do those anyway
as there are other members in the derived type.

So we can simply drop the base class, and build the args object on the
stack in `do_ioctl`.  This also removes the need to verify initialization.

There's no bug here since the `info` member of the base class was
never used in place by the derived class, but new compilers reject the
flexible array member in the base class because the derived class makes
`info` be not at the end of the struct any more:

	error: flexible array member btrfs_ioctl_same_args::info not at end of struct crucible::BtrfsExtentSame

Fixes: https://github.com/Zygo/bees/issues/232
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-10-09 20:39:15 -04:00
Zygo Blaxell
26acc6adfd bytevector: introduce BEES_VALGRIND to help work around valgrind
valgrind doesn't understand ioctl arguments, so it does not know if
or when they initialize memory, and it complains about conditionals
depending on data that comes out of ioctls.  That's a problem for bees,
where every decision we ever make is based on data an ioctl gave us.

Fix the initialization issue by using calloc instead of malloc for
ByteVectors when we are building for valgrind.  Don't enable this by
default because all the callocs aren't necessary (assuming the rest
of the code is correct) and hurt performance.

Define BEES_VALGRIND in localconf to activate, e.g.

	echo CCFLAGS += -DBEES_VALGRIND=1 >> localconf

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:10:02 -05:00
Zygo Blaxell
d0c35b4734 fs: yet another const
References to the search key do not need to be modified.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:10:02 -05:00
Zygo Blaxell
5d7e815eb4 lib: add Uname, a constructor for utsname
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
0103a04ca0 task: concurrency cleanups
Update thread_local task state pointers while locked.  This avoids
potential concurrent access of the pointers while making copies of them.

Verify that the queue is really empty after splicing lists, and the
current consumer is really gone after swapping the empty one.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
5e346beb2d task: delete the move constructor for TaskState
Move-constructing isn't good for that class either.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
6325f9ed72 lib: deprecate memset_zero template, use C99 compound literals instead
Sprinkle in some asserts to make sure compilers aren't getting creative.

This may introduce a new compiler dependency, as I suspect older versions
of GCC don't support this syntax.

It definitely needs a new compiler flag to suppress a warning when some
fields are not explicitly initialized.  If we've omitted a field, it's
because it's a field we don't know (or care) about, and we want that
thing initialized to zero.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
95347a08bb fd: better error messages for pread/pwrite
Include file name and offset.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
b828f14dd1 task: optimize for common case of single following Task
If there is only one Task in the post exec queue, we can
simply insert that Task instead of creating a task to hold
a post exec queue of one item.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
a60c53a9e1 fs: dump the TREE_SEARCH_V2 parameters on exception
The current error message is useless.  At least say which tree we were
searching.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
fcd847bbf9 fs: add an item type parameter to next_min
When we are searching the btrfs metadata trees, we usually want only
one type of item.  If the last item in a search result is not of the
desired type, we can restart the search at the next possible key with
that item type, potentially skipping over some uninteresting items we
would otherwise have to fetch, process, and discard.

Also remove a bug in the previous next_min code that would skip over
items if the offset overflowed and the next objectid in the tree had a
lower item type number than the previous objectid.  This doesn't seem
to be a bug that has ever happened, as it would require a file to roll
over in the offset field.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:56:04 -04:00
Zygo Blaxell
55dc98e21a fd: finish deprecating vector<uint8_t> in IO wrapper functions
We can simply remove the template specializations, but if we do that, then
existing code might accidentally write out the vector<uint8_t> struct.

Prevent regressions by deleting the vector specializations, making any
code that uses them fail to build.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:42:01 -04:00
Zygo Blaxell
99709d889f fd: start deprecating vector<uint8_t> for p{read,write}_or_die
Add support for pread and pwrite of ByteVector objects alongside
vector<uint8_t>.  A later commit will delete the template specializations
for vector<uint8_t>, but existing users have to be updated to use
ByteVector first.

Nothing currently uses vector<char>, so we can delete that immediately.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:42:01 -04:00
Zygo Blaxell
bba6f4f183 fs: convert vector<uint8_t> and Spanner to ByteVector and rewrite TREE_SEARCH_V2 wrapper
Switch various methods in fs to use ByteVector to cut down on the number
of slow allocations and copies.

Automatically determine the correct size for TREE_SEARCH_V2 buffers
based on the number of items requested, and grow the buffer as needed.
This eliminates the need to cache some objects that were heavy to create.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:42:01 -04:00
Zygo Blaxell
daf8a2cde1 extentwalker: use default sizing of TREE_SEARCH_V2 buffers
Now that we can guess the size more or less automatically, there's
no need to make it unnecessarily large.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:42:01 -04:00
Zygo Blaxell
f0eb9b202f lib: introduce ByteVector as a replacement for vector<uint8_t> and Spanner
After some benchmarking, it turns out that std::vector<uint8_t> is
about 160 times slower than malloc().  malloc() is faster than "new
uint8_t[]" too.  Get rid of std:;vector<uint8_t> and replace it with
a lightweight wrapper around malloc(), free(), and memcpy().

ByteVector has helpful methods for the common case of moving data to and
from ioctl calls that use a fixed-length header placed contiguously with a
variable-length input/output buffer.  Data bytes are shared between copied
ByteVector objects, allowing a large single buffer to be cheaply chopped
up into smaller objects without memory copies.  ByteVector implements the
more useful parts of the std::vector API, so it can replace std::vector
objects without needing an awkward adaptor class like Spanner.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-31 19:42:01 -04:00
Zygo Blaxell
a9cd19a5fe fs: avoid unaligned access when copying btrfs search headers
The assignment operator will use member-wise assignment, which
assumes the object's this pointer is aligned.  That doesn't
happen when the object in question is part of a btrfs search
result, and aarch64 faults over it.

Use memcpy instead, which has no alignment constraints.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-04 20:19:00 -04:00
Zygo Blaxell
955b8ae459 task: set the name of consumer threads so it is not "load_tracker"
The default name of a newly constructed thread is apparently the name
of the thread that created it.  That's very misleading when there are
a lot of TaskConsumer threads and they have nothing to do, so set the
name of each TaskConsumer thread as soon as it is created.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 21:02:00 -04:00
Zygo Blaxell
6adaedeecd extentwalker: fix the binary search and add some debug infrastructure
Add some conditionally-compiled debug code, including an in-memory log
of what ExtentWalker does.  Dump that log on exceptions.

If we loop too many times in a debug build, kill the process so we can
stack trace.  In non-debug builds just throw a normal exception.

Grow the step size instead of shrinking it, to reduce the number of
binary search iterations.

Prevent a bug where the step size bottoms out before positioning the
target extent in the middle of the result vector.

Use the first extent for "first_extent", instead of the 3rd.

Get rid of some redundant checks.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
54f03a0297 extentwalker: fix missing characters
"C" in LOGICAL_INO, and avoid writing "flags=" in the log.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
52279656cf extentwalker: fix the hole position logic
When a file ends with a hole, ExtentWalker synthesizes a hole extent record
to cover the distance between the last ipos and EOF.  Unfortunately, ipos
was incremented by the number of items in the result vector instead.  Fix
that by incrementing by hole_extent.size().

While we're here, fix up some of the other data quality logic, including
a useless THROW_CHECK that was nothing but workarounds for earlier bugs.

Fixes: https://github.com/Zygo/bees/issues/26
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
5f763f6d41 task: handle thread lifecycle more strictly
Testing sometimes crashes during exec of the first Task object, which
triggers construction of TaskConsumer threads.  Manage the life cycle
of the thread more strictly--don't access any methods of TaskConsumer
or std::thread until the constructor's caller's lock on TaskMaster
is released.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
0928362aab task: replace waiting state with run/exec counter
Task::run() would schedule a new execution of Task, unless it was waiting
on a queue for execution.  This cannot be implemented with a bool,
since a Task might be included in multiple queues, and should still be
in waiting state even when executed in that case.

Replace the bool with a counter.  run() and append() (but not
append_nolock) increment the counter, exec() decrements the counter.
If the counter is non-zero when run() or append() is called, the Task
is not scheduled.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
d5ff35eacf task: track number of Task objects in program and provide report
This is a simple lightweight counter that tracks the number of Task
objects that exist.  Useful for leak detection.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
b7f9ce3f08 task: serialize Task execution when Tasks block due to mutex contention
Quite often we want to execute task B after task A finishes executing,
especially if tasks A and B attempt to acquire locks on the same objects.

Implement that capability in Task directly:  each Task holds a queue
of Tasks which will be executed strictly after this Task has finished
executing, or if the Task is destroyed.

Add a local queue to each TaskConsumer.  This queue contains a list
of Tasks which are to be executed by a single thread in sequential
order.  These tasks are executed before fetching any tasks from
TaskMaster.

Each time a Task finishes executing, the list of tasks appended to the
recently executed Task are spliced at the beginning of the thread's
TaskConsumer local queue.  These tasks will be executed in the same
thread in the same order they were appended to the recently executed Task.

If a Task is destroyed with a post-execution queue, that queue is
also inserted at the front of the current TaskConsumer's local queue.

If a Task is destroyed or somehow executed outside of a TaskConsumer
thread, or a TaskConsumer thread is destroyed, the local queue of Tasks
is wrapped in a "rescue_task" Task, and spliced before the head of the
global queue.  This preserves the sequential ordering of tasks.

In all cases the order of sequential execution of Tasks that are
appended to another Task is preserved.

The unused queue insertion functions are removed.

Exclusion is now simply a mutex, a bool, and a Task with an empty
function.  Tasks that queue up waiting for the mutex are stored in
Exclusion's Task, and Exclusion simply runs that task when the
ExclusionState is released.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
06a46e2736 chatter: add option to remove log level prefix
Some projects use only one log level, so there is no need to repeat it
for every line.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
e4c95d618a crucible: use '#include "crucible/...' everywhere
Make the #include syntax more consistent (even if it has no effect).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
032c740678 process: SIGCLD is not portable
MUSL libc doesn't have it, for instance.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
f8a8704135 ntoa: fix bits_ntoa formatting and error handling
Get rid of an assert in bits_ntoa.  Throw an exception instead.

Fix hex formatting (adding "0x" before a decimal number is not
the correct way to format hex strings).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:20:03 -04:00
Zygo Blaxell
bcf3e7de3e uuid: drop dependency on uuid.h
The weird things distros do to the path where uuid.h gets installed
have broken bees builds for the last time.

We were only using uuid to support a legacy feature that was removed
over four years ago.

Hypothetical users who are upgrading directly from bees v0.1 should
probably restart all the crawlers anyway--there were bugs.  Also, if any
such users exist, I respect their tremendous patience with the horrible
performance all these years--bees got about 30x faster since v0.1.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:16:50 -04:00
Zygo Blaxell
7f660f50b8 lib: fs: stop using libbtrfs-dev helper functions to re-enable buffer length checks
The Linux kernel's btrfs headers are better than the libbtrfs-dev headers:

	- the libbtrfs-dev headers have C++ language compatibility issues

	- upstream version in Linux kernel is more accurate and up to date

	- macros in libbtrfs-dev's ctree.h hide information that would
	enable bees to perform runtime buffer length checking

	- enum types whose presence cannot be detected with #ifdef

When accessing members of metadata items from the filesystem, we want
to verify that the member we are accessing is within the boundaries of
the item that was retrieved; otherwise, a memory access violation may
occur or garbage may be returned to the caller.  A simple C++ template,
given a pointer to a structure member and a buffer, can determine that
the buffer contains enough bytes to safely access a struct member.
This was implemented back in 2016, but left unused due to ctree.h issues.

Some btrfs metadata structures have variable length despite using a
fixed-size in-memory structure.  The members that appear earliest in
the structure contain information about which following members of the
structure are used.  The item stored in the filesystem is truncated after
the last used member, and all following members must not be accessed.

'btrfs_stack_*' accessor macros obscure the memory boundaries of the
members they access, which makes it impossible for a C++ template to
verify the memory access.  If the template checks the length of the
entire structure, it will find an access violation for variable-length
metadata items because the item is rarely large enough for the entire
structure.

Get rid of all the libbtrfs-dev accessor macros and reimplement them
with the necessary buffer length checks.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-22 20:06:43 -05:00
Zygo Blaxell
6eb7afa65c build: include localconf everywhere
Overriding makeflags did not work from localconf in the src, lib, or
test directories.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-22 20:06:43 -05:00