Like filefrag, fiemap was defaulting to FIEMAP_FLAG_SYNC, and providing no
option to turn it off. This prevents observation of delayed allocations,
making fiemap less useful.
Override the default flag setting so fiemap gets the current
(i.e. unflushed) extent map state.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
LOGICAL_INO_V2 has a maximum limit of 655050 references per extent.
Although it no longer has a crippling performance problem, at roughly
two seconds to process extent, it's too slow to be useful.
When an extent gains an absurd number of references, stop making any
more. Returning zero extent refs will make bees believe the extent
was deleted, and it will remove the block from the hash table.
This helps speed processing of highly duplicated large files like
VM images, and the cost of a slightly lower dedupe hit rate.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The default name of a newly constructed thread is apparently the name
of the thread that created it. That's very misleading when there are
a lot of TaskConsumer threads and they have nothing to do, so set the
name of each TaskConsumer thread as soon as it is created.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In 15ab981d9e "bees: replace uncaught_exception(), deprecated in C++17",
uncaught_exception() was replaced with current_exception(); however,
current_exception() is only valid after an exception has been captured
by a catch block.
BeesTracer wants to know about exceptions _before_ they are caught,
so current_exception() is not useful here.
Instead, conditionally compile using uncaught_exception() or
uncaught_exceptions(), selected by C++ standard version, and make
bees stack traces work again.
Fixes: 15ab981d9e "bees: replace uncaught_exception(), deprecated in C++17"
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This allows these components to be used by test executables without
pulling in all of bees, and more rapidly iterate their code.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add some conditionally-compiled debug code, including an in-memory log
of what ExtentWalker does. Dump that log on exceptions.
If we loop too many times in a debug build, kill the process so we can
stack trace. In non-debug builds just throw a normal exception.
Grow the step size instead of shrinking it, to reduce the number of
binary search iterations.
Prevent a bug where the step size bottoms out before positioning the
target extent in the middle of the result vector.
Use the first extent for "first_extent", instead of the 3rd.
Get rid of some redundant checks.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When a file ends with a hole, ExtentWalker synthesizes a hole extent record
to cover the distance between the last ipos and EOF. Unfortunately, ipos
was incremented by the number of items in the result vector instead. Fix
that by incrementing by hole_extent.size().
While we're here, fix up some of the other data quality logic, including
a useless THROW_CHECK that was nothing but workarounds for earlier bugs.
Fixes: https://github.com/Zygo/bees/issues/26
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Two new tree mod log bugs #5 and #6 (uncovered by the zoned IO work,
though #6 has been seen in the wild on 5.10.29).
Tweak the next of some of the workarounds.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Some users are hitting an exception somewhere in crawl_transid, which
forces bees to return back to the transid_max calculation over and over.
Also out-of-range transids.
Add some BEESTRACE so we can see what we were doing in the exception
handler.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Currently if crawl throws an exception, we don't have basic information
about what was being crawled or even if the crawler was running at all.
These traces also help identify the causes of early exception failures.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This might be interesting information, though most of the motivation for
this evaporated when kernel 5.7 came out.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
There seem to be multiple ways to do readahead in Linux, and only some
of them work. Hopefully reading the actual data is one of them.
This is an attempt to avoid page-by-page reads in the generic dedupe code.
We load both extents into the VFS cache (read sequentially) and hope they
are still there by the time we call dedupe on them.
We also call readahead(2) and hopefully that either helps or does nothing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This enables us to correlate FD cache clears with external events such
as btrfs inode eviction storms.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Report the number of Task objects that currently exist as well as the number
on the global work queue.
THREADS (work queue 298 of 2385 tasks, 16 workers):
This helps spot leaks, since Task objects that are blocked on other Task
post-exec queues are otherwise invisible.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Testing sometimes crashes during exec of the first Task object, which
triggers construction of TaskConsumer threads. Manage the life cycle
of the thread more strictly--don't access any methods of TaskConsumer
or std::thread until the constructor's caller's lock on TaskMaster
is released.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Task::run() would schedule a new execution of Task, unless it was waiting
on a queue for execution. This cannot be implemented with a bool,
since a Task might be included in multiple queues, and should still be
in waiting state even when executed in that case.
Replace the bool with a counter. run() and append() (but not
append_nolock) increment the counter, exec() decrements the counter.
If the counter is non-zero when run() or append() is called, the Task
is not scheduled.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This is a simple lightweight counter that tracks the number of Task
objects that exist. Useful for leak detection.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Quite often we want to execute task B after task A finishes executing,
especially if tasks A and B attempt to acquire locks on the same objects.
Implement that capability in Task directly: each Task holds a queue
of Tasks which will be executed strictly after this Task has finished
executing, or if the Task is destroyed.
Add a local queue to each TaskConsumer. This queue contains a list
of Tasks which are to be executed by a single thread in sequential
order. These tasks are executed before fetching any tasks from
TaskMaster.
Each time a Task finishes executing, the list of tasks appended to the
recently executed Task are spliced at the beginning of the thread's
TaskConsumer local queue. These tasks will be executed in the same
thread in the same order they were appended to the recently executed Task.
If a Task is destroyed with a post-execution queue, that queue is
also inserted at the front of the current TaskConsumer's local queue.
If a Task is destroyed or somehow executed outside of a TaskConsumer
thread, or a TaskConsumer thread is destroyed, the local queue of Tasks
is wrapped in a "rescue_task" Task, and spliced before the head of the
global queue. This preserves the sequential ordering of tasks.
In all cases the order of sequential execution of Tasks that are
appended to another Task is preserved.
The unused queue insertion functions are removed.
Exclusion is now simply a mutex, a bool, and a Task with an empty
function. Tasks that queue up waiting for the mutex are stored in
Exclusion's Task, and Exclusion simply runs that task when the
ExclusionState is released.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Change documentation and comments to use the word "dedupe," not "dedup"
as found in circa-3.15 kernel sources.
No changes in code or program output--if they used "dedup" before, they
will continue to be spelled "dedup" now.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Fd's cache does not handle changes in the state of its IOHandle parameter.
If we allow:
Fd f;
f->close();
then Fd ends up caching a pointer to a closed Fd, and will become very
badly confused if a new Fd appears with the same int identifier.
Fix by removing the close method.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Drop the ListType alias because we only use it once. Rename ListRep to
PoolRep to better reflect what it does.
We don't need the Pool to be available to handle destroyed Pool::Handle
objects. A weak_ptr in the Handle would detect the Pool has been
destroyed, so we don't need to track that ourselves. As a bonus, we can
destroy the PoolRep object as soon as the Pool has been destroyed, delayed
only if there is a Handle object currently executing its destructor.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Higher CPU core counts became more common, and kernel bugs became less
common, since the arbitrary 8-thread limit was introduced. We can remove
the limit now, and treat any remaining scaling inefficiency as a bug to
be removed.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The dependency was missing, so changes to the library would not trigger
a rebuild of the bees binary.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Support for multiple BeesContext objects sharing a FdCache was wasting
significant space and atomic inc/dec memory cycles for no good reason
since the shared-FdCache feature was deprecated.
open_root and open_root_ino still need a BeesContext to work. Pass the
BeesContext pointer through the function object instead of the cache
key arguments.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
pthread_cancel doesn't really work properly. It was only being used in
bees to bring threads to a stop if the BeesContext is destroyed early.
It is frequently implicated in core dump reports because of the fragility
of the C++ iostream / C stdio / library infrastructure, particularly
surrounding upgrades on the host running bees. The pthread_cancel call
itself often simply fails even when it doesn't call terminate().
Defer creation of the status and progress threads until after the
BeesContext::start method is invoked. At that point, the existing
ask-threads-nicely-to-stop code is up and running, and normal condvars
can be used to bring bees to a stop, without having to resort to
pthread_cancel.
Since we're deleting half of the BeesContext constructor in this change,
let's remove the other half too, and put an end to the deprecated support
for multiple BeesContexts sharing a process. It's still possible to run
multiple BeesContexts, but they will not share a FD cache. This will
allow the FD cache's keys to become smaller and hopefully save some
memory later on.
Fixes: #171
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
C99's "{ 0 }" notation for filling in a struct with all zeros was not
included in the C++11 standard, so gcc doesn't implement it and neither
does clang.
gcc does (did?) have issues with warnings on the same code in C99,
complaining about uninitialized struct members when "{0}" explicitly
initializes every member to a zero value. These issues don't apply in
the C++ code where NTOA_TABLE_ENTRY_END is used.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Get rid of an assert in bits_ntoa. Throw an exception instead.
Fix hex formatting (adding "0x" before a decimal number is not
the correct way to format hex strings).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The kernel from such an old distro version likely has several unfixed
bugs. Better not to support it at all.
Users who can upgrade the kernel are probably also sophisticated enough
to fix the build issues too.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The weird things distros do to the path where uuid.h gets installed
have broken bees builds for the last time.
We were only using uuid to support a legacy feature that was removed
over four years ago.
Hypothetical users who are upgrading directly from bees v0.1 should
probably restart all the crawlers anyway--there were bugs. Also, if any
such users exist, I respect their tremendous patience with the horrible
performance all these years--bees got about 30x faster since v0.1.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The slow backrefs performance improvement is confirmed by reports from
multiple users:
* Me (5.4.60 + backref patches, 5.7 to 5.11)
* https://github.com/Zygo/bees/issues/161 (5.8)
* https://github.com/Zygo/bees/issues/162 (5.8)
* IRC user S0rin (5.4.88 + backref patches)
The issue still exists, but at a significantly reduced scale: now about
2 ms of CPU per ref on a fast machine.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The Linux kernel's btrfs headers are better than the libbtrfs-dev headers:
- the libbtrfs-dev headers have C++ language compatibility issues
- upstream version in Linux kernel is more accurate and up to date
- macros in libbtrfs-dev's ctree.h hide information that would
enable bees to perform runtime buffer length checking
- enum types whose presence cannot be detected with #ifdef
When accessing members of metadata items from the filesystem, we want
to verify that the member we are accessing is within the boundaries of
the item that was retrieved; otherwise, a memory access violation may
occur or garbage may be returned to the caller. A simple C++ template,
given a pointer to a structure member and a buffer, can determine that
the buffer contains enough bytes to safely access a struct member.
This was implemented back in 2016, but left unused due to ctree.h issues.
Some btrfs metadata structures have variable length despite using a
fixed-size in-memory structure. The members that appear earliest in
the structure contain information about which following members of the
structure are used. The item stored in the filesystem is truncated after
the last used member, and all following members must not be accessed.
'btrfs_stack_*' accessor macros obscure the memory boundaries of the
members they access, which makes it impossible for a C++ template to
verify the memory access. If the template checks the length of the
entire structure, it will find an access violation for variable-length
metadata items because the item is rarely large enough for the entire
structure.
Get rid of all the libbtrfs-dev accessor macros and reimplement them
with the necessary buffer length checks.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>