BEESNOTE can only be seen if the status thread is running at the time,
making the log of activities during shutdown incomplete.
Wake up the status thread early during shutdown so the logged sequence
of shutdown actions is complete.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In the current architecture we can't directly measure the physical extent
size, and we can't make good decisions with the extent data (reference)
item alone. If the early return is enabled here, there is a small speedup
and a large drop in dedupe hit rate, especially when extent splits occur.
Leave the early return commented for now, but collect the event statistics.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Tree searches are all looking for specific item types. Skip over any
item types we are not interested in when resetting the search key for
the next search.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When we are searching the btrfs metadata trees, we usually want only
one type of item. If the last item in a search result is not of the
desired type, we can restart the search at the next possible key with
that item type, potentially skipping over some uninteresting items we
would otherwise have to fetch, process, and discard.
Also remove a bug in the previous next_min code that would skip over
items if the offset overflowed and the next objectid in the tree had a
lower item type number than the previous objectid. This doesn't seem
to be a bug that has ever happened, as it would require a file to roll
over in the offset field.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
BtrfsIoctlSearchKeyV2's constructor now fills in nr_items = 1, so we
don't need to set it explicitly any more.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
vector_copy_struct constructed a std::vector<uint8_t> from a fixed-size
struct. ByteVector replaces std::vector<uint8_t> and has a template
constructor which does the same thing as vector_copy_struct, so there
is no longer a need for this function.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Spanner was a workaround for terrible std::vector _copy_ performance,
but it turns out that std::vector has terrible _allocator_ performance
(compared to an implementation based on malloc and memcpy). Spanner is a
workaround for the copy performance issue, so it doesn't help very much.
Refraining from using vector at all is much better.
Now that all code that used Spanner has been converted to ByteVector,
there's no further need for Spanner<uint8_t>, which was the only type
it was ever used for.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We can simply remove the template specializations, but if we do that, then
existing code might accidentally write out the vector<uint8_t> struct.
Prevent regressions by deleting the vector specializations, making any
code that uses them fail to build.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The vector<uint8_t> in the hash table doesn't hurt very much--only a few
microseconds per 128K hash block.
The vector<uint8_t> in BeesBlockData hurts a bit more--we run that
constructor thousands of times per second.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add support for pread and pwrite of ByteVector objects alongside
vector<uint8_t>. A later commit will delete the template specializations
for vector<uint8_t>, but existing users have to be updated to use
ByteVector first.
Nothing currently uses vector<char>, so we can delete that immediately.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Switch various methods in fs to use ByteVector to cut down on the number
of slow allocations and copies.
Automatically determine the correct size for TREE_SEARCH_V2 buffers
based on the number of items requested, and grow the buffer as needed.
This eliminates the need to cache some objects that were heavy to create.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Now that we can guess the size more or less automatically, there's
no need to make it unnecessarily large.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
After some benchmarking, it turns out that std::vector<uint8_t> is
about 160 times slower than malloc(). malloc() is faster than "new
uint8_t[]" too. Get rid of std:;vector<uint8_t> and replace it with
a lightweight wrapper around malloc(), free(), and memcpy().
ByteVector has helpful methods for the common case of moving data to and
from ioctl calls that use a fixed-length header placed contiguously with a
variable-length input/output buffer. Data bytes are shared between copied
ByteVector objects, allowing a large single buffer to be cheaply chopped
up into smaller objects without memory copies. ByteVector implements the
more useful parts of the std::vector API, so it can replace std::vector
objects without needing an awkward adaptor class like Spanner.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Sometimes we need to check constraints on 4 variables at once.
It would be nice if variadic macros in C++ were also polymorphic.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Previously, when the bees send workaround is enabled, bees would
immediately advance the subvol's crawl status as if the entire subvol
had been scanned.
If the subvol is later made read-write, or if the workaround is disabled,
bees sees that the subvol has already been marked as scanned. This is
an unfortunate result if the subvol is inadvertently marked read-only
or if bees is inadvertently run with the send workaround disabled.
Instead, (almost) completely ignore the subvol: don't advance the crawl
pointer, don't consider the subvol in the list if searchable roots, and
don't consider the subvol when calculating min_transid for new subvols.
The "almost" part is: if the subvol scan has not yet started, keep its
start timestamp current so it won't mess up subvol traversal performance
metrics.
Also handle exceptions while determining whether a subvol is read-only,
as those apparently do happen.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
install -Dm644 scripts/beesd.conf.sample $(DESTDIR)/$(ETC_PREFIX)/bees/beesd.conf.sample
will expand to //etc/bees/beesd.conf.sample. This patch removes the duplicated /
Since we started locking down the beesd service, we no longer have
privileges to do some things. Have systemd do it for us instead.
Fixes: #195
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In fiemap.h the members of struct fiemap are declared as __u64, but the
FIEMAP_MAX_OFFSET macro is an unsigned long long value:
$ grep FIEMAP_MAX_OFFSET -r /usr/include/
/usr/include/linux/fiemap.h:#define FIEMAP_MAX_OFFSET (~0ULL)
$ grep fe_length -r /usr/include/
/usr/include/linux/fiemap.h: __u64 fe_length; /* length in bytes for this extent */
This results in a type mismatch error on architectures like ppc64le:
fiemap.cc:31:35: note: deduced conflicting types for parameter 'const _Tp' ('long unsigned int' and 'long long unsigned int')
31 | fm.fm_length = min(fm.fm_length, FIEMAP_MAX_OFFSET - fm.fm_start);
| ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Work around this by copying the macro into a uint64_t constant,
and not using the macro any more.
Fixes: https://github.com/Zygo/bees/issues/194
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The hash table is one of the few cases in bees where a non-trivial amount
of page cache memory will be used in a predictable way, so we can advise
the kernel about our IO demands in advance.
Use WILLNEED to prefetch hash table pages at startup.
Use DONTNEED to trigger writeback on hash table pages at shutdown.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In theory, we don't need the pread() loop, because the kernel will do a
better job with readahead().
In practice, we might still need the pread() code, as the readahead will
occur at idle IO priority, which could adversely affect bees performance.
More testing is required.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The assignment operator will use member-wise assignment, which
assumes the object's this pointer is aligned. That doesn't
happen when the object in question is part of a btrfs search
result, and aarch64 faults over it.
Use memcpy instead, which has no alignment constraints.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Btrfs mount options effects all mount points using the same Btrfs
partition, so specifing it per-mount is useless.
Also, common mount options like `noatime,nosuid,nodev,noexec` has little
to no effect on beesd, so it's just better and simpler to remove this.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
I've verified that using this setup, user will be able to access the log
in /run/bees, but cannot access the mounted filesystem.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Like filefrag, fiemap was defaulting to FIEMAP_FLAG_SYNC, and providing no
option to turn it off. This prevents observation of delayed allocations,
making fiemap less useful.
Override the default flag setting so fiemap gets the current
(i.e. unflushed) extent map state.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
LOGICAL_INO_V2 has a maximum limit of 655050 references per extent.
Although it no longer has a crippling performance problem, at roughly
two seconds to process extent, it's too slow to be useful.
When an extent gains an absurd number of references, stop making any
more. Returning zero extent refs will make bees believe the extent
was deleted, and it will remove the block from the hash table.
This helps speed processing of highly duplicated large files like
VM images, and the cost of a slightly lower dedupe hit rate.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The default name of a newly constructed thread is apparently the name
of the thread that created it. That's very misleading when there are
a lot of TaskConsumer threads and they have nothing to do, so set the
name of each TaskConsumer thread as soon as it is created.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In 15ab981d9e "bees: replace uncaught_exception(), deprecated in C++17",
uncaught_exception() was replaced with current_exception(); however,
current_exception() is only valid after an exception has been captured
by a catch block.
BeesTracer wants to know about exceptions _before_ they are caught,
so current_exception() is not useful here.
Instead, conditionally compile using uncaught_exception() or
uncaught_exceptions(), selected by C++ standard version, and make
bees stack traces work again.
Fixes: 15ab981d9e "bees: replace uncaught_exception(), deprecated in C++17"
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This allows these components to be used by test executables without
pulling in all of bees, and more rapidly iterate their code.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add some conditionally-compiled debug code, including an in-memory log
of what ExtentWalker does. Dump that log on exceptions.
If we loop too many times in a debug build, kill the process so we can
stack trace. In non-debug builds just throw a normal exception.
Grow the step size instead of shrinking it, to reduce the number of
binary search iterations.
Prevent a bug where the step size bottoms out before positioning the
target extent in the middle of the result vector.
Use the first extent for "first_extent", instead of the 3rd.
Get rid of some redundant checks.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When a file ends with a hole, ExtentWalker synthesizes a hole extent record
to cover the distance between the last ipos and EOF. Unfortunately, ipos
was incremented by the number of items in the result vector instead. Fix
that by incrementing by hole_extent.size().
While we're here, fix up some of the other data quality logic, including
a useless THROW_CHECK that was nothing but workarounds for earlier bugs.
Fixes: https://github.com/Zygo/bees/issues/26
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Two new tree mod log bugs #5 and #6 (uncovered by the zoned IO work,
though #6 has been seen in the wild on 5.10.29).
Tweak the next of some of the workarounds.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Some users are hitting an exception somewhere in crawl_transid, which
forces bees to return back to the transid_max calculation over and over.
Also out-of-range transids.
Add some BEESTRACE so we can see what we were doing in the exception
handler.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Currently if crawl throws an exception, we don't have basic information
about what was being crawled or even if the crawler was running at all.
These traces also help identify the causes of early exception failures.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>