1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00

797 Commits

Author SHA1 Message Date
Zygo Blaxell
c354e77634 docs: simplify the exit-with-SIGTERM description
The description now matches the code again.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-02-25 03:36:44 -05:00
Zygo Blaxell
f21569e88c docs: update the feature interactions page
Fixing the obviously out-of-date and no-longer-tested things.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-02-25 03:34:22 -05:00
Zygo Blaxell
3d5ebe4d40 docs: update kernel bugs and workarounds list for 6.2.0
Remove some of the repetition to make the document easier to edit.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-02-25 03:32:52 -05:00
Zygo Blaxell
3430f16998 context: create a Pool of BtrfsIoctlLogicalInoArgs objects
Each object contains a 16 MiB buffer, which is very heavy for some
malloc implementations.

Keep the objects in a Pool so that their buffers are only allocated and
deallocated once in the process lifetime.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
v0.9.3
2023-02-23 22:45:31 -05:00
Zygo Blaxell
7c764a73c8 fs: allow BtrfsIoctlLogicalInoArgs to be reused, remove virtual methods
Some malloc implementations will try to mmap() and munmap() large buffers
every time they are used, causing a severe loss of performance.

Nothing ever overrode the virtual methods, and there was no virtual
destructor, so they cause compiler warnings at build time when used with
a template that tries to delete pointers to them.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-02-23 22:40:12 -05:00
Zygo Blaxell
a9a5cd03a5 ProgressTracker: reduce memory usage with long-running work items
ProgressTracker was only freeing memory for work items when they reach
the head of the work tracking queue.  If the first work item takes
hours to complete, and thousands of items are processed every second,
this leads to millions of completed items tracked in memory at a time,
wasting gigabytes of system RAM.

Rewrite ProgressHolderState methods to keep only incomplete work items
in memory, regardless of the order in which they are added or removed.

Also fix the unit tests which were relying on the memory leak to work,
and add test cases for code coverage.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-02-23 22:33:35 -05:00
Zygo Blaxell
299509ce32 seeker: fix the test for ILP32 platforms
Not sure what I was thinking, but the argument here should clearly
be uint64_t.

Fixes: https://github.com/Zygo/bees/issues/248
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
v0.9.2
2023-02-20 11:30:56 -05:00
Zygo Blaxell
d5a99c2f5e roots: don't share a RootFetcher between threads
If the send workaround is enabled, it is possible for two threads (a
thread running the crawl_new task, and a thread attempting to apply the
send workaround) to access the same RootFetcher object at the same time.
That never ends well.

Give each function its own BtrfsRootFetcher object.

Fixes: https://github.com/Zygo/bees/issues/250
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-02-20 11:14:34 -05:00
Kai Krakow
fd6c3b3769
Makefile: also drop fiemap and fiewalk from main Makefile
Fixes: ccd8dcd43f
Signed-off-by: Kai Krakow <kai@kaishome.de>
v0.9.1
2023-01-28 11:21:51 +01:00
Zygo Blaxell
849c071146 hash: flush the table more slowly
With SIGTERM and fast exit, the trickle writeback is less important.
We don't want to flood people's IO subsystems with continuous writes.
This really should be configurable at runtime.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
v0.9
2023-01-27 22:16:02 -05:00
Zygo Blaxell
85ff543695 test: simplify Makefile
Make can build dependencies in parallel, so let Make do that.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
8147f80a5a src: bees-version.cc cleanups
Do rebuild bees-version.cc if libcrucible changes.
Don't rebuild bees-version.cc if it doesn't change.
Also use the standard suffix for new files.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
cbde237f79 src: simplify Makefile
Make can build dependencies in parallel, so let Make do that.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
3b85fc8bc7 lib: drop version.cc entirely
crucible::VERSION doesn't make much sense now that libcrucible no
longer exists as a shared library.  Nothing ever referenced it, so
it can go away.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
4df1b2c834 lib: simplify dependency generation
We don't need to run all the dependencies first, Make can do those in parallel.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
495218104a fd: FS_IOC_SETFLAGS takes an int* argument not a long*
According to ioctl_iflags(2):

	The type of the argument given to the FS_IOC_GETFLAGS and
	FS_IOC_SETFLAGS  operations is int *, notwithstanding the
	implication in the kernel source file include/uapi/linux/fs.h
	that the argument is long *.

So this code doesn't work on be64 machines.

Also, Valgrind complains about it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
e82ce3c06e fd: pwrite returns ssize_t not int
A subtle distinction, and not one that is particularly relevant to bees,
but it does make toolchains complain.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
bd336e81a6 fs: get rid of base class btrfs_ioctl_logical_ino_args
Another instance of the pattern where we derived a crucible class
from a btrfs struct.  Make it an automatic variable instead.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
ea17c89165 fs: remove duplicate BTRFS_COMPRESS_ definitions
This was fixed in

	7f660f50b lib: fs: stop using libbtrfs-dev helper functions to re-enable buffer length checks

but apparently some copies live on.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-27 22:16:02 -05:00
Zygo Blaxell
ccd8dcd43f fiemap, fiewalk: drop dead example/test code
These tools are obsolete.  fiemap was a thin wrapper around FIEMAP,
but FIEMAP is not useful on btrfs.  fiewalk was a thin wrapper around
BtrfsExtentWalker, but development on BtrfsExtentWalker has been
abandoned.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-23 00:09:26 -05:00
Zygo Blaxell
facf4121a6 context: remove the one call to operator vector<> method in BtrfsIoctlLogicalInoArgs
There's only one user of this method.  Open-code it so we can kill the
method in libcrucible.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-23 00:09:26 -05:00
Zygo Blaxell
cbc76a7457 hash: don't spin when writes fail
When a hash table write fails, we skip over the write throttling because
we didn't report that we successfully wrote an extent.  This can be bad
if the filesystem is full and the allocations for writes are burning a
lot of CPU time searching for free space.

We also don't retry the write later on since we assume the extent is
clean after a write attempt whether it was successful or not, so the
extent might not be written out later when writes are possible again.

Check whether a hash extent is dirty, and always throttle after
attempting the write.

If a write fails, leave the extent dirty so we attempt to write it out
the next time flush cycles through the hash table.  During shutdown
this will reattempt each failing write once, after that the updated hash
table data will be dropped.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-23 00:09:26 -05:00
Zygo Blaxell
28ee2ae1a8 docs: fix broken link in options.md
Links in docs/ are relative to docs/, not the top level.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-23 00:08:54 -05:00
Zygo Blaxell
d27621b779 main: catch exceptions and exit gracefully
Calling 'bees -m4' should not call 'std::terminate()', but it does.

Use catch_all instead.  It will still pass the exit value to return
from main.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
cb2c20ccc9 fs: get rid of base class btrfs_ioctl_same_extent_info
We only use BtrfsExtentInfo when it's exactly equivalent to the
base, so drop the derived class.

While we're here, fix BtrfsExtentSame::add so it uses a btrfs-compatible
uint64_t instead of an off_t.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
ded5bf0148 btrfs-tree: fix whitespace and const
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
d5de012a17 btrfs-tree: translate item types for error messages
Look up the name when filling in the what() field for the exception.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
66d1e8a89b btrfs-tree: add chunk items: length and type
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
c327e0bb10 readahead: report the original size in BEESTOOLONG
BEESTOOLONG was always reporting a size of zero, and the offset of the
end of the readahead region.  Report the original size instead (and also
in BEESTRACE and BEESNOTE).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
9587c40677 docs: add crawl_again, drop crawl_restart
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
a115587fad roots: fix extent lock failure handling
Drop the crawl_restart counter, it doesn't happen here (or anywhere else).

Add the crawl_again counter for extents that are restarted due to an
extent-level lock.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
af6ecbc69b trace: use pthread_setname wrapper
libcrucible can deal with the Linux kernel and/or libc's thread name
limitations.  No need to duplicate that work in bees.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
563e584da4 task: use pthread_setname_np correctly
It turns out I've been using pthread_setname_np wrong the whole time:

 * on Linux, the thread name length is 15 characters.
   TASK_COMM_LEN is 16 bytes, and the last one is always 0.
   This is now hardcoded in many places and cannot be changed.

 * pthread_setname_np doesn't return -errno, so DIE_IF_MINUS_ERRNO
   was the wrong macro.  On the other hand, we never want to do anything
   differently when pthread_setname_np fails, so we never needed to
   check the return value.

Also, libc silently ignores attempts to set the thread name when it is too
long.  That's almost certainly a libc bug, but libc probably suppresses
the error result for the same reasons I ignore the error result.

Wrap the pthread_setname function with a C++ std::string overload that
truncates the argument at 15 characters, so we at least get the first
part of the task name in the thread name field.  Later commits can deal
with making the bees thread names shorter.

Also wrap pthread_getname for symmetry.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Zygo Blaxell
c5889049f0 docs: remove duplicate (and wrong) default scan mode
The default scan mode is found in config.md.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2023-01-05 01:10:17 -05:00
Adam Faiz
ecaed09128 docs: fix reference direction
The Dependencies list is above the Packaging section, not below.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-29 06:25:33 -05:00
Zygo Blaxell
64dab81e42 Merge github PR #148
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-23 00:26:33 -05:00
Zygo Blaxell
cfcdac110b context: don't count MultiLock waiting time in dedup_ms
This was inflating the dedup_ms statistic because it was counting all
the resolve time too.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-22 23:46:36 -05:00
Zygo Blaxell
c3b664fea5 context: don't forget to retry locked extents
The caller of scan_forward has to stop advancing the BeesFileCrawl
position when an extent lock blocks a scan, so that it will resume
from the same position when the Task is scheduled again; otherwise,
bees simply skips over the extent and leave it incompletely deduped.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-22 23:46:36 -05:00
Hilton Chain
66b00f8a97
beesd: Honor DESTDIR on installation.
Co-authored-by: Adam Faiz <adam.faiz@disroot.org>
Signed-off-by: Hilton Chain <hako@ultrarare.space>
2022-12-23 11:10:17 +08:00
Zygo Blaxell
bbcfd9daa6 roots: replace BEES_TRANSID_FACTOR with BEES_TRANSID_POLL_INTERVAL
Restart crawl_more (and update crawl roots and flush FD caches) every
time the transid changes, and only when the transid changes, but
not more often than a reasonable minimum poll interval.

Clean up the log message:  use the proper thread name and remove
the wildly inaccurate estimate of when crawl will resume.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
d6d3e1045e context: keep the resolve cache smaller
We don't need to cache 65536 extent maps, especially if each one
can have almost 700K references.

Valgrind's massif tool points to the extent map cache as a very
large memory allocator, but test runs with memcg disagree.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
d5d17cbe62 roots: run insert_new_crawl from within a Task
If we have loadavg targeting enabled, there may be no worker threads
available to respond to new subvols, so we should not bother updating
the subvols list.

Put insert_new_crawl into a Task so it only executes when a worker
is available.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
48dd2a45fe docs: remove the line discussing 'max_transid' in recent scan mode
This makes the doc match the code again.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
7267707687 roots: disable recent sorting by max_transid
On large filesystems where the min_transid of all subvols gets stuck at 0,
bees may lose the ability to effectively track recent data.  A secondary sort
by max_transid will allow scanning newer subvols that were created after bees
started running on the filesystem, but before bees completed the first scan
of all subvols.

On the other hand, the secondary sort does a reverse version of the
sequential scan mode, and the sequential scan mode is simply awful.

Disable it for now.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
984ceeb2a5 docs: update documentation for new 'recent' scan mode
Also attempted to clarify the descriptions of the modes based on
feedback and questions from users over the years.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
03f809bf22 roots: reimplement scan modes using virtual base and methods
Split each scan mode into two distinct phases:

    1.  A heavy discovery phase, where we search the entire filesystem
    for something (new items in subvol trees in this case).

    2.  A light consuming phase, where we fetch extents to dedupe
    from places that we found in the discovery phase.

Part 1 recomputes the subvol ordering every time there is a new transid.
For some scan modes this computation is quite expensive, far too costly
to pay for every extent, so we do it no more than once per transaction.

Part 2 is run every time a worker thread hits the crawl_more Task.
It simply pulls one extent from the first crawler off a sorted list,
removing the crawler from the list when the crawler runs out of data.

Part 1 creates a new structure and swaps it into place, while Part 2
continues to run using the previous strucuture.  Neither of these
need to block the other, so they don't.

The separate class and base pointer also make it easer to add new scan
modes that are not based on subvol trees or that don't use BeesCrawl.

While we're here, fix up some method visibility in BeesRoots.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
0dca6f74b0 roots: remove duplicate default scan mode setting
Set the constructor's default scan mode to an invalid mode, so if we
change the default, we don't have to update two places.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:01 -05:00
Zygo Blaxell
f5c4714a28 roots: add 'recent' crawl mode for a mix of new and old data
Crawl mode 3 'recent' prioritizes data from new updates to previously
scanned subvols over subvols that have not been completely scanned yet.
If no such new data exists, falls back to a variation of 'lockstep'
scan mode.

This enables us to keep up with new data as it arrives, a key weakness
of all the other scan modes, and worth violating our unwritten "no new
scan modes until we have extent-tree dedupe working" policy for.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:00 -05:00
Zygo Blaxell
de96a38460 roots: emit "crawl finished" at the correct time
The correct time is when we set the deferred bit after a tree
search returns empty.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:00 -05:00
Zygo Blaxell
82c2b5bafe roots: improve thread status tracking messages
Don't dereference a shared_ptr inside a thread status function.

Do trace the crawl start events.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:51:00 -05:00