Reword log message for discovery of new toxic extents vs. lookup of
previously known toxic extents. Also add the block data (especially
filename) to the discovery message.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
No public version of bees ever created old-style compressed hash table
entries. Remove the code that supports them.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add a third scan mode with alternative trade-offs.
Benefits: Good sequential read performance. Avoids race conditions
described in https://github.com/Zygo/bees/issues/27. Avoids diverting
scan resources into short-lived snapshots before their long-lived
origin subvols are fully scanned.
Drawbacks: Takes the longest time of the three implemented scan-modes
to free space in extents that are shared between snapshots. Uses the
maximum amount of temporary space.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Duplicated code between the different scan modes has slowly been
becoming less and less trivial. Move the code to a method and
make both scan-modes call it.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Prealloc extent sizes were taken from the Extent object and did not
take the file size into account. If a file with a non-4K-aligned
size is preallocated, the resulting dedup fails with an exception
because the size of both ranges of the BeesRangePair do not match.
Limit the size of the replacement hole extent to not extend past the
end of the file.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Restartng scans for each transid is a bit aggressive. Scan every 10
transids for a polling rate close to the former BEES_COMMIT_INTERVAL.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
transid_max is now measured at a single point in the crawl_transid thread.
Move the Crawl deferred logic into BeesRoots so it restarts all crawls
when transid_max increases. Gets rid of some messy time arithmetic.
Change name of Crawl thread to "crawl_master" in both thread name and
log messages.
Replace "Next transid" with "Crawl started".
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The periodic cache age check was not protected by a lock, so multiple
threads may decide to concurrently clear the cache. This led to
duplicate log messages.
Fix by moving the cache expiry trigger out of FdCache and into Roots,
which knows when transids change and can perform cache clears at exactly
the time they are most relevant, i.e. after something that was deleted
becomes permanently so.
This removes the last references to BEES_COMMIT_INTERVAL, so get rid
of its definition too.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Now that the polling interval is up to 30 times faster,
next_transid seems too verbose again.
Make it clearer that the interval quoted in the "Deferring..."
message is the computed transaction polling interval.
Combine "Next transid" and "Restarted crawl" into a single message.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Make the crawl polling interval more closely track the commit interval
on the btrfs filesystem. In the future this will provide opportunities
to do things like clear FD caches and stop crawls on deleted subvols,
but triggered by transaction commits instead of arbitrary time intervals.
Rename the "crawl" thread so it no longer has the same name as the "crawl"
task, and repurpose it for dedicated transid polling. Cancel the deletion
of crawl_thread and repurpose it to trigger new crawls and wake up the
main crawl Task when it runs out of data.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Fix discussion of nodatasum files, clarifying what we can and cannot do.
Get rid of some BEESNOTE and BEESTRACE calls which cannot be observed
(well, BEESNOTE can, but you have to be quick!).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Having too many "write a message to the log" primitives is confusing,
and having one that intermittently and silently discards output is even
_more_ confusing.
Replace all BEESINFO with appropriate BEESLOG*s. Usually DEBUG.
Except for one or two that occur too often. Just delete those.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The data field of BeesBlockData is only interesting to those who want
to debug the BeesBlockData implementation or other battle-tested parts
of bees. Users who want to do this can modify and rebuild the source
to enable the output.
To everyone else, the data field is a huge, ongoing infoleak through
the log.
Don't bother with an option, just output the length of the data field
and nothing else.
Fixes: https://github.com/Zygo/bees/issues/53
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Since we are now unconditionally rendering the print_fn as a static
string, there is no need for it to be a function. We also need it to
be brief and mostly constant.
Use a string instead. Put the string before the function in the Task
constructor arguments so that the title string appears as a heading in
code, since we are making a breaking API change already.
Drop TASK_MACRO as it is broken by this change, but there is no similar
usage of Task anywhere to make it worth fixing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Move pthread_setname_np to the same place we do pthread_getname_np.
Detect errors in pthread_getname_np--but don't throw an exception
because we would call ourself recursively from the exception handler
when it tries to log the exception.
Fix the order of set_name and the first BEESNOTE/BEESLOG call in threads,
closing small time intervals where logs have the wrong thread name,
and that wrong name becomes persistent for the thread.
Make the main thread's name "bees" because Linux kernel stack traces use
the pthread name of the main thread instead of the name of the process.
Anonymous threads get the process name (usually "bees"). We should not
have any such threads, but we do. This appears to occur mostly during
exception stack unwinding. GCC/pthread bug?
Fixes: https://github.com/Zygo/bees/issues/51
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When a Task worker thread is executing a Task, the thread name is less
useful than the Task description.
Use the Task description instead of the thread name if the thread has
no BeesThread name and the thread is currently executing a task.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Threads from the Task module in libcrucible don't set BeesNote::tl_name.
Even if they did, in Task context the thread name is unspecific to the point
of meaninglessness.
Use the Task::print method as the name for such threads, and be sure
that future Task print functions are designed for that usage.
The extra complexity in BeesNote::get_name() seems preferable to
bombarding pthread_setname_np hundreds or thousands of times per second.
FIXME: we are now calling Task::print() on every BeesNote, which
is effectively unconditionally. Maybe we should have Task::print()
and get_name() return a closure, or just evaluate Task::print() once
and cache it in TaskState, or define Task's constructor with a string
argument instead of the current print_fn closure.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Silence the three(!) log messages per crawl increment an extra one at
the end of the subvol.
The three critical messages per subvol crawl cycle are:
Next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <A>..<B> started <T> (<AGO>s ago)
Subvol has been completely scanned and a new transaction range will
be created. CrawlState is the state of the old subvol.
Restarted crawl BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago)
Subvol has been restarted. CRawlState is the state of the new subvol.
Deferring next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago)
Subvol has been completely scanned, but it is too soon to start a
new scan.
Fix the "Restart..." message to use the correct verb tense and to use
the correct BeesCrawlState data.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When we find a matching block we attempt to extend ("grow") the matched
pair around the first matching block. This function takes the IO hit of
reading the second extent from each duplicate extent pair. It's also
very slow--too many allocations, too small reads, reads in the wrong
order, an order of magnitude too many calls to TREE_SEARCH_V2, and it
is usually in the top 3 most frequent PERFORMANCE warnings.
Start tracking the running time of grows using the pairforward_ms
and pairbackward_ms counters so that we can compare it to various
replacements.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit adds log levels to the output. In systemd, it makes colored
lines, otherwise it's probably just a number. Bees is very chatty, so
this paves the road for log level filtering.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Dependencies can be generated in parallel which can be much faster. It
also puts away the problem that for may fail multiple times in a row and
leaving behind a broken intermediate file which would be picked up by
successive runs.
Signed-off-by: Kai Krakow <kai@kaishome.de>
The mlock runs much faster, probably because the hash fetches are
doing most of the work that mlock does.
It makes bees startup latency for testing smaller, even if it takes more
time in absolute terms.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
There are two subvol scan algorithms implemented so far. The two modes
are unimaginatively named 0 and 1.
0: sorts extents by (inode, subvol, offset),
1: scans extents round-robin from all subvols.
Algorithm 0 scans references to the same extent at close to the same
time, which is good for performance; however, whenever a snapshot is
created, the scan of the entire filesystem restarts at the beginning of
the new snapshot.
Algorithm 1 makes continuous forward progress even when new snapshots
are created, but it does not benefit from caching and will force the
kernel to reread data multiple times when there are snapshots.
The algorithm can be selected at run-time using the -m or --scan-mode
option.
We can collect some field data on these before replacing them with
an extent-tree-based scanner. Alternatively, for pre-4.14 kernels,
we can keep these two modes as non-default options.
Currently these algorithms have terrible names. TODO: fix that, but
also TODO: delete all that code and do scans directly from the extent
tree instead.
Augment the scan algorithms relative to their earlier implementation by
batching multiple extents to scan from each subvol before switching to
a different subvol.
Sprinkle some BEESNOTEs on the Task objects so that they don't
disappear from the thread status output.
Adjust some timing constants to deal with the increased latency from
competing threads.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Distribute incoming extents across a thread pool for faster execution
on multi-core, multi-disk environments.
Switch extent enumeration model to scan extent refs consecutively(ish).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In both instances the code contained within (or the conditional
compilation surrounding it) is no longer controversial.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Remove some dead code because dedup-related deadlocks have not been
observed since Linux kernel v4.11.
Preserve rationale of remaining #if 0 block (why we do write/rename
instead of write/fsync/rename) so that people don't try to replace the
"missing" fsync() there.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
With kernel 4.14 there is no sign of the previous LOGICAL_INO performance
problems, so there seems to be no need to throttle threads using this
ioctl.
Increase the FD cache size limits and scan thread count. Let the kernel
figure out scheduling.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
BEESNOTE puts a message on the status message stack. BEESINFO logs a
message with rate limiting. The message that was flooding the logs
was coming from BEESINFO not BEESNOTE.
Fix earlier commit which removed the wrong message.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This avoids PERFORMANCE warnings when large hash tables are used on slow
CPUs or with lots of worker threads. It also simplifies the code (no
locksets, only one object-wide mutex instead of two).
Fixed a few minor bugs along the way (e.g. we were not setting the dirty
flag on the right hash table extent when we detected hash table errors).
Simplified error handling: IO errors on the hash table are ignored,
instead of throwing an exception into the function that tried to use the
hash table.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When a toxic extent is discovered, insert the offending hash/address/toxic
entry into the hash table.
When a previously discovered toxic extent is encountered, do nothing,
i.e. allow the offending hash/address/toxic entry in the hash table
to expire.
Previously both inserts were removed from the code, but the former one
is required. The latter prevents bees from forgiving toxic extents
(or any hash matching one) should they be relocated, deleted, or simply
become non-toxic.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
GCC 7 and higher turn a previous warning into an error for implicit
fallthrough. Let's hint the compiler that this is intentional here.
Signed-off-by: Kai Krakow <kai@kaishome.de>
(cherry picked from commit 270a91cf17783813edaa75fb9766858dcf8e1689)
GCC 7 and higher turn a previous warning into an error for implicit
fallthrough. Let's hint the compiler that this is intentional here.
Signed-off-by: Kai Krakow <kai@kaishome.de>
GCC 7 and higher turn a previous warning into an error for implicit
fallthrough. Let's hint the compiler that this is intentional here.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Adjust bees to match changes in Chatter's interface.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 66fd28830d42acbd837dfb08b89a1f9c05c6d474)
To make bees more friendly to use with syslog/systemd, we add an option
to omit timestamps from the log output.
Signed-off-by: Kai Krakow <kai@kaishome.de>
This commit adds a simple getopt options parser to show help. This can
be used as a boilerplate for adding more options later.
Signed-off-by: Kai Krakow <kai@kaishome.de>
After a few hundred subvol threads start running, the inode cache starts
to thrash, and the log gets spammed with messages of the form:
"open_root_nocache <subvolid>: <path>"
Ideally there would be some way to schedule work to minimize inode
thrashing. Until that gets done, just silence the messages for now.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>