Silence the three(!) log messages per crawl increment an extra one at
the end of the subvol.
The three critical messages per subvol crawl cycle are:
Next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <A>..<B> started <T> (<AGO>s ago)
Subvol has been completely scanned and a new transaction range will
be created. CrawlState is the state of the old subvol.
Restarted crawl BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago)
Subvol has been restarted. CRawlState is the state of the new subvol.
Deferring next transid in BeesCrawlState <SUBVOL>:0 offset 0x0 transid <B>..<C> started <T+AGO> (0s ago)
Subvol has been completely scanned, but it is too soon to start a
new scan.
Fix the "Restart..." message to use the correct verb tense and to use
the correct BeesCrawlState data.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit adds log levels to the output. In systemd, it makes colored
lines, otherwise it's probably just a number. Bees is very chatty, so
this paves the road for log level filtering.
Signed-off-by: Kai Krakow <kai@kaishome.de>
There are two subvol scan algorithms implemented so far. The two modes
are unimaginatively named 0 and 1.
0: sorts extents by (inode, subvol, offset),
1: scans extents round-robin from all subvols.
Algorithm 0 scans references to the same extent at close to the same
time, which is good for performance; however, whenever a snapshot is
created, the scan of the entire filesystem restarts at the beginning of
the new snapshot.
Algorithm 1 makes continuous forward progress even when new snapshots
are created, but it does not benefit from caching and will force the
kernel to reread data multiple times when there are snapshots.
The algorithm can be selected at run-time using the -m or --scan-mode
option.
We can collect some field data on these before replacing them with
an extent-tree-based scanner. Alternatively, for pre-4.14 kernels,
we can keep these two modes as non-default options.
Currently these algorithms have terrible names. TODO: fix that, but
also TODO: delete all that code and do scans directly from the extent
tree instead.
Augment the scan algorithms relative to their earlier implementation by
batching multiple extents to scan from each subvol before switching to
a different subvol.
Sprinkle some BEESNOTEs on the Task objects so that they don't
disappear from the thread status output.
Adjust some timing constants to deal with the increased latency from
competing threads.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Distribute incoming extents across a thread pool for faster execution
on multi-core, multi-disk environments.
Switch extent enumeration model to scan extent refs consecutively(ish).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
In both instances the code contained within (or the conditional
compilation surrounding it) is no longer controversial.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
BEESNOTE puts a message on the status message stack. BEESINFO logs a
message with rate limiting. The message that was flooding the logs
was coming from BEESINFO not BEESNOTE.
Fix earlier commit which removed the wrong message.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
After a few hundred subvol threads start running, the inode cache starts
to thrash, and the log gets spammed with messages of the form:
"open_root_nocache <subvolid>: <path>"
Ideally there would be some way to schedule work to minimize inode
thrashing. Until that gets done, just silence the messages for now.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
If we lose a race and open the wrong file, we will not retry with the
next path if the file we opened had incompatible flags. We need to keep
trying paths until we open the correct file or run out of paths.
Fix by moving the inode flag check after the checks for file identity.
Output attributes in hex to be consistent with other attribute error
messages.
There is no need to report root and file paths separately in the error
message for incompatible flags because we have confirmed the identity of
the file before the incompatible flag error is detected. Other messages
in this loop still output root path and file_path separately because
the identity of 'rv' is unknown at the time these messages are emitted.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
If you have a lot of or a few big nocow files (like vm images) which
contain a lot of potential deduplication candidates, bees becomes
incredibly slow running through a lot "invalid operation" exceptions.
Let's just skip over such files to get more bang for the buck. I did no
regression testing as this patch seems trivial (and I cannot imagine any
pitfalls either). The process progresses much faster for me now.
In gcc 7+ warning: implicit-fallthrough has been added
In some places fallthrough is expectable, disable warning
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Before:
unique_lock<mutex> lock(some_mutex);
// run lock.~unique_lock() because return
// return reference to unprotected heap
return foo[bar];
After:
unique_lock<mutex> lock(some_mutex);
// make copy of object on heap protected by mutex lock
auto tmp_copy = foo[bar];
// run lock.~unique_lock() because return
// pass locally allocated object to copy constructor
return tmp_copy;
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Previously, the scan order processed each subvol in order. This required
very large amounts of temporary disk space, as a full filesystem scan
was required before any shared extents could be deduped. If the hash
table RAM was underprovisioned this would mean some shared dup blocks
were removed from the hash table before they could be deduped.
Currently the scan order takes the first unscanned extent from each
subvol. This works well if--and only if--the subvols are either empty
or children of a common ancestor. It forces the same inode/offset pairs
to be read at close to the same time from each subvol.
When a new snapshot is created, this ordering diverts scanning to the
new subvol until it catches up to the existing subvols. For large
filesystems with frequent snapshot creation this means that the scanner
never reaches the end of all subvols. Each new subvol effectively
resets the current scan position for the entire filesystem to zero.
This prevents bees from ever completing the first filesystem scan.
Change the order again, so that we now read one unscanned extent from
each subvol in round-robin fashion. When a new subvol is created, we
share scan time between old and new subvols. This ensures we eventually
finish scanning initial subvols and enter the incremental scanning state.
The cost of this change is more repeated reading of shared extents at
scan time with less benefit from disk-device-level caching; however, the
only way to really fix this problem is to implement scanning on tree 2
(the btrfs extent tree) instead of the subvol trees.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The thread name has an arbitrarily limited size, and we are eventually
removing support for multiple paths in a single bees daemon process.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
We really do need some large buffers for BtrfsIoctlSearchKey in some
cases, but we don't need to zero them out first. Don't do that so we
save some CPU.
Reduce the default buffer size to 4K because most BISK users don't get
need much more than 1K. Set the buffer size explicitly to the product of
the number of items and the desired item size in the places that really
need a lot of items.
Linux kernel commit 7f8e406 ("btrfs: improve delayed refs iterations")
seems to dramatically improve LOGICAL_INO performance. Hopefully this
commit will find its way into mainline Linux soon.
This means that most of the time in Bees is now spent on block reading
(50-75%); however, there is still a big gap between block read and
the sum of everything else we are measuring with the "*_ms" counters.
This gap is about 30% of the run time, so it would be good to find out
what's in the gap.
Add ms counters around the crawl and open calls to capture where we are
spending all the time.