1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-18 05:45:45 +02:00
Zygo Blaxell 56c23c4517 crawl: implement two crawler algorithms and adjust scheduling parameters
There are two subvol scan algorithms implemented so far.  The two modes
are unimaginatively named 0 and 1.

	0:  sorts extents by (inode, subvol, offset),

	1:  scans extents round-robin from all subvols.

Algorithm 0 scans references to the same extent at close to the same
time, which is good for performance; however, whenever a snapshot is
created, the scan of the entire filesystem restarts at the beginning of
the new snapshot.

Algorithm 1 makes continuous forward progress even when new snapshots
are created, but it does not benefit from caching and will force the
kernel to reread data multiple times when there are snapshots.

The algorithm can be selected at run-time using the -m or --scan-mode
option.

We can collect some field data on these before replacing them with
an extent-tree-based scanner.  Alternatively, for pre-4.14 kernels,
we can keep these two modes as non-default options.

Currently these algorithms have terrible names.  TODO:  fix that, but
also TODO: delete all that code and do scans directly from the extent
tree instead.

Augment the scan algorithms relative to their earlier implementation by
batching multiple extents to scan from each subvol before switching to
a different subvol.

Sprinkle some BEESNOTEs on the Task objects so that they don't
disappear from the thread status output.

Adjust some timing constants to deal with the increased latency from
competing threads.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-01-17 22:53:49 -05:00
..