1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-18 05:45:45 +02:00
Zygo Blaxell 63ddbb9a4f context: serialize LOGICAL_INO calls
LOGICAL_INO can trip over the btrfs slow-backrefs bug, resulting in
some very long in-kernel runtimes.  If too many threads are executing
LOGICAL_INO then there may be no cores left on the system to run other
tasks.

Toxic extent detection is done by a very rudimentary algorithm which
can be confused by unrelated sources of latency within btrfs (especially
commit latency).  The algorithm can also be confused by other threads
executing the LOGICAL_INO ioctl.

These are two good reasons to prevent any two threads in a single bees
process instance from executing LOGICAL_INO at the same time, so let's
do that.

It is possible to limit the number of threads executing LOGICAL_INO with
the -c and -C options; however, this also limits the number of threads
which can perform any operation, while only LOGICAL_INO (*) has such a
profound effect on the rest of system operation.

Also make the status message clearer about exactly when LOGICAL_INO is
executed, as opposed to merely waiting to acquire a lock before executing
the ioctl.

(*) or maybe FILE_EXTENT_SAME.  The problem function that keeps showing
up in kernel stack traces is find_parent_nodes, which is called by both
the LOGICAL_INO and FILE_EXTENT_SAME ioctls.  We'll try this change
first and see if it prevents any recurrences of forced watchdog reboots;
if it does not, then we'll limit FILE_EXTENT_SAME the same way.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:16 -04:00
..
2018-10-19 20:21:04 -04:00
2018-01-26 23:47:47 -05:00