1
0
mirror of https://github.com/Zygo/bees.git synced 2025-07-07 02:42:27 +02:00

context: speed up orderly process termination

Quite often bees exceeds its service timeout for termination because
it is waiting for a loop embedded in a Task to finish some long-running
btrfs operation.  This can cause bees to be aborted by SIGKILL before
it can completely flush the hash table or save crawl state.

There are only two important things SIGTERM does when bees terminates:
 1.  Save crawl progress
 2.  Flush out the hash table

Everything else is automatically handled by the kernel when the process
is terminated by SIGKILL, so we don't have to bother doing it ourselves.
This can save considerable time at shutdown since we don't have to wait
for every thread to reach a point where it becomes idle, or force loops
to terminate by throwing exceptions, or check a condition every time we
access a pointer.  Instead, we need do only the things in the list
above, and then call _exit() to clean up everything else.

Hash table and crawl state writeback can happen in their background
threads instead of the foreground one.  Separate the "stop" method for
these classes into "stop_request" and "stop_wait" so that these writebacks
can run at the same time.

Deprecate and remove all references to the BeesHalt exception, and remove
several unnecessary checks for BeesContext::stop_requested.

Pause the task queue instead of cancelling it, which preserves the
crawl progress state and stops new Tasks from competing for iops and
CPU during writeback.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit is contained in:
Zygo Blaxell
2022-11-19 02:45:15 -05:00
parent 594ad1786d
commit 31b2aa3c0d
4 changed files with 72 additions and 106 deletions

View File

@ -595,7 +595,7 @@ BeesRoots::start()
}
void
BeesRoots::stop()
BeesRoots::stop_request()
{
BEESLOGDEBUG("BeesRoots stop requested");
BEESNOTE("stopping BeesRoots");
@ -603,7 +603,11 @@ BeesRoots::stop()
m_stop_requested = true;
m_stop_condvar.notify_all();
lock.unlock();
}
void
BeesRoots::stop_wait()
{
// Stop crawl writeback first because we will break progress
// state tracking when we cancel the TaskMaster queue
BEESLOGDEBUG("Waiting for crawl writeback");