mirror of
https://github.com/Zygo/bees.git
synced 2025-05-17 21:35:45 +02:00
docs: simplify the exit-with-SIGTERM description
The description now matches the code again. Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit is contained in:
parent
f21569e88c
commit
c354e77634
@ -8,9 +8,10 @@ are reasonable in most cases.
|
|||||||
Hash Table Sizing
|
Hash Table Sizing
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
Hash table entries are 16 bytes per data block. The hash table stores
|
Hash table entries are 16 bytes per data block. The hash table stores the
|
||||||
the most recently read unique hashes. Once the hash table is full,
|
most recently read unique hashes. Once the hash table is full, each new
|
||||||
each new entry in the table evicts an old entry.
|
entry added to the table evicts an old entry. This makes the hash table
|
||||||
|
a sliding window over the most recently scanned data from the filesystem.
|
||||||
|
|
||||||
Here are some numbers to estimate appropriate hash table sizes:
|
Here are some numbers to estimate appropriate hash table sizes:
|
||||||
|
|
||||||
@ -25,9 +26,11 @@ Here are some numbers to estimate appropriate hash table sizes:
|
|||||||
Notes:
|
Notes:
|
||||||
|
|
||||||
* If the hash table is too large, no extra dedupe efficiency is
|
* If the hash table is too large, no extra dedupe efficiency is
|
||||||
obtained, and the extra space just wastes RAM. Extra space can also slow
|
obtained, and the extra space wastes RAM. If the hash table contains
|
||||||
bees down by preventing old data from being evicted, so bees wastes time
|
more block records than there are blocks in the filesystem, the extra
|
||||||
looking for matching data that is no longer present on the filesystem.
|
space can slow bees down. A table that is too large prevents obsolete
|
||||||
|
data from being evicted, so bees wastes time looking for matching data
|
||||||
|
that is no longer present on the filesystem.
|
||||||
|
|
||||||
* If the hash table is too small, bees extrapolates from matching
|
* If the hash table is too small, bees extrapolates from matching
|
||||||
blocks to find matching adjacent blocks in the filesystem that have been
|
blocks to find matching adjacent blocks in the filesystem that have been
|
||||||
@ -36,6 +39,10 @@ one block in common between two extents in order to be able to dedupe
|
|||||||
the entire extents. This provides significantly more dedupe hit rate
|
the entire extents. This provides significantly more dedupe hit rate
|
||||||
per hash table byte than other dedupe tools.
|
per hash table byte than other dedupe tools.
|
||||||
|
|
||||||
|
* There is a fairly wide range of usable hash sizes, and performances
|
||||||
|
degrades according to a smooth probabilistic curve in both directions.
|
||||||
|
Double or half the optimium size usually works just as well.
|
||||||
|
|
||||||
* When counting unique data in compressed data blocks to estimate
|
* When counting unique data in compressed data blocks to estimate
|
||||||
optimum hash table size, count the *uncompressed* size of the data.
|
optimum hash table size, count the *uncompressed* size of the data.
|
||||||
|
|
||||||
@ -66,11 +73,11 @@ data on an uncompressed filesystem. Dedupe efficiency falls dramatically
|
|||||||
with hash tables smaller than 128MB/TB as the average dedupe extent size
|
with hash tables smaller than 128MB/TB as the average dedupe extent size
|
||||||
is larger than the largest possible compressed extent size (128KB).
|
is larger than the largest possible compressed extent size (128KB).
|
||||||
|
|
||||||
* **Short writes** also shorten the average extent length and increase
|
* **Short writes or fragmentation** also shorten the average extent
|
||||||
optimum hash table size. If a database writes to files randomly using
|
length and increase optimum hash table size. If a database writes to
|
||||||
4K page writes, all of these extents will be 4K in length, and the hash
|
files randomly using 4K page writes, all of these extents will be 4K
|
||||||
table size must be increased to retain each one (or the user must accept
|
in length, and the hash table size must be increased to retain each one
|
||||||
a lower dedupe hit rate).
|
(or the user must accept a lower dedupe hit rate).
|
||||||
|
|
||||||
Defragmenting files that have had many short writes increases the
|
Defragmenting files that have had many short writes increases the
|
||||||
extent length and therefore reduces the optimum hash table size.
|
extent length and therefore reduces the optimum hash table size.
|
||||||
|
@ -51,73 +51,36 @@ loops early. The exception text in this case is:
|
|||||||
Terminating bees with SIGTERM
|
Terminating bees with SIGTERM
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
bees is designed to survive host crashes, so it is safe to terminate
|
bees is designed to survive host crashes, so it is safe to terminate bees
|
||||||
bees using SIGKILL; however, when bees next starts up, it will repeat
|
using SIGKILL; however, when bees next starts up, it will repeat some
|
||||||
some work that was performed between the last bees crawl state save point
|
work that was performed between the last bees crawl state save point
|
||||||
and the SIGKILL (up to 15 minutes). If bees is stopped and started less
|
and the SIGKILL (up to 15 minutes), and a large hash table may not be
|
||||||
than once per day, then this is not a problem as the proportional impact
|
completely written back to disk, so some duplicate matches will be lost.
|
||||||
is quite small; however, users who stop and start bees daily or even
|
|
||||||
more often may prefer to have a clean shutdown with SIGTERM so bees can
|
|
||||||
restart faster.
|
|
||||||
|
|
||||||
bees handling of SIGTERM can take a long time on machines with some or
|
If bees is stopped and started less than once per week, then this is not
|
||||||
all of:
|
a problem as the proportional impact is quite small; however, users who
|
||||||
|
stop and start bees daily or even more often may prefer to have a clean
|
||||||
|
shutdown with SIGTERM so bees can restart faster.
|
||||||
|
|
||||||
* Large RAM and `vm.dirty_ratio`
|
The shutdown procedure performs these steps:
|
||||||
* Large number of active bees worker threads
|
|
||||||
* Large number of bees temporary files (proportional to thread count)
|
|
||||||
* Large hash table size
|
|
||||||
* Large filesystem size
|
|
||||||
* High IO latency, especially "low power" spinning disks
|
|
||||||
* High filesystem activity, especially duplicate data writes
|
|
||||||
|
|
||||||
Each of these factors individually increases the total time required
|
1. Crawl state is saved to `$BEESHOME`. This is the most
|
||||||
to perform a clean bees shutdown. When combined, the factors can
|
|
||||||
multiply with each other, dramatically increasing the time required to
|
|
||||||
flush bees state to disk.
|
|
||||||
|
|
||||||
On a large system with many of the above factors present, a "clean"
|
|
||||||
bees shutdown can take more than 20 minutes. Even a small machine
|
|
||||||
(16GB RAM, 1GB hash table, 1TB NVME disk) can take several seconds to
|
|
||||||
complete a SIGTERM shutdown.
|
|
||||||
|
|
||||||
The shutdown procedure performs potentially long-running tasks in
|
|
||||||
this order:
|
|
||||||
|
|
||||||
1. Worker threads finish executing their current Task and exit.
|
|
||||||
Threads executing `LOGICAL_INO` ioctl calls usually finish quickly,
|
|
||||||
but btrfs imposes no limit on the ioctl's running time, so it
|
|
||||||
can take several minutes in rare bad cases. If there is a btrfs
|
|
||||||
commit already in progress on the filesystem, then most worker
|
|
||||||
threads will be blocked until the btrfs commit is finished.
|
|
||||||
|
|
||||||
2. Crawl state is saved to `$BEESHOME`. This normally completes
|
|
||||||
relatively quickly (a few seconds at most). This is the most
|
|
||||||
important bees state to save to disk as it directly impacts
|
important bees state to save to disk as it directly impacts
|
||||||
restart time, so it is done as early as possible (but no earlier).
|
restart time, so it is done as early as possible
|
||||||
|
|
||||||
3. Hash table is written to disk. Normally the hash table is
|
2. Hash table is written to disk. Normally the hash table is
|
||||||
trickled back to disk at a rate of about 2GB per hour;
|
trickled back to disk at a rate of about 128KiB per second;
|
||||||
however, SIGTERM causes bees to attempt to flush the whole table
|
however, SIGTERM causes bees to attempt to flush the whole table
|
||||||
immediately. If bees has recently been idle then the hash table is
|
immediately. The time spent here depends on the size of RAM, speed
|
||||||
likely already flushed to disk, so this step will finish quickly;
|
of disks, and aggressiveness of competing filesystem workloads.
|
||||||
however, if bees has recently been active and the hash table is
|
It can trigger `vm.dirty_bytes` limits and block other processes
|
||||||
large relative to RAM size, the blast of rapidly written data
|
writing to the filesystem for a while.
|
||||||
can force the Linux VFS to block all writes to the filesystem
|
|
||||||
for sufficient time to complete all pending btrfs metadata
|
|
||||||
writes which accumulated during the btrfs commit before bees
|
|
||||||
received SIGTERM...and _then_ let bees write out the hash table.
|
|
||||||
The time spent here depends on the size of RAM, speed of disks,
|
|
||||||
and aggressiveness of competing filesystem workloads.
|
|
||||||
|
|
||||||
4. bees temporary files are closed, which implies deletion of their
|
3. The bees process calls `_exit`, which terminates all running
|
||||||
inodes. These are files which consist entirely of shared extent
|
worker threads, closes and deletes all temporary files. This
|
||||||
structures, and btrfs takes an unusually long time to delete such
|
can take a while _after_ the bees process exits, especially on
|
||||||
files (up to a few minutes for each on slow spinning disks).
|
slow spinning disks.
|
||||||
|
|
||||||
If bees is terminated with SIGKILL, only step #1 and #4 are performed (the
|
|
||||||
kernel performs these automatically if bees exits). This reduces the
|
|
||||||
shutdown time at the cost of increased startup time.
|
|
||||||
|
|
||||||
Balances
|
Balances
|
||||||
--------
|
--------
|
||||||
|
Loading…
x
Reference in New Issue
Block a user