bees/src at 8d08a3c06f340a8678727518f957141e08150696 - bees

mirror of https://github.com/Zygo/bees.git synced 2026-01-08 19:00:22 +00:00

Files

T

Zygo Blaxell 8d08a3c06f readahead: inject some sanity at the foundation of an insane architecture

This solves some of the worst problems with bees reads:

1.  The kernel readahead doesn't work.  More precisely, it's much better
adapted for a very different use case:  a single thread alternating
between reading a file sequentially and processing the data that was read.
bees has multiple threads which compete for access to IO and then issue
reads in random order immediately after the call to readahead.  The kernel
uses idle ioprio scheduling for the readaheads, so the readaheads get
preempted by the random reads, or cancels the readaheads because the
data access pattern isn't sequential after the readahead was issued.

2.  Seeking drives perform terribly with multiple competing readers,
especially with btrfs striped profiles where the iops are broken into
tiny stripe-sized pieces.  At one point I intended to read the btrfs
device map and figure out which devices can be read in parallel, but to
make that useful, the user needs to have an array with multiple drives
in single profile, or 4+ drives in raid1 profile.  In all other cases,
the elaborate calculations always return the same result:  there can be
only one reader at a time.

This commit fixes both problems:

1.  Don't use the kernel readahead.  Use normal reads into a dummy
buffer instead.

2.  Allow only one thread to readahead at any time.  Once the read is
completed, the data is in the page cache, and all the random-order small
reads that bees does will hit the page cache, not a spinning disk.
In some cases we need to read two things close together, so add a
`bees_readahead_pair` which holds one lock across both reads.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>

2024-11-30 23:30:33 -05:00

.gitignore

bees: move usage message out of source file and fix a few inaccuracies

2020-12-17 17:54:51 -05:00

bees-context.cc

context: when a task fails to acquire an extent lock, don't go ahead and scan the extent anyway

2024-11-30 23:30:33 -05:00

bees-hash.cc

hash: use kernel readahead instead of bees_readahead to prefetch hash table

2024-11-30 23:30:33 -05:00

bees-resolve.cc

types: member m_fd in BeesFileRange must be protected against data races