From 82474b4ef44ab0049990c123b901084a6b66850b Mon Sep 17 00:00:00 2001 From: Zygo Blaxell Date: Sat, 25 Feb 2023 03:12:27 -0500 Subject: [PATCH] docs: update front page At least one user was significantly confused by "designed for large filesystems". The btrfs send workarounds aren't new any more. Signed-off-by: Zygo Blaxell --- docs/index.md | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/docs/index.md b/docs/index.md index 5d90f10..c607db7 100644 --- a/docs/index.md +++ b/docs/index.md @@ -6,10 +6,11 @@ Best-Effort Extent-Same, a btrfs deduplication agent. About bees ---------- -bees is a block-oriented userspace deduplication agent designed for large -btrfs filesystems. It is an offline dedupe combined with an incremental -data scan capability to minimize time data spends on disk from write -to dedupe. +bees is a block-oriented userspace deduplication agent designed to scale +up to large btrfs filesystems. It is a daemon that performs offline +dedupe automatically as required. It uses an incremental data scan +capability to minimize memory usage and dedupe new data soon after it +appears in the filesystem. Strengths --------- @@ -17,23 +18,23 @@ Strengths * Space-efficient hash table and matching algorithms - can use as little as 1 GB hash table per 10 TB unique data (0.1GB/TB) * Daemon incrementally dedupes new data using btrfs tree search * Works with btrfs compression - dedupe any combination of compressed and uncompressed files - * **NEW** [Works around `btrfs send` problems with dedupe and incremental parent snapshots](options.md) - * Works around btrfs filesystem structure to free more disk space - * Persistent hash table for rapid restart after shutdown + * Works around btrfs filesystem structure issues to free more disk space than generic dedupe tools + * Persistent hash table and checkpoint for rapid restart after shutdown * Whole-filesystem dedupe - including snapshots * Constant hash table size - no increased RAM usage if data set becomes larger * Works on live data - no scheduled downtime required * Automatic self-throttling based on system load + * Low memory footprint (excluding the hash table) Weaknesses ---------- - * Whole-filesystem dedupe - has no include/exclude filters, does not accept file lists - * Requires root privilege (or `CAP_SYS_ADMIN`) + * Whole-filesystem dedupe - has no include/exclude filters, does not accept file lists, terminates only when explicitly requested + * Requires root privilege (or `CAP_SYS_ADMIN`) to work * First run may require temporary disk space for extent reorganization * [First run may increase metadata space usage if many snapshots exist](gotchas.md) * Constant hash table size - no decreased RAM usage if data set becomes smaller - * btrfs only + * btrfs only (bcachefs and xfs are missing various features) Installation and Usage ---------------------- @@ -70,6 +71,6 @@ You can also use Github: Copyright & License ------------------- -Copyright 2015-2022 Zygo Blaxell . +Copyright 2015-2023 Zygo Blaxell . GPL (version 3 or later).