fjall-rs
diff --git a/‎astro.config.mjs‎
Lines changed: 10 additions & 9 deletions b/‎astro.config.mjs‎
Lines changed: 10 additions & 9 deletions
diff --git a/‎public/media/posts/fjall-28/ingest_cpu.png‎
30 KB b/‎public/media/posts/fjall-28/ingest_cpu.png‎
30 KB
diff --git a/‎public/media/posts/fjall-28/ingest_write_amp.png‎
19.9 KB b/‎public/media/posts/fjall-28/ingest_write_amp.png‎
19.9 KB
diff --git a/‎public/media/posts/fjall-28/ingest_write_buffer.png‎
34 KB b/‎public/media/posts/fjall-28/ingest_write_buffer.png‎
34 KB
diff --git a/‎public/media/posts/fjall-28/ycsb_c_binary_search.png‎
27.1 KB b/‎public/media/posts/fjall-28/ycsb_c_binary_search.png‎
27.1 KB
diff --git a/‎src/content/blog/2024-09-27_fjall-2.mdx‎
Lines changed: 1 addition & 1 deletion b/‎src/content/blog/2024-09-27_fjall-2.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/content/blog/2025-03-30_fjall-28.md‎
Lines changed: 128 additions & 0 deletions b/‎src/content/blog/2025-03-30_fjall-28.md‎
Lines changed: 128 additions & 0 deletions
@@ -1,33 +1,34 @@
+import mdx from "@astrojs/mdx";
 import sitemap from "@astrojs/sitemap";
 import solidJs from "@astrojs/solid-js";
 import { defineConfig } from "astro/config";
 import unocss from "unocss/astro";
-import config from "./src/config";
 
-import mdx from "@astrojs/mdx";
+import config from "./src/config";
 
 // https://astro.build/config
 export default defineConfig({
   integrations: [
     unocss({
-      injectReset: true
+      injectReset: true,
     }),
     sitemap(),
     solidJs(),
-    mdx()
+    mdx(),
   ],
   site: config.site.url,
   base: config.site.baseUrl,
   prefetch: {
     prefetchAll: true,
-    defaultStrategy: "hover"
+    defaultStrategy: "hover",
   },
   markdown: {
     shikiConfig: {
-      themes: config.post.code.theme
-    }
+      themes: config.post.code.theme,
+    },
   },
   redirects: {
+    "/post/announcing-fjall-2": "/post/fjall-2",
     "/post/announcing-fjall-22": "/post/fjall-2-2",
-  }
-});
+  },
+});
@@ -1,6 +1,6 @@
 ---
 title: Announcing Fjall 2.0
-slug: announcing-fjall-2
+slug: fjall-2
 description: Available in all Cargo registries near you
 tags:
   - major
 
@@ -0,0 +1,128 @@
+---
+title: Fjall 2.8
+slug: fjall-2-8
+description: "Better cache API & fast bulk loading"
+tags:
+  - release
+  - block cache
+  - performance
+published_at: 2025-03-30T19:06:53.186Z
+last_modified_at: 2025-03-30T19:06:53.186Z
+image: /media/thumbs/kawaii.png
+---
+
+Fjall is an embeddable LSM-based forbid-unsafe Rust key-value storage engine.
+Its goal is to be a reliable & predictable but performant general-purpose KV storage engine.
+
+---
+
+## Bulk loading API
+
+For bulk loading, we want fast insertion speeds of an existing data set.
+Because the data set already exists, we can insert in ascending order, which makes bulk loading faster.
+The write path of a naive LSM-tree is already notoriously short:
+Any write object gets appended to the write-ahead journal, followed by an insert into the in-memory Memtable (e.g. a skiplist).
+Most writes do not even trigger a disk I/O when we do not explicitly sync to disk, which is desirable to make bulk loads fast.
+
+However, this write path is still unnecessarily slow for bulk loads.
+Logging to the journal, and periodically flushing memtables has a write amplification of 2\* (for ascending data, we never need to compact).
+
+If we can guarantee:
+
+1. the data is inserted in ascending order
+2. our tree starts out empty
+
+we can skip the journal, flushing and compaction machinery altogether, while also needing less temporary memory.
+
+Using a new API, we gain access directly to the internal segment writing mechanism of the LSM-tree.
+This API takes a sorted iterator and creates a list of disk segments (a.k.a. `SSTables`), then registers them atomically into the tree.
+
+This API is very useful for:
+
+- schema migrations from tree A to tree B
+- migrating from a different DB
+- restoring data from a backup
+
+```rs
+let new_tree = ...
+
+let stream = (0..1_000_000).map(|x| /* create KV-tuples */);
+
+new_tree.ingest(stream.into_iter())?;
+
+assert_eq!(new_tree.len(), 1_000_000, "bulk load was not correct");
+```
+
+> \* Actually the write amp in such case is 3 currently, but [that will change to be 2 in the future](https://github.com/fjall-rs/lsm-tree/issues/121).
+
+### Benchmark
+
+This benchmark writes 100 million monotonically ascending `u128` integer keys with 100 byte values:
+
+<div style="margin-top: 10px; width: 100%; display: flex; justify-content: center">
+  <img style="border-radius: 16px; max-height: 500px" src="/media/posts/fjall-28/ingest_cpu.png" />
+</div>
+<div style="margin-top: 10px; width: 100%; display: flex; justify-content: center">
+  <img style="border-radius: 16px; max-height: 500px" src="/media/posts/fjall-28/ingest_write_amp.png" />
+</div>
+<div style="margin-top: 10px; width: 100%; display: flex; justify-content: center">
+  <img style="border-radius: 16px; max-height: 500px" src="/media/posts/fjall-28/ingest_write_buffer.png" />
+</div>
+
+Note:
+
+- `redb` used a single write transaction to load all the data, which is the fastest way to load a lot of data
+- `sled` does not have a bulk loading mechanism, so comparing it would be unfair
+
+## Unified cache API
+
+Setting the cache capacity has always been a bit of an awkward API.
+Originally it was intended to eventually allow different types of caches, however that never really materialized.
+Not only is the API verbose, it also makes it impossible to tune the cache capacity well in case of key-value separation.
+
+With key-value separation, we have different types of storages: the index tree and the value log.
+The index tree is typically much smaller than the value log because it only stores pointers into the value log.
+Only small values are directly added to the index tree.
+Depending on the average value size, it is easily possible to have an index-to-vLog size ratio of 1:30'000 (in case all values are ~1 MB).
+So for 100 GB of blobs, the index tree would be ~3 MB.
+That makes it pretty trivial to fully cache the index tree.
+
+However, unless _perfectly_ knowing the size ratio and data set size, it is impossible to properly set the block cache size such that the index tree can be fully cached.
+
+Consider the example below: we want to spend 1 GB of memory on caching, so we allow 900 MB to be used for blob caching, while reserving 100 MB for the index tree's blocks.
+However, if we stored the 100 GB mentioned above, the index tree would be much smaller than our configured cache size, resulting in essentially 97 MB of wasted cache that could be used for blobs (or bloom filters) instead.
+
+```rs
+// Before (< 2.8); now deprecated
+use std::sync::Arc;
+use fjall::{BlobCache, BlockCache, Config};
+
+let keyspace = Config::new(&folder)
+    .block_cache(Arc::new(BlockCache::with_capacity_bytes(/* 100 MB */ 100_000_000)))
+    .blob_cache(Arc::new(BlobCache::with_capacity_bytes(/* 900 MB */ 900_000_000)))
+    .open()?;
+
+// After (2.8+)
+use fjall::Config;
+
+let keyspace = Config::new(&folder)
+    .cache_size(/* 1 GB */ 1_000_000_000)
+    .open()?;
+```
+
+Internally, now the cache is one unified cache that stores all types of items (blocks and blobs).
+That way we only have to set its capacity, and let the internal caching algorithm handle what data to evict.
+
+## Replaced `std::slice::partition_point`
+
+The implementation of `std::slice::partition_point` was modified around rustc **1.82**, which caused performance regressions in binary searches.
+
+Using a less smart, cookie cutter implementation seems to perform better for `lsm-tree`, restoring some read performance in cached, random key scenarios:
+
+<div style="margin-top: 10px; width: 100%; display: flex; justify-content: center">
+  <img style="border-radius: 16px; max-height: 500px" src="/media/posts/fjall-28/ycsb_c_binary_search.png" />
+</div>
+
+## 2.7
+
+2.7 was mostly a maintenance update with minor uninteresting features.