Skip to content

feat(fs/mem): Tarfs for Hermit images implementation#2390

Open
fogti wants to merge 2 commits intohermit-os:mainfrom
fogti:tarfs-v2
Open

feat(fs/mem): Tarfs for Hermit images implementation#2390
fogti wants to merge 2 commits intohermit-os:mainfrom
fogti:tarfs-v2

Conversation

@fogti
Copy link
Copy Markdown
Contributor

@fogti fogti commented Apr 21, 2026

Currently untested. Supersedes #2077.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Details
Benchmark Current: 680cc9c Previous: 10515fe Performance Ratio
startup_benchmark Build Time 92.19 s 104.77 s 0.88
startup_benchmark File Size 0.79 MB 0.79 MB 1.00
Startup Time - 1 core 0.83 s (±0.03 s) 0.81 s (±0.03 s) 1.02
Startup Time - 2 cores 0.84 s (±0.03 s) 0.79 s (±0.03 s) 1.06
Startup Time - 4 cores 0.85 s (±0.02 s) 0.81 s (±0.04 s) 1.05
multithreaded_benchmark Build Time 93.17 s 93.44 s 1.00
multithreaded_benchmark File Size 0.89 MB 0.89 MB 1.00
Multithreaded Pi Efficiency - 2 Threads 90.47 % (±6.02 %) 89.01 % (±7.12 %) 1.02
Multithreaded Pi Efficiency - 4 Threads 45.28 % (±2.82 %) 44.63 % (±2.54 %) 1.01
Multithreaded Pi Efficiency - 8 Threads 25.44 % (±1.54 %) 25.22 % (±1.90 %) 1.01
micro_benchmarks Build Time 102.35 s 100.50 s 1.02
micro_benchmarks File Size 0.90 MB 0.90 MB 1.00
Scheduling time - 1 thread 72.68 ticks (±4.61 ticks) 71.92 ticks (±4.10 ticks) 1.01
Scheduling time - 2 threads 40.07 ticks (±4.09 ticks) 38.20 ticks (±4.02 ticks) 1.05
Micro - Time for syscall (getpid) 3.04 ticks (±0.30 ticks) 2.97 ticks (±0.26 ticks) 1.03
Memcpy speed - (built_in) block size 4096 73651.60 MByte/s (±51141.15 MByte/s) 76558.69 MByte/s (±53084.00 MByte/s) 0.96
Memcpy speed - (built_in) block size 1048576 29892.52 MByte/s (±24596.94 MByte/s) 29990.76 MByte/s (±24588.55 MByte/s) 1.00
Memcpy speed - (built_in) block size 16777216 24495.00 MByte/s (±20488.10 MByte/s) 25249.82 MByte/s (±20983.81 MByte/s) 0.97
Memset speed - (built_in) block size 4096 72997.52 MByte/s (±50713.72 MByte/s) 76908.69 MByte/s (±53298.22 MByte/s) 0.95
Memset speed - (built_in) block size 1048576 30652.68 MByte/s (±25023.63 MByte/s) 30735.57 MByte/s (±25013.96 MByte/s) 1.00
Memset speed - (built_in) block size 16777216 25082.00 MByte/s (±20814.18 MByte/s) 25986.32 MByte/s (±21452.40 MByte/s) 0.97
Memcpy speed - (rust) block size 4096 67089.77 MByte/s (±46757.82 MByte/s) 66829.79 MByte/s (±46725.25 MByte/s) 1.00
Memcpy speed - (rust) block size 1048576 29583.44 MByte/s (±24438.50 MByte/s) 30155.93 MByte/s (±24776.94 MByte/s) 0.98
Memcpy speed - (rust) block size 16777216 25002.11 MByte/s (±20882.76 MByte/s) 24764.33 MByte/s (±20682.04 MByte/s) 1.01
Memset speed - (rust) block size 4096 67707.95 MByte/s (±47218.70 MByte/s) 67280.60 MByte/s (±47088.88 MByte/s) 1.01
Memset speed - (rust) block size 1048576 30349.98 MByte/s (±24862.47 MByte/s) 30925.42 MByte/s (±25213.93 MByte/s) 0.98
Memset speed - (rust) block size 16777216 25742.30 MByte/s (±21349.72 MByte/s) 25520.22 MByte/s (±21174.56 MByte/s) 1.01
alloc_benchmarks Build Time 94.41 s 92.85 s 1.02
alloc_benchmarks File Size 0.86 MB 0.86 MB 1.00
Allocations - Allocation success 100.00 % 100.00 % 1
Allocations - Deallocation success 100.00 % 100.00 % 1
Allocations - Pre-fail Allocations 100.00 % 100.00 % 1
Allocations - Average Allocation time 12855.97 Ticks (±163.05 Ticks) 8790.75 Ticks (±102.27 Ticks) 1.46
Allocations - Average Allocation time (no fail) 12855.97 Ticks (±163.05 Ticks) 8790.75 Ticks (±102.27 Ticks) 1.46
Allocations - Average Deallocation time 1016.07 Ticks (±244.61 Ticks) 2262.71 Ticks (±174.78 Ticks) 0.45
mutex_benchmark Build Time 95.24 s 91.47 s 1.04
mutex_benchmark File Size 0.90 MB 0.90 MB 1.00
Mutex Stress Test Average Time per Iteration - 1 Threads 13.44 ns (±0.94 ns) 13.04 ns (±0.75 ns) 1.03
Mutex Stress Test Average Time per Iteration - 2 Threads 14.66 ns (±1.50 ns) 14.50 ns (±1.43 ns) 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@fogti fogti force-pushed the tarfs-v2 branch 3 times, most recently from 250cae7 to 0cff0d2 Compare April 22, 2026 14:30
@fogti
Copy link
Copy Markdown
Contributor Author

fogti commented Apr 28, 2026

@mkroening Would it be possible to merge just part of this (namely the changes to src/fs/mem.rs, but not to other files in src)? That part is basically done. (And the rest suffers from memory management issues which I don't really know yet how to solve.) It would be nice to have a test for that, could I just add an example uncompressed tar archive to the repository for that and basically do something that amount essentially to snapshot testing (tho manual)?

@fogti fogti force-pushed the tarfs-v2 branch 3 times, most recently from f9992f1 to 38c05a2 Compare April 29, 2026 12:38
@fogti fogti marked this pull request as ready for review April 29, 2026 12:42
@mkroening mkroening self-assigned this May 6, 2026
@mkroening mkroening self-requested a review May 7, 2026 09:12
@mkroening
Copy link
Copy Markdown
Member

Currently untested. Supersedes #2077.

Is this tested now?

Would it be possible to merge just part of this (namely the changes to src/fs/mem.rs, but not to other files in src)? That part is basically done. (And the rest suffers from memory management issues which I don't really know yet how to solve.)

Is this still up to date?

It would be nice to have a test for that, could I just add an example uncompressed tar archive to the repository for that and basically do something that amount essentially to snapshot testing (tho manual)?

I would like to avoid binary files in the repo. Generating a file for the test is fine, though.

@fogti
Copy link
Copy Markdown
Contributor Author

fogti commented May 7, 2026

I could repeat a test similar to the one that is now in hermit-entry, is that okay?

Copy link
Copy Markdown
Member

@mkroening mkroening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! :)

It would be great if you could add a high-level description of the changes to the PR description.

Comment thread Cargo.toml Outdated
Comment thread src/mm/mod.rs Outdated
Comment thread src/fs/mem.rs Outdated
Comment thread src/fs/mem.rs Outdated
Comment thread src/fs/mod.rs
Comment on lines +311 to +312
root_filesystem.root =
MemDirectory::try_from_image(tar_image).expect("Unable to parse Hermit Image");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not print an error instead of panicking and continue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really make sense to boot a kernel that expects a very specific file system environment, and letting the kernel later run into errors when the image loading fails.

Comment thread src/mm/mod.rs
Comment on lines +101 to +103
let fdt::node::NodeProperty {
value: image_reg, ..
} = chosen.property("image_reg")?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a spec on the design of this FDT node?

Why is it so much more complicated to parse than a Linux initrd (https://github.com/hermit-os/loader/blob/v0.5.6/src/arch/riscv64/mod.rs#L21-L31)?

Should we model it like a Linux initrd instead? We could do either straight up linux,initrd-start and linux,initrd-end as that is what we use with other VMMs or adapt Hermit-specific names just for Uhyve (hermit,... or uhyve,...).

Copy link
Copy Markdown
Contributor Author

@fogti fogti May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Doing it the way I described above would make sense, I think.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, hermit-specific naming would make sense.

Comment thread src/mm/mod.rs
Comment on lines +134 to +136
// technically, the following is UB, because the kernel might be contained within...
HERMIT_IMAGE_START_AND_LEN
.map(|(addr, len)| unsafe { core::slice::from_raw_parts(addr.as_ptr(), len) })
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the FDT node include the kernel? Why not only put the initrd into the node?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kernel doesn't know if it is contained in it or not, it is simply possible that the ranges overlap (at least currently we don't outlaw that because it might provide memory usage savings in case of using the hermit-loader; uhyve always copies the kernel anyways).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. How can they ever overlap? Sure, the image could contain the kernel ELF disk image, but not the loaded image, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For uhyve, the case is clear: https://github.com/fogti/uhyve/blob/278bfd24a8744adc762fd598bcd025495bcc1d9b/src/vm.rs#L365
For loader, I'm not sure. There is no final implementation yet, and idk how the loading (dyn-linking?) of the kernel works there.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loader works very similarly to Uhyve. It's just happening from inside the VM.

We get an ELF image and load the ELF image to memory by copying. The running executable is not running straight from the ELF image.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, then normally, no overlap should be possible (except by loader error).

Comment thread src/mm/mod.rs

/// Physical and virtual address range of the Hermit image, in case it is present
/// (indicated via FDT).
static HERMIT_IMAGE_START_AND_LEN: Lazy<Option<(VirtAddr, usize)>> = Lazy::new(|| {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is having a lazy static for this worth it? Or would a function suffice?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to avoid parsing this information multiple times;

Normally, we need it twice: once during the memory setup (to ensure that memory is reserved for it, even tho the FDT reserved memory regions should take care of that), and once during filesystem setup.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you measure any benefit here? I expect the parsing overhead to be minimal.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably; but I prefer an implementation that makes it harder to find out when stuff can panic and when not.

And in general, it might be possible to "thread" this information through the call-stack via argument passing, but mm/mod.rs and fs/mod.rs are quite far apart from each other in terms of invocation sites.

Comment thread src/mm/mod.rs Outdated
@mkroening
Copy link
Copy Markdown
Member

I could repeat a test similar to the one that is now in hermit-entry, is that okay?

Yeah, sounds good! 👍

@fogti fogti force-pushed the tarfs-v2 branch 2 times, most recently from 48c5fd7 to 72a19ad Compare May 7, 2026 10:46
This implements the internal Hermit image mode.

The decompressed Hermit image gets passed from the
bootloader (hermit-loader) or hypervisor (uhyve) to the kernel,
the location is announced via the FDT property `/chosen,image_reg`,
and the kernel parses the contained tar (expected in ustar format) file,
placing its content at the filesystem root, similar to an initrd.

The access permissions get copied from the tar archive,
directories get default permissions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants