A very minimal distributed filesystem meant solely for write-once-read-many workloads using differently sized drives under my homelab. Taking inspiration from lizardfs, moosefs, and razorfs.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
hak8or 069d424455 Moving even more around! 1 day ago
CMake Added Ctest handler for Catch2 tests! 8 months ago
Core Moving even more around! 1 day ago
FileUtility Moving even more around! 1 day ago
Fuse_Server Moving even more around! 1 day ago
Mojette Core FileUtility Mojette: Fix sanitizer errors and improvements to APIs 1 month ago
Tests Moving even more around! 1 day ago
docs Added catch2+spdlog, first working mojette transform+inverse 1 year ago
libs Bump XXHash to v0.7.3 which is RC of 128 bit 6 months ago
misc_source_data Added catch2+spdlog, first working mojette transform+inverse 1 year ago
.gitignore Monstrous WIP (moving stuff around, finalizing serialized structures) 2 days ago
.gitmodules Added CLI11 as optional library + more CMake cleaning 8 months ago
CMakeLists.txt Moving even more around! 1 day ago
Makefile Beginnings of CMake cleanup for multiple targets, etc 8 months ago
compile_commands.json Add symlink for compile_commands.json from top of source for clangd 5 months ago
readme.md Monstrous WIP (moving stuff around, finalizing serialized structures) 2 days ago

readme.md

MyFS

A dead simple file system optimized for homelab use, where we care primarily for read-many-write-once use-cases while making use of many variable sized drives via configurable erasure encoding chunks.

Core serialized data structures

This is what data structures we save to disks. Keep in mind we try to align as much as we can to 8192 Byte boundaries. This is because we tend to talk about where things are on the disks in units of pages (each of which is 8192 Bytes).

Our metadisk will contain these structures, tying hashes to inodes.

|        Header         | Source File Inodes[0..n] | Hash Super Groups[0..n] |    Hash Group      |
| --------------------- | ------------------------ | ----------------------- | ------------------ |
|  8B MagicNumber       | 256B FileName            | 8192B Hash Group 00     | 16B Hash 000       |
|  1B Version           |   8B Parent Node         |       ....              |     ....           |
|  3B Padding           |   8B FileSize            | 8192B Hash Group 15     | 16B Hash 511       |
|  4B Reserved          |   1B Inode Type          |                         | 16B Checksum       |
||----- Sections ----   |   3B Padding0            |                         |                    |
||- 8B Inodes           |   8B Reserved            |                         |                    |
||- 8B Hash SuperGroups |   8B Accessed_ms         |                         |                    |
||- 8B CPID SuperGroups |   8B Modified_ms         |                         |                    |
||- 8B Projection Rows  |   8B Created_ms          |                         |                    |
| 16B Checksum          |   12B Padding1           |                         |                    |
|                       |  16B SFO_Hashes 00       |                         |                    |
|                       |    ... 688B ...          |                         |                    |
|                       |  16B SFO_Hashes 42       |                         |                    |
|                       |   4B HashTable_Page 0000 |                         |                    |
|                       |   ... 15376B ...         |                         |                    |
|                       |   4B HashTable_Page 3843 |                         |                    |
|                       |  16B Checksum            |                         |                    |
|                       |                          |                         |                    |
| Total: 64 Bytes       | Total: 16,384 Bytes      | Total: 131,072 Bytes    | Total: 8,192 Bytes |
 
- "SFO_IDs": Short File Optimization IDs
- Super groups are just pages that contain a bunch of serialized types and a checksum.
- Hash refers to a Mojette projection

Our storage disks will contain a header (same as metadisk) and then a bunch of CPIDs describing what projection rows are present on this disk after said CPID information.

| CPID Super Groups[0..n] |    CPID Group      | Compressed ProjectionID | Projection Rows [0..n] |
| ----------------------- | ------------------ | ----------------------- | ---------------------- |
| 8192B CPID Group_00     | - 24B CPID_000     | 16B Hash                | Projection Row 0000000 |
|         ....            |       ....         |  4B Index (8kB Page)    |          ....          |
| 8192B CPID Group_15     | - 24B CPID_388     |  1B Angle + WordSize    |          ....          |
|                         | -  7B Padding      |                         |          ....          |
|                         | - 16B Checksum     |                         | Projection Row xxxxxxx |
|                         |                    | Total: 21 Bytes         |                        |
| Total: 131,072 Bytes    | Total: 8,192 Bytes | * Also called "CPID"    |                        | 
Maximum size of each disk: 
	UINT32_MAX Pages * 8192 BytesPerPage = 
	(2^32 - 1) * 8192 = 35,184,372,080,640 Bytes = 35 TB

Bytes per file using shortfile optimization (Inode with no tables):
	43 SFO_IDs * BytePerID= 
	43 SFO_IDs * 196,608 BytePerID = 8,454,144 Bytes = 8 MB

Maximum ChunkID's per file: 
	43 SFO_IDs + (3834 Hash Super Groups * 16 Hash Groups * 511 IDs) =
	43 SFO_IDs + 31,346,784 IDs = 31,346,727 IDs

Maximum Bytes per file:
	31,346,727 IDs * OrigBytesPerID = 
	31,346,727 IDs * 196,608 OrigBytesPerID = 6,163,017,302,016 Bytes = 5.6 TB

Benchmarks

Below is a rough benchmark using Catch2's BENCHMARK macro of our Mojette transform and Mojette inverse. Performance can be better, for example I bet we could optimize the overflow bit handling to instead just use a byte per column, so we won't have to do bitwise operations. There is also no SIMD processing of this yet (which this is perfect for). We see the smaller bin sizes even off in performance somewhere between 64 and 256 words per pin, which opens up being better able to multithread getting smaller chunks.

Words/Row Mode Mojette Transform Mojette Inverse
5 Words Rel 32 nS (31,250K Chunk/s): 625 MB/s 144 ns (6,944K Chunk/s): 139 MB/s
64 Words Rel 0.8 uS ( 1,250K Chunk/s): 320 MB/s 1.7 us ( 588K Chunk/s): 151 MB/s
96 Words Rel 1.2 uS ( 833K Chunk/s): 320 MB/s 2.5 us ( 400K Chunk/s): 154 MB/s
128 Words Rel 1.6 uS ( 625K Chunk/s): 320 MB/s 2.9 us ( 345K Chunk/s): 177 MB/s
256 Words Rel 3.0 uS ( 333K Chunk/s): 341 MB/s 5.7 us ( 175K Chunk/s): 179 MB/s
1024 Words Rel 12 uS ( 83K Chunk/s): 338 MB/s 23 us ( 44K Chunk/s): 176 MB/s
4096 Words Rel 48 uS ( 21K Chunk/s): 341 MB/s 92 us ( 11K Chunk/s): 178 MB/s
16384 Words Rel 197 uS ( 5K Chunk/s): 332 MB/s 375 us ( 3K Chunk/s): 175 MB/s

Keep in mind the following

  • Time is how long to do one iteration of a Mojette Inverse or Transform.
  • Each word is 4 bytes long, with 2 overflow bits in separate bitmap
  • Mojette Transform generates 5 projections, not the minimum 3
  • Mojette Inverse includes a copy of the projections
  • This is with 3 data rows and 5 projections.
  • Release mode is CMake's “Release” mode with all sanitizers disabled.

Tests

To run your tests, first just compile, and then run ctest or ctest -v and that's it! CMake, when including Ctest and Catch2Parse CMake files, will add a new command. This command runs our test binaries, which will parse the output of Catch2 test binaries using a list tests command. This parsed information seems to get passed to ctest somehow, and will run the tests.