A very minimal distributed filesystem meant solely for write-once-read-many workloads using differently sized drives under my homelab. Taking inspiration from lizardfs, moosefs, and razorfs.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
hak8or 47365b4f20 Object_Storage: WIP of projection_source allocator. 1 week ago
CMake Added Ctest handler for Catch2 tests! 2 months ago
Core Object_Storage: WIP of projection_source allocator. 1 week ago
FileUtility FileUtility: Major cleaning WIP for FileUtility 1 week ago
Mojette Misc: Huge cleanup, squash me, WIP 3 weeks ago
Server Added CLI11 as optional library + more CMake cleaning 2 months ago
Tests Misc: Huge cleanup, squash me, WIP 3 weeks ago
docs Added catch2+spdlog, first working mojette transform+inverse 8 months ago
libs Bump XXHash to v0.7.3 which is RC of 128 bit 2 weeks ago
misc_source_data Added catch2+spdlog, first working mojette transform+inverse 8 months ago
.gitignore Initial commit 9 months ago
.gitmodules Added CLI11 as optional library + more CMake cleaning 2 months ago
CMakeLists.txt Misc: Huge cleanup, squash me, WIP 3 weeks ago
Makefile Beginnings of CMake cleanup for multiple targets, etc 2 months ago
readme.md Added Ctest handler for Catch2 tests! 2 months ago

readme.md

MyFS

A dead simple file system optimized for homelab use, where we care primarly for read-many-write-once use-cases while making use of many variable sized drives via configurable erasure encoding chunks.

FileSystem Configuration

Using the following assumptions.

Property Value
Rows per Chunk 3
Word Size (Bytes) 4
Total Storage (Bytes) 35 TB
Total Files 1,000,000
Inode Base Size (Bytes) 1024
ChunkCacheEntry Size (Bytes) 16
ChunkIDs per Inode 16
ChunkID's per Table 8192
ChunkID Tables per Inode 2048
Inode size (Bytes) 17,536
Inode Total Size (MB) 16,724
ChunkTable Size (Bytes) 65,536

We get the following data. This will be useful for getting a sense of memory needs and lookup times.

Words/Row Data/Chunk (Bytes) Total Chunks Total Projections Chunk Cache (MB) Data/Inode (KB) ChunkTables Data/ChunkTable (MB) Max File Size (MB) ChunkTables Total (MB)
64 768 45,812,984,491 229,064,922,453 699,050.67 12 5,592,406 6 12,288 349,525
128 1,536 22,906,492,245 114,532,461,227 349,525.33 24 2,796,203 12 24,576 174,763
256 3,072 11,453,246,123 57,266,230,613 174,762.67 48 1,398,102 24 49,152 87,381
512 6,144 5,726,623,061 28,633,115,307 87,381.33 96 699,051 48 98,304 43,691
1024 12,288 2,863,311,531 14,316,557,653 43,690.67 192 349,526 96 196,608 21,845
2048 24,576 1,431,655,765 7,158,278,827 21,845.33 384 174,763 192 393,216 10,923
---- ------- -------------- --------------- ---------- ------ --------- ---- ---------- -------
4096 49,152 715,827,883 3,579,139,413 10,922.67 768 87,382 384 786,432 5,461
---- ------- -------------- --------------- ---------- ------ --------- ---- ---------- -------
8192 98,304 357,913,941 1,789,569,707 5,461.33 1,536 43,691 768 1,572,864 2,731
16384 196,608 178,956,971 894,784,853 2,730.67 3,072 21,846 1536 3,145,728 1,365
32768 393,216 89,478,485 447,392,427 1,365.33 6,144 10,923 3072 6,291,456 683
65536 786,432 44,739,243 223,696,213 682.67 12,288 5,462 6144 12,582,912 341

Benchmarks

Below is a rough benchmark using Catch2's BENCHMARK macro. Performance can be better, for example I bet we could optimize the overflow bit handling to instead just use a byte per column, so we won't have to do bitwise operations. We see the smaller bin sizes even off in performance somewhere between 64 and 256 words per pin, which opens up being better able to multithread getting smaller chunks.

Words/Row Mode Mojette Transform Mojette Inverse
5 Words Rel 32 nS (31,250K Chunk/s): 625 MB/s 144 ns (6,944K Chunk/s): 139 MB/s
64 Words Rel 0.8 uS ( 1,250K Chunk/s): 320 MB/s 1.7 us ( 588K Chunk/s): 151 MB/s
96 Words Rel 1.2 uS ( 833K Chunk/s): 320 MB/s 2.5 us ( 400K Chunk/s): 154 MB/s
128 Words Rel 1.6 uS ( 625K Chunk/s): 320 MB/s 2.9 us ( 345K Chunk/s): 177 MB/s
256 Words Rel 3.0 uS ( 333K Chunk/s): 341 MB/s 5.7 us ( 175K Chunk/s): 179 MB/s
1024 Words Rel 12 uS ( 83K Chunk/s): 338 MB/s 23 us ( 44K Chunk/s): 176 MB/s
4096 Words Rel 48 uS ( 21K Chunk/s): 341 MB/s 92 us ( 11K Chunk/s): 178 MB/s
16384 Words Rel 197 uS ( 5K Chunk/s): 332 MB/s 375 us ( 3K Chunk/s): 175 MB/s

Keep in mind the following

  • Time is how long to do one iteration of a Mojette Inverse or Transform.
  • Each word is 4 bytes long, with 2 overflow bits in seperate bitmap
  • Mojette Transform generates 5 projections, not the minimum 3
  • Mojette Inverse includes a copy of the projections
  • This is with 3 data rows and 5 projections.
  • Release mode is CMake's “Release” mode with all sanitizers disabled.

Tests

To run your tests, first just compile, and then run ctest or ctest -v and that's it! CMake, when including Ctest and Catch2Parse CMake files, will add a new command. This command runs our test binaries, which will parse the output of Catch2 test binaries using a list tests command. This parsed information seems to get passed to ctest somehow, and will run the tests.