
Deflux - streaming ODS parser for .NET, pause/resume across process restarts, parallel work across sheets, instant sheet switching in large ODS files via checkpointable streaming.
Built this because I needed to parse large ODS files (up to 1 GB in practice) with the ability to stop, persist progress to disk, restart the process, and continue exactly where I left off - without re-decompressing or re-parsing anything.
ODS keeps all sheets inside a single content.xml entry, so jumping to a specific sheet normally means ecompressing everything before it. Deflux does one ScanSheets() pass that snapshots a checkpoint at the start of each sheet (45 KB each), and OpenSheet(name) then restores instantly without re-decompressing.
Checkpoints are fully self-contained byte arrays - they outlive the process, and any reader (in the same process or a different one) can restore from them. Parallel processing is just "open the file N times, restore each reader to a different checkpoint" - no shared state, no coordination.
The checkpoint captures the full vertical state: DEFLATE sliding window + Huffman trees + bit buffer, plus the XML parser's element stack and namespace bindings. Restore seeks the compressed stream to the saved bit position and rebuilds state.
Invariant:
Read(0→P) + Save(P) + [restart] + Restore(P) + Read(P→end)
=== Read(0→end)
Forked SharpZipLib's inflater to access internal state - no algorithmic changes, just exposed the fields needed for serialization.
Pure C#, .NET 8+.
https://github.com/daniilvaino/Deflux
Happy to answer questions. Curious if this co-checkpointing approach would be of interest to authors of perf-focused readers like Sylvan.Data.Excel