I wrote a PNG decompressor and neural network from scratch for MNIST
A year back, I read a lot about people implementing a NN from "scratch", and then using PyTorch or other tools along the way. Wondered, what it would be actually like to implement it from sticks and stones (zig). Well, there goes my sanity lmao.
I have attempted this a while back, and finished the PNG decompression part almost a year ago, but the neural network never worked properly, which is why I abruptly quit the project for good. Came back two days ago; hours of debugging later, my neural network finally trains correctly on MNIST.
Very proud about the fact that I built the entire stack myself in Zig:
PNG Decompression, MNIST PNG parsing, matrix implementation, feedforward, backpropagation, gradient descent (with heavy inspiration AND help from sebastian lague).
No ML libraries, no NumPy, no PyTorch.
Math was annoying, but a lot of the stuff (such as partial deriviatives) were taught to me in school during the time, so that was alright. other issues, such as tiny implementation bugs like
- weights not actually updating
- wrong indexing dimensions
- hidden layer gradients using incorrect inputs
-sigmoid saturation from bad initialization
were a lot more annoying.
At one point the network would appear to learn smth depending on RNG seed, which turned out to be a mix of actual bugs + terrible weight initialization (mostly still dependent on luck cuz Xavier Init resulted in NaN output values...).
After fixing the training pipeline, it now reliably learns all digits 0-9 from MNIST.
Zig was probably the hardest possible language choice for this, but also the reason I understand it now.
The code is open-source over here;
https://github.com/XerWoho/Triarch
There are not any docs in docs/. My brain back then was solely focused on learning PNGs as it appears lol. I would suggest reading through the code, though it is really old (0.14.0), because it was a big learning journey for me, and could be for you to. Though this is what would basically happen if you would execute it;
(correct guesses / total attempts). (ignore the fact that the correct guesses are larger than total attempts... I wrote / 100 instead of / 1000... gotta stop using magic numbers)
Would love feedback, as this was total work of about 7-8 months (excluding the long break), though be nice to me, as the PNG decompression code (as mentioned), is almost older than me.