Tensor manipulation lib in zig
Hello, I was working on my tensor lib and I thought maybe its time to see what other people think:
https://github.com/RubyBit/aion
this has been a project ive been working on about half a year now and I am really happy with its current state. My goal was to take the approach of onnx, as in making portable models launchable anywhere but without the multi vendor approach they took and also provide a high level api to build models, similar to pytorch. This is an inference library though currently, no training.
Most of my focus has been on making the tiling strategy (to maximize cache locality) and on optimized kernels. Towards that goal, I really like zigs portable simd (@Vector) as it allows me to keep a high level attitude towards how to make the kernels and reuse them across cpus.
My landmark goals was to be able to run some models of interest to me with performance near onnx and pytorch such as gemma 4 (e2b) and silero-vad (ON CPU):
On my hardware (i9 14900hx, ddr5 5600mt/s), the current version of the lib is able to run silero with performance better than onnx and pytorch (~65ns per chunk / 576 samples 16khz 1 thread, check out examples, while onnx runs at ~100ns). Gemma 4 q8_0 runs at ~13tokens/s 32 threads (tbh I havent benched it with another lib yet). Next models im looking at will be some asr ones of interest (and also kolmogorov arnold networks / neural operators).
If you want to run the models yourself I can post the .aion versions in hf (there are also scripts to convert them yourself in the repo).
I have also used AI in this project as of course I wouldnt be able to make so much progress in 6 months, so this post might be removed but hopefully I pass the effort benchmark.
My question to you is mainly what you think of my tiling strategy/arch (checkout tiled tensor in src/storage and src/graph for details), my aot approach to optimized kernels and also which other models would be of interest. Next steps for me is to add some control flow nodes in the runtime and improve the api (and maaaybe look at gpu support).
Check it out and sorry for the top level read me (checkout bindings/python), I havent have time to write down proper docs.