u/Haza_rd

My friend and I built a real-time hardware anomaly detection system on an FPGA using a hybrid MobileNet + GRU architecture deployed on a Xilinx Zynq UltraScale+ ZCU104 platform.

The pipeline works like this:

MobileNet is used for spatial feature extraction from 224×224 video frames.
A GRU processes the temporal sequence information for anomaly detection.
The accelerator was implemented on the FPGA fabric, while the quad-core ARM processor on the Zynq handled camera integration and system-level control.
We later integrated a 30 FPS camera feed to demonstrate real-time inference.

For testing, since the GRU was trained only on hockey-fight anomaly datasets, we pointed the camera toward a laptop playing YouTube hockey-fight videos to validate the detection pipeline in real time.

Current performance:

Input resolution: 224×224
Inference latency: ~620 ms per frame
Platform: ZCU104 / PYNQ framework

One optimization we already implemented was using a CDMA (memory-mapped DMA) approach instead of a stream-based DMA to reduce unnecessary BRAM/URAM data movement overhead and simplify memory transfers between PS and PL.

I’d really appreciate feedback from the FPGA/embedded AI community on:

Whether this is considered a solid FPGA project for research/industry portfolios.
Suggestions to improve inference latency on the PYNQ/Zynq platform.
Whether moving more preprocessing into PL would help significantly.
Ideas like quantization, pruning, pipelining, double-buffering, AXI-Stream architectures, or using DPU/Vitis AI instead of custom logic.
Whether the MobileNet+GRU architecture is a good fit for FPGA deployment or if there are better temporal models for low-latency anomaly detection.

I’m especially interested in opinions from people who have worked with:

AMD Zynq platforms
Xilinx ZCU104
PYNQ
FPGA-based CNN acceleration
Video analytics pipelines
AXI DMA/CDMA optimization

Does ~620 ms latency sound reasonable for a first custom implementation, or is there likely a major bottleneck in the architecture/design flow that we should investigate

GitHub (other projects): CraftedByDavid GitHub
LinkedIn: David Paul LinkedIn

Built a Real-Time FPGA Anomaly Detection System on ZCU104 Using MobileNet + GRU — Looking for Optimization Advice