u/Hamza-bkd09

Is multi-camera person tracking + re-identification actually feasible today? How close are we to “movie-style” systems?

I’m coming more from an NLP background and recently started digging into computer vision, so I might be missing some context here.

I’m trying to understand how realistic multi-camera person tracking systems are in practice — the kind where a person is consistently identified and followed across different cameras (like surveillance systems or what we see in movies).

From my current understanding, such a system would typically involve:

  • Person detection (YOLO / RT-DETR etc.)
  • Multi-object tracking within each camera (ByteTrack / DeepSORT / BoT-SORT)
  • Cross-camera re-identification using embeddings (OSNet / TorchReID / ViT-based models)

My questions are:

  1. How mature is this field today in real-world deployments?
  2. Is consistent identity tracking across multiple non-overlapping cameras actually reliable, or still very brittle?
  3. What are the main failure points in practice (lighting, clothing similarity, occlusion, etc.)?
  4. Are there any solid open-source end-to-end systems worth studying?
  5. At what point does this stop being a “CV engineering problem” and become an open research problem again?

I’m not expecting movie-level perfect tracking — just trying to understand how close we are to a robust real-world system and what the real limitations are today.

reddit.com
u/Hamza-bkd09 — 7 days ago

I’d like to understand how companies actually apply Data Science in real-world scenarios—especially in industrial contexts like the food sector. I already have a solid foundation in AI, so feel free to go beyond basics and dive into concrete use cases, architectures, challenges, and trade-offs. If possible, I’d also appreciate insights drawn from real-world experience or industry practice

reddit.com
u/Hamza-bkd09 — 26 days ago