Testing an agentic workflow for setting up and labeling a medical video dataset
I’ve been experimenting with an agentic annotation workflow for computer vision datasets and wanted to share a quick demo for feedback.
In this example, the workflow starts with a fetal head image/video dataset. Instead of manually creating the project, defining the label, setting up the task and attaching the uploaded media, the user gives the system a natural language instruction. The agent then handles the project setup steps and prepares the annotation task.
After the task is created, the workflow moves into the annotation interface, where an AI-assisted tool is used to segment fetal heads across multiple frames.
The part I’m most interested in is whether this kind of agentic setup can reduce the overhead around dataset preparation, especially for workflows where teams repeatedly need to:
- create projects
- define labels
- upload or organize media
- configure annotation tasks
- apply AI-assisted segmentation across frames
- review and correct generated masks
This is still early, but the goal is to reduce repetitive setup work before the actual annotation and review process begins.
Would love feedback from people working on medical imaging, video annotation or CV dataset pipelines:
What parts of annotation setup do you think are worth automating, and what parts should always stay manual or human-controlled?
check it out: https://www.perceptronai.org/