u/Few-Bullfrog3807

AI-Assisted Oncology Variant Reconciliation Platform — Seeking Technical & Clinical Feedback

Hi everyone,

I’m organizing a small team project for an AI/healthcare innovation competition focused on oncology molecular data interoperability and reconciliation.

Our proposed project is:

OncoReconcile AI

An AI-assisted platform designed to standardize and reconcile oncology genomic information across:

  • VCF files
  • molecular pathology PDF reports
  • vendor-specific biomarker formats
  • structured clinical/genomic data

The goal is to transform fragmented molecular oncology data into explainable, standardized, and interoperable outputs that could support:

  • molecular tumor board workflows
  • cohort generation
  • downstream analytics
  • clinical research
  • interoperability pipelines

Current Technical Direction

We are exploring a hybrid architecture combining:

  • HGNC gene normalization
  • HGVS variant normalization
  • ontology-grounded mappings
  • biomedical NLP / entity extraction
  • LLM-assisted reconciliation
  • explainable confidence scoring
  • human-in-the-loop review workflows

Potential standards/tools under evaluation include:

  • HL7 FHIR / mCODE
  • ClinVar / ClinGen
  • HGVS
  • BioBERT / SciSpacy
  • RAG-based architectures

Current MVP Scope

To keep the project realistic for a small team and limited timeline, we are likely focusing on:

  • NSCLC initially
  • a limited hotspot gene set (EGFR, KRAS, ALK, BRAF, etc.)
  • 2–3 molecular vendor formats
  • PDF + VCF reconciliation workflows

Feedback We Are Looking For

We would greatly appreciate feedback from people working in:

  • oncology informatics
  • molecular pathology
  • bioinformatics
  • clinical genomics
  • healthcare interoperability
  • biomedical NLP
  • precision medicine platforms

Especially around:

  1. Common real-world reconciliation pain points
  2. Vendor-specific genomic reporting inconsistencies
  3. Explainability and validation expectations
  4. Existing open-source tools/frameworks we should evaluate
  5. Clinical workflow considerations we may overlook
  6. FHIR/mCODE/genomics interoperability best practices
  7. Public datasets suitable for realistic MVP development

We are intentionally positioning this as:

  • AI-assisted,
  • explainable,
  • standards-aligned,
  • human-reviewed,

rather than fully autonomous interpretation.

Thanks in advance for any guidance, references, or suggestions.

reddit.com
u/Few-Bullfrog3807 — 3 days ago