
Built a browser-based molecule viewer with a voice layer on top of Mol*. An OpenFold3 prediction comes back from NVIDIA BioNeMo as mmCIF + confidence scores, gets rendered with Mol*, and a Web Speech API layer accepts commands like "walk through chain A", "focus glutamate 4", or "start guided tour".
No install, no login. Works in Chromium-based browsers and Edge. iOS Safari is click-only (Web Speech isn't supported there).
Stack
- Prediction: OpenFold3 on NVIDIA AI Enterprise / BioNeMo. Demo ships two pre-computed complexes (a zinc-finger–DNA binder and λ Cro bound to operator DNA).
- Rendering: Mol* 4.9.0 via CDN. Cartoon + residue-index coloring for proteins, default nucleic-acid representation for DNA, black canvas.
- Voice: Web Speech API —
SpeechRecognitionfor commands,SpeechSynthesisfor narration. - Residue parsing: normalizes number words (
three → 3), spelled letters (dee gee three → DG3), full amino-acid names, and common Chrome mishears (glue → GLU,tire → TYR,isle → ILE,trip → TRP). Falls back to Levenshtein with a length-sensitive threshold. - Camera: a custom
focusFacing()— Mol*'s defaultcamera.focus(center, radius)dollies along the view vector, which drops the camera inside the structure when focusing on a back-facing residue.focusFacing()orbits to the outside of the structure centroid first, then focuses. - Tour orchestration: narration + camera moves + auto-spin between steps + mic pause during TTS (otherwise you get a feedback loop).
Features that were annoying to build
- Chrome transcribes 3-letter codes as homophones. Lysine → lice/like. Tryptophan → trip. Threonine → thor. Had to hand-curate an alias table.
playsInline+ muted autoplay differences across iOS Safari vs Chromium.- Coordinating the SpeechRecognition state machine with Mol*'s render loop during tours — the mic has to stop before TTS starts, restart after, and the start button also has to act as a stop-tour button.
What it doesn't do (yet)
- No prediction job submission — hardcoded to the two pre-computed outputs.
- No MSA handling.
- pLDDT per-residue comes back in the JSON but isn't painted on the surface yet. Trivial to add via Mol*'s
plddt-confidencetheme — just haven't. - No ligands. OpenFold3 supports them; I haven't added a non-polymer representation.
- No export — PNG snapshot, downloadable mmCIF, neither is wired up.
Try it
https://sheldonbarnes.com/tools/ai-voice-guided-molecule-viewer
Click the icon in the bottom-right to activate the mic. Say "what can I do" for a 30-second narrated capability demo. If voice isn't your thing, the sidebar has a residue picker and per-chain walkthrough buttons.
Looking for feedback
- Voice command grammar — what commands would actually be useful in a real bioinformatics workflow vs demo territory?
- Is pLDDT painted on the surface worth the cognitive load for non-specialists, or does it overwhelm the initial read?
- Export needs — does anyone here actually want to render → download, or is the goal always to link out to the underlying structure?
- Teaching — anyone using anything like this in an undergrad biochem or structural-bio course? Interested in what the gaps look like there.