u/Comfortable-Edge-915 — reddlx

https://preview.redd.it/orpvoyzmg6wg1.jpg?width=796&format=pjpg&auto=webp&s=6a80bcb82b1aa4c10ca3e59a242fcbf220d486ab

Built a browser-based molecule viewer with a voice layer on top of Mol*. An OpenFold3 prediction comes back from NVIDIA BioNeMo as mmCIF + confidence scores, gets rendered with Mol*, and a Web Speech API layer accepts commands like "walk through chain A", "focus glutamate 4", or "start guided tour".

No install, no login. Works in Chromium-based browsers and Edge. iOS Safari is click-only (Web Speech isn't supported there).

Stack

Prediction: OpenFold3 on NVIDIA AI Enterprise / BioNeMo. Demo ships two pre-computed complexes (a zinc-finger–DNA binder and λ Cro bound to operator DNA).
Rendering: Mol* 4.9.0 via CDN. Cartoon + residue-index coloring for proteins, default nucleic-acid representation for DNA, black canvas.
Voice: Web Speech API — SpeechRecognition for commands, SpeechSynthesis for narration.
Residue parsing: normalizes number words (three → 3), spelled letters (dee gee three → DG3), full amino-acid names, and common Chrome mishears (glue → GLU, tire → TYR, isle → ILE, trip → TRP). Falls back to Levenshtein with a length-sensitive threshold.
Camera: a custom focusFacing() — Mol*'s default camera.focus(center, radius) dollies along the view vector, which drops the camera inside the structure when focusing on a back-facing residue. focusFacing() orbits to the outside of the structure centroid first, then focuses.
Tour orchestration: narration + camera moves + auto-spin between steps + mic pause during TTS (otherwise you get a feedback loop).

Features that were annoying to build

Chrome transcribes 3-letter codes as homophones. Lysine → lice/like. Tryptophan → trip. Threonine → thor. Had to hand-curate an alias table.
playsInline + muted autoplay differences across iOS Safari vs Chromium.
Coordinating the SpeechRecognition state machine with Mol*'s render loop during tours — the mic has to stop before TTS starts, restart after, and the start button also has to act as a stop-tour button.

What it doesn't do (yet)

No prediction job submission — hardcoded to the two pre-computed outputs.
No MSA handling.
pLDDT per-residue comes back in the JSON but isn't painted on the surface yet. Trivial to add via Mol*'s plddt-confidence theme — just haven't.
No ligands. OpenFold3 supports them; I haven't added a non-polymer representation.
No export — PNG snapshot, downloadable mmCIF, neither is wired up.

Try it
https://sheldonbarnes.com/tools/ai-voice-guided-molecule-viewer

Click the icon in the bottom-right to activate the mic. Say "what can I do" for a 30-second narrated capability demo. If voice isn't your thing, the sidebar has a residue picker and per-chain walkthrough buttons.

Looking for feedback

Voice command grammar — what commands would actually be useful in a real bioinformatics workflow vs demo territory?
Is pLDDT painted on the surface worth the cognitive load for non-specialists, or does it overwhelm the initial read?
Export needs — does anyone here actually want to render → download, or is the goal always to link out to the underlying structure?
Teaching — anyone using anything like this in an undergrad biochem or structural-bio course? Interested in what the gaps look like there.