u/Expensive_Spend_9187

▲ 2 r/molecularbiology+1 crossposts

I have been working on a tool for drug resistance predictions after running into how manual the process still is when you go through the major catalogues. I built something that takes a gene, a mutation, and a drug, and returns a resistance call with a confidence score, a rough fitness cost estimate, and a population-genetics view of how the mutation would behave under selection. I have expanded the classifiers to cover beta-lactamases, AMR, oncology kinases, TB, HIV, and other targets.

I have benchmarked it on CRyPTIC for tuberculosis and got AUROC around 0.81 across roughly 400 mutations spanning rpoB, gyrA, gyrB, parC, and a few others. Independent test on Hauser's Abl1 dataset came in at AUROC 0.72. HIV through Stanford HIVdb came in much weaker at AUROC 0.46, which I traced to mixture-format parsing and single-drug training assumptions that do not hold for polypharmacy. Coverage right now is a handful of TB and gram-negative targets plus some kinase oncology ones I added for stress testing. The other thing is I would also like to know whether it'd be more sensible to just focus on some specific area like AMR/TB where I have gotten the best results or to cover a bit more ground.

I am posting because I want honest feedback. Specifically, whether the existing catalogue tools (CARD, ResFinder, AMRFinderPlus, WHO TB catalogue) already cover these needs, or whether something that returns a confidence-scored prediction with fitness context would fit somewhere. Happy to give access to anyone who wants to try it.

reddit.com
u/Expensive_Spend_9187 — 21 days ago

I have been working on a tool for drug resistance predictions after discovering that much of this process is being done manually with catalogues. I have been able to build something that automates this process.

It works by having the user give it a gene, a mutation, and a drug, and it returns a resistance call along with a fitness cost estimate and a sense of how the mutation would behave under selection pressure. I have benchmarked it on CRyPTIC for tuberculosis and got AUROC around 0.81 across roughly 400 mutations. Coverage right now spans some TB and gram-negative targets and a few oncology ones I added for stress testing.

I am posting because I want to know if this would actually be useful in anyone's workflow, or if the existing catalogue tools are already covering it well enough. I am happy to give access to anyone who wants to try it.

reddit.com
u/Expensive_Spend_9187 — 21 days ago