r/biostatistics

Final interview was just shy of two weeks ago, no result

Hi all,

Had final interview for senior level position just shy of two weeks ago now. Employer told me the Tuesday thereafter (so last Tuesday) I was the first final interview, that they’d be wrapping up the final interviews by end of that week, and they’d be debriefing to make final decisions about offers early the next week (ie, this week). They asked about where I was in any other interview process, to which I told them I’m about halfway through some others. They thanked me and told me to update them if that changes and they’d have “more details shortly”

Yesterday morning, I emailed them to ask for an update and informed them I have been moved to final round interviews with another employer. Nothing, no response. If I don’t hear anything back by EOD today I’m just assuming they went with someone else.

Am I cooked? Or are they genuinely still getting this stuff worked through the corporate powers that be? It’s not a particularly massive company.

reddit.com
u/Kooky-Shock-8021 — 15 hours ago
▲ 19 r/biostatistics+2 crossposts

What is better count regression or t-tests for cell proliferation data: I had to know

In biology you often count things: cells of type A out of total cells of type B, mutant flies out of total flies, etc. The most common move in papers is to compute a ratio per animal and run a t-test on the ratios. This throws away how many cells you actually counted: "5/100" and "50/1000” becomes same, and feeds strictly [0,1] bound data to t-test. The principled alternative is count regression with offset(log(N)): model the raw count directly, bring the total in as a statistical weight, respect the non-Gaussian nature of count data. This week I decided to test this assumption in practice:

Setup. Four methods across two pipelines:

  • Animal-level: Welch's t-test on ratios vs CMP GLM (glmmTMB(..., family = compois()))
  • Field-level: LMM with (1 | EmbryoID) vs CMP GLMM with the same RE

Three metrics: Type-I error, size-adjusted power (Lloyd correction), median 95% CI width.

The interesting bit. Instead of running ~10k sims at one design, I sampled 300 designs over a 6-dim space with Latin hypercube (log-uniform on multiplicative knobs, linear on CV, discrete on n_animals), ran 200-500 sims per design × method, then fit GP emulators (hetGP, Matérn 5/2 + ARD) on the point estimates. (I try to run and hide but come back to GAMs one way or another :)). LOOCV verified they generalize. Sobol decomposition tells me which design knobs drive each method's response; Monte Carlo marginalization over nuisance knobs gives clean 2D heatmaps of power and CI width on (n_animals, CV).

Findings.

  • Both methods hit 80% power at essentially the same (n_animals, CV) spot. Below that threshold, in the underpowered regime where most real experiments live, count regression beats the ratio approach.
  • CMP GLMM produces narrower CIs than LMM at essentially 100% of designs (median ~12% narrower). CMP GLM beats Welch at ~97% (~7% narrower).
  • Adding random effects shifts the 80% power contour to the left: fewer animals for the same power.
  • Sobol shows all four methods have nearly identical sensitivity profiles. The precision advantage isn't about one method responding to a knob the others ignore; it's about how efficiently each one extracts information from the same drivers.

Practical takeaway. Default to glmmTMB(Y ~ Group + offset(log(N)) + (1 | EmbryoID), family = compois()). The CMP advantage is real and lives in the small-n regime. If you have huge n, all four agree.

Full reproducible post with code:

u/rrytas — 1 day ago

Where to find Biostatisticians?

Hello all,

I am a masters student doing my thesis. My study contains a three armed study for comparing the safery and efficacy of 3 drugs and I need to calculate a sample size.

I am trying to find a biostatistician to help me find the sample size. (I can pay)

Are there any places where I can find reliable biostatisticians?

reddit.com
u/ExtraMediumFromage — 3 days ago

Advice on entering the Biostatistics field

Hello! I am interested in pursuing Biostatistics but am not sure how to go about starting in the field. I have a Bachelor’s degree in Psychology and work experience as an office administrator. For the past year, I’ve been tasked to make sheets, reports, and graphs in relation to all sorts of stats like demographics, heat-related cases, drugs, etc.

I was thinking about getting a graduate certificate in Biostatistics / Epidemiology. Is this the right way to go about it? Are there better alternatives? Any programs you recommend?

reddit.com
u/Corporate-Gorilla — 4 days ago

Side Hustles

Hey all,

As the title suggests, I wanted to hear from others any side-hustles that involve skills that a biostatistician typically haves and also paid well or at least makes you more fulfilled. I work at an R1 institution so I know if I look around I might find something but was wondering if there are other gigs I should look into. Am fine with working on weekends or after my usual full-time ends. Really considering being a barista over the weekends as a fun thing to do, but will definitely choose finances over individual aspirations. Thanks!

reddit.com
u/LavishnessJolly1681 — 7 days ago

Big pharma to small biotech - good or bad career move for statistician?

I’ve been working as a biostatistician in big pharma for four years, and it’s the only industry setting I’ve known since I went from grad school into this company. I’m interviewing for a similar role with a small biotech that is fairly well funded. I’d be supporting a registrational clinical trial, which I have not done in my current job so that is a step up.

I’m wondering how a move like this is viewed. If I joined a smaller biotech/startup and later on decided I wanted to return to big pharma, would that be difficult? Does making that transition raise concerns on a CV?

I’m also thinking longer term about the impact of AI and industry trends. Part of my concern is that biostatistics roles in large pharma may become more limited over time, which could make it harder to even find opportunities in the future.

reddit.com
u/Curious-Code-4370 — 7 days ago
▲ 896 r/biostatistics+4 crossposts

The emerging cancer treatment that’s exciting scientists: ‘We’ve just scratched the surface on what’s possible’ | Cancer | The Guardian

theguardian.com
u/okietarheel — 12 days ago
▲ 19 r/biostatistics+1 crossposts

Python vs. R for Academic Biostatistics

As you have observed, the exponential growth of AI and large language models (LLMs) will affect the field of biostatistics. Considering this an investment in the future, which side would you take in the Python vs. R debate, a common comparison in biostatistics, by 2026? Could you explain your reasoning?

reddit.com
u/howruhow — 9 days ago

Got a screening invite to a job I was previously rejected from. What do?

Hi all,

I applied to a Sr. Outcomes Data Scientist position at a mid-sized biotech last December. Interviewed in Jan, made it to the takehome technical and dropped after that in Feb.

The job has since been reposted several times (this will have been the third time it was reposted after my initial attempt) evidently without any luck finding a new candidate. Being nearly six months after I initially applied, I decided to give it another shot with a different workday account (it’s the exact same job, I could see it under the inactive tab of my workday account).

It is with a different recruiter (different person contacted me for the screener) and, I assume, the same hiring manager. I intend on disclosing pretty much immediately at the screen. What’s the etiquette here? What’s the chance I won’t just immediately be dropped again?

reddit.com
u/Kooky-Shock-8021 — 8 days ago
▲ 2 r/biostatistics+1 crossposts

Do independent Biostatistician consultants/freelancer ever hire experienced technical support?

I’ve spent 5 years in the clinical trial industry. I’m thinking about pivoting toward a flexible, international freelance model and am curious about the demand for high-level sub-contracting because I'm new to this world and prefer not to fly solo immediately .

Specifically, I’m looking to support independent consultants who have the client base but lack the bandwidth for the actual implementation and modeling.

What I can provide:

  • Technical Independence: 5 years of CRO-like (the company I work with isn't a "typical CRO, but our team does something CRO does") experience. I can handle modeling (MMRM, GLM, non linear mixed model etc.) and regulatory-stakes work without needing technical hand-holding. With an M.S. in Statistics,I also have knowledge in advanced statistical methodologies.
  • Niche Expertise: Deep background in cardiac safety/ICH E14.
  • Mutual Benefit: I’m looking to learn from a veteran while providing them the capacity to take on more projects.

My Questions:

  1. For those running your own consultancy: Do you ever use experienced sub-contractors to scale, or do you prefer to stay strictly solo?
  2. Is there a specific platform or network where this "second-in-command" model is common in our industry?

Any feedback or insights are greatly appreciated!!

reddit.com
u/Joysien — 7 days ago

Mphill or msc

I am a 24 bsc midwifery holder,thinking of going into biostatistics,I am genuinely confused as to which one is more sought after with employment rates being on an all time low,I would appreciate any advice I can get,I am trying to do anything to make life better for myself, any advice or opinion will be warmly welcomed

reddit.com
u/Deep_Journalist_960 — 7 days ago

Opinion on a result of a clinical study ($CMPX)

Compass Therapeutic ($CMPX) released the results of their Phase 2/3 of their drug against biliary tract cancer.

Looking at the results, I believe they are quite positive (I have a formal education in statistics though).

I avoid to give more details to avoid any bias In your judgments.

What interrogates me is financial markets decided the results were so bad that it would be denied commercialization by the e FDA.

I was wondering if I was missing anything?

investors.compasstherapeutics.com
u/neo2551 — 11 days ago

Academic vs. pharma trial work as a biostatistician: what is different?

I have worked in academia as a staff biostatistician for the past couple of years and will be going back to school for a PhD. In my current role, I have mostly worked on NIH-funded trials (not drug or device related).

One thing I have found challenging is that, during recruitment and follow-up, I often do not feel like I have much to contribute. Even when a trial funds a substantial portion of my effort over years, my work during that stage is mostly data cleaning, SAP development, and occasional support tasks. For trials where I am required to stay blinded, I may not even be involved much in providing data updates to the study team, and there is sometimes limited time for dry runs before the final analysis.

I do work with senior biostatisticians, and they are the one who actually give me tasks, but looking back, many of those tasks did not directly contribute much to the final analyses. I have also been involved a little in grant submissions (power calculations), but most of the design work is still led by faculty supervisors.

That said, the work-life balance has been good, and I still really like biostatistics. After PhD training, I am seriously considering opportunities in pharma, so I am curious how trial work differs in industry, especially for FDA-regulated trials.

For people who have experience with both academic trials and pharma/industry trials:

  • In early-career industry roles, how many trials are you typically working on at the same time?
  • What do biostatisticians usually do before data lock?
  • How much time is usually allocated to dry runs and final analyses?
  • Will biostatisticians be working on multi-year long trials from beginning to end or just join at some important time points?

Apologies if these questions are naive. I am just trying to get a better sense of what trial work looks like in pharma, and I hope the discussion may also be helpful to others who are curious about the academia-to-industry path.

reddit.com
u/These-Interview312 — 11 days ago

Accepted into BU Biostats PhD with funding!

Dm if you are too. I want to meet my class. This is a dream come true!

What’s the best thing I can do to prepare coming out of a math major where I don’t do much stats? Thank you!

reddit.com
u/Huge-Cry-1560 — 12 days ago

Subjectivity in biostats?

Is it just me or is there actually a lot more subjectivity than I expected in this field which makes me feel less wary about AI? For example using good statistical judgment, making sure to not fish for significance, limitations of the specific application, defining success in a trial… these are all strategic and situational right? AI couldn’t necessarily be trusted with these judgement calls even if it can build models?

reddit.com
u/Ok_Occasion_906 — 11 days ago
▲ 3 r/biostatistics+1 crossposts

Am I using the Right Approach?

I am hoping someone can give me some guidance! I struggle a lot with statistics. The idea of stats sounds fun because you have to look at your data and decide which approach statistically is best and gives your data the most love, but that's what also makes it so stressful! I would love to get some insight.

I am trying to look at the relationship between bacterial loads of a blood-borne pathogen of hosts and their parasites (there is only one order of parasite but multiple species, and there are multiple species of hosts). However, some hosts have 1 parasite while others have four or more (with varying bacterial statuses from zero to way higher). I have 144 hosts and 270 parasites.

I originally wanted to do a T-test or ANOVA, but I don't believe bacterial loads are independent data, since the presence or absence of the pathogen can be dependent on vector competence and other factors.

I am thinking now that I need to do a nonparametric test of some sort, since I do not have normally distributed and dependent data.

I have been interested in looking at host sex and age to see if those could also be factors in this relationship, but I am concerned that it would shrink my already small dataset. I was really excited about doing species-related relationship stats, but my parasite/host dataset is too small, and I had some difficulties getting pathogen species results.

Can someone give me recommendations on good statistics resources, specifically anything for disease ecology? Thank you!

reddit.com
u/Embarrassed-Oil-1312 — 12 days ago
▲ 3 r/biostatistics+1 crossposts

Using cross-validation for lambda selection vs model validation in LASSO and if they are the same thing?

So basically I**'**m have just done a maths degree and am now doing a masters in statistics. So I am fairly new to validation and stuff.

I'm working on a LASSO logistic regression model and am confused about whether the cross-validation used to optimise lambda is sufficient for model validation, or whether a separate validation step is needed afterward.

Suppose for simplicity we use k=2 fold cross-validation to select lambda, and get λ₁ = 0.9 and λ₂ = 0.8 from each fold. We conclude our optimal lambda is λ = 0.85 (the mean), and fit our final LASSO model using this value.

Now we want to assess whether this model is any good. My instinct is that using the same cross validation process that selected lambda to also assess model performance feels circular whcih is the very the very procedure used to tune the model is being used to evaluate it.

In other prediction modelling I've learnt, the workflow seems cleaner and more seperated out: fit your model, declare it fixed, then bootstrap it purely as a validation tool — specifically by estimating optimism as the difference between AUC on the bootstrap sample and AUC on the original data, averaged over B iterations. The bootstrap doesn't touch the model itself.

So it strikes me that for LASSO there should be a separate validation step after lambda selection. As I see it, the options are:

  1. K-fold cross-validation applied to the final model at λ = 0.85, purely to assess performance across k held-out sets
  2. Nested cross-validation — lambda optimisation in an inner loop, performance assessment in an outer loop
  3. Bootstrap with the full lambda-optimisation process nested inside each iteration so that each bootstrap sample runs its own cv.glmnet to select its own lambda, fits on the bootstrap sample, then predicts back on the original data to estimate optimism
  4. A simpler bootstrap that just assesses how the fixed λ = 0.85 model performs across 1000 resampled datasets, without re-running lambda selection each time

Option 3 seems most rigorous to me but I want to understand whether option 4 is defensible, and whether options 1 or 2 are commonly used in practice for this problem.

reddit.com
u/Daimbarboy — 13 days ago

What makes a good grad school application coming out of undergrad?

I'm a current undergraduate student in a statistics program in the U.S. and hoping to apply to next year and was curious what's worked for others for Masters and PhD applications. I know having research experience is generally good and of course decent GPA but given applications can be fairly competitive figured it doesn't hurt to see what better can be done before application time.

reddit.com
u/dumb_trans_girl — 13 days ago
▲ 13 r/biostatistics+3 crossposts

Biologist deciding between a Master’s in Bioinformatics or Biostatistics: which field currently offers better opportunities, flexibility, and long-term growth?

Hi everyone. I’m a biologist, and I wanted to ask for some advice because I’m currently at a very important point in my career and I feel really torn between two paths that genuinely fascinate me: bioinformatics and biostatistics.

During my undergraduate thesis, I had the opportunity to move from a very general biology background — mostly oriented toward environmental sciences — into using bioinformatics and biostatistics tools for the first time. My thesis focused on a metagenomic analysis related to microbial communities and mercury, and honestly, that experience completely changed the way I saw my professional future.

Before that, most of my academic background and the opportunities I saw as a junior biologist were strongly connected to environmental consulting, fieldwork, biodiversity characterization, monitoring, etc. And while I truly respect that field and I do have experience in it, I always felt more drawn toward the intersection between science, technology, and data.

What surprised me the most is that when I first got into bioinformatics and biostatistics, I realized how challenging both fields really are. I became very aware of my weaknesses, both in mathematics and computational skills. But instead of discouraging me, it had the opposite effect. It made me think: “I really want to learn this properly.” Since then, I’ve had a strong desire to specialize in one of these areas.

After working for about a year, I finally managed to save enough money to pursue a master’s degree. However, now I’m facing the big dilemma: choosing between a Master’s in Bioinformatics or a Master’s in Biostatistics.

I’ve been reading a lot about automation, AI, the current job market, entry-level saturation, industry demand, research opportunities, and future projections. And honestly, the more I read, the harder it becomes to decide.

I know a master’s degree alone won’t magically solve everything, and that experience, internships, projects, and practical skills matter a lot.

So I would really love to hear from people who work or study in either of these areas:

What has your professional experience been like?

How difficult was it to enter the job market?

What would you prioritize today: bioinformatics or biostatistics?

reddit.com
u/Lisanya18 — 14 days ago