r/AskStatistics

▲ 1 r/AskStatistics

EFA before LPA?

Hello. I am working with a large set of biological variables (40+). Can I run exploratory factor analysis to obtain a smaller number of latent factors, then use these in a latent profile analysis? Trying to detect patterns across these many variables

u/Haunting_Ad_52 — 2 hours ago

▲ 28 r/AskStatistics+8 crossposts

A general video on causality for non-specialists - feedback welcome

I made a NeuralCipher video introducing causality for a broader AI/science audience.

The goal is not to present a technical tutorial on causal inference, but to make the conceptual distinction clear: association, intervention, counterfactuals, explanation, and why causal claims require more than predictive success.

I tried to avoid the shallow version of “correlation is not causation” and instead explain why causal reasoning changes the kind of question we are asking.

Disclosure: I made this. I would especially appreciate corrections from people working directly in causal inference.

▶️ https://www.youtube.com/watch?v=dzgwW2n19bE

See more at neuralcipher.net

What is the most common misconception about causality that you see outside the field?

u/NeuralCipher_NC — 23 hours ago

▲ 1 r/AskStatistics

[Question] inter- and intrarater reliability and reproducibility: ICC/Cohen‘s kappa, t-test/Anova or something else

Would you look for significances or reliability/reproducibility for 3 raters that measured radiographic lesions and classified them with 3 different scores (mirels Score, Harrington classification and Lodwick classification).

My colleague suggested to use Anova and t-test with p-values for everything to just go for statistically significant difference.

Edit: I used mostly ICC for a uniform analysis. kappa was just used for localisations (nominal)

u/nLucky-13 — 18 hours ago

▲ 0 r/AskStatistics

Raffle statistics

If you entered a raffle, with 115 spots, and could choose your number. And at $100 a pick we have a budget of 4 picks. Would it be best to pick random or break it into 4 quarters so a number between 1-30 30-60 60-90 90-115?

u/Just_Veterinarian264 — 20 hours ago

▲ 34 r/AskStatistics+3 crossposts

Hawkes process MLE calibration diverges (β → bound) on tick data with millisecond-tied timestamps. Is timestamp jittering the standard fix?

u/tulipteaaa__ — 21 hours ago

▲ 0 r/AskStatistics

How can I test my mathematical potential?

I studied stats only until middle school, then chose Arts in high school. Later, I completed both my Bachelor’s and Master’s degrees in Business. I did have some mathematics, accounting, and statistics-related subjects during my degrees, but I never had the same statistical foundation as someone who studied statistics or took the Science.

Now I want to transition from a business and management background into tech. I know the amount of mathematics required depends on the specific field, but I am interested in technical areas where mathematics, statistics, logical reasoning, and problem-solving can matter.

Throughout most of my life, I was an average student. I want to be honest about that. A large part of it was because I was careless, inconsistent, and simply not interested in studying at the time. At the same time, I have also seen that when I genuinely put in serious effort, I can sometimes perform extremely well and even score near the top. Because of that, I do not know whether my past academic record accurately reflects my actual ability or potential.

Right now, I would consider my current level close to zero because I lost touch with statistics a long time ago. I have forgotten even many basic concepts, so I already know that if I took a test today, my performance would probably be poor. That is not really what I am trying to measure.

What I want to understand is my statistics ability, competence, or learning potential, whichever is the correct term. Even though my current level is very low, I want to practice seriously for a few weeks and then test myself to see where I stand, how quickly I can pick things up, how far I may be able to go, and whether I could realistically commit to statistics in the long term.

I understand that a few weeks cannot prove my ultimate potential or predict my entire future. But I want to run the best short-term experiment I realistically can. I want to observe whether I can relearn concepts, understand statistics solve unfamiliar problems, improve with practice, retain what I learn, and apply concepts in new situations rather than simply memorising procedures.

Another reason I am asking is that some people seem to show statistical ability, strong interest, or talent from a very young age. They may have always been “good at stats.” Unfortunately, I was not one of those people. I did not grow up seeing myself as stats gifted, and I only started developing a genuine interest much later in life. So I am trying to understand what that means for someone like me.

Can a person with my background, who was not particularly good at stats from a young age and currently has a very weak foundation, still develop very high stats competence over time? Could someone like me eventually become genuinely advanced or even expert in stats, or are there meaningful limits that a short-term experiment might help reveal?

If I have only a few weeks to test myself seriously, what exactly should I do? What diagnostic tests, problem sets, exercises, or progressively difficult topics should I attempt? What should I measure: my rate of improvement, how much help I need, my ability to solve unfamiliar problems, abstraction, retention, transfer of learning, persistence, or something else?

I would especially appreciate advice from people with strong backgrounds in statistics.

If you had only a few weeks to evaluate someone with my background as objectively as possible, what exact process would you recommend?

My current level is close to zero because I lost touch with it years ago, but I do not want to confuse my current level with my potential ability. I was never someone who showed obvious stats talent from a young age, but I have developed a genuine interest later in life. I want to practice seriously for a few weeks and test how quickly I learn, improve, reason, retain, and solve unfamiliar problems. How can I use those few weeks to get the best possible indication of whether I can realistically pursue stats long term and potentially become highly competent or even expert?

u/Maximum-Page3433 — 1 day ago

▲ 3 r/AskStatistics

What are the benefits of data aggregation?

I analysed a dataset exploring doctors’ well-being and its association with patient care outcomes across hospitals. Patients usually saw more than one doctor during data collection, so doctor-level data and patient-level outcome data were not linked at the individual level. Because I have data from multiple hospitals, I aggregated both doctor well-being and patient care outcomes to the hospital level.

I explained this in the methods section, but received a comment asking me to discuss the benefits of aggregated data. I'm unclear on this. I've mainly seen aggregation as a necessary devil and as a loss of information.

What are the benefits of aggregation?

u/ImaginationBig4641 — 1 day ago

▲ 3 r/AskStatistics

How can rewrite person-years as something that doesn't feel like jargon?

I came across some research where scientists were assessing the effectiveness of a vaccine and placebo a few weeks after receiving multiple doses of either. I've tried a couple of things to solve it. It says the incidence of first infections post the vaccination phase was around 45 infections per 100 person-years in the vaccine group, and 49 infections per 100 person-years in the placebo group. Similarly, the all infection rates post the vaccination/placebo phase (which includes multiple infections), was 60 per 100 person-years in the vaccine group, and around 66 per 100 person-years in the placebo group.

What sort of ways could this be framed to ensure that younger people can understand this?

Could this be written as 45 infections among 100 people every year? And 60/66 infections, including repeat infections, among 100 people per year? The rate of first infections in people who received two doses of the vaccine was 45%? And 60/66% for all infections among both groups? Please help.

u/OCDtoomuch — 2 days ago

▲ 7 r/AskStatistics

What paths should I choose after a statistics degree if I want field work and not office life

I am a statistics undergraduate student and I am planning to do higher studies. I am trying to understand what paths I should focus after I graduate.

I do not want a career where I stay in a room all day and look at a screen. I want something more active. I like nature agriculture travel and being outside. I want to work in real places and see real problems and try to solve them.

So my question is simple. After graduating with a statistics degree what paths should I focus if I want this kind of life. What fields should I choose for higher studies that can lead to field work and not only office work.

I am not looking only to combine statistics. I am open to moving into another field if needed. I just want to use my degree as a base and move into something more meaningful and active.

I also want to know what I should start learning now before going to higher studies.

If anyone has taken a similar path or knows good directions please share your advice

u/Physical-Writer-3435 — 2 days ago

▲ 3 r/AskStatistics

What are good majors that compliment statistics ?

hi everyone, currently I am thinking about switching majors. I took a stats class and really enjoyed it, cuz I am not fully liking my current major and hope to switch. But I heard that stats alone is not ”good enough” so I was thinking maybe double majoring in data science, but also heard that it has been harder now to get a data entry job. can anyone provide any guidance! thank you !!!

u/ContributionAny2743 — 2 days ago

▲ 9 r/AskStatistics+2 crossposts

Systematic Review and Meta Analysis?

Hi everyone,
Quick question for people who do systematic reviews and meta-analyses.
Is there an all-in-one platform that covers the entire workflow, so you don’t have to switch between multiple tools?
If yes, which one? If not, what would you consider a fair monthly subscription price for a platform that does?

u/Happy_Culture1209 — 3 days ago

▲ 3 r/AskStatistics+1 crossposts

Reco research and statistics book

Im a medical post grad from the PH. Pls do reco local/international books that is easy to understand and is not intimidating. I love research and i lost my research and stat books from secondary school. Thanks in advanced! 😊

u/Logical_Crab2661 — 2 days ago

▲ 0 r/AskStatistics

I think I made a mistake on my thesis

I did t test for gender, married/single with continuous numeric data type but for age group, education I did Kruskal Wallis. It's been months I don't remember how I did it is it wrong should it be both t test and ANOVA or my way is somehow possible?

u/Embarrassed-Run2760 — 2 days ago

▲ 3 r/AskStatistics

Quadratic linear term

I’m having a hard time interpreting the result of my quadratic term on my linear regression model.

My exposure is a continuous variable ranging from 3-9 and my exposure is also a continuous ranging from 0-4.

After i added a quadratic term i got a significant result with a very low increase in the R2=0.004. Also the curvature starts at 8.9 so its very close to the max of 9 on my exposure.

Does this null my linear model? Or is it okay to use it?

I also ran a ordinal logistic regression model where i collapsed my outcome into categories and got a similar coefficient to the linear model. Just to double check

Statisticians of reddit where are you :)

u/Ammo991 — 3 days ago

▲ 0 r/AskStatistics

Help and advice in developing advance stats course.

Hello everyone I have been assigned the work of formulating a proposal for a course on advance statistics , I was hoping if you all had any pointers in what should it cover and what not,

My objective is to make it in a manner that it doesn't get too tough and is rather accessible and easy to comprehend for people from all walks of life, but at the same time it also covers enough and should be made keeping in mind that it should make the learner more employable or more practically skilled than they were before it.

u/harshdce — 3 days ago

▲ 6 r/AskStatistics

Confusion regarding usage of p-value correction tests

Hi everyone, I am asking this question as I am currently confused about the usage of p-value correction tests in hypothesis testing, such as FDR and Bonferroni correction tests, especially in research papers. My apologies in advance if it seems to be an unconventional question, it just seems like no one has questioned it before.

Based on my understanding, these tests should be used when there are multiple hypothesis tests carried out simultaneously. So to say, if one has a matrix plot of features - for example: height, width and weight of 2 populations, and pairwise comparison tests are used to test for significant differences in each metric across the 2 populations, a p-value correction would usually be used in a research paper to reduce the possibility of Type 1 errors.

However, what if the aforementioned matrix plot was separated into different charts in a sections of a research paper? Does a p-value correction still need to be used here? If yes, by this logic, wouldn’t that mean that p-value correction would have to be performed for all statistical tests of the same type in the entire paper? Wouldn’t performing a p-value correction for so many comparisons pose a risk of over-correction as well?

Thank you in advance for the advice, and please feel free to correct me if I was wrong in my understanding.

u/jadexiaohui — 4 days ago

▲ 2 r/AskStatistics

Standard deviation of two dependant dice rolls

I have two dice, a d4 and a d6.

I first roll the d4. If I roll more than 2, I roll the d6 and note the result as my score. If I roll 2 or less, my score is 0.

I know that I can manually calculate the standard deviation by just writing out the score for all 24 possible results (12 * 0, 2 * [1..6]) and putting that into the standard deviation formula. That comes out to 2.126.

What I'm actually trying to do however is find a generalised formula that calculates the standard deviation of the score when rolling 1dx if 1dy is greater than z.

Going back to my d4 and d6, I know that the standard deviation of a d6 is 1.708, and I have a 50% chance that I roll the d6, or a 50% chance that I score 0. Is there a formula that I can use to get from those values to the 2.126?

To make it even more complicated, the next step is that if I roll a 4 on the d4, I actually roll 2d6. 50% chance to score 0, 25% chance to score 1d6, 25% chance to score 2d6. That gives me a standard deviation of 3.257. Is there a formula I could use to calculate this as well?

u/the_twig_131 — 3 days ago

▲ 1 r/AskStatistics

failed math subjects as math (stats) major, should i keep going?

i apologize if this is not the correct subreddit, just thought id get some thoughts from people in the field ish
hi everyone, im a math (stats) major and i had just finished my first year in uni (out of three). i found out that i had failed probability and real analysis. idk what to do. i feel so lost, im not even really sad i mainly just feel empty. i chose a math major bc i thought i was good w numbers, but since getting in uni my mental state has been in a decline and prob reached its lowest this semester. since probability and real analysis are 2nd year subjects, i really felt the gap between 1st year subjects (had to do early due to course progression stuff). moreover, im an international student, so failing just means i have to pay a crap ton more money. im not struggling financially, all my expenses are supported by my parents, and because of that, i feel really bad since i wasted so much money alrd by failing.
im honestly dreading retaking real analysis, i know i couldve done better on both these subjects but i just felt so terrible about everything that i neglected my uni studies. my fault.
i just feel like maybe this major isnt for me, and that i dont have what it takes. im scared :/

u/No_Wonder8449 — 4 days ago

▲ 3 r/AskStatistics

Alternatives for one-way ANOVA with failed independence (multiple group membership)

Participants	Football	Baseball	Tennis	Result
1	Yes	No	No	0
2	No	Yes	No	1
3	No	No	Yes	-1
4	Yes	No	Yes	3
5	No	Yes	Yes	-2

Here I have a list of participants (1-5) who did a survey and produced "results". Group membership is my independent variable, and the results column is my dependent. If there was no group overlap I would simply use an ANOVA and be done with it, but because I have participants in multiple groups (4 and 5) I fail the independence assumption.

I could create new "combo" categories for the cases in which there is multiple group membership and only count those participants in those new categories, but I was wondering if something else could be used instead.

What is the right stat to use here? Running in Jasp, but can use SPSS too.

u/adankishmeme — 5 days ago

▲ 2 r/AskStatistics

Is there a name for when probability factors have less weight due to sheer sample size?

Like let’s say your local gym has 30 parking spots and has at least 300 members and staff combined. While you don’t know the percentage of those who drive in, you decide to go during the time of day with the least amount of traffic, however enough people also have that idea that the parking lot remains full regardless. This happens nearly every time you go. Is there a term in statistics for this kind of phenomenon? Question inspired by several real world experiences lol.

Apologies if I’m using incorrect terminology, I never took a statistics class in school/college.

u/uncouth_youth — 4 days ago