u/Sad_Treat_5285

▲ 9 r/AskStatistics+1 crossposts

How do I get a sensible output for a regression in R with many categorical variables

Hello everyone!

I hope this is the right thread, if not I‘m very sorry.

I am running a regression in R using lm that contains quite a few categorical variables. I‘m using factor() on all categorical variables. The problem is that when using summary() I get estimates for each combination of categorical variables, meaning that the output has over 300 lines. I‘ve been using drop1 (F-test) to solve this problem, but I‘ve been wondering whether ANOVA would be a better choice? Another issue with using drop1 is that I can‘t use robust errors, because drop1 doesn‘t work with lm_robust or lm2.

My supervisor can‘t help me (only knows STATA) which is why I‘m asking here.

Any help is much appreciated!

reddit.com
u/Sad_Treat_5285 — 9 days ago