Predictor Model Comparison- What is the best approach?
Hi all,
I'm working on a prognostic landmark analysis in a cardiac disease cohort (N=194) ( high for this condition) and looking for advice on the best analytical strategy given our sample size constraints.
Study setup:
- Landmark design: predictors measured at baseline and follow-up visit, outcomes counted after follow-up visit
- 4 binary predictors (worsened vs not): Predictors 1, 2 3 and a novel clinical marker.
- Primary outcome: composite of CV hospitalization or death (77 events)
- Secondary outcomes: first CV hospitalization (73), composite HF hospitalization/death (23), all-cause death (18)
The hypothesis: Three of the four predictors are FDA-validated endpoints used in clinical trials. Our novel predictor has shown prognostic value in prior univariate analyses, and also multivariable Cox regression, but has never been directly compared head-to-head with the validated ones. We hypothesize it performs similarly in terms of prognostic magnitude.
What we've done so far:
- Univariate Cox for each predictor × each outcome
- Ordinal domain score (0–4 worsened domains) as a single parsimonious predictor
- C-statistic comparison across nested models (with vs without novel predictor)
- LRT for incremental value of novel predictor above the two functional measures
- Pairwise models (novel predictor + each comparator)
- Andersen-Gill for recurrent hospitalizations (131 total events)
The problem: With only 73–77 primary events and 4 binary predictors, we're at ~19 EPV — adequate for univariate and domain score analyses, but underpowered for full multivariable Cox. The novel predictor appears in only 32/194 patients (16.5%), limiting statistical power further.
Specific questions:
- Are there methods beyond C-statistic and LRT better suited to compare prognostic markers in this underpowered setting?
- Is NRI (Net Reclassification Index) appropriate here given binary predictors and a time-to-event outcome?
- Would a permutation-based approach or bootstrapped C-statistic comparison be more appropriate than asymptotic LRT?
- Any recommendations or a different approach to analyze this 4 predictors?
Thanks in advance folks