u/Legitimate-Muffin917

▲ 3 r/biostatistics+1 crossposts

Facing difficulty in Waters HDMS preproceesing in metabolomics pipeline

I am performing untargeted metabolomics analysis on a public dataset generated using a Waters SYNAPT-G2 HDMS (Q-TOF with ion mobility) coupled with ACQUITY UPLC. The raw data is in .raw format, and I need to convert it to .mzML for downstream processing in r/XCMS.

Because the raw files are very large and contain ion mobility data, I am using msconvert. However, I am facing issues deciding the correct conversion strategy.

The dataset details mention:

  • Waters SYNAPT-G2 HDMS
  • Ion mobility enabled acquisition
  • Untargeted metabolomics workflow

I tested 3 conversion combinations:

  1. Only centroiding → mzML generated successfully, but downstream peak detection gives almost no usable peaks.
  2. Only combineIonMobilitySpectra → mzML looks usable and peaks are detected, but spectra are still largely profile-mode / insufficiently centroided.
  3. Both centroiding + combineIonMobilitySpectra → mzML files become problematic/corrupted for downstream processing (e.g., m/z ordering / MSnbase errors).

At this point, using combineIonMobilitySpectra seems to be the only workable option, but I am doubtful whether collapsing ion mobility spectra at conversion is the correct approach biologically and computationally.

Has anyone processed Waters SYNAPT HDMS metabolomics data successfully for XCMS/MSnbase workflows?

  • Is combineIonMobilitySpectra generally recommended here?
  • Should centroiding instead be done later inside R?
  • Are there better msconvert filters/settings for Waters HDMS ion mobility data?
  • How do people usually handle IM dimensions when the downstream tools do not fully support them?

Any guidance from people experienced with Waters HDMS preprocessing would help a lot.

reddit.com
u/Legitimate-Muffin917 — 2 days ago