Targeted proteomics
(but not that kind!)
If shotgun proteomics is listening to the whole song
(DDA,
DIA),
then what I’m doing here is hovering over a few notes and asking:
“Were you even played? And roughly how loud?”
Not re-identifying peptides.
Not doing PRM.
Not replacing Skyline.
Just opening the RAW file and counting stuff.
Where this came from
In my earlier post Chopping Proteins to Peptides I started with a very naive question:
What peptides even exist, theoretically, in our proteome?
So I:
chopped protein sequences into overlapping peptides (10–30 aa)
computed monoisotopic masses
deduplicated sequences
verified mass correctness with known isobaric peptides
That exercise made one thing painfully obvious:
wildly different peptides can have exactly the same mass
For example:
LSLAQEDLISNR(12 aa)GSLLLGGLDAEASR(14 aa)
Same monoisotopic mass. Completely unrelated proteins.
From an MS¹ point of view, they collapse onto the same ion.
This post is a continuation of that line of thinking.
The next naive question
Given a list of theoretical peptide masses:
Do I actually see these ions in my RAW file?
And if yes, how much signal do they carry?
That’s it.
No peak picking.
No fragment scoring.
No chromatogram integration.
Just counting intensities already recorded by the instrument.
About “targeted” (important caveat)
Let me be very clear:
This is not proper targeted proteomics in the sense of:
PRM
SRM
Skyline workflows
interference-aware quantification
If you want that, Skyline exists — and it’s excellent.
This is much more primitive:
no deconvolution
no fragment analysis
no sequence confirmation
no confidence model
Think of it as:
“What did the instrument actually collect for these m/z values?”
Nothing more.
How the RAW data is read
Unlike timsTOF-pro data (where conversion to mzML was done,
Convert timsTOF-pro data to mzML), here Thermo RAW files is read directly using the vendor-provided DLL comes from: http://planetorbitrap.com/rawfilereader
My repo: https://github.com/animesh/RawRead is just an example implementation showing how to:
open RAW files
iterate scans
extract scan metadata
parse scan titles
Support for Bruker / timsTOF is a different problem, API calls are just too slow ATM 🤪 but I’m working on it 🤞
Minimal targets: just m/z
The minimal input for this workflow is a CSV (targets.csv) with m/z values.
For the two isobaric peptides mentioned earlier (charge 2):
Compound Mass [m/z]
GSLLLGGLDAEASR 679.867348
GSLLLGGLDAEASR (heavy) 684.871482
LSLAQEDLISNR 679.867348
LSLAQEDLISNR (heavy) 684.871482
Yes, I added (heavy) because that’s what a good PRM person does 😉
Optional columns are charge-number of the ion, collision energy (NCE) used in the instrument , … and of course when do we expect the ion to fly into the instrument, the Retention Time (RT) window
Start [min]End [min]
More details here:👉 https://github.com/animesh/RawRead/tree/count-ions#:~:text=Minimal%20CSV%20requirements
Finally here is the code countIons.cs: https://raw.githubusercontent.com/animesh/RawRead/refs/heads/count-ions/countIons.cs which is a tiny helper that does three things:
Reads a Thermo RAW file
Writes a compact per-scan TSV
Accumulates signal for target m/z values
Build & run
mcs countIons.cs /reference:ThermoFisher.CommonCore.RawFileReader.dll \
/reference:ThermoFisher.CommonCore.Data.dll \
/reference:ThermoFisher.CommonCore.MassPrecisionEstimator.dll \
/reference:MathNet.Numerics.dll \
/reference:System.Numerics.dll \
-out:countIons.exe
mono countIons.exe file.raw targets.csv
Mass tolerance (0.0001) and RT slack (0.01 min) are hard-coded for now.
What it actually matches
Observed m/z is parsed from the scan title
BasePeakMass is not used
Absolute mass tolerance ≤
0.0001Optional RT windows are applied if provided
Because of this:
two peptides with the same monoisotopic mass
cannot be disambiguated unless nonoverlapping RT is provided!
Here if a scan matches multiple targets, its intensity is counted for each —
and this is explicitly reported, not hidden.
Outputs (boring but honest)
<raw>.cI.tsv
Compact per-scan table (scan, RT, TIC, title, etc.)<raw>.<csv>.accumulation.csv
Accumulated signal per target<raw>.<csv>.duplicated_scans.tsv
Scans matched to multiple targets<raw>.<csv>.unmatched_scans.tsv
Scans matched to none
Bottom line
This is not about being correct.
It’s about being explicit.
Before running fancy tools, I want to know:
what ions exist
where they show up
how crowded mass space really is
Sometimes the most useful thing is to just open the RAW file and look.
That’s all this does — and that’s exactly the point.
A final note on acquisition methods
How the RAW file was acquired matters.
Targeting on Orbitraps has subtle but important pitfalls, nicely explained here:
👉 https://proteomicsnews.blogspot.com/2015/03/targeting-on-q-exactive-which-method-to.html
Missing signal does not automatically mean missing peptide.

