Quick overview of data
17022 peptides (2250 proteins). From 144 samples + 16 controls = 160 channels over 16 set (10-plex)
Output from PD2.2 is not scaled or normalized.
Some notes immediately:
What about raw abundances?
Median abundance = 68.4 (this output is not scaled!)
Are any channels enriched for LOD values?
Which data are LOD?
Showing a sample of 20 peptides
When I re-sample a couple of times it's clear that LOD abundance ratios are: either NA or low signal in sample abundance.
Input from Johan Gobom: TMT is not temp sensitive but water sensitive -> condensation reaction w/water -> no primary amine bindings -> BAD!
I've previously shown that detection is only set dependent: if a peptide is detected in one channel in a set it will be detected in all other channels as well. But it may perhaps not be quantified in all. E.g. TMT does not bind for some reason for ch 126/127N/127C -> there is no corresponding peak in the spectra -> it is still detected but quantified as NA -> divided with control channel (131) which was quantified and the result is defined as LOD (0.01).
So, how to deal with this? Can I just exclude these proteins? Or is there any general skewness to data?
Some channels look weird! E.g. distribution of abundance ratios of F6/127N shown below: It's centered around -2.5! How can that be?
Below are all the ratio distributions displayed be TMTch
So, 130C/N seems like proper data, and to some extent also 128-129. But, 126 and 127N has skrewed distributions! 95% CI varies somewhat where 130N has tighter CI.