Categories
Articles

Appendix I

Appendix
I

Statistical rationale of pooling scheme for
correlation analysis

We define that gene expression levels for a single
gene for n subjects will be denoted as Xi (i=1…n) for mRNA and Yi (i=1…n) for
protein, which are identically and independently distributed random variables
with mean Then we have [1]:

The correlation coefficient for X and Y can be
calculated as:

Then the correlation coefficient for X’ and Y’ can
be deduced as the following:                       
As Theoretically, for a large sample size, a stochastical
and equal pooling scheme will not affect the detection of mRNA–protein
expression correlation. However, if the pooled sample is not well matched, the
correlation coefficient will decrease. For example, if As 21 subjects out of 30 have complete
corresponding mRNA and protein data, we expect that the correlation

Reference

Peng X, Wood CL, Blalock EM, Chen KC, Landfield
PW, Stromberg AJ. Statistical implications of pooling RNA samples for
microarray experiments. BMC Bioinformatics 2003, 4: 26