ROSALIND selects the optimal subset of housekeeping probes for normalization using the geNorm algorithm, as implemented in the Bioconductor package NormqPCR. The list of specific housekeeping genes selected for normalization can be found within the Variance of Mean QC plot (see here for more detail).
Normalization of the raw data is performed in 4 steps:
- For each sample, calculate the geometric mean of the selected housekeeping genes.
- Calculate the geometric mean across all sample-specific geometric means from step #1. This is the global geometric mean.
- For each sample, divide the global geometric mean by the sample-specific geometric mean. This generates a normalization factor for each sample.
- For each sample, multiply the raw counts of each gene by its sample-specific normalization factor.