The Rosalind data analysis pipeline for the input of RCC files (ie raw data) was validated to reproduce the nSolver/Advanced Analysis with the following settings:
Analysis Set-up
- Data input: RCC files
- The covariate setting in nSovler AA is the confounder setting in Rosalind
- Analysis Type: Custom Analysis (vs Quick Analysis)
- Check “Omit Low Count Data” with default settings of Auto: checked, Threshold Count Value: 20, Observation Frequency: 0.5
- Default threshold settings: 20 counts, and an observation frequency of 0.5 (eg 50%)
Normalization default with the GeNorm algorithm
Differential Expression
- Differential Expression with the Fast analysis setting
- P-value Adjustment: Benjamini-Hockberg
Cell type Profiling
- Column Specifying the Cell Types' Characteristic Probes: Use Default (Cell Type)
- Creating Signatures: Dynamically select a subset
- P-value Threshold for Reporting Cell Type Abundance: Custom, p-value= 0.05
- Show results for: Raw Cell Type Abundance (checked) Relative Cell Type Abundance, Cell Type Contrasts: Use Defaults selected
Common differences in the analysis are:
- Uploading normalized data
- Running the normalization on a different set of RCC data in the nSolver AA and Rosalind Analyses will provide different results when the comparison is done on the same samples.
- P-value adjustment not selected for Benjamini Hochberg