GEPIA generates box plots with jitter for comparing expression in several cancer types (For best visual quality, we recommend 1-4 cancer types).
Parameters
-
Gene: Input a gene/isoform or gene signature of interest.
-
Datasets Selection/Dataset: Select cancer types of interest in the "Dataset Selection" field and click "add"
to build a dataset list in the "Dataset" field. Manual input of cancer types split by comma (e.g. ACC,BRCA,BLCA) is
also acceptable. The x-axis of the plot will follow the order of datasets. Meanwhile, users can also choose cancer subtype (For example, the Breast Lumina A subtype) by clicking the "Subtype Filter" check-box. Users can select "Stacked", "Separated" or "Comparing Across Subtypes" for different purposes. "Stacked" indicates the approach to combine all the samples from selected subtypes prior to differential expression analysis with normal samples. "Separated" means the method to perform differential expression analysis separately with normal samples. "Comparing Across Subtypes" selection performs the differential expression analysis among the selected subtypes.
-
Tumor Color: Set the box color of tumor dataset.
-
Normal Color: Set the box color of normal dataset.
-
Log Scale: Choose whether to use linear or log2(TPM + 1) transformed expression data for plotting.
-
Jitter Size: Set the size of jitter across the box.
Differential thresholds:
-
|log2FC| Cutoff: Set custom fold-change threshold.
-
p-value Cutoff: Set custom p-value threshold.
- Matched Normal data: Select "TCGA normal + GTEx normal", "Only TCGA normal" for differential analysis and
plotting.
The differential analysis here is based on the selected datasets ("TCGA tumors vs TCGA normal + GTEx normal"
or "TCGA tumors vs TCGA normal"). The method for differential analysis is one-way ANOVA, using disease state (Tumor
or Normal) as variable for calculating differential expression:
Gene expression ~ disease state
The expression data are first log
2(TPM+1) transformed for differential analysis and the log
2FC is defined as median(Tumor) - median(Normal).
Genes with higher |log
2FC| values and lower q values than pre-set thresholds are considered differentially expressed genes.