Box Plots

    GEPIA generates box plots with jitter for comparing expression in several cancer types (For best visual quality, we recommend 1-4 cancer types).

Parameters

  • Gene: Input a gene/isoform or gene signature of interest.
  • Datasets Selection/Dataset: Select cancer types of interest in the "Dataset Selection" field and click "add" to build a dataset list in the "Dataset" field. Manual input of cancer types split by comma (e.g. ACC,BRCA,BLCA) is also acceptable. The x-axis of the plot will follow the order of datasets. Meanwhile, users can also choose cancer subtype (For example, the Breast Lumina A subtype) by clicking the "Subtype Filter" check-box. Users can select "Stacked", "Separated" or "Comparing Across Subtypes" for different purposes. "Stacked" indicates the approach to combine all the samples from selected subtypes prior to differential expression analysis with normal samples. "Separated" means the method to perform differential expression analysis separately with normal samples. "Comparing Across Subtypes" selection performs the differential expression analysis among the selected subtypes.
  • Tumor Color: Set the box color of tumor dataset.
  • Normal Color: Set the box color of normal dataset.
  • Log Scale: Choose whether to use linear or log2(TPM + 1) transformed expression data for plotting.
  • Jitter Size: Set the size of jitter across the box.

  • Differential thresholds:

  • |log2FC| Cutoff: Set custom fold-change threshold.
  • p-value Cutoff: Set custom p-value threshold.
  • Matched Normal data: Select "TCGA normal + GTEx normal", "Only TCGA normal" for differential analysis and plotting.
  • The differential analysis here is based on the selected datasets ("TCGA tumors vs TCGA normal + GTEx normal" or "TCGA tumors vs TCGA normal"). The method for differential analysis is one-way ANOVA, using disease state (Tumor or Normal) as variable for calculating differential expression:

    Gene expression ~ disease state

    The expression data are first log 2(TPM+1) transformed for differential analysis and the log 2FC is defined as median(Tumor) - median(Normal).

    Genes with higher |log 2FC| values and lower q values than pre-set thresholds are considered differentially expressed genes.

--- Help ---

Input a gene symbol or id.


Reset Add The plot axis-x order will follow the list.
We use log2(TPM + 1) for log-scale.