A simple use-case comparing OmicsBox with R chunks for Time Course Expression Analysis
The Blast2GO feature “Time Course Expression Analysis” is designed to perform time-course expression analysis of count data arising from RNA-seq technology. Based on the software package ‘maSigPro’, which belongs to the Bioconductor project, this tool allows the detection of genomic features with significant temporal expression changes and significant differences between experimental groups by applying a two steps regression strategy. This use case shows the basic analysis workflow, comparing the results obtained with R Bioconductor and Blast2GO.
DataSet
This use case explains the analysis of a real dataset that describes the transcriptional response of immunocompromised Arabidopsis thaliana lines to the barley powdery mildew fungus Blumeria graminis (GSE43163). The experimental design of this study has 48 samples: plants were challenged with either the Bgh (B. graminis) isolate K1 or the Bgh isolate A6, and three independent biological replicates per condition were harvested at 6, 12, 18 and 24 hours.
Download:
Video tutorial:
Analysis Workflow
-
Loading Data
The read counts for the 48 samples were stored in one tab-delimited file. In this case, raw counts without any type of normalization were provided.
maSigPro:
|
counts = read.delim("count_table.txt", check.names=F, stringAsFactors=F, row.names=1) |
Blast2GO:
|
-
Filtering and Normalization
Genes with low counts should be eliminated since it makes no sense to test them for differential expression if they were not expressed. The used filter for this case excludes genes which are not being expressed in at least one experimental condition. In order to make the samples comparable and to remove possible biases, the TMM normalization method is applied.
maSigPro:
|
##Filtering library(edgeR) dge = DGEList(counts=counts) keep = rowSums(cpm(5.5)>=24 dge = dge[keep, , keep.lib.sizes=F] filteredData = as.data.frame(dge$counts) ##Normalization library(NOISeq) normalizedData = tmm(filteredData, long=1000, lc=0) |
-
Experimental Design
The experimental design of this case corresponds to a replicated 4 time points course with two series (Bgh isolate A6 or K1).
maSigPro:
|
edesign = read.delim("design.txt", check.names=F, stringsAsFactors=F, row.names=1) |
-
Time Course Expression Analysis
The software package maSigPro follows a two steps regression strategy to find genes that show significant expression changes over time and between experimental groups. Plants that were challenged with the BGh isolate A6 were treated as the control condition.
maSigPro:
|
library(maSigPro) design = make.design.matrix(edesign=edesign, degree=2) fit = p.vector(data=normalizedData, design=design, Q=0.05, counts=T, min.obs=7) tstep = T.fit(fit) sigs.all = get.siggenes(tstep=tstep, rsq=0.5, vars="all") sigs.group = get.siggenes(tstep=tstep, rsq=0.5, vars="groups") sigs.each = get.siggenes(tstep=tstep, rsq=0.5, vars="each") |
Blast2GO:
|
Results
After the analysis, interpretation of results is important to reach biological conclusions.
maSigPro:
|
tstep$sol |
|
sigs.group$summary |
Blast2GO:
|
|
Statistics
Different statistics charts can be generated for a global visualization of the results.
-
Venn Diagram
Diagram that shows all possible logical relations between a finite collection of different feature sets.
-
Expression Profile by Gene
Graph of gene expression profiles over time for a particular feature.
maSigPro:
|
gene <- normalizedData[ rownames(normalizedData)=="bgh02759",] PlotGroups(gene, edesign=edesign) |
-
Experiment-wide Expression Profiles
Plot that shows the expression levels across samples for each cluster of genes.
maSigPro:
|
see.genes(sigs.all$sig.genes, min.obs=7, cluster.method="hclust", cluster.data=1, k=9) |
Blast2GO:
|
-
Summary Expression Profiles
Plot that shows the median level expression of each cluster of genes across time.
maSigPro:
|
see.genes(sigs.all$sig.genes, min.obs=7, cluster.method="hclust", cluster.data=1, k=9) |
Blast2GO:
|
Conclusions
As shown in this use case, the maSigPro package is a powerful tool that allows statistical analysis for RNA-seq technology data from time course experiments. The Blast2GO feature “Time Course Expression Analysis” uses all the maSigPro statistical potential to offer an easy and simple way to perform this type of analysis, without requiring programming skills. Furthermore, users can take advantage of Blast2GO features to complete the analysis and achieve greater understanding of the biological problem that is being studied.
References:
Nueda MJ, Tarazona S and Conesa A (2014). “Next maSigPro: updating maSigPro bioconductor package for RNA-seq time series.” Bioinformatics, 30, p. 2598–2602.