Odds and Ends
Table of Contents
1 Use R to retrieve genes associated with a GO term
1.1 Method 1 - use biomaRt
Given a GO term id, such as: GO:0046328, how do we get all the genes mapped to that term? We can use biomaRt to retrieve the attributes we want, and we can use the filter mechanism.
library(biomaRt)
# make a connection to ensembl and select data source
ensembl <- useMart("ensembl",dataset="dmelanogaster_gene_ensembl")
# select attributes we want to retrieve, and filter for genes annotated to GO:0046328
genes <- getBM(attributes=c('go_id', 'ensembl_gene_id', 'description'),
filters = 'go', values = 'GO:0046328', mart = ensembl)
1.2 Method 2 - use the annotation objects
Use the annotation objects to select gene ids using GO terms as the keys:
library(org.Dm.eg.db)
select(org.Dm.eg.db, keys="GO:0038023", keytype="GO", columns=c("GO","ENSEMBL","SYMBOL","GENENAME"))
2 Get gene annotation for a previous version of a genome or Ensembl
The above method retreives the default latest version of Ensembl annotation. Sometimes you want to access a previous version. Projects often unfold over years, and you may need annotation for an older set of gene ids. For instance, the current version of ensembl annotation for mouse is 98, but maybe you need something from ensembl 94 to maintain compatibility with current projects and older data sets.
ensembl = useEnsembl(biomart="ensembl", dataset="dmelanogaster_gene_ensembl", version=94)
dmel <- getBM(attribute=c("ensembl_gene_id", "external_gene_name", "chromosome_name",
"start_position", "end_position", "transcript_biotype", "description"), mart=ensembl)
—