Odds and Ends

Table of Contents

1 Use R to retrieve genes associated with a GO term

1.1 Method 1 - use biomaRt

Given a GO term id, such as: GO:0046328, how do we get all the genes mapped to that term? We can use biomaRt to retrieve the attributes we want, and we can use the filter mechanism.

library(biomaRt)

# make a connection to ensembl and select data source
ensembl <- useMart("ensembl",dataset="dmelanogaster_gene_ensembl")

# select attributes we want to retrieve, and filter for genes annotated to GO:0046328
genes <- getBM(attributes=c('go_id', 'ensembl_gene_id', 'description'),
                   filters = 'go', values = 'GO:0046328', mart = ensembl)

1.2 Method 2 - use the annotation objects

Use the annotation objects to select gene ids using GO terms as the keys:

library(org.Dm.eg.db)

select(org.Dm.eg.db, keys="GO:0038023", keytype="GO", columns=c("GO","ENSEMBL","SYMBOL","GENENAME"))

2 Get gene annotation for a previous version of a genome or Ensembl

The above method retreives the default latest version of Ensembl annotation. Sometimes you want to access a previous version. Projects often unfold over years, and you may need annotation for an older set of gene ids. For instance, the current version of ensembl annotation for mouse is 98, but maybe you need something from ensembl 94 to maintain compatibility with current projects and older data sets.

ensembl = useEnsembl(biomart="ensembl", dataset="dmelanogaster_gene_ensembl", version=94)
dmel <- getBM(attribute=c("ensembl_gene_id", "external_gene_name", "chromosome_name", 
              "start_position", "end_position", "transcript_biotype", "description"), mart=ensembl)

Date: 2013-08-30 Fri 00:00

Author: Chris Seidel Christopher Seidel

Created: 2020-10-06 Tue 00:38

Emacs 24.3.1 (Org mode 8.0.6)

Validate