makeTranscriptDbFromBiomart {GenomicFeatures} | R Documentation |
The makeTranscriptDbFromBiomart
function allows the user
to make a TranscriptDb object from transcript annotations
available on a BioMart database.
getChromInfoFromBiomart(biomart="ensembl", dataset="hsapiens_gene_ensembl", id_prefix="ensembl_", host="www.biomart.org", port=80) makeTranscriptDbFromBiomart(biomart="ensembl", dataset="hsapiens_gene_ensembl", transcript_ids=NULL, circ_seqs=DEFAULT_CIRC_SEQS, filters="", id_prefix="ensembl_", host="www.biomart.org", port=80, miRBaseBuild=NA)
biomart |
which BioMart database to use.
Get the list of all available BioMart databases with the
|
dataset |
which dataset from BioMart. For example:
|
transcript_ids |
optionally, only retrieve transcript annotation data for the specified set of transcript ids. If this is used, then the meta information displayed for the resulting TranscriptDb object will say 'Full dataset: no'. Otherwise it will say 'Full dataset: yes'. |
circ_seqs |
a character vector to list out which chromosomes should be marked as circular. |
filters |
Additional filters to use in the BioMart query. Must be
a named list. An example is |
host |
The host URL of the BioMart. Defaults to www.biomart.org. |
port |
The port to use in the HTTP communication with the host. |
id_prefix |
Specifies the prefix used in BioMart attributes. For
example, some BioMarts may have an attribute specified as
|
miRBaseBuild |
specify the string for the appropriate build
Information from mirbase.db to use for microRNAs. This can be
learned by calling |
makeTranscriptDbFromBiomart
is a convenience function that feeds
data from a BioMart database to the lower level makeTranscriptDb
function.
See ?makeTranscriptDbFromUCSC
for a similar function
that feeds data from the UCSC source.
BioMart databases that are known to have compatible transcript annotations are:
the most recent ensembl: ENSEMBL GENES (SANGER UK)
the most recent bacterial_mart: ENSEMBL BACTERIA (EBI UK)
the most recent fungal_mart: ENSEMBL FUNGAL (EBI UK)
the most recent metazoa_mart: ENSEMBL METAZOA (EBI UK)
the most recent plant_mart: ENSEMBL PLANT (EBI UK)
the most recent protist_mart: ENSEMBL PROTISTS (EBI UK)
the most recent ensembl_expressionmart: EURATMART (EBI UK)
Not all annotations will have CDS information.
A TranscriptDb object.
M. Carlson and H. Pages
listMarts
,
useMart
,
listDatasets
,
DEFAULT_CIRC_SEQS
,
makeTranscriptDbFromUCSC
,
makeTranscriptDbFromGFF
,
makeTranscriptDb
,
supportedMiRBaseBuildValues
## Discover which datasets are available in the "ensembl" BioMart ## database: library("biomaRt") head(listDatasets(useMart("ensembl"))) ## Retrieving an incomplete transcript dataset for Human from the ## "ensembl" BioMart database: transcript_ids <- c( "ENST00000268655", "ENST00000313243", "ENST00000341724", "ENST00000400839", "ENST00000435657", "ENST00000478783" ) txdb <- makeTranscriptDbFromBiomart(transcript_ids=transcript_ids) txdb # note that these annotations match the GRCh37 genome assembly ## Now what if we want to use another mirror? We might make use of the ## new host argument. But wait! If we use biomaRt, we can see that ## this host has named the mart differently! listMarts(host="uswest.ensembl.org") ## Therefore we must also change the name passed into the "mart" ## argument thusly: try( txdb <- makeTranscriptDbFromBiomart(biomart="ENSEMBL_MART_ENSEMBL", transcript_ids=transcript_ids, host="uswest.ensembl.org") ) txdb