Download OpenAPI specification:Download
The Dfam REST API provides a means for programs and scripts to access information from the current release of the Dfam database. It provides the core functionality used by the dfam.org website and is offered for use in community developed applications and workflows.
For more information on getting started with the Dfam API, see the documentation and examples at https://dfam.org/help/api
Robert Hubley, Arian Smit, Travis Wheeler, Jeb Rosen, Anthony Gray
Copyright 2016-2023 Institute for Systems Biology
Retrieve a list of families in Dfam, optionally filtered and sorted.
format | string Desired output format. Supported formats include "summary", "full", "embl", "fasta", and "hmm". Defaults to "summary". |
sort | string A string containing sort columns, for example "name:asc,length:desc". Sorting by any of "accession", "name", "length", "type", "subtype", "date_created", and "date_modified" are supported. If unspecified, "accession:asc" will be used. |
name | string Search term for any part of the family name. Takes precedence over "name_prefix" if both are specified. |
name_prefix | string Search term for a prefix of the family name. |
name_accession | string Search term for any part of the family name or accession |
classification | string Search term for family classification. Sub-classifications are included. A full classification lineage is expected for this search term; such as "root;Interspersed_Repeat". |
clade | string Search term for family clade. Can be either an NCBI Taxonomy ID or scientific name. If the scientific name is ambiguous (e.g. "Drosophila"), the taxonomy ID should be used instead. |
clade_relatives | string Relatives of the requested clade to include: 'ancestors', 'descendants', or 'both' |
type | string Search term for TE type, as understood by RepeatMasker. |
subtype | string Search term for TE subtype, as understood by RepeatMasker. |
updated_after | string <date> Filter by "updated on or after" date. |
updated_before | string <date> Filter by "updated on or before" date. |
desc | string Search term for family description. |
keywords | string Keywords to search in text fields (currently name, title, description, accession, author). |
include_raw | boolean Whether to include raw ("DR") families in the results. Default is false. |
start | integer Index of first record to return. Commonly used along with |
limit | integer Maxium number of records to return. |
download | boolean If true, adds headers to trigger a browser download. |
{- "total_count": 0,
- "results": [
- {
- "submitter": "submitter",
- "aliases": [
- {
- "database": "database",
- "alias": "alias"
}, - {
- "database": "database",
- "alias": "alias"
}
], - "description": "description",
- "consensus_sequence": "consensus_sequence",
- "accession": "accession",
- "refineable": true,
- "title": "title",
- "source_assembly": {
- "hyperlink": "hyperlink",
- "label": "label"
}, - "curation_state_name": "curation_state_name",
- "search_stages": [
- {
- "name": "name"
}, - {
- "name": "name"
}
], - "features": [
- {
- "model_start_pos": 5,
- "description": "description",
- "attributes": [
- {
- "attribute": "attribute",
- "value": "value"
}, - {
- "attribute": "attribute",
- "value": "value"
}
], - "label": "label",
- "type": "type",
- "model_end_pos": 9
}, - {
- "model_start_pos": 5,
- "description": "description",
- "attributes": [
- {
- "attribute": "attribute",
- "value": "value"
}, - {
- "attribute": "attribute",
- "value": "value"
}
], - "label": "label",
- "type": "type",
- "model_end_pos": 9
}
], - "citations": [
- {
- "journal": "journal",
- "pmid": 7,
- "title": "title",
- "authors": "authors",
- "pubdate": "pubdate"
}, - {
- "journal": "journal",
- "pmid": 7,
- "title": "title",
- "authors": "authors",
- "pubdate": "pubdate"
}
], - "curation_state_description": "curation_state_description",
- "disabled": true,
- "target_site_cons": "target_site_cons",
- "source_method": "source_method",
- "author": "author",
- "date_created": "date_created",
- "length": 1,
- "source_method_description": "source_method_description",
- "coding_seqs": [
- {
- "cds_end": 3,
- "frameshifts": 1,
- "product": "product",
- "exon_count": 2,
- "protein_type": "protein_type",
- "stop_codons": 1,
- "left_unaligned": 7,
- "description": "description",
- "align_data": "align_data",
- "exon_starts": [
- 4,
- 4
], - "reverse": true,
- "gaps": 1,
- "right_unaligned": 1,
- "cds_start": 9,
- "translation": "translation",
- "external_reference": "external_reference",
- "percent_identity": 6.84685269835264,
- "exon_ends": [
- 7,
- 7
], - "classification_id": 4
}, - {
- "cds_end": 3,
- "frameshifts": 1,
- "product": "product",
- "exon_count": 2,
- "protein_type": "protein_type",
- "stop_codons": 1,
- "left_unaligned": 7,
- "description": "description",
- "align_data": "align_data",
- "exon_starts": [
- 4,
- 4
], - "reverse": true,
- "gaps": 1,
- "right_unaligned": 1,
- "cds_start": 9,
- "translation": "translation",
- "external_reference": "external_reference",
- "percent_identity": 6.84685269835264,
- "exon_ends": [
- 7,
- 7
], - "classification_id": 4
}
], - "buffer_stages": [
- {
- "name": "name",
- "start": 5,
- "end": 2
}, - {
- "name": "name",
- "start": 5,
- "end": 2
}
], - "classification": "classification",
- "version": 6,
- "repeat_type_name": "repeat_type_name",
- "hmm_general_threshold": 5.962133916683182,
- "date_modified": "date_modified",
- "clades": [
- "clades",
- "clades"
], - "model_mask": "model_mask",
- "name": "name",
- "repeat_subtype_name": "repeat_subtype_name"
}, - {
- "submitter": "submitter",
- "aliases": [
- {
- "database": "database",
- "alias": "alias"
}, - {
- "database": "database",
- "alias": "alias"
}
], - "description": "description",
- "consensus_sequence": "consensus_sequence",
- "accession": "accession",
- "refineable": true,
- "title": "title",
- "source_assembly": {
- "hyperlink": "hyperlink",
- "label": "label"
}, - "curation_state_name": "curation_state_name",
- "search_stages": [
- {
- "name": "name"
}, - {
- "name": "name"
}
], - "features": [
- {
- "model_start_pos": 5,
- "description": "description",
- "attributes": [
- {
- "attribute": "attribute",
- "value": "value"
}, - {
- "attribute": "attribute",
- "value": "value"
}
], - "label": "label",
- "type": "type",
- "model_end_pos": 9
}, - {
- "model_start_pos": 5,
- "description": "description",
- "attributes": [
- {
- "attribute": "attribute",
- "value": "value"
}, - {
- "attribute": "attribute",
- "value": "value"
}
], - "label": "label",
- "type": "type",
- "model_end_pos": 9
}
], - "citations": [
- {
- "journal": "journal",
- "pmid": 7,
- "title": "title",
- "authors": "authors",
- "pubdate": "pubdate"
}, - {
- "journal": "journal",
- "pmid": 7,
- "title": "title",
- "authors": "authors",
- "pubdate": "pubdate"
}
], - "curation_state_description": "curation_state_description",
- "disabled": true,
- "target_site_cons": "target_site_cons",
- "source_method": "source_method",
- "author": "author",
- "date_created": "date_created",
- "length": 1,
- "source_method_description": "source_method_description",
- "coding_seqs": [
- {
- "cds_end": 3,
- "frameshifts": 1,
- "product": "product",
- "exon_count": 2,
- "protein_type": "protein_type",
- "stop_codons": 1,
- "left_unaligned": 7,
- "description": "description",
- "align_data": "align_data",
- "exon_starts": [
- 4,
- 4
], - "reverse": true,
- "gaps": 1,
- "right_unaligned": 1,
- "cds_start": 9,
- "translation": "translation",
- "external_reference": "external_reference",
- "percent_identity": 6.84685269835264,
- "exon_ends": [
- 7,
- 7
], - "classification_id": 4
}, - {
- "cds_end": 3,
- "frameshifts": 1,
- "product": "product",
- "exon_count": 2,
- "protein_type": "protein_type",
- "stop_codons": 1,
- "left_unaligned": 7,
- "description": "description",
- "align_data": "align_data",
- "exon_starts": [
- 4,
- 4
], - "reverse": true,
- "gaps": 1,
- "right_unaligned": 1,
- "cds_start": 9,
- "translation": "translation",
- "external_reference": "external_reference",
- "percent_identity": 6.84685269835264,
- "exon_ends": [
- 7,
- 7
], - "classification_id": 4
}
], - "buffer_stages": [
- {
- "name": "name",
- "start": 5,
- "end": 2
}, - {
- "name": "name",
- "start": 5,
- "end": 2
}
], - "classification": "classification",
- "version": 6,
- "repeat_type_name": "repeat_type_name",
- "hmm_general_threshold": 5.962133916683182,
- "date_modified": "date_modified",
- "clades": [
- "clades",
- "clades"
], - "model_mask": "model_mask",
- "name": "name",
- "repeat_subtype_name": "repeat_subtype_name"
}
]
}
Retrieve full details of an individual Dfam family.
id required | string The Dfam family accession. |
{- "submitter": "submitter",
- "aliases": [
- {
- "database": "database",
- "alias": "alias"
}, - {
- "database": "database",
- "alias": "alias"
}
], - "description": "description",
- "consensus_sequence": "consensus_sequence",
- "accession": "accession",
- "refineable": true,
- "title": "title",
- "source_assembly": {
- "hyperlink": "hyperlink",
- "label": "label"
}, - "curation_state_name": "curation_state_name",
- "search_stages": [
- {
- "name": "name"
}, - {
- "name": "name"
}
], - "features": [
- {
- "model_start_pos": 5,
- "description": "description",
- "attributes": [
- {
- "attribute": "attribute",
- "value": "value"
}, - {
- "attribute": "attribute",
- "value": "value"
}
], - "label": "label",
- "type": "type",
- "model_end_pos": 9
}, - {
- "model_start_pos": 5,
- "description": "description",
- "attributes": [
- {
- "attribute": "attribute",
- "value": "value"
}, - {
- "attribute": "attribute",
- "value": "value"
}
], - "label": "label",
- "type": "type",
- "model_end_pos": 9
}
], - "citations": [
- {
- "journal": "journal",
- "pmid": 7,
- "title": "title",
- "authors": "authors",
- "pubdate": "pubdate"
}, - {
- "journal": "journal",
- "pmid": 7,
- "title": "title",
- "authors": "authors",
- "pubdate": "pubdate"
}
], - "curation_state_description": "curation_state_description",
- "disabled": true,
- "target_site_cons": "target_site_cons",
- "source_method": "source_method",
- "author": "author",
- "date_created": "date_created",
- "length": 1,
- "source_method_description": "source_method_description",
- "coding_seqs": [
- {
- "cds_end": 3,
- "frameshifts": 1,
- "product": "product",
- "exon_count": 2,
- "protein_type": "protein_type",
- "stop_codons": 1,
- "left_unaligned": 7,
- "description": "description",
- "align_data": "align_data",
- "exon_starts": [
- 4,
- 4
], - "reverse": true,
- "gaps": 1,
- "right_unaligned": 1,
- "cds_start": 9,
- "translation": "translation",
- "external_reference": "external_reference",
- "percent_identity": 6.84685269835264,
- "exon_ends": [
- 7,
- 7
], - "classification_id": 4
}, - {
- "cds_end": 3,
- "frameshifts": 1,
- "product": "product",
- "exon_count": 2,
- "protein_type": "protein_type",
- "stop_codons": 1,
- "left_unaligned": 7,
- "description": "description",
- "align_data": "align_data",
- "exon_starts": [
- 4,
- 4
], - "reverse": true,
- "gaps": 1,
- "right_unaligned": 1,
- "cds_start": 9,
- "translation": "translation",
- "external_reference": "external_reference",
- "percent_identity": 6.84685269835264,
- "exon_ends": [
- 7,
- 7
], - "classification_id": 4
}
], - "buffer_stages": [
- {
- "name": "name",
- "start": 5,
- "end": 2
}, - {
- "name": "name",
- "start": 5,
- "end": 2
}
], - "classification": "classification",
- "version": 6,
- "repeat_type_name": "repeat_type_name",
- "hmm_general_threshold": 5.962133916683182,
- "date_modified": "date_modified",
- "clades": [
- "clades",
- "clades"
], - "model_mask": "model_mask",
- "name": "name",
- "repeat_subtype_name": "repeat_subtype_name"
}
Retrieve an individual Dfam family's annotated HMM.
id required | string The Dfam family accession. |
format required | string The desired output format: "hmm", "logo", or "image". |
download | boolean If true, adds headers to trigger a browser download. |
Retrieve an individual Dfam family's annotated consensus sequence. If only the raw sequence is needed, use the consensus_sequence
property from the /families/{id}
endpoint instead.
id required | string The Dfam family accession. |
format required | string The desired output format. "embl" and "fasta" are the currently supported formats. |
download | boolean If true, adds headers to trigger a browser download. |
Retrieve an individual Dfam family's seed alignment data.
id required | string The Dfam family accession. |
format required | string The format to return, one of 'stockholm' or 'alignment_summary'. |
download | boolean If true, adds headers to trigger a browser download. |
Retrieve an individual Dfam family's relationship information.
id required | string The Dfam family accession. |
include | string Which families to include. "all" searches all of Dfam, and "related" searches only families that are found in ancestor or descendant clades of the one this family belongs to. Default is "all". |
include_raw | boolean Whether to include matches to raw ("DR") families. Default is false. |
[- {
- "coverage": "coverage",
- "target_start": 6,
- "strand": "strand",
- "identity": "identity",
- "cigar": "cigar",
- "target_end": 5,
- "model_end": 1,
- "evalue": "evalue",
- "auto_overlap": {
- "model": {
- "length": 5,
- "id": "id_9",
- "accession": "accession"
}, - "target": {
- "length": 2,
- "id": "id_10",
- "accession": "accession"
}
}, - "model_start": 0
}
]
Retrieve a list of genome assemblies with annotations for a Dfam family.
id required | string The Dfam family accession. |
[- {
- "hmm_hit_ga": 0.8008281904610115,
- "name": "name",
- "hmm_fdr": 1.4658129805029452,
- "id": "id_11",
- "hmm_hit_tc": 6.027456183070403
}
]
Retrieve a family's annotation statistics for all assemblies it is annotated in.
id required | string The Dfam family accession. |
[- {
- "hmm_hit_ga": 0.8008281904610115,
- "stats": {
- "hmm": {
- "divergence": 5.962133916683182,
- "gathering_nonredundant": 5,
- "trusted_nonredundant": 7,
- "trusted_all": 9,
- "gathering_all": 2
}
}, - "name": "name",
- "hmm_fdr": 1.4658129805029452,
- "id": "id_12",
- "hmm_hit_tc": 6.027456183070403
}
]
Retrieve a family's coverage data associated with a given assembly.
id required | string The Dfam family accession. |
assembly_id required | string The assembly identifier, as shown in /families/{id}/assemblies. |
model required | string Model type, "cons" or "hmm". |
{- "all": "all",
- "false": "false",
- "nrph": "nrph"
}
Retrieve a family's conservation data associated with a given assembly.
id required | string The Dfam family accession. |
assembly_id required | string The assembly identifier, as shown in /families/{id}/assemblies. |
model required | string Model type, "cons" or "hmm". |
[- {
- "max_insert": 0,
- "num_seqs": 6,
- "threshold": "threshold",
- "graph": "graph"
}
]
Retrieve a family's annotations associated with a given assembly.
id required | string The Dfam family accession. |
assembly_id required | string The assembly identifier, as shown in /families/{id}/assemblies. |
nrph required | boolean "true" to include only non-redundant profile hits. |
download | boolean If true, adds headers to trigger a browser download. |
Retrieve a family's annotation statistics associated with a given assembly.
id required | string The Dfam family accession. |
assembly_id required | string The assembly identifier, as shown in /families/{id}/assemblies. |
{- "hmm": {
- "divergence": 5.962133916683182,
- "gathering_nonredundant": 5,
- "trusted_nonredundant": 7,
- "trusted_all": 9,
- "gathering_all": 2
}
}
Retrieve a family's karyotype data associated with a given assembly.
id required | string The Dfam family accession. |
assembly_id required | string The assembly identifier, as shown in /families/{id}/assemblies. |
{ }
Retrieve the entire TE classification hierarchy used by Dfam.
name | string Classification name to search for. If given, the results will be returned as an array instead of the default hierarchical format. |
{- "hyperlink": "hyperlink",
- "repbase_equiv": "repbase_equiv",
- "aliases": "aliases",
- "tooltip": "tooltip",
- "count": "count",
- "description": "description",
- "piegu_equiv": "piegu_equiv",
- "full_name": "full_name",
- "repeatmasker_subtype": "repeatmasker_subtype",
- "children": [
- null,
- null
], - "repeatmasker_type": "repeatmasker_type",
- "curcio_derbyshire_equiv": "curcio_derbyshire_equiv",
- "name": "name",
- "wicker_equiv": "wicker_equiv"
}
Query Dfam's copy of the NCBI taxonomy database.
name required | string Search string for taxonomy name. |
annotated | boolean Whether only taxa with annotated assemblies should be returned. |
limit | integer Only return up to a maximum number of matching taxa. |
{- "taxa": [
- {
- "name": "name",
- "id": 2
}, - {
- "name": "name",
- "id": 3
}
]
}
Retrieve annotations for a given genome assembly in a given range.
assembly required | string Genome assembly to search. A list of assemblies is available at |
chrom required | string Chromosome to search. Assembly dependent, but normally in the "chrN" format. |
start required | integer Start of the sequence range (one based). |
end required | integer End of the sequence range (one based, fully-closed). |
family | string An optional family to restrict results to. |
nrph | boolean
|
{- "hits": [
- {
- "bit_score": 1.4658129805029452,
- "ali_start": 2,
- "ali_end": 7,
- "query": "query",
- "e_value": "e_value",
- "model_end": 5,
- "accession": "accession",
- "type": "type",
- "seq_end": 3,
- "sequence": "sequence",
- "strand": "strand",
- "seq_start": 9,
- "model_start": 5
}, - {
- "bit_score": 1.4658129805029452,
- "ali_start": 2,
- "ali_end": 7,
- "query": "query",
- "e_value": "e_value",
- "model_end": 5,
- "accession": "accession",
- "type": "type",
- "seq_end": 3,
- "sequence": "sequence",
- "strand": "strand",
- "seq_start": 9,
- "model_start": 5
}
], - "tandem_repeats": [
- {
- "repeat_length": 7,
- "sequence": "sequence",
- "start": 2,
- "end": 4,
- "type": "type"
}, - {
- "repeat_length": 7,
- "sequence": "sequence",
- "start": 2,
- "end": 4,
- "type": "type"
}
], - "offset": 0,
- "query": "query",
- "length": 6
}
Query the alignment of a family to an assembly. This API is meant for use only on dfam.org.
assembly required | string Genome assembly to align to. A list of assemblies is available at |
chrom required | string Chromosome to align to. |
start required | integer Start of the sequence range (one based). |
end required | integer End of the sequence range (one based, fully closed). |
family required | string The family to align against. |
{- "pp": {
- "string": "string"
}, - "match": {
- "string": "string"
}, - "hmm": {
- "string": "string",
- "start": 0,
- "end": 6,
- "id": "id1a"
}, - "seq": {
- "string": "string",
- "start": 1,
- "end": 5,
- "id": "id_2"
}
}
Submit a sequence search request.
sequence required | string Sequence data to search, in FASTA format. |
organism required | string Source organism of the sequence, used for determining search. thresholds. Uses the organism-specific E-value |
cutoff required | string Type of cutoff to use, either 'curated' or 'evalue'. |
evalue | string E-value cutoff to use in the search. Only effective if |
{- "id": "id_3"
}
Retrieve the results of a sequence search.
id required | string A result set ID matching the ID returned by a previous search submission. |
{- "hits": [
- {
- "bit_score": 1.4658129805029452,
- "ali_start": 2,
- "ali_end": 7,
- "query": "query",
- "e_value": "e_value",
- "model_end": 5,
- "accession": "accession",
- "type": "type",
- "seq_end": 3,
- "sequence": "sequence",
- "strand": "strand",
- "seq_start": 9,
- "model_start": 5
}, - {
- "bit_score": 1.4658129805029452,
- "ali_start": 2,
- "ali_end": 7,
- "query": "query",
- "e_value": "e_value",
- "model_end": 5,
- "accession": "accession",
- "type": "type",
- "seq_end": 3,
- "sequence": "sequence",
- "strand": "strand",
- "seq_start": 9,
- "model_start": 5
}
], - "tandem_repeats": [
- {
- "repeat_length": 7,
- "sequence": "sequence",
- "start": 2,
- "end": 4,
- "type": "type"
}, - {
- "repeat_length": 7,
- "sequence": "sequence",
- "start": 2,
- "end": 4,
- "type": "type"
}
], - "offset": 0,
- "query": "query",
- "length": 6
}
Retrieve an alignment from a sequence search result.
id required | string The result set ID returned by the API at submission time. |
sequence required | string The name of the input sequence to align. |
start required | integer Start of the sequence range. |
end required | integer End of the sequence range. |
family required | string The family to align against. |
{- "pp": {
- "string": "string"
}, - "match": {
- "string": "string"
}, - "hmm": {
- "string": "string",
- "start": 0,
- "end": 6,
- "id": "id1a"
}, - "seq": {
- "string": "string",
- "start": 1,
- "end": 5,
- "id": "id_2"
}
}