Dfam REST API (0.4.0)

Download OpenAPI specification:Download

The Dfam REST API provides a means for programs and scripts to access information from the current release of the Dfam database. It provides the core functionality used by the dfam.org website and is offered for use in community developed applications and workflows.

For more information on getting started with the Dfam API, see the documentation and examples at https://dfam.org/help/api

Known Consumers

https://dfam.org/

Authors

Robert Hubley, Arian Smit, Travis Wheeler, Jeb Rosen, Anthony Gray

Copyright 2016-2023 Institute for Systems Biology

version

API specification

Return the version of the API.

Return the version of the API.

Responses

Response samples

Content type
application/json
{
  • "major": "major",
  • "minor": "minor",
  • "bugfix": "bugfix"
}

blog

Access to blog posts

Retrieve a list of recent Dfam blog posts. This API is intended for use only at dfam.org.

Retrieve a list of recent Dfam blog posts. This API is intended for use only at dfam.org.

Responses

Response samples

Content type
application/json
[
  • { }
]

families

Access to Dfam families

Retrieve a list of families in Dfam, optionally filtered and sorted.

Retrieve a list of families in Dfam, optionally filtered and sorted.

query Parameters
format
string

Desired output format. Supported formats include "summary", "full", "embl", "fasta", and "hmm". Defaults to "summary".

sort
string

A string containing sort columns, for example "name:asc,length:desc". Sorting by any of "accession", "name", "length", "type", "subtype", "date_created", and "date_modified" are supported. If unspecified, "accession:asc" will be used.

name
string

Search term for any part of the family name. Takes precedence over "name_prefix" if both are specified.

name_prefix
string

Search term for a prefix of the family name.

name_accession
string

Search term for any part of the family name or accession

classification
string

Search term for family classification. Sub-classifications are included. A full classification lineage is expected for this search term; such as "root;Interspersed_Repeat".

clade
string

Search term for family clade. Can be either an NCBI Taxonomy ID or scientific name. If the scientific name is ambiguous (e.g. "Drosophila"), the taxonomy ID should be used instead.

clade_relatives
string

Relatives of the requested clade to include: 'ancestors', 'descendants', or 'both'

type
string

Search term for TE type, as understood by RepeatMasker.

subtype
string

Search term for TE subtype, as understood by RepeatMasker.

updated_after
string <date>

Filter by "updated on or after" date.

updated_before
string <date>

Filter by "updated on or before" date.

desc
string

Search term for family description.

keywords
string

Keywords to search in text fields (currently name, title, description, accession, author).

include_raw
boolean

Whether to include raw ("DR") families in the results. Default is false.

start
integer

Index of first record to return. Commonly used along with limit to implement paging.

limit
integer

Maxium number of records to return.

download
boolean

If true, adds headers to trigger a browser download.

Responses

Response samples

Content type
{
  • "total_count": 0,
  • "results": [
    ]
}

Retrieve full details of an individual Dfam family.

Retrieve full details of an individual Dfam family.

path Parameters
id
required
string

The Dfam family accession.

Responses

Response samples

Content type
application/json
{
  • "submitter": "submitter",
  • "aliases": [
    ],
  • "description": "description",
  • "consensus_sequence": "consensus_sequence",
  • "accession": "accession",
  • "refineable": true,
  • "title": "title",
  • "source_assembly": {
    },
  • "curation_state_name": "curation_state_name",
  • "search_stages": [
    ],
  • "features": [
    ],
  • "citations": [
    ],
  • "curation_state_description": "curation_state_description",
  • "disabled": true,
  • "target_site_cons": "target_site_cons",
  • "source_method": "source_method",
  • "author": "author",
  • "date_created": "date_created",
  • "length": 1,
  • "source_method_description": "source_method_description",
  • "coding_seqs": [
    ],
  • "buffer_stages": [
    ],
  • "classification": "classification",
  • "version": 6,
  • "repeat_type_name": "repeat_type_name",
  • "hmm_general_threshold": 5.962133916683182,
  • "date_modified": "date_modified",
  • "clades": [
    ],
  • "model_mask": "model_mask",
  • "name": "name",
  • "repeat_subtype_name": "repeat_subtype_name"
}

Retrieve an individual Dfam family's annotated HMM.

Retrieve an individual Dfam family's annotated HMM.

path Parameters
id
required
string

The Dfam family accession.

query Parameters
format
required
string

The desired output format: "hmm", "logo", or "image".

download
boolean

If true, adds headers to trigger a browser download.

Responses

Response samples

Content type
No sample

Retrieve an individual Dfam family's annotated consensus sequence.

Retrieve an individual Dfam family's annotated consensus sequence. If only the raw sequence is needed, use the consensus_sequence property from the /families/{id} endpoint instead.

path Parameters
id
required
string

The Dfam family accession.

query Parameters
format
required
string

The desired output format. "embl" and "fasta" are the currently supported formats.

download
boolean

If true, adds headers to trigger a browser download.

Responses

Retrieve an individual Dfam family's seed alignment data.

Retrieve an individual Dfam family's seed alignment data.

path Parameters
id
required
string

The Dfam family accession.

query Parameters
format
required
string

The format to return, one of 'stockholm' or 'alignment_summary'.

download
boolean

If true, adds headers to trigger a browser download.

Responses

Retrieve an individual Dfam family's relationship information.

Retrieve an individual Dfam family's relationship information.

path Parameters
id
required
string

The Dfam family accession.

query Parameters
include
string

Which families to include. "all" searches all of Dfam, and "related" searches only families that are found in ancestor or descendant clades of the one this family belongs to. Default is "all".

include_raw
boolean

Whether to include matches to raw ("DR") families. Default is false.

Responses

Response samples

Content type
application/json
[
  • {
    }
]

assemblies

Access to stored genome assembly details

Retrieve a list of all genome assemblies in Dfam that have annotations.

Retrieve a list of all genome assemblies in Dfam that have annotations.

Responses

Response samples

Content type
application/json
[
  • {
    }
]

familyAssemblies

Access to species-specific family charactersitics

Retrieve a list of genome assemblies with annotations for a Dfam family.

Retrieve a list of genome assemblies with annotations for a Dfam family.

path Parameters
id
required
string

The Dfam family accession.

Responses

Response samples

Content type
application/json
[
  • {
    }
]

Retrieve a family's annotation statistics for all assemblies it is annotated in.

Retrieve a family's annotation statistics for all assemblies it is annotated in.

path Parameters
id
required
string

The Dfam family accession.

Responses

Response samples

Content type
application/json
[
  • {
    }
]

Retrieve a family's coverage data associated with a given assembly.

Retrieve a family's coverage data associated with a given assembly.

path Parameters
id
required
string

The Dfam family accession.

assembly_id
required
string

The assembly identifier, as shown in /families/{id}/assemblies.

query Parameters
model
required
string

Model type, "cons" or "hmm".

Responses

Response samples

Content type
application/json
{
  • "all": "all",
  • "false": "false",
  • "nrph": "nrph"
}

Retrieve a family's conservation data associated with a given assembly.

Retrieve a family's conservation data associated with a given assembly.

path Parameters
id
required
string

The Dfam family accession.

assembly_id
required
string

The assembly identifier, as shown in /families/{id}/assemblies.

query Parameters
model
required
string

Model type, "cons" or "hmm".

Responses

Response samples

Content type
application/json
[
  • {
    }
]

Retrieve a family's annotations associated with a given assembly.

Retrieve a family's annotations associated with a given assembly.

path Parameters
id
required
string

The Dfam family accession.

assembly_id
required
string

The assembly identifier, as shown in /families/{id}/assemblies.

query Parameters
nrph
required
boolean

"true" to include only non-redundant profile hits.

download
boolean

If true, adds headers to trigger a browser download.

Responses

Retrieve a family's annotation statistics associated with a given assembly.

Retrieve a family's annotation statistics associated with a given assembly.

path Parameters
id
required
string

The Dfam family accession.

assembly_id
required
string

The assembly identifier, as shown in /families/{id}/assemblies.

Responses

Response samples

Content type
application/json
{
  • "hmm": {
    }
}

Retrieve a family's karyotype data associated with a given assembly.

Retrieve a family's karyotype data associated with a given assembly.

path Parameters
id
required
string

The Dfam family accession.

assembly_id
required
string

The assembly identifier, as shown in /families/{id}/assemblies.

Responses

Response samples

Content type
application/json
{ }

classification

Access to Dfam TE classifications

Retrieve the entire TE classification hierarchy used by Dfam.

Retrieve the entire TE classification hierarchy used by Dfam.

query Parameters
name
string

Classification name to search for. If given, the results will be returned as an array instead of the default hierarchical format.

Responses

Response samples

Content type
application/json
{
  • "hyperlink": "hyperlink",
  • "repbase_equiv": "repbase_equiv",
  • "aliases": "aliases",
  • "tooltip": "tooltip",
  • "count": "count",
  • "description": "description",
  • "piegu_equiv": "piegu_equiv",
  • "full_name": "full_name",
  • "repeatmasker_subtype": "repeatmasker_subtype",
  • "children": [
    ],
  • "repeatmasker_type": "repeatmasker_type",
  • "curcio_derbyshire_equiv": "curcio_derbyshire_equiv",
  • "name": "name",
  • "wicker_equiv": "wicker_equiv"
}

taxa

Access to Dfam taxonomy

Query Dfam's copy of the NCBI taxonomy database.

Query Dfam's copy of the NCBI taxonomy database.

query Parameters
name
required
string

Search string for taxonomy name.

annotated
boolean

Whether only taxa with annotated assemblies should be returned.

limit
integer

Only return up to a maximum number of matching taxa.

Responses

Response samples

Content type
application/json
{
  • "taxa": [
    ]
}

Retrieve statistics on Dfam's coverage of species.

Retrieve statistics on Dfam's coverage of species.

Responses

Response samples

Content type
application/json
{
  • "count": 0.8008281904610115
}

Retrieve the name of a single taxon by its identifier.

Retrieve the name of a single taxon by its identifier.

path Parameters
id
required
integer

The identifier of the taxonomy node to retrieve.

Responses

Response samples

Content type
application/json
{
  • "name": "name",
  • "id": 1
}

annotations

Access to stored genome annotations

Retrieve annotations for a given genome assembly in a given range.

Retrieve annotations for a given genome assembly in a given range.

query Parameters
assembly
required
string

Genome assembly to search. A list of assemblies is available at /assemblies.

chrom
required
string

Chromosome to search. Assembly dependent, but normally in the "chrN" format.

start
required
integer

Start of the sequence range (one based).

end
required
integer

End of the sequence range (one based, fully-closed).

family
string

An optional family to restrict results to.

nrph
boolean

true to exclude redundant profile hits.

Responses

Response samples

Content type
application/json
{
  • "hits": [
    ],
  • "tandem_repeats": [
    ],
  • "offset": 0,
  • "query": "query",
  • "length": 6
}

alignment

Access to family-assembly alignment data

Query the alignment of a family to an assembly. This API is meant for use only on dfam.org.

Query the alignment of a family to an assembly. This API is meant for use only on dfam.org.

query Parameters
assembly
required
string

Genome assembly to align to. A list of assemblies is available at /assemblies.

chrom
required
string

Chromosome to align to.

start
required
integer

Start of the sequence range (one based).

end
required
integer

End of the sequence range (one based, fully closed).

family
required
string

The family to align against.

Responses

Response samples

Content type
application/json
{
  • "pp": {
    },
  • "match": {
    },
  • "hmm": {
    },
  • "seq": {
    }
}

searches

Access to search submission system

Submit a sequence search request.

Submit a sequence search request.

Request Body schema: application/x-www-form-urlencoded
required
sequence
required
string

Sequence data to search, in FASTA format.

organism
required
string

Source organism of the sequence, used for determining search. thresholds. Uses the organism-specific E-value

cutoff
required
string

Type of cutoff to use, either 'curated' or 'evalue'.

evalue
string

E-value cutoff to use in the search. Only effective if cutoff is set to 'evalue'. Used when orgamism is not specified.

Responses

Response samples

Content type
application/json
{
  • "id": "id_3"
}

Retrieve the results of a sequence search.

Retrieve the results of a sequence search.

path Parameters
id
required
string

A result set ID matching the ID returned by a previous search submission.

Responses

Response samples

Content type
application/json
{
  • "hits": [
    ],
  • "tandem_repeats": [
    ],
  • "offset": 0,
  • "query": "query",
  • "length": 6
}

Retrieve an alignment from a sequence search result.

Retrieve an alignment from a sequence search result.

path Parameters
id
required
string

The result set ID returned by the API at submission time.

query Parameters
sequence
required
string

The name of the input sequence to align.

start
required
integer

Start of the sequence range.

end
required
integer

End of the sequence range.

family
required
string

The family to align against.

Responses

Response samples

Content type
application/json
{
  • "pp": {
    },
  • "match": {
    },
  • "hmm": {
    },
  • "seq": {
    }
}