Manuals

This page contains instructions for using and interpreting data from the Canadian Marine DNA Library. Refer to the CaMDL_Manual for more in depth information about the CaMDL project, including who is involved and where the specimens currently in the database have come from.

The West Coast Ocean Biomolecular Observing Network (WCOBON) is a UN Decade of the Ocean endorsed project within OBON. WCOBON has just released the Introduction to Developing DNA Reference Barcode Sequences 2025. This guide provides a framework for generating and disseminating voucher-based DNA reference barcode sequences. Please refer to this guide when starting any work to generate reference DNA sequences to submit to CaMDL or other repositories intended to support eDNA metabarcoding applications.

An Introduction to Developing Reference Barcode Sequences

Download

Canadian Marine DNA Library Manual

Download

How to use the Canadian Marine DNA Library

Building a custom reference library:

  1. Navigate to the page “Search the CaMDL database”
  2. Fill in the search criteria to meet the needs of your study. The only required fields are “Genetic Data” (e.g. choose what gene you want to download) and “Specimen Collection Information” (e.g. choose the Province and/or State in which you are working and adjacent).
  3. Search results are displayed on a new page. Results can be further refined using any of the search criteria on the left hand side.
  4. Click on the records to keep for the custom library (or click all at the top).
  5. Download the custom reference dataset using the two buttons at the bottom. The fasta file is a fasta formatted file that can easily be converted to any format needed by the user for multiple applications (e.g. in silico PCR, taxonomic assignment of OTUs or ESVs, etc). The metadata file is a csv formatted document with all metadata associated with each record downloaded. The metadata is standardized, and several terms are required for all database entries. See ‘Resources’ page for a template of the database fields and metadata explanations.
  6. It’s important to note that CaMDL records contain complete or near complete (90% or greater) single genes from the mitochondrial genome, and not complete circularized mitogenomes. In order to create custom reference libraries, researchers need to access genes of interest. Having only complete mitogenomes would make accessing particular single genes of interest more difficult. Each gene of interest is also in GenBank and accessioned as a single gene. Where available, complete circularized mitogenomes have also been submitted to GenBank and the accession number included in the genetic record metadata. Reference the Metadata terms (Appendix 1 in ‘CaMDL_Manual.pdf’) to learn which term has the mitogenome accession number and which has the single gene accession number.

Downloading and interpreting search results:

When constructing a custom reference library, always download the Genetic Data (FASTA format) and Metadata (csv) for selections from the search results. These files work together to ensure the genetic data is properly linked with it’s description.

 

Genetic Data (FASTA format): In FASTA format the line before the nucleotide sequence, called the FASTA definition line, must begin with a carat (“>”), followed by a unique sequence identifier. For sequences downloaded from CaMDL, the sequence identifier is always the occurrenceID, which is a globally unique identifier for each sequence in CaMDL. The occurrenceID matches each record in the associated metadata csv file. This is the only relational element between the Genetic Data and Metadata files, as sequences alone aren’t guaranteed to be unique.

The FASTA formatted Genetic Data file is ready to be re-formatted to suit the needs of taxonomic assignment. Follow instructions here to format FASTA files into a BLAST database for use with BLAST+ software.

 

Metadata (csv): Each gene record that shows as a search result has associated metadata. All of the metadata terms are aligned with Darwin Core (DwC) data standard terminology used by the Ocean Biodiversity Information System (OBIS) and the “Minimum Information about any (X) Sequence” (MIxS) specification generated by the Genomic Standards Consortium (Yilmaz et al., 2011).

More information about the Darwin Core Archive biodiversity informatics data standard can be found on the GBIF website (Darwin Core Archives – How-to Guide :: GBIF IPT User Manual).

Metadata terms can be accessed in Appendix 1 of this document and as an excel spreadsheet in the ‘Resources’ section of the website. Terms are classified into required, recommended, or optional. The spreadsheet includes the term name (which is the name also found on any downloaded metadata files), the definition, and an example of the text.