Create a user-defined analysis workflow for use with Metagenomics research application
The metagenomics analysis workflow provides access to 2 preinstalled reference databases for mapping: the curated MicroSEQ™ ID database and the curated GreenGenes database. You can customize the analysis workflow to make any of the following changes:
-
Upload user-defined reference files, then map samples to any combination of user-defined and preinstalled reference databases for metagenomics research.
If multiple reference databases are selected, data are first mapped against the first selected reference in the list. Next, reads that were not mapped against the first selected database are mapped against the next database that is selected in descending order, and so on, until the sample is mapped against the entire list of selected databases.
-
Add primer sequences that were used to prepare the samples for metagenomics analysis workflows.
Note: You can create an analysis workflow that does not include any primer information, however, it is recommended that you always add primer information to a metagenomics analysis workflow. When the primer information is absent, no trimming is performed on your reads. A warning message appears during analysis review when primer information is missing.
- In the Workflows tab, click Overview.
-
In the Workflows table, select the metagenomics analysis workflow that you want to copy, then click .
The analysis workflow is copied and the Edit workflow bar opens to the Research Application step. Ensure that Metagenomics is selected in the Research Application section, then review the selection in Sample Groups, then click Next.
- (Optional) To upload a user-defined reference database in FASTA format, in the Reference step, click Upload, then browse to, select, then upload the user-defined reference database file.
-
Select one or more reference databases in the Available References list, then click the arrow to add the reference to the Selected References list. Click Next when you have added all references that you want to use for the analysis.
The order of the Selected References determines the order in which the sample is mapped against reference databases.
-
In the Primers step, select a primer option, then click Next.
Option
Description
Use No Primers
Select if no primers were used in library preparation.
Use Custom Primers
Select to provide your own primers. If primer sequences are provided, the sequences are trimmed from the reads before mapping occurs in the software. For more information about the file format of the primers, see Custom primer sequences for Metagenomics analysis workflows.
User Default Primers
Select to use proprietary primers that are included by default.
-
If you select Use Custom Primers, do one of the following to enter custom primer sequences:
-
Enter individual primer sequences directly into the Paste FASTA Sequences text box. For example: >MyFavoriteV5_forward ACTCGGTCCARACTGAGACT >MyFavoriteV5_Rev TTACCGRGGCGTATGCGG>MyFavoriteV8_FwdCCARAACTCGGTCTGSGACT >MyFavoriteV8_rRGGCGTATGCSTACCGGG
-
The names of forward primers must end in _f*. Reverse primer names must end in _r*. Primers in a pair must have identical names so that the software can match the primers during the analysis.
-
Upload a FASTA file that contains primers. For more information about the file format of the primers, see Custom primer sequences for Metagenomics analysis workflows.
-
- Click Next.
- In the Parameters step, make any desired changes to the Metagenomics parameters, then click Next.
- Enter a name for the analysis workflow, and an (optional) description, then click Save Workflow.
Custom primer sequences for Metagenomics analysis workflows
You can upload a set of primer sequences that were used to prepare your samples for metagenomics analysis workflows. Primer sequences can be uploaded from a FASTA file or entered individually into the software. If primer sequences are provided, the sequences are trimmed from the reads before mapping occurs in the software. The names of forward primers must end in _f* and reverse primer names must end in _r*. Primers in a pair must otherwise have identical names so that the software can match the primers during the analysis.
IMPORTANT! The header for each custom reference sequence must include at least the following information, where sequence is the base-pair sequence:
>mg|Genus|Species|
sequence
The header can include the following additional information, if available:
>mg|Genus|Species|Subspecies/
Strain|Accession#|Kingdom|Phylum|Class|Order|Family|PubMed#|LibraryID#|
sequence
