Bulk Immunomics
Our OP² immune repertoire sequencing pipeline is a bioinformatics workflow used to characterize the complementary determining region (CDR) of B-cell receptors (BCRs) or T-cell receptors (TCRs).
You get insights on the quality of your data, overview of the clonotype contents, somatic hypermutation rates, amino acid properties and gene usage.
The workflow processes raw data from FastQ input files, aligns the sequences to the germline database and then proceeds to the immuno profiling of the samples.
The results are made available via two reports, and the data is provided in the standardized AIRR format to perform downstream analyses.
The pre-processing workflow processes the raw sequence data until the sequences are aligned against the IMGT germline reference.
The post-processing workflow provides a set of analyses and matrices to provide basic characteristics and insights on the immune repertoire.
-
1
Input
Paired-end raw FastQ files (compressed)
V-region and C-region primer sequences in fasta format -
2
Sequence QC
Low-quality reads are discarded
-
3
Primer Masking
The PCR primers are identified and masked
-
4
Generation of UMI consensus sequences
If UMIs were used, consensus sequences are generated
-
5
Duplicate Removal and Filtering
Duplicate nucleotide sequences are removed and then only unique sequences with at least 2 contributing sequences are retained
-
6
Alignment
Assembled sequences are aligned against the IMGT germline reference database
-
1
Input
AIRR tsv with the germline gene annotations
-
2
CDRH3 Overlap
Provides visualization of the overlap between files on CDRH3 amino acid level
-
3
Clonotyping
Assigns clonotypes and provides a visualization of the clonotype composition and the clone/clonotype accumulation curve
-
4
Somatic Hypermutation
AIRR tsv with the germline gene annotations
-
5
Amino Acid Physicochemical Properties
Provides an overview of potential protein interaction metrics
Here we focus on 9 properties of the HCDR3 -
6
V gene usage
Identifies and visualizes the most frequently used V genes in the data
-
7
Diversity Analysis
Infers clonal abundance distribution to assess the diversity of each sample