Title

Benchmarking bioinformatic tools for amplicon based High throughput sequencing of norovirus

ORCID

https://orcid.org/ 0000-0002-1883-0489

Department

Biological Sciences

Year of Study

4

Full-time or Part-time Study

Full-time

Level

Postgraduate

Presentation Type

Oral Presentation

Supervisor

Prof. Paul Cotter

Supervisor

Dr Sinead Keaveney

Supervisor

Dr Helen O'Shea

Abstract

In order to survey noroviruses in our environment using High Throughput Sequencing (HTS) it is essential that our methods, both wet-lab and computational, are fit for purpose. In this body of work, we have evaluated pipelines and classifiers for the genotypic characterisation of norovirus VP1 region using simulated sequencing data.

Denoising based pipelines Dada2, Deblur and USEARCH-UNOISE3 were included, alongside clustering based pipelines VSEARCH and FROGS. NoroNet and CaliciNet classifiers were compared to QIIME2 feature-classifier with standard and custom databases. Pipelines were compared to the expected sequences and composition using a variety of measures, Bray-Curtis distance, UniFrac weighted and unweighted and a confusion matrix.

Contrary to the expected performance of clustering versus denoising methods, clustering approaches produced data more closely reflecting the expected composition, on all measures, similarity/dissimilarity distances and phylogenetic. VSEARCH performed the best, in terms of similarity to expected composition. However, FROGS produced sequences and compositions distinctly different from all other pipelines. The impact of reduced depth of coverage on performance was assessed for VSEARCH and there were no differences in composition, phylogenetic similarity or taxonomic assignment. Classification was more strongly impacted by database rather than classification method. QIIME2 feature-classifier provides 99% agreement with NoroNet typing tool to capsid designation level. Disagreement increases with the inclusion capsid variant designation.

VSEARCH provides a robust option for analysing viral amplicons. Pipeline choice impacted false positives (Dada2) and sub-standard classification (FROGS). QIIME2 feature-classifier is a viable alternative to external classification, however maintenance of the input database is essential.

Keywords:

norovirus, High Throughput Sequencing, in-silico, sensitivity

Start Date

14-6-2022 9:30 AM

End Date

14-6-2022 9:45 AM

Comments

Oral Session 1 - Technological Advancements in Food and Health Research

This document is currently not available here.

Share

COinS
 
Jun 14th, 9:30 AM Jun 14th, 9:45 AM

Benchmarking bioinformatic tools for amplicon based High throughput sequencing of norovirus

In order to survey noroviruses in our environment using High Throughput Sequencing (HTS) it is essential that our methods, both wet-lab and computational, are fit for purpose. In this body of work, we have evaluated pipelines and classifiers for the genotypic characterisation of norovirus VP1 region using simulated sequencing data.

Denoising based pipelines Dada2, Deblur and USEARCH-UNOISE3 were included, alongside clustering based pipelines VSEARCH and FROGS. NoroNet and CaliciNet classifiers were compared to QIIME2 feature-classifier with standard and custom databases. Pipelines were compared to the expected sequences and composition using a variety of measures, Bray-Curtis distance, UniFrac weighted and unweighted and a confusion matrix.

Contrary to the expected performance of clustering versus denoising methods, clustering approaches produced data more closely reflecting the expected composition, on all measures, similarity/dissimilarity distances and phylogenetic. VSEARCH performed the best, in terms of similarity to expected composition. However, FROGS produced sequences and compositions distinctly different from all other pipelines. The impact of reduced depth of coverage on performance was assessed for VSEARCH and there were no differences in composition, phylogenetic similarity or taxonomic assignment. Classification was more strongly impacted by database rather than classification method. QIIME2 feature-classifier provides 99% agreement with NoroNet typing tool to capsid designation level. Disagreement increases with the inclusion capsid variant designation.

VSEARCH provides a robust option for analysing viral amplicons. Pipeline choice impacted false positives (Dada2) and sub-standard classification (FROGS). QIIME2 feature-classifier is a viable alternative to external classification, however maintenance of the input database is essential.