**TITLE**\
Next-generation sequencing technologies in the One Health context: a
scoping review of current practices on zoonotic diseases
**AUTHORS**\
Stefano Catalano``{=html}1,\*``{=html}, Francesca
Battelli``{=html}2``{=html}, Zoumana I
Traore``{=html}2``{=html}, Jayna
Raghwani``{=html}3,4``{=html}, Christina L
Faust``{=html}1``{=html}, Claire J
Standley``{=html}2,5``{=html}\
\
**AFFILIATIONS**\
1 School of Biodiversity, One Health and Veterinary Medicine, University
of Glasgow, Glasgow, UK\
2 Center for Global Health Science and Security, Georgetown University,
Washington, DC, USA\
3 Department of Pathobiology and Population Sciences, Royal Veterinary
College, University of London, London, UK\
4 Department of Biology, University of Oxford, Oxford, UK\
5 Heidelberg Institute of Global Health, Heidelberg University,
Heidelberg, Germany\
\*Corresponding author. Email: stefano.catalano@glasgow.ac.uk
**FILES** There are three c.sv files included in this dataset. These are
all .csv files which could be opened by Microsoft Excel.
"NGSscoping-included-Enlighten.csv" includes articles after full-text
screening and their extracted data. "NGSscoping-excluding-Enlighten.csv"
lists all the articles that have been excluded and the reason for
exclusion during full-text screening. "NGSscoping-legend-Enlighten.csv"
includes the description of each data extraction column and list of
abbreviations.
**METHODS**\
**Search strategy**\
We followed published guidance on conducting (Munn et al. 2018; Foo et
al. 2021) and reporting (Tricco *et al*, 2018) evidence synthesis.
First, we searched PROSPERO database to determine whether our research
questions had not been already addressed by a registered review (Moher
*et al*, 2014). Then, we searched the following engines based on their
large, multidisciplinary spectrum and their classification as principal
resources (Gusenbauer & Haddaway, 2020): PubMed and Web of Science (Web
of Science Core Collections selected within the platform).
Intentionally, we did not use the Medical Subject Headings (MeSH)
database in PubMed to include any non-indexed article in our search.\
To ensure that the search string had the ability to capture all relevant
articles, two authors examined the full text of several studies on
zoonotic transmission dynamics (i.e., Yadav *et al*, 2019; Durrant *et
al*, 2020; Kim *et al*, 2020; Medkour *et al*, 2020; Gee *et al*, 2021;
Zhang *et al*, 2022). This exercise enabled the identification of the
terminology applied to our search string and the inclusion criteria,
which were then discussed and finalised with other authors
(Supplementary Table 1). The final search string applied by our study
was the following: (ecolog\* OR evolution\* OR epidemiolog\*) AND
("transmission" OR "surveillance") AND (zoono\* OR "disease" OR
infect\*) AND ("molecular" OR genetic\* OR genom\* OR metagenom\*) AND
(phylogen\* OR phylodynamic\* OR phylogeograph\*) AND ("reads" OR
librar\* OR align\* OR polymorph\* OR "next generation") NOT (Sanger OR
microsatellite\*).\
The search was completed within one day on September
27``{=html}th``{=html}, 2022. Records were exported to
EndNote X9.3.3 (Clarivate™, Philadelphia, USA), combined into a single
library and de-duplicated by the software, followed by a visual check of
the record list sorted by digital object identifier (DOI). This scoping
review followed the Preferred Reporting Items for Systematic reviews and
Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines
(Supplementary Table 2 adapted from Tricco *et al*, 2018).\
\
**Screening and data extraction**\
Screening of articles and extraction of data were performed in Rayyan
(Ouzzani *et al*, 2016) using a three-stage process. At each stage,
articles were retained only if they complied with the following
inclusion criteria: I) the infectious agent(s) is classified as zoonotic
or deemed a potentially emerging zoonotic disease by the article's
abstract and/or co-authors of this scoping review (Supplementary Table 3
adapted from Rees *et al*, 2021); II) the record includes sampling
activities, and/or handling of specimens for nucleic acid extraction
and/or library preparation, of human hosts in addition to the animal
and/or environment One Health domains (in other words, the record
includes genomic data produced directly by the study from the human
domain in addition to the animal and/or environment One Health domains)
(Table 1 adapted from Cavalerie *et al*, 2021); III) NGS data are
applied to evolutionary models of transmission dynamics; and IV)
articles' publication date goes from January
1``{=html}st``{=html} 2005 to present (this criterion was
based on the commercial release of the first high-throughput sequencing
platforms (Goodwin *et al*, 2016; Kulski, 2016)).\
The following studies were excluded from our scoping review: I)
scientific work focusing on SARS-CoV-2; II) articles which do not
incorporate original NGS data (in other words, we excluded studies that
only collated genomic data deposited in private and publicly accessible
databases); III) methodologies exclusively based on Sanger sequencing
and amplified/restriction fragment length polymorphism (i.e., AFLP and
RFLP); IV) literature reviews, perspective articles, and commentaries;
and V) grey literature and literature whose full text is not available
in English.\
In the first stage, two reviewers independently screened titles and
abstracts using the inclusion rubric; 100 randomly selected records were
initially screened to ensure an agreement rate of at least 80% between
reviewers before proceeding with title/abstract review of all records.
In the second stage, two reviewers independently screened the full text
of each retained article; 10 randomly selected records were first
screened to ensure an agreement rate of at least 80% between reviewers
before proceeding with full-text review of included records. At each
stage, the reviewers followed a decision tree, which was defined by the
inclusion and exclusion criteria listed above (Supplementary Figure 1).
The resolution of any conflicting classifications was addressed by a
discussion between reviewers; if needed, the full paper was retrieved
and re-screened to resolve the disagreement. Finally, one reviewer
extracted the data included in Table 2 for each record that complied
with the inclusion criteria.\
\
**Data analysis**\
Geographic localities where sampling was carried were aggregated based
on income status (i.e., low/lower-middle income countries and
upper-middle/high income countries) as reported by the Organisation for
Economic Co-operation and Development in 2022
(``{=html}``{=html}).\
The biological agents included in our review were categorised based on
hazard group definitions by Health and Safety Executive (HSE, 2023) and
current legislation in the UK (i.e., The Specified Animal Pathogens
(Scotland) Order 2009 under the Animal Health Act 1981). These
categories reflect infectiousness, available vaccines or treatments, and
laboratory containment levels appropriate to work with the listed
pathogens. Data were collated based on the reproducibility and
accessibility of analytical methodologies. Software was categorized as
licenced or open source. Additionally, details of open-source code and
publicly available data were collected.\
We used generalised linear mixed models (GLMMs) to describe sampling
size (log-transformed with Poisson family links) with pathogen type,
biocontainment level, sample origin, geographic income stratification,
and year as predictors. Data were visualised in R version 4.3.2 (R Core
Team, 2023).