Senior Biomedical Research Services
Division of Microbiology
Regulatory Science, FDA
GenomeTrakr: A Pathogen Databases to Build a Global Genomic Network for Pathogen Traceback and Outbreak Detection
M. W. Allard, C. Wang, G. Kastanis, C. Pirone, Tim Muruvanda, E. Strain, R. Timme, J. Payne, Y. Luo, Narjol Gonzalez-Escalona, Magaly ToroIbaceta, A. Ottesen, D. Melka, P. Evans, S. M. Musser and E. W. Brown. Food and Drug Administration, College Park MD USA.
Introduction: This study demonstrates how with the selection of proper data quality one can use desktop Whole Genome Sequencing (WGS) data in a combined analysis for source tracking of pathogens. Multiple data analysis pipelines are tested to combine draft genomes of bacterial data for phylogenetic clustering to provide leads in outbreak investigations of foodborne pathogens.
Purpose: This study outlines how these tools will be implemented to create a pathogen detection network where state and federal public health agencies can share data publicly to build a transparent reference database with data deposited into the sequence read archive (SRA, NCBI).
Methods: Herein we describe the components of the NGS pathogen network that integrates state public health laboratories (AK, AZ, FL, HI, MD, MN, NM, NY, NY_Ag TX, VA, and WA) as well as federal laboratories. Details of the successes and failures concerning communication, coordination, data acquisition, assembly, storage, and analysis will be provided. Several case studies will be reported for this pilot study.
Results: The hardware and software implemented allows us to compare and cluster complete genomes of thousands of taxa at a time, and the software outputs daily phylogenetic trees for source tracking of food, clinical and environmental isolates. Herein, we report enhanced molecular epidemiological insights gained by comparative analysis of Salmonella, E. coli, and Listeria genomes previously deemed indistinguishable by conventional subtyping methodologies.
Significance: These results demonstrate an important role for WGS tools within a regulatory environment while highlighting the novel additional insights provided to epidemiological investigations through comparison to a reference database. See URL for the Genometrakr network where we have released >12,000 unpublished draft genomes into the SRA database for food safety.