Clinical Evaluation of the Rapid STANDARD Q COVID-19 Ag Test for the Screening of Severe Acute Respiratory Syndrome Coronavirus 2
2022; 42(1): 100-104
Ann Lab Med 2019; 39(6): 530-536
Published online November 1, 2019 https://doi.org/10.3343/alm.2019.39.6.530
Copyright © Korean Society for Laboratory Medicine.
Sung-Min Ha, M.Sc.1,2* , Chang Ki Kim, M.D., Ph.D.3,4* , Juhye Roh, M.D.5 , Jung-Hyun Byun, M.D.4,5,6 , Seung-Jo Yang, Ph.D.1 , Seon-Bin Choi, M.Sc.1 , Jongsik Chun, Ph.D.1,2 , and Dongeun Yong, M.D., Ph.D.4,5,7
1ChunLab, Inc., Seoul, Korea; 2School of Biological Sciences & Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Korea; 3Seoul Clinical Laboratories, Yongin, Korea; 4Research Institute of Bacterial Resistance, Yonsei University College of Medicine, Seoul, Korea; 5Department of Laboratory Medicine, Yonsei University College of Medicine, Seoul, Korea; 6Department of Laboratory Medicine, Gyeongsang National University Hospital, and Gyeongsang National University College of Medicine, Jinju, Korea; 7Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, Korea
Correspondence to: Dongeun Yong, M.D., Ph.D.
Department of Laboratory Medicine and Research Institute of Bacterial Resistance, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea
Tel: +82-2-2228-2802 Fax: +82-2-364-1583 E-mail: email@example.com
*These authors contributed equally to this work.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Next-generation sequencing is increasingly used for taxonomic identification of pathogenic bacterial isolates. We evaluated the performance of a newly introduced whole genome-based bacterial identification system, TrueBac ID (ChunLab Inc., Seoul, Korea), using clinical isolates that were not identified by three matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) systems and 16S rRNA gene sequencing.
Thirty-six bacterial isolates were selected from a university-affiliated hospital and a commercial clinical laboratory. Species was identified by three MALDI-TOF MS systems: Bruker Biotyper MS (Bruker Daltonics, Billerica, MA, USA), VITEK MS (bioM?rieux, Marcy l'?toile, France), and ASTA MicroIDSys (ASTA Inc., Suwon, Korea). Whole genome sequencing was conducted using the Illumina MiSeq system (Illumina, San Diego, CA, USA), and genome-based identification was performed using the TrueBac ID cloud system ().
TrueBac ID assigned 94% (34/36) of the isolates to known (N=25) or novel (N=4) species, genomospecies (N=3), or species group (N=2). The remaining two were identified at the genus level.
TrueBac ID successfully identified the majority of isolates that MALDI-TOF MS failed to identify. Genome-based identification can be a useful tool in clinical laboratories, with its superior accuracy and database-driven operations.
Keywords: Next generation sequencing, Genome-based identification, TrueBac ID, Performance
Primary and nosocomial bacterial infections are significant causes of morbidity and mortality worldwide . Identification of bacterial isolates at the species level is the first and crucial step in routine clinical laboratories, as it provides essential guidance regarding treatment. Although conventional biochemical testing is still used, whole-cell matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) has been widely adopted for the routine identification of pathogenic bacteria . MALDI-TOF MS can rapidly identify isolates by comparing the proteomic profiles of highly conserved and abundant proteins with an already compiled profile database of reference strains. Therefore, the accuracy of MALDI-TOF MS identification is heavily dependent on the software and spectral database/libraries . MALDI-TOF MS is particularly useful for identifying frequently isolated pathogenic species because of better coverage of the spectral database. However, its ability to identify infrequently isolated species is questionable.
16S ribosomal RNA (16S rRNA) gene sequencing has been primarily used for cases, in which routine, conventional methods fail to identify the isolates. The 16S rRNA gene is a phylogenetic marker that is present in all bacteria and has played an essential role in the development of bacterial phylogeny and classification [4,5]. Recently, the similarity cut-off for 16s rRNA gene sequence (98.7%) was proposed as a boundary for bacterial species . However, a 16S rRNA gene sequence similarity of ≥98.7% does not guarantee that the test isolate is a member of the species. Almost identical 16S rRNA gene sequences have been found in different species [6,7].
Unlike 16S rRNA gene sequencing, whole genome sequencing (WGS) provides a clear-cut criterion for bacterial classification. Furthermore, bacterial species is now defined by the relatedness of genome sequences . A category of algorithms, named the overall genome-related index , has been devised to calculate the genomic similarity for taxonomic purposes, and a general guideline for bacterial identification using WGS data has been published . WGS has considerable potential in clinical diagnostics as it could provide accurate identification of the species, and resolution can be achieved up to the strain . As the cost of WGS is continuously decreasing, its use as a routine test has been validated in large hospitals [10,11].
If the genome sequences of the type strains representing all the known bacterial species are available, any isolate could be identified with high confidence. However, until a few years ago, the availability of such data was not satisfactory [4,12]. The utility of genome-based methods for clinical diagnostics requires re-evaluation in light of the recent expansion in the bacterial genome sequence database. This is the first study to evaluate TrueBac ID (ChunLab Inc., Seoul, Korea), which is the first commercial whole genome-based bacterial identification system. Its database contains highly curated and taxonomically validated genome data of type and reference strains. We evaluated the performance of TrueBac ID in identifying clinical bacterial isolates that could not be identified using commercial MALDI-TOF MS systems and 16S rRNA gene sequencing.
In this retrospective study, a total of 36 clinical isolates that were either unidentified or identified with low confidence by three commercial MALDI-TOF MS systems were collected from two institutes in Korea. Fifteen isolates were chosen from Severance hospital, Seoul, Korea, and 21 isolates were from Seoul Clinical Laboratories, Yongin, Korea. The isolates were recovered from clinical specimens (blood, pus, sputum, tracheal aspirate, urine, and wounds) from April 2017 to January 2018. Since this study focuses on the identification of the isolates, an approval from the Institutional Review Board was not required, and the demographic data of the patients were not included.
Initially, we used Bruker Biotyper (Bruker Daltonics, Billerica, MA, USA) for species identification at both institutes. 16S rRNA gene PCR and sequencing were carried out for the isolates that showed no possible identification (score value: <1.70) or low confidence identification (score value: 1.70–1.99) MALDI-TOF MS results. For isolates showing uncertain identification results, we employed two additional MALDI-TOF identification systems: the VITEK MS system (bioMérieux, Marcy l'Étoile, France) and ASTA MicroIDSys (ASTA Inc., Suwon, Korea). A colony grown on sheep blood agar was smeared and dried on the target plates of each instruments. Matrix solution (α-cyano-4-hydroxycinnamic acid) and 70% formic acid (Sigma-Aldrich, St. Louis, MO, USA) provided by the manufacturer were overlaid on the spot, and the peptide profile was acquired using Microflex with Biotyper Software 3.1 (Bruker), VITEK MS V3.0 (bioMérieux), and ASTA MicroIDSys 3.0.4 (ASTA Inc.). Mass spectra were analyzed according to the manufacturers' instructions.
Genomic DNA was extracted from the isolates using the FastDNA SPIN Kit for Soil (MP Biomedicals, Santa Ana, CA, USA), and 550-bp long fragments were generated using the M220 Focused-ultrasonicator (Covaris Ltd, Brighton, UK). The sequencing library was constructed using the TruSeq DNA Library LT kit (Illumina, San Diego, CA, USA), according to the manufacturer's protocols. WGS was performed on an Illumina MiSeq system (Illumina) with a 300 bp paired-end reads sequencing kit (MiSeq Reagent Kit v3; Illumina).
The raw data from the MiSeq instrument in the FASTQ format were directly uploaded to the TrueBac ID cloud system (www.truebacid.com) and analyzed with the TrueBac ID-Genome system. The current version of the system uses trimmomatic for filtering low-quality reads . The genome assembly was then carried out using the SPAdes software , as well as proprietary software specifically designed for the assembly of the 16S rRNA gene from the raw data.
The main section of the TrueBac ID-Genome system consists of (1) the proprietary reference database, named the TrueBac database, which is curated to hold up-to-date nomenclature, 16S rRNA gene, and genome sequences of type/reference strains, and (2) the optimized bioinformatics pipeline that provides the identification of a query genome sequence using the average nucleotide identity (ANI) [4,8,15]. We used TrueBac database version 2018-08, which contains 10,439 genomes representing 10,152 species and 287 subspecies (7,702 with valid names, 261 with invalid names, 138 with Candidatus names , and 2,338 genomospecies). Genomospecies is defined as a hitherto unknown species that is supported by its genome sequences [17,18,19]. The database also contains 18,476 16S rRNA gene sequences representing each species/subspecies.
The algorithmic identification scheme using WGS was slightly modified from that of Yoon, et al. . First, the most phylogenetically closely related pool of taxa was identified using a search of three genes—16S rRNA,
The algorithmic cut-off for species-level identification was set at 95% ANI [8,15]. If the closely related taxa in a 16S rRNA gene comparison did not have the corresponding genome sequences in the database, the species assignment was made when the 16S rRNA gene sequence similarity to the best hit taxon was ≥99% with >0.8% separation between species . Using these criteria, a genome sequence could be assigned to a species held in the TrueBac database, identified to the genus level (e.g.
In some cases, two or more species belonging to the same species were not yet formally reclassified. For isolates assigned to these species, the TrueBac ID system generated the final decision as a “species group” instead of individual species.
Of the 36 isolates, TrueBac ID successfully identified 25 isolates as known species (Table 1). Four isolates were new species that had not been previously recognized. Three genomospecies, labeled CP015506_s, BBQM_s, and JHEL_s, were assigned. Detailed taxonomic information on these genomospecies is available at www.ezbiocloud.net . Two isolates were identified as “species group” (
Of the 34 isolates that were conclusively identified at the species level, 26 were assigned to known species using the ANI calculation against the type strain genomes in the database, yielding a true or definitive identification. The remaining eight isolates were identified by 16S rRNA gene similarity, according to the CLSI guidelines .
Of the 25 isolates identified as known species by TrueBac ID, all three MALDI-TOF systems failed to identify 17 isolates. The MALDI Biotyper System identified nine (eight matched with TrueBac ID and one mismatched), the VITEK MS identified seven (three matched with TrueBac ID and four mismatched), and the ASTA system identified seven (four matched with TrueBac ID and three mismatched). The detailed identification results with the genome assembly and gene sequences we reported are available at https://www.truebacid.com/genome/demo/clinical/korea.
Overall, TrueBac ID performs well for isolates that MALDI-TOF MS systems and 16S rRNA gene sequencing fail to identify. The ability to identify rare species is largely influenced by database coverage. The TrueBac ID system contains >10,000 species, whereas commercially available MALDI-TOF MS systems contain only ~ 2,500 species .
Because of advances in DNA sequencing technologies and the introduction of genomics into bacterial taxonomy, numerous species have been newly described. On average, approximately 100 new species were described every month in 2017 (data from www.ezbiocloud.net). The TrueBac ID system reference database is updated every month, enabling detection of recently described species. For example, isolate SCL P33 was identified as “
One of the major benefits of whole genome-based identification is that it can provide a scientifically sound decision for the recognition of novel species. We confirmed four novel species and three genomospecies based on 16S rRNA gene or genomic evidence. Isolates YUMC P721, YUMC P647, and YUMC B11605 were identified as genomospecies JHEL_s, CP015506_s, and BBQM_s, respectively. These species were never officially proposed and only tentatively named by the EzBioCloud database , so they can be considered novel species. As genomospecies represent previously isolated species, the use of this concept can provide further insights into species ecology. For example, the genomospecies CP015506_s included in this study is a species of the genus
Isolate SCL B79 showed high ANI values to
Overall, TrueBac ID could identify the species level for >90% of the isolates. Moreover, it demonstrated the ability to recognize new species with high confidence. This is a significant advantage of genome-based ID over other methods, including MALDI-TOF MS and biochemical tests. In addition to its superior accuracy, WGS is not influenced by media and growth conditions, in contrast to other methods based on phenotypes including MALDI-TOF MS .
Although 16S rRNA gene sequencing has been widely used as the gold standard for bacterial identification , this method is not feasible for some clinically important species with highly similar 16S rRNA gene sequences . We demonstrated that WGS exhibited sufficient taxonomic coverage to be employed as a scientifically sound gold standard when any new diagnostic method or commercial system is evaluated.
This study has some limitations. We collected only those isolates that were not properly identified by MALDI-TOF MS. However, the proportion of those isolates would be low in most laboratories. In addition, the clinical significance of the isolates was not clearly defined. We assume that not all the isolates are true pathogens. Lastly, we did not examine how accurately identifying the isolates can improve patient care.
In conclusion, TrueBac ID successfully identified the majority of clinical bacterial isolates that were not identified by commercially available MALDI-TOF MS systems or 16S rRNA gene sequencing. TrueBac ID was more useful than other conventional diagnostic methods in recognizing new species. As the coverage of type strain-genome sequence database continues to grow and the cost of DNA sequencing continues to decrease, genome-based identification can be a useful tool for diagnostic laboratories, with its superior accuracy and database-driven operations.