Substantial Improvement in Nontuberculous Mycobacterial Identification Using ASTA MicroIDSys Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry with an Upgraded Database
2022; 42(3): 358-362
Ann Lab Med 2022; 42(2): 213-248
Published online March 1, 2022 https://doi.org/10.3343/alm.2022.42.2.213
Copyright © Korean Society for Laboratory Medicine.
Young-Gon Kim , M.D., Kiwook Jung
, M.D., Seunghwan Kim
, M.D., Man Jin Kim
, M.D., Jee-Soo Lee
, M.D., Sung-Sup Park
, M.D., Ph.D., and Moon-Woo Seong
, M.D., Ph.D.
Department of Laboratory Medicine, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
Correspondence to: Moon-Woo Seong, M.D., Ph.D.
Department of Laboratory Medicine, Seoul National University Hospital, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea
Tel: +82-2-2072-4180
Fax: +82-2-747-0359
E-mail: mwseong@snu.ac.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Background: Sequence-based identification is one of the most effective methods for species-level identification of nontuberculous mycobacteria (NTM). However, it is time-consuming because of the bioinformatics processes involved, including sequence trimming, consensus sequence generation, and public database searches. We developed a simple and fully automated software that enabled species-level identification of NTM from trace files, SnackNTM (https://github.com/Young-gonKim/SnackNTM).
Methods: JAVA programing language was used for software development. The SnackNTM diagnostic algorithm utilized 16S rRNA gene sequences, according to the Clinical & Laboratory Standards Institute guidelines, and an rpoB gene region was adjunctively utilized to narrow down the species. The software performance was validated using trace files of 234 clinical cases, comprising 217 consecutive cases and 17 additionally selected cases of unique species.
Results: SnackNTM could analyze multiple cases at once, and all the bioinformatics processes required for sequence-based NTM identification were automatically performed with a single mouse click. SnackNTM successfully identified 95.9% (208/217) of consecutive clinical cases, and the results showed 99.0% (206/208) agreement with manual classification results. SnackNTM successfully identified all 17 cases of unique species. In a processing time comparison test, the analysis and reporting of 30 cases, which took 150 minutes manually, took only 40 minutes with SnackNTM.
Conclusions: SnackNTM is expected to reduce the workload for NTM identification, especially in clinical laboratories that process large numbers of cases.
Keywords: Nontuberculous mycobacteria, NTM identification, SnackNTM, 16S rRNA gene, rpoB
The incidence of nontuberculous mycobacterial infections has increased worldwide, posing several challenges for clinicians and clinical laboratories [1, 2]. As the clinical relevance of infection varies widely with the species of nontuberculous mycobacteria (NTM), species-level identification is fundamental for proper patient management [3-5]. The differential susceptibility of different species to various antibiotics renders correct species identification even more important [6, 7]. However, species identification is not straightforward as there are more than 190 validly published species in the genus
Sequence-based species identification involves several bioinformatics processes, including the reviewing and trimming of trace files, the generation of consensus sequences of forward and reverse traces, and searching public databases using tools such as Basic Local Alignment Search Tool (BLAST) (https://blast.ncbi.nlm.nih.gov/Blast.cgi). BLAST results usually include multiple species and should be filtered according to predefined criteria. These processes require time and effort, especially when multiple genome regions from multiple cases are to be evaluated. The sequences submitted to public databases lack validation in terms of sequence quality and contemporariness of the mycobacterial taxonomic classification [10-12].
We developed a software for automated batch processing of Sanger sequence data for NTM identification and validated it using clinical data. To the best of our knowledge, SnackNTM is the first software for automated NTM identification based on Sanger sequencing.
This study was approved by the Institutional Review Board of Seoul National University Hospital (SNUH), Seoul, Korea. This study comprised two parts: software development and validation, both of which were performed at SNUH. Software was developed between June 2019 and December 2019. Validation was performed as a retrospective study utilizing sequencing trace files stored in the SNUH repository between July 2020 and August 2020.
Java Development Kit 8 (Oracle, CA, USA) and its component JavaFX were used to develop a graphical user interface-based software. BioJava Legacy (https://github.com/biojava/biojava-legacy) was used to read sequence trace files (.ab1 files). Basic functions required for the handling of sequence files, including reading trace files, automatic trimming, and the alignment algorithm, were adopted from our previous work [13].
The workflow of SnackNTM is summarized in Fig. 1. SnackNTM reads multiple trace files at once and allocates them according to the case and target region. The maximum number of cases successfully processed during the validation was 217, with 868 sequence trace files. After the trace files are trimmed, the input sequences are iteratively aligned with locally stored reference sequences. The percent identity score is then calculated for every alignment, and the species are sorted according to their scores. Species that satisfy the identification criteria described in the next subsection are output as the identification results.
The 5’ end of the 16S rRNA gene (approximately 500 bp) and part of
The identification process used in SnackNTM is shown in Fig. 2. The interpretative criteria used at SNUH were adopted. According to the CLSI guidelines MM18-A, a threshold of 99.0% sequence identity for the 16S rRNA gene was used [11]. However, species with 100% identity were preferentially considered. When identification based on the 16S rRNA gene failed to narrow down to a single species,
Trace files of 217 consecutive cases, which were produced from NTM identification tests performed between July 2019 and August 2019, were reanalyzed using both SnackNTM and a manual identification process, in which sequences were trimmed using the default trimming function of Sequencher (Gene Codes, Ann Arbor, MI, USA). The consensus sequences of forward and reverse traces thus generated were fed to locally installed BLAST to search the locally stored reference sequences. The same sets of reference sequences were used in SnackNTM and in manual identification. The results from the two methods were compared, and the percentage of agreement was calculated.
Since the consecutive case data contained a limited number of species (11 species from 217 cases), to utilize more species for validation, we performed additional tests using selected cases with unique species. Unique species data comprised 11 cases selected from the consecutive case data described above and 17 additional cases selected from the SNUH repository. Case selection was based on reported identification results obtained at SNUH between May 2017 and October 2019. Among cases with the same reported species, the case with the earliest test date was selected. The 28 cases with unique species data finally included were analyzed using SnackNTM and EzBioCloud (https://www.ezbiocloud.net/identify), which utilizes 16S rRNA gene sequences for bacterial identification. The results were compared, and the percentage of agreement was calculated.
Thirty cases were selected from the consecutive case data and two researchers, KJ and SK, analyzed the selected cases using SnackNTM and manual identification. The time required for both analyses by both researchers was measured, and the mean processing times for the analyses were compared.
A simple windowed application was developed for NTM identification. Detailed instructions on how to use SnackNTM are provided in a movie clip that can be found on https://github.com/Young-gonKim/SnackNTM (accessed on February 24, 2021). The status of SnackNTM after completion of the initial analysis is shown in Fig. 3. In this example, 120 trace files generated from 30 cases were selected at once and allocated automatically to proper positions according to the file naming rules that utilize case ID, target region, and read direction (forward vs. reverse) contained in the file names. Clicking the button “Run (All samples)” initiates the analysis, which takes a few seconds per case. As the trace files occasionally contain base-calling errors, especially in the trace ends, the traces should be reviewed after the initial automatic analysis. When an error is found, it can be fixed using the “Edit base” function, and when the conclusion contains unexpected results, such as multiple species or no species, the “Edit trimming” function can be used to increase or decrease the range of sequences to be included in the analysis. The identification results can be exported either in the tab-separated value format or as an excel file using the “Export” function in the data tab.
The SnackNTM and manual identification results of all cases are listed in Table 1, and a summary of the species identified by SnackNTM is presented in Table 2. Among 217 cases, 208 (95.9%) were successfully identified to a single species. The most frequently identified species were
Identification results for consecutive case data (217 cases)
Case ID | SnackNTM | Conclusion of SnackNTM | Conclusion of manual identification | Agreement* | |||
---|---|---|---|---|---|---|---|
16S rRNA gene | |||||||
Species | % Identity | Species | % Identity | ||||
NTM001 | 100.00 | 99.38 | O | ||||
NTM002 | 100.00 | O | |||||
NTM003 | 99.82 | 100.00 | O | ||||
99.82 | (most closely)† | (most closely) | |||||
99.47 | |||||||
NTM004 | 99.62 | O | |||||
(most closely) | (most closely) | ||||||
NTM005 | 100.00 | O | |||||
NTM006 | 99.82 | 99.69 | O | ||||
99.81 | 99.54 | (most closely) | (most closely) | ||||
99.62 | 99.38 | ||||||
99.08 | |||||||
99.08 | |||||||
99.03 | |||||||
NTM007 | 99.82 | 99.85 | O | ||||
99.81 | 99.69 | (most closely) | (most closely) | ||||
99.63 | 99.54 | ||||||
99.10 | |||||||
99.10 | |||||||
99.05 | |||||||
NTM008 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM009 | 100.00 | 99.64 | O | ||||
100.00 | 99.46 | ||||||
99.46 | |||||||
NTM010 | 100.00 | O | |||||
100.00 | |||||||
NTM011 | 100.00 | O | |||||
100.00 | |||||||
NTM012 | 100.00 | 100.00 | O | ||||
100.00 | 99.85 | ||||||
99.69 | |||||||
NTM013 | 100.00 | O | |||||
NTM014 | 99.82 | 99.69 | O | ||||
99.81 | 99.54 | (most closely) | (most closely) | ||||
99.62 | 99.38 | ||||||
99.09 | |||||||
99.03 | |||||||
NTM015 | 100.00 | O | |||||
NTM016 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM017 | 100.00 | O | |||||
NTM018 | 100.00 | 99.54 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM019 | 100.00 | O | |||||
NTM020 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM021 | 100.00 | O | |||||
NTM022 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM023 | 99.82 | 99.85 | O | ||||
99.81 | 99.69 | (most closely) | (most closely) | ||||
99.62 | 99.54 | ||||||
99.09 | |||||||
99.03 | |||||||
NTM024 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM025 | 100.00 | O | |||||
NTM026 | 100.00 | 99.54 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM027 | 100.00 | 99.69 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM028 | 99.81 | Not Evaluable | |||||
99.81 | |||||||
99.64 | |||||||
99.47 | |||||||
99.44 | |||||||
99.29 | |||||||
99.29 | |||||||
99.25 | |||||||
99.25 | |||||||
NTM029 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM030 | 100.00 | 99.68 | O | ||||
100.00 | 99.52 | ||||||
99.52 | |||||||
NTM031 | 100.00 | 99.82 | O | ||||
100.00 | 99.64 | ||||||
99.64 | |||||||
NTM032 | 100.00 | 99.38 | Not Evaluable | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM033 | 100.00 | 99.54 | O | ||||
NTM034 | 100.00 | O | |||||
NTM035 | 100.00 | 99.84 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM036 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM037 | 100.00 | Not Evaluable | |||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM038 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM039 | 100.00 | 99.69 | X | ||||
NTM040 | 100.00 | 99.54 | O | ||||
100.00 | 99.38 | ||||||
NTM041 | 100.00 | 99.69 | O | ||||
100.00 | 99.53 | ||||||
99.38 | |||||||
NTM042 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM043 | 100.00 | 99.67 | O | ||||
100.00 | 99.5 | ||||||
99.34 | |||||||
NTM044 | 100.00 | 99.38 | O | ||||
100.00 | |||||||
NTM045 | 100.00 | 99.84 | O | ||||
100.00 | 99.69 | ||||||
99.53 | |||||||
NTM046 | 100.00 | O | |||||
NTM047 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM048 | 100.00 | O | |||||
NTM049 | 100.00 | 99.53 | O | ||||
NTM050 | 100.00 | 99.53 | O | ||||
100.00 | 99.38 | ||||||
NTM051 | 100.00 | 99.52 | O | ||||
100.00 | 99.36 | ||||||
NTM052 | 100.00 | 99.54 | O | ||||
NTM053 | 100.00 | O | |||||
NTM054 | 100.00 | O | |||||
NTM055 | 100.00 | O | |||||
NTM056 | 100.00 | O | |||||
NTM057 | 100.00 | O | |||||
100.00 | |||||||
NTM058 | 100.00 | 99.69 | O | ||||
100.00 | 99.53 | ||||||
99.37 | |||||||
NTM059 | 100.00 | O | |||||
NTM060 | 100.00 | 99.85 | O | ||||
99.54 | |||||||
NTM061 | 100.00 | 99.68 | O | ||||
100.00 | 99.53 | ||||||
99.37 | |||||||
NTM062 | 100.00 | O | |||||
NTM063 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM064 | 100.00 | O | |||||
NTM065 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM066 | 100.00 | O | |||||
NTM067 | 100.00 | 99.84 | O | ||||
100.00 | 99.69 | ||||||
99.53 | |||||||
NTM068 | 100.00 | 99.53 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM069 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.39 | |||||||
NTM070 | 100.00 | O | |||||
NTM071 | 100.00 | 99.37 | O | ||||
NTM072 | 100.00 | O | |||||
NTM073 | 100.00 | 99.84 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM074 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM075 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM076 | 100.00 | 99.84 | O | ||||
100.00 | |||||||
NTM077 | 100.00 | O | |||||
100.00 | |||||||
NTM078 | 100.00 | O | |||||
NTM079 | 100.00 | O | |||||
NTM080 | 100.00 | O | |||||
NTM081 | 100.00 | O | |||||
NTM082 | 100.00 | O | |||||
NTM083 | 100.00 | O | |||||
NTM084 | 100.00 | O | |||||
100.00 | |||||||
NTM085 | 100.00 | Not Evaluable | |||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM086 | 99.81 | 100.00 | Not Evaluable | ||||
99.62 | 99.67 | ||||||
99.44 | |||||||
99.44 | |||||||
99.29 | |||||||
99.29 | |||||||
NTM087 | 100.00 | O | |||||
NTM088 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM089 | 100.00 | O | |||||
NTM090 | 100.00 | 99.54 | O | ||||
100.00 | 99.39 | ||||||
NTM091 | 100.00 | 100.00 | O | ||||
NTM092 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM093 | 100.00 | Not Evaluable | |||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM094 | 100.00 | 99.68 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM095 | 100.00 | Not Evaluable | |||||
100.00 | |||||||
NTM096 | 100.00 | 99.68 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM097 | 100.00 | 100.00 | O | ||||
99.69 | |||||||
NTM098 | 100.00 | O | |||||
NTM099 | 100.00 | O | |||||
100.00 | |||||||
NTM100 | 100.00 | 99.38 | O | ||||
NTM101 | 100.00 | 99.68 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM102 | 100.00 | 99.84 | O | ||||
100.00 | 99.67 | ||||||
99.67 | |||||||
NTM103 | 100.00 | 99.68 | O | ||||
100.00 | 99.52 | ||||||
99.37 | |||||||
NTM104 | 100.00 | 99.84 | O | ||||
100.00 | 99.68 | ||||||
99.52 | |||||||
NTM105 | 100.00 | O | |||||
NTM106 | 100.00 | 99.67 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM107 | 100.00 | 99.84 | O | ||||
100.00 | 99.68 | ||||||
99.52 | |||||||
NTM108 | 100.00 | O | |||||
NTM109 | 100.00 | 99.52 | O | ||||
NTM110 | 100.00 | 99.69 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM111 | 100.00 | 99.52 | O | ||||
NTM112 | 99.81 | O | |||||
(most closely) | (most closely) | ||||||
NTM113 | 100.00 | O | |||||
100.00 | |||||||
NTM114 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.39 | |||||||
NTM115 | 100.00 | 99.62 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM116 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM117 | 100.00 | 99.8 | O | ||||
100.00 | 99.6 | ||||||
99.6 | |||||||
NTM118 | 100.00 | 99.38 | O | ||||
100.00 | |||||||
NTM119 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM120 | 100.00 | O | |||||
NTM121 | 100.00 | O | |||||
100.00 | |||||||
NTM122 | 100.00 | O | |||||
NTM123 | 100.00 | 99.83 | O | ||||
100.00 | 99.66 | ||||||
99.48 | |||||||
NTM124 | 99.45 | 99.84 | O | ||||
99.45 | 99.68 | (most closely) | (most closely) | ||||
99.42 | 99.68 | ||||||
99.42 | |||||||
99.25 | |||||||
99.08 | |||||||
99.08 | |||||||
99.03 | |||||||
NTM125 | 100.00 | 100.00 | O | ||||
100.00 | 99.84 | ||||||
99.84 | |||||||
NTM126 | 100.00 | 99.66 | O | ||||
100.00 | 99.48 | ||||||
99.48 | |||||||
NTM127 | 100.00 | 100.00 | O | ||||
99.70 | |||||||
NTM128 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM129 | 99.81 | 99.69 | O | ||||
99.80 | 99.54 | (most closely) | (most closely) | ||||
99.61 | 99.38 | ||||||
99.05 | |||||||
NTM130 | 100.00 | 99.53 | O | ||||
NTM131 | 100.00 | O | |||||
NTM132 | 100.00 | O | |||||
NTM133 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM134 | 100.00 | 99.68 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM135 | 100.00 | 99.54 | O | ||||
NTM136 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM137 | 100.00 | 100.00 | X | ||||
NTM138 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM139 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM140 | 100.00 | 99.84 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM141 | 100.00 | 99.65 | O | ||||
100.00 | 99.48 | ||||||
99.31 | |||||||
NTM142 | 100.00 | O | |||||
NTM143 | 100.00 | 99.69 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM144 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM145 | 100.00 | 99.83 | O | ||||
100.00 | 99.66 | ||||||
99.50 | |||||||
NTM146 | 100.00 | 99.37 | O | ||||
99.37 | |||||||
NTM147 | 100.00 | 99.69 | O | ||||
100.00 | 99.38 | ||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM148 | 100.00 | 99.39 | O | ||||
NTM149 | 100.00 | 100.00 | O | ||||
99.69 | |||||||
NTM150 | 100.00 | O | |||||
100.00 | |||||||
NTM151 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM152 | 100.00 | O | |||||
NTM153 | 100.00 | 100.00 | O | ||||
100.00 | 99.84 | ||||||
99.69 | |||||||
NTM154 | 100.00 | 99.68 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM155 | 100.00 | 99.84 | O | ||||
100.00 | 99.68 | ||||||
99.52 | |||||||
NTM156 | 100.00 | 99.54 | O | ||||
NTM157 | 100.00 | O | |||||
100.00 | |||||||
NTM158 | 100.00 | 99.54 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM159 | 100.00 | 99.68 | O | ||||
100.00 | 99.52 | ||||||
99.37 | |||||||
NTM160 | 100.00 | O | |||||
100.00 | |||||||
NTM161 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM162 | 100.00 | 99.85 | O | ||||
99.54 | |||||||
NTM163 | 100.00 | O | |||||
NTM164 | 100.00 | O | |||||
NTM165 | 100.00 | O | |||||
NTM166 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM167 | 100.00 | O | |||||
NTM168 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM169 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM170 | 100.00 | 99.69 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM171 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM172 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM173 | 100.00 | 99.85 | O | ||||
99.85 | |||||||
NTM174 | 100.00 | 100.00 | O | ||||
100.00 | 99.85 | ||||||
99.69 | |||||||
NTM175 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM176 | 100.00 | 99.54 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM177 | 100.00 | 99.37 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM178 | 100.00 | 99.38 | Not Evaluable | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM179 | 100.00 | O | |||||
NTM180 | 100.00 | 99.84 | O | ||||
100.00 | 99.68 | ||||||
99.52 | |||||||
NTM181 | 100.00 | 100.00 | O | ||||
100.00 | 99.85 | ||||||
99.69 | |||||||
NTM182 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.39 | |||||||
NTM183 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM184 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM185 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.39 | |||||||
NTM186 | 100.00 | O | |||||
NTM187 | 100.00 | 99.69 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM188 | 100.00 | 99.85 | O | ||||
99.85 | |||||||
NTM189 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.39 | |||||||
NTM190 | 100.00 | 99.51 | O | ||||
NTM191 | 100.00 | O | |||||
NTM192 | 100.00 | Not Evaluable | |||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM193 | 100.00 | O | |||||
NTM194 | 100.00 | 99.53 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM195 | 100.00 | O | |||||
NTM196 | 100.00 | 99.67 | O | ||||
100.00 | 99.51 | ||||||
99.51 | |||||||
NTM197 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.39 | |||||||
NTM198 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM199 | 100.00 | 100.00 | O | ||||
100.00 | |||||||
NTM200 | 100.00 | O | |||||
NTM201 | 100.00 | O | |||||
NTM202 | 100.00 | O | |||||
NTM203 | 100.00 | 99.85 | O | ||||
100.00 | 99.69 | ||||||
99.54 | |||||||
NTM204 | 100.00 | 99.56 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM205 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM206 | 100.00 | 99.84 | O | ||||
100.00 | 99.67 | ||||||
99.67 | |||||||
NTM207 | 100.00 | 99.83 | O | ||||
100.00 | 99.67 | ||||||
99.67 | |||||||
NTM208 | 100.00 | 99.83 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM209 | 100.00 | O | |||||
NTM210 | 100.00 | 99.67 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM211 | 100.00 | O | |||||
NTM212 | 100.00 | 99.69 | O | ||||
100.00 | 99.54 | ||||||
99.38 | |||||||
NTM213 | 100.00 | O | |||||
NTM214 | 100.00 | 100.00 | O | ||||
100.00 | 99.84 | ||||||
99.69 | |||||||
NTM215 | 100.00 | 99.84 | O | ||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
100.00 | |||||||
NTM216 | 100.00 | 99.84 | O | ||||
100.00 | 99.67 | ||||||
99.67 | |||||||
NTM217 | 100.00 | 100.00 | O | ||||
99.69 |
*O: agreement between SnackNTM and manual identification, X: discrepancy between SnackNTM and manual identification. †According to the identification criteria described in Fig. 2, an identification result that did not show 100% identity for the 16S rRNA sequence was designated as “most closely.”
Distribution of species identified from consecutive case data using SnackNTM
Species | Count (%) |
---|---|
80 (36.9) | |
74 (34.1) | |
27 (12.4) | |
16 (7.4) | |
2 (0.9) | |
2 (0.9) | |
2 (0.9) | |
1 (0.5) | |
1 (0.5) | |
1 (0.5) | |
1 (0.5) | |
1 (0.5) | |
9 (4.1) | |
Total | 217 (100.0) |
*Two cases that were identified as
For case NTM095, SnackNTM yielded unidentifiable results because the identification process resulted in two species,
In another unidentifiable case, NTM086, the identification results were
Among the 208 cases that were identified as a single species by both SnackNTM and manual identification, 206 cases (99.0%) showed agreement. The discrepancy in the remaining two cases (NTM039 and NTM137) was caused by a difference in the trimming procedure between SnackNTM and Sequencher. In both cases, a longer portion of the trace was trimmed out in manual identification, leading to the utilization of shorter input sequences.
The results of identification using unique species data are described in Table 3. The SnackNTM results agreed with the manual identification results in all 28 cases. The EzBioCloud results agreed with the SnackNTM results in 23 cases (82.1%). Discrepancies between SnackNTM and EzBioCloud results were due to the utilization of
Identification results of unique species data using SnackNTM and EzBioCloud*
No. case | Case ID | Manual identification (16S rRNA gene+rpoB) | SnackNTM (16S rRNA gene+rpoB) | EzBioCloud (16S rRNA gene) | Cause of discrepancy |
---|---|---|---|---|---|
1 | NTM001 | ||||
2 | NTM003 | ||||
3 | NTM004 | ||||
4 | NTM008 | ||||
5 | NTM009 | ||||
6 | NTM018 | ||||
7 | NTM055 | ||||
8 | NTM147 | ||||
9 | NTM160 | ||||
10 | NTM162 | ||||
11 | NTM199 | ||||
12 | NTM218 | ||||
13 | NTM219 | ||||
14 | NTM220 | ||||
15 | NTM221 | ||||
16 | NTM222 | ||||
17 | NTM223 | ||||
18 | NTM224 | ||||
19 | NTM225 | ||||
20 | NTM226 | ||||
21 | NTM227 | ||||
22 | NTM228 | ||||
23 | NTM229 | ||||
24 | NTM230 | ||||
25 | NTM231 | Different set of reference sequences | |||
26 | NTM232 | ||||
27 | NTM233 | ||||
28 | NTM234 |
SnackNTM analysis and manual identification of the 30 cases selected for processing time comparison took 40 minutes and 150 minutes on average, respectively. The processing time included the time required for writing reports of the 30 cases, which was approximately 30 minutes for both methods. The running time of SnackNTM was approximately 2 minutes, and the remainder of the 10 minutes were mostly spent on reviewing and modifying the results when required. For manual identification, the first part, using Sequencher to review, trim, and align the sequences to produce the input for BLAST, and the second part, comprising running BLAST and interpreting the results, took approximately 60 minutes each.
We developed and validated SnackNTM, a simple and fully automated software for Sanger sequence-based NTM identification. Except for reviewing of the chromatograms generated by the sequencer, all bioinformatics processes involved in sequence-based NTM identification are automatically performed with a single mouse click. The benefit of using SnackNTM is maximized when multiple cases are to be evaluated, as SnackNTM can read multiple trace files at once and sort them according to predefined file naming rules. SnackNTM is free to use and can be instantly used by other laboratories if the primer pairs described herein are used for sequencing. Laboratories can also target different genome regions, if they obtain the appropriate reference sequences.
Searching online databases such as GenBank is considered the standard method for sequence-based identification. However, not all sequences in GenBank are rigorously validated [10–12]. An exhaustive GenBank search usually yields numerous results, including low-quality data, and search filtering options such as “representative genomes only” result in rare species missing from the results. In contrast, SnackNTM contains 16S rRNA gene sequences of all validly published
In two cases (NTM039 and NTM137) that showed discrepant SnackNTM and manual identification results, the discrepancy was due to the use of different trimming criteria. In manual identification, shorter fragments of 16S rRNA gene traces were utilized because larger portions of the sequences were trimmed, resulting in 100% identity to multiple species, including
Nine cases (4.1% of 217 consecutive cases) were unidentifiable by SnackNTM because the results indicated multiple species. Paradoxically, according to our definition of an unidentifiable result, the more species included in the reference sequence database, the higher the possibility of unidentifiable results. However, in these cases, the unidentifiable result does not necessarily imply identification failure. Additional information, such as the growth rate, photochromogenicity, and incidence in the region, could be utilized to narrow down the species. SnackNTM utilizes a comprehensive set of reference sequences that comprises >200 species, and utilizing only two target regions is bound to have some level of unidentifiable results. Attempts have been made to utilize other target regions, such as
In the unique species data, five cases (17.9%) yielded different results in SnackNTM and EzBioCloud. Some level of discrepancy was expected because of the difference in target regions, i.e., the 16S rRNA gene and
This study had several limitations. First, Sanger sequencing of the two target regions we used, i.e., the 16S rRNA gene and
In conclusion, SnackNTM represents an efficient software for the automation of Sanger sequence-based NTM species identification. As there is no single authorized and validated database of reference sequences of all published mycobacteria, the laboratory database of SNUH was incorporated into SnackNTM. SnackNTM is free to use, and if required, target regions and reference sequences can be optimized for individual users. SnackNTM is expected to reduce the workload required for Sanger sequence-based NTM identification.
None.
Kim YG developed the software and drafted the manuscript. Jung K and Kim S performed the experiments and analyzed the data. Kim MJ and Lee JS interpreted the data and contributed to manuscript revision. Park SS and Seong MW supervised the study and performed the final manuscript revision.
No potential conflicts of interest relevant to this article were reported.
None declared.