StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees
Título
StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees
Autor
Märt Roosaare, Mihkel Vaher, Lauris Kaplinski, Märt Möls, Reidar Andreson, Maarja Lepamets, Triinu Kõressaar, Paul Naaber, Siiri Kõljalg, Maido Remm
Descripción
Background Fast, accurate and high-throughput identification of bacterial isolates is in great demand. The present work was conducted to investigate the possibility of identifying isolates from unassembled next-generation sequencing reads using custom-made guide trees. Results A tool named StrainSeeker was developed that constructs a list of specific k-mers for each node of any given Newick-format tree and enables the identification of bacterial isolates in 1–2 min. It uses a novel algorithm, which analyses the observed and expected fractions of node-specific k-mers to test the presence of each node in the sample. This allows StrainSeeker to determine where the isolate branches off the guide tree and assign it to a clade whereas other tools assign each read to a reference genome. Using a dataset of 100 Escherichia coli isolates, we demonstrate that StrainSeeker can predict the clades of E. coli with 92% accuracy and correct tree branch assignment with 98% accuracy. Twenty-five thousand Illumina HiSeq reads are sufficient for identification of the strain. Conclusion StrainSeeker is a software program that identifies bacterial isolates by assigning them to nodes or leaves of a custom-made guide tree. StrainSeeker’s web interface and pre-computed guide trees are available at http://bioinfo.ut.ee/strainseeker. Source code is stored at GitHub: https://github.com/bioinfo-ut/StrainSeeker.
Fecha
2017
Materia
k-mer, Clade, strain identification, Species identification, diagnostics
Identificador
DOI: 10.7717/peerj.3353
Fuente
PeerJ
Editor
PeerJ Inc.
Cobertura
Medicine
Idioma
EN
Colección
Citación
Märt Roosaare, Mihkel Vaher, Lauris Kaplinski, Märt Möls, Reidar Andreson, Maarja Lepamets, Triinu Kõressaar, Paul Naaber, Siiri Kõljalg, Maido Remm, “StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees,” SOCICT Open, consulta 19 de abril de 2026, https://socictopen.socict.org/items/show/620.
Position: 16996 (18 views)