The folded k-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies.
Título
The folded k-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies.
Autor
Abdulkadir Elmas, Xiao-Dong Wang, Jacqueline M. Dresch
Descripción
Understanding the molecular machinery involved in transcriptional regulation is central to improving our knowledge of an organism's development, disease, and evolution. The building blocks of this complex molecular machinery are an organism's genomic DNA sequence and transcription factor proteins. Despite the vast amount of sequence data now available for many model organisms, predicting where transcription factors bind, often referred to as 'motif detection' is still incredibly challenging. In this study, we develop a novel bioinformatic approach to binding site prediction. We do this by extending pre-existing SVM approaches in an unbiased way to include all possible gapped k-mers, representing different combinations of complex nucleotide dependencies within binding sites. We show the advantages of this new approach when compared to existing SVM approaches, through a rigorous set of cross-validation experiments. We also demonstrate the effectiveness of our new approach by reporting on its improved performance on a set of 127 genomic regions known to regulate gene expression along the anterio-posterior axis in early Drosophila embryos.
Fecha
2017
Identificador
DOI: 10.1371/journal.pone.0185570
Fuente
PLoS ONE
Editor
Public Library of Science (PLoS)
Cobertura
Science, Medicine
Idioma
EN
Colección
Citación
Abdulkadir Elmas, Xiao-Dong Wang, Jacqueline M. Dresch, “The folded k-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies.,” SOCICT Open, consulta 30 de mayo de 2026, http://socictopen.socict.org/items/show/324.
Position: 6415 (34 views)