A dictionary based informational genome analysis

Título

A dictionary based informational genome analysis

Autor

Castellini Alberto, Franco Giuditta, Manca Vincenzo

Descripción

Abstract Background In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes. Results Any set of words (factors) occurring in a genome provides a genomic dictionary. About sixty genomes were analyzed by means of informational indexes based on genomic dictionaries, where a systemic view replaces a local sequence analysis. A software prototype applying a methodology here outlined carried out some computations on genomic data. We computed informational indexes, built the genomic dictionaries with different sizes, along with frequency distributions. The software performed three main tasks: computation of informational indexes, storage of these in a database, index analysis and visualization. The validation was done by investigating genomes of various organisms. A systematic analysis of genomic repeats of several lengths, which is of vivid interest in biology (for example to compute excessively represented functional sequences, such as promoters), was discussed, and suggested a method to define synthetic genetic networks. Conclusions We introduced a methodology based on dictionaries, and an efficient motif-finding software application for comparative genomics. This approach could be extended along many investigation lines, namely exported in other contexts of computational genomics, as a basis for discrimination of genomic pathologies.

Fecha

2012

Materia

Comparative genomics, Computational genomics, Genome clustering, information theory, Sequence analysis

Identificador

DOI: 10.1186/1471-2164-13-485

Fuente

BMC Genomics

Editor

BMC

Cobertura

Genetics, Biotechnology

Idioma

EN

Archivos

https://socictopen.socict.org/files/to_import/pdfs/article 2081.pdf

Colección

Citación

Castellini Alberto, Franco Giuditta, Manca Vincenzo, “A dictionary based informational genome analysis,” SOCICT Open, consulta 18 de abril de 2026, https://socictopen.socict.org/items/show/2027.

Formatos de Salida

Position: 4037 (38 views)