<?xml version="1.0" encoding="UTF-8"?>
<item xmlns="http://omeka.org/schemas/omeka-xml/v5" itemId="702" public="1" featured="0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://omeka.org/schemas/omeka-xml/v5 http://omeka.org/schemas/omeka-xml/v5/omeka-xml-5-0.xsd" uri="http://socictopen.socict.org/items/show/702?output=omeka-xml" accessDate="2026-06-10T20:58:14+00:00">
  <fileContainer>
    <file fileId="702">
      <src>http://socictopen.socict.org/files/original/54548cf6aa77e56df576f8a94bffbac0.pdf</src>
      <authentication>89227f9831102e2a85da84e16a9a4134</authentication>
    </file>
  </fileContainer>
  <collection collectionId="1">
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="1">
                <text>Coronavirus</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="41">
            <name>Description</name>
            <description>An account of the resource</description>
            <elementTextContainer>
              <elementText elementTextId="2">
                <text>Dominio científico: Coronavirus</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
  </collection>
  <itemType itemTypeId="1">
    <name>Text</name>
    <description>A resource consisting primarily of words for reading. Examples include books, letters, dissertations, poems, newspapers, articles, archives of mailing lists. Note that facsimiles or images of texts are still of the genre Text.</description>
  </itemType>
  <elementSetContainer>
    <elementSet elementSetId="1">
      <name>Dublin Core</name>
      <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
      <elementContainer>
        <element elementId="50">
          <name>Title</name>
          <description>A name given to the resource</description>
          <elementTextContainer>
            <elementText elementTextId="6577">
              <text>Optimization of &lt;it&gt;de novo&lt;/it&gt; transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="39">
          <name>Creator</name>
          <description>An entity primarily responsible for making the resource</description>
          <elementTextContainer>
            <elementText elementTextId="6578">
              <text>Haznedaroglu Berat Z, Reeves Darryl, Rismani-Yazdi Hamid, Peccia Jordan</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="41">
          <name>Description</name>
          <description>An account of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="6579">
              <text>Abstract Background The k-mer hash length is a key factor affecting the output of de novo transcriptome assembly packages using de Bruijn graph algorithms. Assemblies constructed with varying single k-mer choices might result in the loss of unique contiguous sequences (contigs) and relevant biological information. A common solution to this problem is the clustering of single k-mer assemblies. Even though annotation is one of the primary goals of a transcriptome assembly, the success of assembly strategies does not consider the impact of k-mer selection on the annotation output. This study provides an in-depth k-mer selection analysis that is focused on the degree of functional annotation achieved for a non-model organism where no reference genome information is available. Individual k-mers and clustered assemblies (CA) were considered using three representative software packages. Pair-wise comparison analyses (between individual k-mers and CAs) were produced to reveal missing Kyoto Encyclopedia of Genes and Genomes (KEGG) ortholog identifiers (KOIs), and to determine a strategy that maximizes the recovery of biological information in a de novo transcriptome assembly. Results Analyses of single k-mer assemblies resulted in the generation of various quantities of contigs and functional annotations within the selection window of k-mers (k-19 to k-63). For each k-mer in this window, generated assemblies contained certain unique contigs and KOIs that were not present in the other k-mer assemblies. Producing a non-redundant CA of k-mers 19 to 63 resulted in a more complete functional annotation than any single k-mer assembly. However, a fraction of unique annotations remained (~0.19 to 0.27% of total KOIs) in the assemblies of individual k-mers (k-19 to k-63) that were not present in the non-redundant CA. A workflow to recover these unique annotations is presented. Conclusions This study demonstrated that different k-mer choices result in various quantities of unique contigs per single k-mer assembly which affects biological information that is retrievable from the transcriptome. This undesirable effect can be minimized, but not eliminated, with clustering of multi-k assemblies with redundancy removal. The complete extraction of biological information in de novo transcriptomics studies requires both the production of a CA and efforts to identify unique contigs that are present in individual k-mer assemblies but not in the CA.</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="40">
          <name>Date</name>
          <description>A point or period of time associated with an event in the lifecycle of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="6580">
              <text>2012</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="43">
          <name>Identifier</name>
          <description>An unambiguous reference to the resource within a given context</description>
          <elementTextContainer>
            <elementText elementTextId="6581">
              <text>DOI: 10.1186/1471-2105-13-170</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="48">
          <name>Source</name>
          <description>A related resource from which the described resource is derived</description>
          <elementTextContainer>
            <elementText elementTextId="6582">
              <text>BMC Bioinformatics</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="45">
          <name>Publisher</name>
          <description>An entity responsible for making the resource available</description>
          <elementTextContainer>
            <elementText elementTextId="6583">
              <text>BMC</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="38">
          <name>Coverage</name>
          <description>The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant</description>
          <elementTextContainer>
            <elementText elementTextId="6584">
              <text>Biology (General), Computer applications to medicine. Medical informatics</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="44">
          <name>Language</name>
          <description>A language of the resource</description>
          <elementTextContainer>
            <elementText elementTextId="6585">
              <text>EN</text>
            </elementText>
          </elementTextContainer>
        </element>
      </elementContainer>
    </elementSet>
  </elementSetContainer>
</item>
