<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Annotating a sequence not yet published]]></title>
	<link>https://bioinformaticsonline.com/answers/view/10479/annotating-a-sequence-not-yet-published?</link>
	<atom:link href="https://bioinformaticsonline.com/answers/view/10479/annotating-a-sequence-not-yet-published?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/answers/view/10479/annotating-a-sequence-not-yet-published</guid>
	<pubDate>Wed, 07 May 2014 15:26:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/answers/view/10479/annotating-a-sequence-not-yet-published</link>
	<title><![CDATA[Annotating a sequence not yet published]]></title>
	<description><![CDATA[<p>I have 198,991 bp long sequence. I ran it through Maker, Glimmer and GeneMarkS. All three gave me different number of predicted genes.Here I am talking about genes that I got through GeneMarkS.<br /><br />Now when I am using BlastX to find the similar genes, I sometimes get highly dissimilar results. What does that indicate?<br /><br />For eg:- for the given gene <br /><br />&gt;gene_id_13<br />ATGTTCGCGGGCGTAGTGCGCAGTTTCGTGCAGTGGAGACGAGTCGATGACACCGCCGTC<br />AGAGACGGTGAATGGCAAGACGAACGCGGCCGGATTGACTGGTTCGAATGA<br /><br />After running blastX for gene 13, I get results with evalue 4.3 and above and sequences that are only 43% similar.</p><table id="dscTable" width="639" cellspacing="0" cellpadding="0">
<tbody>
<tr id="dtr_515988156">
<td>&nbsp;</td>
<td>
<div><a href="http://blast.ncbi.nlm.nih.gov/Blast.cgi#alnHdr_515988156" title="Go to alignment for stage II sporulation protein D [Bacillus amyloliquefaciens]">stage II sporulation protein D [Bacillus amyloliquefaciens]</a></div>
</td>
<td>33.1</td>
<td>33.1</td>
<td>81%</td>
<td>4.3</td>
<td>43%</td>
<td><a href="http://www.ncbi.nlm.nih.gov/protein/515988156?report=genbank&amp;log$=prottop&amp;blast_rank=1&amp;RID=PN3UYJW5014" target="lnkPN3UYJW5014" title="Show report for WP_017418739.1">WP_017418739.1</a></td>
</tr>
</tbody>
</table><p>&gt;gene_id_14<br />ATGCTCGGACTGATGAAGGCCTGCAAAAAGCTCGGCCTGTCGTTCTGGCAGTATCTCTGT<br />GATCGCATCGGTGTCGATGGCCAGGCCATTCCGCCGCTGGCCGCCCTTGTCGGGGCAAAA<br />GCCTAA</p><p>with this gene, i got the following result</p><table width="640" cellspacing="0" cellpadding="0">
<tbody>
<tr id="dtr_583962256">
<td>&nbsp;</td>
<td>
<div><a href="http://blast.ncbi.nlm.nih.gov/Blast.cgi#alnHdr_583962256" title="Go to alignment for putative protein y4jO [Rhizobium sp. LPU83]">putative protein y4jO [Rhizobium sp. LPU83]</a></div>
</td>
<td>76.6</td>
<td>76.6</td>
<td>97%</td>
<td>5e-16</td>
<td>80%</td>
<td><a href="http://www.ncbi.nlm.nih.gov/protein/583962256?report=genbank&amp;log$=prottop&amp;blast_rank=1&amp;RID=PN50601N015" target="lnkPN50601N015" title="Show report for CDM61676.1">CDM61676.1</a></td>
</tr>
</tbody>
</table><p>Now if I combine genes 13 and 14. I got</p><table width="638" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td>&nbsp;</td>
<td>
<div><a href="http://blast.ncbi.nlm.nih.gov/Blast.cgi#alnHdr_583962256" title="Go to alignment for putative protein y4jO [Rhizobium sp. LPU83]">putative protein y4jO [Rhizobium sp. LPU83]</a></div>
</td>
<td>78.2</td>
<td>78.2</td>
<td>92%</td>
<td>4e-16</td>
<td>57%</td>
<td><a href="http://www.ncbi.nlm.nih.gov/protein/583962256?report=genbank&amp;log$=prottop&amp;blast_rank=1&amp;RID=PN4YGYVB014" target="lnkPN4YGYVB014" title="Show report for CDM61676.1">CDM61676.1</a></td>
</tr>
</tbody>
</table><p>what does that suggest?&nbsp;</p><p>Also, how to find the correct number of predicted genes when all three tools give different results?</p><p>Thanks</p>]]></description>
	<dc:creator>ruchira</dc:creator>
</item>

</channel>
</rss>