<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/31371?offset=120</link>
	<atom:link href="https://bioinformaticsonline.com/related/31371?offset=120" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/30104/structural-variation-the-hidden-genomic-treasure</guid>
	<pubDate>Sat, 10 Dec 2016 16:19:09 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/30104/structural-variation-the-hidden-genomic-treasure</link>
	<title><![CDATA[Structural variation: the hidden genomic treasure]]></title>
	<description><![CDATA[<p>Genome re-sequencing projects have revealed substantial amounts of genetic variation between individuals extending beyond single nucleotide polymorphisms (SNPs) and short indels. Structural Variations (SVs) and Copy Number Variations (CNVs) are a major source of genomic variation. However, compared to SNPs, accurate detection, genotyping and understanding of CNVs is lagging behind due to much greater analytical challenges related to SV/CNV detection and analysis. In our lab we analyse SVs/CNVs using high-throughput sequencing and different analytical approaches.&nbsp;The most‐studied structural variants are copy number variations (CNVs) which can be generated by several different mechanisms including non‐allelic homologous recombination, non‐homologous end‐joining and deoxyribonucleic acid (DNA) replication‐related fork stalling and template switching. CNVs are closely related to segmental duplications (SDs): SDs can stimulate the formation of CNVs and themselves started out as CNVs, but became fixed in a species. Structural variation can be neutral but has also influenced our phenotypic evolution, for example our susceptibility to disease and our ability to digest certain types of food. Our understanding of the extent of structural variation is increasing rapidly, but it will be much more difficult to understand its phenotypic consequences.&nbsp;</p><p><img src="http://www.nature.com/nmeth/journal/v9/n2/images/nmeth.1858-F3.jpg" alt="image" width="946" height="603" style="border: 0px; border: 0px;"></p><p>Structural variants (SVs) such as deletions, insertions, duplications, inversions and translocations litter genomes and are often associated with gene expression changes and severe phenotypes (ie. genetic diseases in humans). Recent studies on the functional aspects of different types of SVs have unveiled several cases of adaptive evolution. For example, inversions have been associated with ecological adaptations and may facilitate speciation. Due to their prevalent nature, SVs arguably have a large impact on genome evolution and should not be neglected when studying the genetics of adaptation and speciation.&nbsp;SVs were classically defined as chromosomal rearrangements larger than 1kb, but due to a higher resolution of new detection methods, smaller variants (between 50 and 1000 base pairs) can now be accurately assessed. Besides various methods of detection in next generation sequencing data (paired end mapping, split reads, and depth of coverage), array-based approaches have proven to be particularly useful for detecting copy number variations (CNVs). These technologies have enabled researchers to catalog a wide spectrum of SVs in many organisms and infer the effects of selection shaping their evolutionary trajectories.</p><p><strong>Structure variation sequencing signature (Source: NatRev Genetics)</strong></p><p><img src="http://www.nature.com/nrg/journal/v12/n5/images/nrg2958-f2.jpg" alt="image" width="800" height="824" style="border: 0px; border: 0px;"></p><p>Related tools, databases and publications are listed below. If you know any interesing papers, please let us know in comment section:</p><p><br /><strong>Key concepts</strong></p><p>Structural variation includes balanced variants such as inversions and translocations, and unbalanced ones such as duplications and deletions (copy number variations or CNVs).</p><p>Structural variants can arise by several mechanisms, including nonallelic homologous recombination (NAHR), nonhomologous end‐joining (NHEJ) and DNA replication‐based fork stalling and template switching (FoSTeS).</p><p>CNV is closely linked to segmental duplication, but is not exactly the same. Segmental duplications can stimulate CNV formation by NAHR, and themselves arise from CNVs that have become fixed.</p><p>Segmental duplications did not appear uniformly during the evolution of the Great Ape species, but rather during a burst of activity around the time of the divergence of gorilla from the human/chimpanzee ancestor.</p><p>Duplicated genes play a critical role in the evolution of a genome as they act as &lsquo;spare parts&rsquo; than can evolve to perform new or more specialized functions.</p><p>Effects of structural variation on gene expression can be identified but only a few examples of the consequences for species biology have been documented.</p><p><strong style="font-size: 12.8px;">Tools</strong></p><p><a href="http://sv.gersteinlab.org/cnvnator">CNVnator</a>a tool for CNV discovery and genotyping from depth of read mapping.<a href="http://www.ncbi.nlm.nih.gov/pubmed/21293372">2011a</a>,<a href="http://www.ncbi.nlm.nih.gov/pubmed/21324876">2011b</a></p><p><a href="http://sv.gersteinlab.org/age">AGE</a>a tools that implements an algorithm for optimal alignment of sequences with SVs.<a href="http://www.ncbi.nlm.nih.gov/pubmed/21233167">2011</a></p><p><a href="http://sv.gersteinlab.org/breakseq">BreakSeq</a>a pipeline for annotation, classification and analysis of SVs at single nucleotide resolution.<a href="http://www.ncbi.nlm.nih.gov/pubmed/20037582">2010</a></p><p><a href="http://sv.gersteinlab.org/pemer">PEMer</a>a computational and simulation framework for discovering SVs by paired-end read mapping.<a href="http://www.ncbi.nlm.nih.gov/pubmed/19236709">2009</a>,<a href="http://www.ncbi.nlm.nih.gov/pubmed/17901297">2007</a></p><p>GASV https://code.google.com/archive/p/gasv/</p><p>PAIROSCOPE http://pairoscope.sourceforge.net/</p><p>SVDetect&nbsp;http://svdetect.sourceforge.net/Site/Home.html</p><p>BreakPtr, discovery of unbalanced structural variants (copy-number variants) with tiling microarrays&nbsp;<a href="http://tiling.mbb.yale.edu/BreakPtr/" target="_top">Link</a>&nbsp;</p><p>R Package&nbsp;https://www.bioconductor.org/help/course-materials/2010/EMBL2010/Practical-4-StructuralVariants.pdf<br /><br />BreakSeq, structural variant genotyping using split reads&nbsp;<a href="http://sv.gersteinlab.org/breakseq/" target="_top">Link</a>&nbsp;<br /><br />CopySeq, genotyping of unbalanced structural variants (copy-number variants) using read-depth&nbsp;<a href="http://www.korbel.embl.de/CopySeq/" target="_top">Link</a>&nbsp;<br /><br />DELLY2, integrated structural variant discovery, genotyping and visualization in deep sequencing data&nbsp;<a href="https://github.com/dellytools/delly" target="_top">Link</a>&nbsp;<br /><br />PEMer, structural variant discovery in 454 sequencing data by paired-end mapping&nbsp;<a href="http://www.korbel.embl.de/PEMer/" target="_top">Link</a>&nbsp;<br /><br />TIGER, transduction inference in germline genomes using short read data&nbsp;<a href="https://github.com/jelena-tica/TIGER" target="_top">Link</a>&nbsp;</p><p>MANTA&nbsp;https://github.com/Illumina/manta</p><p>SV-Bay&nbsp;https://github.com/InstitutCurie/SV-Bay</p><p>BreakDancer&nbsp;http://breakdancer.sourceforge.net/</p><p>Variation Hunter&nbsp;http://compbio.cs.sfu.ca/software-variation-hunter</p><p>Lumpy&nbsp;https://github.com/arq5x/lumpy-sv</p><p>ForestSV&nbsp;http://sebatlab.ucsd.edu/index.php/software-data&nbsp;</p><p>PBSuites for long reads&nbsp;https://sourceforge.net/projects/pb-jelly/</p><p><strong>Visualization</strong></p><p>The SV visualization tool:&nbsp;<a href="http://genomesavant.com/savant/">http://genomesavant.com/savant/</a></p><p>InGAP-SV (<a href="http://ingap.sourceforge.net/">http://ingap.sourceforge.net/</a>) that is nice tools for both detection and visualisation of severals kind of structural variations (Large insertions, translocation, deletion, inversions....)&nbsp;</p><p>Tools table: http://www.nature.com/nbt/journal/v29/n8/fig_tab/nbt.1904_T2.html</p><p>Variation Viewer https://www.ncbi.nlm.nih.gov/variation/view/</p><p><strong style="font-size: 12.8px;">Papers</strong></p><p>http://www.nature.com/nmeth/journal/v9/n2/full/nmeth.1858.html</p><p>http://journal.frontiersin.org/researchtopic/1412/structural-variations-in-genomes-ecological-and-evolutionary-implications</p><p>http://www.mi.fu-berlin.de/wiki/pub/ABI/GenomicsLecture10Materials/structural-variation.pdf</p><p>http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1479-3</p><p>https://www.ncbi.nlm.nih.gov/dbvar/content/overview/</p><p>http://www.nature.com/subjects/structural-variation</p><p>https://eichlerlab.gs.washington.edu/news/NatMeth_Feb2012.pdf</p><p>https://www.ncbi.nlm.nih.gov/pubmed/19477992 ***</p><p>https://www.ncbi.nlm.nih.gov/pubmed/22452995</p><p>http://biorxiv.org/content/early/2016/09/06/073833</p><p>https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4479793/</p><p>http://www.nature.com/articles/srep18501</p><p>http://www.genetics.org/content/202/1/351</p><p>http://www.cs.cmu.edu/~sssykim/teaching/s13/slides/Lecture_SVI.pdf</p><p>https://www.omicsonline.org/open-access/structural-variation-detection-from-next-generation-sequencing-2469-9853-S1-007.php?aid=69055</p><p>http://schatzlab.cshl.edu/presentations/2016/2016.01.12.PAG.Structural%20Variations.pdf</p><p>&nbsp;</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</guid>
	<pubDate>Mon, 19 Dec 2016 14:20:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</link>
	<title><![CDATA[pyScaf]]></title>
	<description><![CDATA[<p>pyScaf orders contigs from genome assemblies utilising several types of information:</p>
<ul>
<li>paired-end (PE) and/or mate-pair libraries (<a href="https://github.com/lpryszcz/pyScaf#ngs-based-scaffolding">NGS-based mode</a>)</li>
<li>long reads (<a href="https://github.com/lpryszcz/pyScaf#scaffolding-based-on-long-reads">NGS-based mode</a>)</li>
<li>synteny to the genome of some related species (<a href="https://github.com/lpryszcz/pyScaf#reference-based-scaffolding">reference-based mode</a>)</li>
</ul>
<p>Scaffolding&nbsp;</p>
<p>In reference-based mode, pyScaf uses synteny to the genome of closely related species in order to order contigs and estimate distances between adjacent contigs.</p>
<p>Contigs are aligned globally (end-to-end) onto reference chromosomes, ignoring:</p>
<ul>
<li>matches not satisfying cut-offs (<code>--identity</code>&nbsp;and&nbsp;<code>--overlap</code>)</li>
<li>suboptimal matches (only best match of each query to reference is kept)</li>
<li>and removing overlapping matches on reference.</li>
</ul>
<p>In preliminary tests, pyScaf performed superbly on simulated heterozygous genomes based on&nbsp;<em>C. parapsilosis</em>&nbsp;(13 Mb; CANPA) and&nbsp;<em>A. thaliana</em>&nbsp;(119 Mb; ARATH) chromosomes, reconstructing correctly all chromosomes always for CANPA and nearly always for ARATH (<a href="https://www.dropbox.com/sh/bb7lwggo40xrwtc/AAAZ7pByVQQQ-WhUXZVeJaZVa/pyScaf?dl=0">Figures in dropbox</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=2036953672">CANPA table</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=1920757821">ARATH table</a>).<br>Runs took ~0.5 min for CANPA on&nbsp;<code>4 CPUs</code>&nbsp;and ~2 min for ARATH on&nbsp;<code>16 CPUs</code>.</p>
<p><span>Important remarks:</span></p>
<ul>
<li>Reduce your assembly before (fasta2homozygous.py) as any redundancy will likely break the synteny.</li>
<li>pyScaf works better with contigs than scaffolds, as scaffolds are often affected by mis-assemblies (no&nbsp;<em>de novo assembler</em>&nbsp;/ scaffolder is perfect...), which breaks synteny.</li>
<li>pyScaf works very well if divergence between reference genome and assembled contigs is below 20% at nucleotide level.</li>
<li>pyScaf deals with large rearrangements ie. deletions, insertion, inversions, translocations.&nbsp;<span>Note however, this is experimental implementation!</span></li>
<li>Consider closing gaps after scaffolding.</li>
</ul><p>Address of the bookmark: <a href="https://github.com/lpryszcz/pyScaf" rel="nofollow">https://github.com/lpryszcz/pyScaf</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31012/genomecomp</guid>
	<pubDate>Fri, 17 Feb 2017 08:38:32 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31012/genomecomp</link>
	<title><![CDATA[GenomeComp]]></title>
	<description><![CDATA[<p>GenomeComp is a tool for summarizing, parsing and visualizing the genome wide sequence comparison results derived from voluminous BLAST textual output, so as to locate the rearrangements, insertions or deletions of genome segments between species or strains.<br><br>It can be easily used to compare, parsing and visualize large genomic sequences, especially closely related genomes such as inter-species or inter-strains. In addition, it can also show other sequence features like repeat sequence distributions in one whole-genome DNA sequence by comparing the genome to itself.<br><br>It is a stand-alone graphical user interface (GUI) program which runs on Linux, Unix, Mac OS X (tested on version 10.2.4 only) and Microsoft Windows platforms and is written in Perl/Tk.</p><p>Address of the bookmark: <a href="http://www.mgc.ac.cn/GenomeComp/" rel="nofollow">http://www.mgc.ac.cn/GenomeComp/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31568/pacbio-long-reads-compatible-software-and-tools</guid>
	<pubDate>Wed, 15 Mar 2017 14:19:01 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31568/pacbio-long-reads-compatible-software-and-tools</link>
	<title><![CDATA[Pacbio Long Reads Compatible Software and Tools]]></title>
	<description><![CDATA[<p>The following software packages are known to be compatible with PacBio&reg; data, in addition to PacBio's own SMRT&reg; Analysis suite. All packages are believed to be open source or freely available for non-commercial use. See the individual project sites for up-to-date license information. A separate page lists&nbsp;<a href="http://pacb.com/community/partner_program/current_partners/">commercial software</a>.</p>
<p>Know of any other open source software for PacBio data?&nbsp;<a href="mailto:devnet@pacificbiosciences.com">Email us</a>.</p>
<p>Software categories:</p>
<ul>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#denovo">De novo assembly</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#svdetection">Structural Variations Detection</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#aligners">Reference-based alignment</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#variants">Consensus and variant calling</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#RNA">RNA analysis</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#basemods">Epigenetic base modifications and methylation</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#barcoding">Barcoding</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#browsers">Genome Browsers</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#qc">Run QC</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#frameworks">Frameworks and APIs</a></li>
</ul><p>Address of the bookmark: <a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software" rel="nofollow">https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software</a></p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42166/software-for-genome-assembly</guid>
	<pubDate>Sun, 30 Aug 2020 09:51:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42166/software-for-genome-assembly</link>
	<title><![CDATA[Software for genome assembly !]]></title>
	<description><![CDATA[<p>List of bioinformatics tools/Software Website References for genome assembly:</p><p>1 Falcon&nbsp;https://github.com/PacificBiosciences/pb-assembly</p><p>2 Canu assembler http://canu.readthedocs.io/en/latest/index.html</p><p>3 Miniasm assembler https://github.com/lh3/miniasm</p><p>4 PBJelly scaffolding tool https://sourceforge.net/projects/pb-jelly/</p><p>5 ARCS scaffolding tool https://github.com/bcgsc/arcs</p><p>6 Redundans reduction and scaffolding tool https://github.com/Gabaldonlab/redundans</p><p>7 Arrow error correction https://github.com/PacificBiosciences/ GenomicConsensus</p><p>8 PILON error correction https://github.com/broadinstitute/pilon/wiki</p><p>9 BUSCO single copy gene markers http://busco.ezlab.org/</p><p>10 Bandage graph assembly viewer https://rrwick.github.io/Bandage/</p><p>11 Gepard dotter http://cube.univie.ac.at/gepard</p><p>12 MUMmer aligner and plotter http://mummer.sourceforge.net/</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43736/odgi-optimized-dynamic-genomegraph-implementation</guid>
	<pubDate>Tue, 01 Feb 2022 23:42:21 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43736/odgi-optimized-dynamic-genomegraph-implementation</link>
	<title><![CDATA[odgi: optimized dynamic genome/graph implementation]]></title>
	<description><![CDATA[<p dir="auto"><code>odgi</code>&nbsp;provides an efficient and succinct dynamic DNA sequence graph model, as well as a host of algorithms that allow the use of such graphs in bioinformatic analyses.</p>
<p dir="auto">Careful encoding of graph entities allows&nbsp;<code>odgi</code>&nbsp;to efficiently compute and transform&nbsp;<a href="https://pangenome.github.io/">pangenomes</a>&nbsp;with minimal overheads.&nbsp;<code>odgi</code>&nbsp;implements a dynamic data structure that leveraged multi-core CPUs and can be updated on the fly.</p>
<p dir="auto">The edges and path steps are recorded as deltas between the current node id and the target node id, where the node id corresponds to the rank in the global array of nodes. Graphs built from biological data sets tend to have local partial order and, when sorted, the deltas be small. This allows them to be compressed with a variable length integer representation, resulting in a small in-memory footprint at the cost of packing and unpacking.</p>
<p dir="auto">The RAM and computational savings are substantial. In partially ordered regions of the graph, most deltas will require only a single byte.</p><p>Address of the bookmark: <a href="https://github.com/pangenome/odgi" rel="nofollow">https://github.com/pangenome/odgi</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44352/bioinformatics-tools-for-genome-assembly</guid>
	<pubDate>Mon, 24 Jul 2023 07:04:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44352/bioinformatics-tools-for-genome-assembly</link>
	<title><![CDATA[Bioinformatics tools for genome assembly !]]></title>
	<description><![CDATA[<p>There are numerous genome assembly tools available, each with its strengths and weaknesses. Here is a list of some widely used genome assembly tools as of my last update in September 2021:</p><ol>
<li>
<p><span>SPAdes:</span> An assembler specifically designed for single-cell and multi-cell bacterial genomes, as well as small eukaryotic genomes.</p>
</li>
<li>
<p><span>ABySS:</span> A parallelized assembler for large genomes that uses de Bruijn graphs.</p>
</li>
<li>
<p><span>Velvet:</span> Another de Bruijn graph-based assembler optimized for short-read sequencing data.</p>
</li>
<li>
<p><span>SOAPdenovo:</span> A de Bruijn graph-based assembler designed for short reads, widely used for assembling large and complex genomes.</p>
</li>
<li>
<p><span>MaSuRCA:</span> A hybrid assembler that combines data from multiple sequencing technologies, such as Illumina and PacBio.</p>
</li>
<li>
<p><span>Canu:</span> A long-read assembler optimized for PacBio and Oxford Nanopore sequencing data.</p>
</li>
<li>
<p><span>Flye:</span> A long-read assembler suitable for bacterial and small eukaryotic genomes.</p>
</li>
<li>
<p><span>SMARTdenovo:</span> An assembler designed for long reads, particularly suited for PacBio data.</p>
</li>
<li>
<p><span>SPAdes Long Read (SPAdesLR):</span> An extension of SPAdes for long-read data, such as those from PacBio or Nanopore.</p>
</li>
<li>
<p><span>Minia:</span> An assembler optimized for low memory consumption, suitable for small and medium-sized genomes.</p>
</li>
<li>
<p><span>Unicycler:</span> A hybrid assembler that combines short and long reads for circular bacterial genome assembly.</p>
</li>
<li>
<p><span>wtdbg2:</span> A de Bruijn graph assembler for long reads, efficient for very large genomes.</p>
</li>
<li>
<p><span>Shasta:</span> A long-read assembler that uses the Overlap-Layout-Consensus approach, suitable for PacBio and Nanopore data.</p>
</li>
<li>
<p><span>Sparc:</span> An assembler designed to handle noisy long reads from Nanopore sequencing.</p>
</li>
<li>
<p><span>CANA:</span> An assembler for metagenomic data, particularly for complex and diverse microbial communities.</p>
</li>
<li>
<p><span>Ra</span> Assembler: A metagenome assembler for long reads, designed for highly complex metagenomic samples.</p>
</li>
</ol><p>Please note that the field of bioinformatics is constantly evolving, and new assembly tools may have emerged since my last update. Additionally, the performance of these tools can vary depending on the characteristics of the sequencing data and the genome being assembled. When selecting an assembly tool, consider the specific requirements of your project, the available data types, and the computational resources at your disposal. Always refer to the respective tool's documentation and publications for the most up-to-date information and recommendations.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26303/maker</guid>
	<pubDate>Sun, 07 Feb 2016 15:59:24 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26303/maker</link>
	<title><![CDATA[MAKER]]></title>
	<description><![CDATA[<p>MAKER is a portable and easily configurable genome annotation pipeline.Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values.</p>
<p>More at http://www.yandell-lab.org/software/maker.html</p><p>Address of the bookmark: <a href="http://www.yandell-lab.org/software/maker.html" rel="nofollow">http://www.yandell-lab.org/software/maker.html</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30124/understanding-greedy-algorithms</guid>
	<pubDate>Mon, 12 Dec 2016 04:37:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30124/understanding-greedy-algorithms</link>
	<title><![CDATA[Understanding Greedy Algorithms]]></title>
	<description><![CDATA[<p>Learning greedy algo for biologist.&nbsp;</p>
<p>https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/</p>
<p>This webpage is also useful for the same:</p>
<p>http://learninglover.com/examples.php?id=59</p>
<p>http://www.cs.rpi.edu/~magdon/ps/conference/super_biokdd.pdf</p>
<p>https://ocw.mit.edu/courses/biology/7-91j-foundations-of-computational-and-systems-biology-spring-2014/lecture-slides/MIT7_91JS14_Lecture6.pdf</p>
<p>http://schatzlab.cshl.edu/teaching/AssemblyClass/01.%20Assembly%20Intro.pdf</p>
<p>http://lsl.sinica.edu.tw/Services/Class/files/20150612449.pdf</p>
<p>http://www.cs.jhu.edu/~langmea/resources/lecture_notes/assembly_scs.pdf</p>
<p>https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-43.pdf</p><p>Address of the bookmark: <a href="https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/" rel="nofollow">https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30625/pandaseq</guid>
	<pubDate>Mon, 23 Jan 2017 04:54:32 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30625/pandaseq</link>
	<title><![CDATA[PANDASEQ]]></title>
	<description><![CDATA[<p>PANDASEQ assembles paired-end Illumina reads into sequences, trying to correct for errors and uncalled bases. The assembler reads two files in FASTQ format with quality information. If amplification primers were used (e.g., to isolate a variable region of the 16S gene, or the constant regions around zinc finger binding residues), they can be removed from the sequence during assembly. The final sequence will correct any uncalled bases in the overlapping region using the complementary strand. When mismatches occur in the overlapping region, the base with the better quality score is chosen.<br>The algorithm is as follows:<br><br>1.Find the positions where the forward and reverse primers match best above the threshold and discard the ends of the sequence, including the primer.<br>2.Pick and overlap to maximise the probability of the forward and reverse reads having come from a single piece of DNA.<br>3.Identify the masking of the end of the read with the quality score B or # as done by CASAVA and adjust the probabilities in this region.<br>4.Construct an assembled sequence between the primers and calculate the quality.<br>5.Check for various constraints, including quality, length, uncalled bases, and user-supplied modules.</p>
<p>http://neufeldserver.uwaterloo.ca/~apmasell/pandaseq_man1.html</p><p>Address of the bookmark: <a href="http://neufeldserver.uwaterloo.ca/~apmasell/pandaseq_man1.html" rel="nofollow">http://neufeldserver.uwaterloo.ca/~apmasell/pandaseq_man1.html</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>

</channel>
</rss>