<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/28269?</link>
	<atom:link href="https://bioinformaticsonline.com/related/28269?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29280/nemo-%E2%80%93-a-stochastic-individual-base-genetically-explicit-simulation-platform</guid>
	<pubDate>Sat, 01 Oct 2016 14:45:02 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29280/nemo-%E2%80%93-a-stochastic-individual-base-genetically-explicit-simulation-platform</link>
	<title><![CDATA[Nemo – A stochastic, individual-base, genetically explicit simulation platform]]></title>
	<description><![CDATA[<ul>
<li>
<p>A&nbsp;<strong>recombination map</strong>&nbsp;has been added for all multi-locus traits. The map positions (chromosomal) for neutral markers (e.g. SNPs) and loci under selection (QTLs, deleterious mutations, DMIs) can now be specified explicitly, or set at random. The map can hold an unlimited number of loci of different types jointly, at any recombination scale (cM or lower). The effects of linkage can thus be finely explored.</p>
</li>
<li>
<p>A new trait coding for (Bateson-)<strong>Dobzhansky-Muller incompatibility loci</strong>. Multiple haploid or diploid pairs of incompatible loci can be spread throughout the genome and affect individual fitness.</p>
</li>
<li>
<p><strong>Multi-type selection</strong>:&nbsp;<a href="http://nemo2.sourceforge.net/classIndividual.html" title="This class contains traits along with other individual information (sex, pedigree, etc. ).">Individual</a>&nbsp;fitness can be jointly determined by different types of loci under selectinon, such as QTLs coding for quantitative traits under spatially variable selection, universally deleterious mutations, and Dobzhansky-Muller incompatibility loci.</p>
</li>
<li>
<p><strong>An unlimited number of quantitative traits</strong>&nbsp;under different forms of selection can be modelled, based on universally pleiotropic loci with several bi- or multi-allelic models.</p>
</li>
<li>
<p><strong>Spatial and temporal variation of selection</strong>&nbsp;on quantitative traits is possible, modelling shifts of environmental conditions over time.</p>
</li>
<li>
<p>The dispersal matrix describing the movement of individuals among sub-populations can be replaced by a connectivity matrix and a reduced dispersal matrix describing migration only among the connected sub-populations. This offers a substantial gain in computing time and system memory when simulating very large grids.</p>
</li>
<li>
<p>Input parameters' arguments may be specified in separate files. This is particularly convenient when specifying large matrices.</p>
</li>
<li>
<p>Many adjustments have been made for refined control of the input of parameters and data output. See updates in the manual.</p>
</li>
</ul><p>Address of the bookmark: <a href="http://nemo2.sourceforge.net/index.html" rel="nofollow">http://nemo2.sourceforge.net/index.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/30104/structural-variation-the-hidden-genomic-treasure</guid>
	<pubDate>Sat, 10 Dec 2016 16:19:09 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/30104/structural-variation-the-hidden-genomic-treasure</link>
	<title><![CDATA[Structural variation: the hidden genomic treasure]]></title>
	<description><![CDATA[<p>Genome re-sequencing projects have revealed substantial amounts of genetic variation between individuals extending beyond single nucleotide polymorphisms (SNPs) and short indels. Structural Variations (SVs) and Copy Number Variations (CNVs) are a major source of genomic variation. However, compared to SNPs, accurate detection, genotyping and understanding of CNVs is lagging behind due to much greater analytical challenges related to SV/CNV detection and analysis. In our lab we analyse SVs/CNVs using high-throughput sequencing and different analytical approaches.&nbsp;The most‐studied structural variants are copy number variations (CNVs) which can be generated by several different mechanisms including non‐allelic homologous recombination, non‐homologous end‐joining and deoxyribonucleic acid (DNA) replication‐related fork stalling and template switching. CNVs are closely related to segmental duplications (SDs): SDs can stimulate the formation of CNVs and themselves started out as CNVs, but became fixed in a species. Structural variation can be neutral but has also influenced our phenotypic evolution, for example our susceptibility to disease and our ability to digest certain types of food. Our understanding of the extent of structural variation is increasing rapidly, but it will be much more difficult to understand its phenotypic consequences.&nbsp;</p><p><img src="http://www.nature.com/nmeth/journal/v9/n2/images/nmeth.1858-F3.jpg" alt="image" width="946" height="603" style="border: 0px; border: 0px;"></p><p>Structural variants (SVs) such as deletions, insertions, duplications, inversions and translocations litter genomes and are often associated with gene expression changes and severe phenotypes (ie. genetic diseases in humans). Recent studies on the functional aspects of different types of SVs have unveiled several cases of adaptive evolution. For example, inversions have been associated with ecological adaptations and may facilitate speciation. Due to their prevalent nature, SVs arguably have a large impact on genome evolution and should not be neglected when studying the genetics of adaptation and speciation.&nbsp;SVs were classically defined as chromosomal rearrangements larger than 1kb, but due to a higher resolution of new detection methods, smaller variants (between 50 and 1000 base pairs) can now be accurately assessed. Besides various methods of detection in next generation sequencing data (paired end mapping, split reads, and depth of coverage), array-based approaches have proven to be particularly useful for detecting copy number variations (CNVs). These technologies have enabled researchers to catalog a wide spectrum of SVs in many organisms and infer the effects of selection shaping their evolutionary trajectories.</p><p><strong>Structure variation sequencing signature (Source: NatRev Genetics)</strong></p><p><img src="http://www.nature.com/nrg/journal/v12/n5/images/nrg2958-f2.jpg" alt="image" width="800" height="824" style="border: 0px; border: 0px;"></p><p>Related tools, databases and publications are listed below. If you know any interesing papers, please let us know in comment section:</p><p><br /><strong>Key concepts</strong></p><p>Structural variation includes balanced variants such as inversions and translocations, and unbalanced ones such as duplications and deletions (copy number variations or CNVs).</p><p>Structural variants can arise by several mechanisms, including nonallelic homologous recombination (NAHR), nonhomologous end‐joining (NHEJ) and DNA replication‐based fork stalling and template switching (FoSTeS).</p><p>CNV is closely linked to segmental duplication, but is not exactly the same. Segmental duplications can stimulate CNV formation by NAHR, and themselves arise from CNVs that have become fixed.</p><p>Segmental duplications did not appear uniformly during the evolution of the Great Ape species, but rather during a burst of activity around the time of the divergence of gorilla from the human/chimpanzee ancestor.</p><p>Duplicated genes play a critical role in the evolution of a genome as they act as &lsquo;spare parts&rsquo; than can evolve to perform new or more specialized functions.</p><p>Effects of structural variation on gene expression can be identified but only a few examples of the consequences for species biology have been documented.</p><p><strong style="font-size: 12.8px;">Tools</strong></p><p><a href="http://sv.gersteinlab.org/cnvnator">CNVnator</a>a tool for CNV discovery and genotyping from depth of read mapping.<a href="http://www.ncbi.nlm.nih.gov/pubmed/21293372">2011a</a>,<a href="http://www.ncbi.nlm.nih.gov/pubmed/21324876">2011b</a></p><p><a href="http://sv.gersteinlab.org/age">AGE</a>a tools that implements an algorithm for optimal alignment of sequences with SVs.<a href="http://www.ncbi.nlm.nih.gov/pubmed/21233167">2011</a></p><p><a href="http://sv.gersteinlab.org/breakseq">BreakSeq</a>a pipeline for annotation, classification and analysis of SVs at single nucleotide resolution.<a href="http://www.ncbi.nlm.nih.gov/pubmed/20037582">2010</a></p><p><a href="http://sv.gersteinlab.org/pemer">PEMer</a>a computational and simulation framework for discovering SVs by paired-end read mapping.<a href="http://www.ncbi.nlm.nih.gov/pubmed/19236709">2009</a>,<a href="http://www.ncbi.nlm.nih.gov/pubmed/17901297">2007</a></p><p>GASV https://code.google.com/archive/p/gasv/</p><p>PAIROSCOPE http://pairoscope.sourceforge.net/</p><p>SVDetect&nbsp;http://svdetect.sourceforge.net/Site/Home.html</p><p>BreakPtr, discovery of unbalanced structural variants (copy-number variants) with tiling microarrays&nbsp;<a href="http://tiling.mbb.yale.edu/BreakPtr/" target="_top">Link</a>&nbsp;</p><p>R Package&nbsp;https://www.bioconductor.org/help/course-materials/2010/EMBL2010/Practical-4-StructuralVariants.pdf<br /><br />BreakSeq, structural variant genotyping using split reads&nbsp;<a href="http://sv.gersteinlab.org/breakseq/" target="_top">Link</a>&nbsp;<br /><br />CopySeq, genotyping of unbalanced structural variants (copy-number variants) using read-depth&nbsp;<a href="http://www.korbel.embl.de/CopySeq/" target="_top">Link</a>&nbsp;<br /><br />DELLY2, integrated structural variant discovery, genotyping and visualization in deep sequencing data&nbsp;<a href="https://github.com/dellytools/delly" target="_top">Link</a>&nbsp;<br /><br />PEMer, structural variant discovery in 454 sequencing data by paired-end mapping&nbsp;<a href="http://www.korbel.embl.de/PEMer/" target="_top">Link</a>&nbsp;<br /><br />TIGER, transduction inference in germline genomes using short read data&nbsp;<a href="https://github.com/jelena-tica/TIGER" target="_top">Link</a>&nbsp;</p><p>MANTA&nbsp;https://github.com/Illumina/manta</p><p>SV-Bay&nbsp;https://github.com/InstitutCurie/SV-Bay</p><p>BreakDancer&nbsp;http://breakdancer.sourceforge.net/</p><p>Variation Hunter&nbsp;http://compbio.cs.sfu.ca/software-variation-hunter</p><p>Lumpy&nbsp;https://github.com/arq5x/lumpy-sv</p><p>ForestSV&nbsp;http://sebatlab.ucsd.edu/index.php/software-data&nbsp;</p><p>PBSuites for long reads&nbsp;https://sourceforge.net/projects/pb-jelly/</p><p><strong>Visualization</strong></p><p>The SV visualization tool:&nbsp;<a href="http://genomesavant.com/savant/">http://genomesavant.com/savant/</a></p><p>InGAP-SV (<a href="http://ingap.sourceforge.net/">http://ingap.sourceforge.net/</a>) that is nice tools for both detection and visualisation of severals kind of structural variations (Large insertions, translocation, deletion, inversions....)&nbsp;</p><p>Tools table: http://www.nature.com/nbt/journal/v29/n8/fig_tab/nbt.1904_T2.html</p><p>Variation Viewer https://www.ncbi.nlm.nih.gov/variation/view/</p><p><strong style="font-size: 12.8px;">Papers</strong></p><p>http://www.nature.com/nmeth/journal/v9/n2/full/nmeth.1858.html</p><p>http://journal.frontiersin.org/researchtopic/1412/structural-variations-in-genomes-ecological-and-evolutionary-implications</p><p>http://www.mi.fu-berlin.de/wiki/pub/ABI/GenomicsLecture10Materials/structural-variation.pdf</p><p>http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1479-3</p><p>https://www.ncbi.nlm.nih.gov/dbvar/content/overview/</p><p>http://www.nature.com/subjects/structural-variation</p><p>https://eichlerlab.gs.washington.edu/news/NatMeth_Feb2012.pdf</p><p>https://www.ncbi.nlm.nih.gov/pubmed/19477992 ***</p><p>https://www.ncbi.nlm.nih.gov/pubmed/22452995</p><p>http://biorxiv.org/content/early/2016/09/06/073833</p><p>https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4479793/</p><p>http://www.nature.com/articles/srep18501</p><p>http://www.genetics.org/content/202/1/351</p><p>http://www.cs.cmu.edu/~sssykim/teaching/s13/slides/Lecture_SVI.pdf</p><p>https://www.omicsonline.org/open-access/structural-variation-detection-from-next-generation-sequencing-2469-9853-S1-007.php?aid=69055</p><p>http://schatzlab.cshl.edu/presentations/2016/2016.01.12.PAG.Structural%20Variations.pdf</p><p>&nbsp;</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31014/sockeye</guid>
	<pubDate>Fri, 17 Feb 2017 08:51:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31014/sockeye</link>
	<title><![CDATA[sockeye]]></title>
	<description><![CDATA[<p>This sockeye&nbsp;software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization.</p><p>Address of the bookmark: <a href="http://www.bcgsc.ca/platform/bioinfo/software/sockeye/releases/1.3" rel="nofollow">http://www.bcgsc.ca/platform/bioinfo/software/sockeye/releases/1.3</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</guid>
	<pubDate>Fri, 08 Jun 2018 09:31:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</link>
	<title><![CDATA[Jvarkit : Java utilities for Bioinformatics]]></title>
	<description><![CDATA[Collection of Java tool kits for bioinformatics works:

Jvarkit : Java utilities for Bioinformatics<p>Address of the bookmark: <a href="http://lindenb.github.io/jvarkit/" rel="nofollow">http://lindenb.github.io/jvarkit/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31018/j-circos</guid>
	<pubDate>Fri, 17 Feb 2017 09:06:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31018/j-circos</link>
	<title><![CDATA[J-Circos]]></title>
	<description><![CDATA[<p>Circos plot tool (J-Circos) that is an interactive visualization tool that can plot Circos figures, as well as being able to dynamically add data to the figure, and providing information for specific data points using mouse hover display and zoom in/out functions. J-Circos uses the Java computer language to enable it to be used on most operating systems (Windows, MacOS, Linux). Users can input data into J-Circos using flat data formats, as well as from the GUI. J-Circos will enable biologists to better study more complex chromosomal interactions and fusion transcripts that are otherwise difficult to visualize from next-generation sequencing data.</p><p>Address of the bookmark: <a href="http://www.australianprostatecentre.org/research/software/jcircos" rel="nofollow">http://www.australianprostatecentre.org/research/software/jcircos</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31568/pacbio-long-reads-compatible-software-and-tools</guid>
	<pubDate>Wed, 15 Mar 2017 14:19:01 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31568/pacbio-long-reads-compatible-software-and-tools</link>
	<title><![CDATA[Pacbio Long Reads Compatible Software and Tools]]></title>
	<description><![CDATA[<p>The following software packages are known to be compatible with PacBio&reg; data, in addition to PacBio's own SMRT&reg; Analysis suite. All packages are believed to be open source or freely available for non-commercial use. See the individual project sites for up-to-date license information. A separate page lists&nbsp;<a href="http://pacb.com/community/partner_program/current_partners/">commercial software</a>.</p>
<p>Know of any other open source software for PacBio data?&nbsp;<a href="mailto:devnet@pacificbiosciences.com">Email us</a>.</p>
<p>Software categories:</p>
<ul>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#denovo">De novo assembly</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#svdetection">Structural Variations Detection</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#aligners">Reference-based alignment</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#variants">Consensus and variant calling</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#RNA">RNA analysis</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#basemods">Epigenetic base modifications and methylation</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#barcoding">Barcoding</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#browsers">Genome Browsers</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#qc">Run QC</a></li>
<li><a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software#frameworks">Frameworks and APIs</a></li>
</ul><p>Address of the bookmark: <a href="https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software" rel="nofollow">https://github.com/PacificBiosciences/DevNet/wiki/Compatible-Software</a></p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/19555/a-3d-map-of-the-human-genome</guid>
	<pubDate>Fri, 12 Dec 2014 22:27:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/19555/a-3d-map-of-the-human-genome</link>
	<title><![CDATA[A 3D Map of the Human Genome]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/dES-ozV65u4" frameborder="0" allowfullscreen></iframe>Suhas Rao and Miriam Huntley (of the Aiden Lab) describe a 3D map of the human genome at kilobase resolution, revealing the principles of chromatin looping. Guest Origami Folding: Sarah Nyquist.

Suhas S.P. Rao*, Miriam H. Huntley*, Neva C. Durand, Elena K. Stamenova, Ivan D. Bochkov, James T. Robinson, Adrian L. Sanborn, Ido Machol, Arina D. Omer, Eric S. Lander, Erez Lieberman Aiden. (2014). A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell.]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29284/genebreak-a-tool-to-systematically-identify-genes-recurrently-affected-by-the-genomic-location-of-chromosomal-cna-associated-breaks-by-a-genome-wide-approach</guid>
	<pubDate>Sat, 01 Oct 2016 15:15:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29284/genebreak-a-tool-to-systematically-identify-genes-recurrently-affected-by-the-genomic-location-of-chromosomal-cna-associated-breaks-by-a-genome-wide-approach</link>
	<title><![CDATA[GeneBreak: a tool to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach]]></title>
	<description><![CDATA[<p>Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. ‘GeneBreak’ is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, ‘GeneBreak’ collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, ‘GeneBreak’, is implemented in R (www.cran.r-project.org) and is available from Bioconductor (www.bioconductor.org/packages/release/bioc/html/GeneBreak.html).</p>
<p> </p><p>Address of the bookmark: <a href="http://www.bioconductor.org/packages/release/bioc/html/GeneBreak.html" rel="nofollow">http://www.bioconductor.org/packages/release/bioc/html/GeneBreak.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30234/last</guid>
	<pubDate>Mon, 19 Dec 2016 14:07:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30234/last</link>
	<title><![CDATA[LAST]]></title>
	<description><![CDATA[<p>LAST can:</p>
<ul>
<li>Handle&nbsp;<strong>big</strong>&nbsp;sequence data, e.g:
<ul>
<li>Compare two vertebrate genomes</li>
<li>Align billions of DNA reads to a genome</li>
</ul>
</li>
<li>Indicate the&nbsp;<a href="http://lastweb.cbrc.jp/about.html">reliability</a>&nbsp;of each aligned column.</li>
<li>Use sequence quality data&nbsp;<a href="http://nar.oxfordjournals.org/content/38/7/e100.abstract">properly</a>.</li>
<li>Compare DNA to proteins, with frameshifts.</li>
<li>Compare PSSMs to sequences</li>
<li>Calculate the likelihood of chance similarities between random sequences.</li>
<li>Do split and spliced alignment.</li>
<li><a href="http://last.cbrc.jp/doc/last-train.html">Train</a>&nbsp;alignment parameters for unusual kinds of sequence (e.g. nanopore).</li>
</ul><p>Address of the bookmark: <a href="http://last.cbrc.jp/" rel="nofollow">http://last.cbrc.jp/</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</guid>
	<pubDate>Mon, 19 Dec 2016 14:20:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30236/pyscaf</link>
	<title><![CDATA[pyScaf]]></title>
	<description><![CDATA[<p>pyScaf orders contigs from genome assemblies utilising several types of information:</p>
<ul>
<li>paired-end (PE) and/or mate-pair libraries (<a href="https://github.com/lpryszcz/pyScaf#ngs-based-scaffolding">NGS-based mode</a>)</li>
<li>long reads (<a href="https://github.com/lpryszcz/pyScaf#scaffolding-based-on-long-reads">NGS-based mode</a>)</li>
<li>synteny to the genome of some related species (<a href="https://github.com/lpryszcz/pyScaf#reference-based-scaffolding">reference-based mode</a>)</li>
</ul>
<p>Scaffolding&nbsp;</p>
<p>In reference-based mode, pyScaf uses synteny to the genome of closely related species in order to order contigs and estimate distances between adjacent contigs.</p>
<p>Contigs are aligned globally (end-to-end) onto reference chromosomes, ignoring:</p>
<ul>
<li>matches not satisfying cut-offs (<code>--identity</code>&nbsp;and&nbsp;<code>--overlap</code>)</li>
<li>suboptimal matches (only best match of each query to reference is kept)</li>
<li>and removing overlapping matches on reference.</li>
</ul>
<p>In preliminary tests, pyScaf performed superbly on simulated heterozygous genomes based on&nbsp;<em>C. parapsilosis</em>&nbsp;(13 Mb; CANPA) and&nbsp;<em>A. thaliana</em>&nbsp;(119 Mb; ARATH) chromosomes, reconstructing correctly all chromosomes always for CANPA and nearly always for ARATH (<a href="https://www.dropbox.com/sh/bb7lwggo40xrwtc/AAAZ7pByVQQQ-WhUXZVeJaZVa/pyScaf?dl=0">Figures in dropbox</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=2036953672">CANPA table</a>,&nbsp;<a href="https://docs.google.com/spreadsheets/d/1InBExy-qKDLj-upd8tlPItVSKc4mLepZjZxB31ii9OY/edit#gid=1920757821">ARATH table</a>).<br>Runs took ~0.5 min for CANPA on&nbsp;<code>4 CPUs</code>&nbsp;and ~2 min for ARATH on&nbsp;<code>16 CPUs</code>.</p>
<p><span>Important remarks:</span></p>
<ul>
<li>Reduce your assembly before (fasta2homozygous.py) as any redundancy will likely break the synteny.</li>
<li>pyScaf works better with contigs than scaffolds, as scaffolds are often affected by mis-assemblies (no&nbsp;<em>de novo assembler</em>&nbsp;/ scaffolder is perfect...), which breaks synteny.</li>
<li>pyScaf works very well if divergence between reference genome and assembled contigs is below 20% at nucleotide level.</li>
<li>pyScaf deals with large rearrangements ie. deletions, insertion, inversions, translocations.&nbsp;<span>Note however, this is experimental implementation!</span></li>
<li>Consider closing gaps after scaffolding.</li>
</ul><p>Address of the bookmark: <a href="https://github.com/lpryszcz/pyScaf" rel="nofollow">https://github.com/lpryszcz/pyScaf</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>

</channel>
</rss>