<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43770?offset=80</link>
	<atom:link href="https://bioinformaticsonline.com/related/43770?offset=80" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44371/steps-to-find-all-the-repeats-in-the-genome</guid>
	<pubDate>Thu, 31 Aug 2023 02:43:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44371/steps-to-find-all-the-repeats-in-the-genome</link>
	<title><![CDATA[Steps to find all the repeats in the genome !]]></title>
	<description><![CDATA[<div><p>To find repeats in a genome from 2 to 9 length using a Perl script, you can use the RepeatMasker tool with the "--length" option<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>. Here's a step-by-step guide:</p></div><div><ol>
<li>Install RepeatMasker: First, you need to install RepeatMasker on your system. You can download it from the RepeatMasker website<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>.</li>
</ol></div><div><ol>
<li>Prepare the genome sequence: Make sure you have the genome sequence in a FASTA file format. Let's assume the file is named "genome.fasta".</li>
</ol><blockquote><p>./RepeatMasker -pa &lt;number_of_processors&gt; -nolow -norna -no_is -div &lt;divergence_value&gt; -lib RepeatMaskerLib.embl -gff -xsmall -small -poly -species &lt;species_name&gt; -dir &lt;output_directory&gt; -length &lt;min_length&gt;-&lt;max_length&gt; genome.fasta</p></blockquote><div><p>Replace the following placeholders with appropriate values:</p><ul>
<li><code>&lt;number_of_processors&gt;</code>: The number of processors/threads you want to use for parallel processing.</li>
<li><code>&lt;divergence_value&gt;</code>: The divergence value for the species you are analyzing. You can find divergence values for different species in the RepeatMasker documentation<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>.</li>
<li><code>&lt;species_name&gt;</code>: The name of the species you are analyzing.</li>
<li><code>&lt;output_directory&gt;</code>: The directory where you want the output files to be saved.</li>
<li><code>&lt;min_length&gt;</code>&nbsp;and&nbsp;<code>&lt;max_length&gt;</code>: The minimum and maximum lengths of the repeats you want to find (in this case, 2 and 9).</li>
</ul></div><div><ol>
<li>Analyze the output: RepeatMasker will generate several output files, including a .out file. You can parse this file to extract the information you need. There is a Perl tool called "one_code_to_find_them_all.pl" that can help you parse RepeatMasker output files<a href="https://mobilednajournal.biomedcentral.com/articles/10.1186/1759-8753-5-13" target="_blank">[0]</a>. You can download it from the source provided.</li>
</ol></div><div><ol>
<li>Use the provided Perl script: Once you have the "one_code_to_find_them_all.pl" script, you can run it to conveniently parse the RepeatMasker output files. Here's an example of how to use it:</li>
</ol><blockquote><p>perl one_code_to_find_them_all.pl --rm &lt;RepeatMasker_out_file&gt; --length &lt;length_file&gt;</p></blockquote></div><p>&nbsp;</p></div><div><div><p>Replace&nbsp;<code>&lt;RepeatMasker_out_file&gt;</code>&nbsp;with the path to your RepeatMasker .out file, and&nbsp;<code>&lt;length_file&gt;</code>&nbsp;with the path to a file containing the lengths of the reference elements.</p></div><div><p>This script will generate several output files, including .log.txt and .copynumber.csv, which contain quantitative information about the identified repeat elements.</p></div><div><p>Remember to adjust the parameters and options according to your specific needs and the characteristics of your genome.</p></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44637/tools-to-access-the-quality-of-your-assembled-genome</guid>
	<pubDate>Thu, 08 Aug 2024 23:31:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44637/tools-to-access-the-quality-of-your-assembled-genome</link>
	<title><![CDATA[Tools to access the quality of your assembled genome !]]></title>
	<description><![CDATA[<ul dir="auto">
<li><a href="https://github.com/linsalrob/fasta_validator">FASTA VALIDATOR</a>&nbsp;+&nbsp;<a href="https://github.com/shenwei356/seqkit">SEQKIT RMDUP</a>: FASTA validation</li>
<li><a href="https://genometools.org/tools/gt_gff3validator.html">GENOMETOOLS GT GFF3VALIDATOR</a>: GFF3 validation</li>
<li><a href="https://github.com/PlantandFoodResearch/assemblathon2-analysis/blob/a93cba25d847434f7eadc04e63b58c567c46a56d/assemblathon_stats.pl">ASSEMBLATHON STATS</a>: Assembly statistics</li>
<li><a href="https://genometools.org/tools/gt_stat.html">GENOMETOOLS GT STAT</a>: Annotation statistics</li>
<li><a href="https://github.com/ncbi/fcs">NCBI FCS ADAPTOR</a>: Adaptor contamination pass/fail</li>
<li><a href="https://github.com/ncbi/fcs">NCBI FCS GX</a>: Foreign organism contamination pass/fail</li>
<li><a href="https://gitlab.com/ezlab/busco">BUSCO</a>: Gene-space completeness estimation</li>
<li><a href="https://github.com/tolkit/telomeric-identifier">TIDK</a>: Telomere repeat identification</li>
<li><a href="https://github.com/oushujun/LTR_retriever/blob/master/LAI">LAI</a>: Continuity of repetitive sequences</li>
<li><a href="https://github.com/DerrickWood/kraken2">KRAKEN2</a>: Taxonomy classification</li>
<li><a href="https://github.com/igvteam/juicebox.js">HIC CONTACT MAP</a>: Alignment and visualisation of HiC data</li>
<li><a href="https://github.com/mummer4/mummer">MUMMER</a>&nbsp;&rarr;&nbsp;<a href="http://circos.ca/documentation/">CIRCOS</a>&nbsp;+&nbsp;<a href="https://plotly.com/">DOTPLOT</a>&nbsp;&amp;&nbsp;<a href="https://github.com/lh3/minimap2">MINIMAP2</a>&nbsp;&rarr;&nbsp;<a href="https://github.com/schneebergerlab/plotsr">PLOTSR</a>: Synteny analysis</li>
<li><a href="https://github.com/marbl/merqury">MERQURY</a>: K-mer completeness, consensus quality and phasing assessment</li>
</ul>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44775/genomic-architecture-surrounding-the-fusion-site-of-human-chromosome-2</guid>
	<pubDate>Tue, 04 Mar 2025 12:26:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44775/genomic-architecture-surrounding-the-fusion-site-of-human-chromosome-2</link>
	<title><![CDATA[Genomic architecture surrounding the fusion site of human chromosome 2]]></title>
	<description><![CDATA[<p>The article <strong>"Genomic Structure and Evolution of the Ancestral Chromosome Fusion Site in 2q13&ndash;2q14.1 and Paralogous Regions on Other Human Chromosomes (https://pmc.ncbi.nlm.nih.gov/articles/PMC187548/)"</strong> explores the genomic architecture surrounding the fusion site of human chromosome 2. This fusion event is a key evolutionary marker distinguishing humans from other great apes, as humans have 46 chromosomes while chimpanzees, gorillas, and orangutans possess 48. The fusion occurred through an end-to-end joining of two ancestral chromosomes, which remain separate in nonhuman primates.</p><h3><strong>Key Findings:</strong></h3><ol>
<li>
<p><strong>Chromosomal Fusion and Its Molecular Signature:</strong></p>
<ul>
<li>The fusion site is located at <strong>2q13&ndash;2q14.1</strong> and is characterized by <strong>degenerate telomeric sequences</strong> appearing interstitially, indicating the historical head-to-head joining of ancestral chromosomes.</li>
<li>Despite being a signature of a past fusion event, these telomeric repeats are no longer functional and have undergone sequence degradation over time.</li>
</ul>
</li>
<li>
<p><strong>Extensive Duplications in the Surrounding Genomic Region:</strong></p>
<ul>
<li>The study identifies <strong>large-scale segmental duplications</strong> flanking the fusion site, with several of these regions duplicated and scattered across multiple chromosomes.</li>
<li>These duplications are predominantly located in <strong>subtelomeric and pericentromeric regions</strong>, suggesting their role in genomic instability and chromosomal evolution.</li>
</ul>
</li>
<li>
<p><strong>Paralogous Regions and Their Evolutionary Relationships:</strong></p>
<ul>
<li>A <strong>168-kilobase (kb) segment</strong> near the fusion site has <strong>98%&ndash;99% sequence identity</strong> with three regions on <strong>chromosome 9 (9pter, 9p11.2, and 9q13)</strong>.</li>
<li>Another <strong>67-kb region distal to the fusion site</strong> shows a high degree of homology to sequences in <strong>chromosome 22qter</strong>.</li>
<li>Additionally, a <strong>100-kb segment</strong> exhibits <strong>96% sequence identity</strong> with a region in <strong>chromosome 2q11.2</strong>.</li>
</ul>
</li>
<li>
<p><strong>Comparative Genomics and Evolutionary Implications:</strong></p>
<ul>
<li>By comparing the duplicated sequences and their arrangement in primates, the researchers traced the order of duplication events leading to their present distribution.</li>
<li>The presence of specific repetitive elements within these duplicated segments serves as <strong>evolutionary markers</strong> that help infer their historical rearrangements.</li>
<li>Some of these <strong>duplicated regions are associated with chromosomal inversion breakpoints</strong>, potentially contributing to evolutionary changes in primates.</li>
<li>Recurrent <strong>structural rearrangements</strong> in these regions have been linked to human chromosomal disorders.</li>
</ul>
</li>
</ol><h3><strong>Conclusions and Implications:</strong></h3><ul>
<li>The findings provide valuable insights into <strong>the structural evolution of human chromosome 2</strong>, which played a crucial role in human speciation.</li>
<li>Understanding these <strong>segmental duplications</strong> and their evolutionary trajectories sheds light on <strong>genomic instability</strong>, which may contribute to <strong>human genetic diseases</strong>.</li>
<li>The study highlights how large-scale chromosomal rearrangements, such as fusion and duplication, have influenced the <strong>evolutionary divergence of humans</strong> from other primates.</li>
</ul><p>This research advances our understanding of <strong>human genome evolution</strong> and offers a foundation for studying the effects of <strong>structural variants in genetic disorders</strong>.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36905/d-genies-a-tool-for-dotplot-large-genomes-in-an-interactive-efficient-and-simple-way</guid>
	<pubDate>Mon, 11 Jun 2018 09:41:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36905/d-genies-a-tool-for-dotplot-large-genomes-in-an-interactive-efficient-and-simple-way</link>
	<title><![CDATA[D-GENIES: A tool for Dotplot large Genomes in an Interactive, Efficient and Simple way]]></title>
	<description><![CDATA[D-GENIES – for Dotplot large Genomes in an Interactive, Efficient and Simple way – is an online tool designed to compare two genomes. It supports large genome and you can interact with the dot plot to improve the visualisation.

We use minimap version 2 to align the two genomes. Then, the PAF file is parsed and plotted into an interactive plot written with d3.js library.

D-Genies also allows to display dot plots from other aligners by uploading their PAF or MAF alignment file.

http://dgenies.toulouse.inra.fr/<p>Address of the bookmark: <a href="http://dgenies.toulouse.inra.fr/" rel="nofollow">http://dgenies.toulouse.inra.fr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40583/trelliscope-flexibly-visualize-large-complex-data-in-great-detail-from-within-the-r-statistical-programming-environment</guid>
	<pubDate>Tue, 21 Jan 2020 04:22:49 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40583/trelliscope-flexibly-visualize-large-complex-data-in-great-detail-from-within-the-r-statistical-programming-environment</link>
	<title><![CDATA[Trelliscope: flexibly visualize large, complex data in great detail from within the R statistical programming environment.]]></title>
	<description><![CDATA[<p>Trelliscope provides a way to flexibly visualize large, complex data in great detail from within the R statistical programming environment. Trelliscope is a component in the<span>&nbsp;</span><a href="http://deltarho.org/docs-trelliscope/deltarho.org">DeltaRho</a><span>&nbsp;</span>environment.</p>
<p>For those familiar with<span>&nbsp;</span><a href="http://cm.bell-labs.com/cm/ms/departments/sia/project/trellis/">Trellis Display</a>,<span>&nbsp;</span><a href="http://docs.ggplot2.org/0.9.3.1/facet_wrap.html">faceting in ggplot</a>, or the notion of<span>&nbsp;</span><a href="http://en.wikipedia.org/wiki/Small_multiple">small multiples</a>, Trelliscope provides a scalable way to break a set of data into pieces, apply a plot method to each piece, and then arrange those plots in a grid and interactively sort, filter, and query panels of the display based on metrics of interest. With Trelliscope, we are able to create multipanel displays on data with a very large number of subsets and view them in an interactive and meaningful way.</p><p>Address of the bookmark: <a href="http://deltarho.org/docs-trelliscope/#introduction" rel="nofollow">http://deltarho.org/docs-trelliscope/#introduction</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36974/many-to-many-pairwise-alignments-of-two-sequence-sets</guid>
	<pubDate>Tue, 19 Jun 2018 08:34:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36974/many-to-many-pairwise-alignments-of-two-sequence-sets</link>
	<title><![CDATA[Many-to-many pairwise alignments of two sequence sets]]></title>
	<description><![CDATA[needleall reads a set of input sequences and compares them all to one or more sequences, writing their optimal global sequence alignments to file. It uses the Needleman-Wunsch alignment algorithm to find the optimum alignment (including gaps) of two sequences along their entire length. The algorithm uses a dynamic programming method to ensure the alignment is optimum, by exploring all possible alignments and choosing the best. A scoring matrix is read that contains values for every possible residue or nucleotide match. Needleall finds the alignment with the maximum possible score where the score of an alignment is equal to the sum of the matches taken from the scoring matrix, minus penalties arising from opening and extending gaps in the aligned sequences. The substitution matrix and gap opening and extension penalties are user-specified.<p>Address of the bookmark: <a href="http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needleall.html" rel="nofollow">http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/needleall.html</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38892/wtdbg2-a-fuzzy-bruijn-graph-approach-to-long-noisy-reads-assembly</guid>
	<pubDate>Mon, 04 Feb 2019 04:53:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38892/wtdbg2-a-fuzzy-bruijn-graph-approach-to-long-noisy-reads-assembly</link>
	<title><![CDATA[wtdbg2: A fuzzy Bruijn graph approach to long noisy reads assembly]]></title>
	<description><![CDATA[<p><span>Wtdbg2 is a&nbsp;</span><em>de novo</em><span>&nbsp;sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT). It assembles raw reads without error correction and then builds the consensus from intermediate assembly output.&nbsp;</span></p>
<pre>./wtdbg2 -x rs -g 4.6m -t 16 -i reads.fa.gz -fo prefix
./wtpoa-cns -t 16 -i prefix.ctg.lay.gz -fo prefix.ctg.fa</pre><p>Address of the bookmark: <a href="https://github.com/ruanjue/wtdbg2" rel="nofollow">https://github.com/ruanjue/wtdbg2</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34579/moss-a-system-for-detecting-software-similarity</guid>
	<pubDate>Sat, 09 Dec 2017 08:59:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34579/moss-a-system-for-detecting-software-similarity</link>
	<title><![CDATA[MOSS: A System for Detecting Software Similarity]]></title>
	<description><![CDATA[<p><span>Moss (for a Measure Of Software Similarity) is an automatic system for determining the similarity of programs. To date, the main application of Moss has been in detecting plagiarism in programming classes. Since its development in 1994, Moss has been very effective in this role. The algorithm behind moss is a significant improvement over other cheating detection algorithms (at least, over those known to us).</span></p>
<p><span><span>Moss can currently analyze code written in the following languages:</span></span></p>
<p>C, C++, Java, C#, Python, Visual Basic, Javascript, FORTRAN, ML, Haskell, Lisp, Scheme, Pascal, Modula2, Ada, Perl, TCL, Matlab, VHDL, Verilog, Spice, MIPS assembly, a8086 assembly, a8086 assembly, MIPS assembly, HCL2.</p><p>Address of the bookmark: <a href="https://theory.stanford.edu/~aiken/moss/" rel="nofollow">https://theory.stanford.edu/~aiken/moss/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38749/clipcrop-a-tool-for-detecting-structural-variations-with-single-base-resolution-using-soft-clipping-information</guid>
	<pubDate>Sun, 20 Jan 2019 06:34:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38749/clipcrop-a-tool-for-detecting-structural-variations-with-single-base-resolution-using-soft-clipping-information</link>
	<title><![CDATA[ClipCrop: a tool for detecting structural variations with single-base resolution using soft-clipping information]]></title>
	<description><![CDATA[<p><span>ClipCrop for detecting SVs with single-base resolution using soft-clipping information. A soft-clipped sequence is an unmatched fragment in a partially mapped read. To assess the performance of ClipCrop with other SV-detecting tools, we generated various patterns of simulation data &ndash; SV lengths, read lengths, and the depth of coverage of short reads &ndash; with insertions, deletions, tandem duplications, inversions and single nucleotide alterations in a human chromosome.&nbsp;</span></p><p>Address of the bookmark: <a href="https://github.com/shinout/clipcrop" rel="nofollow">https://github.com/shinout/clipcrop</a></p>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>

</channel>
</rss>