<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34702?offset=140</link>
	<atom:link href="https://bioinformaticsonline.com/related/34702?offset=140" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44229/common-steps-for-reads-mapping</guid>
	<pubDate>Thu, 09 Mar 2023 02:48:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44229/common-steps-for-reads-mapping</link>
	<title><![CDATA[Common steps for reads mapping !]]></title>
	<description><![CDATA[<div><div><div><div><div><div><div><div><div><div><p>Mapping reads to a reference genome is an essential step in many types of genomic analysis, such as variant calling and gene expression analysis. Here are some general steps to follow for mapping reads to a genome:</p><ol>
<li>
<p>Choose a read mapper: There are many read mappers available, such as BWA, Bowtie, and HISAT2. Choose a mapper that is appropriate for your type of data and research question.</p>
</li>
<li>
<p>Index the reference genome: Before mapping reads, the reference genome needs to be indexed. This involves creating an index of the genome sequence that allows the mapper to quickly find matches to the reads. Most mappers have their own indexing tools.</p>
</li>
<li>
<p>Prepare the read data: The reads should be in a format that is compatible with the mapper. Most mappers accept FASTQ or BAM files. Depending on the quality of the data, it may need to be filtered or trimmed before mapping.</p>
</li>
<li>
<p>Run the mapper: The mapper is run with the command-line interface or using a graphical user interface. The specific command depends on the mapper being used, but typically involves specifying the input data, reference genome, and output file format.</p>
</li>
<li>
<p>Evaluate the mapping results: After the mapping is complete, the results should be evaluated. This includes assessing the quality of the mapping, such as the mapping rate, the number of mapped reads, and the mapping quality score.</p>
</li>
<li>
<p>Post-processing: Depending on the analysis being performed, post-processing of the mapped reads may be necessary. This can include filtering reads based on quality, removing duplicate reads, and calling variants.</p>
</li>
</ol><p>Overall, mapping reads to a reference genome is a complex process that requires careful consideration of the type of data, the research question, and the specific mapper being used.</p></div></div></div></div></div></div></div></div></div></div>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/17176/arvados</guid>
	<pubDate>Sat, 20 Sep 2014 16:54:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/17176/arvados</link>
	<title><![CDATA[Arvados]]></title>
	<description><![CDATA[<p>Arvados is a free and open&nbsp;source bioinformatics&nbsp;platform for genomic and&nbsp;biomedical data. User can&nbsp;Store | Organize | Compute | Share the data for free.&nbsp;</p>
<p><img src="https://arvados.org/images/dax.png" width="400" height="535" alt="image" style="border: 0px;"></p><p>Address of the bookmark: <a href="https://arvados.org/" rel="nofollow">https://arvados.org/</a></p>]]></description>
	<dc:creator>Martin Jones</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44720/a-beginners-guide-to-using-kraken-for-taxonomic-classification</guid>
	<pubDate>Fri, 13 Dec 2024 11:29:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44720/a-beginners-guide-to-using-kraken-for-taxonomic-classification</link>
	<title><![CDATA[A Beginner&#039;s Guide to Using Kraken for Taxonomic Classification]]></title>
	<description><![CDATA[<div>Kraken is a popular bioinformatics tool designed for fast and accurate taxonomic classification of metagenomic sequences. Its efficiency and precision make it a go-to resource for analyzing microbial communities, including bacteria, viruses, archaea, and fungi. Whether you're new to bioinformatics or experienced in the field, Kraken is an indispensable tool for taxonomic analysis.</div><div><div><div><div dir="auto"><div><div><p>In this blog, we&rsquo;ll walk through the basics of Kraken, from installation to running an analysis, and highlight its key features and applications.</p><h4><strong>What is Kraken?</strong></h4><p>Kraken is a sequence classification tool that assigns taxonomic labels to DNA sequences using exact k-mer matching. It uses a reference database of genomes, dividing sequences into k-mers and identifying matches in a computationally efficient way.</p><h4><strong>Key Features of Kraken</strong></h4><ul>
<li><strong>Speed</strong>: Kraken processes data much faster than alignment-based methods.</li>
<li><strong>Accuracy</strong>: It uses a precise k-mer matching algorithm for high-resolution taxonomic assignments.</li>
<li><strong>Scalability</strong>: It can handle large metagenomic datasets.</li>
<li><strong>Custom Databases</strong>: You can build and use custom databases tailored to your research needs.</li>
</ul><h4><strong>Installing Kraken</strong></h4><ol>
<li>
<p><strong>System Requirements</strong></p>
<ul>
<li>A Unix-based operating system (Linux/macOS).</li>
<li>Sufficient computational resources for database building (RAM and disk space).</li>
</ul>
</li>
<li>
<p><strong>Installation Steps</strong></p>
<ul>
<li>Clone the Kraken repository from GitHub:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code>git <span style="font-size: 12.8px; font-weight: normal;">clone</span> https://github.com/DerrickWood/kraken.git <span style="font-size: 12.8px; font-weight: normal;">cd</span> kraken </code></div>
</div>
</li>
<li>Compile the Kraken binaries:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code>make </code></div>
</div>
</li>
<li>Add Kraken to your PATH for easy access:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code><span style="font-size: 12.8px; font-weight: normal;">export</span> PATH=<span style="font-size: 12.8px; font-weight: normal;">$PATH</span>:/path/to/kraken </code></div>
</div>
</li>
</ul>
</li>
</ol><h4><strong>Preparing a Database</strong></h4><p>Kraken requires a database of reference genomes. You can use a pre-built database or create a custom one.</p><ol>
<li>
<p><strong>Downloading a Pre-built Database</strong><br />Kraken offers pre-built databases, such as the <em>MiniKraken</em> database, which is lightweight and suitable for smaller datasets. Download it using:</p>
<div>
<div dir="ltr"><code>kraken-build --download-library minikraken </code></div>
</div>
</li>
<li>
<p><strong>Building a Custom Database</strong><br />To include specific genomes, download FASTA files and build the database:</p>
<div>
<div dir="ltr"><code>kraken-build --download-library bacteria --threads 4 --db my_database kraken-build --build --db my_database </code></div>
</div>
<p>This process may take considerable time and resources, depending on the size of the database.</p>
</li>
</ol><h4><strong>Running Kraken</strong></h4><p>Once the database is ready, you can classify sequences.</p><ol>
<li>
<p><strong>Basic Usage</strong><br />Use the following command to classify sequences:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --threads 4 --fastq-input input_sequences.fastq --output kraken_output.txt </code></div>
</div>
<p>Key options:</p>
<ul>
<li><code>--db</code>: Specifies the database.</li>
<li><code>--threads</code>: Number of threads for parallel processing.</li>
<li><code>--fastq-input</code>: Indicates input file format (FASTQ/FASTA).</li>
</ul>
</li>
<li>
<p><strong>Interpreting Results</strong><br />Kraken generates an output file with columns for sequence IDs, taxonomic classifications, and the confidence score.</p>
</li>
</ol><h4><strong>Visualizing Kraken Results</strong></h4><p>Kraken results can be visualized using tools like <strong>Krona</strong> or converted to human-readable reports using <code>kraken-report</code>.</p><ol>
<li>
<p><strong>Generate a Report</strong></p>
<div>
<div dir="ltr"><code>kraken-report --db my_database kraken_output.txt &gt; kraken_report.txt </code></div>
</div>
</li>
<li>
<p><strong>Krona Visualization</strong><br />Install Krona and convert Kraken output for visualization:</p>
<div>
<div dir="ltr"><code>cut -f2,3 kraken_output.txt | ktImportTaxonomy -o krona_output.html </code></div>
</div>
<p>Open the HTML file in your browser to interactively explore the taxonomic classifications.</p>
</li>
</ol><h4><strong>Advanced Usage</strong></h4><ol>
<li>
<p><strong>Confidence Thresholds</strong><br />Adjust the confidence threshold for classification using the <code>--confidence</code> option. Higher values reduce false positives but may miss some true positives:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --confidence 0.1 --fastq-input input.fastq </code></div>
</div>
</li>
<li>
<p><strong>Paired-End Reads</strong><br />For paired-end sequencing data, use:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --paired reads_1.fastq reads_2.fastq </code></div>
</div>
</li>
<li>
<p><strong>Customizing K-mers</strong><br />Kraken allows you to set custom k-mer lengths during database building for specific applications.</p>
</li>
</ol><h4><strong>Applications of Kraken</strong></h4><ul>
<li><strong>Microbial Ecology</strong>: Characterizing microbial communities in soil, water, and the human microbiome.</li>
<li><strong>Pathogen Detection</strong>: Identifying pathogens in clinical samples.</li>
<li><strong>Fungal Research</strong>: Analyzing fungal diversity in metagenomic datasets.</li>
<li><strong>Environmental Monitoring</strong>: Tracking microbial populations in diverse habitats.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Kraken is a versatile and efficient tool for taxonomic classification in metagenomics. Its speed, accuracy, and flexibility make it a favorite among bioinformaticians. By following this guide, you can set up and use Kraken to unlock insights into microbial and fungal communities, paving the way for discoveries in ecology, medicine, and biotechnology.</p></div></div></div></div></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/35923/basic-command-line-to-run-blast</guid>
	<pubDate>Wed, 14 Mar 2018 05:10:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/35923/basic-command-line-to-run-blast</link>
	<title><![CDATA[Basic command-line to run BLAST]]></title>
	<description><![CDATA[<p>&nbsp;</p><p>The goal of this tutorial is to run you through a demonstration of the command line, which you may not have seen or used much before.</p><p>All of the commands below can copy/pasted.</p><div id="install-software"><h2>Install software<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#install-software" title="Permalink to this headline"></a></h2><p>Copy and paste the following commands</p><div><div><pre>sudo apt-get update &amp;&amp; sudo apt-get -y install python ncbi-blast+
</pre></div></div><p>This updates the software list and installs the Python programming language and NCBI BLAST+.</p></div><div id="get-data"><h2>Get Data<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#get-data" title="Permalink to this headline"></a></h2><p>Grab some data to play with. Grab some cow and human RefSeq proteins:</p><div><div><pre>wget ftp://ftp.ncbi.nih.gov/refseq/B_taurus/mRNA_Prot/cow.1.protein.faa.gz
wget ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/mRNA_Prot/human.1.protein.faa.gz
</pre></div></div><p>This is only the first part of the human and cow protein files - there are 24 files total for human.</p><p>The database files are both gzipped, so lets unzip them</p><div><div><pre>gunzip *gz
ls
</pre></div></div><p>Take a look at the head of each file:</p><div><div><pre>head cow.1.protein.faa
head human.1.protein.faa
</pre></div></div><p>These are protein sequences in FASTA format. FASTA format is something many of you have probably seen in one form or another &ndash; it&rsquo;s pretty ubiquitous. It&rsquo;s just a text file, containing records; each record starts with a line beginning with a &lsquo;&gt;&rsquo;, and then contains one or more lines of sequence text.</p><p>Note that the files are in fasta format, even though they end if &rdquo;.faa&rdquo; instead of the usual &rdquo;.fasta&rdquo;. This NCBI&rsquo;s way of denoting that this is a fasta file with amino acids instead of nucleotides.</p><p>How many sequences are in each one?</p><div><div><pre>grep -c '^&gt;' cow.1.protein.faa
grep -c '^&gt;' human.1.protein.faa
</pre></div></div><p>This grep command uses the c flag, which reports a count of lines with match to the pattern. In this case, the pattern is a regular expression, meaning match only lines that begin with a &gt;.</p><p>This is a bit too big, lets take a smaller set for practice. Lets take the first two sequences of the cow proteins, which we can see are on the first 6 lines</p><div><div><pre>head -6 cow.1.protein.faa &gt; cow.small.faa
</pre></div></div></div><div id="blast"><h2>BLAST<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#blast" title="Permalink to this headline"></a></h2><p>Now we can blast these two cow sequences against the set of human sequences. First, we need to tell blast about our database. BLAST needs to do some pre-work on the database file prior to searching. This helps to make the software work a lot faster. Because you installed your own version of the sotware, you need to tell the shell where the software is located. Use the full path and the makeblastdb command:</p><div><div><pre>makeblastdb -in human.1.protein.faa -dbtype prot
ls
</pre></div></div><p>Note that this makes a lot of extra files, with the same name as the database plus new extensions (.pin, .psq, etc). To make blast work, these files, called index files, must be in the same directory as the fasta file.</p><p><br /> blastp [-h] [-help] [-import_search_strategy filename]<br /> [-export_search_strategy filename] [-task task_name] [-db database_name]<br /> [-dbsize num_letters] [-gilist filename] [-seqidlist filename]<br /> [-negative_gilist filename] [-negative_seqidlist filename]<br /> [-entrez_query entrez_query] [-db_soft_mask filtering_algorithm]<br /> [-db_hard_mask filtering_algorithm] [-subject subject_input_file]<br /> [-subject_loc range] [-query input_file] [-out output_file]<br /> [-evalue evalue] [-word_size int_value] [-gapopen open_penalty]<br /> [-gapextend extend_penalty] [-qcov_hsp_perc float_value]<br /> [-max_hsps int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]<br /> [-xdrop_gap_final float_value] [-searchsp int_value]<br /> [-sum_stats bool_value] [-seg SEG_options] [-soft_masking soft_masking]<br /> [-matrix matrix_name] [-threshold float_value] [-culling_limit int_value]<br /> [-best_hit_overhang float_value] [-best_hit_score_edge float_value]<br /> [-window_size int_value] [-lcase_masking] [-query_loc range]<br /> [-parse_deflines] [-outfmt format] [-show_gis]<br /> [-num_descriptions int_value] [-num_alignments int_value]<br /> [-line_length line_length] [-html] [-max_target_seqs num_sequences]<br /> [-num_threads int_value] [-ungapped] [-remote] [-comp_based_stats compo]<br /> [-use_sw_tback] [-version]</p><p>Now we can run the blast job. We will use blastp, which is appropriate for protein to protein comparisons.</p><div><div><pre>blastp -query cow.small.faa -db human.1.protein.faa
</pre></div></div><p>This gives us a lot of information on the terminal screen. But this is difficult to save and use later - Blast also gives the option of saving the text to a file.</p><div><div><pre>    blastp -query cow.small.faa -db human.1.protein.faa -out cow_vs_human_blast_results.txt
ls
</pre></div></div><p>Take a look at the results using less. Note that there can be more than one match between the query and the same subject. These are referred to as high-scoring segment pairs (HSPs).</p><div><div><pre>less cow_vs_human_blast_results.txt
</pre></div></div><p>So how do you know about all the options, such as the flag to create an output file? Lets also take a look at the help pages. Unfortunately there are no man pages (those are usually reserved for shell commands, but some software authors will provide them as well), but there is a text help output</p><div><div><pre>blastp -help
</pre></div></div><p>To scroll through slowly</p><div><div><pre>blastp -help | less
</pre></div></div><p>To quit the less screen, press the q key.</p><p>Parameters of interest include the -evalue (Default is 10?!?) and the -outfmt</p><p>Lets filter for more statistically significant matches with a different output format:</p><div><div><pre>blastp \
-query cow.small.faa \
-db human.1.protein.faa \
-out cow_vs_human_blast_results.tab \
-evalue 1e-5 \
-outfmt 7
</pre></div></div><p>I broke the long single command into many lines with by &ldquo;escaping&rdquo; the newline. That forward slash tells the command line &ldquo;Wait, I&rsquo;m not done yet!&rdquo;. So it waits for the next line of the command before executing.</p><p>Check out the results with less.</p><p>Lets try a medium sized data set next</p><div><div><pre>head -199 cow.1.protein.faa &gt; cow.medium.faa
</pre></div></div><p>What size is this db?</p><div><div><pre>grep -c '^&gt;' cow.medium.faa
</pre></div></div><p>Lets run the blast again, but this time lets return only the best hit for each query.</p><div><div><pre>blastp \
-query cow.medium.faa \
-db human.1.protein.faa \
-out cow_vs_human_blast_results.tab \
-evalue 1e-5 \
-outfmt 6 \
-max_target_seqs 1
</pre></div></div></div><div id="summary"><h2>Summary<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#summary" title="Permalink to this headline"></a></h2><p>Review:</p><ul>
<li>command line programs such as blast use flags to get information about how and what to do</li>
<li>blast options can be found by typing&nbsp;<cite>blastp -help</cite></li>
<li>break a command up over many lines by using&nbsp;<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#id1">`</a>` to &ldquo;escape&rdquo; the new line</li>
</ul><p>&nbsp;</p><p>Blastn</p><p>blastn [-h] [-help] [-import_search_strategy filename]<br /> [-export_search_strategy filename] [-task task_name] [-db database_name]<br /> [-dbsize num_letters] [-gilist filename] [-seqidlist filename]<br /> [-negative_gilist filename] [-negative_seqidlist filename]<br /> [-entrez_query entrez_query] [-db_soft_mask filtering_algorithm]<br /> [-db_hard_mask filtering_algorithm] [-subject subject_input_file]<br /> [-subject_loc range] [-query input_file] [-out output_file]<br /> [-evalue evalue] [-word_size int_value] [-gapopen open_penalty]<br /> [-gapextend extend_penalty] [-perc_identity float_value]<br /> [-qcov_hsp_perc float_value] [-max_hsps int_value]<br /> [-xdrop_ungap float_value] [-xdrop_gap float_value]<br /> [-xdrop_gap_final float_value] [-searchsp int_value]<br /> [-sum_stats bool_value] [-penalty penalty] [-reward reward] [-no_greedy]<br /> [-min_raw_gapped_score int_value] [-template_type type]<br /> [-template_length int_value] [-dust DUST_options]<br /> [-filtering_db filtering_database]<br /> [-window_masker_taxid window_masker_taxid]<br /> [-window_masker_db window_masker_db] [-soft_masking soft_masking]<br /> [-ungapped] [-culling_limit int_value] [-best_hit_overhang float_value]<br /> [-best_hit_score_edge float_value] [-window_size int_value]<br /> [-off_diagonal_range int_value] [-use_index boolean] [-index_name string]<br /> [-lcase_masking] [-query_loc range] [-strand strand] [-parse_deflines]<br /> [-outfmt format] [-show_gis] [-num_descriptions int_value]<br /> [-num_alignments int_value] [-line_length line_length] [-html]<br /> [-max_target_seqs num_sequences] [-num_threads int_value] [-remote]<br /> [-version]</p><p>DESCRIPTION<br /> Nucleotide-Nucleotide BLAST 2.7.0+</p></div>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27328/platanus</guid>
	<pubDate>Fri, 13 May 2016 05:12:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27328/platanus</link>
	<title><![CDATA[Platanus]]></title>
	<description><![CDATA[<p>Platanus is a novel <em>de novo</em> sequence assembler that can reconstruct genomic sequences of<br> highly heterozygous diploids from massively parallel shotgun sequencing data.</p>
<p>The latest version is <a href="http://platanus.bio.titech.ac.jp/platanus/?page_id=14">1.2.4</a>.</p>
<p>To cite Platanus, please use the following:</p>
<p>Kajitani R, Toshimoto K, Noguchi H, Toyoda A, Ogura Y, Okuno M, Yabana M, Harada M, Nagayasu E, Maruyama H, Kohara Y, Fujiyama A, Hayashi T, Itoh T, &ldquo;Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads&rdquo;.&nbsp;Genome Res. 2014 Aug;24(8):1384-95. doi: 10.1101/gr.170720.113. [<a href="http://www.ncbi.nlm.nih.gov/pubmed/24755901">abstract</a> |<a href="http://genome.cshlp.org/content/24/8/1384.long"> full text</a>]</p><p>Address of the bookmark: <a href="http://platanus.bio.titech.ac.jp/" rel="nofollow">http://platanus.bio.titech.ac.jp/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29144/fermi</guid>
	<pubDate>Fri, 09 Sep 2016 05:37:13 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29144/fermi</link>
	<title><![CDATA[FERMI]]></title>
	<description><![CDATA[<p><span>Fermi is a de novo assembler with a particular focus on assembling Illumina&nbsp;</span><span>short sequence reads from a mammal-sized genome. In addition to the role of a&nbsp;</span><span>typical assembler, fermi also aims to preserve heterozygotes which are often&nbsp;</span><span>collapsed by other assemblers. Its ultimate goal is to find a minimal set of</span><br><span>unitigs to represent all the information in raw reads.</span><br><br><span>Fermi follows the overlap-layout-consensus paradigm and uses the FM-DNA-index&nbsp;</span><span>(FMD-index) as the key data structure. It is inspired by the string graph&nbsp;</span><span>assembler (Simpson and Durbin, 2010 and 2012) and has a similar workflow.</span><br><br><span>As a typical de novo assembler, fermi tends to produce contigs with slightly&nbsp;</span><span>longer N50. However, the major weakness of fermi is the high misassembly rate.&nbsp;</span><span>Although fermi provides a tool to fix misassemblies by using paired-end reads&nbsp;</span><span>to achieve an accuracy comparable to other assemblers, this is not a favorable&nbsp;</span><span>solution.</span><br><br><span>Fermi is designed to be used on a multi-core Linux machine with large shared&nbsp;</span><span>memory. The easiest way to run fermi is to use the run-fermi.pl script. It&nbsp;</span><span>generates a Makefile. The actual assembly is done by invoking make. Premature&nbsp;</span><span>assembly processes can be resumed. Here is an example:</span><br><br><span>run-fermi.pl -dAPe ./fermi -p NA12878 -t16 -f18 reads*.fq.gz &gt; NA12878.mak</span><br><span>make -f NA12878.mak -j16</span></p><p>Address of the bookmark: <a href="https://github.com/lh3/fermi" rel="nofollow">https://github.com/lh3/fermi</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36723/hapsembler-an-assembler-for-highly-polymorphic-genomes</guid>
	<pubDate>Tue, 22 May 2018 04:09:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36723/hapsembler-an-assembler-for-highly-polymorphic-genomes</link>
	<title><![CDATA[Hapsembler: An Assembler for Highly Polymorphic Genomes]]></title>
	<description><![CDATA[Hapsembler is a haplotype-specific genome assembly toolkit that is designed for genomes that are rich in SNPs and other types of polymorphism. Hapsembler can be used to assemble reads from a variety of platforms including Illumina and Roche/454. 

http://compbio.cs.toronto.edu/hapsembler/<p>Address of the bookmark: <a href="http://compbio.cs.toronto.edu/hapsembler/" rel="nofollow">http://compbio.cs.toronto.edu/hapsembler/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</guid>
	<pubDate>Mon, 11 Jun 2018 03:08:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</link>
	<title><![CDATA[PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++.]]></title>
	<description><![CDATA[We are pleased to release PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++. Its name describes the strategy that it implements for genome assembly: PRICE uses paired-read information to iteratively increase the size of existing contigs. Initially, those contigs can be individual reads from a subset of the paired-read dataset, non-paired reads from sequencing technologies that provide non-paired data, or contigs that were output from a prior run of PRICE or any other assembler.

http://derisilab.ucsf.edu/software/price/<p>Address of the bookmark: <a href="http://derisilab.ucsf.edu/software/price/" rel="nofollow">http://derisilab.ucsf.edu/software/price/</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</guid>
	<pubDate>Tue, 14 Jan 2020 06:47:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</link>
	<title><![CDATA[Shasta long read assembler]]></title>
	<description><![CDATA[<p>The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by&nbsp;<a href="https://nanoporetech.com/">Oxford Nanopore</a>&nbsp;flow cells.</p>
<p>Computational methods used by the Shasta assembler include:</p>
<ul>
<li>Using a&nbsp;<a href="https://en.wikipedia.org/wiki/Run-length_encoding">run-length</a>&nbsp;representation of the read sequence. This makes the assembly process more resilient to errors in homopolymer repeat counts, which are the most common type of errors in Oxford Nanopore reads.</li>
<li>Using in some phases of the computation a representation of the read sequence based on&nbsp;<em>markers</em>, a fixed subset of short k-mers (k &asymp; 10).</li>
</ul>
<p>More at&nbsp;<a href="https://chanzuckerberg.github.io/shasta/index.html">https://chanzuckerberg.github.io/shasta/index.html</a></p><p>Address of the bookmark: <a href="https://github.com/chanzuckerberg/shasta" rel="nofollow">https://github.com/chanzuckerberg/shasta</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/poll/view/23590/will-minion-nanopore-sequencing-increase-the-number-of-next-generation-sequencing-projects</guid>
	<pubDate>Tue, 04 Aug 2015 05:14:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/poll/view/23590/will-minion-nanopore-sequencing-increase-the-number-of-next-generation-sequencing-projects</link>
	<title><![CDATA[Will MinION Nanopore sequencing increase the number of Next Generation Sequencing projects?]]></title>
	<description><![CDATA[<p>Will MinION Nanopore sequencing increase the number of Next Generation Sequencing projects?</p>]]></description>
	<dc:creator>Strand</dc:creator>
</item>

</channel>
</rss>