<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Abhimanyu Singh's pages]]></title>
	<link>https://bioinformaticsonline.com/pages/owner/abhimanyu?</link>
	<atom:link href="https://bioinformaticsonline.com/pages/owner/abhimanyu?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</guid>
	<pubDate>Mon, 31 Jan 2022 07:18:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</link>
	<title><![CDATA[Short-read assembly using Spades !]]></title>
	<description><![CDATA[<h2 id="short-read-assembly-a-comparison">If we only had Illumina reads, we could also assemble these using the tool Spades.</h2><p>You can try this here, or try it later on your own data.</p><h2 id="get-data">Get data</h2><p>We will use the same Illumina data as we used above:</p><ul>
<li>illumina_R1.fastq.gz: the Illumina forward reads</li>
<li>illumina_R2.fastq.gz: the Illumina reverse reads</li>
</ul><h2 id="assemble">Assemble</h2><p>Run Spades:</p><div><pre>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
</pre></div><ul>
<li><code>-1</code>&nbsp;is input file of forward reads</li>
<li><code>-2</code>&nbsp;is input file of reverse reads</li>
<li><code>--careful</code>&nbsp;minimizes mismatches and short indels</li>
<li><code>--cov-cutoff auto</code>&nbsp;computes the coverage threshold (rather than the default setting, &ldquo;off&rdquo;)</li>
<li><code>-o</code>&nbsp;is the output directory</li>
</ul><h2 id="results">Results</h2><p>Move into the output directory and look at the contigs:</p><div><pre>infoseq contigs.fasta</pre></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35559/computational-resources-for-te-discovery-and-te-detection</guid>
	<pubDate>Mon, 12 Feb 2018 10:29:18 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35559/computational-resources-for-te-discovery-and-te-detection</link>
	<title><![CDATA[Computational resources for TE discovery and TE detection]]></title>
	<description><![CDATA[<p><span>Transposable Elements (TEs) to genome structure and evolution as well as their impact on genome sequencing, assembly, annotation and alignment has generated increasing interest in developing new methods for their computational analysis. </span></p><p><span>Following are the list of r</span><span>esource and location for TE discovery and TE detection:</span></p><p>BLASTER suite&nbsp;http://urgi.versailles.inra.fr/development/blaster/&nbsp;</p><p>Censor&nbsp;http://www.girinst.org/censor/download.php&nbsp;</p><p>find_ltr&nbsp;http://darwin.informatics.indiana.edu/cgi-bin/evolution/ltr.pl&nbsp;</p><p>FINDMITE http://jaketu.biochem.vt.edu/dl_software.htm </p><p>HMMER http://hmmer.janelia.org/ </p><p>LTR_FINDER http://tlife.fudan.edu.cn/ltr_finder/ </p><p>LTR_STRUC http://www.genetics.uga.edu/retrolab/data/LTR_Struc.html </p><p>LTR_MINER http://genomebiology.com/2004/5/10/R79/suppl/s7 </p><p>LTR_par http://www.eecs.wsu.edu/~ananth/software.htm </p><p>MAK http://wesslercluster.plantbio.uga.edu/mak06.html </p><p>MaskerAid http://blast.wustl.edu/maskeraid/ </p><p>mer-engine http://mer-engine.cshl.edu/mer-home.php </p><p>mreps http://bioinfo.lifl.fr/mreps/ </p><p>PILER http://www.drive5.com/piler/ </p><p>PLOTREP http://repeats.abc.hu/cgi-bin/plotrep.pl </p><p>RepBase http://www.girinst.org/ </p><p>RepeatFinder http://cbcb.umd.edu/software/RepeatFinder/ </p><p>RepeatGluer http://nbcr.sdsc.edu/euler/intro_tmp.htm </p><p>RepeatMasker http://www.repeatmasker.org/ </p><p>RepeatRunner http://www.yandell-lab.org/repeat_runner/index.html </p><p>RepeatScout http://repeatscout.bioprojects.org/ </p><p>repeat-match http://mummer.sourceforge.net/ </p><p>REPuter http://www.genomes.de/ </p><p>RetroMap http://www.burchsite.com/bioi/RetroMapHome.html </p><p>SMaRTFinder http://bioinf.dimi.uniud.it/software/software/smartfinder </p><p>Tandem Repeats Finder http://tandem.bu.edu/trf/trf.html </p><p>Transposon Cluster Finder http://www.mssm.edu/labs/warbup01/paper/files.html </p><p>TE nest http://www.plantgdb.org/prj/TE_nest/TE_nest.html </p><p>TRANSPO http://alggen.lsi.upc.es/recerca/search/transpo/transpo.html </p><p>TSDfinder http://www.ncbi.nlm.nih.gov/CBBresearch/Landsman/TSDfinder/ </p><p>Tu Lab TE tools http://jaketu.biochem.vt.edu/dl_software.htm </p><p>WU-BLAST http://blast.wustl.edu</p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</guid>
	<pubDate>Tue, 07 Nov 2017 05:33:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</link>
	<title><![CDATA[Alignment-free sequence comparison tools available for next-generation sequencing data analysis]]></title>
	<description><![CDATA[<div><p><span>kallisto</span></p></div><div><p>Transcript abundance quantification from RNA-seq data (uses pseudoalignment for rapid determination of read compatibility with targets)</p><p>Software (C++)</p><p><a href="https://pachterlab.github.io/kallisto/">https://pachterlab.github.io/kallisto/</a></p><p>Sailfish</p><p>Estimation of isoform abundances from reference sequences and RNA-seq data (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="http://www.cs.cmu.edu/~ckingsf/software/sailfish/">http://www.cs.cmu.edu/~ckingsf/software/sailfish/</a></p><p>Salmon</p><p>Quantification of the expression of transcripts using RNA-seq data (uses&nbsp;<em>k</em>-mers)</p><p><a href="https://combine-lab.github.io/salmon/">https://combine-lab.github.io/salmon/</a></p><p>RNA-Skim</p><p>RNA-seq quantification at transcript-level (partitions the transcriptome into disjoint transcript clusters; uses&nbsp;<em>sig</em>-mers, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C++)</p><p><a href="http://www.csbio.unc.edu/rs/">http://www.csbio.unc.edu/rs/</a></p><p>Variant calling</p><p>ChimeRScope</p><p>Fusion transcript prediction using gene&nbsp;<em>k</em>-mers profiles of the RNA-seq paired-end reads</p><p>Software (Java)</p><p><a href="https://github.com/ChimeRScope/ChimeRScope/wiki">https://github.com/ChimeRScope/ChimeRScope/wiki</a></p><p>FastGT</p><p>Genotyping of known SNV/SNP variants directly from raw NGS sequence reads by counting unique&nbsp;<em>k</em>-mers</p><p>Software (C)</p><p><a href="https://github.com/bioinfo-ut/GenomeTester4/">https://github.com/bioinfo-ut/GenomeTester4/</a></p><p>Phy-Mer</p><p>Reference-independent mitochondrial haplogroup classifier from NGS data (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/danielnavarrogomez/phy-mer">https://github.com/danielnavarrogomez/phy-mer</a></p><p>LAVA</p><p>Genotyping of known SNPs (dbSNP and Affymetrix's Genome-Wide Human SNP Array) from raw NGS reads (<em>k</em>-mer based)</p><p>Software (C)</p><p><a href="http://lava.csail.mit.edu/">http://lava.csail.mit.edu/</a></p><p>MICADo</p><p>Detection of mutations in targeted third-generation NGS data (can distinguish patients&rsquo; specific mutations; algorithm uses&nbsp;<em>k</em>-mers and is based on colored de Bruijn graphs)</p><p>Software (Python)</p><p><a href="http://github.com/cbib/MICADo">http://github.com/cbib/MICADo</a></p><p>General mapper</p><p>Minimap</p><p>Lightweight and fast read mapper and read overlap detector (uses the concept of &ldquo;minimazers&rdquo;, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C)</p><p><a href="https://github.com/lh3/minimap">https://github.com/lh3/minimap</a></p><p>Assembly</p><p>De novo genome assembly</p><p>MHAP</p><p>Produces highly continuous assembly (fully resolved chromosome arms) from third-generation long and noisy reads (10 kbp) using a dimensionality reduction technique MinHash</p><p>Software (Java)</p><p><a href="https://github.com/marbl/MHAP">https://github.com/marbl/MHAP</a></p><p>Miniasm</p><p>Assembler of long noisy reads (SMRT, ONT) using the Overlap-Layout Consensus (OLC) approach without the necessity of an error correction stage (uses minimap)</p><p>Software (C)</p><p><a href="https://github.com/lh3/miniasm">https://github.com/lh3/miniasm</a></p><p>LINKS</p><p>Scaffolding genome assembly with error-containing long sequence (e.g., ONT or PacBio reads, draft genomes)</p><p>Software (Perl)</p><p><a href="https://github.com/warrenlr/LINKS/">https://github.com/warrenlr/LINKS/</a></p><p>Read clustering</p><p>afcluster</p><p>Clustering of reads from different genes and different species based on&nbsp;<em>k</em>-mer counts</p><p>Software (C++)</p><p><a href="https://github.com/luscinius/afcluster">https://github.com/luscinius/afcluster</a></p><p>QCluster</p><p>Clustering of reads with alignment-free measures (<em>k</em>-mer based) and quality values</p><p>Software (C++)</p><p><a href="http://www.dei.unipd.it/~ciompin/main/qcluster.html">http://www.dei.unipd.it/~ciompin/main/qcluster.html</a></p><p>Reads error correction</p><p>Lighter</p><p>Correction of sequencing errors in raw, whole genome sequencing reads (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="https://github.com/mourisl/Lighter">https://github.com/mourisl/Lighter</a></p><p>QuorUM</p><p>Error corrector for Illumina reads using k-mers</p><p>Software (C++)</p><p><a href="https://github.com/gmarcais/Quorum">https://github.com/gmarcais/Quorum</a></p><p>Trowel</p><p>Software (C++)</p><p><a href="https://sourceforge.net/projects/trowel-ec/">https://sourceforge.net/projects/trowel-ec/</a></p><p>Metagenomics</p><p>Assembly-free phylogenomics</p><p>AAF</p><p>Phylogeny reconstruction directly from unassembled raw sequence data from whole genome sequencing projects; provides bootstrap support to assess uncertainty in the tree topology (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/fanhuan/AAF">https://github.com/fanhuan/AAF</a></p><p>kSNP v3</p><p>Reference-free SNP identification and estimation of phylogenetic trees using SNPs (based on&nbsp;<em>k</em>-mer analysis)</p><p>Software (C)</p><p><a href="https://sourceforge.net/projects/ksnp/files/">https://sourceforge.net/projects/ksnp/files/</a></p><p>NGS-MC</p><p>Phylogeny of species based on NGS reads using alignment-free sequence dissimilarity measures d2* and d2&nbsp;S&nbsp;under different Markov chain models (using&nbsp;<em>k</em>-words)</p><p>R package</p><p><a href="http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html">http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html</a></p><p>Species identification/taxonomic profiling</p><p>CLARK</p><p>Taxonomic classification of metagenomic reads to known bacterial genomes using&nbsp;<em>k</em>-mer search and LCA assignment</p><p>Software (C++)</p><p><a href="http://clark.cs.ucr.edu/">http://clark.cs.ucr.edu/</a></p><p>FOCUS</p><p>Reports organisms present in metagenomic samples and profiles their abundances (uses composition-based approach and non-negative least squares for prediction)</p><p>Web service Software (Python)</p><p><a href="http://edwards.sdsu.edu/FOCUS/">http://edwards.sdsu.edu/FOCUS/</a></p><p>GSM</p><p>Estimation of abundances of microbial genomes in metagenomic samples (<em>k</em>-mer based)</p><p>Software (Go)</p><p><a href="https://github.com/pdtrang/GSM">https://github.com/pdtrang/GSM</a></p><p>Mash</p><p>Species identification using assembled or unassembled Illumina, PacBio, and ONT data (based on MinHash dimensionality-reduction technique)</p><p>Software (C++)</p><p><a href="https://github.com/marbl/mash">https://github.com/marbl/mash</a></p><p>Kraken</p><p>Taxonomic assignment in metagenome analysis by exact&nbsp;<em>k</em>-mer search; LCA assignment of short reads based on a comprehensive sequence database</p><p>Software (C++)</p><p><a href="https://ccb.jhu.edu/software/kraken/">https://ccb.jhu.edu/software/kraken/</a></p><p>LMAT</p><p>Assignment of taxonomic labels to reads by&nbsp;<em>k</em>-mers searches in precomputed database</p><p>Software (C++/Python)</p><p><a href="https://sourceforge.net/projects/lmat/">https://sourceforge.net/projects/lmat/</a></p><p>stringMLST</p><p><em>k</em>-mer-based tool for MLST directly from the genome sequencing reads</p><p>Software (Python)</p><p><a href="http://jordan.biology.gatech.edu/page/software/stringMLST">http://jordan.biology.gatech.edu/page/software/stringMLST</a></p><p>Taxonomer</p><p><em>k</em>-mer-based ultrafast metagenomics tool for assigning taxonomy to sequencing reads from clinical and environmental samples</p><p>Web service</p><p><a href="http://taxonomer.iobio.io/">http://taxonomer.iobio.io/</a></p><p>Other</p><p>d2-tools</p><p>Word-based (<em>k</em>-tuple) comparison (pairwise dissimilarity matrix using d2S measure) of metatranscriptomic samples from NGS reads</p><p>Software (Python/R)</p><p><a href="https://code.google.com/p/d2-tools/">https://code.google.com/p/d2-tools/</a></p><p>VirHostMatcher</p><p>Prediction of hosts from metagenomic viral sequences based on ONF using various distance measures (e.g., d2)</p><p>Software (C++)</p><p><a href="https://github.com/jessieren/VirHostMatcher">https://github.com/jessieren/VirHostMatcher</a></p><p>MetaFast</p><p>Statistics calculation of metagenome sequences and the distances between them based on assembly using de Bruijn graphs and Bray&ndash;Curtis dissimilarity measure</p><p>Software (Java)</p><p><a href="https://github.com/ctlab/metafast">https://github.com/ctlab/metafast</a></p></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

</channel>
</rss>