<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37241?offset=180</link>
	<atom:link href="https://bioinformaticsonline.com/related/37241?offset=180" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27973/wgsim</guid>
	<pubDate>Thu, 23 Jun 2016 07:26:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27973/wgsim</link>
	<title><![CDATA[WgSim]]></title>
	<description><![CDATA[<p>Reads simulator</p>
<p>Wgsim is a small tool for simulating sequence reads from a reference genome. It is able to simulate diploid genomes with SNPs and insertion/deletion (INDEL) polymorphisms, and simulate reads with uniform substitution sequencing errors. It does not generate INDEL sequencing errors, but this can be partly compensated by simulating INDEL polymorphisms.<br><br>Wgsim outputs the simulated polymorphisms, and writes the true read coordinates as well as the number of polymorphisms and sequencing errors in read names. One can evaluate the accuracy of a mapper or a SNP caller with wgsim_eval.pl that comes with the package.<br><br></p><p>Address of the bookmark: <a href="https://github.com/lh3/wgsim" rel="nofollow">https://github.com/lh3/wgsim</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30144/bima-v3-an-aligner-customized-for-mate-pair-library-sequencing</guid>
	<pubDate>Wed, 14 Dec 2016 15:20:00 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30144/bima-v3-an-aligner-customized-for-mate-pair-library-sequencing</link>
	<title><![CDATA[BIMA V3: an aligner customized for mate pair library sequencing]]></title>
	<description><![CDATA[<p>Summary: Mate pair library sequencing is an effective and economical method for detecting genomic structural variants and chromosomal abnormalities. Unfortunately, the mapping and alignment of mate pair read pairs to a reference genome is a challenging and <br>time consuming process for most NGS alignment programs. Large insert sizes, introduction of library preparation protocol artifacts (biotin junction reads, paired-end read contamination, chimeras, etc.), and presence of structural variant breakpoints within reads increases mapping and alignment complexity. We describe an algorithm that is up to 20 times faster and 25% more accurate than popular NGS alignment programs when processing mate pair sequencing. <br>Availability: http://bioinformaticstools.mayo.edu/research/bima/ <br>Contact: vasmatzis.george@mayo.edu</p><p>Address of the bookmark: <a href="http://bioinformatics.oxfordjournals.org/content/early/2014/02/12/bioinformatics.btu078.full.pdf" rel="nofollow">http://bioinformatics.oxfordjournals.org/content/early/2014/02/12/bioinformatics.btu078.full.pdf</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31156/splitbam-splits-a-bam-by-chromosomes</guid>
	<pubDate>Tue, 28 Feb 2017 09:01:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31156/splitbam-splits-a-bam-by-chromosomes</link>
	<title><![CDATA[splitbam: splits a BAM by chromosomes]]></title>
	<description><![CDATA[<p><strong>splitbam</strong>&nbsp;splits a BAM by chromosomes.</p>
<p>Using the reference sequence dictionary (<code>*.dict</code>), it also creates some empty BAM files if no sam record was found for a chromosome. A pair of 'mock' SAM-Records can also be added to those empty BAMs to avoid some tools (like samtools) to crash.</p>
<h1>Usage</h1>
<p><code>java -jar splitbam.jar -p OUT/__CHROM__/__CHROM__.bam -R ref.fasta (bam|sam|stdin)</code></p>
<h1>Options</h1>
<ul>
<li>-h help; This screen.</li>
<li>-R (indexed reference file) REQUIRED.</li>
<li>-u (unmapped chromosome name): default:Unmapped</li>
<li>-e | --empty : generate EMPTY bams for chromosome having no read mapped</li>
<li>-m | --mock : if option '-e', add a mock pair of sam records to the empty bam</li>
<li>-p (output file/bam pattern) REQUIRED. MUST contain&nbsp;<strong><code>__CHROM__</code></strong>&nbsp;and end with .bam</li>
<li>-s assume input is sorted.</li>
<li>-x | --index create index.</li>
<li>-t | --tmp (dir) tmp file directory</li>
<li>-G (file) chrom-group file (see below)</li>
</ul><p>Address of the bookmark: <a href="https://code.google.com/archive/p/jvarkit/wikis/SplitBam.wiki" rel="nofollow">https://code.google.com/archive/p/jvarkit/wikis/SplitBam.wiki</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32709/cabog-celera-assembler-with-best-overlap-graph</guid>
	<pubDate>Mon, 15 May 2017 05:04:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32709/cabog-celera-assembler-with-best-overlap-graph</link>
	<title><![CDATA[CABOG: Celera Assembler with Best Overlap Graph]]></title>
	<description><![CDATA[<p>CABOG (Celera Assembler with Best Overlap Graph) is scientific software for&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/24/24/2818.abstract">DNA research</a>. CABOG has been a critical component of many genome sequencing projects. CABOG operates on small genomes such as bacterial as well as large genomes such as mammalian. CABOG is an extension of the Celera Assembler software that was originally developed at&nbsp;<a href="http://www.celera.com/">Celera</a>&nbsp;for the 2001 publication of the first draft human genome sequence. The software was released to the public domain in 2004. Its open source&nbsp;<a href="http://wgs-assembler.sf.net/">repository</a>&nbsp;on Source Forge is an internet resource for scientists around the world.&nbsp;</p>
<p>CABOG is one of many software programs called genome assemblers. These programs exist to overcome the fundamental limitation of all sequencing machines, namely, that they read out very few DNA letters at a time. These programs reconstruct genomes that are billions of letters long from the hundreds of letters per read that modern sequencers provide. What these programs do is often described as a scaled up version of a family solving a jigsaw puzzle.</p>
<p>The CABOG software was the first to accomplish many scientific goals. It was the first to assemble the genome of a multicellular organism (<em>Drosophila melanogaster</em>, 2000). It was the first to assemble both parental haplotypes of one human genome (J. Craig Venter, 2007). It was the first to assemble environmental sequence from the oceans (Sargasso Sea in 2004 and Global Ocean Sampling in 2007). It was first to combine reads from first-generation Sanger sequencing machines and second-generation pyrosequencing machines (Marine microbes, 2006). Today, CABOG is one of the leading assembly programs for data sets that include paired end data from the Roche 454 line of sequencing machines.</p><p>Address of the bookmark: <a href="http://www.jcvi.org/cms/research/projects/cabog/overview/" rel="nofollow">http://www.jcvi.org/cms/research/projects/cabog/overview/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</guid>
	<pubDate>Sat, 16 Jan 2021 21:42:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</link>
	<title><![CDATA[Protocol for De novo Genome Assembly using Illumina Reads]]></title>
	<description><![CDATA[<p>In this protocol, we address and describe the de novo assembly method for small to medium-sized genomes.</p><p><strong>What is de novo genome assembly?<br /></strong>The method of taking a large number of short DNA sequences and placing them back together to create a reflection of the original chromosomes from which the DNA originated relates to genome assembly. No previous knowledge of the source DNA sequence length, structure or composition is inferred by De novo genome assemblies. The DNA of the target organism is split up into millions of tiny parts and read on a sequencing computer in a genome sequencing experiment. Depending on the sequencing system used, these "reads" range from 20 to 1000 nucleotide base pairs (bp) in length. Usually, length reads of 36 - 150 bp are produced for Illumina style short read sequencing. These reads can be either &ldquo;single ended&rdquo; as described above or &ldquo;paired end.&rdquo;</p><p><strong>Why genome assembly?</strong><br />In basic research into why and how they live, as well as in applied topics, identifying the DNA sequence of an organism is useful. Awareness of a DNA sequence may be useful in virtually any biological research because of the relevance of DNA to living things. For example, it may be used in medicine to classify, diagnose and eventually improve genetic disorder therapies. Similarly, pathogens study can lead to treatments for infectious diseases.</p><p><strong>Raw NGS data</strong><br />Reads can be saved as a Fasta file as text or in a FastQ file with their attributes.&nbsp;FastQ is the most common read file format since this is what the Illumina sequencing pipeline creates. This will henceforth be the subject of our conversation.</p><p><strong>In a nutshell the protocol:</strong> <br />Get the sequence file(s) read from the sequencing machine (s). <br />Look at the readings - have an idea of what you have and what the standard is like. <br />If required, raw data cleanup/quality trimming. <br />Choose an adequate parameter set for assembly. <br />Assemble the data into scaffolds/contigs. <br />Examine the assembly performance and determine the efficiency of the assembly.</p><p><strong>Read Quality Control:</strong><br />Check the qualiy with fastQC.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42540/install-fastqc-using-conda</p><p>Quality trimming/cleanup of read files.<br />This function trims adapters, barcodes and other contaminants from the reads.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42542/trimmomatic-command</p><p><strong>Genome Assembly:</strong><br />The object of this portion of the protocol is to explain the method of assembling the reads trimmed by quality into draft contigs.</p><blockquote><p>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o result_of_spades_assembly_all_illumina</p></blockquote><p>A significant range of short-read assemblers are available. Everyone with strengths and disadvantages of their own. <br /><em>Some of the assemblers available include:</em><br />Velvet<br />SOAP-denovo<br />MIRA<br />ALLPATHS</p><p>Next step is to assess the suitability and what to do with a draft package of contiguous details for the remainder of the study now.&nbsp;Few stuff you can note about the contigs you just created:&nbsp;They're the draft Contigs. Any mis-assemblies can occur.</p><p><strong>Mis-assembly checking and assembly metric tools:</strong><br />QUAST - Quality assessment tool for genome assembly http://bioinf.spbau.ru/quast<br />Mauve assembly metrics - http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve<br />InGAP-SV - https://sites.google.com/site/nextgengenomics/ingap and http://ingap.sourceforge.net/<br />inGAP is also useful for finding structural variants between genomes from read mappings.</p><p><strong>Genome finishing tools:</strong><br />Semi-automated gap fillers:<br />Gap filler - http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/gapfiller/</p><p>IMAGE (V2) - http://sourceforge.net/apps/mediawiki/image2/index.php?title=Main_Page</p><p><strong>Genome visualisers and editors:</strong><br />Artemis - http://www.sanger.ac.uk/resources/software/artemis/<br />IGV - http://www.broadinstitute.org/igv/</p><p><strong>Automated and semi automated annotation tools:</strong><br />Prokka - https://github.com/tseemann/prokka<br />RAST - http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer<br />JCVI Annotation Service - http://www.jcvi.org/cms/research/projects/annotation-service/</p><p><strong>Frequent command use for the analysis are at:</strong></p><p>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly<br />https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</guid>
	<pubDate>Thu, 02 Jan 2025 20:11:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44758/the-ifs-and-buts-of-ngs-quality-control-and-trimming</link>
	<title><![CDATA[The &quot;Ifs&quot; and &quot;Buts&quot; of NGS Quality Control and Trimming]]></title>
	<description><![CDATA[<p>Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.</p><h3><strong>The "Ifs" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Ensures Data Integrity</strong><br />If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.</p>
</li>
<li>
<p><strong>Removes Contaminants</strong><br />If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.</p>
</li>
<li>
<p><strong>Improves Mapping and Assembly</strong><br />If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.</p>
</li>
<li>
<p><strong>Reduces Computational Load</strong><br />If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.</p>
</li>
<li>
<p><strong>Prepares for Standardized Analyses</strong><br />If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.</p>
</li>
</ol><h3><strong>The "Buts" of NGS QC and Trimming</strong></h3><ol>
<li>
<p><strong>Risk of Over-Trimming</strong><br />But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.</p>
</li>
<li>
<p><strong>Bias Introduction</strong><br />But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.</p>
</li>
<li>
<p><strong>Loss of Context in Paired-End Reads</strong><br />But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.</p>
</li>
<li>
<p><strong>Time and Resource Intensive</strong><br />But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.</p>
</li>
<li>
<p><strong>Variable Standards</strong><br />But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.</p>
</li>
</ol><h3><strong>Balancing the "Ifs" and "Buts"</strong></h3><p>To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:</p><ul>
<li>
<p><strong>Use QC Tools Wisely:</strong> Start with tools like <strong>FastQC</strong> to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.</p>
</li>
<li>
<p><strong>Choose Reliable Trimming Tools:</strong> Tools like <strong>Trimmomatic</strong>, <strong>Cutadapt</strong>, and <strong>BBduk</strong> offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.</p>
</li>
<li>
<p><strong>Set Reasonable Parameters:</strong> Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.</p>
</li>
<li>
<p><strong>Test Downstream Effects:</strong> Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.</p>
</li>
<li>
<p><strong>Document Your Workflow:</strong> Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.</p>
</li>
</ul><h3><strong>Conclusion</strong></h3><p>NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/30867/perl-special-vars-quick-reference</guid>
	<pubDate>Tue, 07 Feb 2017 05:08:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/30867/perl-special-vars-quick-reference</link>
	<title><![CDATA[Perl Special Vars Quick Reference]]></title>
	<description><![CDATA[<table>
<tbody>
<tr>
<td><tt>$_</tt></td>
<td>The default or implicit variable.</td>
</tr>
<tr>
<td><tt>@_</tt></td>
<td>Subroutine parameters.</td>
</tr>
<tr>
<td><tt>$a</tt><br /><tt>$b</tt></td>
<td><a href="http://perldoc.perl.org/functions/sort.html">sort</a>&nbsp;comparison routine variables.</td>
</tr>
<tr>
<td><tt>@ARGV</tt></td>
<td>The command-line args.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Regular Expressions</span></td>
</tr>
<tr>
<td><tt>$&lt;digit&gt;</tt></td>
<td>Regexp parenthetical capture holders.</td>
</tr>
<tr>
<td><tt>$&amp;</tt></td>
<td>Last successful match (degrades performance).</td>
</tr>
<tr>
<td><tt>${^MATCH}</tt></td>
<td>Similar to&nbsp;<tt>$&amp;</tt>&nbsp;without performance penalty. Requires /p modifier.</td>
</tr>
<tr>
<td><tt>$`</tt></td>
<td>Prematch for last successful match string (degrades performance).</td>
</tr>
<tr>
<td><tt>${^PREMATCH}</tt></td>
<td>Similar to&nbsp;<tt>$`</tt>&nbsp;without performance penalty. Requires&nbsp;<tt>/p</tt>&nbsp;modifier.</td>
</tr>
<tr>
<td><tt>$'</tt></td>
<td>Postmatch for last successful match string (degrades performance).</td>
</tr>
<tr>
<td><tt>${^POSTMATCH}</tt></td>
<td>Similar to&nbsp;<tt>$'</tt>&nbsp;without performance penalty. Requires&nbsp;<tt>/p</tt>&nbsp;modifier.</td>
</tr>
<tr>
<td><tt>$+</tt></td>
<td>Last paren match.</td>
</tr>
<tr>
<td><tt>$^N</tt></td>
<td>Last closed paren match (last submatch).</td>
</tr>
<tr>
<td><tt>@+</tt></td>
<td>Offsets of ends of successful submatches in scope.</td>
</tr>
<tr>
<td><tt>@-</tt></td>
<td>Offsets of starts of successful submatches in scope.</td>
</tr>
<tr>
<td><tt>%+</tt></td>
<td>Like&nbsp;<tt>@+</tt>, but for named submatches.</td>
</tr>
<tr>
<td><tt>%-</tt></td>
<td>Like&nbsp;<tt>@-</tt>, but for named submatches.</td>
</tr>
<tr>
<td><tt>$^R</tt></td>
<td>Last regexp (?{code}) result.</td>
</tr>
<tr>
<td><tt>${^RE_DEBUG_FLAGS}</tt></td>
<td>Current value of regexp debugging flags. See&nbsp;<tt>use re 'debug';</tt></td>
</tr>
<tr>
<td><tt>${^RE_TRIE_MAXBUF}</tt></td>
<td>Control memory allocations for RE optimizations for large alternations.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Encoding</span></td>
</tr>
<tr>
<td><tt>${^ENCODING}</tt></td>
<td>The object reference to the Encode object, used to convert the source code to Unicode.</td>
</tr>
<tr>
<td><tt>${^OPEN}</tt></td>
<td>Internal use: \0 separated Input / Output layer information.</td>
</tr>
<tr>
<td><tt>${^UNICODE}</tt></td>
<td>Read-only Unicode settings.</td>
</tr>
<tr>
<td><tt>${^UTF8CACHE}</tt></td>
<td>State of the internal UTF-8 offset caching code.</td>
</tr>
<tr>
<td><tt>${^UTF8LOCALE}</tt></td>
<td>Indicates whether UTF8 locale was detected at startup.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">IO and Separators</span></td>
</tr>
<tr>
<td><tt>$.</tt></td>
<td>Current line number (or record number) of most recent filehandle.</td>
</tr>
<tr>
<td><tt>$/</tt></td>
<td>Input record separator.</td>
</tr>
<tr>
<td><tt>$|</tt></td>
<td>Output autoflush. 1=autoflush, 0=default. Applies to currently selected handle.</td>
</tr>
<tr>
<td><tt>$,</tt></td>
<td>Output field separator (lists)</td>
</tr>
<tr>
<td><tt>$\</tt></td>
<td>Output record separator.</td>
</tr>
<tr>
<td><tt>$"</tt></td>
<td>Output list separator. (interpolated lists)</td>
</tr>
<tr>
<td><tt>$;</tt></td>
<td>Subscript separator. (Use a real multidimensional array instead.)</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Formats</span></td>
</tr>
<tr>
<td><tt>$%</tt></td>
<td>Page number for currently selected output channel.</td>
</tr>
<tr>
<td><tt>$=</tt></td>
<td>Current page length.</td>
</tr>
<tr>
<td><tt>$-</tt></td>
<td>Number of lines left on page.</td>
</tr>
<tr>
<td><tt>$~</tt></td>
<td>Format name.</td>
</tr>
<tr>
<td><tt>$^</tt></td>
<td>Name of top-of-page format.</td>
</tr>
<tr>
<td><tt>$:</tt></td>
<td>Format line break characters</td>
</tr>
<tr>
<td><tt>$^L</tt></td>
<td>Form feed (default "\f").</td>
</tr>
<tr>
<td><tt>$^A</tt></td>
<td>Format Accumulator</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Status Reporting</span></td>
</tr>
<tr>
<td><tt>$?</tt></td>
<td>Child error. Status code of most recent system call or pipe.</td>
</tr>
<tr>
<td><tt>$!</tt></td>
<td>Operating System Error. (What just went 'bang'?)</td>
</tr>
<tr>
<td><tt>%!</tt></td>
<td>Error number hash</td>
</tr>
<tr>
<td><tt>$^E</tt></td>
<td>Extended Operating System Error (Extra error explanation).</td>
</tr>
<tr>
<td><tt>$@</tt></td>
<td>Eval error.</td>
</tr>
<tr>
<td><tt>${^CHILD_ERROR_NATIVE}</tt></td>
<td>Native status returned by the last pipe close, backtick (`` ) command, successful call to wait() or waitpid(), or from the system() operator.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">ID's and Process Information</span></td>
</tr>
<tr>
<td><tt>$$</tt></td>
<td>Process ID</td>
</tr>
<tr>
<td><tt>$&lt;</tt></td>
<td>Real user id of process.</td>
</tr>
<tr>
<td><tt>$&gt;</tt></td>
<td>Effective user id of process.</td>
</tr>
<tr>
<td><tt>$(</tt></td>
<td>Real group id of process.</td>
</tr>
<tr>
<td><tt>$)</tt></td>
<td>Effective group id of process.</td>
</tr>
<tr>
<td><tt>$0</tt></td>
<td>Program name.</td>
</tr>
<tr>
<td><tt>$^O</tt></td>
<td>Operating System name.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Perl Status Info</span></td>
</tr>
<tr>
<td><tt>$]</tt></td>
<td>Old: Version and patch number of perl interpreter. Deprecated.</td>
</tr>
<tr>
<td><tt>$^C</tt></td>
<td>Current value of flag associated with&nbsp;<strong>-c</strong>&nbsp;switch.</td>
</tr>
<tr>
<td><tt>$^D</tt></td>
<td>Current value of debugging flags</td>
</tr>
<tr>
<td><tt>$^F</tt></td>
<td>Maximum system file descriptor.</td>
</tr>
<tr>
<td><tt>$^I</tt></td>
<td>Value of the&nbsp;<strong>-i</strong>&nbsp;(inplace edit) switch.</td>
</tr>
<tr>
<td><tt>$^M</tt></td>
<td>Emergency Memory pool.</td>
</tr>
<tr>
<td><tt>$^P</tt></td>
<td>Internal variable for debugging support.</td>
</tr>
<tr>
<td><tt>$^R</tt></td>
<td>Last regexp (?{code}) result.</td>
</tr>
<tr>
<td><tt>$^S</tt></td>
<td>Exceptions being caught. (eval)</td>
</tr>
<tr>
<td><tt>$^T</tt></td>
<td>Base time of program start.</td>
</tr>
<tr>
<td><tt>$^V</tt></td>
<td>Perl version.</td>
</tr>
<tr>
<td><tt>$^W</tt></td>
<td>Status of -w switch</td>
</tr>
<tr>
<td><tt>${^WARNING_BITS}</tt></td>
<td>Current set of warning checks enabled by&nbsp;<tt>use warnings;</tt></td>
</tr>
<tr>
<td><tt>$^X</tt></td>
<td>Perl executable name.</td>
</tr>
<tr>
<td><tt>${^GLOBAL_PHASE}</tt></td>
<td>Current phase of the Perl interpreter.</td>
</tr>
<tr>
<td><tt>$^H</tt></td>
<td>Internal use only: Hook into Lexical Scoping.</td>
</tr>
<tr>
<td><tt>%^H</tt></td>
<td>Internaluse only: Useful to implement scoped pragmas.</td>
</tr>
<tr>
<td><tt>${^TAINT}</tt></td>
<td>Taint mode read-only flag.</td>
</tr>
<tr>
<td><tt>${^WIN32_SLOPPY_STAT}</tt></td>
<td>If true on Windows&nbsp;<tt>stat()</tt>&nbsp;won't try to open the file.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Command Line Args</span></td>
</tr>
<tr>
<td><tt>ARGV</tt></td>
<td>Filehandle iterates over files from command line (see also&nbsp;<tt>&lt;&gt;</tt>).</td>
</tr>
<tr>
<td><tt>$ARGV</tt></td>
<td>Name of current file when reading &lt;&gt;</td>
</tr>
<tr>
<td><tt>@ARGV</tt></td>
<td>List of command line args.</td>
</tr>
<tr>
<td><tt>ARGVOUT</tt></td>
<td>Output filehandle for -i switch</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Miscellaneous</span></td>
</tr>
<tr>
<td><tt>@F</tt></td>
<td>Autosplit (-a mode) recipient.</td>
</tr>
<tr>
<td><tt>@INC</tt></td>
<td>List of library paths.</td>
</tr>
<tr>
<td><tt>%INC</tt></td>
<td>Keys are filenames, values are paths to modules included via&nbsp;<tt>use, require,&nbsp;</tt>or&nbsp;<tt>do</tt>.</td>
</tr>
<tr>
<td><tt>%ENV</tt></td>
<td>Hash containing current environment variables</td>
</tr>
<tr>
<td><tt>%SIG</tt></td>
<td>Signal handlers.</td>
</tr>
<tr>
<td><tt>$[</tt></td>
<td>Array and substr first element (Deprecated!).</td>
</tr>
</tbody>
</table><p>&nbsp;</p><p>See&nbsp;<a href="http://perldoc.perl.org/perlvar.html">perlvar</a>&nbsp;for detailed descriptions of each of these (and a few more) special variables.</p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43374/reference-sequence-resource</guid>
	<pubDate>Wed, 15 Sep 2021 21:15:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43374/reference-sequence-resource</link>
	<title><![CDATA[Reference Sequence Resource!]]></title>
	<description><![CDATA[<p><span>The ENCODE project uses Reference Genomes from&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/genome/browse/reference/">NCBI</a><span>&nbsp;or&nbsp;</span><a href="http://hgdownload.cse.ucsc.edu/downloads.html">UCSC</a><span>&nbsp;to provide a consistent framework for mapping high-throughput sequencing data.&nbsp;In general, ENCODE data are mapped consistently to 2 human (GRCH38, hg19) and 2 mouse (mm9/mm10) genomes for historical comparability.&nbsp;</span><em>Drosophia melanogaster</em><span>&nbsp;experiments are mapped to either dm3 or dm6 and&nbsp;</span><em>Caenorhabdilis elegans&nbsp;</em><span>experiments are mapped to ce10 or ce11.&nbsp;T</span></p><p>Address of the bookmark: <a href="https://www.encodeproject.org/data-standards/reference-sequences/" rel="nofollow">https://www.encodeproject.org/data-standards/reference-sequences/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37414/arc-pipeline-which-facilitates-iterative-reference-guided-de-novo-assemblies</guid>
	<pubDate>Thu, 26 Jul 2018 09:20:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37414/arc-pipeline-which-facilitates-iterative-reference-guided-de-novo-assemblies</link>
	<title><![CDATA[ARC: pipeline which facilitates iterative, reference guided de novo assemblies]]></title>
	<description><![CDATA[<p>ARC is a pipeline which facilitates iterative, reference guided&nbsp;<em>de novo</em>&nbsp;assemblies with the intent of:</p>
<ol>
<li>Reducing time in analysis and increasing accuracy of results by only considering those reads which should assemble together.</li>
<li>Reducing/removing reference bias as compared to mapping based approaches.</li>
</ol>
<p><span>The software is designed to work in situations where a whole-genome assembly is not the objective, but rather when the researcher wishes to assemble discreet 'targets' contained within next-generation shotgun sequence data. ARC decomplexifies the traditionally difficult problem of assembly by breaking the reads into small, manageable subsets which can then be assembled quickly and efficiently in parallel. Applications include those in which the researcher wishes to&nbsp;</span><em>de novo</em><span>&nbsp;assemble specific content and a set of semi-similar reference targets is available to initialize the assembly process.</span></p>
<p>https://ibest.github.io/ARC/</p><p>Address of the bookmark: <a href="https://ibest.github.io/ARC/" rel="nofollow">https://ibest.github.io/ARC/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39104/hipstr-haplotype-inference-and-phasing-for-short-tandem-repeats</guid>
	<pubDate>Thu, 07 Mar 2019 21:13:06 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39104/hipstr-haplotype-inference-and-phasing-for-short-tandem-repeats</link>
	<title><![CDATA[HipSTR: Haplotype inference and phasing for Short Tandem Repeats]]></title>
	<description><![CDATA[<p><span>HipSTR</span>&nbsp;was specifically developed to deal with these errors in the hopes of obtaining more robust STR genotypes. In particular, it accomplishes this by:</p>
<ol>
<li>Learning locus-specific PCR stutter models using an&nbsp;<a href="http://en.wikipedia.org/wiki/Expectation-maximization_algorithm">EM algorithm</a></li>
<li>Mining candidate STR alleles from population-scale sequencing data</li>
<li>Employing a specialized hidden Markov model to align reads to candidate alleles while accounting for STR artifacts</li>
<li>Utilizing phased SNP haplotypes to genotype and phase STRs</li>
</ol><p>Address of the bookmark: <a href="https://github.com/tfwillems/HipSTR" rel="nofollow">https://github.com/tfwillems/HipSTR</a></p>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>

</channel>
</rss>