<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36905?offset=240</link>
	<atom:link href="https://bioinformaticsonline.com/related/36905?offset=240" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43770/chromeister-an-ultra-fast-heuristic-approach-to-detect-conserved-signals-in-extremely-large-pairwise-genome-comparisons</guid>
	<pubDate>Thu, 03 Feb 2022 04:01:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43770/chromeister-an-ultra-fast-heuristic-approach-to-detect-conserved-signals-in-extremely-large-pairwise-genome-comparisons</link>
	<title><![CDATA[chromeister: An ultra fast, heuristic approach to detect conserved signals in extremely large pairwise genome comparisons.]]></title>
	<description><![CDATA[<p>chromeister: An ultra fast, heuristic approach to detect conserved signals in extremely large pairwise genome comparisons.</p>
<p dir="auto">USAGE:</p>
<ul dir="auto">
<li>-query: sequence A in fasta format</li>
<li>-db: sequence B in fasta format</li>
<li>-out: output matrix</li>
<li>-kmer Integer: k&gt;1 (default 32) Use 32 for chromosomes and genomes and 16 for small bacteria</li>
<li>-diffuse Integer: z&gt;0 (default 4) Use 4 for everything - if using large plant genomes you can try using 1</li>
<li>-dimension Size of the output matrix and plot. Integer: d&gt;0 (default 1000) Use 1000 for everything that is not full genome size, where 2000 is recommended</li>
</ul><p>Address of the bookmark: <a href="https://github.com/estebanpw/chromeister" rel="nofollow">https://github.com/estebanpw/chromeister</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/27459/tools-for-searching-repeats-and-palindromic-sequences</guid>
	<pubDate>Sat, 21 May 2016 22:32:25 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/27459/tools-for-searching-repeats-and-palindromic-sequences</link>
	<title><![CDATA[Tools for Searching Repeats And Palindromic Sequences]]></title>
	<description><![CDATA[<p>What are genomic interspersed repeats?</p><p>In the mid 1960's scientists discovered that many genomes contain stretches of highly repetitive DNA sequences ( see Reassociation Kinetics Experiments, and C-Value Paradox ). These sequences were later characterized and placed into five categories:</p><p><strong>Simple Repeats</strong> - Duplications of simple sets of DNA bases (typically 1-5bp) such as A, CA, CGG etc.<br /><strong>Tandem Repeats</strong> - Typically found at the centromeres and telomeres of chromosomes these are duplications of more complex 100-200 base sequences.<br /><strong>Segmental Duplications</strong> - Large blocks of 10-300 kilobases which are that have been copied to another region of the genome.<br /><strong>Interspersed Repeats</strong><br />Processed Pseudogenes, Retrotranscripts, SINES - Non-functional copies of RNA genes which have been reintegrated into the genome with the assitance of a reverse transcriptase.<br />DNA Transposons<br />Retrovirus Retrotransposons<br />Non-Retrovirus Retrotransposons ( LINES )</p><p>Currently up to 50% of the human genome is repetitive in nature and as improvements are made in detection methods this number is expected to increase.</p><p>On the other hand; In genetics, the term palindrome refers to a sequence of nucleotides along a DNA (deoxyribonucleic acid) or RNA (ribonucleic acid) strand that contains the same series of nitrogenous bases regardless from which direction the strand is analyzed. Akin to a language palindrome&mdash;wherein a word or phrase is spelled the same left-to-right as right-to-left (e.g., the word RADAR or the phrase "able was I ere I saw elba")&mdash;with genetic palindromes it does not matter whether the nucleic acid strand is read starting from the 3' (three prime) end or the 5' (five prime) end of the strand.</p><p>Recent research on palindromes centers on understanding palindrome formation during gene amplification. Other studies have attempted to relate palindrome formation to molecular mechanisms involved in double stranded breaks and in the formation of inverted repeats. Assisted by high speed computers, other groups of scientists link palindrome formation to the conservation of genetic information.</p><p>Related to the direction of transcription by RNA polymerase, DNA strands have upstream and downstream terminus defined by differing chemical groups at each end. The ends of each strand of DNA or RNA are termed the 5' (phosphate bound to the 5' position carbon) and 3' (phosphate bound to the 3' carbon) ends to indicate a polarity within the molecule. Using the letters A, T, C, G, to represent the nitrogenous bases adenine, thymine, cytosine, and guanine found in DNA, and the letters A, U, C, G to represent the nitrogenous bases adenine, uracil, cytosine, guanine found in RNA (Note that uracil in RNA replaces the thymine found in DNA), geneticists usually represent DNA by a series of base codes (e.g., 5' AATCGGATTGCA 3'). The base codes are usually arranged from the 5' end to the 3' end.</p><p>Because of specific base pairing in DNA (i.e., adenine (A) always bonds with (thymine (T) and cytosine (C) always bonds with guanine (G)) the complimentary stand to the sequence 5' AATCGGATTGCA 3' would be 3' TTAGCCTAACGT 5'.</p><p>With palindromes the sequences on the complimentary strands read the same in either direction. For example, a sequence of 5' GAATTC3' on one strand would be complimented by a 3' CTTAAG 5' strand. In either case, when either strand is read from the 5' prime end the sequence is GAATTC. Another example of a palindrome would be the sequence 5' CGAAGC 3' that, when reversed, still reads CGAAGC.</p><p>Palindromes are important sequences within nucleic acids. Often they are the site of binding for specific enzymes (e.g., restriction endobucleases) designed to cut the DNA strands at specific locations (i.e., at palindromes).</p><p>Palindromes may arise from brakeage and chromosomal inversions that form inverted repeats that compliment each other. When a palindrome results from an inversion, it is often referred to as an inverted repeat. For example, the sequence 5' CGAAGC 3', if inverted (reversed 180&deg;), still reads CGAAGC.</p><p>The <a href="http://emboss.open-bio.org/">European Molecular Biology Open Software Suite (EMBOSS)</a> includes some basic tools for finding tandem repeats and inverted repeats (see <a href="http://emboss.open-bio.org/html/use/apbs06.html#GroupsAppsTableNucleicrepeatsR6">B.6.22. Applications in group Nucleic:repeats</a>). There are many on-line services providing the EMBOSS tools, for example:</p><ul>
<li>Wageningen Bioinformatics Webportal <a href="http://emboss.bioinformatics.nl/">EMBOSS explorer</a></li>
<li><a href="http://mobyle.pasteur.fr/">Mobyle@Pasteur</a></li>
<li><a href="http://wsembnet.vital-it.ch/">Soaplab2 Web Services at Vital-IT</a></li>
</ul><p>For more sophisticated repeat finding you will want to look at tools using <a href="http://www.girinst.org/repbase/">Repbase</a> for example:</p><ul>
<li>CENSOR
<ul>
<li><a href="http://www.girinst.org/censor/">CENSOR@GIRI</a></li>
<li><a href="http://www.ebi.ac.uk/Tools/so/censor/">CENSOR@EMBL-EBI</a></li>
</ul>
</li>
<li><a href="http://www.repeatmasker.org/">RepeatMasker</a></li>
<li><a href="http://mummer.sourceforge.net/">MUMmer</a>&nbsp;(scan_for_match)</li>
<li><a href="http://emboss.bioinformatics.nl/cgi-bin/emboss/palindrome">Emboss Palindrome</a></li>
</ul><p>Other nucleotide repeat finding methods found by a couple of web searches:</p><ul>
<li><a href="http://tandem.bu.edu/trf/trf.html">Tandem Repeats Finder</a></li>
<li><a href="http://selab.janelia.org/recon.html">RECON</a></li>
<li><a href="http://www.yandell-lab.org/software/repeatrunner.html">RepeatRunner</a></li>
<li><a href="http://bibiserv.techfak.uni-bielefeld.de/reputer/">REPuter</a></li>
<li><a href="http://210.212.215.200/IMEX/index.html">Imperfect Microsatellite Extractor (IMEx)</a></li>
<li><a href="http://www.imtech.res.in/raghava/srf/">Spectral Repeat Finder (SRF)</a></li>
<li><a href="http://zlab.bu.edu/repfind/form.html">REPFIND</a></li>
<li><a href="http://crispr.u-psud.fr/Server/CRISPRfinder.php">CRISPRfinder</a></li>
<li><a href="http://grail.lsd.ornl.gov/grailexp/">GrailEXP</a></li>
<li><a href="http://alggen.lsi.upc.edu/recerca/search/frame-search.html">CONREPP</a></li>
<li><a href="http://www.biophp.org/minitools/find_palindromes/demo.php%20"><span>find_palindromes</span></a></li>
<li><a href="http://insilico.ehu.eus/palindromes/"><span>Palindrome</span></a></li>
<li><a href="http://emboss.bioinformatics.nl/cgi-bin/emboss/palindrome">EMBOSS Palindrome</a></li>
<li><a href="http://bioinfo.cs.technion.ac.il/projects/Engel-Freund/new.html">Palindrome Search</a></li>
</ul>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39821/gvolante-completeness-assessment-of-genometranscriptome-sequences</guid>
	<pubDate>Tue, 06 Aug 2019 21:37:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39821/gvolante-completeness-assessment-of-genometranscriptome-sequences</link>
	<title><![CDATA[gVolante: Completeness Assessment of Genome/Transcriptome Sequences]]></title>
	<description><![CDATA[<p><strong>gVolante</strong><span>&nbsp;provides an online interface for completeness assessment of user&rsquo;s original or publicly available sequence datasets as well as for browsing results of completeness assessment performed on publicly available genome and transcriptome assemblies.</span></p>
<p><img src="https://gvolante.riken.jp/images/assessment.png" width="937" height="545" alt="image" style="border: 0px;"></p><p>Address of the bookmark: <a href="https://gvolante.riken.jp/" rel="nofollow">https://gvolante.riken.jp/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36758/pbalign-maps-pacbio-reads-to-reference-sequences-and-saves-alignments-to-a-bam-file</guid>
	<pubDate>Thu, 24 May 2018 10:06:52 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36758/pbalign-maps-pacbio-reads-to-reference-sequences-and-saves-alignments-to-a-bam-file</link>
	<title><![CDATA[pbalign: maps PacBio reads to reference sequences and saves alignments to a BAM file]]></title>
	<description><![CDATA[pbalign aligns PacBio reads to reference sequences, filters aligned reads according to user-specific filtering criteria, and converts the output to either the SAM format or PacBio Compare HDF5 (e.g., .cmp.h5) format. The output Compare HDF5 file will be compatible with Quiver if --forQuiver option is specified.<p>Address of the bookmark: <a href="https://github.com/PacificBiosciences/pbalign" rel="nofollow">https://github.com/PacificBiosciences/pbalign</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38012/cosine-non-seeding-method-for-mapping-long-noisy-sequences</guid>
	<pubDate>Fri, 26 Oct 2018 00:41:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38012/cosine-non-seeding-method-for-mapping-long-noisy-sequences</link>
	<title><![CDATA[COSINE: non-seeding method for mapping long noisy sequences]]></title>
	<description><![CDATA[<p><span>Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors.</span></p><p>Address of the bookmark: <a href="https://github.com/SUwonglab/COSINE" rel="nofollow">https://github.com/SUwonglab/COSINE</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/42296/igblast-117-is-now-available-with-improved-identification-of-productive-v-gene-sequences</guid>
	<pubDate>Sun, 01 Nov 2020 16:52:58 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/42296/igblast-117-is-now-available-with-improved-identification-of-productive-v-gene-sequences</link>
	<title><![CDATA[IgBLAST 1.17 is now available with improved identification of productive V gene sequences]]></title>
	<description><![CDATA[<p>A new release of&nbsp;<a href="https://go.usa.gov/x7WMc" target="_blank">IgBLAST</a>&nbsp;(1.17), the popular package for classifying and analyzing immunoglobulin and T cell receptor sequences, is now available on the&nbsp;<a href="https://go.usa.gov/x7WMc" target="_blank">web</a>&nbsp;and from the&nbsp;<a href="https://ftp.ncbi.nih.gov/blast/executables/igblast/release/LATEST" target="_blank">FTP site</a>. The updated package is better at identifying productive V gene sequences. We added a new field , &ldquo;V frame shift&rdquo;, to the IgBLAST output to indicate whether the V gene translation frame contains a frame-shift. We have also updated the definition of a productive V(D)J sequence to now exclude those with internal frame shifts.</p><p>See the&nbsp;<a href="https://ncbi.github.io/igblast/" target="_blank">new IgBLAST manual</a>&nbsp;on the NCBI GitHub site for more information on setting up and running IgBLAST.</p><p>If you have any questions or concerns, please email us at&nbsp;<a href="mailto:blast-help@ncbi.nlm.nih.gov" target="_blank">blast-help@ncbi.nlm.nih.gov</a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44479/doubletrouble-identify-duplicated-genes-from-whole-genome-protein-sequences-and-classify</guid>
	<pubDate>Tue, 05 Mar 2024 00:23:49 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44479/doubletrouble-identify-duplicated-genes-from-whole-genome-protein-sequences-and-classify</link>
	<title><![CDATA[doubletrouble: identify duplicated genes from whole-genome protein sequences and classify]]></title>
	<description><![CDATA[<p><span>doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.</span></p><p>Address of the bookmark: <a href="https://bioconductor.org/packages/release/bioc/html/doubletrouble.html" rel="nofollow">https://bioconductor.org/packages/release/bioc/html/doubletrouble.html</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39272/ansible-simple-agentless-it-automation-that-anyone-can-use</guid>
	<pubDate>Wed, 17 Apr 2019 21:41:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39272/ansible-simple-agentless-it-automation-that-anyone-can-use</link>
	<title><![CDATA[ansible: Simple, agentless IT automation that anyone can use]]></title>
	<description><![CDATA[<p>Ansible is a universal language, unraveling the mystery of how work gets done. Turn tough tasks into repeatable playbooks. Roll out enterprise-wide protocols with the push of a button. Give your team the tools to automate, solve, and share.</p><p>Address of the bookmark: <a href="https://www.ansible.com/" rel="nofollow">https://www.ansible.com/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/14191/scalpel</guid>
	<pubDate>Wed, 20 Aug 2014 02:07:58 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/14191/scalpel</link>
	<title><![CDATA[Scalpel]]></title>
	<description><![CDATA[<p>A team from Cold Spring Harbor Laboratory has released an algorithm, called Scalpel, for finding insertions and deletions in next generation sequencing data sets. Scalpel, which is open source and <a href="http://scalpel.sourceforge.net/" title="available for download">available for download</a> on SourceForge,&nbsp;<span>outperformed the popular tools GATK HaplotypeCaller and SOAPindel in test runs on both simulated and real whole human exomes.</span></p><p>Like other indel callers, Scalpel works by performing <em>de novo</em>&nbsp;assembly of regions of interest, so that misalignment to the reference genome cannot obscure the presence of an insertion or deletion. Scalpel's innovation is to repeatedly check its assembly before comparing to the reference genome, to account for simple sequence repeats that are a regular source of error in indel calling. When Scalpel assembles an exon, it collects reads that map to that exon (including partial matches), splits them into k-mers, and creates a de Bruijn graph to span the exon; however, if it detects repeats in the map, it iteratively increases the size of the k-mers by one base until the repeats are eliminated. This ensures that the final assembly of the exon is highly accurate while minimizing compute time.</p><p>The Cold Spring Harbor team's validation of Scalpel, <a href="http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3069.html" title="published over the weekend in Nature Methods">published over the weekend in <em>Nature Methods</em></a>, compares Scalpel's performance on a live whole exome against HaplotypeCaller and SOAPindel. The donor is an individual with serious neurological disorders, which may be linked to a high incidence of indels. One thousand indels from this individual's exome, called by one or more of the informatics pipelines, were selected for focused resequencing. This resequencing revealed a 77% true positive rate for Scalpel calls, dramatically better than the rates for either of the competing tools; Scalpel performed especially well with indels longer than five base pairs, a traditional weak point for indel callers.</p><p>Finally, the authors demonstrate Scalpel's use on a large set of genetic data from nearly 600 families who donated samples to the Simons Simplex Collection, a project of the Simons Foundation Autism Research Initiative. Scalpel found a very high enrichment for indels in children affected by autism, compared with their unaffected siblings, a pattern that persisted even after excluding common variants.</p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27113/picard</guid>
	<pubDate>Fri, 29 Apr 2016 08:21:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27113/picard</link>
	<title><![CDATA[Picard]]></title>
	<description><![CDATA[<p>Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the <a href="http://samtools.github.io/hts-specs/">Hts-specs</a> repository. See especially the <a href="http://samtools.github.io/hts-specs/SAMv1.pdf">SAM specification</a> and the <a href="http://samtools.github.io/hts-specs/VCFv4.3.pdf">VCF specification</a>.</p>
<p>Note that the information on this page is targeted at end-users. For developers, the source code, building instructions and implementation/development resources are available on <a href="https://github.com/broadinstitute/picard">GitHub</a>.</p>
<p>The Picard toolkit is open-source under the <a href="https://tldrlegal.com/license/mit-license">MIT license</a> and free for all uses.</p>
<p>Enjoy!</p><p>Address of the bookmark: <a href="http://broadinstitute.github.io/picard/" rel="nofollow">http://broadinstitute.github.io/picard/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>