<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36583?offset=170</link>
	<atom:link href="https://bioinformaticsonline.com/related/36583?offset=170" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/42023/encode3-a-collection-of-research-articles-and-related-content-describing-the-encyclopedia-of-dna-elements-its-datasets-and-tools</guid>
	<pubDate>Sat, 08 Aug 2020 08:25:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/42023/encode3-a-collection-of-research-articles-and-related-content-describing-the-encyclopedia-of-dna-elements-its-datasets-and-tools</link>
	<title><![CDATA[ENCODE3: A collection of research articles and related content describing the Encyclopedia of DNA Elements, its datasets and tools.]]></title>
	<description><![CDATA[<p>How cells, tissues and organisms interpret the information encoded in the genome has vital implications for our understanding of development, health and disease. Launched in 2003, the ENCyclopedia Of DNA Elements (ENCODE) project has the aim of mapping the functional elements in the human genome (later expanded to include model organisms).</p><p>During the first phase of ENCODE, published in 2007, microarray-based technologies were used to detect regions associated with transcription factors, certain histone modifications and open chromatin within a pre-specified 1% of the human genome.</p><p>ENCODE&rsquo;s second phase saw a switch to sequencing-based technologies, the addition of new assay types and the analysis of functional elements genome-wide, described in a collection of research articles in 2012.</p><p><span>The&nbsp;</span><a href="https://www.nature.com/articles/s41586-020-2493-4">Encyclopedia paper of ENCODE 3</a><span>, published in&nbsp;</span><em>Nature</em><span>, gives an overview of the various assays that were performed in human and mouse cell lines and tissues and describes a Registry of human and mouse candidate&nbsp;</span><em>cis</em><span>-regulatory elements (cCREs).</span></p><p>More at&nbsp;<a href="https://www.nature.com/immersive/d42859-020-00027-2/index.html">https://www.nature.com/immersive/d42859-020-00027-2/index.html</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</guid>
	<pubDate>Sun, 07 Mar 2021 00:32:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</link>
	<title><![CDATA[Ancient whole genome duplication (WGD) detection tools !]]></title>
	<description><![CDATA[<p>There are two methods for ancient WGD detection, one is collinearity analysis, and the other is based on the Ks distribution map. Among them, Ks is defined as the average number of synonymous substitutions at each synonymous site, and there is also a Ka corresponding to it, which refers to the average number of non-synonymous substitutions at each non-synonymous site.</p><p>At present, some people have posted articles about the analysis process of WGD. I searched for the keyword "wgd pipeline" and found the following:</p><p><strong>GenoDup: https:// github.com/MaoYafei/GenoDup-Pipeline</strong><br /><strong>https://peerj.com/articles/6303/</strong><br /><strong>WGDdetector: https:// github.com/yongzhiyang2 012/WGDdetector</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2670-3</strong><br /><strong>wgd: https:// github.com/arzwa/wgd</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2#Sec1</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>GeNoGAP https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>https://github.com/dfguan/purge_dups</strong><br /><strong>https://www.biorxiv.org/content/10.1101/2020.01.24.917997v1</strong></p><p>This article introduces the usage of wgd.</p><p>Wgd cannot be installed directly with bioconda at present, so it is a little troublesome to install, because it depends on a lot of software. wgd depends on the following software</p><p><strong>BLAST</strong><br /><strong>MCL</strong><br /><strong>MUSCLE/MAFFT/PRANK</strong><br /><strong>PAML</strong><br /><strong>PhyML/FastTree</strong><br /><strong>i-ADHoRe</strong></p><p>But the good news is that most of the software it depends on can be installed with bioconda</p><blockquote><p>conda create -n wgd python=3.5 blast mcl muscle mafft prank paml fasttree cmake libpng mpi=1.0=mpich<br />conda activate wgd</p></blockquote><p>Here mpi=1.0=mpich is selected, because i-adhore depends on mpich. If openmpi is installed, an error will appear while loading shared libraries: libmpi_cxx.so.40: cannot open shared object file: No such file or directory</p><p>After that, the installation is much simpler</p><blockquote><p>git clone https://github.com/arzwa/wgd.git<br />cd wgd<br />pip install .<br />pip install git+https://github.com/arzwa/wgd.git<br />For i-ADHoRe, you need to register at http:// bioinformatics.psb.ugent.be /webtools/i-adhore/licensing/Agree to the license to download i-ADHoRe-3.0</p></blockquote><p>Since my miniconda3 installed ~/opt/, the installation path is so~/opt/miniconda3/envs/wgd/</p><blockquote><p>tar -zxvf i-adhore-3.0.01.tar.gz<br />cd i-adhore-3.0.01<br />mkdir -p build &amp;&amp; cd build<br />cmake .. -DCMAKE_INSTALL_PREFIX=~/opt/miniconda3/envs/wgd/<br />make -j 4 <br />make insatall</p></blockquote><p>Take the sugarcane genome Saccharum spontaneum L as an example. The genome is 8-ploid with 32 chromosomes (2n = 4x8 = 32)</p><p><strong>Download the tutorial for CDS and GFF annotation files</strong></p><blockquote><p><strong>mkdir -p wgd_tutorial &amp;&amp; cd wgd_tutorial</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.cds.fasta.gz</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.gff3.gz</strong><br /><strong>gunzip *.gz</strong></p></blockquote><p>First conda activate wgdstart our analysis environment, and then start the analysis</p><p>Step 1 : Use to wgd mclidentify homologous genes in the genome</p><blockquote><p>wgd mcl -n 20 --cds --mcl -s Sspon.v20190103.cds.fasta -o Sspon_cds.out</p></blockquote><p>Step 2 : Use to wgd ksdbuild Ks distribution</p><blockquote><p>wgd ksd --n_threads 80 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl Sspon.v20190103.cds.fasta</p></blockquote><p>Step 3 : If the quality of the genome is good, then wgd syncollinearity analysis can be used . It can help us find the collinearity block in the genome and the corresponding anchor point</p><blockquote><p>wgd syn --feature gene --gene_attribute ID \<br /> -ks wgd_ksd/Sspon.v20190103.cds.fasta.ks.tsv \<br /> Sspon.v20190103.gff3 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl</p></blockquote><p>&nbsp;For more reading - There are 9 sub-modules in WGD</p><ul>
<li><span>kde: KDE fitting to the Ks distribution</span></li>
<li><span>ksd: Ks distribution construction</span></li>
<li><span>mcl: BLASP comparison of All-vs-ALl + MCL classification analysis.</span></li>
<li><span><span>mix: Hybrid modeling of Ks distribution.</span></span></li>
<li><span>pre: preprocess the CDS file</span></li>
<li><span>syn: Call I-ADHoRe 3.0 to use GFF files for collinearity analysis</span></li>
<li><span>viz: draw histogram and density plot</span></li>
<li><span>wf1: Ks standard analysis procedure of the whole genome paranome (paranome), call mcl, ksd and syn</span></li>
<li><span>wf2: Ks standard analysis procedure of one-vs-one homologous gene (ortholog), call wcl and kSD</span></li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43364/ragtag-a-collection-of-software-tools-for-scaffolding-and-improving-modern-genome-assemblies</guid>
	<pubDate>Sat, 11 Sep 2021 00:28:14 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43364/ragtag-a-collection-of-software-tools-for-scaffolding-and-improving-modern-genome-assemblies</link>
	<title><![CDATA[RagTag: a collection of software tools for scaffolding and improving modern genome assemblies]]></title>
	<description><![CDATA[<p>RagTag is a collection of software tools for scaffolding and improving modern genome assemblies. Tasks include:</p>
<ul>
<li>Homology-based misassembly&nbsp;<a href="https://github.com/malonge/RagTag/wiki/correct">correction</a></li>
<li>Homology-based assembly&nbsp;<a href="https://github.com/malonge/RagTag/wiki/scaffold">scaffolding</a>&nbsp;and&nbsp;<a href="https://github.com/malonge/RagTag/wiki/patch">patching</a></li>
<li>Scaffold&nbsp;<a href="https://github.com/malonge/RagTag/wiki/merge">merging</a></li>
</ul><p>Address of the bookmark: <a href="https://github.com/malonge/RagTag" rel="nofollow">https://github.com/malonge/RagTag</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44352/bioinformatics-tools-for-genome-assembly</guid>
	<pubDate>Mon, 24 Jul 2023 07:04:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44352/bioinformatics-tools-for-genome-assembly</link>
	<title><![CDATA[Bioinformatics tools for genome assembly !]]></title>
	<description><![CDATA[<p>There are numerous genome assembly tools available, each with its strengths and weaknesses. Here is a list of some widely used genome assembly tools as of my last update in September 2021:</p><ol>
<li>
<p><span>SPAdes:</span> An assembler specifically designed for single-cell and multi-cell bacterial genomes, as well as small eukaryotic genomes.</p>
</li>
<li>
<p><span>ABySS:</span> A parallelized assembler for large genomes that uses de Bruijn graphs.</p>
</li>
<li>
<p><span>Velvet:</span> Another de Bruijn graph-based assembler optimized for short-read sequencing data.</p>
</li>
<li>
<p><span>SOAPdenovo:</span> A de Bruijn graph-based assembler designed for short reads, widely used for assembling large and complex genomes.</p>
</li>
<li>
<p><span>MaSuRCA:</span> A hybrid assembler that combines data from multiple sequencing technologies, such as Illumina and PacBio.</p>
</li>
<li>
<p><span>Canu:</span> A long-read assembler optimized for PacBio and Oxford Nanopore sequencing data.</p>
</li>
<li>
<p><span>Flye:</span> A long-read assembler suitable for bacterial and small eukaryotic genomes.</p>
</li>
<li>
<p><span>SMARTdenovo:</span> An assembler designed for long reads, particularly suited for PacBio data.</p>
</li>
<li>
<p><span>SPAdes Long Read (SPAdesLR):</span> An extension of SPAdes for long-read data, such as those from PacBio or Nanopore.</p>
</li>
<li>
<p><span>Minia:</span> An assembler optimized for low memory consumption, suitable for small and medium-sized genomes.</p>
</li>
<li>
<p><span>Unicycler:</span> A hybrid assembler that combines short and long reads for circular bacterial genome assembly.</p>
</li>
<li>
<p><span>wtdbg2:</span> A de Bruijn graph assembler for long reads, efficient for very large genomes.</p>
</li>
<li>
<p><span>Shasta:</span> A long-read assembler that uses the Overlap-Layout-Consensus approach, suitable for PacBio and Nanopore data.</p>
</li>
<li>
<p><span>Sparc:</span> An assembler designed to handle noisy long reads from Nanopore sequencing.</p>
</li>
<li>
<p><span>CANA:</span> An assembler for metagenomic data, particularly for complex and diverse microbial communities.</p>
</li>
<li>
<p><span>Ra</span> Assembler: A metagenome assembler for long reads, designed for highly complex metagenomic samples.</p>
</li>
</ol><p>Please note that the field of bioinformatics is constantly evolving, and new assembly tools may have emerged since my last update. Additionally, the performance of these tools can vary depending on the characteristics of the sequencing data and the genome being assembled. When selecting an assembly tool, consider the specific requirements of your project, the available data types, and the computational resources at your disposal. Always refer to the respective tool's documentation and publications for the most up-to-date information and recommendations.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</guid>
	<pubDate>Fri, 04 Oct 2024 02:45:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</link>
	<title><![CDATA[Libraries or management tools for high throughput sequencing data]]></title>
	<description><![CDATA[<ul>
<li><a href="http://gatb.inria.fr/"><span>GATB</span></a>&nbsp;Library.&nbsp;The&nbsp;<span>Genome Analysis Toolbox with de-Bruijn graph.&nbsp;</span>A large part of tools developed by the GenScale team are based on this library.<br />These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (<em>e.g.</em>&nbsp;metagenomes). Among them are (the full is available here:&nbsp;<a href="https://gatb.inria.fr/software/">https://gatb.inria.fr/software/</a>):</li>
<li><a href="https://github.com/morispi/LRez"><span>LRez</span></a>: C++ Library and toolkit for the barcode-based management and indexation of linked-read datasets.</li>
</ul><h2>Variant calling and/or genotyping</h2><ul>
<li><a href="https://gatb.inria.fr/software/discosnp/" title="DiscoSNP">DiscoSNP++ and&nbsp;discoSnpRAD</a>: Reference-free small variant discovery (SNPs and indels)</li>
<li><a href="https://gatb.inria.fr/software/mind-the-gap/" title="MindTheGap">MindTheGap</a>: Detection and assembly of large insertion variants</li>
<li><a href="https://gatb.inria.fr/software/takeabreak/" title="TakeABreak">TakeABreak</a>:&nbsp;reference-free inversion discovery tool</li>
<li><a href="https://github.com/llecompte/SVJedi">SVJedi</a>: Structural Variant genotyper with long read data</li>
<li><a href="https://github.com/SandraLouise/SVJedi-graph">SVJedi-graph</a>: Structural Variant genotyper with long read data using a variation graph</li>
</ul><h2>Sequence assembly</h2><ul>
<li><a href="https://github.com/cguyomar/MinYS">MinYS</a>: reference-guided genome assembly in metagenomics data</li>
<li><a href="https://github.com/anne-gcd/MTG-Link">MTG-link</a>: local assembly tool for linked-read data</li>
<li><a href="https://gatb.inria.fr/software/minia/" title="Minia">Minia</a>: De novo short read assembler</li>
<li><a href="https://gatb.inria.fr/de-novo-genome-assembly/">de-novo pipeline</a>:&nbsp;<em>de-novo</em>&nbsp;assembly pipeline (error correction / contigs / scaffolding) for genomes and meta-genomes</li>
<li><a href="https://gatb.inria.fr/software/mapsembler/" title="Mapsembler2">Mapsembler2</a>: Targeted assembly (not maintained)</li>
</ul><h2>Managing k-mers &amp; indexation</h2><ul>
<li><a href="https://github.com/lrobidou/findere">findere</a>:&nbsp;simple strategy for speeding up queries and for reducing false positive calls from any Approximate Membership Query data structure.
<ul>
<li><a href="https://github.com/lrobidou/fimpera">fimpera</a>&nbsp;extends findere adding the abundance information.</li>
</ul>
</li>
<li><a href="https://github.com/tlemane/kmtricks">kmtricks</a>:&nbsp;modular tool suite for counting kmers, and constructing Bloom filters or kmer matrices, for large collections of sequencing data.</li>
<li><a href="https://github.com/tlemane/kmindex">kmindex&nbsp;</a>is a tool for indexing and querying sequencing samples. It is built on top of kmtricks.</li>
<li><a href="https://github.com/pierrepeterlongo/back_to_sequences">back to sequences</a>: Find sequences (reads, unitigs, genes) related to a set of kmers in large datasets, in a matter of seconds.</li>
<li><a href="https://github.com/vicLeva/bqf">Backpack Quotient Filter</a>:&nbsp;k-mer indexing data structure with abundance</li>
<li><a href="http://github.com/GATB/rconnector">short read connector</a>:&nbsp;Detect similar reads from potentially large read set</li>
<li><a href="https://gatb.inria.fr/software/dsk/" title="DSK">DSK</a>:&nbsp;Count K-mer in sequences</li>
</ul><h2>Pangenome graph manipulation</h2><ul>
<li><a href="https://github.com/Tharos-ux/pancat">Pancat</a>: Pangenome Comparison and Analysis Toolkit</li>
<li><a href="https://pypi.org/project/gfagraphs/">GFAGraphs</a>: a Python library to handle pangenome graph files in GFA format.</li>
</ul><h2>Comparative metagenomics with k-mers</h2><ul>
<li><a href="https://github.com/GATB/simka">Simka and SimkaMin</a>:&nbsp;Comparative metagenomics for large-scale datasets</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/compreads-metagenomic-data-analysis/">Comparead &amp; Commet</a>:&nbsp;comparison of metagenomic datasets</li>
</ul><h2>Species and bacterial strains identification</h2><ul>
<li><a href="https://github.com/gsiekaniec/ORI">ORI</a>: software using long nanopore reads to identify bacteria present in a sample at the strain level</li>
<li><a href="https://github.com/kevsilva/StrainFLAIR">StrainFLAIR</a>:&nbsp;STRAIN-level proFiLing using vArIation gRaph</li>
</ul><h2>General-purpose sequencing data manipulation</h2><ul>
<li><a href="https://team.inria.fr/genscale/ngs-software/gassst/">GASSST</a>:&nbsp;long read mapper</li>
<li><a href="https://gatb.inria.fr/software/leon/" title="Leon">Leon</a>: short read compressor (now included in GATB-core)</li>
<li><a href="https://gatb.inria.fr/software/bloocoo/" title="Bloocoo">Bloocoo</a>:&nbsp;short read corrector</li>
<li><a href="https://github.com/GATB/bcalm">BCALM</a>:&nbsp;Construct compacted de Bruijn graphs (unitigs)</li>
</ul><h2>&nbsp;Protein Structure</h2><ul>
<li><a href="https://team.inria.fr/genscale/protein-structure/a-purva-contact-map-overlap-solver/">A_Purva</a>:&nbsp;Contact Map Overlap solver</li>
<li><a href="https://team.inria.fr/genscale/protein-structure/md-jeep-distance-geomtry-solver/">MD-Jeep</a>:&nbsp;Distance Geometry solver</li>
<li><a href="https://team.inria.fr/genscale/csa-comparative-structural-alignment/">CSA</a>:&nbsp;Comparative Structural Alignment</li>
</ul><h2>Workflow</h2><ul>
<li><a href="https://team.inria.fr/genscale/workflows/slicee/">SLICEE</a>:&nbsp;parallel execution of bioinformatics workflows</li>
</ul><h3>Comparative Genomics</h3><ul>
<li><a href="https://team.inria.fr/genscale/comparative-genomics/cassis/">CASSIS</a>:&nbsp;detection of rearrangement breakpoints</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/plast-intensive-sequence-comparison/">PLAST</a>:&nbsp;intensive bank-to-bank sequence comparison</li>
<li><a href="https://github.com/stephanierobin/DrjBreakpointFinder">DRJBreakpointFinder</a>: detection and precise localization of excision sites in proviral segments</li>
</ul>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40546/clincnv-detection-of-copy-number-changes-in-germlinetriosomatic-contexts-in-ngs-data</guid>
	<pubDate>Thu, 16 Jan 2020 23:16:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40546/clincnv-detection-of-copy-number-changes-in-germlinetriosomatic-contexts-in-ngs-data</link>
	<title><![CDATA[ClinCNV: Detection of copy number changes in Germline/Trio/Somatic contexts in NGS data]]></title>
	<description><![CDATA[<p><span>ClinCNV detects CNVs in germline and somatic context in NGS data (targeted and whole-genome). We work in cohorts, so it makes sense to try&nbsp;</span><code>ClinCNV</code><span>&nbsp;if you have more than 10 samples (recommended amount - 40 since we estimate variances from the data). By "cohort" we mean samples sequenced with the same enrichment kit with approximately the same depth (ie 1x WGS and 30x WGS better be analysed in separate runs of ClinCNV). Of course it is better if your samples were sequenced within the same sequencing facility.</span></p><p>Address of the bookmark: <a href="https://github.com/imgag/ClinCNV" rel="nofollow">https://github.com/imgag/ClinCNV</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34569/ksnp30-snp-detection-and-phylogenetic-analysis-of-genomes-without-genome-alignment-or-reference-genome</guid>
	<pubDate>Fri, 08 Dec 2017 16:48:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34569/ksnp30-snp-detection-and-phylogenetic-analysis-of-genomes-without-genome-alignment-or-reference-genome</link>
	<title><![CDATA[kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome]]></title>
	<description><![CDATA[<p><span>Sept. 20, 2017 Version 3.1 released. Major upgrade. Version 3.1 fixes the problems with SNP annotation that arose when NCBI discontinued use of GI numbers. Please read carefully the Preface (page 3) and the File of annotated genomes section (pages 9-10) in the version 3.1 User Guide. Thanks to Tom Slezak for revsing the get_genbank_file3 script and to Tod Stuber (USDA) for testing version 3.1 even though he doesn't need the annotation feature. All users are encouraged to upgrade to version 3.1.&nbsp;<br></span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/ksnp/files/" rel="nofollow">https://sourceforge.net/projects/ksnp/files/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37241/remilo-reference-assisted-misassembly-detection-algorithm-using-short-and-long-reads</guid>
	<pubDate>Fri, 06 Jul 2018 04:27:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37241/remilo-reference-assisted-misassembly-detection-algorithm-using-short-and-long-reads</link>
	<title><![CDATA[ReMILO: reference assisted misassembly detection algorithm using short and long reads.]]></title>
	<description><![CDATA[ReMILO, a reference assisted misassembly detection algorithm that uses both short reads and PacBio SMRT long reads. ReMILO aligns the initial short reads to both the contigs and reference genome, and then constructs a novel data structure called red-black multipositional de Bruijn graph to detect misassemblies. In addition, ReMILO also aligns the contigs to long reads and find their differences from the long reads to detect more misassemblies.<p>Address of the bookmark: <a href="https://github.com/songc001/remilo" rel="nofollow">https://github.com/songc001/remilo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44902/hite-a-fast-and-accurate-dynamic-boundary-adjustment-approach-for-full-length-transposable-elements-detection-and-annotation-in-genome-assemblies</guid>
	<pubDate>Sat, 20 Sep 2025 09:34:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44902/hite-a-fast-and-accurate-dynamic-boundary-adjustment-approach-for-full-length-transposable-elements-detection-and-annotation-in-genome-assemblies</link>
	<title><![CDATA[HiTE: a fast and accurate dynamic boundary adjustment approach for full-length Transposable Elements detection and annotation in Genome Assemblies]]></title>
	<description><![CDATA[<p dir="auto"><code>HiTE</code>&nbsp;is a Python software that uses a dynamic boundary adjustment approach to detect and annotate full-length Transposable Elements in Genome Assemblies. In comparison to other tools, HiTE demonstrates superior performance in detecting a greater number of full-length TEs.</p>
<div dir="auto">
<h2 dir="auto">panHiTE</h2>
<a href="https://github.com/CSU-KangHu/HiTE#panhite"></a></div>
<p dir="auto">We have developed panHiTE, a comprehensive and accurate pipeline for TE detection in large-scale population genomes. It has been successfully applied to hundreds of plant population genomes, demonstrating its effectiveness and scalability.</p>
<p dir="auto">For detailed instructions, please refer to the&nbsp;<a href="https://github.com/CSU-KangHu/HiTE/wiki/panHiTE-tutorial">panHiTE tutorial</a>.</p><p>Address of the bookmark: <a href="https://github.com/CSU-KangHu/HiTE" rel="nofollow">https://github.com/CSU-KangHu/HiTE</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/1161/genomics-for-bioinformatician</guid>
	<pubDate>Sat, 20 Jul 2013 07:03:00 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/1161/genomics-for-bioinformatician</link>
	<title><![CDATA[Genomics for Bioinformatician]]></title>
	<description><![CDATA[<p>Genomics is the study of the genomes of organisms. The field includes intensive efforts to determine the entire DNA sequence of organisms and fine-scale genetic mapping efforts. The field also includes studies of intragenomic phenomena such as heterosis, epistasis, pleiotropy and other interactions between loci and alleles within the genome. In contrast, the investigation of the roles and functions of single genes is a primary focus of molecular biology or genetics and is a common topic of modern medical and biological research. Research of single genes does not fall into the definition of genomics unless the aim of this genetic, pathway, and functional information analysis is to elucidate its effect on, place in, and response to the entire genome's networks.<br /><br />Genomics was established by Fred Sanger when he first sequenced the complete genomes of a virus and a mitochondrion. His group established techniques of sequencing, genome mapping, data storage, and bioinformatic analyses in the 1970-1980s. A major branch of genomics is still concerned with sequencing the genomes of various organisms, but the knowledge of full genomes has created the possibility for the field of functional genomics, mainly concerned with patterns of gene expression during various conditions. The most important tools here are microarrays and bioinformatics. Study of the full set of proteins in a cell type or tissue, and the changes during various conditions, is called proteomics. A related concept is materiomics, which is defined as the study of the material properties of biological materials (e.g. hierarchical protein structures and materials, mineralized biological tissues, etc.) and their effect on the macroscopic function and failure in their biological context, linking processes, structure and properties at multiple scales through a materials science approach. The actual term 'genomics' is thought to have been coined by Dr. Tom Roderick, a geneticist at the Jackson Laboratory (Bar Harbor, ME) over beer at a meeting held in Maryland on the mapping of the human genome in 1986.<br /><br />The outcome of almost two years of intense discussions with literally hundreds of scientists and members of the public, has three major areas of focus: Genomics to Biology, Genomics to Health, and Genomics to Society.<br /><br /><strong><em>Genomics to Biology:</em></strong>&nbsp;<br />The human genome sequence provides foundational information that now will allow development of a comprehensive catalog of all of the genome's components, determination of the function of all human genes, and deciphering of how genes and proteins work together in pathways and networks.<br /><br /><strong><em>Genomics to Health:<br /></em></strong>Completion of the human genome sequence offers a unique opportunity to understand the role of genetic factors in health and disease, and to apply that understanding rapidly to prevention, diagnosis, and treatment. This opportunity will be realized through such genomics-based approaches as identification of genes and pathways and determining how they interact with environmental factors in health and disease, more precise prediction of disease susceptibility and drug response, early detection of illness, and development of entirely new therapeutic approaches.<br /><br /><strong><em>Genomics to Society:</em>&nbsp;<br /></strong>Just as the HGP has spawned new areas of research in basic biology and in health, it has created new opportunities in exploring the ethical, legal, and social implications (ELSI) of such work. These include defining policy options regarding the use of genomic information in both medical and non-medical settings and analysis of the impact of genomics on such concepts as race, ethnicity, kinship, individual and group identity, health, disease, and "normality" for traits and behaviors.<br /><br />This vision for the future of genomics is not just about the NHGRI. It encompasses the whole field of genomics, including the work of all the other Institutes and Centers at the NIH and of a number of other federal agencies. All of the NIH Institutes are already taking full advantage of the sequence and will apply its data to the better understanding of both rare and common diseases, almost all of which have a genetic component. A recent example of the way that the HGP and the knowledge and new technologies it has spawned are already facilitating science is the extremely rapid sequencing by groups in Canada and at the Centers for Disease Control and Prevention (CDC) in Atlanta of the genome of the virus that causes Severe Acute Respiratory Syndrome (SARS). The sequencing of the SARS virus genome provides insight into this new and deadly disease at a speed never before possible in science. In turn, this should lead to the rapid development of diagnostic tests and, in time, vaccines and effective treatments.<br /><br /><strong>Links for the addition material available on Net</strong></p><p><a href="http://pevsnerlab.kennedykrieger.org/bioinformatics/bioinf10_genomes.htm">Genomes and genomics:</a></p><p><a href="http://www.123genomics.com/learning.html">Bioinformatics and Genomics:</a></p><p><a href="http://www.ebi.ac.uk/pdbe/docs/roadshow_tutorial/strgenomics/tutorial.html">Structural genomics tutorial:</a></p><p><a href="http://www.hgu.mrc.ac.uk/Users/Philippe.Gautier/tutorial/index.html">Comparative Genomics Tutorial:</a></p><p><a href="http://www.scfbio-iitd.res.in/tutorial/genomics.html">GENOME TUTORIAL:</a></p><p><a href="http://genomebiology.com/content/pdf/gb-2001-3-1-reviews2001.pdf">Tools and resources for identifying protein families, domains and motifs</a></p><p><a href="http://www.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/tools.shtml">Bioinformatics Tools</a><a href="http://www.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/tools.shtml">&nbsp;<br />Tips, Tutorials, and Terminology for Using Selected Resources in Genome Database Guide:</a></p><p><a href="http://www.doe-mbi.ucla.edu/Reprints/R31%20Strong%20A%20Web-based%20Comparative%20Genomics%20tutorial%20Microbiology%20Eduction%202004.pdf">A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes:</a></p><p><a href="http://www.genome.gov/27530225">Free Online Tutorials Teach Anyone How to Use Genome Databases:</a></p><p><a href="http://mkweb.bcgsc.ca/circos/?tutorials">Circos to create concise, explanatory, unique and print-ready visualizations of your data:</a></p><p><a href="http://www.igd.cornell.edu/Comparative%20Genomics/Comparative%20Genomics%20Proj.html">Genomics and Comparative Genomics</a><a href="http://www.igd.cornell.edu/Comparative%20Genomics/Comparative%20Genomics%20Proj.html">&nbsp;Learning Module:</a></p><p><a href="http://psb.stanford.edu/psb10/conference-materials/tutorials/compgen-notes.pdf">Computational Challenges in Comparative Genomics</a></p><p><a href="http://psb.stanford.edu/psb10/conference-materials/tutorials/compgen-notes.pdf">A Tutorial:</a></p><p><a href="http://gramene.agrinome.org/tutorials/modules_tutorial.pdf">A Comparative Genomics Resource for Grains</a>:</p><p><a href="http://www.plantcell.org/cgi/content/full/21/12/3718">PLAZA: A Comparative Genomics Resource to Study Gene and Genome Evolution in Plants:</a></p><p><a href="http://en.wikipedia.org/wiki/VISTA_(comparative_genomics)">VISTA</a><a href="http://en.wikipedia.org/wiki/VISTA_(comparative_genomics)">:</a></p><p>Software for Genomics</p><ol>
<li><strong>Artemis</strong>&nbsp;Artemis is a free genome viewer and annotation tool that allows visualization of sequence features and the results of analyses within the context of the sequence, and its six-frame translation.</li>
<li><strong>Chromas&nbsp;</strong>It will display and prints chromatogram files from ABI automated DNA sequencers, and Staden SCF files which the analysis programs for ALF, Li-Cor and Visible Genetics OpenGene sequencers can create.</li>
<li><strong>Glimmer</strong>&nbsp;A system for finding genes in microbial DNA, especially the genomes of bacteria and archaea.Glimmer (Gene Locator and Interpolated Markov Modeler) uses interpolated Markov models (IMMs) to identify the coding regions and distinguish them from noncoding DN</li>
<li><strong>Glimmer</strong>&nbsp;HMM&nbsp;A fast and accurate gene finder based on a GHMM architecture, developed specifically for eukaryotes. It incorporates splice site models adapted from the GeneSplicer program and uses interpolated Markov models for evaluating the coding regions.</li>
<li><strong>Glimmer</strong>&nbsp;M&nbsp;A gene finder derived from Glimmer, but developed specifically for eukaryotes. It is based on a dynamic programming algorithm that considers all combinations of possible exons for inclusion in a gene model and chooses the best of these combinations. The d</li>
<li><strong>MUMmer</strong>&nbsp;MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form.</li>
<li><strong>pDRAW</strong>&nbsp;pDRAW32 is being developed as a free time hobby project. It is far from finished, but as it has reached a point where it could be helpful for many labs, it is now available to the scientific community.</li>
<li><strong>Sequin</strong>&nbsp;Sequin is a stand-alone software tool developed by the NCBI for submitting and updating entries to the GenBank, EMBL, or DDBJ sequence databases. It is capable of handling simple submissions that contain a single short mRNA sequence, and complex submissio</li>
<li><strong>Staden&nbsp;</strong>The Staden Package consists of a series of tools for DNA sequence preparation (pregap4), assembly (gap4), editing (gap4) and DNA/protein sequence analysis (spin).</li>
</ol><p>For more software @&nbsp;<a href="http://bioinformaticsonline.com/bookmarks/view/926/list-of-popular-bioinformatics-softwaretools">http://bioinformaticsonline.com/bookmarks/view/926/list-of-popular-bioinformatics-softwaretools</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>

</channel>
</rss>