<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/42645?offset=90</link>
	<atom:link href="https://bioinformaticsonline.com/related/42645?offset=90" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43856/puffaligner-a-fast-efficient-and-accurate-aligner-based-on-the-pufferfish-index</guid>
	<pubDate>Thu, 21 Apr 2022 05:41:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43856/puffaligner-a-fast-efficient-and-accurate-aligner-based-on-the-pufferfish-index</link>
	<title><![CDATA[PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index]]></title>
	<description><![CDATA[<p><span>PuffAligner, a fast, accurate and versatile aligner built on top of the Pufferfish index. PuffAligner is able to produce highly sensitive alignments, similar to those of Bowtie2, but much more quickly. While exhibiting similar speed to the ultrafast STAR aligner, PuffAligner requires considerably less memory to construct its index and align reads. PuffAligner strikes a desirable balance with respect to the time, space and accuracy tradeoffs made by different alignment tools and provides a promising foundation on which to test new alignment ideas over large collections of sequences.</span></p><p>Address of the bookmark: <a href="https://github.com/COMBINE-lab/pufferfish/tree/cigar-strings" rel="nofollow">https://github.com/COMBINE-lab/pufferfish/tree/cigar-strings</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44902/hite-a-fast-and-accurate-dynamic-boundary-adjustment-approach-for-full-length-transposable-elements-detection-and-annotation-in-genome-assemblies</guid>
	<pubDate>Sat, 20 Sep 2025 09:34:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44902/hite-a-fast-and-accurate-dynamic-boundary-adjustment-approach-for-full-length-transposable-elements-detection-and-annotation-in-genome-assemblies</link>
	<title><![CDATA[HiTE: a fast and accurate dynamic boundary adjustment approach for full-length Transposable Elements detection and annotation in Genome Assemblies]]></title>
	<description><![CDATA[<p dir="auto"><code>HiTE</code>&nbsp;is a Python software that uses a dynamic boundary adjustment approach to detect and annotate full-length Transposable Elements in Genome Assemblies. In comparison to other tools, HiTE demonstrates superior performance in detecting a greater number of full-length TEs.</p>
<div dir="auto">
<h2 dir="auto">panHiTE</h2>
<a href="https://github.com/CSU-KangHu/HiTE#panhite"></a></div>
<p dir="auto">We have developed panHiTE, a comprehensive and accurate pipeline for TE detection in large-scale population genomes. It has been successfully applied to hundreds of plant population genomes, demonstrating its effectiveness and scalability.</p>
<p dir="auto">For detailed instructions, please refer to the&nbsp;<a href="https://github.com/CSU-KangHu/HiTE/wiki/panHiTE-tutorial">panHiTE tutorial</a>.</p><p>Address of the bookmark: <a href="https://github.com/CSU-KangHu/HiTE" rel="nofollow">https://github.com/CSU-KangHu/HiTE</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/9639/find-certain-filesdocuments-in-linux-os</guid>
	<pubDate>Sun, 06 Apr 2014 23:56:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/9639/find-certain-filesdocuments-in-linux-os</link>
	<title><![CDATA[Find certain files/documents in Linux OS]]></title>
	<description><![CDATA[<p>As bioinformatician I know the fact that we usually handle the large dataset and lost in the huge numbers of files and folders. In order to search the missing file a strong search command is required. The Linux Find Command is one of the most important and much used command in Linux sytems. Find command used to search and locate list of files and directories based on conditions you specify for files that match the arguments. Find can be used in variety of conditions like you can find files by permissions, users, groups, file type, date, size and other possible criteria.<br /><br />Through this article we are sharing our day-to-day Linux find command experience and its usage in the form of examples. In this article we will show you the most used 35 Find Commands examples in Linux. We have divided the section into Five parts from basic to advance usage of find command.</p><p><strong>Part I &ndash; Basic Find Commands for Finding Files with Names</strong><br />1. Find Files Using Name in Current Directory<br /><br />Find all the files whose name is gene.txt in a current working directory.<br /><br /># find . -name gene.txt<br /><br />./gene.txt<br /><br />2. Find Files Under Home Directory<br /><br />Find all the files under /home directory with name gene.txt.<br /><br /># find /home -name gene.txt<br /><br />/home/gene.txt<br /><br />3. Find Files Using Name and Ignoring Case<br /><br />Find all the files whose name is gene.txt and contains both capital and small letters in /home directory.<br /><br /># find /home -iname gene.txt<br /><br />./gene.txt<br />./Gene.txt<br /><br />4. Find Directories Using Name<br /><br />Find all directories whose name is Gene in / directory.<br /><br /># find / -type d -name Gene<br /><br />/Gene<br /><br />5. Find fasta Files Using Name<br /><br />Find all php files whose name is gene.fasta in a current working directory.<br /><br /># find . -type f -name gene.fasta<br /><br />./gene.fasta<br /><br />6. Find all PHP Files in Directory<br /><br />Find all fasta files in a directory.<br /><br /># find . -type f -name "*.fasta"<br /><br />./gene.fasta<br />./cancer.fasta<br />./allgene.fasta<br /><br /><strong>Part II &ndash; Find Files Based on their Permissions</strong><br />7. Find Files With 777 Permissions<br /><br />Find all the files whose permissions are 777.<br /><br /># find . -type f -perm 0777 -print<br /><br />8. Find Files Without 777 Permissions<br /><br />Find all the files without permission 777.<br /><br /># find / -type f ! -perm 777<br /><br />9. Find SGID Files with 644 Permissions<br /><br />Find all the SGID bit files whose permissions set to 644.<br /><br /># find / -perm 2644<br /><br />10. Find Sticky Bit Files with 551 Permissions<br /><br />Find all the Sticky Bit set files whose permission are 551.<br /><br /># find / -perm 1551<br /><br />11. Find SUID Files<br /><br />Find all SUID set files.<br /><br /># find / -perm /u=s<br /><br />12. Find SGID Files<br /><br />Find all SGID set files.<br /><br /># find / -perm /g+s<br /><br />13. Find Read Only Files<br /><br />Find all Read Only files.<br /><br /># find / -perm /u=r<br /><br />14. Find Executable Files<br /><br />Find all Executable files.<br /><br /># find / -perm /a=x<br /><br />15. Find Files with 777 Permissions and Chmod to 644<br /><br />Find all 777 permission files and use chmod command to set permissions to 644.<br /><br /># find / -type f -perm 0777 -print -exec chmod 644 {} \;<br /><br />16. Find Directories with 777 Permissions and Chmod to 755<br /><br />Find all 777 permission directories and use chmod command to set permissions to 755.<br /><br /># find / -type d -perm 777 -print -exec chmod 755 {} \;<br /><br />17. Find and remove single File<br /><br />To find a single file called gene.txt and remove it.<br /><br /># find . -type f -name "gene.txt" -exec rm -f {} \;<br /><br />18. Find and remove Multiple File<br /><br />To find and remove multiple files such as .fa or .gb, then use.<br /><br /># find . -type f -name "*.fa" -exec rm -f {} \;<br /><br />OR<br /><br /># find . -type f -name "*.gb" -exec rm -f {} \;<br /><br />19. Find all Empty Files<br /><br />To file all empty files under certain path.<br /><br /># find /tmp -type f -empty<br /><br />20. Find all Empty Directories<br /><br />To file all empty directories under certain path.<br /><br /># find /tmp -type d -empty<br /><br />21. File all Hidden Files<br /><br />To find all hidden files, use below command.<br /><br /># find /tmp -type f -name ".*"<br /><br /><strong>Part III &ndash; Search Files Based On Owners and Groups</strong><br />22. Find Single File Based on User<br /><br />To find all or single file called gene.txt under / root directory of owner root.<br /><br /># find / -user root -name gene.txt<br /><br />23. Find all Files Based on User<br /><br />To find all files that belongs to user Rahul under /home directory.<br /><br /># find /home -user rahul<br /><br />24. Find all Files Based on Group<br /><br />To find all files that belongs to group Developer under /home directory.<br /><br /># find /home -group developer<br /><br />25. Find Particular Files of User<br /><br />To find all .txt files of user Rahul under /home directory.<br /><br /># find /home -user rahul -iname "*.txt"<br /><br /><strong>Part IV &ndash; Find Files and Directories Based on Date and Time</strong><br />26. Find Last 50 Days Modified Files<br /><br />To find all the files which are modified 50 days back.<br /><br /># find / -mtime 50<br /><br />27. Find Last 50 Days Accessed Files<br /><br />To find all the files which are accessed 50 days back.<br /><br /># find / -atime 50<br /><br />28. Find Last 50-100 Days Modified Files<br /><br />To find all the files which are modified more than 50 days back and less than 100 days.<br /><br /># find / -mtime +50 &ndash;mtime -100<br /><br />29. Find Changed Files in Last 1 Hour<br /><br />To find all the files which are changed in last 1 hour.<br /><br /># find / -cmin -60<br /><br />30. Find Modified Files in Last 1 Hour<br /><br />To find all the files which are modified in last 1 hour.<br /><br /># find / -mmin -60<br /><br />31. Find Accessed Files in Last 1 Hour<br /><br />To find all the files which are accessed in last 1 hour.<br /><br /># find / -amin -60<br /><br /><strong>Part V &ndash; Find Files and Directories Based on Size</strong><br />32. Find 50MB Files<br /><br />To find all 50MB files, use.<br /><br /># find / -size 50M<br /><br />33. Find Size between 50MB &ndash; 100MB<br /><br />To find all the files which are greater than 50MB and less than 100MB.<br /><br /># find / -size +50M -size -100M<br /><br />34. Find and Delete 100MB Files<br /><br />To find all 100MB files and delete them using one single command.<br /><br /># find / -size +100M -exec rm -rf {} \;<br /><br />35. Find Specific Files and Delete<br /><br />Find all .gb files with more than 10MB and delete them using one single command.<br /><br /># find / -type f -name *.gb -size +10M -exec rm {} \;</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40703/%CF%80-cyc-a-reference-free-snp-discovery-application-using-parallel-graph-search</guid>
	<pubDate>Tue, 28 Jan 2020 03:34:23 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40703/%CF%80-cyc-a-reference-free-snp-discovery-application-using-parallel-graph-search</link>
	<title><![CDATA[Π-cyc: A Reference-free SNP Discovery Application using Parallel Graph Search]]></title>
	<description><![CDATA[<p>Reference free SNP search for comparative population genomics: multiple samples run simultanously. **experimental phase, compiles and runs with OpenMPI-1.8.8 with Intel Compiler only</p>
<p><span>Cycles enumeration (aka Bubbles) as part of de novo de bruijn graphs assembly using colours can be unpractical for large error prone genomes which makes the assembly process produce an excessive number of false positive cycles.&nbsp; Our solution is to search the graph in multicores shared memory parallel mode using graph decomposition then use filtering method to generate good quality SNPs.</span></p>
<p><a href="https://arxiv.org/abs/1809.06700">https://arxiv.org/abs/1809.06700</a></p>
<p><a href="https://github.com/redayounsi/2KP2P">https://github.com/redayounsi/2KP2P</a></p>
<blockquote>
<p>/2kp2omp/bin/main_2kp2_K63_C2 -i fastq_files.txt -o fungus_bub.fasta -r stat_fungus.txt -c cov_fungus_hash.txt -k 63 -h 20 -b 100 -g 600 -l 100 -f 16 -t 5.0 -x 1 -v 0 -p 1 -y 1 -u 1</p>
<p>&nbsp;</p>
</blockquote><p>Address of the bookmark: <a href="https://github.com/redayounsi/2KP2P" rel="nofollow">https://github.com/redayounsi/2KP2P</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35131/giggle-a-search-engine-for-large-scale-integrated-genome-analysis</guid>
	<pubDate>Wed, 10 Jan 2018 03:10:45 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35131/giggle-a-search-engine-for-large-scale-integrated-genome-analysis</link>
	<title><![CDATA[GIGGLE: a search engine for large-scale integrated genome analysis]]></title>
	<description><![CDATA[<p><span>GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (</span><a href="https://github.com/ryanlayer/giggle">https://github.com/ryanlayer/giggle</a><span>) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.</span></p>
<p>https://www.nature.com/articles/nmeth.4556</p><p>Address of the bookmark: <a href="https://github.com/ryanlayer/giggle" rel="nofollow">https://github.com/ryanlayer/giggle</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44301/carrot2-clustering-engine</guid>
	<pubDate>Fri, 07 Apr 2023 13:11:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44301/carrot2-clustering-engine</link>
	<title><![CDATA[Carrot2 clustering engine]]></title>
	<description><![CDATA[<h2>&nbsp;</h2>
<p>This is the demo application of the&nbsp;<a href="http://project.carrot2.org/" target="_blank">Carrot<sup>2</sup>&nbsp;clustering engine</a>. It uses Carrot<sup>2</sup>'s algorithms to organize search results into thematic folders.</p>
<h3>User interfaces</h3>
<ul>
<li><span><a href="https://search.carrot2.org/#/search/:source">Web Search Clustering</a></span>&nbsp;organizes search results from public search engines into clusters; offers treemap- and pie-chart visualizations of the clusters.</li>
<li><span><a href="https://search.carrot2.org/#/workbench">Clustering Workbench</a></span>&nbsp;clusters content from local files in JSON or Excel format, Solr or Elasticsearch; allows tuning of clustering parameters and exporting results as Excel or JSON.</li>
</ul>
<h3>Search engines</h3>
<ul>
<li><span>Web</span>:&nbsp;<span>web search results provided by&nbsp;<a href="https://etools.ch/" target="_blank">etools.ch</a>. Extensive use may require special arrangements with the&nbsp;<a href="mailto:sschmid@comcepta.com" target="_blank">owner</a>&nbsp;of the etools.ch service.</span></li>
<li><span>PubMed</span>:&nbsp;<span>abstracts of medical papers from the PubMed database provided by NCBI.</span></li>
<li><span>Local file</span>:&nbsp;<span>content read from a local file in Carrot2 XML, JSON, CSV or Excel format.</span></li>
<li><span>Solr</span>:&nbsp;<span>queries an Apache Solr instance.</span></li>
<li><span>Elasticsearch</span>:&nbsp;<span>queries an Elasticsearch instance.</span></li>
</ul>
<h3>Clustering algorithms</h3>
<ul>
<li><span>Lingo</span>:&nbsp;<span>creates well-described flat clusters. Does not scale beyond a few thousand search results. Available as part of the open source&nbsp;<a href="http://project.carrot2.org/" target="_blank">Carrot<sup>2</sup>&nbsp;framework</a>.</span></li>
<li><span>STC</span>:&nbsp;<span>the classic search results clustering algorithm. Produces flat cluster with adequate description, very fast. Available as part of the open source&nbsp;<a href="http://project.carrot2.org/" target="_blank">Carrot<sup>2</sup>&nbsp;framework</a></span></li>
<li><span>k-means</span>:&nbsp;<span>base line clustering algorithm, produces bag-of-words style cluster descriptions. Available as part of the open source&nbsp;<a href="http://project.carrot2.org/" target="_blank">Carrot<sup>2</sup>&nbsp;framework</a></span></li>
</ul><p>Address of the bookmark: <a href="https://search.carrot2.org/#/search/web" rel="nofollow">https://search.carrot2.org/#/search/web</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42619/metaeuk-sensitive-high-throughput-gene-discovery-and-annotation-for-large-scale-eukaryotic-metagenomics</guid>
	<pubDate>Wed, 13 Jan 2021 19:29:32 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42619/metaeuk-sensitive-high-throughput-gene-discovery-and-annotation-for-large-scale-eukaryotic-metagenomics</link>
	<title><![CDATA[MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics]]></title>
	<description><![CDATA[<p><span>MetaEuk is a modular toolkit designed for large-scale gene discovery and annotation in eukaryotic metagenomic contigs. Metaeuk combines the fast and sensitive homology search capabilities of&nbsp;</span><a href="https://github.com/soedinglab/MMseqs2">MMseqs2</a><span>&nbsp;with a dynamic programming procedure to recover optimal exons sets. It reduces redundancies in multiple discoveries of the same gene and resolves conflicting gene predictions on the same strand. MetaEuk is GPL-licensed open source software that is implemented in C++ and available for Linux and macOS. The software is designed to run on multiple cores.</span></p><p>Address of the bookmark: <a href="https://github.com/soedinglab/metaeuk" rel="nofollow">https://github.com/soedinglab/metaeuk</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/19980/seqloc-06</guid>
	<pubDate>Sun, 28 Dec 2014 12:51:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/19980/seqloc-06</link>
	<title><![CDATA[seqloc 0.6]]></title>
	<description><![CDATA[<p>The <code>Bio.SeqLoc</code> modules in <code>seqloc</code> are designed to represent positions and locations (ranges of positions) on sequences, particularly nucleotide sequences. My original motivation for writing these packages was handing the locations of genes in eukaryotic genomes.</p>
<p>Handle sequence locations for bioinformatics http://www.ingolia-lab.org/seqloc-tutorial.html</p><p>Address of the bookmark: <a href="http://www.stackage.org/snapshot/nightly-2014-12-28/package/seqloc-0.6" rel="nofollow">http://www.stackage.org/snapshot/nightly-2014-12-28/package/seqloc-0.6</a></p>]]></description>
	<dc:creator>Gudiya Pal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27331/andi</guid>
	<pubDate>Fri, 13 May 2016 05:16:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27331/andi</link>
	<title><![CDATA[Andi]]></title>
	<description><![CDATA[<p>This is the <code>andi</code> program for estimating the evolutionary distance between closely related genomes. These distances can be used to rapidly infer phylogenies for big sets of genomes. Because <code>andi</code> does not compute full alignments, it is so efficient that it scales even up to thousands of bacterial genomes.</p>
<p>This readme covers all necessary instructions for the impatient to get <code>andi</code> up and running. For extensive instructions please consult the <a href="https://github.com/EvolBioInf/andi/blob/master/andi-manual.pdf">manual</a>.</p>
<p>More at https://github.com/evolbioinf/andi/</p><p>Address of the bookmark: <a href="http://bioinformatics.oxfordjournals.org/content/early/2015/01/13/bioinformatics.btu815.full" rel="nofollow">http://bioinformatics.oxfordjournals.org/content/early/2015/01/13/bioinformatics.btu815.full</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27971/samtools-primer</guid>
	<pubDate>Thu, 23 Jun 2016 07:18:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27971/samtools-primer</link>
	<title><![CDATA[Samtools Primer !!]]></title>
	<description><![CDATA[<p>SAMtools: Primer / Tutorial by Ethan Cerami, Ph.D.<br><br>keywords: samtools, next-gen, next-generation, sequencing, bowtie, sam, bam, primer, tutorial, how-to, introduction<br>Revisions<br><br>&nbsp;&nbsp;&nbsp; 1.0: May 30, 2013: First public release on biobits.org.<br>&nbsp;&nbsp;&nbsp; 1.1: July 24, 2013: Updated with Disqus Comments / Feedback section.<br>&nbsp;&nbsp;&nbsp; 1.2: December 19, 2014: Multiple updates, including:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Updated to use samtools 1.1 and bcftools 1.2.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Updated usage for bcftools.<br><br>About<br><br>SAMtools is a popular open-source tool used in next-generation sequence analysis. This primer provides an introduction to SAMtools, and is geared towards those new to next-generation sequence analysis. The primer is also designed to be self-contained and hands-on, meaning that you only need to install SAMtools, and no other tools, and sample data sets are provided. Terms in bold are also explained in the glossary at the end of the document.</p><p>Address of the bookmark: <a href="http://biobits.org/samtools_primer.html" rel="nofollow">http://biobits.org/samtools_primer.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>