<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: All]]></title>
	<link>https://bioinformaticsonline.com/snippets?offset=110</link>
	<atom:link href="https://bioinformaticsonline.com/snippets?offset=110" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43558/perl-onliner-to-check-the-ids-in-two-files</guid>
	<pubDate>Thu, 21 Oct 2021 07:21:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43558/perl-onliner-to-check-the-ids-in-two-files</link>
	<title><![CDATA[Perl onliner to check the ids in two files !]]></title>
	<description><![CDATA[<code>perl -lane &#039;BEGIN{open(A,&quot;ids2.txt&quot;); while(&lt;A&gt;){chomp; $k{$_}++}} if (defined($k{$F[0]})) {print &quot;$_\t$F[0]\t1&quot;} else {print &quot;$_\tNA\t0&quot;}; &#039; ids1.txt &gt; aaa.xls</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43557/onliner-to-convert-multi-line-fasta-to-single-line-fasta</guid>
	<pubDate>Wed, 20 Oct 2021 05:00:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43557/onliner-to-convert-multi-line-fasta-to-single-line-fasta</link>
	<title><![CDATA[Onliner to convert multi line fasta to single line fasta !]]></title>
	<description><![CDATA[<code>#Oneliner to convert
awk &#039;/^&gt;/ {printf(&quot;\n%s\n&quot;,$0);next; } { printf(&quot;%s&quot;,$0);}  END {printf(&quot;\n&quot;);}&#039; &lt; file.fa &gt; fileres.fa

#Then delete the first empty line
tail -n +2 fileres.fa &gt; fileout.fa</code>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43556/simulate-the-reads</guid>
	<pubDate>Wed, 20 Oct 2021 04:52:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43556/simulate-the-reads</link>
	<title><![CDATA[Simulate the reads !]]></title>
	<description><![CDATA[<code># make reference for randomreads.sh
# randomreads.sh part of BBTools/BBMap https://sourceforge.net/projects/bbmap/
/genetics/elbers/bbmap-38.86/randomreads.sh build=1 \
seed=1 \
ref=GCA_003401745.1_ASM340174v1_genomic.fna_upper.diploid.fasta.gz \
illuminanames=t addslash=t \
pacbio=t pbmin=0.13 pbmax=0.17 \
reads=100 paired=f \
gaussianlength=t \
minlength=1000 midlength=20000 maxlength=100000 \
out=/dev/null




# make 60x haploid coverage for Illumina reads
/genetics/elbers/bbmap-38.86/randomreads.sh build=1 \
ref=GCA_003401745.1_ASM340174v1_genomic.fna_upper.diploid.fasta.gz \
illuminanames=t addslash=t \
coverage=30 paired=t maxinsert=550 mininsert=450 \
out1=illumina1.fastq.gz out2=illumina2.fastq.gz &gt; random_reads_illumina.log 2&gt;&amp;1




# interleave the paired-end reads
# reformat.sh part of BBTools/BBMap https://sourceforge.net/projects/bbmap/
/genetics/elbers/bbmap-38.86/reformat.sh \
in=illumina1.fastq.gz in2=illumina2.fastq.gz out=illumina.int.fastq 2&gt;/dev/null




# use KmerGenie 1.7051 to get an idea of k-mer with that produces longest N50
# http://kmergenie.bx.psu.edu/
mkdir -p /genetics/elbers/test/fly2/kmergenie-illumina-raw-reads

cd /genetics/elbers/test/fly2/kmergenie-illumina-raw-reads
/genetics/elbers/kmergenie-1.7051/kmergenie ../illumina.int.fastq \
&gt; kmergenie-illumina-raw-reads.log 2&gt;&amp;1
rm ../illumina.int.fastq

k=`grep &quot;^best k:&quot; \
kmergenie-illumina-raw-reads.log | grep -Po &quot;\d+&quot;` 
echo &quot;best k=${k}&quot;




# make 30x haploid coverage for PacBio CLR reads
# error rate from 13 - 15 % minimum 1000bp midlength 20000bp maximum 30000bp
cd /genetics/elbers/test/fly2

/genetics/elbers/bbmap-38.86/randomreads.sh build=1 \
ow=t seed=1 \
ref=GCA_003401745.1_ASM340174v1_genomic.fna_upper.diploid.fasta.gz \
illuminanames=t addslash=t \
pacbio=t pbmin=0.13 pbmax=0.15 \
coverage=15 paired=f \
gaussianlength=t \
minlength=1000 midlength=20000 maxlength=30000 \
out=pacbio.fastq.gz &gt; random_reads_pacbio.log 2&gt;&amp;1



# make 30x haploid coverage for PacBio reads for Hifi reads
# error rate from 1 - 0.1 % minimum 9000bp midlength 10000bp max 12000bp
/genetics/elbers/bbmap-38.86/randomreads.sh build=1 \
ow=t seed=1 \
ref=GCA_003401745.1_ASM340174v1_genomic.fna_upper.diploid.fasta.gz \
illuminanames=t addslash=t \
pacbio=t pbmin=0.001 pbmax=0.01 \
coverage=15 paired=f \
gaussianlength=t \
minlength=9000 midlength=10000 maxlength=12000 \
out=hifi.fastq.gz &gt; random_reads_pacbio_hifi.log 2&gt;&amp;1</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43438/downloading-mmseqs-databases</guid>
	<pubDate>Wed, 06 Oct 2021 06:25:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43438/downloading-mmseqs-databases</link>
	<title><![CDATA[Downloading mmseqs databases !]]></title>
	<description><![CDATA[<code># mmseqs databases
Usage: mmseqs databases &lt;name&gt; &lt;o:sequenceDB&gt; &lt;tmpDir&gt; [options]

  Name                	Type      	Taxonomy	Url
- UniRef100           	Aminoacid 	     yes	https://www.uniprot.org/help/uniref
- UniRef90            	Aminoacid 	     yes	https://www.uniprot.org/help/uniref
- UniRef50            	Aminoacid 	     yes	https://www.uniprot.org/help/uniref
- UniProtKB           	Aminoacid 	     yes	https://www.uniprot.org/help/uniprotkb
- UniProtKB/TrEMBL    	Aminoacid 	     yes	https://www.uniprot.org/help/uniprotkb
- UniProtKB/Swiss-Prot	Aminoacid 	     yes	https://uniprot.org
- NR                  	Aminoacid 	       -	https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA
- NT                  	Nucleotide	       -	https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA
- PDB                 	Aminoacid 	       -	https://www.rcsb.org
- PDB70               	Profile   	       -	https://github.com/soedinglab/hh-suite
- Pfam-A.full         	Profile   	       -	https://pfam.xfam.org
- Pfam-A.seed         	Profile   	       -	https://pfam.xfam.org
- Pfam-B              	Profile   	       -	https://xfam.wordpress.com/2020/06/30/a-new-pfam-b-is-released
- eggNOG              	Profile   	       -	http://eggnog5.embl.de
- dbCAN2              	Profile   	       -	http://bcb.unl.edu/dbCAN2
- Resfinder           	Nucleotide	       -	https://cge.cbs.dtu.dk/services/ResFinder
- Kalamari            	Nucleotide	     yes	https://github.com/lskatz/Kalamari

#For example, run the following to download and setup the Swiss-Prot at the output path outpath/swissprot:

mmseqs databases UniProtKB/Swiss-Prot outpath/swissprot tmp

#In this case, since Swiss-Prot has a value yes in the Taxonomy column above, all necessary files to use it as a valid seqTaxDB will be downloaded and prepared by the databases command.

More information @ https://github.com/soedinglab/mmseqs2/wiki#downloading-databases</code>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43435/install-hhsuite-using-conda</guid>
	<pubDate>Wed, 06 Oct 2021 05:06:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43435/install-hhsuite-using-conda</link>
	<title><![CDATA[Install hhsuite using conda !]]></title>
	<description><![CDATA[<code>(base) [abhi@hn1 bin]$ conda install -c conda-forge -c bioconda hhsuite
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/abhi/anaconda3

  added / updated specs:
    - hhsuite


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    _openmp_mutex-4.5          |            1_gnu          22 KB
    conda-4.10.3               |   py38h578d9bd_2         3.0 MB  conda-forge
    hhsuite-3.3.0              |py38pl526h6ed170a_1        26.6 MB  bioconda
    libgomp-9.3.0              |      h5101ec6_17         311 KB
    ------------------------------------------------------------
                                           Total:        30.0 MB

The following NEW packages will be INSTALLED:

  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-4.5-1_gnu
  hhsuite            bioconda/linux-64::hhsuite-3.3.0-py38pl526h6ed170a_1
  libgomp            pkgs/main/linux-64::libgomp-9.3.0-h5101ec6_17

The following packages will be UPDATED:

  conda                               4.10.3-py38h578d9bd_1 --&gt; 4.10.3-py38h578d9bd_2


Proceed ([y]/n)? y


Downloading and Extracting Packages
_openmp_mutex-4.5    | 22 KB     | ######################################################################################################## | 100%
conda-4.10.3         | 3.0 MB    | ######################################################################################################## | 100%
hhsuite-3.3.0        | 26.6 MB   | ######################################################################################################## | 100%
libgomp-9.3.0        | 311 KB    | ######################################################################################################## | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done</code>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43429/download-desire-version-of-blast-software</guid>
	<pubDate>Wed, 06 Oct 2021 02:55:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43429/download-desire-version-of-blast-software</link>
	<title><![CDATA[Download desire version of Blast software !]]></title>
	<description><![CDATA[<code>#Create a directory and wget it
wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.6.0/ncbi-blast-2.6.0+-x64-linux.tar.gz

#unpacking blast
tar -zxvf ncbi-blast-2.6.0+-x64-linux.tar.gz

#Slurm template

#!/bin/bash
#SBATCH --partition=longjobs
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=32
#SBATCH --time=1:00:00
#SBATCH --job-name=vsearch
#SBATCH -o result_%N_%j.out
#SBATCH -e result_%N_%j.err

export SBATCH_EXPORT=NONE
export OMP_NUM_THREADS=???

module load ncbi-blast/2.6.0_x86_64</code>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43426/command-line-to-download-blast-database-protein</guid>
	<pubDate>Tue, 05 Oct 2021 00:06:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43426/command-line-to-download-blast-database-protein</link>
	<title><![CDATA[Command line to download blast database / protein]]></title>
	<description><![CDATA[<code>#download all available nr - protein database as a single file 

#Database location - NCBI where all databases are available
ftp://ftp.ncbi.nlm.nih.gov/blast/db/
https://ftp.ncbi.nlm.nih.gov/blast/db/

# Database detail / description 
nr.*tar.gz | Non-redundant protein sequences from GenPept, Swissprot, PIR, PDF, PDB, and NCBI RefSeq

#First run this to download
wget &#039;ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.*.tar.gz&#039;

#cat them into one
cat nr.*.tar.gz | tar -zxvi -f - -C .</code>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43413/get-the-linux-system-information</guid>
	<pubDate>Thu, 30 Sep 2021 06:37:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43413/get-the-linux-system-information</link>
	<title><![CDATA[Get the Linux system information !]]></title>
	<description><![CDATA[<code>#!/bin/bash

# while-menu-dialog: a menu driven system information program

DIALOG_CANCEL=1
DIALOG_ESC=255
HEIGHT=0
WIDTH=0

display_result() {
  dialog --title &quot;$1&quot; \
    --no-collapse \
    --msgbox &quot;$result&quot; 0 0
}

while true; do
  exec 3&gt;&amp;1
  selection=$(dialog \
    --backtitle &quot;System Information&quot; \
    --title &quot;Menu&quot; \
    --clear \
    --cancel-label &quot;Exit&quot; \
    --menu &quot;Please select:&quot; $HEIGHT $WIDTH 4 \
    &quot;1&quot; &quot;Display System Information&quot; \
    &quot;2&quot; &quot;Display Disk Space&quot; \
    &quot;3&quot; &quot;Display Home Space Utilization&quot; \
    2&gt;&amp;1 1&gt;&amp;3)
  exit_status=$?
  exec 3&gt;&amp;-
  case $exit_status in
    $DIALOG_CANCEL)
      clear
      echo &quot;Program terminated.&quot;
      exit
      ;;
    $DIALOG_ESC)
      clear
      echo &quot;Program aborted.&quot; &gt;&amp;2
      exit 1
      ;;
  esac
  case $selection in
    1 )
      result=$(echo &quot;Hostname: $HOSTNAME&quot;; uptime)
      display_result &quot;System Information&quot;
      ;;
    2 )
      result=$(df -h)
      display_result &quot;Disk Space&quot;
      ;;
    3 )
      if [[ $(id -u) -eq 0 ]]; then
        result=$(du -sh /home/* 2&gt; /dev/null)
        display_result &quot;Home Space Utilization (All Users)&quot;
      else
        result=$(du -sh $HOME 2&gt; /dev/null)
        display_result &quot;Home Space Utilization ($USER)&quot;
      fi
      ;;
  esac
done</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43412/bash-script-for-getopts</guid>
	<pubDate>Wed, 29 Sep 2021 04:53:14 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43412/bash-script-for-getopts</link>
	<title><![CDATA[Bash script for getopts]]></title>
	<description><![CDATA[<code>#using : after a switch variable means it requires some input (ie, t: requires something after t to validate while h requires nothing.
while getopts “ht:r:p:v” OPTION
do
     case $OPTION in
         h)
             usage
             exit 1
             ;;
         t)
             TEST=$OPTARG
             ;;
         r)
             SERVER=$OPTARG
             ;;
         p)
             PASSWD=$OPTARG
             ;;
         v)
             VERBOSE=1
             ;;
         ?)
             usage
             exit
             ;;
     esac
done

if [[ -z $TEST ]] || [[ -z $SERVER ]] || [[ -z $PASSWD ]]
then
     usage
     exit 1
fi</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/43409/inreractive-scp-file-transfer</guid>
	<pubDate>Tue, 28 Sep 2021 08:14:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/43409/inreractive-scp-file-transfer</link>
	<title><![CDATA[Inreractive SCP / File transfer !]]></title>
	<description><![CDATA[<code>#!/bin/bash
#next line prints hearer of script
echo &quot;Interactive Script to Copy File (files) / Directory using scp&quot;
#next line check if entered value is not null, and if null it will reask user to enter Destination Server
while [ x$desthost = &quot;x&quot; ]; do
#next line prints what userd should enter, and stores entered value to variable with name desthost
read -p &quot;Destination Server Name : &quot; desthost
#next line finishes while loop
done
#next line check if entered value is not null, and if null it will reask user to enter Destination Path
while [ x$destpath = &quot;x&quot; ]; do
#next line prints what userd should enter, and stores entered value to variable with name destpath
read -p &quot;Destination Path : &quot; destpath
#next line finishes while loop
done
#next line put null value to variable filename
filename=&#039;null&#039;
#next line check if entered value is null, and If not null it will reask user to enter file(s) to copy
while ! [ x&quot;$filename&quot; = &quot;x&quot; ]; do
#next line prints what userd should enter, and stores entered value to variable with name filename
read -p &quot;Path to source directory / file : &quot; filename
#next line checks if entered value is not null, and if not null it will copy file(s)
if ! [ x&quot;$filename&quot; = &quot;x&quot; ];
then
#next line prints header
echo -n &quot;Copying $filename ... &quot;
#next like copy pre-entered file(s) or dir to destination path on destination server
scp -r &quot;$filename&quot; &quot;$desthost&quot;:&quot;$destpath&quot;
#end of if
fi
#next line finishes while loop
done</code>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>