<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: All site blogs]]></title>
	<link>https://bioinformaticsonline.com/blog/all?offset=70</link>
	<atom:link href="https://bioinformaticsonline.com/blog/all?offset=70" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43911/slurm-commands</guid>
	<pubDate>Wed, 06 Jul 2022 07:40:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43911/slurm-commands</link>
	<title><![CDATA[SLURM Commands]]></title>
	<description><![CDATA[<h3>SLURM commands</h3><p>The following table shows SLURM commands on the SOE cluster.</p><table border="1">
<thead>
<tr><th>Command</th><th>Description</th></tr>
</thead>
<tbody>
<tr>
<td><strong>sbatch</strong></td>
<td>Submit batch scripts to the cluster</td>
</tr>
<tr>
<td><strong>scancel</strong></td>
<td>Signal jobs or job steps that are under the control of Slurm.</td>
</tr>
<tr>
<td><strong>sinfo</strong></td>
<td>View information about SLURM nodes and partitions.</td>
</tr>
<tr>
<td><strong>squeue</strong></td>
<td>View information about jobs located in the SLURM scheduling queue</td>
</tr>
<tr>
<td><strong>smap</strong></td>
<td>Graphically view information about SLURM jobs, partitions, and set configurations parameters</td>
</tr>
<tr>
<td><strong>sqlog</strong></td>
<td>View information about running and finished jobs</td>
</tr>
<tr>
<td><strong>sacct</strong></td>
<td>View resource accounting information for finished and running jobs</td>
</tr>
<tr>
<td><strong>sstat</strong></td>
<td>View resource accounting information for running jobs</td>
</tr>
</tbody>
</table><p><span>For more information, run&nbsp;</span><strong>man</strong><span>&nbsp;on the commands above. See some examples below.</span><br /><br /><span style="font-size: large;"><strong>1. Info about the partitions and nodes</strong></span><span></span><br /><span>List all the partitions available to you and the nodes therein:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sinfo
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>Nodes in state&nbsp;</span><tt>idle</tt><span>&nbsp;can accept new jobs.</span><br /><br /><span>Show a partition configuratuin, for example,&nbsp;</span><tt>SOE_main</tt><span></span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>scontrol show partition=SOE_main
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>Show current info about a specific node:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>scontrol show node=&lt;nodename&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>You can also specify a group of nodes in the command above. For example, if your MPI job is running across soenode05,06,35,36, you can execute the command below to get the info on the nodes you are interested in:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>scontrol show node=soenode[05-06,35-36]
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>An informative parameter in the output to look at would be CPULoad. It allows you to see how your application utilizes the CPUs on the running nodes.</span><br /><br /><span style="font-size: large;"><strong>2. Submit scripts</strong></span><span></span><br /><span>The header in a submit script specifies job name, partition (queue), time limit, memory allocation, number of nodes, number of cores, and files to collect standard output and error at run time, for example</span></p><div><table border="1">
<tbody>
<tr>
<td>
<pre>#!/bin/bash

#SBATCH --job-name=OMP_run     # job name, "OMP_run"
#SBATCH --partition=SOE_main   # partition (queue)
#SBATCH -t 0-2:00              # time limit: (D-HH:MM) 
#SBATCH --mem=32000            # memory per node in MB 
#SBATCH --nodes=1              # number of nodes
#SBATCH --ntasks-per-node=16   # number of cores
#SBATCH --output=slurm.out     # file to collect standard output
#SBATCH --error=slurm.err      # file to collect standard errors
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>If the time limit is not specified in the submit script, SLURM will assign the default run time, 3 days. This means the job will be terminated by SLURM in 72 hrs. The maximum allowed run time is two weeks,&nbsp;</span><tt>14-0:00</tt><span>.</span><br /><span>If the memory limit is not requested, SLURM will assign the default 16 GB. The maximum allowed memory per node is 128 GB. To see how much RAM per node your job is using, you can run commands&nbsp;</span><tt>sacct</tt><span>&nbsp;or&nbsp;</span><tt>sstat</tt><span>&nbsp;to query MaxRSS for the job on the node - see examples below.</span><br /><span>Depending on a type of application you need to run, the submit script may contain commands to create a temporary space on a computational node -&nbsp;</span><a href="http://ecs.rutgers.edu/file_systems.html">see the discussion about using the file systems on the cluster.</a><span></span><br /><span>Then it sets the environment specific to the application and starts the application on one or multiple nodes - see sbatch sample scripts in directory&nbsp;</span><tt>/usr/local/Samples</tt><span>&nbsp;on soemaster1.hpc.rutgers.edu.</span><br /><span>You can submit your job to the cluster with&nbsp;</span><tt>sbatch</tt><span>&nbsp;command:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sbatch myscript.sh
</pre>
</td>
</tr>
</tbody>
</table></div><p><br /><span style="font-size: large;"><strong>3. Query job information</strong></span><span></span><br /><span>List all currently submitted jobs in running and pending states for a user:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>squeue -u &lt;username&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>Command&nbsp;</span><tt>squeue</tt><span>&nbsp;can be run with format options to expose specific information, for example, when pending job #706 is scheduled to start running:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>squeue -j 706 --format="%S"
</pre>
</td>
</tr>
</tbody>
</table></div><div><table border="1">
<tbody>
<tr>
<td>
<pre>START_TIME
2015-04-30T09:54:32
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>More info can be shown by placing additional format options, for example:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>squeue -j 706 --format="%i %P %j %u %T %l %C %S"
</pre>
</td>
</tr>
</tbody>
</table></div><div><table border="1">
<tbody>
<tr>
<td>
<pre>JOBID PARTITION   NAME    USER STATE   TIMELIMIT  CPUS START_TIME
706   SOE_main  Par_job_3 mike PENDING 3-00:00:00 64   2015-04-30T09:54:32
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>To see when all the jobs, pending in the queue, are scheduled to start:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>squeue --start 
</pre>
</td>
</tr>
</tbody>
</table></div><p><br /><span>List all running and completed jobs for a user</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sqlog -u &lt;username&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>or</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sqlog -j &lt;JobID&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>The following appreviations are used for the job states:</span></p><pre>       CA   CANCELLED      Job was cancelled.

       CD   COMPLETED      Job completed normally.

       CG   COMPLETING     Job is in the process of completing.

       F    FAILED         Job termined abnormally.

       NF   NODE_FAIL      Job terminated due to node failure.

       PD   PENDING        Job is pending allocation.

       R    RUNNING        Job currently has an allocation.

       S    SUSPENDED      Job is suspended.

       TO   TIMEOUT        Job terminated upon reaching its time limit.
</pre><p><span>You can specify the fields you would like to see in the output of&nbsp;</span><tt>sqlog</tt><span>:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sqlog --format=list
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>The command below, for example, provides Job ID, user name, exit state, start date-time, and end date-time for job #2831:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sqlog -j 2831 --format=jid,user,state,start,end
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>List status info for a currently running job:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sstat -j &lt;jobid&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>A formatted output can be used to gain only a specific info, for example, the maximum resident RAM usage on a node:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sstat --format="JobID,MaxRSS" -j &lt;jobid&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>To get statistics on completed jobs by jobID:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sacct --format="JobID,JobName,MaxRSS,Elapsed" -j &lt;jobid&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>To view the same information for all jobs of a user:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sacct --format="JobID,JobName,MaxRSS,Elapsed" -u &lt;username&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>To print a list of fields that can be specified with the&nbsp;</span><tt>--format</tt><span>&nbsp;option:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sacct --helpformat
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>For example, to get Job ID, Job name, Exit state, start date-time, and end date-time for job #2831:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sacct -j 2831 --format="JobID,JobName,State,Start,End"
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>Another useful command to gain information about a running job is&nbsp;</span><tt>scontrol</tt><span>:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>scontrol show job=&lt;jobid&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><br /><span style="font-size: large;"><strong>4. Cancel a job</strong></span><span></span><br /><span>To cancel one job:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>scancel &lt;jobid&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>To cancel one job and delete the TMP directory created by the submit script on a node:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>sdel &lt;jobid&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>To cancel all the jobs for a user:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>scancel -u &lt;username&gt;
</pre>
</td>
</tr>
</tbody>
</table></div><p><span>To cancel one or more jobs by name:</span></p><div><table border="0" style="background-color: #D0D0D0;">
<tbody>
<tr>
<td>
<pre>scancel --name &lt;myJobName&gt;
</pre>
</td>
</tr>
</tbody>
</table></div>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43900/finding-a-mimicry-game-for-teaching-on-line-and-mentioned-general-resources</guid>
	<pubDate>Tue, 28 Jun 2022 07:32:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43900/finding-a-mimicry-game-for-teaching-on-line-and-mentioned-general-resources</link>
	<title><![CDATA[Finding a mimicry game for teaching on-line and mentioned general resources]]></title>
	<description><![CDATA[<pre>Mimicry and other resources
Mimicry games:
Great Heliconius game:
http://heliconius.org/evolving_butterflies/
(See also 
https://royalsocietypublishing.org/doi/10.1098/rspb.2020.0014)
Other one, a bit less friendly:
https://ccl.northwestern.edu/netlogo/models/Mimicry
Camouflage practical
https://alexis-catherine.github.io/publication/natural-selection-and-camouflage/
(NetLogo also has one: 
https://ccl.northwestern.edu/netlogo/models/BugHuntCamouflage)
Peppered moth game:
https://askabiologist.asu.edu/peppered-moths-game/play.html

General resources
The always popular Populus:
https://cbs.umn.edu/populus/overview
Drift &amp; Gene Flow 
https://cartwrig.ht/apps/genie/
(Cock van Oosterhout has a great ppt to lead students through this)
See also https://cartwrig.ht/apps/redlynx/
https://demonstrations.wolfram.com/ReplicatorMutatorDynamicsWithThreeStrategies/
NetLogo:
http://ccl.northwestern.edu/netlogo/models/index.cgi
Population Genetics:
https://www.radford.edu/~rsheehy/Gen_flash/popgen/
Evolution in general
https://evolution.berkeley.edu/evolibrary/home.php
Mitochondrial Eve:
https://projects.ncsu.edu/cals/gn/ex/mit-eve.html
Y chromosomes:
https://projects.ncsu.edu/cals/gn/ex/y-chrom.html
A professional online package from Michael Kasumovic:
https://arludo.com/
a compilation of resources:
https://planted.botany.org/index.php?P=Home
Finally, Donald Forsdyke has some great on-line videos explaining
evolutionary principles (occasionally in a fake Scottish accent):
http://post.queensu.ca/~forsdyke/videolectures.htm
</pre><p>&nbsp;</p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43898/online-resources-on-must-read-papers-in-evolutionary-biology-for-a-literature-club</guid>
	<pubDate>Tue, 28 Jun 2022 07:29:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43898/online-resources-on-must-read-papers-in-evolutionary-biology-for-a-literature-club</link>
	<title><![CDATA[Online resources on must-read papers in evolutionary biology, for a literature club]]></title>
	<description><![CDATA[<pre>1.       *Nick Barton:*

- The textbook "Evolution" by Nick Barton, with resources for
  exploring the literature: Barton, N. H., Briggs, D. E. G., Eisen, J.
  A., Goldstein, D. B., &amp; Patel, N. H. (2007). Evolution. Cold Spring
  Harbor Laboratory Press.

- Papers from a course named "Classics in Evolutionary Biology":

Evolutionary Synthesis
1. Haldane, J. B. S. 1932. The causes of evolution. Longmans. New York.
   (esp. Ch. IV).
2. Fisher, R. A. 1930. The genetical theory of natural selection. Oxford
   University Press, Oxford. Selected Sections - Fundamental Theorem.

Genetic Variation
1a. Lewontin, R. C., and J. L. Hubby. 1966. A molecular approach to
the study of genic heterozygosity in natural populations. II. Amount
of variation and degree of heterozygosity in natural populations of
Drosophila pseudoobscura. Genetics. 54:595-609.

1b. Sachidandam et al. 2001. A map of human genome sequence variation
containing 1.42 million single nucleotide polymorphisms. 409: 928-33.

2. Wright S., Dobzhansky T., Hovanitz W. 1942 Genetics of natural
populations VII The allelism of lethals in the third chromosome of
Drosophila pseudoobscura. Genetics 27: 363-394.

Recombination and evolution
1. Hill, W. G., and A. Robertson. 1966. The effect of linkage on limits
to artificial selection. Genet. Res. 8:269-294.

2. Maynard Smith and Haigh. 1974. The hitch-hiking effect of a favourable
gene. Genet. Res. 23: 23-35.

Understanding sequence variation
1. Begun D. J., Aquadro C. F., 1992 Levels of naturally occurring DNA
polymorphism correlate with recombination rate in Drosophila melanogaster.
Nature 356: 519-520.

2. Green R. E., Reich D., P&auml;&auml;bo S., 2010 A draft sequence of the
Neandertal genome. Science 328: 710-722.

Quantitative Genetics:  variation in complex traits
1. Galton F., 1877 Typical laws of heredity. Nature 15: 492-495-
512-514- 532-533.

2. Turelli M., 1984 Heritable genetic variation via
mutation-selection balance: Lerch's Zeta meets the abdominal
bristle. Theor. Popul. Biol. 25: 138-193.

Quantitative Genetics:  finding the genes
1. Shrimpton A. E., Robertson A., 1988 The Isolation of polygenic factors
controlling bristle score in Drosophila melanogaster II Distribution of
third chromosome bristle effects within chromosome sections. Genetics
118: 445-459.

2. Boyle E. A., Li Y. I., Pritchard J. K., 2017 An expanded view of
complex traits: from polygenic to omnigenic. Cell 169: 1177-1186.

Neutral Evolution
1. Kimura, M. 1968. Evolutionary rate at the molecular level. Science.
217:624-626.

2a. Kern A. D., Hahn M. W., 2018 The Neutral Theory in Light of Natural
Selection. Molecular Biology and Evolution 110: 21077-6.

2b. Jensen J. D., Payseur B. A., Stephan W., Aquadro C. F., Lynch M.,
Charlesworth D., Charlesworth B., 2018 The importance of the Neutral Theory
in 1968 and 50 years on: a response to Kern and Hahn 2018. Evolution 112:
2109-4.

2c. Ellegren &amp; Galtier. 2016. Determinants of genetic diversity. Nature
Reviews Genetics.

Mutation and Genetic Variability
1. Luria, S. E., and M. Delbr&uuml;ck. 1943. Mutations of Bacteria from Virus
Sensitivity to Virus Resistance. Genetics. 28(6):491-511.

2. Hill, W G. 1982. "Rates of Change in Quantitative Traits From Fixation
of New Mutations." Proceedings of the National Academy of Sciences (U.S.A.)
79: 142-45.

Testing for selection
1. McDonald &amp; Kreitman. 1991. Adaptive protein evolution at the Adh locus
in Drosophila. Nature.

2. Begun, et al. Mol. Biol. Evol. 16, 1816-1819 (1999).

3. Siddiq et al. 2016. Experimental test and refutation of a classic case
of molecular adaptation in Drosophila melanogaster.  Nature Ecology &amp;
Evolution.

The shifting balance
1. Wright, S. 1932. The roles of mutation, inbreeding, crossbreeding and
selection in evolution. Proceedings of the VI International Congress of
Genetics: 1. pp 356-366.

2. Coyne, J.A., N.H. Barton, and M. Turelli. 1997. A critique of Wright's
shifting balance theory of evolution.  Evolution 51: 643-671.

3. Barton. 2016. Sewall Wright on Evolution in Mendelian Populations and
the "Shifting Balance". Genetics.

Evolution of Sex
1.  Muller, H.J. 1964. The relation of recombination to mutational advance.
Mutation Res. 1(1):2-9

2. McDonald et al. 2016. Sex speeds adaptation by altering the dynamics of
molecular evolution. Nature.

Kin Selection, Cooperation, and Conflict
1. Hamilton, W. D. 1964. The genetical evolution of social behaviour I.
Journal of Theoretical Biology. 7:1-52.

2. Trivers, R. L. 1974 Parent-offspring conflict. American Zoologist.
14(1):249-264.

Sexual Selection
1. Zahavi, A. 1975. Mate selection - a selection of a handicap. J. Theor.
Biol. 53:205-214.

2. Kirkpatrick, M., and Ryan, M.J. 1991. The evolution of mating
preferences and the paradox of the lek. Nature. 350:33-38.

Fitness Landscapes
1. Dean, A. 1995. A Molecular Investigation of Genotype by Environment
Interactions. Genetics. 139:19-33.

2. Costanzo et al. 2010. The Genetic Landscape of a Cell. Science.

Speciation
1. Coyne, J. A., and H. A. Orr. 1989. Patterns of speciation in Drosophila.
Evolution. 43:362-381.

2. Corbett-Detig et al. 2013. Genetic incompatibilities are widespread
within species. Nature.

2.       *Marcos Antezana:*

Valen, L. v. 1975. Energy and Evolution. University of Chicago, Department
of Biology.

3.       *Remco Folkertsma:*

1. The work by Hopi Hoekstra on local adaptation and oldfield mice

2. Poelstra, J. W., Vijay, N., Bossu, C. M., Lantz, H., Ryll, B., M&uuml;ller,
I., ... &amp; Wolf, J. B. (2014). The genomic landscape underlying phenotypic
integrity in the face of gene flow in crows. Science, 344(6190), 1410-1414.

4.       *Joshka Kaufmann and Leslie Turner*

They offer us a link to 'papers every evolutionary biologist should read',
the papers are collected by Leslie Turner.
https://static1.squarespace.com/static/53e8cb7ce4b02c4bc3aeeee4/t/5ab8fcb670a6ad55c67fcdf4/1522072758665/EvoBioClassicsRefList.pdf

5.       *Sarah Stockwell*

Matt Ridley collected classic papers in evolutionary biology and printed
part of these papers in his book Evolution (see Matt Ridley. Evolution
(Univ. of Oxford Press, 2nd edition, 2004))</pre>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43896/list-of-comparative-genomics-resources</guid>
	<pubDate>Tue, 28 Jun 2022 04:08:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43896/list-of-comparative-genomics-resources</link>
	<title><![CDATA[List of comparative genomics resources !]]></title>
	<description><![CDATA[<div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1096638041"><span>3D-GENOMICS -- A Database to Compare Structural and Functional Annotations of Proteins between Sequenced Genomes</span></a></div><p>Compare structural and functional annotations of proteins between sequenced genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1100640374"><span>ARED Organism -- expansion of ARED reveals AU-rich element cluster variations between human and mouse</span></a></div><p>View AREs in the human transcriptome and study the comparative genomics of AREs in model organisms.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1234973128"><span>ATGC -- Alignable Tight Genomic Clusters Database</span></a></div><p>Find information about orthologous genes in prokaryotes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1174596104"><span>AnimalQTLdb -- a livestock QTL database tool set for positional QTL information mining and beyond</span></a></div><p>Search for publicly available QTL data on livestocks and animal species.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL20110518150135"><span>BGDB -- Bovine Genome Database</span></a></div><p>Find information about bovine genomics data.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1229012662"><span>COMPARE -- a multi-organism system for cross-species data comparison and transfer of information</span></a></div><p>A multi-organism web-based resource system designed to easily retrieve, correlate and interpret data across species.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1218141952"><span>CONDOR -- COnserved Non-coDing Orthologous Regions</span></a></div><p>A database resource of developmentally associated conserved non-coding elements.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1099057221"><span>CORG -- A database for COmparative Regulatory Genomics</span></a></div><p>Delineate conserved non-coding blocks from upstream regions of putative orthologous gene pairs from man, mouse, rat, fugu, Mus musculus, Danio rerio, and zebrafish.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1203608896"><span>COXPRESdb -- a database of coexpressed gene networks in mammals</span></a></div><p>Find coexpressed gene lists and networks in human and mouse.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1097763045"><span>CVTree -- A Phylogenetic Tree Reconstruction Tool Based on Whole Genomes</span></a></div><p>Construct phylogenetic tree of microorganisms based on oligopeptide content of their complete proteomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1232729680"><span>CleanEST -- the cleansed EST libraries database</span></a></div><p>A novel database server that classifies GenBank's dbEST (database of expressed gene sequences) libraries and removes contaminants.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1256926144"><span>CoCoa -- COefficient of COAncestry software</span></a></div><p>Find information about the ancestral relationship between genes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1227549154"><span>CoGemiR -- a comparative genomics microRNA database</span></a></div><p>Provides an overview of the genomic organization of microRNAs and extent of conservation during evolution in different metazoan species.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1117678221"><span>Comparative Genometrics (CG) -- a database dedicated to biometric comparisons of whole genomes</span></a></div><p>Conduct comparative biometric analysis of chromosomes of different organisms.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1151007916"><span>DoTS -- Database Of Transcribed Sequences</span></a></div><p>Search for Indices of gene and transcripts in human and mouse.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1174510065"><span>DroSpeGe -- rapid access database for new Drosophila species genomes</span></a></div><p>Search and compare 12 new and old Drosophila genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1098208414"><span>ECR Browser -- A Tool for Visualizing and Accessing Data from Comparisons of Multiple Vertebrate Genomes</span></a></div><p>Access to whole genome alignments of human, mouse, rat and fish sequences.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1209738459"><span>EPGD -- Eukaryotic Paralog Group Database</span></a></div><p>Find eukaryotic paralog/paralogon information.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1232726869"><span>EVOG -- evolutionary visualizer for overlapping genes</span></a></div><p>Analyze the evolutionary process of overlapping genes when comparing different species.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1227633714"><span>GNAT -- Inter-species gene mention normalization (ISGN)</span></a></div><p>The first publicly available system reported to handle inter-species gene mention normalization.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1229438992"><span>GenColors -- annotation and comparative genomics of prokaryotes made easy</span></a></div><p>A web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1151086258"><span>GeneNest gene indices</span></a></div><p>Visualize gene indices of human, mouse, Arabidopsis, Zebrafish, Drosophila and Sheep.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1174489378"><span>GenomeTrafac -- a whole genome resource for the detection of transcription factor binding site clusters associated with conventional and microRNA encoding genes conserved between mouse and human gene orthologs</span></a></div><p>Use comparative genomics approach to characterize gene models and identify putative cis-regulatory regions of RefSeq Gene Orthologs.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL20110518150753"><span>IKMC -- International Knockout Mouse Consortium web portal</span></a></div><p>Find information about mutated mouse genes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1209411604"><span>IMG/M -- Integrated Microbial Genomes/Metagenomes</span></a></div><p>A data management and analysis system for metagenomes</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1234976694"><span>ISED -- Influenza sequence and epitope database.</span></a></div><p>Search for influenza sequence, vaccine, and drug resistance information.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL20140710115515"><span>LAMDHI: The Search for Animal Models Starts Here</span></a></div><p>LAMHDI, the initiative to Link Animal Models to Human DIsease, is designed to accelerate the research process by providing biomedical researchers with a simple, comprehensive Web-based resource to find the best animal models for their research.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1228843803"><span>MANTIS -- a phylogenetic framework for multi-species genome comparisons</span></a></div><p>The missing link between multi-species full genome comparisons and functional analysis.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1099578148"><span>MBGD -- Microbial genome database for comparative analysis</span></a></div><p>Conduct comparative analysis of completely sequenced microbial genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1221077729"><span>MEGA -- Molecular Evolutionary Genetics Analysis</span></a></div><p>A biologist-centric software for evolutionary analysis of DNA and protein sequences.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1174596756"><span>MamPol -- a database of nucleotide polymorphism in the Mammalia class</span></a></div><p>Conduct single nucleotide polymorphisms diversity measurements among homologous sequences from the Mammalia class.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1266437314"><span>MicrobesOnline -- Prokaryotic Genome Database</span></a></div><p>Find information about 1000s of microbial genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1208461006"><span>Narcisse -- a mirror view of conserved syntenies</span></a></div><p>A database dedicated to the study of genome conservation.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1219772764"><span>OMA -- the Orthologous MAtrix project</span></a></div><p>Explore orthologous relations across 352 complete genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1209738741"><span>OPTIC -- orthologous and paralogous transcripts in clades</span></a></div><p>Browse complete genomes in several clades.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1209573208"><span>OrthoDB -- the hierarchical catalog of eukaryotic orthologs</span></a></div><p>Find groups of orthologous genes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1221231200"><span>OrthoMaM -- orthologous mammalian markers</span></a></div><p>A database of orthologous genomic markers for placental mammal phylogenetics.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1100009979"><span>PEDANT -- Protein Extraction, Description and ANalysis Tool</span></a></div><p>Conduct genome wide functional and structural analysis.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1174489475"><span>PReMod -- a database of genome-wide mammalian cis-regulatory module predictions</span></a></div><p>Conduct genome-wide cis-regulatory module (CRM) predictions for both the human and the mouse genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1151083092"><span>PhenomicDB -- Comparison of phenotypes of orthologous genes in human and model organisms</span></a></div><p>Compare phenotypes of a given gene or gene set in different model organisms.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1190899370"><span>Phylemon -- A suite of web tools for molecular evolution, phylogenetics and phylogenomics</span></a></div><p>Phylemon is a web server that integrates a selected suite of more than 20 different tools from the most popular stand-alone programs of phylogenetic and evolutionary analysis.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1232555615"><span>PhyloPat -- the phylogenetic pattern database</span></a></div><p>Use this database to see where in the evolution some phylogenetic lineages were started, and over which species they were contained.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1174510223"><span>Pristionchus.org -- a genome-centric database of the nematode satellite species Pristionchus pacificus</span></a></div><p>Search for genomic information on nematode satellite species Pristionchus pacificus.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1236367352"><span>ProtClustDB -- NCBI Protein Clusters Database</span></a></div><p>Find information about related protein sequences.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1209410278"><span>ProtozoaDB -- database of protozoan genomes</span></a></div><p>Database hosting genomics and post-genomics data from multiple protozoans.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1232554690"><span>Pseudofam -- the pseudogene families database</span></a></div><p>A database of pseudogene families based on the protein families from the Pfam database.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL20110518151439"><span>RIDM - RIKEN Integrated Database of Mammals</span></a></div><p>Find genomic information about mammals.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1272562567"><span>RegPrecise -- Regulon Prediction Database</span></a></div><p>Find information about predicted regulons in prokaryotic transcription regulation.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1272477473"><span>SALAD -- Surveyed contained motif ALignment diagram and the Associating Dendrogram</span></a></div><p>Perform systematic comparison of proteome data among species.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1229010765"><span>SGN -- SOL Genomics Network</span></a></div><p>A comparative map viewer dedicated to the biology of the Solanaceae family.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1256669040"><span>ShotgunFunctionalizeR -- R-package for functional comparison of metagenomes</span></a></div><p>Analyze data from functional analysis on fragmented microbial genetic material.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1256238439"><span>SnoopCGH -- Comparative Genomic Hybridization software</span></a></div><p>Visualize and explore comparative genomic hybridization data sets.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1174489598"><span>SwissRegulon -- a database of genome-wide annotations of regulatory sites</span></a></div><p>Search for genome-wide annotations of regulatory sites in yeast and prokaryotes genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1229013521"><span>TaxonGap -- a visualization tool for intra- and inter-species variation among individual biomarkers</span></a></div><p>Compare and select individual biomarkers.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1106063477"><span>The Adaptive Evolution Database (TAED) -- a phylogeny based tool for comparative genomics</span></a></div><p>Search for information on adaptive evolution in gene families of higher plants and chordate.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1216742716"><span>The CGView Server -- a comparative genomics tool for circular genomes</span></a></div><p>Generate graphical maps of circular genomes that show sequence features, base composition plots, analysis results and sequence similarity plots.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1099663588"><span>The ERGO -- Genome analysis and discovery system</span></a></div><p>Conduct a comprehensive analysis of genes and genomes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1177611772"><span>The Macaque Genome: Interactive Poster and Teaching Resource</span></a></div><p>An interactive online poster presentation on the Macaque genome, including high-quality images, video clips, and Web resources</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1103816940"><span>The TIGR Gene Indices -- clustering and assembling EST and known genes and integration with eukaryotic genomes</span></a></div><p>Search for annotated genetic information of expressed sequence tags (ESTs) in different eukaryotic organisms.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1043767169"><span>UniGene</span></a></div><p>Find mapping and expression information for a unigene cluster (ESTs and full-length mRNA sequences organized into clusters that each represent a unique known or putative gene)</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1216738072"><span>Uprobe -- universal overgo hybridization-based probe retrieval and design</span></a></div><p>A public online resource for identifying or designing 'universal' overgo-hybridization probes from conserved sequences that can be used to efficiently screen one or more genomic libraries from a designated group of species.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1098205291"><span>VISTA -- Computational Tools for Comparative Genomics</span></a></div><p>Comprehensive suite of programs and databases for comparative analysis of genomic sequences.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL20110518144404"><span>cBARBEL -- Catfish Breeder and Researcher Bioinformatics Entry Location</span></a></div><p>Find information about ictalurid catfish.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1209738040"><span>eggNOG -- evolutionary genealogy of genes: Non-supervised Orthologous Groups</span></a></div><p>Discover orthologous groups of genes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1234370319"><span>metaTIGER -- a metabolic gene evolution resource</span></a></div><p>Find metabolic networks and phylogenomic information on a taxonomically diverse range of eukaryotes.</p></div><div><div><a href="https://www.hsls.pitt.edu/obrc/index.php?page=URL1138901833"><span>xBASE -- a collection of online databases for bacterial comparative genomics</span></a></div><p>Conduct bacterial comparative genomics.</p></div>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43872/installing-elgg-on-ubuntu</guid>
	<pubDate>Wed, 25 May 2022 02:26:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43872/installing-elgg-on-ubuntu</link>
	<title><![CDATA[Installing ELGG on Ubuntu !]]></title>
	<description><![CDATA[<p>Elgg is an open-source and highly customizable framework used for building an online social environment. It provides a simple and powerful user interface that helps to manage and build your content through a web browser. Elgg offers a rich set of features including messaging, microblogging, file-sharing, RSS support, access control, groups, and many more.</p><p>&nbsp;</p><p>In this tutorial, we will show you how to install and configure Elgg social networking platform on Ubuntu 20.04.</p><h2>Prerequisites</h2><p>&bull; A fresh Ubuntu 20.04&nbsp;<a href="https://www.atlantic.net/vps-hosting/">VPS</a>&nbsp;on the Atlantic.net Cloud Platform<br />&bull; A valid domain name pointed to your server IP<br />&bull; A root password configured on your server</p><h2>Step 1 &ndash; Create Atlantic.Net Cloud Server</h2><p>First, log in to your&nbsp;<a href="https://cloud.atlantic.net/?page=userlogin" target="_blank">Atlantic.Net Cloud Server</a>. Create a new&nbsp;<a href="https://www.atlantic.net/vps-hosting/how-to-create-new-atlantic-net-cloud-server/">server</a>, choosing Ubuntu 20.04 as the operating system with at least 2GB RAM. Connect to your Cloud Server via SSH and log in using the credentials highlighted at the top of the page.</p><p>Once you are logged in to your Ubuntu 20.04 server, run the following command to update your base system with the latest available packages.</p><pre>apt-get update -y</pre><h2>Step 2 &ndash; Install Apache, MariaDB and PHP</h2><p>Elgg runs on Apache web server, is written in PHP, and uses MySQL/MariaDB as a database backend, so you will need to install the Apache, MariaDB, PHP and other required PHP extensions to your server. You can install all of them with the following command:</p><pre>apt-get install apache2 mariadb-server php libapache2-mod-php php-common php-sqlite3 php-curl 
php-intl php-mbstring php-xmlrpc php-mysql php-gd php-xml php-cli php-zip unzip wget -y</pre><p>After installing all the packages, edit the php.ini file and change some recommended settings.</p><pre>nano /etc/php/7.4/apache2/php.ini</pre><p>Change the following values:</p><pre>max_execution_time = 300
memory_limit = 512M
upload_max_filesize = 100M
date.timezone = Asia/Kolkata</pre><p>Save and close the file, then restart the Apache service to apply the configuration changes.</p><pre>systemctl restart apache2</pre><h2>Step 3 &ndash; Create a Database for Elgg</h2><p>Next, you will need to create a database and user for Elgg. First, log in to MySQL shell with the following command:</p><pre>mysql</pre><p>Once logged in, create a database and user with the following command:</p><pre>CREATE DATABASE elgg;
CREATE USER 'elgg'@'localhost' IDENTIFIED BY 'secure-password';</pre><p>Next, grant all the privileges to the elgg database with the following command:</p><pre>GRANT ALL ON elgg.* TO 'elgg'@'localhost' IDENTIFIED BY 'secure-password' WITH GRANT 
OPTION;</pre><p>Next, flush the privileges and exit from the MariaDB shell with the following command:</p><pre>FLUSH PRIVILEGES;
EXIT;</pre><p>At this point, the MariaDB database is created for Elgg.</p><h2>Step 4 &ndash; Install Elgg</h2><p>First, download the latest version of Elgg from its official website using the following command:</p><pre>wget https://elgg.org/download/elgg-3.3.13.zip</pre><p>Once the download is completed, unzip the downloaded file with the following command:</p><pre>unzip elgg-3.3.13.zip</pre><p>Next, move the extracted directory to the Apache root directory:</p><pre>mv elgg-3.3.13 /var/www/html/elgg</pre><p>Next, create a data directory and set proper ownership and permissions to the Elgg directory:</p><pre>mkdir /var/www/html/data
chown -R www-data:www-data /var/www/html/elgg
chown -R www-data:www-data /var/www/html/data
chmod -R 755 /var/www/html/elgg</pre><p>Once you are finished, you can proceed to the next step.</p><h2>Step 5 &ndash; Configure Apache for Elgg</h2><p>Next, you will need to configure Apache to serve Elgg. You can configure it by creating a new Apache virtual host configuration file:</p><pre>nano /etc/apache2/sites-available/elgg.conf</pre><p>Add the following lines:</p><pre>&lt;VirtualHost *:80&gt;
ServerAdmin admin@example.com
DocumentRoot /var/www/html/elgg/
ServerName elgg.example.com
Options FollowSymLinks
AllowOverride All
ErrorLog /var/log/apache2/elgg-error_log
CustomLog /var/log/apache2/elgg-access_log common
&lt;/VirtualHost&gt;</pre><p>Save and close the file, then enable the virtual host and Apache rewrite module with the following command:</p><pre>a2ensite elgg.conf
a2enmod rewrite</pre><p>Finally, restart the Apache service to apply the changes:</p><pre>systemctl restart apache2</pre><h2>Step 6 &ndash; Access Elgg Web Interface</h2><p>Now, open your web browser and access the Elgg web interface using the URL http://elgg.example.com. You should see the Elgg welcome screen:</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43670/useful-bioinformatics-analysis-tools</guid>
	<pubDate>Thu, 23 Dec 2021 23:10:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43670/useful-bioinformatics-analysis-tools</link>
	<title><![CDATA[Useful Bioinformatics Analysis Tools !]]></title>
	<description><![CDATA[<h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=cometa&amp;subpage=about">CoMeta</a></h3><p><strong>Classificier of reads from metagenomic sequencing experiments.</strong></p><p><span>&bull;&nbsp;&nbsp;Kawulok, J., Deorowicz, S.,&nbsp;</span><em>CoMeta: Classification of Metagenomes Using k-mers</em><span>,&nbsp;</span><a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0121453">PLOS ONE,&nbsp;</a><span>2015; 10(4):1&ndash;23,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=CoMSA&amp;subpage=about">CoMSA</a></h3><p><strong>Compressor of multiple sequence alignments of proteins.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Walczyszyn, J., Debudaj-Grabysz, A.,&nbsp;</span><em>CoMSA: compression of protein multiple sequence alignment files</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty619">Bioinformatics,&nbsp;</a><span>2019; 35(2):22&ndash;234,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=dsrc&amp;subpage=about">DSRC</a></h3><p><strong>Compressor of sequencing reads.</strong></p><p><span>&bull;&nbsp;&nbsp;Roguski, L., Deorowicz, S.,&nbsp;</span><em>DSRC 2: Industry-oriented compression of FASTQ files</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/content/30/15/2213">Bioinformatics,&nbsp;</a><span>2014; 30(15):2213&ndash;2215,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Grabowski, Sz.,&nbsp;</span><em>Compression of DNA sequences in FASTQ format</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/">Bioinformatics,&nbsp;</a><span>2011; 27(6):860&ndash;862,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=famsa&amp;subpage=about">FAMSA</a></h3><p><strong>Multiple sequence alignment designed for huge families of proteins (even containing hundreds of thousands of sequences).</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Gudys, A.,&nbsp;</span><em>FAMSA: Fast and accurate multiple sequence alignment of huge protein families</em><span>,&nbsp;</span><a href="http://www.nature.com/articles/srep33964">Scientific Reports,&nbsp;</a><span>2016; 6(33964):</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=fastore&amp;subpage=about">FaStore</a></h3><p><strong>Compressor of FASTQ files.</strong></p><p><span>&bull;&nbsp;&nbsp;Roguski, L., Ochoa, I., Hernaez, M., Deorowicz, S.,&nbsp;</span><em>FaStore - a space-saving solution for raw sequencing data</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty205">Bioinformatics,&nbsp;</a><span>2018; 34(16):2748&ndash;2756,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=fqsqueezer&amp;subpage=about">FQSqueezer</a></h3><p><strong>Experimental high-end compressor of FASTQ files.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S.,&nbsp;</span><em>FQSqueezer: k-mer-based compression of sequencing data</em><span>,&nbsp;</span><a href="https://www.nature.com/articles/s41598-020-57452-6">Scientific Reports,&nbsp;</a><span>2020; 10(578):</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=gdc&amp;subpage=about">GDC</a></h3><p><strong>Compressor of collections of genome sequences.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A., Niemiec, M.,&nbsp;</span><em>GDC 2: Compression of large collections of genomes</em><span>,&nbsp;</span><a href="http://www.nature.com/srep/2015/150625/srep11565/full/srep11565.html">Scientific Reports,&nbsp;</a><span>2015; 5(11565):1&ndash;12,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Grabowski, Sz.,&nbsp;</span><em>Robust relative compression of genomes with random access</em><span>,&nbsp;</span><a href="http://sun.aei.polsl.pl/REFRESH/bioinformatics.oxfordjournals.org/content/27/21/2979.abstract">Bioinformatics,&nbsp;</a><span>2011; 27(21):2979&ndash;2986,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=gtc&amp;subpage=about">GTC</a></h3><p><strong>Genotype databases compressor with support for fast queries.</strong></p><p><span>&bull;&nbsp;&nbsp;Danek, A., Deorowicz, S.,&nbsp;</span><em>GTC: how to maintain huge genotype collections in a compressed form</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty023">Bioinformatics,&nbsp;</a><span>2018; 34(11):1834&ndash;1840,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=gtshark&amp;subpage=about">GTShark</a></h3><p><strong>Genotypes compressor.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A.,&nbsp;</span><em>GTShark: Genotype compression in large projects</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/btz508">Bioinformatics,&nbsp;</a><span>2019; 35(22):4791&ndash;4793,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=kmc&amp;subpage=about">KMC</a></h3><p><strong>Memory frugal&nbsp;<em>k</em>-mer counter.</strong></p><p><span>&bull;&nbsp;&nbsp;Kokot, M., Długosz, M., Deorowicz, S.,&nbsp;</span><em>KMC 3: counting and manipulating k -mer statistics</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/btx304">Bioinformatics,&nbsp;</a><span>2017; 33(17):2759&ndash;2761,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Kokot, M., Grabowski, Sz., Debudaj-Grabysz, A.,&nbsp;</span><em>KMC 2: Fast and resource-frugal k-mer counting</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/btv022">Bioinformatics,&nbsp;</a><span>2015; 31(10):1569&ndash;1576,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Grabowski, Sz.,&nbsp;</span><em>Disk-based k-mer counting on a PC</em><span>,&nbsp;</span><a href="http://www.biomedcentral.com/1471-2105/14/160">BMC Bioinformatics,&nbsp;</a><span>2013; 14():Article no. 160,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=kmer-db&amp;subpage=about">Kmer-db</a></h3><p><strong>Tool for estimation of evolutionary distances in a collection of genomes.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Gudys, A., Dlugosz, M., Kokot, M., Danek, A.,&nbsp;</span><em>Kmer-db: instant evolutionary distance estimation</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty610">Bioinformatics,&nbsp;</a><span>2019; 35(1):133&ndash;136,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=mugi&amp;subpage=about">MuGI</a></h3><p><strong>Index allowing queries for a collection of multiple genome sequences.</strong></p><p><span>&bull;&nbsp;&nbsp;Danek, A., Deorowicz, S., Grabowski, Sz.,&nbsp;</span><em>Indexes of Large Genome Collections on a PC</em><span>,&nbsp;</span><a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0109384">PLOS ONE,&nbsp;</a><span>2014; 9(10):e109384,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=orcom&amp;subpage=about">ORCOM</a></h3><p><strong>Experimental compressor of sequencing reads.</strong></p><p><span>&bull;&nbsp;&nbsp;Grabowski, Sz., Deorowicz, S., Roguski, L.,&nbsp;</span><em>Disk-based compression of data from genome sequencing</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/content/early/2014/12/22/bioinformatics.btu844.abstract">Bioinformatics,&nbsp;</a><span>2014; 31(9):1389&ndash;1395,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=pgsa&amp;subpage=about">PgSA</a></h3><p><strong>Index allowing queries for a collection of sequencing reads.</strong></p><p><span>&bull;&nbsp;&nbsp;Kowalski, T., Grabowski, Sz., Deorowicz, S.,&nbsp;</span><em>Indexing arbitrary-length k-mers in sequencing reads</em><span>,&nbsp;</span><a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0133198">PLOS ONE,&nbsp;</a><span>2015; 10(7):1&ndash;16,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=quickprobs&amp;subpage=about">QuickProbs</a></h3><p><strong>Multiple sequence alignment designed especially for GPU.</strong></p><p><span>&bull;&nbsp;&nbsp;Gudys, A., Deorowicz, S.,&nbsp;</span><em>QuickProbs 2: towards rapid construction of high-quality alignments of large protein families</em><span>,&nbsp;</span><a href="http://www.nature.com/articles/srep41553">Scientific Reports,&nbsp;</a><span>2017; 7(41553):</span><br /><span>&bull;&nbsp;&nbsp;Gudys, A., Deorowicz, S.,&nbsp;</span><em>QuickProbs &ndash; A Fast Multiple Sequence Alignment Algorithm Designed for Graphics Processors</em><span>,&nbsp;</span><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0088901">PLOS ONE,&nbsp;</a><span>2014; 9(2):e88901,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=reckoner&amp;subpage=about">RECKONER</a></h3><p><strong>Read error corrector.</strong></p><p><span>&bull;&nbsp;&nbsp;Maciej Długosz, M., Deorowicz, S.,&nbsp;</span><em>RECKONER: read error corrector based on KMC</em><span>,&nbsp;</span><a href="https://academic.oup.com/bioinformatics/article-abstract/33/7/1086/2843893/RECKONER-read-error-corrector-based-on-KMC">Bioinformatics,&nbsp;</a><span>2017; 33(7):1086&ndash;1089,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=tgc&amp;subpage=about">TGC</a></h3><p><strong>Compressor of collections of genomes given in Variant Call Format (VCF) files.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A., Grabowski, Sz.,&nbsp;</span><em>Genome compression: a novel approach for large collections</em><span>,&nbsp;</span><a href="http://bioinformatics.oxfordjournals.org/content/early/2013/08/29/bioinformatics.btt460">Bioinformatics,&nbsp;</a><span>2013; 29(20):2572&ndash;2578,</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=vcfshark&amp;subpage=about">VCFShark</a></h3><p><strong>Compressor of VCF files.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Danek, A.,&nbsp;</span><em>GTShark: Genotype compression in large projects</em><span>,&nbsp;</span><a href="https://www.biorxiv.org/content/10.1101/2020.12.18.423437v1">biorxiv.org,&nbsp;</a><span>2020; ():</span></p><h3><a href="http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&amp;project=whisper&amp;subpage=about">Whisper</a></h3><p><strong>Experimental mapper of whole genome sequencing data.</strong></p><p><span>&bull;&nbsp;&nbsp;Deorowicz, S., Gudys, A.,&nbsp;</span><em>Whisper 2: indel-sensitive short read mapping</em><span>,&nbsp;</span><a href="https://doi.org/10.1101/2019.12.18.881292">bioRxiv.org,&nbsp;</a><span>2019; :</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Gudys, A., Grabowski, Sz.,&nbsp;</span><em>Whisper: read sorting allows robust robust mapping of DNA sequencing data</em><span>,&nbsp;</span><a href="https://doi.org/10.1093/bioinformatics/bty927">Bioinformatics,&nbsp;</a><span>2019; 35(12):2043&ndash;2050,</span><br /><span>&bull;&nbsp;&nbsp;Deorowicz, S., Debudaj-Grabysz, A., Gudys, A., Grabowski, Sz.,&nbsp;</span><em>Robust mapping of whole genome sequencing data</em><span>,&nbsp;</span><a href="https://meetings.cshl.edu/abstracts.aspx?meet=GENOME&amp;year=17">Poster at The Biology of Genomes Conference,&nbsp;</a><span>2017;</span></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</guid>
	<pubDate>Fri, 10 Dec 2021 06:22:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</link>
	<title><![CDATA[Illumina based assembly pipeline steps !]]></title>
	<description><![CDATA[<h3 id="illumina">Illumina<a href="https://nf-co.re/viralrecon#illumina"><span></span></a></h3><ol>
<li>Merge re-sequenced FastQ files (<a href="http://www.linfo.org/cat.html"><code>cat</code></a>)</li>
<li>Read QC (<a href="https://www.bioinformatics.babraham.ac.uk/projects/fastqc/"><code>FastQC</code></a>)</li>
<li>Adapter trimming (<a href="https://github.com/OpenGene/fastp"><code>fastp</code></a>)</li>
<li>Removal of host reads (<a href="http://ccb.jhu.edu/software/kraken2/"><code>Kraken 2</code></a>; <em>optional</em>)</li>
<li>Variant calling<ol>
<li>Read alignment (<a href="http://bowtie-bio.sourceforge.net/bowtie2/index.shtml"><code>Bowtie 2</code></a>)</li>
<li>Sort and index alignments (<a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Primer sequence removal (<a href="https://github.com/andersen-lab/ivar"><code>iVar</code></a>; <em>amplicon data only</em>)</li>
<li>Duplicate read marking (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>; <em>optional</em>)</li>
<li>Alignment-level QC (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>, <a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Genome-wide and amplicon coverage QC plots (<a href="https://github.com/brentp/mosdepth/"><code>mosdepth</code></a>)</li>
<li>Choice of multiple variant calling and consensus sequence generation routes (<a href="https://github.com/andersen-lab/ivar"><code>iVar variants and consensus</code></a>; <em>default for amplicon data</em> <em>||</em> <a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>, <a href="https://github.com/arq5x/bedtools2/"><code>BEDTools</code></a>; <em>default for metagenomics data</em>)
<ul>
<li>Variant annotation (<a href="http://snpeff.sourceforge.net/SnpEff.html"><code>SnpEff</code></a>, <a href="http://snpeff.sourceforge.net/SnpSift.html"><code>SnpSift</code></a>)</li>
<li>Consensus assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
<li>Lineage analysis (<a href="https://github.com/cov-lineages/pangolin"><code>Pangolin</code></a>)</li>
<li>Clade assignment, mutation calling and sequence quality checks (<a href="https://github.com/nextstrain/nextclade"><code>Nextclade</code></a>)</li>
<li>Individual variant screenshots with annotation tracks (<a href="https://asciigenome.readthedocs.io/en/latest/"><code>ASCIIGenome</code></a>)</li>
</ul>
</li>
<li>Intersect variants across callers (<a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>)</li>
</ol></li>
<li><em>De novo</em> assembly<ol>
<li>Primer trimming (<a href="https://cutadapt.readthedocs.io/en/stable/guide.html"><code>Cutadapt</code></a>; <em>amplicon data only</em>)</li>
<li>Choice of multiple assembly tools (<a href="http://cab.spbu.ru/software/spades/"><code>SPAdes</code></a> <em>||</em> <a href="https://github.com/rrwick/Unicycler"><code>Unicycler</code></a> <em>||</em> <a href="https://github.com/GATB/minia"><code>minia</code></a>)
<ul>
<li>Blast to reference genome (<a href="https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch"><code>blastn</code></a>)</li>
<li>Contiguate assembly (<a href="https://www.sanger.ac.uk/science/tools/pagit"><code>ABACAS</code></a>)</li>
<li>Assembly report (<a href="https://github.com/BU-ISCIII/plasmidID"><code>PlasmidID</code></a>)</li>
<li>Assembly assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
</ul>
</li>
</ol></li>
<li>Present QC and visualisation for raw read, alignment, assembly and variant calling results (<a href="http://multiqc.info/"><code>MultiQC</code></a>)</li>
</ol>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43607/classification-of-sars-cov2-variant</guid>
	<pubDate>Fri, 26 Nov 2021 12:53:12 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43607/classification-of-sars-cov2-variant</link>
	<title><![CDATA[Classification of SARS-CoV2 Variant !]]></title>
	<description><![CDATA[<p>The scientists established some guidelines for determining whether a variant is a legitimate branch of an existing lineage:</p><p>The variant should be transmitted from its original location to another "geographically distinct population"&mdash;say, another country or a province of a large and populous country.<br />It should differ from its ancestor by at least one nucleotide.<br />At least 95% of its genetic code should have been sequenced at least five times from different samples.</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43550/basic-structure-of-snakemake-pipeline-run</guid>
	<pubDate>Thu, 14 Oct 2021 07:01:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43550/basic-structure-of-snakemake-pipeline-run</link>
	<title><![CDATA[Basic Structure of Snakemake Pipeline Run !]]></title>
	<description><![CDATA[<div>/user/snakemake-demo$ ls</div><div>config.json data envs scripts slurm-240702.out Snakefile</div><ul>
<li>data = mock data for the snakefile to use</li>
<li>Snakefile = name of the snakemake &ldquo;formula&rdquo; file
<ul>
<li>Note: The default file that snakemake looks for in the current working directory is the&nbsp;<code>Snakefile</code>. If you would like to override that you can specify it following the&nbsp;<code>-s</code>
<ul>
<li><code>snakemake -s snakefile.py</code></li>
</ul>
</li>
</ul>
</li>
<li>envs = directory for storing the conda environments that the workflow will use.</li>
<li>scripts = directory for storing python scripts called by the snakemake formula.</li>
<li>config.json = json format file with extra parameters for our snakemake file to use.</li>
<li>cluster.json = json format file with specification for running on the HPC</li>
<li>samples.txt = file we will use later relating to the config.json file.</li>
</ul><p><span>Run the snakemake file as a dry run (the example workflow shown above).</span></p><ul>
<li>This will build a DAG of the jobs to be run without actually executing them.</li>
<li><code>snakemake --dry-run</code></li>
</ul><p>User can e<span>xecute rules of interest.</span></p><ul>
<li><code>snakemake --dry-run all</code>&nbsp;VS.&nbsp;<code>snakemake --dry-run call</code>&nbsp;VS.&nbsp;<code>snakemake --dry-run bwa</code></li>
</ul><p><span>Run the snakemake file in order to produce an image of the DAG of jobs to be run.</span></p><ul>
<li><code>snakemake --dag | dot -Tsvg &gt; dag.svg</code>&nbsp;OR&nbsp;<code>snakemake --dag | dot -Tsvg &gt; dag.svg</code></li>
</ul><p>Run the snakemake (this time not as a dry run)</p><ol>
<li><code>snakemake --use-conda</code></li>
</ol>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43424/rest-api</guid>
	<pubDate>Mon, 04 Oct 2021 12:46:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43424/rest-api</link>
	<title><![CDATA[REST API]]></title>
	<description><![CDATA[<h3 id="PSIBLASTHelpandDocumentation-RESTAPI">REST API</h3><p>The&nbsp;<a href="https://www.ebi.ac.uk/seqdb/confluence/pages/viewpage.action?pageId=68165098">Representational State Transfer (REST)</a>&nbsp;sample clients are provided for a number of programming languages. For details of how to use these clients,&nbsp;<a href="https://github.com/ebi-wp/webservice-clients">download</a>&nbsp;the client and run the program without any arguments.</p><div><table><colgroup><col><col><col></colgroup>
<thead>
<tr><th scope="col">
<div>Language</div>
</th><th scope="col">
<div>Download</div>
</th><th scope="col">
<div>Requirements</div>
</th></tr>
</thead>
<tbody>
<tr><th>Perl</th>
<td><a href="https://raw.githubusercontent.com/ebi-wp/webservice-clients/master/perl/psiblast.pl">psiblast.pl</a></td>
<td><a href="http://search.cpan.org/perldoc?LWP">LWP</a>&nbsp;and&nbsp;<a href="http://search.cpan.org/perldoc?XML::Simple">XML::Simple</a></td>
</tr>
<tr><th colspan="1">
<h4 id="PSIBLASTHelpandDocumentation-Python">Python</h4>
</th>
<td colspan="1">
<p><a href="https://raw.githubusercontent.com/ebi-wp/webservice-clients/master/python/psiblast.py">psiblast.py</a></p>
</td>
<td colspan="1"><a href="https://pypi.python.org/pypi/xmltramp2/3.0.10" title="https://pypi.python.org/pypi/xmltramp2/3.0.10">xmltramp2</a></td>
</tr>
</tbody>
</table></div><p>For details see&nbsp;<a href="https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Environment+setup+for+REST+Web+Services">Environment setup for REST Web Services</a>&nbsp;and&nbsp;<a href="https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Examples+for+Perl+REST+Web+Services+Clients">Examples for Perl REST Web Services Clients</a>&nbsp;pages.</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>