<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/22920?offset=1330</link>
	<atom:link href="https://bioinformaticsonline.com/related/22920?offset=1330" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/8265/list-of-generic-simulation-softwaretoolsresource-with-brief-description-and-homepage</guid>
	<pubDate>Mon, 10 Feb 2014 05:57:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/8265/list-of-generic-simulation-softwaretoolsresource-with-brief-description-and-homepage</link>
	<title><![CDATA[List of generic simulation software/tools/resource with brief description and homepage !!!]]></title>
	<description><![CDATA[<p>List of generic simulation software/tools/resource with brief description and homepage</p><p><img src="http://www.evolution-of-life.com/fileadmin/images/carousel/genetic.PNG" alt="image" style="border: 0px;"></p><p>ALF <br />A Simulation Framework for Genome Evolution <br />http://www.cbrg.ethz.ch/alf<br /><br />Bayesian Serial SimCoal <br />Bayesian Serial SimCoal, (BayeSSC) is a modification of SIMCOAL 1.0, a program written by Laurent Excoffier, John Novembre, and Stefan Schneider. <br />http://www.stanford.edu/group/hadlylab/ssc/index.html<br /><br />BEERS <br />BEERS was designed to benchmark RNA-Seq alignment algorithms and also algorithms that aim to reconstruct different isoforms and alternate splicing from RNA-Seq data <br />http://cbil.upenn.edu/beers/<br /><br />BOTTLENECK <br />Bottleneck is a program for detecting recent effective population size reductions from allele data frequencies <br />http://www.ensam.inra.fr/urlb/bottleneck/bottleneck.html<br /><br />BottleSim <br />BottleSim is a computer simulation program for simulating the process of population bottlenecks <br />http://chkuo.name/software/bottlesim.html<br /><br />CASS <br />Protein Sequence Simulation <br />http://www.wyomingbioinformatics.org/liberlesgroup/cass/<br /><br />CDPOP <br />CDPOP is a landscape genetics tool for simulating the emergence of spatial genetic structure in populations resulting from specified landscape processes governing organism movement behavior. <br />http://cel.dbs.umt.edu/cdpop<br /><br />CoalFace <br />CoalFace is a simulation of the coalescent process with the visual display of gene genealogies. <br />http://web.up.ac.za/default.asp?ipkcategoryid=3283<br /><br />CoaSim <br />CoaSim is a tool for simulating the coalescent process with recombination and geneconversion under various demographic models. <br />http://users-birc.au.dk/mailund/coasim/index.html<br /><br />cosi <br />The cosi package is written in C and is available as a tar file. <br />http://www.broadinstitute.org/~sfs/cosi/<br /><br />CS-PSeq-Gen <br />A program to simulate the evolution of protein sequences under the constraints of the information of a particular reconstructed phylogeny <br />http://bioserv.rpbs.univ-paris-diderot.fr/software/cs-pseq-gen.html<br /><br />DAWG <br />An application designed to simulate the evolution of recombinant DNA sequences in continuous time <br />http://scit.us/projects/dawg<br /><br />Easypop <br />EASYPOP is an individual based model intended to simulate datasets under a very broad range of conditions <br />http://www.unil.ch/dee/page36926_fr.html<br /><br />EggLib <br />EggLib is a C++/Python library and program package for evolutionary genetics and genomics. <br />http://egglib.sourceforge.net/<br /><br />EvolSimulator <br />A simulation test bed for hypotheses of genome evolution <br />http://acb.qfab.org/acb/evolsim/<br /><br />EvolveAGene <br />A realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions <br />http://bellinghamresearchinstitute.com/software/index.html<br /><br />fastsimcoal <br />A continuous-&not;‐time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios <br />http://cmpg.unibe.ch/software/fastsimcoal/<br /><br />FastSLINK <br />Simulation of Marker and Phenotype Data in Pedigrees <br />http://watson.hgen.pitt.edu/<br /><br />FFPopSim <br />C++/Python library for population genetics. <br />http://webdav.tuebingen.mpg.de/ffpopsim/<br /><br />FLUX SIMULATOR <br />The Flux Simulator aims at providing a deterministic in silico reproduction of the experimental pipelines for RNA-Seq, employing a minimal set of parameters. <br />http://flux.sammeth.net/simulator.html<br /><br />ForSim <br />ForSim: A Forward Evolutionary Computer Simulation <br />http://www.anthro.psu.edu/weiss_lab/research.shtml<br /><br />ForwSim <br />The program given below is based on the algorithm described in Padhukasahasram et al. 2008 to simulate genetic drift in a standard Wright-Fisher process. <br />http://badri-populationgeneticsimulators.blogspot.com/<br /><br />FPG <br />Forward Population Genetic simulation <br />http://genfaculty.rutgers.edu/hey/software#fpg<br /><br />FREGENE <br />FREGENE is a C++ program that simulates sequence-like data over large genomic regions in large diploid populations. <br />http://www.ebi.ac.uk/projects/bargen/download/fregen/documentation_html.html<br /><br />GAMETES <br />Genetic Architecture Model Emulator for Testing and Evaluating Software: Simulates complex SNP models with pure, strict epistatic interactions with n-loci. <br />http://sourceforge.net/projects/gametes/?source=navbar<br /><br />GASP <br />Genometric Analysis Simulation Program. A software tool for testing and investigating methods in statistical genetics by generating samples of family data based on user specified models. <br />http://research.nhgri.nih.gov/gasp/<br /><br />GemSIM <br />Next generation sequencing read simulator <br />http://sourceforge.net/projects/gemsim/<br /><br />GeneArtisan <br />Simulation of Markers in Case-Control Study Designs <br />http://www.rannala.org/?page_id=241<br /><br />GENOME <br />A rapid coalescent-based whole genome simulator <br />http://www.sph.umich.edu/csg/liang/genome/<br /><br />GenomePop2 <br />GenomePop2 is a specialization of the program GenomePop just to manage SNPs under more flexible and useful settings. If you need models with more than 2 alleles please use the GenomePop program version. <br />http://webs.uvigo.es/acraaj/genomepop2.htm<br /><br />GenomeSimla <br />GenomeSIMLA is currently under development- however, we have a beta release that we are asking to be tested <br />http://chgr.mc.vanderbilt.edu/genomesimla/<br /><br />GENS2 <br />Simulates interactions among two genetic and one environmental factor and also allows for epistatic interactions. <br />https://sourceforge.net/projects/gensim/<br /><br />GWAsimulator <br />A rapid whole genome simulation program <br />http://biostat.mc.vanderbilt.edu/wiki/main/gwasimulator<br /><br />HAP-SAMPLE <br />An association simulator for candidate regions or genome scans <br />http://www.hapsample.org/<br /><br />HAPGEN <br />A simulator for the simulation of case control datasets at SNP markers <br />https://mathgen.stats.ox.ac.uk/genetics_software/hapgen/hapgen2.html<br /><br />HapSim <br />A simulation tool for generating haplotype data with pre-specified allele frequencies and LD coefficients <br />http://cran.r-project.org/web/packages/hapsim/index.html<br /><br />HAPSIMU <br />A program that simulates heterogeneous populations with various known and controllable structures under the continuous migration model or the discrete model <br />http://l.web.umkc.edu/liujian/<br /><br />IBDsim <br />IBDSim is a computer package for the simulation of genotypic data under general isolation by distance models. <br />http://raphael.leblois.free.fr/<br /><br />indel-Seq-Gen <br />A biological sequence simulation program that simulates highly divergent DNA sequences and protein superfamilies <br />http://bioinfolab.unl.edu/~cstrope/isg/<br /><br />Indelible <br />A powerful and flexible simulator of biological evolution <br />http://abacus.gene.ucl.ac.uk/software/indelible/<br /><br />invertFREGENE <br />InvertFREGENE is a forward-in-time simulator of inversions in population genetic data <br />http://www.ebi.ac.uk/projects/bargen/<br /><br />kernalPop <br />A spatially explicit population genetic simulation engine <br />http://cran.r-project.org/src/contrib/archive/kernelpop/<br /><br />MaCS <br />Markovian Coalescent Simulator <br />http://www-hsc.usc.edu/~garykche/<br /><br />Mason <br />A package for the simulation of nucleotide data. <br />http://www.seqan.de/projects/mason/<br /><br />mbs <br />modifying Hudson's ms software to generate samples of DNA sequences with a biallelic site under selection <br />http://www.sendou.soken.ac.jp/esb/innan/innanlab/software.html<br /><br />Mendel's Accountant <br />Mendel's Accountant (MENDEL) is an advanced numerical simulation program for modeling genetic change over time and was developed collaboratively by Sanford, Baumgardner, Brewer, Gibson and ReMine <br />http://mendelsaccount.sourceforge.net/<br /><br />MetaSim <br />A tool to generate collections of synthetic reads that reflect the diverse taxonomical composition of typical metagenome data sets <br />http://ab.inf.uni-tuebingen.de/software/metasim/<br /><br />mlcoalsim <br />Multilocus Coalescent Simulations <br />http://code.google.com/p/mlcoalsim-v1/<br /><br />ms <br />The purpose of this program is to allow one to investigate the statistical properties of such samples, to evaluate estimators or statistical tests, and generally to aid in the interpretation of polymorphism data sets. <br />http://home.uchicago.edu/~rhudson1/source/mksamples.html<br /><br />msHOT <br />The purpose of this program is to allow one to investigate the statistical properties of such samples, to evaluate estimators or statistical tests, and generally to aid in the interpretation of polymorphism data sets. <br />http://home.uchicago.edu/~rhudson1/<br /><br />msms <br />A coalescent Simlation tool with selection. <br />http://www.mabs.at/ewing/msms/index.shtml<br /><br />MySSP <br />A program for the simulation of DNA sequence evolution across a phylogenetic tree <br />http://www.rosenberglab.net/software.php<br /><br />Nemo <br />A forward-time, individual-based, genetically explicit, and stochastic simulation program designed to study the evolution of genetic markers, life history traits, and phenotypic traits in a flexible (meta-)population framework. <br />http://nemo2.sourceforge.net/<br /><br />NetRecodon <br />Coalescent simulation of coding DNA sequences with recombination (inter and intracodon), migration and demography <br />http://code.google.com/p/netrecodon/<br /><br />PEDAGOG <br />Software for simulating eco-evolutionary population dynamics <br />https://bcrc.bio.umass.edu/pedigreesoftware/node/5<br /><br />phenosim <br />A tool to add phenotypes to simulated genotypes <br />http://evoplant.uni-hohenheim.de/doku.php?id=software:software<br /><br />PhyloSim <br />An R package for the Monte Carlo simulation of sequence evolution <br />http://bit.ly/rlsim-git<br /><br />pIRS <br />Profile-based Illumina pair-end reads simulator <br />https://code.google.com/p/pirs/<br /><br />ProteinEvolver <br />Simulation of protein evolution along phylogenies under structure-based substitution models <br />http://code.google.com/p/proteinevolver/<br /><br />QMSim <br />QTL and Marker Simulator <br />http://www.aps.uoguelph.ca/~msargol/qmsim/<br /><br />quantiNEMO <br />An individual-based program for the analysis of quantitative traits with explicit genetic architecture potentially under selection in a structured population <br />http://www2.unil.ch/popgen/softwares/quantinemo/<br /><br />RECOAL <br />Simulates new haplotype data from a reference population of haplotypes. <br />ftp://popgen.usc.edu/<br /><br />Recodon <br />Coalescent simulation of coding DNA sequences with recombination, migration and demography <br />http://code.google.com/p/recodon/<br /><br />rlsim <br />A package for simulating RNA-seq library preparation with parameter estimation <br />http://bit.ly/rlsim-git<br /><br />Rmetasim <br />Rmetasim is a front-end for the metasim engine that is implemented as a package that runs in the statistical computing environment R <br />http://linum.cofc.edu/software.html#metasim<br /><br />RNA Seq Simulator <br />RSS takes SAM alignment files from RNA-Seq data and simulates over dispersed, multiple replica, differential, non-stranded RNA-Seq datasets. <br />http://useq.sourceforge.net/cmdlnmenus.html#rnaseqsimulator<br /><br />Rose <br />Random model of sequence evolution <br />http://bibiserv.techfak.uni-bielefeld.de/rose/<br /><br />SelSim <br />SelSim is a program for Monte Carlo simulation of DNA polymorphism data for a recom- bining region within which a single bi-allelic site has experienced natural selection <br />http://www.well.ox.ac.uk/~spencer/selsim/<br /><br />Seq-Gen <br />An application for the Monte Carlo simulation of molecular sequence evolution along phylogenetic trees. <br />http://tree.bio.ed.ac.uk/software/seqgen/<br /><br />SEQPower <br />Statistical power analysis for sequence-based association studies <br />http://bioinformatics.org/spower/<br /><br />SeqSIMLA <br />SeqSIMLA can simulate sequence data with user-specified disease and quantitative trait models. Family or unrelated case-control data can be simulated. <br />http://seqsimla.sourceforge.net/<br /><br />Serial NetEvolve <br />A flexible utility for generating serially-sampled sequences along a tree or recombinant network <br />http://biorg.cis.fiu.edu/sne/<br /><br />SFS_CODE <br />SFS_CODE can perform forward population genetic simulations under a general Wright-Fisher model with arbitrary migration, demographic, selective, and mutational effects. <br />http://sfscode.sourceforge.net/sfs_code/index/index.html<br /><br />SIBSIM <br />Quantitative phenotype simulation in extended pedigrees <br />http://sourceforge.net/projects/sibsim/<br /><br />SIMCOAL2 <br />A coalescent program for the simulation of complex recombination patterns over large genomic regions under various demographic models <br />http://cmpg.unibe.ch/software/simcoal2/<br /><br />SimCopy <br />An R package simulating the evolution of copy number profiles along a tree. <br />http://bit.ly/simcopy<br /><br />SIMLA <br />SIMLA is a SIMuLAtion program that generates data sets of families for use in Linkage and Association studies. <br />http://www.chg.duke.edu/research/simla.html<br /><br />SimPed <br />A Simulation Program to Generate Haplotype and Genotype Data for Pedigree Structures <br />http://www.hgsc.bcm.tmc.edu/content/simped<br /><br />Simprot <br />A program to simulate protein evolution by substitution, insertion and deletion <br />http://www.uhnresearch.ca/labs/tillier/software.htm#3<br /><br />SimRare <br />Rare variant simulation and analysis tool <br />http://code.google.com/p/simrare/<br /><br />simuGWAS <br />A forward-time simulator that simulates realistic samples for genome-wide association studies. <br />http://simupop.sourceforge.net/cookbook/simucomplexdisease<br /><br />simuPOP <br />simuPOP is a general-purpose individual-based forward-time population genetics simulation environment. <br />http://simupop.sourceforge.net/<br /><br />SISSI <br />A software tool to generate data of related sequences along a given phylogeny, taking into account user defined system of neighbourhoods and instantaneous rate matrices. <br />http://www.cibiv.at/software/sissi/<br /><br />SNPsim <br />Coalescent simulation of hotspot recombination <br />http://code.google.com/p/phylosoftware/<br /><br />SPIP <br />SPIP simulates the transmission of genes from parents to offspring in a population having demographic structure defined by the user <br />http://swfsc.noaa.gov/textblock.aspx?division=fed&amp;id=3434<br /><br />Splatche <br />Spatial and Temporal Coalescences in Heterogeneous Environment <br />http://www.splatche.com/<br /><br />srv <br />Simulator of Rare Varaints (srv) is a simulator for the simulation of the introduction and evolution of (rare) genetic variants. <br />http://simupop.sourceforge.net/cookbook/simurarevariants<br /><br />SUP <br />SLINK/FastSLINK utility program <br />http://mlemire.freeshell.org/software.html<br /><br />TreesimJ <br />A flexible, forward-time population genetic simulator <br />http://code.google.com/p/treesimj/<br /><br />Vortex <br />VORTEX is an individual-based simulation model for population viability analysis (PVA). <br />http://www.vortex9.org/vortex.html<br /><br />References:</p><p>Image www.evolution-of-life.com</p><p>www.cancer.gov</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</guid>
	<pubDate>Tue, 06 Feb 2018 14:54:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35534/awk-for-bioinformatician-and-computational-biologist</link>
	<title><![CDATA[Awk for Bioinformatician and computational biologist]]></title>
	<description><![CDATA[<p>Awk is a programming language which allows easy manipulation of structured data and is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match with the specified patterns and then perform associated actions. The basic syntax is:</p><blockquote><p><br />awk '/pattern1/ {Actions}<br /> /pattern2/ {Actions}' file</p></blockquote><p><br />The working of Awk is as follows<br />Awk reads the input files one line at a time.<br />For each line, it matches with given pattern in the given order, if matches performs the corresponding action.<br />If no pattern matches, no action will be performed.<br />In the above syntax, either search pattern or action are optional, But not both.<br />If the search pattern is not given, then Awk performs the given actions for each line of the input.<br />If the action is not given, print all that lines that matches with the given patterns which is the default action.<br />Empty braces with out any action does nothing. It wont perform default printing operation.<br />Each statement in Actions should be delimited by semicolon.<br />Say you have data.tsv with the following contents:</p><p><br />$ cat data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />By default Awk prints every line from the file.</p><p><br />$ awk '{print;}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />We print the line which matches the pattern contig3</p><p><br />$ awk '/contig3/' data/test.tsv<br />contig3 ACTTATATATATATA<br />Awk has number of builtin variables. For each record i.e line, it splits the record delimited by whitespace character by default and stores it in the $n variables. If the line has 5 words, it will be stored in $1, $2, $3, $4 and $5. $0 represents the whole line. NF is a builtin variable which represents the total number of fields in a record.</p><p><br />$ awk '{print $1","$2;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p>$ awk '{print $1","$NF;}' data/test.tsv<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT</p><p><br />Awk has two important patterns which are specified by the keyword called BEGIN and END. The syntax is as follows:</p><blockquote><p>BEGIN { Actions before reading the file}<br />{Actions for everyline in the file} <br />END { Actions after reading the file }</p></blockquote><p><br />For example,<br />$ awk 'BEGIN{print "Header,Sequence"}{print $1","$2;}END{print "-------"}' data/test.tsv<br />Header,Sequence<br />contig1,ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2,ACTTTATATATT<br />contig3,ACTTATATATATATA<br />contig4,ACTTATATATATATA<br />contig5,ACTTTATATATT<br />------- <br />We can also use the concept of a conditional operator in print statement of the form print CONDITION ? PRINT_IF_TRUE_TEXT : PRINT_IF_FALSE_TEXT. For example, in the code below, we identify sequences with lengths &gt; 14:</p><p>$ awk '{print (length($2)&gt;14) ? $0"&gt;14" : $0"&lt;=14";}' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG&gt;14<br />contig2 ACTTTATATATT&lt;=14<br />contig3 ACTTATATATATATA&gt;14<br />contig4 ACTTATATATATATA&gt;14<br />contig5 ACTTTATATATT&lt;=14<br />We can also use 1 after the last block {} to print everything (1 is a shorthand notation for {print $0} which becomes {print} as without any argument print will print $0 by default), and within this block, we can change $0, for example to assign the first field to $0 for third line (NR==3), we can use:</p><p>$ awk 'NR==3{$0=$1}1' data/test.tsv<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT<br />You can have as many blocks as you want and they will be executed on each line in the order they appear, for example, if we want to print $1 three times (here we are using printf instead of print as the former doesn't put end-of-line character),</p><p>$ awk '{printf $1"\t"}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1 contig1<br />contig2 contig2 contig2<br />contig3 contig3 contig3<br />contig4 contig4 contig4<br />contig5 contig5 contig5 <br />Although, we can also skip executing later blocks for a given line by using next keyword:</p><p>$ awk '{printf $1"\t"}NR==3{print "";next}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2<br />contig3 <br />contig4 contig4<br />contig5 contig5</p><p>$ awk 'NR==3{print "";next}{printf $1"\t"}{print $1}' data/test.tsv<br />contig1 contig1<br />contig2 contig2</p><p>contig4 contig4<br />contig5 contig5<br />You can also use getline to load the contents of another file in addition to the one you are reading, for example, in the statement given below, the while loop will load each line from test.tsv into k until no more lines are to be read:</p><p>$ awk 'BEGIN{while((getline k &lt;"data/test.tsv")&gt;0) print "BEGIN:"k}{print}' data/test.tsv<br />BEGIN:contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />BEGIN:contig2 ACTTTATATATT<br />BEGIN:contig3 ACTTATATATATATA<br />BEGIN:contig4 ACTTATATATATATA<br />BEGIN:contig5 ACTTTATATATT<br />contig1 ACTGTCTGTCACTGTGTTGTGATGTTGTGTGTG<br />contig2 ACTTTATATATT<br />contig3 ACTTATATATATATA<br />contig4 ACTTATATATATATA<br />contig5 ACTTTATATATT <br />You can also store data in the memory with the syntax VARIABLE_NAME[KEY]=VALUE which you can later use through for (INDEX in VARIABLE_NAME) command:</p><p>$ awk '{i[$1]=1}END{for (j in i) print j"&lt;="i[j]}' data/test.tsv<br />contig1&lt;=1<br />contig2&lt;=1<br />contig3&lt;=1<br />contig4&lt;=1<br />contig5&lt;=1</p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/38029/biologist-versus-computational-biologist</guid>
	<pubDate>Mon, 29 Oct 2018 04:23:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/38029/biologist-versus-computational-biologist</link>
	<title><![CDATA[Biologist versus computational biologist !]]></title>
	<description><![CDATA[<p>This is how it work :)</p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/38029" length="69305" type="image/png" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/39471/bioinformatics-for-precision-oncology-online-training-program-summer-2019</guid>
	<pubDate>Wed, 05 Jun 2019 15:04:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/39471/bioinformatics-for-precision-oncology-online-training-program-summer-2019</link>
	<title><![CDATA[Bioinformatics for Precision Oncology - Online Training Program, Summer 2019]]></title>
	<description><![CDATA[<p><img src="https://edu.t-bio.info/wp-content/uploads/2019/05/OncologyBioinformatics.jpeg" width="600" height="337.5" alt="image" style="border: 0px;"></p><p>The bioinforamtics for precision oncology online course provides an opportunity to learn about bioinformatics methods used in precision oncology research and practice. As a subset of precision medicine, precision oncology deals with molecular factors involved in the biological rpocesses that lead to cancer and can help diagnose, treat or prevent this disease. Oncology is driven by data, often times generated using Next Generation Sequencing (NGS) that helps us study the genomic and transcriptomic sub-cellular processes. Learn more and register:&nbsp;https://edu.t-bio.info/bioinformatics-training-precision-oncology/</p>]]></description>
	<dc:creator>eliabrodsky</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/40204/iitm-tokyo-tech-joint-symposium</guid>
	<pubDate>Thu, 24 Oct 2019 10:30:25 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/40204/iitm-tokyo-tech-joint-symposium</link>
	<title><![CDATA[IITM-Tokyo Tech Joint Symposium]]></title>
	<description><![CDATA[<p>The IITM-Tokyo Tech Joint Symposium is a biannual international symposium held in Indian Institute of Technology Madras (IITM), India in collaboration with Tokyo Institute of Technology (Tokyo-Tech), Japan. During the symposium, experts in various domains of Bioinformatics gather from India and Japan under one roof to discuss and present their works. This provides an unique opportunity to the researchers and students to learn the frontiers and interact with eminent scientists in Bioinformatics. The 5th IITM - Tokyo Tech Joint Symposium titled "Current trends in Bioinformatics: Big data analysis, machine learning and drug design", will be held on 6th - 7th March 2020 in IITM, Chennai, India.</p><p>The symposium will focus on topics in the below mentioned areas.</p><p>Topics: Algorithms for biomolecular sequences / structures Bioinformatics databases and tools Protein function Structure based drug design Machine learning Deep learning Large scale data analysis Big Data NGS Analysis Protein interactions/network Molecular modelling/docking/screening Biomolecular structure and function More</p><p>Info: https://web.iitm.ac.in/bioinfo2/symposium2020/home</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</guid>
	<pubDate>Tue, 27 Oct 2020 19:42:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</link>
	<title><![CDATA[Frequent parameters for bioinformatics tools !]]></title>
	<description><![CDATA[<div><div>Third party executable parameters and options.</div><div>&nbsp;</div><div>Trimmomatic</div><div>&nbsp;</div><div>&ldquo;ILLUMINACLIP:...:2:30:10&rdquo;</div><div>&ldquo;LEADING:15&rdquo;</div><div>&ldquo;TRAILING:15&rdquo;</div><div>&ldquo;SLIDINGWINDOW:4:20&rdquo;</div><div>&ldquo;MINLEN:20&rdquo;</div><div>&ldquo;TOPHRED33&rdquo;</div><div>&nbsp;</div><div>Filtlong</div><div>--min_length 500</div><div>--min_mean_q 85</div><div>--min_window_q 65</div><div>&nbsp;</div><div>FastQ Screen</div><div>--aligner bowtie2' (bwa for PacBio)</div><div>--subset 1000 (for PacBio)</div><div>&nbsp;</div><div>SPAdes</div><div>--careful</div><div>--disable-gzip-output</div><div>--cov-cutoff auto</div><div>--phred-offset 33</div><div>&nbsp;</div><div>HGAP</div><div>Pbalign.task_options.min_accuracy: 70</div><div>Pbalign.task_options.no_split_subreads: false</div><div>Genomic_consensus.task_options.min_confidence: 40</div><div>falcon_ns.task_options.HGAP_GenomeLength_str:</div><div>6000000</div><div>Pbcoretools.task_options.read_length: 0</div><div>Genomic_consensus.task_options.use_score: 0</div><div>Pbalign.task_options.min_length: 50</div><div>Pbalign.task_options.algorithm_options: --minMatch 12</div><div>--bestn 10 --minPctSimilarity 70.0</div><div>Pbalign.task_options.hit_policy: randombest</div><div>Pbcoretools.task_options.other_filters: rq &gt;= 0.7</div><div>Pbalign.task_options.concordant: false</div><div>Genomic_consensus.task_options.min_coverage: 5</div><div>falcon_ns.task_options.HGAP_SeedCoverage_str: 30</div><div>falcon_ns.task_options.HGAP_AggressiveAsm_bool: false</div><div>Genomic_consensus.task_options.algorithm: best</div><div>falcon_ns.task_options.HGAP_SeedLengthCutoff_str: -1</div><div>Genomic_consensus.task_options.diploid: false</div><div>&nbsp;</div><div>MeDuSa</div><div>-random 100</div><div>&nbsp;</div><div>Prokka</div><div>--usegenus</div><div>--force</div><div>--addgenes</div><div>--rfam</div><div>--rawproduct</div><div>&nbsp;</div><div>cmsearch (taxonomy, 16S)</div><div>--rfam</div><div>--noali</div><div>&nbsp;</div><div>blastn (taxonomy, 16S)</div><div>-evalue 1E-10</div><div>&nbsp;</div><div>blastn (MLST)</div><div>-ungapped</div></div><div><div>-dust no</div><div>-evalue 1E-20</div><div>-word_size 32</div><div>-culling_limit 2</div><div>-perc_identity 95</div><div>&nbsp;</div><div>blastp (VF)</div><div>-culling_limit 2</div><div>&nbsp;</div><div>RGI (ABR)</div><div>--input_type contig</div><div>&nbsp;</div><div>bowtie2 (mapping)</div><div>--sensitive</div><div>&nbsp;</div><div>minimap2 (mapping)</div><div>-a</div><div>-x map-ont</div><div>&nbsp;</div><div>samtools mpileup (SNP&nbsp;detection)</div><div>-uRI</div><div>&nbsp;</div><div>bcftools call (SNP detection)</div><div>--variants-only</div><div>--skip-variants indels</div><div>--output-type v</div><div>--ploidy 1</div><div>-c</div><div>&nbsp;</div><div>SNPsift filter (SNP detection)</div><div>"( QUAL &gt;= 30 ) &amp; (( na FILTER ) | (FILTER = 'PASS')) &amp;</div><div>( DP &gt;= 20 ) &amp; ( MQ &gt;= 20 )"</div><div>&nbsp;</div><div>SNPeff ann (SNP detection)</div><div>-nodownload</div><div>-no-intron</div><div>-no-downstream</div><div>-no SPLICE_SITE_REGION</div><div>-upDownStreamLen 250</div><div>&nbsp;</div><div>bcftools consensus</div><div>(phylogenetic tree)</div><div>--haplotype 1</div><div>&nbsp;</div><div>fasttreemp</div><div>-nt</div><div>-boot 100</div><div>&nbsp;</div><div>roary</div><div>-e</div><div>-n</div><div>-cd 100</div><div>-g 100000</div></div>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/43044/kanthida-lab</guid>
  <pubDate>Wed, 28 Apr 2021 02:27:22 -0500</pubDate>
  <link></link>
  <title><![CDATA[Kanthida Lab !]]></title>
  <description><![CDATA[
<p>Research Interest: </p>

<p>Bioinformatics </p>

<p>High-throughput and high-dimensional data analysis</p>

<p>Microbiome data analysis (Main focus)</p>

<p>Next-generation and third-generation sequencing data analysis for genomics</p>

<p>Gene expression data analysis</p>

<p>Machine learning for biological data</p>

<p>Biomarkers identification </p>

<p>Database and web-application for biological data</p>

<p>More at <br />https://sites.google.com/mail.kmutt.ac.th/kanthida-k/home?authuser=0</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43323/biostarhandbook</guid>
	<pubDate>Fri, 27 Aug 2021 01:31:01 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43323/biostarhandbook</link>
	<title><![CDATA[biostarhandbook]]></title>
	<description><![CDATA[<p>Nice book collection for bioinformatician ... highly recommended.</p><p>Address of the bookmark: <a href="https://www.biostarhandbook.com/" rel="nofollow">https://www.biostarhandbook.com/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44734/data-visualization-in-bioinformatics-useful-and-eye-catching-plots-for-data-analysis</guid>
	<pubDate>Sat, 14 Dec 2024 12:41:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44734/data-visualization-in-bioinformatics-useful-and-eye-catching-plots-for-data-analysis</link>
	<title><![CDATA[Data Visualization in Bioinformatics: Useful and Eye-Catching Plots for Data Analysis]]></title>
	<description><![CDATA[<p>Data visualization is a cornerstone of bioinformatics, enabling researchers to interpret complex datasets effectively. With a plethora of data types&mdash;genomic sequences, expression profiles, protein interactions, and more&mdash;the right visualizations can make or break an analysis. This blog highlights some of the most useful and visually compelling plots for bioinformatics data analysis, along with tools to create them.</p><h4><strong>1. Heatmaps: Exploring Patterns in High-Dimensional Data</strong></h4><p>Heatmaps are a go-to visualization for representing high-dimensional datasets, such as gene expression or metabolomics data. They use color gradients to display data intensity, making patterns and clusters easily detectable.</p><ul>
<li>
<p><strong>Applications</strong>: Gene expression analysis, pathway enrichment, methylation studies.</p>
</li>
<li>
<p><strong>Tools</strong>: Seaborn (Python), ComplexHeatmap (R), Morpheus (web-based).</p>
</li>
</ul><p><strong>Tip</strong>: Add dendrograms to visualize clustering of rows and columns for hierarchical relationships.</p><h4><strong>2. Volcano Plots: Highlighting Differential Features</strong></h4><p>Volcano plots are indispensable for identifying significantly differentially expressed genes or proteins. They plot the log2 fold change against &ndash;log10(p-value), making it easy to spot statistically significant changes.</p><ul>
<li>
<p><strong>Applications</strong>: RNA-seq, proteomics, and metabolomics.</p>
</li>
<li>
<p><strong>Tools</strong>: ggplot2 (R), EnhancedVolcano (R), Plotly (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use color to highlight significant features and label key genes or proteins.</p><h4><strong>3. PCA Plots: Reducing Complexity with Principal Component Analysis</strong></h4><p>Principal Component Analysis (PCA) plots are used to reduce dimensionality and uncover trends or clusters in data. They provide insights into sample variability and grouping.</p><ul>
<li>
<p><strong>Applications</strong>: Transcriptomics, metabolomics, microbiome studies.</p>
</li>
<li>
<p><strong>Tools</strong>: scikit-learn + Matplotlib (Python), prcomp (R), ClustVis (web-based).</p>
</li>
</ul><p><strong>Tip</strong>: Annotate clusters with metadata to enhance interpretability.</p><h4><strong>4. Manhattan Plots: Genome-Wide Association Studies</strong></h4><p>Manhattan plots visualize p-values across the genome, making it easy to identify significant associations in genome-wide studies. They resemble city skylines, with the highest peaks indicating loci of interest.</p><ul>
<li>
<p><strong>Applications</strong>: GWAS, QTL mapping.</p>
</li>
<li>
<p><strong>Tools</strong>: qqman (R), Matplotlib (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use alternating colors for chromosomes and highlight significant SNPs for clarity.</p><h4><strong>5. Circular Plots (Circos): Visualizing Genomic Relationships</strong></h4><p>Circular plots are ideal for visualizing relationships across the genome, such as structural variations, gene duplications, or synteny.</p><ul>
<li>
<p><strong>Applications</strong>: Comparative genomics, structural variation studies.</p>
</li>
<li>
<p><strong>Tools</strong>: Circos (standalone), Rcircos (R), pyCircos (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Keep the plot clean and avoid overcrowding to maintain readability.</p><h4><strong>6. Sankey Diagrams: Tracking Data Flows</strong></h4><p>Sankey diagrams visualize flows or relationships between categories, often used to track changes in gene expression or pathway enrichment across conditions.</p><ul>
<li>
<p><strong>Applications</strong>: Pathway analysis, gene set enrichment analysis.</p>
</li>
<li>
<p><strong>Tools</strong>: Plotly (Python), networkD3 (R).</p>
</li>
</ul><p><strong>Tip</strong>: Use gradients or distinct colors to highlight key transitions.</p><h4><strong>7. Network Graphs: Mapping Interactions</strong></h4><p>Network graphs represent relationships between entities, such as protein-protein interactions or gene regulatory networks. Nodes represent entities, and edges represent relationships.</p><ul>
<li>
<p><strong>Applications</strong>: Systems biology, interactomics.</p>
</li>
<li>
<p><strong>Tools</strong>: Cytoscape (standalone), igraph (R), NetworkX (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use edge thickness or node size to represent interaction strength or centrality.</p><h4><strong>8. Violin Plots: Visualizing Data Distribution</strong></h4><p>Violin plots combine a boxplot with a density plot, showing the distribution and variability of data.</p><ul>
<li>
<p><strong>Applications</strong>: Single-cell RNA-seq, quantitative trait analysis.</p>
</li>
<li>
<p><strong>Tools</strong>: Seaborn (Python), ggplot2 (R).</p>
</li>
</ul><p><strong>Tip</strong>: Split violins by groups for side-by-side comparisons.</p><h4><strong>9. Time-Series Plots: Monitoring Changes Over Time</strong></h4><p>Time-series plots display changes in variables across time points, useful for tracking gene expression dynamics or metabolic fluxes.</p><ul>
<li>
<p><strong>Applications</strong>: Time-course experiments, cell cycle studies.</p>
</li>
<li>
<p><strong>Tools</strong>: Matplotlib (Python), ggplot2 (R).</p>
</li>
</ul><p><strong>Tip</strong>: Smooth the data to highlight trends while avoiding overfitting.</p><h4><strong>10. Genome Tracks: Visualizing Genomic Features</strong></h4><p>Genome tracks display multiple layers of genomic data, such as gene annotations, sequencing coverage, and epigenetic marks.</p><ul>
<li>
<p><strong>Applications</strong>: ChIP-seq, ATAC-seq, whole-genome sequencing.</p>
</li>
<li>
<p><strong>Tools</strong>: IGV (standalone), pyGenomeTracks (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Stack related tracks for direct comparisons.</p><h4><strong>11. UpSet Plots: Visualizing Set Intersections</strong></h4><p>UpSet plots are a powerful alternative to Venn diagrams for visualizing intersections between multiple datasets.</p><ul>
<li>
<p><strong>Applications</strong>: Overlap analysis for gene sets, pathways, or variants.</p>
</li>
<li>
<p><strong>Tools</strong>: UpSetR (R), ComplexUpset (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use bar plots to represent the size of each intersection for added clarity.</p><h4><strong>12. Ridge Plots: Comparing Distributions</strong></h4><p>Ridge plots visualize the distributions of multiple datasets, stacked for easy comparison.</p><ul>
<li>
<p><strong>Applications</strong>: Transcriptomics, single-cell RNA-seq.</p>
</li>
<li>
<p><strong>Tools</strong>: ggridges (R), Matplotlib (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use transparency and consistent scaling for better readability.</p><h4><strong>13. Chord Diagrams: Visualizing Connections Between Groups</strong></h4><p>Chord diagrams illustrate relationships between categories, such as shared genes between pathways or overlaps in regulatory elements.</p><ul>
<li>
<p><strong>Applications</strong>: Pathway overlap, synteny, co-expression networks.</p>
</li>
<li>
<p><strong>Tools</strong>: Circlize (R), Holoviews (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use distinct colors for each group to emphasize relationships.</p><h4><strong>14. Treemaps: Hierarchical Data Representation</strong></h4><p>Treemaps visualize hierarchical data as nested rectangles, with area proportional to data size.</p><ul>
<li>
<p><strong>Applications</strong>: Ontology enrichment, pathway analysis.</p>
</li>
<li>
<p><strong>Tools</strong>: Treemapify (R), Plotly (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use colors to represent additional variables, like significance or enrichment scores.</p><h4><strong>15. T-SNE/UMAP Plots: Dimensionality Reduction for Clustering</strong></h4><p>T-SNE and UMAP plots are great for visualizing high-dimensional data in two dimensions while preserving local or global structure.</p><ul>
<li>
<p><strong>Applications</strong>: Single-cell transcriptomics, clustering analyses.</p>
</li>
<li>
<p><strong>Tools</strong>: scikit-learn (Python), Seurat (R).</p>
</li>
</ul><p><strong>Tip</strong>: Combine with metadata annotations for better cluster interpretation.</p><h4><strong>Bringing It All Together</strong></h4><p>The choice of visualization can significantly impact the insights gained from bioinformatics data. By selecting plots tailored to your data type and analysis goals, you can effectively communicate your findings and make your research more impactful. Whether you&rsquo;re a seasoned bioinformatician or a beginner, mastering these visualizations will elevate your analyses and presentations.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>

</channel>
</rss>