<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: All site blogs]]></title>
	<link>https://bioinformaticsonline.com/blog/all?offset=90</link>
	<atom:link href="https://bioinformaticsonline.com/blog/all?offset=90" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</guid>
	<pubDate>Sat, 16 Jan 2021 21:42:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</link>
	<title><![CDATA[Protocol for De novo Genome Assembly using Illumina Reads]]></title>
	<description><![CDATA[<p>In this protocol, we address and describe the de novo assembly method for small to medium-sized genomes.</p><p><strong>What is de novo genome assembly?<br /></strong>The method of taking a large number of short DNA sequences and placing them back together to create a reflection of the original chromosomes from which the DNA originated relates to genome assembly. No previous knowledge of the source DNA sequence length, structure or composition is inferred by De novo genome assemblies. The DNA of the target organism is split up into millions of tiny parts and read on a sequencing computer in a genome sequencing experiment. Depending on the sequencing system used, these "reads" range from 20 to 1000 nucleotide base pairs (bp) in length. Usually, length reads of 36 - 150 bp are produced for Illumina style short read sequencing. These reads can be either &ldquo;single ended&rdquo; as described above or &ldquo;paired end.&rdquo;</p><p><strong>Why genome assembly?</strong><br />In basic research into why and how they live, as well as in applied topics, identifying the DNA sequence of an organism is useful. Awareness of a DNA sequence may be useful in virtually any biological research because of the relevance of DNA to living things. For example, it may be used in medicine to classify, diagnose and eventually improve genetic disorder therapies. Similarly, pathogens study can lead to treatments for infectious diseases.</p><p><strong>Raw NGS data</strong><br />Reads can be saved as a Fasta file as text or in a FastQ file with their attributes.&nbsp;FastQ is the most common read file format since this is what the Illumina sequencing pipeline creates. This will henceforth be the subject of our conversation.</p><p><strong>In a nutshell the protocol:</strong> <br />Get the sequence file(s) read from the sequencing machine (s). <br />Look at the readings - have an idea of what you have and what the standard is like. <br />If required, raw data cleanup/quality trimming. <br />Choose an adequate parameter set for assembly. <br />Assemble the data into scaffolds/contigs. <br />Examine the assembly performance and determine the efficiency of the assembly.</p><p><strong>Read Quality Control:</strong><br />Check the qualiy with fastQC.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42540/install-fastqc-using-conda</p><p>Quality trimming/cleanup of read files.<br />This function trims adapters, barcodes and other contaminants from the reads.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42542/trimmomatic-command</p><p><strong>Genome Assembly:</strong><br />The object of this portion of the protocol is to explain the method of assembling the reads trimmed by quality into draft contigs.</p><blockquote><p>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o result_of_spades_assembly_all_illumina</p></blockquote><p>A significant range of short-read assemblers are available. Everyone with strengths and disadvantages of their own. <br /><em>Some of the assemblers available include:</em><br />Velvet<br />SOAP-denovo<br />MIRA<br />ALLPATHS</p><p>Next step is to assess the suitability and what to do with a draft package of contiguous details for the remainder of the study now.&nbsp;Few stuff you can note about the contigs you just created:&nbsp;They're the draft Contigs. Any mis-assemblies can occur.</p><p><strong>Mis-assembly checking and assembly metric tools:</strong><br />QUAST - Quality assessment tool for genome assembly http://bioinf.spbau.ru/quast<br />Mauve assembly metrics - http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve<br />InGAP-SV - https://sites.google.com/site/nextgengenomics/ingap and http://ingap.sourceforge.net/<br />inGAP is also useful for finding structural variants between genomes from read mappings.</p><p><strong>Genome finishing tools:</strong><br />Semi-automated gap fillers:<br />Gap filler - http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/gapfiller/</p><p>IMAGE (V2) - http://sourceforge.net/apps/mediawiki/image2/index.php?title=Main_Page</p><p><strong>Genome visualisers and editors:</strong><br />Artemis - http://www.sanger.ac.uk/resources/software/artemis/<br />IGV - http://www.broadinstitute.org/igv/</p><p><strong>Automated and semi automated annotation tools:</strong><br />Prokka - https://github.com/tseemann/prokka<br />RAST - http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer<br />JCVI Annotation Service - http://www.jcvi.org/cms/research/projects/annotation-service/</p><p><strong>Frequent command use for the analysis are at:</strong></p><p>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly<br />https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42329/10-ngs-services-companies-around-the-globe</guid>
	<pubDate>Sun, 22 Nov 2020 23:56:17 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42329/10-ngs-services-companies-around-the-globe</link>
	<title><![CDATA[10 NGS services companies around the globe !]]></title>
	<description><![CDATA[<p><strong>The global&nbsp;NGS services market&nbsp;is expected to reach USD 13.1 billion by 2025.&nbsp;</strong>Here are the&nbsp;<strong style="font-size: 12.8px;">top 10 NGS services companies to look for &ndash;</strong></p><p><strong>1.&nbsp;<a href="https://www.illumina.com/">Illumina, Inc. (U.S.)</a></strong></p><p>Illumina, Inc. was founded in 1998 and is headquartered at San Diego, U.S. Illumina, Inc. is one of the leading players in DNA sequencing and array-based technologies, serving customers in the research, clinical, and applied markets. The company offers products for applications in the life sciences, oncology, reproductive health, agriculture, and other emerging segments. The company serves government laboratories, genomic research centers, academics institutions as well as pharmaceutical, biotechnology, agrigenomics, commercial molecular diagnostics laboratories and consumer genomics companies. Illumina, Inc. has its geographic presence in North America, Europe, Latin America, Asia-pacific, and others.</p><p><strong>2.&nbsp;<a href="https://www.qiagen.com/us/">QIAGEN N.V. (Netherlands)</a></strong></p><p>QIAGEN N.V. was incorporated in 1986 and is headquartered at Venlo, The Netherlands. The Company is engaged in providing Sample to Insight solutions that transform biological samples into molecular insights. QIAGEN provides its workflow to customers in molecular diagnostics, assay technologies, bioservices and automation systems.&nbsp; The company&rsquo;s genome services are suitable for custom/tailored projects that allow access to genomic sequence information.&nbsp; The Company market its products in more than 100 countries across the Americas, Europe, Asia, Australia, and the Middle-East &amp;Africa through its subsidiaries and channel partners.</p><p><strong>3.&nbsp;<a href="https://www.perkinelmer.com/">PerkinElmer, Inc. (U.S.)</a></strong></p><p>PerkinElmer, Inc. was founded in 1947 and is headquartered in Waltham, Massachusetts, the U.S. PerkinElmer, Inc. offers its products &amp; services and solutions for the diagnostics, food, environmental, industrial, life sciences research and laboratory services markets. The company offer comprehensive genetic testing solutions that help to provide insight into the complex nature of rare and inherited diseases. Some of the subsidiaries of the company are Caliper Life Sciences, Improvision, Viacell Inc., ViaCord LLC, among many others. The company has its facilities located in Europe (France, Germany, and Belgium), U.S. and Asia (China, India, and Japan).</p><p><strong>4.&nbsp;<a href="https://www.eurofins.com/">Eurofins Scientific SE (Luxembourg)</a></strong></p><p>Eurofins Scientific SE was founded in 1987 and is headquartered in Luxembourg, Europe. The company offers a portfolio of over 130,000 analytical methods and more than 150 million assays performed each year to establish the safety, identity, composition, authenticity, origin, traceability, and purity of biological substances and products, as well as carry out human diagnostic services. The company has its geographic presence across 39 countries in Europe, North and South America, and Asia-Pacific.</p><p><strong>5.&nbsp;<a href="https://www.gatc-biotech.com/en/index.html">GATC Biotech AG (Germany)</a></strong></p><p>GATC Biotech AG was founded in 1990 and is headquartered in Constance, Germany. The company provides DNA and RNA sequencing and bioservices solutions to academics and industrial areas. It also provides next generation sequencing services including genomes, targeted (re)-sequencing, human sample sequencing, transcriptomes, metagenomes, regulomes, pre-sequencing, NGS barcode labels, and next generation sequencing technologies; and bioservices services, including bioservices tools, pipelines and workflows, compute resources, data analysis reports, and case studies. GATC Biotech AG operates as a subsidiary of Eurofins Scientific SE. It offers its products through distributors in Italy, Japan, Portugal, Spain, and the Czech Republic.</p><p><strong>6.<a href="https://www.macrogen.com/">&nbsp;Macrogen, Inc. (South Korea)</a></strong></p><p>Macrogen, Inc. was founded in 1997 and is headquartered in Seoul, South Korea. Macrogen, Inc. provides next generation sequencing services such as whole genome, de novo, exome, targeted, transcriptomics, metagenome, and epigenome sequencing.&nbsp; The company also provides a variety of services such as oligo synthesis, database construction, genome research, and bioservices analysis system consulting services. Macrogen, Inc. provides genome research services in Korea and internationally.</p><p><strong>7.&nbsp;<a href="https://www.genotypic.co.in/">Genotypic Technology Pvt. Ltd. (India)</a></strong></p><p>Genotypic Technology Pvt. Ltd. was incorporated in 1998 and is headquartered in Bangalore, India. Genotypic Technology is the first Genomics service provider in India providing Microarray, Next Generation Sequencing (NGS), Bioservices and solutions to domestic/ international pharma, biotech companies and academia. The company provides its services for protocol optimization, probe designing, array layouts, project designing, and nucleic acid analysis to in-depth analysis. Genotypic Technology has its geographic presence in North America, Europe, Asia Pacific, Middle East &amp; Africa, and Latin America.</p><p><strong>8.&nbsp;<a href="https://www.genewiz.com/">GENEWIZ, Inc. (U.S.)</a></strong></p><p>GENEWIZ, Inc. was founded in 1999 and is headquartered in South Plainfield, New Jersey, the U.S.; The company is a leading provider of research service in the field of Next Generation Sequencing, Sanger DNA sequencing, sequencing of bacteria and phage, gene synthesis, DNA cloning, genomics including mutation analysis, single nucleotide polymorphism, and bioservices. GENEWIZ, Inc. has its geographic presence in U.S., China, Germany, France, Japan, and the U.K.</p><p><strong>9.&nbsp;<a href="https://www.genomics.cn/">Beijing Genomics Institute (China)</a></strong></p><p>Beijing Genomics Institute (BGI) is the world&rsquo;s largest genomics organization and non-profit research institution that was founded in 1999 and is headquartered in Shenzhen, China. The Company provides a wide range of commercial next generation sequencing services and genetic tests for medical institutions, agricultural and environmental applications. The Company operates all across the globe through its subsidiaries, namely, BGI China (Mainland), BGI Asia Pacific, BGI Americas (North and South America) and BGI Europe (Europe and Africa).</p><p><strong>10.&nbsp;<a href="https://www.scigenom.com/">SciGenom Labs Pvt. Ltd (India)</a></strong></p><p>SciGenom Labs Pvt. Ltd was founded in 2010 and is headquartered in Cochin, India with offices in Chennai &amp; Hyderabad in India, and San Francisco in the U.S. It is a Genomics R&amp;D services company that provides genomic sequencing and NGS services to life sciences and healthcare businesses globally as well as academic and government institutions in India.</p><p>Popular mentions &ndash; MedGenome (India), DNA Link, Inc. (South Korea), Otogenetics Corporation (U.S.), Novogene Corporation (China), LGC Limited (U.K.), CD Genomics (U.S.), SeqLL, LLC (U.S.)</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42188/tools-and-method-for-haplotype-phasing</guid>
	<pubDate>Fri, 04 Sep 2020 20:41:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42188/tools-and-method-for-haplotype-phasing</link>
	<title><![CDATA[Tools and Method for Haplotype phasing !]]></title>
	<description><![CDATA[<div>Huge amounts of genotype data are being produced with recent technological advances, both from increasingly&nbsp; comprehensive and inexpensive genome-wide SNP microarrays and from ever more accessible whole-genome and whole-exome sequencing methods. The vast amount of knowledge contained in these results, however, is best&nbsp; exploited through phased haplotypes, which classify the alleles co-located on the same chromosome. Since sequence and SNP array data normally take the form of unphased genotypes, one does not specifically observe which of the two parental chromosomes, or haplotypes, falls on a specific allele. Fortunately, new advances in both computational and laboratory methods promise improved determination of haplotype phase. Following are useful tools :</div><div>&nbsp;</div><p><strong>Arlequin:</strong>&nbsp;<a href="http://cmpg.unibe.ch/software/arlequin3/" target="_blank">http://cmpg.unibe.ch/software/arlequin3/</a></p><p><strong>BEAGLE:</strong>&nbsp;<a href="http://faculty.washington.edu/browning/beagle/beagle.html" target="_blank">http://faculty.washington.edu/browning/beagle/beagle.html</a></p><p><strong>fastPHASE:</strong>&nbsp;<a href="http://stephenslab.uchicago.edu/software.html" target="_blank">http://stephenslab.uchicago.edu/software.html</a></p><p><strong>GENEHUNTER:</strong>&nbsp;<a href="http://linkage.rockefeller.edu/soft/gh/" target="_blank">http://linkage.rockefeller.edu/soft/gh/</a></p><p><strong>The Genome Analysis Toolkit:</strong></p><p><a href="http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit" target="_blank">http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit</a></p><p><strong>IMPUTE2:</strong>&nbsp;<a href="https://mathgen.stats.ox.ac.uk/impute/impute_v2.html" target="_blank">https://mathgen.stats.ox.ac.uk/impute/impute_v2.html</a></p><p><strong>MACH:</strong>&nbsp;<a href="http://www.sph.umich.edu/csg/abecasis/MACH/" target="_blank">http://www.sph.umich.edu/csg/abecasis/MACH/</a></p><p><strong>MERLIN:</strong>&nbsp;<a href="http://www.sph.umich.edu/csg/abecasis/Merlin/" target="_blank">http://www.sph.umich.edu/csg/abecasis/Merlin/</a></p><p><strong>PHASE:</strong>&nbsp;<a href="http://stephenslab.uchicago.edu/software.html" target="_blank">http://stephenslab.uchicago.edu/software.html</a></p><p><strong>PL-EM:</strong>&nbsp;<a href="http://www.people.fas.harvard.edu/~junliu/plem/" target="_blank">http://www.people.fas.harvard.edu/~junliu/plem/</a></p><p><strong>&ldquo;Read-backed phasing&rdquo; algorithm</strong>:&nbsp;<a href="http://www.broadinstitute.org/gsa/wiki/index.php/Read-backed_phasing_algorithm" target="_blank">http://www.broadinstitute.org/gsa/wiki/index.php/Read-backed_phasing_algorithm</a></p><p><strong>SHAPE-IT:</strong>&nbsp;<a href="http://www.griv.org/shapeit/" target="_blank">http://www.griv.org/shapeit/</a></p>]]></description>
	<dc:creator>Manisha Mishra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42166/software-for-genome-assembly</guid>
	<pubDate>Sun, 30 Aug 2020 09:51:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42166/software-for-genome-assembly</link>
	<title><![CDATA[Software for genome assembly !]]></title>
	<description><![CDATA[<p>List of bioinformatics tools/Software Website References for genome assembly:</p><p>1 Falcon&nbsp;https://github.com/PacificBiosciences/pb-assembly</p><p>2 Canu assembler http://canu.readthedocs.io/en/latest/index.html</p><p>3 Miniasm assembler https://github.com/lh3/miniasm</p><p>4 PBJelly scaffolding tool https://sourceforge.net/projects/pb-jelly/</p><p>5 ARCS scaffolding tool https://github.com/bcgsc/arcs</p><p>6 Redundans reduction and scaffolding tool https://github.com/Gabaldonlab/redundans</p><p>7 Arrow error correction https://github.com/PacificBiosciences/ GenomicConsensus</p><p>8 PILON error correction https://github.com/broadinstitute/pilon/wiki</p><p>9 BUSCO single copy gene markers http://busco.ezlab.org/</p><p>10 Bandage graph assembly viewer https://rrwick.github.io/Bandage/</p><p>11 Gepard dotter http://cube.univie.ac.at/gepard</p><p>12 MUMmer aligner and plotter http://mummer.sourceforge.net/</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42003/perl-one-liner-for-beginners</guid>
	<pubDate>Fri, 24 Jul 2020 05:58:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42003/perl-one-liner-for-beginners</link>
	<title><![CDATA[Perl one-liner for beginners !]]></title>
	<description><![CDATA[<p>I often use the following arguments to perl:</p><ul>
<li>-e Makes the line of code be executed instead of a script</li>
<li>-n Forces your line to be called in a loop. Allows you to take lines from the diamond operator (or stdin)</li>
<li>-p Forces your line to be called in a loop. Prints $_ at the end</li>
</ul><p>&nbsp;</p><ul>
<li>This counts the number of quotation marks in each line and prints it
<div>
<blockquote>
<div>perl -ne&nbsp;'$cnt = tr/"//;print "$cnt\n"'&nbsp;inputFileName.txt</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>Adds string to each line, followed by tab
<div>
<blockquote>
<div>perl -pe&nbsp;'s/(.*)/string\t$1/'&nbsp;inFile &gt; outFile</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>Append a new line to each line
<div>
<blockquote>
<div>perl -pe&nbsp;'s//\n/'&nbsp;all.sent.classOnly &gt; all.sent.classOnly.sep</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>Replace all occurrences of pattern1 (e.g. [0-9]) with pattern2
<div>
<blockquote>
<div>perl -p -i.bak -w -e&nbsp;'s/pattern1/pattern2/g'&nbsp;inputFile</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>Go through file and only print words that do not have any uppercase letters.
<div>
<blockquote>
<div>perl -ne&nbsp;'print unless m/[A-Z]/'&nbsp;allWords.txt &gt; allWordsOnlyLowercase.txt</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>Go through file, split line at each space and print words one per line.
<div>
<blockquote>
<div>perl -ne&nbsp;'print join("\n", split(/ /,$_));print("\n")'&nbsp;someText.txt &gt; wordsPerLine.txt</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>or in other words, delete every character that is not a letter, white space or line end (replace with nothing)
<div>
<blockquote>
<div>perl -pne&nbsp;'s/[^a-zA-Z\s]*//g'&nbsp;text_withSpecial.txt &gt; text_lettersOnly.txt</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>
<div>
<div>perl -pne&nbsp;'tr/[A-Z]/[a-z]/'&nbsp;textWithUpperCase.txt &gt; textwithoutuppercase.txt;</div>
</div>
</li>
</ul><ul>
<li>Print only the second column of the data when using tabular as a separator
<div>
<blockquote>
<div>perl -ne&nbsp;'@F = split("\t", $_); print "$F[1]";'&nbsp;columnFileWithTabs.txt &gt; justSecondColumn.txt</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>
<div>One-Liner: Sort lines by their length
<blockquote>
<div>perl -e&nbsp;'print sort {length $a &lt;=&gt; length $b} &lt;&gt;'&nbsp;textFile</div>
</blockquote>
</div>
</li>
</ul><ul>
<li>One-Liner: Print second column, unless it contains a number
<blockquote>
<div>perl"&gt;perl -lane&nbsp;'print $F[1] unless $F[1] =~ m/[0-9]/'&nbsp;wordCounts.txt</div>
</blockquote>
</li>
</ul>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/41804/useful-links-to-therapy-disease-drug-and-drug-target-network-data</guid>
	<pubDate>Mon, 01 Jun 2020 11:47:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/41804/useful-links-to-therapy-disease-drug-and-drug-target-network-data</link>
	<title><![CDATA[Useful links to therapy, disease, drug and drug-target network data:]]></title>
	<description><![CDATA[<p>Useful links to therapy, disease, drug and drug-target network data:</p><p><strong>DrugBank:</strong></p><p>a bioinformatics- cheminformatics resource combining detailed drug data with comprehensive drug target information with &gt;4900 drug (~3500 experimental) and &gt;1500 non-redundant protein entries http://www.drugbank.ca/</p><p><strong>Drug-Target Network:</strong></p><p>network data of 890 drugs and 394 target human proteins http://www.nature.com/nbt/journal/v25/ n10/suppinfo/nbt1338_S1.html</p><p><strong>Drug-Therapy Network:</strong></p><p>three layers of drug-therapy networks according to the ATC classification http://www.biomedcentral.com/1471-2210/8/5/additional/</p><p><strong>FDA Orange Book:</strong></p><p>approved drug products with therapeutic equivalence evaluations http://www.fda.gov/cder/ob/HIDdb: Thomson Investigational drugs database including information on 107000 patents, 25000 investigational drugs and 80000 chemical structures http://scientific.thomson.com/products/iddb/HOMIM: a knowledgebase of human genes and genetic disorders http://www.ncbi.nlm.nih.gov/ sites/entrez?db=omim</p><p><strong>PDTD:</strong></p><p>3D drug target structure database with a target identification option http://www.dddc.ac.cn/pdtd/</p><p><strong>Predicted drug targets:</strong></p><p>a set of 1383 predicted drug targets http://www.biomedcentral.com/1471-2105/8/353/additional/ [25] Protein ligand network: a network of 4208 ligands and ~15000 binding sites http://pbil.kaist.ac.kr/~parkkw/Lnet/</p><p><strong>TDR Targets Database:</strong></p><p>identification and ranking targets against neglected tropical diseases http://tdrtargets.org/</p><p><strong>Therapeutic Target Database:</strong></p><p>lists &gt;1500 therapeutic targets, disease conditions and corresponding drugs http://xin.cz3.nus.edu.sg/group/cjttd/ttd.asp</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/41496/new-machine-learning-packages-in-r</guid>
	<pubDate>Fri, 27 Mar 2020 12:11:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/41496/new-machine-learning-packages-in-r</link>
	<title><![CDATA[New Machine Learning Packages in R]]></title>
	<description><![CDATA[<h3 id="machine-learning">Machine Learning</h3><p><a href="https://cran.r-project.org/package=autokeras">autokeras</a>&nbsp;v1.0.1: Implements an interface to&nbsp;<a href="https://autokeras.com/">AutoKeras</a>, an open source software library for automated machine learning. See&nbsp;<a href="https://cran.r-project.org/web/packages/autokeras/readme/README.html">README</a>&nbsp;for an example.</p><p><a href="https://cran.r-project.org/package=MTPS">MTPS</a>&nbsp;v0.1.9: Implements functions to predict simultaneous multiple outcomes based on revised stacking algorithms as described in&nbsp;<a href="denied:doi:10.1093/bioinformatics/btz531">Xing et al. (2019)</a>. See the&nbsp;<a href="https://cran.r-project.org/web/packages/MTPS/vignettes/Guide.html">vignette</a>&nbsp;to get started.</p><p><a href="https://cran.r-project.org/package=quanteda.textmodels">quanteda.textmodels</a>&nbsp;v0.9.1: Implements methods for scaling models and classifiers based on sparse matrix objects representing textual data. It includes implementations of the&nbsp;<a href="denied:doi:10.1017/S0003055403000698">Laver et al. (2003)</a>&nbsp;wordscores model, the&nbsp;<a href="denied:arxiv:1710.08963">Perry &amp; Benoit&rsquo;s (2017)</a>&nbsp;class affinity scaling model, and the&nbsp;<a href="denied:doi:10.1111/j.1540-5907.2008.00338.x">Slapin &amp; Proksch (2008)</a>&nbsp;wordfish model. See the&nbsp;<a href="https://cran.r-project.org/web/packages/quanteda.textmodels/vignettes/textmodel_performance.html">vignette</a>&nbsp;to get started.</p><p><a href="https://cran.r-project.org/package=SeqDetect">SeqDetect</a>&nbsp;v1.0.7: Implements the automaton model found in&nbsp;<a href="https://ieeexplore.ieee.org/document/8910574">Krleža, Vrdoljak &amp; Brčić (2019)</a>&nbsp;to detect and process sequences. See the&nbsp;<a href="https://cran.r-project.org/web/packages/SeqDetect/vignettes/SequentialDetector.pdf">vignette</a>&nbsp;for examples and theory.</p><p><a href="https://cran.r-project.org/package=studyStrap">studyStrap</a>&nbsp;v1.0.0: Implements multi-Study Learning algorithms such as Merging, Study-Specific Ensembling (Trained-on-Observed-Studies Ensemble), the Study Strap, and the Covariate-Matched Study Strap. and offers over 20 similarity measures. See&nbsp;<a href="denied:doi:10.1101/856385">Kishida, et al. (2019)</a>&nbsp;for background and the&nbsp;<a href="https://cran.r-project.org/web/packages/studyStrap/vignettes/vignette.html">vignette</a>&nbsp;for how to use the package.</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/41455/coronavirus-covid-%E2%80%9019-testing-sites-in-india</guid>
	<pubDate>Mon, 16 Mar 2020 16:13:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/41455/coronavirus-covid-%E2%80%9019-testing-sites-in-india</link>
	<title><![CDATA[Coronavirus COVID ‐19 Testing Sites In India]]></title>
	<description><![CDATA[<p>COVID-19 is a new illness that can affect your lungs and airways. It's caused by a virus called coronavirus.</p><h2>Stay at home if you have coronavirus symptoms</h2><p>Stay at home if you have either:</p><ul>
<li>a high temperature &ndash; you feel hot to touch on your chest or back</li>
<li>a new, continuous cough &ndash; this means you've started coughing repeatedly</li>
</ul><h2>DO NOT TAKE</h2><p><em>Ibrufen</em></p><p><em>https://amp.theguardian.com/world/2020/mar/14/anti-inflammatory-drugs-may-aggravate-coronavirus-infection</em></p><h2>How to avoid catching and spreading coronavirus (social distancing)</h2><p>Everyone should do what they can to stop coronavirus spreading.</p><p>It is particularly important for people who:</p><ul>
<li>are 70 or over</li>
<li>have a long-term condition</li>
<li>are pregnant</li>
<li>have a weakened immune system</li>
</ul><p><img src="https://www.hindustantimes.com/rf/image_size_960x540/HT/p2/2020/03/16/Pictures/_c0c377e0-6789-11ea-8a5c-cb364e4c5304.png" alt="image" width="960" height="543" style="border: 0px; border: 0px;"></p><p><strong>Below are the 52 Coronavirus COVID-19 Testing sites/locations in India.</strong></p><p>State:&nbsp;Andhra Pradesh&nbsp; &nbsp; &nbsp; &nbsp;</p><ol>
<li>Sri Venkateswara Institute of Medical Sciences, Tirupati</li>
<li>Andhra Medical College, Visakhapatnam, Andhra Pradesh</li>
<li>GMC, Anantapur, AP</li>
</ol><p>State:&nbsp;Andaman &amp; Nicobar islands</p><ol>
<li>Regional Medical Research Centre, Port Blair, Andaman, and Nicobar</li>
</ol><p>State:&nbsp;Assam</p><ol>
<li>Gauhati Medical College, Guwahati</li>
<li>&nbsp;Regional Medical Research Center, Dibrugarh</li>
</ol><p>State:&nbsp;Bihar</p><ol>
<li>Rajendra Memorial Research Institute of Medical Sciences, Patna</li>
</ol><p>State: Chandigarh</p><ol>
<li>Post Graduate Institute of Medical Education &amp; Research, Chandigarh</li>
</ol><p>State: Chhattisgarh</p><ol>
<li>All India Institute Medical Sciences, Raipur</li>
</ol><p>Union Territory: Delhi-NCT&nbsp;</p><ol>
<li>All India Institute Medical Sciences, Delhi</li>
<li>National Centre for Disease Control, Delhi</li>
</ol><p>State: Gujarat</p><ol>
<li>BJ Medical College, Ahmedabad</li>
<li>M.P.Shah Government Medical College, Jamnagar</li>
</ol><p>State: Haryana</p><ol>
<li>Pt. B.D. Sharma Post Graduate Inst. of Med. Sciences, Rohtak, Haryana</li>
<li>BPS Govt Medical College, Sonipat</li>
</ol><p>State: Himachal Pradesh</p><ol>
<li>Indira Gandhi Medical College, Shimla, Himachal Pradesh</li>
<li>Dr.Rajendra Prasad Govt. Med. College, Kangra, Tanda, HP</li>
</ol><p>Union Territory: Jammu and Kashmir</p><ol>
<li>Sher‐e‐ Kashmir Institute of Medical Sciences, Srinagar</li>
<li>Government Medical College, Jammu</li>
</ol><p>State: Jharkhand</p><ol>
<li>MGM Medical College, Jamshedpur</li>
</ol><p>State: Karnataka</p><ol>
<li>Bangalore Medical College &amp; Research Institute, Bangalore</li>
<li>National Institute of Virology Field Unit Bangalore</li>
<li>Mysore Medical College &amp; Research Institute, Mysore</li>
<li>Hassan Inst. of Med. Sciences, Hassan, Karnataka</li>
<li>Shimoga Inst. of Med. Sciences, Shivamogga, Karnataka</li>
</ol><p>State: Kerala</p><ol>
<li>National Institute of Virology Field Unit, Kerala</li>
<li>Govt. Medical College, Thiruvananthapuram, Kerala</li>
<li>Govt. Medical College, Kozhikode, Kerala</li>
</ol><p>State: Madhya Pradesh</p><ol>
<li>All India Institute Medical Sciences, Bhopal</li>
<li>National Institute of Research in Tribal Health (NIRTH), Jabalpur</li>
</ol><p>State: Meghalaya</p><ol>
<li>NEIGRI of Health and Medical Sciences, Shillong, Meghalaya</li>
</ol><p>State: Maharashtra</p><ol>
<li>Indira Gandhi Government Medical College, Nagpur</li>
<li>Kasturba Hospital for Infectious Diseases, Mumbai</li>
</ol><p>State: Manipur</p><ol>
<li>J N Inst. of Med. Sciences Hospital, Imphal‐East, Manipur</li>
</ol><p>State: Odisha</p><ol>
<li>Regional Medical Research Center, Bhubaneswar</li>
</ol><p>Union Territory: Puducherry</p><ol>
<li>Jawaharlal Institute of Postgraduate Medical Education &amp; Research, Puducherry</li>
</ol><p>State: Punjab</p><ol>
<li>Government Medical College, Patiala, Punjab</li>
<li>Government Medical College, Amritsar</li>
</ol><p>State: Rajasthan</p><ol>
<li>Sawai Man Singh, Jaipur</li>
<li>Dr. S.N Medical College, Jodhpur</li>
<li>Jhalawar Medical College, Jhalawar, Rajasthan</li>
<li>SP Med. College, Bikaner, Rajasthan</li>
</ol><p>State: Tamil Nadu</p><ol>
<li>King&rsquo;s Institute of Preventive Medicine &amp; Research, Chennai</li>
<li>Government Medical College, Theni</li>
</ol><p>State: Tripura</p><ol>
<li>Government Medical College, Agartala</li>
</ol><p>State: Telangana</p><ol>
<li>Gandhi Medical College, Secunderabad</li>
</ol><p>State: Uttar Pradesh</p><ol>
<li>King&rsquo;s George Medical University, Lucknow</li>
<li>Institute of Medical Sciences, Banaras, Hindu University, Varanasi</li>
<li>Jawaharlal Nehru Medical College, Aligarh</li>
</ol><p>State: Uttarakhand</p><ol>
<li>Government Medical College, Haldwani</li>
</ol><p>State: West Bengal</p><ol>
<li>National Institute of Cholera and Enteric Diseases, Kolkata</li>
<li>IPGMER, Kolkata</li>
</ol>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/40953/explore-taxdump-files</guid>
	<pubDate>Sat, 08 Feb 2020 04:44:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/40953/explore-taxdump-files</link>
	<title><![CDATA[Explore taxdump files !]]></title>
	<description><![CDATA[<pre>This is an extract of taxdump-readme.txt to be found at 
ftp://ftp.ncbi.nih.gov/pub/taxonomy/

The content of the archive
--------------------------

It may look like this:

delnodes.dmp
division.dmp
gencode.dmp
merged.dmp
names.dmp
nodes.dmp
readme.txt

The readme.txt file gives a brief description of *.dmp files. These files
contain taxonomic information and are briefly described below. Each of the
files store one record in the single line that are delimited by "\t|\n"
(tab, vertical bar, and newline) characters. Each record consists of one 
or more fields delimited by "\t|\t" (tab, vertical bar, and tab) characters.
The brief description of field position and meaning for each file follows.

nodes.dmp
---------

This file represents taxonomy nodes. The description for each node includes 
the following fields:

	tax_id					-- node id in GenBank taxonomy database
 	parent tax_id				-- parent node id in GenBank taxonomy database
 	rank					-- rank of this node (superkingdom, kingdom, ...) 
 	embl code				-- locus-name prefix; not unique
 	division id				-- see division.dmp file
 	inherited div flag  (1 or 0)		-- 1 if node inherits division from parent
 	genetic code id				-- see gencode.dmp file
 	inherited GC  flag  (1 or 0)		-- 1 if node inherits genetic code from parent
 	mitochondrial genetic code id		-- see gencode.dmp file
 	inherited MGC flag  (1 or 0)		-- 1 if node inherits mitochondrial gencode from parent
 	GenBank hidden flag (1 or 0)            -- 1 if name is suppressed in GenBank entry lineage
 	hidden subtree root flag (1 or 0)       -- 1 if this subtree has no sequence data yet
 	comments				-- free-text comments and citations

names.dmp
---------
Taxonomy names file has these fields:

	tax_id					-- the id of node associated with this name
	name_txt				-- name itself
	unique name				-- the unique variant of this name if name not unique
	name class				-- (synonym, common name, ...)

division.dmp
------------
Divisions file has these fields:
	division id				-- taxonomy database division id
	division cde				-- GenBank division code (three characters)
	division name				-- e.g. BCT, PLN, VRT, MAM, PRI...
	comments

gencode.dmp
-----------
Genetic codes file:

	genetic code id				-- GenBank genetic code id
	abbreviation				-- genetic code name abbreviation
	name					-- genetic code name
	cde					-- translation table for this genetic code
	starts					-- start codons for this genetic code

delnodes.dmp
------------
Deleted nodes (nodes that existed but were deleted) file field:

	tax_id					-- deleted node id

merged.dmp
----------
Merged nodes file fields:

	old_tax_id                              -- id of nodes which has been merged
	new_tax_id                              -- id of nodes which is result of merging

</pre>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/40768/linux-advantages</guid>
	<pubDate>Thu, 30 Jan 2020 06:27:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/40768/linux-advantages</link>
	<title><![CDATA[Linux advantages]]></title>
	<description><![CDATA[<p>https://www.forbes.com/sites/jasonevangelho/2018/07/30/ditching-windows-heres-how-ubuntu-updates-your-pc-and-why-its-better/#7aa6fa5f7c23</p><p>https://www.forbes.com/sites/jasonevangelho/2018/07/23/5-reasons-you-should-switch-from-windows-to-linux-right-now/#70c74923777b</p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>