BOL: Related items

J-Circos

Shruti Paniwala — Fri, 17 Feb 2017 09:06:54 -0600

Circos plot tool (J-Circos) that is an interactive visualization tool that can plot Circos figures, as well as being able to dynamically add data to the figure, and providing information for specific data points using mouse hover display and zoom in/out functions. J-Circos uses the Java computer language to enable it to be used on most operating systems (Windows, MacOS, Linux). Users can input data into J-Circos using flat data formats, as well as from the GUI. J-Circos will enable biologists to better study more complex chromosomal interactions and fusion transcripts that are otherwise difficult to visualize from next-generation sequencing data.

Address of the bookmark: http://www.australianprostatecentre.org/research/software/jcircos

Nanopolis: polish a genome assembly

Rahul Nayak — Thu, 26 Jul 2018 04:51:28 -0500

Software package for signal-level analysis of Oxford Nanopore sequencing data. Nanopolish can calculate an improved consensus sequence for a draft genome assembly, detect base modifications, call SNPs and indels with respect to a reference genome and more (see Nanopolish modules, below).

Quickstart

http://nanopolish.readthedocs.io/en/latest/quickstart_consensus.html

Algorithms

http://simpsonlab.github.io/2017/06/30/nanopolish-v0.7.0/

Address of the bookmark: https://github.com/jts/nanopolish

Krona

Jit — Wed, 22 Mar 2017 04:47:35 -0500

Krona allows hierarchical data to be explored with zooming, multi-layered pie charts. Krona charts can be created using an Excel template or KronaTools, which includes support for several bioinformatics tools and raw data formats. The interactive charts are self-contained and can be viewed with any modern web browser (see Browser support).

Address of the bookmark: https://github.com/marbl/Krona/wiki

CANU genome assembly parameters !

Rahul Nayak — Mon, 07 Jan 2019 08:40:37 -0600

Choose the appropriate parameters to run Canu and run it. The assembly will take about an hour. You can use two cores (parameter -maxThreads=2) and you would like to disable cluster option, since we compute on a single Amazon server set off the option to compute on cluster useGrid=false. This specifications should be for your project discussed with a local computing guru. The parameters that are in square brackets [] are optional, symbol | stands for "or".

usage:   canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s ] \
               -p  \
               -d  \
               genomeSize=[g|m|k] \
               -maxThreads=2 \
               useGrid=false \
              [other-options] \
               read_file.fastq.gz

A default Canu run produces usually high quality assembly, example of a command that was used for testing can be found below. However, there are still a lot of parameters that are possible to tweak. For example if we desire to assemble haplotypes separately of if we want to smash them together, we can alternate the error correction process.

canu -p test_asmbl \
     -d asm_test3 \
     genomeSize=2m \
     -maxThreads=2 useGrid=false \
     -pacbio-raw \ ~/pacbio/dna/sample_reads.fastq.gz

There is a brilliant section in documentation about parameter tweaking.

The output directory contains will contain many files. The most interesting ones are:

*.correctedReads.fasta.gz : file containing the input sequences after correction, trim and split based on consensus evidence.
*.trimmedReads.fastq : file containing the sequences after correction and final trimming
*.layout : file containing informations about read inclusion in the final assembly
*.gfa : file containing the assembly graph by Canu
*.contigs.fasta : file containing everything that could be assembled and is part of the primary assembly

The basic stats of assembly can be read from reports generated by the assembler, or calculated using standard UNIX command line tools.

More at https://canu.readthedocs.io/en/latest/faq.html

SSPACE

Jit — Fri, 05 May 2017 05:42:15 -0500

SSPACE standard is a stand-alone program for scaffolding pre-assembled contigs using NGS paired-read data. It is unique in offering the possibility to manually control the scaffolding process. By using the distance information of paired-end and/or matepair data, SSPACE is able to assess the order, distance and orientation of your contigs and combine them into scaffolds. Currently we offer this as a command-line tool in Perl. The input data is given by pre-assembled contig sequences (FASTA) and NGS paired-read data (Illumina/454/Solid FASTA or FASTQ). The final scaffolds are provided in FASTA format.

Address of the bookmark: https://www.baseclear.com/genomics/bioinformatics/basetools/SSPACE

Bioinformatics Services / CRO Services

RASA Life Sciences — Wed, 06 Nov 2019 00:33:11 -0600

RASA is set to provide premium technical and scientific services in a form of solutions, product development and training. .We are also very proficient in providing the high quality Research & Development services in life science informatics field like Next Generation Sequencing (NGS) Data Analysis,Computational Drug Discovery, Bioinformatics, Chemo-informatics and BIO-IT.

RASA offers faster, better and cost effective cutting edge technology solutions to chemical and life science research and industry. We provide our customers with A seamless model of wide expertise and comprehensive platforms. Our Value is to take our customers

A Post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs

Abhimanyu Singh — Fri, 12 May 2017 10:50:29 -0500

PAGIT addresses the need for software to generate high quality draft genomes. It is based on a series of programs that we developed:

ABACAS, that is able to contiguate contigs from a de novo assembly against a closely related reference.

IMAGE, an iterative approach for closing gaps in assembled genomes using mate pair information. It is able to close gaps left open by the assembler in a draft genome, even when using the same data sets as used by the original assembler.

iCORN, that enables errors in the consensus sequence to be corrected by iteratively mapping reads to the current assembly. An improved version, especially correction Pacfic Bioscience assemblies (PacBio) can be found here.

RATT, a tool to transfer the annotation from a reference genome, or an earlier assembly, onto the latest assembly.

PAGIT bundles these software and makes them more accessible for users.

Address of the bookmark: http://www.sanger.ac.uk/science/tools/pagit

LACHESIS: Genome Assembly with Hi-C-based Contact Probability Maps (LACHESIS)

Jit — Mon, 14 May 2018 04:26:30 -0500

LACHESIS is method that exploits contact probability map data (e.g. from Hi-C) for chromosome-scale de novo genome assembly.

Further information about LACHESIS, including source code, documentation and a user's guide are available at: http://shendurelab.github.io/LACHESIS.

Manuscript describing LACHESIS was published as: Burton JN#, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J#. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nature Biotechnology 2013 Dec;31(12):1119-25. doi: 10.1038/nbt.272. PubMed PMID: 24185095.

http://shendurelab.github.io/LACHESIS/

Address of the bookmark: http://shendurelab.github.io/LACHESIS/

Ranbow: a haplotype assembler for polyploid genomes

Jit — Fri, 01 Jun 2018 07:21:54 -0500

Ranbow is a haplotype assembler for polyploid genomes. It has been developed for the haplotype assembly of the hexaploid sweet potato genome, which is highly heterozygous. Ranbow can also be applied to other polyploid genomes. After a first phasing, Ranbow utilizes the assembled haplotypes to improve the accuracy of variant calling results and to infer the evolutionary history of the organism´s genome. Ranbow has three main modes of function: ranbow hap: for haplotyping ranbow eval: for evaluating of the assemble haplotypes by gold standard (long) reads ranbow phylo: for the phylogenetic analysis

Address of the bookmark: https://www.molgen.mpg.de/ranbow

Phased Human Genome Assembly !

Rahul Nayak — Mon, 08 Oct 2018 09:10:54 -0500

The new publicly available assembly (PacBio HG00733) has the fewest gaps of any human genome assembly, with more than half of the genome contained in gapless sequence at least 27 Mb long. The primary contig assembly is 2.89 Gb long and consists of 865 contigs that were assembled with PacBio data generated with the company’s Sequel® System. Using the FALCON-Unzip assembler, maternal and paternal haplotypes were resolved over more than 80% of the genome. Maternal and paternal haplotype blocks were then further phased using Hi-C technology and the FALCON-Phase methoddeveloped in collaboration with Phase Genomics. The genome was then de novo scaffolded using Phase Genomics’ Proximo Hi-C platform, resulting in the first chromosome-scale diploid assembly of a single individual accomplished with only two technologies. More specific details about the assembly are included on the PacBio blog.

The data are available using NCBI accession IDs: BioProject: (PRJNA483067), assembly: [RBJD00000000] and sequence data (SRP155659).

Additional Resources

Interactive map showcasing global initiatives underway to generate reference-quality human genome assemblies for diverse populations
BioReport Podcast on the value of ethnic-specific reference genomes
Nature Reviews Genetics paper from NHGRI: Prioritizing diversity in human genomics research
Article in The Journal of Precision Medicine: “Minority Report – Ethnic Diversity and the Real Promise for Precision Medicine”
Article in Bio-IT World: “Genomic Data Standards Are a Necessity”
NHGRI Project Award: High Quality Human and Non-Human Primate Genome Assemblies

More details are available on the PacBio website:

Blog post: Data Release: Highest-Quality, Most Contiguous Individual Human Genome Assembly to Date
Blog post: For Reference-Grade Human Genome Assemblies, SMRT Sequencing Yields Optimal Results
Webinar: Assembling High-Quality Human Reference Genomes for Global Populations
FALCON-Phase press release and article preprint
PacBio research focus webpage about Human Population Genetics

Ref: https://stockguru.com/2018/10/08/pacific-biosciences-releases-highest-quality-most-contiguous-individual-human-genome-assembly-to-date/