github.com - YASRA (Yet Another Short Read Assembler) performs comparative assembly of short reads using a reference genome, which can differ substantially from the genome being sequenced. Mapping reads to reference genomes makes use of LASTZ (Harris et al), a...
github.com - This code is designed to enable anyone to reproduce the Hs2-HiC and the AaegL4 genomes reported in: Dudchenko et al., De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science, 2017.
Unless otherwise...
http://www.htslib.org/ - Samtools is a suite of programs for interacting with high-throughput sequencing data. It consists of three separate repositories:
SamtoolsReading/writing/editing/indexing/viewing SAM/BAM/CRAM formatBCFtoolsReading/writing BCF2/VCF/gVCF files and...
github.com - ALPACA requires Celera Assembler 8.3 or later. It is recommended to build Celera Assembler from source. (Why? The pre-built binaries CA_8.3rc1 and CA8.3rc2 will work for any large data set.
Detail paper...
github.com - PERGA - Paired End Reads Guided Assembler
PERGA is a novel sequence reads guided de novo assembly approach which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds. Instead of using single-end reads to construct...
www.atgc-montpellier.fr - LoRDEC is a program to correct sequencing errors in long reads from 3rd generation sequencing with high error rate, and is especially intended for PacBio reads. It uses a hybrid strategy, meaning that it uses two sets of reads: the reference read...
denbi-metagenomics-workshop.readthedocs.io - Welcome to the one-day metagenomics assembly workshop. This tutorial will guide you through the typical steps of metagenome assembly and binning.
The Tutorial Data Set
FastQC Quality Control
Assembly
Velvet Assembly
MEGAHIT...
github.com - Here is the command to run the tool:
python finisherSC.py destinedFolder mummerPath
If you are running on server computer and would like to use multiple threads, then the following commands can generate 20 threads to run FinisherSC.
python...
en.wikipedia.org - FASTQ format is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a...
We are two groups of scientists doing frontier research in quantitative biology and biomedicine. The Bienko group is interested in exploring the fundamental design principles controlling how DNA is packed in the eukaryotic nucleus and its relation...