Automatic Filtering, Trimming, Error Removing and Quality Control for fastq dataAfterQC can simply go through all fastq files in a folder and then output three folders: good, bad and QC folders, which contains good...
Unicycler is an assembly pipeline for bacterial genomes. It can assemble Illumina-only read sets where it functions as a SPAdes-optimiser. It can also assembly long-read-only sets (PacBio or Nanopore) where it runs...
Illumina Sequencing data can provide high coverage of a genome by relatively short (most often 100 bp to 150 bp) reads at a low cost. Even with low (advertised 1%) error rate, 100 × coverage Illumina data on average has an error in some read...
One interesting approach that gave good results in other genome projects - do several cycles of SSPACE / GapCloser/ REAPR (to detect misassemblies and break them). Usually after 4-6 cycles this converge on an optimal assembly and the statistics...
kallisto
Transcript abundance quantification from RNA-seq data (uses pseudoalignment for rapid determination of read compatibility with targets)
Software (C++)
https://pachterlab.github.io/kallisto/
Sailfish
Estimation of isoform abundances...
We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia...