X BOL wishing you a very and Happy New year

Alternative content

Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Short-read assembly using Spades !

If we only had Illumina reads, we could also assemble these using the tool Spades.

You can try this here, or try it later on your own data.

Get data

We will use the same Illumina data as we used above:

  • illumina_R1.fastq.gz: the Illumina forward reads
  • illumina_R2.fastq.gz: the Illumina reverse reads

Assemble

Run Spades:

spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
  • -1 is input file of forward reads
  • -2 is input file of reverse reads
  • --careful minimizes mismatches and short indels
  • --cov-cutoff auto computes the coverage threshold (rather than the default setting, “off”)
  • -o is the output directory

Results

Move into the output directory and look at the contigs:

infoseq contigs.fasta

Comments

  • LEGE 651 days ago

    SPAdes (St. Petersburg genome assembler) is a popular tool for short-read assembly. It can assemble reads generated from Illumina, IonTorrent, PacBio, and Oxford Nanopore sequencing platforms. Here is a brief overview of how to use SPAdes for short-read assembly:

    1. Install SPAdes: SPAdes can be downloaded from the official website (http://cab.spbu.ru/software/spades/). It is available for Linux, macOS, and Windows operating systems.

    2. Prepare the input data: The input data should be in FASTQ format, which contains the sequencing reads. If the reads are paired-end, they should be provided in separate files for each read.

    3. Run SPAdes: To run SPAdes, open the terminal and navigate to the directory where the input data is stored. Then, enter the following command:

    spades.py -1 <read1.fastq> -2 <read2.fastq> -o <output_dir>

    Replace read1.fastq and read2.fastq with the names of the input files containing the paired-end reads, and <output_dir> with the name of the output directory where the assembled contigs will be saved.

    1. Check the output: Once SPAdes has finished running, check the output directory for the assembled contigs. The contigs will be saved in a file named contigs.fasta. You can view the contigs using a text editor or a genome browser.

    SPAdes also provides several options for tuning the assembly parameters, such as adjusting the k-mer length or enabling error correction. The SPAdes manual provides detailed information on how to use these options.

  • BioStar 645 days ago

    Short-read assembly is the process of constructing a genome sequence from a large number of short sequencing reads. SPAdes (St. Petersburg genome assembler) is a popular software tool for short-read assembly. Here are the general steps for short-read assembly using SPAdes:

    1. Quality control of raw reads: Before running the assembly, it is important to ensure that the raw reads are of high quality and do not contain sequencing errors or low-quality bases. This step can be performed using various tools such as Trimmomatic or FastQC.

    2. Running SPAdes: After quality control, the next step is to run SPAdes. This can be done through a command-line interface or a graphical user interface. The command for running SPAdes is usually as follows:

      spades.py -o output_directory -1 forward_reads.fastq -2 reverse_reads.fastq

      Here, "output_directory" is the name of the directory where the assembled genome will be saved, "forward_reads.fastq" and "reverse_reads.fastq" are the forward and reverse reads in FASTQ format.

    3. Evaluation of assembly: After running SPAdes, the next step is to evaluate the quality of the assembled genome. This can be done using tools such as QUAST or BUSCO. These tools assess the completeness and accuracy of the assembled genome by comparing it to reference genomes or sets of conserved genes.

    4. Iterative assembly: If the quality of the initial assembly is not satisfactory, SPAdes allows for an iterative assembly approach where the output from the previous assembly is used as input for another round of assembly. This process can be repeated until a satisfactory assembly is achieved.

    These are the general steps for short-read assembly using SPAdes. However, the specific parameters used for SPAdes and the evaluation tools may vary depending on the specific research question and the characteristics of the sequencing data.