On Jan 10 2020, while news of the first fatality was barely trickling in, the 29,903 letters constituting the viral genome from an affected individual in Wuhan had already been elucidated (even though a few corrections were made subsequently).
github.com - Collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.
https://github.com/tanghaibao/jcvi
More at https://github.com/tanghaibao/jcvi/wiki
github.com - Flanker, a Python package which performs alignment-free clustering of gene flanking sequences in a consistent format, allowing investigation of mobile genetic elements (MGEs) without prior knowledge of their structure. Flanker can be...
github.com - IVA (Iterative Virus Assembler) designed specifically for read pairs sequenced at highly variable depth from RNA virus samples. We tested IVA on datasets from 140 sequenced samples from human immunodeficiency virus-1 or influenza-virus-infected...
github.com - RagTag is a collection of software tools for scaffolding and improving modern genome assemblies. Tasks include:
Homology-based misassembly correction
Homology-based assembly scaffolding and patching
Scaffold merging
If we only had Illumina reads, we could also assemble these using the tool Spades.
You can try this here, or try it later on your own data.
Get data
We will use the same Illumina data as we used above:
illumina_R1.fastq.gz: the Illumina...