• Blogs
  • Jit
  • Structural variation: the hidden genomic treasure

Structural variation: the hidden genomic treasure

Genome re-sequencing projects have revealed substantial amounts of genetic variation between individuals extending beyond single nucleotide polymorphisms (SNPs) and short indels. Structural Variations (SVs) and Copy Number Variations (CNVs) are a major source of genomic variation. However, compared to SNPs, accurate detection, genotyping and understanding of CNVs is lagging behind due to much greater analytical challenges related to SV/CNV detection and analysis. In our lab we analyse SVs/CNVs using high-throughput sequencing and different analytical approaches. The most‐studied structural variants are copy number variations (CNVs) which can be generated by several different mechanisms including non‐allelic homologous recombination, non‐homologous end‐joining and deoxyribonucleic acid (DNA) replication‐related fork stalling and template switching. CNVs are closely related to segmental duplications (SDs): SDs can stimulate the formation of CNVs and themselves started out as CNVs, but became fixed in a species. Structural variation can be neutral but has also influenced our phenotypic evolution, for example our susceptibility to disease and our ability to digest certain types of food. Our understanding of the extent of structural variation is increasing rapidly, but it will be much more difficult to understand its phenotypic consequences. 


Structural variants (SVs) such as deletions, insertions, duplications, inversions and translocations litter genomes and are often associated with gene expression changes and severe phenotypes (ie. genetic diseases in humans). Recent studies on the functional aspects of different types of SVs have unveiled several cases of adaptive evolution. For example, inversions have been associated with ecological adaptations and may facilitate speciation. Due to their prevalent nature, SVs arguably have a large impact on genome evolution and should not be neglected when studying the genetics of adaptation and speciation. SVs were classically defined as chromosomal rearrangements larger than 1kb, but due to a higher resolution of new detection methods, smaller variants (between 50 and 1000 base pairs) can now be accurately assessed. Besides various methods of detection in next generation sequencing data (paired end mapping, split reads, and depth of coverage), array-based approaches have proven to be particularly useful for detecting copy number variations (CNVs). These technologies have enabled researchers to catalog a wide spectrum of SVs in many organisms and infer the effects of selection shaping their evolutionary trajectories.

Structure variation sequencing signature (Source: NatRev Genetics)


Related tools, databases and publications are listed below. If you know any interesing papers, please let us know in comment section:

Key concepts

Structural variation includes balanced variants such as inversions and translocations, and unbalanced ones such as duplications and deletions (copy number variations or CNVs).

Structural variants can arise by several mechanisms, including nonallelic homologous recombination (NAHR), nonhomologous end‐joining (NHEJ) and DNA replication‐based fork stalling and template switching (FoSTeS).

CNV is closely linked to segmental duplication, but is not exactly the same. Segmental duplications can stimulate CNV formation by NAHR, and themselves arise from CNVs that have become fixed.

Segmental duplications did not appear uniformly during the evolution of the Great Ape species, but rather during a burst of activity around the time of the divergence of gorilla from the human/chimpanzee ancestor.

Duplicated genes play a critical role in the evolution of a genome as they act as ‘spare parts’ than can evolve to perform new or more specialized functions.

Effects of structural variation on gene expression can be identified but only a few examples of the consequences for species biology have been documented.


CNVnatora tool for CNV discovery and genotyping from depth of read mapping.2011a,2011b

AGEa tools that implements an algorithm for optimal alignment of sequences with SVs.2011

BreakSeqa pipeline for annotation, classification and analysis of SVs at single nucleotide resolution.2010

PEMera computational and simulation framework for discovering SVs by paired-end read mapping.2009,2007

GASV https://code.google.com/archive/p/gasv/

PAIROSCOPE http://pairoscope.sourceforge.net/

SVDetect http://svdetect.sourceforge.net/Site/Home.html

BreakPtr, discovery of unbalanced structural variants (copy-number variants) with tiling microarrays Link 

R Package https://www.bioconductor.org/help/course-materials/2010/EMBL2010/Practical-4-StructuralVariants.pdf

BreakSeq, structural variant genotyping using split reads Link 

CopySeq, genotyping of unbalanced structural variants (copy-number variants) using read-depth Link 

DELLY2, integrated structural variant discovery, genotyping and visualization in deep sequencing data Link 

PEMer, structural variant discovery in 454 sequencing data by paired-end mapping Link 

TIGER, transduction inference in germline genomes using short read data Link 

MANTA https://github.com/Illumina/manta

SV-Bay https://github.com/InstitutCurie/SV-Bay

BreakDancer http://breakdancer.sourceforge.net/

Variation Hunter http://compbio.cs.sfu.ca/software-variation-hunter

Lumpy https://github.com/arq5x/lumpy-sv

ForestSV http://sebatlab.ucsd.edu/index.php/software-data 

PBSuites for long reads https://sourceforge.net/projects/pb-jelly/


The SV visualization tool: http://genomesavant.com/savant/

InGAP-SV (http://ingap.sourceforge.net/) that is nice tools for both detection and visualisation of severals kind of structural variations (Large insertions, translocation, deletion, inversions....) 

Tools table: http://www.nature.com/nbt/journal/v29/n8/fig_tab/nbt.1904_T2.html

Variation Viewer https://www.ncbi.nlm.nih.gov/variation/view/









https://www.ncbi.nlm.nih.gov/pubmed/19477992 ***