Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Question: Question: What is paired end and mate pair sequencing? How is it works?

Rahul Nayak
3404 days ago

Question: What is paired end and mate pair sequencing? How is it works?

I come across two confusing terms: paired end and mate pair sequencing. What are the difference betweeen them. How it is being done.

Answers
0

Hi Rahul,

Mate-pair is a specific type of library; paired-end is a type of sequencing. They are not two different methods; in fact, mate-pair libraries require paired-end sequencing. Mate pair allows you to have your pairs be much farther apart, which can be more informative than the standard paired-end protocol.

The downside is that it's a more complicated protocol, and it can be contaminted with ordinary paired-end reads, which I image can make analysis more difficult.

Thanks

0

Hi Rahul,

The term 'paired ends' refers to the two ends of the same DNA molecule. So you can sequence one end, then turn it around and sequence the other end. The two sequences you get are 'paired end reads'. Sometimes they're called 'mate pairs' (but with Illumina technology, I think what they call 'mate pair' and 'paired end' methodology is different).

"paired end" or "mate pair" refers to how the library is made, and then how it is sequenced. Both are methodologies that, in addition to the sequence information, give you information about the physical distance between the two reads in your genome.

For example, you shear up some genomic DNA, and cut a region out at ~500bp. Then you prepare your library, and sequence 35bp from each end of each molecule. Now you have three pieces of information:

--the tag 1 sequence
--the tag 2 sequence
--that they were 500bp ± (some) apart in your genome

This gives you the ability to map to a reference (or denovo for that matter) using that distance information. It helps dramatically to resolve larger structural rearrangements (insertions, deletions, inversions), as well as helping to assemble across repetitive regions.

Structural rearrangements can be deduced when your read pairs map to a reference at a distance that is substantially different from how that library was constructed (~500bp in the above example). Let's say you had two reads that mapped to your reference 1000bp apart...this suggests there has been a deletion between those two sequence reads within your genome. Same thing with an insertion, if your reads mapped 100bp apart on the reference, this suggests that your genome has an insertion.

Mapping over repeats is similar...if one read is unmappable because it falls in a very repetitive region (eg. LINE, LTR, SINE), but the other is unique, you can again use that distance information to map both reads. The first read would likely come from the repeat that is ~500bp away from your unique second read.

Cheers