1000 Genomes data tutorial at ASHG
Structural variants presentation by
Jan Korbel
European Molecular Biology Laboratory (EMBL) Heidelberg Genome Biology Research...
Mostly FASTA file contain NNN characters, which can be replace by random A T G C character with this perl script. It also print the FASTA sequence name, N's counts, nucleotide count and percentage details at command prompt/standard output.
This script is one of my old script to detect some centromeric pattern in chromosomes. User can also control the number of mismatches allowed through command line ..
To run:
perl centro.pl