Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Question: Question: Replace N characters with random nucleotide in a FASTA file.

Agampreet Kaur
3853 days ago

Question: Replace N characters with random nucleotide in a FASTA file.

During sequencing and assembly process Fasta files typically contain invalid characters such as "NNNN", how can I remove those with randon nucleotide charaters A T G C.

Answers
0

You can use a s/// regex like this:

my $str = "This is my string";
my $find = "string";
my $replace = "strings";
$find = quotemeta $find; # escape regex metachars if present

##The quotemeta lets you find strings that contain regex meta characters and the /g at the end of the s/// does all occurances.

$str =~ s/$find/$replace/g;

print $str;

Thanks

0

Thats true .. by using  s/// regex will remove the N's, but if you need to replace them with randomly generated nucleotide sequence then use this script http://bioinformaticsonline.com/file/view/5307/clean-the-fasta-file which will clean the N's with random A T G C characters.