Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Perl script to count number of Ns in a multifasta file !

#!/usr/bin/perl my ($h, $n, $l); open(I,$ARGV[0]) or die($!); while(<I>){ chomp; next if /^$/; if(/^>/){ $h=substr($_,1); }else{ $n=($_=~tr/nN/nN/); $l=length($_); print $h,"\t",$l,"\t",$n,"\t",$n/($l-$n),"\n"; } } close(I); __END__ Note: Convert sequences in oneline first perl -pe '/^>/ ? print "\n" : chomp' scaffolds_backup.fasta > out.fasta perl countN.pl scaff.fa Result ➜ SSPACED_P2 perl countN.pl out.fasta scaffold1_size15575755 15575755 71824 0.00463263155647429 scaffold2_size10632363 10632363 64900 0.00614149299600103 scaffold3_size10490233 10490233 14872 0.0014197124089566 scaffold4_size8615068 8615068 28079 0.00326994712582024 scaffold5_size7253348 7253348 0 0 scaffold6_size5859599 5859599 51144 0.00880509533085821 scaffold7_size5312044 5312044 144 2.71089440689772e-05 scaffold8_size4790259 4790259 167678 0.0362736748150005 scaffold9_size2901530 2901530 4652 0.001605866729631 scaffold10_size2523421 2523421 142217 0.0597248282801474 scaffold11_size2396892 2396892 6066 0.00253719844104088 scaffold12_size2309371 2309371 240122 0.11604306683246 scaffold13_size2240539 2240539 48209 0.0219898464191066 scaffold14_size2236206 2236206 132012 0.0627375612704912 scaffold15_size2184218 2184218 29144 0.0135234335340689 scaffold16_size2020193 2020193 11868 0.00590940211370172 scaffold17_size1776796 1776796 132882 0.0808326956276302 scaffold18_size1445477 1445477 50836 0.0364509576299564 scaffold19_size1287487 1287487 102593 0.0865841163850944 scaffold20_size1287226 1287226 23771 0.0188142830571726 scaffold21_size1176376 1176376 18339 0.0158362815695872 scaffold22_size1164766 1164766 82205 0.075935674756434 scaffold23_size1108643 1108643 36256 0.0338086903328742 scaffold24_size1010466 1010466 93216 0.10162551103843 scaffold25_size1009266 1009266 51939 0.0542541890075178 scaffold26_size990651 990651 75286 0.082246972519159 scaffold27_size971184 971184 55725 0.0608711040035654 scaffold28_size898523 898523 92917 0.115338018833028 scaffold29_size889417 889417 33420 0.0390421929048817 scaffold30_size864014 864014 80076 0.10214583296128 scaffold31_size790135 790135 0 0 scaffold32_size767454 767454 42932 0.0592556195671077 scaffold33_size711456 711456 0 0 scaffold34_size672622 672622 0 0 scaffold35_size649832 649832 0 0 scaffold36_size583465 583465 10815 0.0188858814284467 scaffold37_size532901 532901 19453 0.0378869914772285 scaffold38_size462550 462550 16583 0.0371843656593425 scaffold39_size413010 413010 29289 0.0763288952129281 scaffold40_size395146 395146 72775 0.225749214414448 scaffold41_size385263 385263 0 0 scaffold42_size378415 378415 0 0 scaffold43_size374696 374696 0 0 scaffold44_size369872 369872 0 0 scaffold45_size357218 357218 0 0 scaffold46_size308025 308025 25709 0.0910646226214596 scaffold47_size306566 306566 16299 0.056151749940572 scaffold48_size288610 288610 59285 0.258519568298267 scaffold49_size269035 269035 0 0 scaffold50_size267296 267296 0 0 scaffold51_size239237 239237 0 0 scaffold52_size229546 229546 81 0.000352995010132264 scaffold53_size228017 228017 3283 0.0146083814643089 scaffold54_size218690 218690 9299 0.0444097406287758 scaffold55_size205658 205658 0 0 scaffold56_size203300 203300 0 0 scaffold57_size189523 189523 54248 0.401020144150804 scaffold58_size164988 164988 0 0 scaffold59_size158628 158628 0 0 scaffold60_size158343 158343 29807 0.231896122487085 scaffold61_size148599 148599 0 0 scaffold62_size137911 137911 19923 0.168856154863206 scaffold63_size129836 129836 0 0 scaffold64_size127140 127140 2695 0.0216561533207441 scaffold65_size121401 121401 0 0 scaffold66_size119540 119540 7247 0.0645365249837479 scaffold67_size114479 114479 11483 0.111489766592877 scaffold68_size111283 111283 0 0 scaffold69_size106201 106201 0 0 scaffold70_size102444 102444 7907 0.0836392100447444 scaffold71_size97059 97059 9770 0.11192704693604 scaffold72_size94628 94628 14575 0.182066880691542 scaffold73_size86984 86984 0 0 scaffold74_size80782 80782 0 0 scaffold75_size68265 68265 0 0 scaffold76_size67964 67964 0 0 scaffold77_size56495 56495 0 0 scaffold78_size51859 51859 0 0 scaffold79_size46742 46742 0 0 scaffold80_size44222 44222 0 0 scaffold81_size39139 39139 0 0 scaffold82_size36951 36951 0 0 scaffold83_size35745 35745 0 0 scaffold84_size34131 34131 0 0 scaffold85_size33753 33753 0 0 scaffold86_size33604 33604 0 0 scaffold87_size32653 32653 0 0 scaffold88_size32281 32281 0 0 scaffold89_size32217 32217 0 0 scaffold90_size29934 29934 0 0 scaffold91_size29230 29230 0 0 scaffold92_size22007 22007 0 0 scaffold93_size20584 20584 0 0

Comments

  • Jit 1557 days ago

    If you want to count all 'N' in miltifasta file. 

    (base) ?  output_2_test git:(master) ? more scaffolds.fasta |  grep -Ho N * | uniq -c

         12 draft_summary.info:N

    1035948 scaffolds.fasta:N

         12 scaffolds_summary.info:N

        900 updatedGenome.fa:N