Count the number of N in fasta file with Perl
#!/usr/bin/perl my ($h, $n, $l); open(I,$ARGV[0]) or die($!); while(){ chomp; if(/^>/){ $h=substr($_,1); }else{ $n=($_=~tr/nN/nN/); $l=length($_); print $h,"\t",$l,"\t",$n,"\t",$n/($l-$n),"\n"; } } close(I);2510 days ago
Genetic Algorithms demonstration with word DNA in Perl
...# a good starting point my $dna_length = 512; # 4 "letters" in the DNA my $dna_byte_length = $dna_length / 8; # the DNA...foreach my $byte (1 .. $dna_byte_length) { # get one random by...# integers between 0 and 2^$dna_length my $old_dna = $individual-...2416 days ago
Perl script for calculate Levenshtein distance
sub levenshtein_dist { my ($str1, $str2) = @_; my ($len1, $len2) = (length $str1, length $str2); if ($len1 == 0) { return $len2; } if ($len2 == 0) { return $len1; } my %mat; for (my $i = 0; $i2384 days ago
Clump Finding Problem Solved with Perl
#Find patterns forming clumps in a string. #Given: A string Genome, and integers k, L, and t. #Return: All distinct k-mers forming (L, t)-clumps in Genome. use str...2378 days ago
Insert the sequence at desire location in multi-fasta file with Perl
...File::Copy; #ARGV[0] should be in following format --- Keep the coordinate sorted by name+location #GenomechrName locationStart AlienGene AlienLength # The coordinate should no...2358 days ago
Create genome scaffolding with Perl
...ue on if there's a good likelihood that this will work ## i.e. trimLength * (1-%id) < threshold my...my $lPost = substr($lSeq, $lEnd); my $sPreTrim = substr($sPre, length($sPre)-$preTrim); my $...2352 days ago
Perl script to remove fasta sequences in multifasta file with certain length threshold
...chomp; next unless /\w/; s/>$//gs; my @chunk = split /\n/; my $header = shift @chunk; my $seqlen = length join "", @chunk; pri...2295 days ago
Estimate Genome Size with Jellyfish and R
...unit32 Number of treads to be used in the run. eg: 1,2,3,..etc. #-C -both-strands Count both strands #-m -mer-len=unit32 Length of the k-mer #-s -siz...2290 days ago
Perl script to find palindromic regions in DNA sequences
use strict; use warnings; my $pp = qr/(?: (\w) (?1) \g{-1} | \w? )/ix; my $filename = $ARGV[0]; open(my $fh, '2198 days ago
Perl script to check fastq reads qualities !
...->[1] = 1 if (!defined($aux->[0])); return ($name, $seq) if ($c ne '+'); my $qual = ''; while () { chomp; $qual .= $_; if (length($qual) >= length($seq)) { $aux->[0] = unde...2198 days ago