Unzip all the genome file and remove all fasta header except first one
...cessing $f file..." if [[ $f =~ \.fna$ ]]; then awk ' /^>/ && FNR > 1 {next} {print $0} ' $f | sed '/^>/{s/ /_/g}' > $f.fa #then sed '1!{/^\>/d;}' $f > $f.fa els...2525 days ago
Read a tab delimited file and search with perl
use strict; use warnings; use Data::Dumper; use Text::CSV; use IO::Handle; my $file = "/home/urbe/Tools/Alienomics_v0.1/Alienomics/output/intermediate_files/rRNA/refGene.megablast"; open my $fh, "[0]\n"; warn Dumper $row; # To see the structure }2519 days ago
Download the gff files from NCBI using bash script/command
...lm.nih.gov/genomes/all/.+/)(GCF_.+)|\1\2/\2_genomic.gff.gz|' > genom...lm.nih.gov/genomes/all/.+/)(GCA_.+)|\1\2/\2_genomic.gff.gz|' > genom...lm.nih.gov/genomes/all/.+/)(GCF_.+)|\1\2/\2_genomic.gff.gz|' > genom...lm.nih.gov/genomes/all/.+/)(GCF_.+)|\1\2/\2_genomic.gff.gz|' > genom...2517 days ago
Extract fasta sequence from a multifasta file with fasta header Ids
...my $fasta = shift @ARGV; my $out = shift @ARGV; my %select; open LIST, "$list" or die; while () { chomp; s/>//g; $select{$_} = 1; } close LIST; $/ = "\n>...2512 days ago
2488 days ago
Compress and decompress the sequence with perl
use strict; use warnings; my @char; while () { @char = split //; } comp(\@char); #--------------------- my $com= "r0a3m4a4j0"; my @com = split //, $com; dcomp (\@com); #dcomp sub here sub dcomp { my ($com_ref)=@_; my @com=@$com_ref; my $car; for (my $aa=0; $aa2509 days ago
2507 days ago
2463 days ago
Count the number of N in fasta file with Perl
#!/usr/bin/perl my ($h, $n, $l); open(I,$ARGV[0]) or die($!); while(){ chomp; if(/^>/){ $h=substr($_,1); }else{ $n=($_=~tr/nN/nN/); $l=length($_); print $h,"\t",$l,"\t",$n,"\t",$n/($l-$n),"\n"; } } close(I);2460 days ago
2460 days ago