1779 days ago
Perl script to run in parellel !
#!/usr/bin/perl use strict; use warnings; use Parallel::ForkManager; use Bio::SeqIO; my ($sequence_data_ref) = parse_genome_files($ARGV[0]); my %genome=%{$seq...1743 days ago
Find and replace in multifasta or fasta header with perl onliner
You have a fasta file and you want to replace: "|" You are told to replace that by "_" perl -i -p -e "s/\|/_/g" genome.fasta -i = inplace editing -p = loop over lines and print each line (after processing) -e = command line script1672 days ago
Samtools commands for bioinformatician !
## count mapped reads samtools view -c -F 260 mapping_file.bam ### converting sam file into fasta samtools fasta reads_mapped.sam > reads.fasta ### converting...1659 days ago
Python script to check sequence length in multifasta file
#!/usr/bin/python from Bio import SeqIO import sys cmdargs = str(sys.argv) for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): output_line = '%s\t%i' % \ (seq_record.id, len(seq_record)) print(output_line)1272 days ago
Onliner to split the multifasta to singlefasta files !
#Split the multifasta to singlefasta # Multi fasta #Single fasta awk '$0 ~ "^>" { match($1, /^>([^:]+)/, id); filename=id[1]} {print >> filename".fa"}' sequence.fasta1457 days ago
1459 days ago
get GC across the entire CDS !
#look at GC across the entire CDS. gffread -x - -g | \ seqtk comp - | \ awk -v OFS="\t" '{ print $1, "0", $2, ($4 + $5) / $2 }'1449 days ago
Reformat the multifasta for sequence length !
#awk oneliner to reformat the multifasta sequences awk '!/^>/ {printf "%s", $0; n = "\n"} /^>/ {print n $0; n = ""}' file.fasta | fold -w 1001448 days ago
Bash script to handle Multifasta files
#Convert all lowercase residues to uppercase in a FASTA sequence file $ awk 'BEGIN{FS=" "}{if(!/>/){print toupper($0)}else{print $1}}' input.fasta > output.fasta...1413 days ago