Extract ids from file with perl
...#################################### #Open and Read a file sub read_fh { my $filename = shift @_; my $filehandle; if ($filename =~ /gz$/) { open $filehandl...2599 days ago
Download the genome from NCBI using bash script/command
...gov/genomes/all/.+/)(GCF_.+)|\1\2/\2_genomic.fna.gz|' > genomic_file_fungi # -...gov/genomes/all/.+/)(GCF_.+)|\1\2/\2_genomic.fna.gz|' > genomic_file_bacteria...gov/genomes/all/.+/)(GCF_.+)|\1\2/\2_genomic.fna.gz|' > genomic_file_protozoa...2523 days ago
Unzip all the genome file and remove all fasta header except first one
#!/bin/bash gzip -d *.gz FILES=$(pwd)/* for f in $FILES do echo "Processing $f file..." if [[ $f =~ \.fna$ ]]; then awk ' /^>/ && FNR > 1 {next} {print $0} ' $f...2522 days ago
Download the gff files from NCBI using bash script/command
...ll/.+/)(GCF_.+)|\1\2/\2_genomic.gff.gz|' > genomic_file_fungi # -...ll/.+/)(GCA_.+)|\1\2/\2_genomic.gff.gz|' > genomic_file_bacteria...ll/.+/)(GCF_.+)|\1\2/\2_genomic.gff.gz|' > genomic_file_protozoa...wget --input $f.head gunzip *.gz #cat $f cd .. done...2514 days ago
Convert fastq to fasta in Perl
use Bio::SeqIO; #convert .fastq.gz to .fasta open my $zcat, 'zcat seq.fastq.gz |' or die $!; my $in=Bio::SeqIO->new(-fh=>$zcat, -format=>'fastq'); my $out=Bio::SeqIO->new(-f...2322 days ago
Download genomes in batch from NCBI
curl 'ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/assembly_summary.txt' | awk '{FS="\t"} !/^#/ {print $20}' | sed -r 's|(ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)(GCA/)([0-9]{3}/)([0-9]{3}/)([0-9]{3}/)(GCA_.+)|\1\2\3\4\5\6/\6_genomic.fna.gz|' > genomic_file2254 days ago
2135 days ago
Perl script to find the distance beetween all the contigs and scaffolds
...} close $fh; return \%sequences, \@allIds; } sub read_fh { my $filename = shift @_; my $filehandle; if ($filename =~ /gz$/) { open $filehandl...2131 days ago
2117 days ago
1507 days ago