Find and replace in multifasta or fasta header with perl onliner
You have a fasta file and you want to replace: "|" You are told to replace that by "_" perl -i -p -e "s/\|/_/g" genome.fasta -i = inplace editing -p = loop over lines and print each line (after processing) -e = command line script1616 days ago
Samtools commands for bioinformatician !
...### sorting bam file by genome position samtools sort sal_sej....ed.bam.bam ### identifying genome variants (mpileup command) # -g...call format) file # -f : use reference genome given samtools mpi...lize (realign) indels # -f : reference fasta, needed to left alig...1604 days ago
Bash script to alignment of short reads against reference genome !
bwa mem -t 40 -R '@RG\tID:K12\tSM:K12' \ E.coli_K12_MG1655.fa SRR1770413_1.fastq.gz S...eads the read group K12 and the sample name K12" #reference and FASTQs E.coli_K12_MG1655....770413_2.fastq.gz --- this just specifies the base refere...1560 days ago
Pack a perl program with their dependencies on Ubuntu !
#Follow steps to create your own executable ./web jit@jit-HP-P...released... Evaluation of genome assembly software based on l...the fruit fly genome with the human genome reveals that about sixt...also found that two-thirds of human genes known to be involved in...1512 days ago
Install Ragout genome assembler
$ conda install -c bioconda ragout Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/...1465 days ago
get GC across the entire CDS !
#look at GC across the entire CDS. gffread -x - -g | \ seqtk comp - | \ awk -v OFS="\t" '{ print $1, "0", $2, ($4 + $5) / $2 }'1394 days ago
Bash script to get exon fragments from genome files !
#Exons are already defined in the GTF file, so we simply need to print lines that are marked exonic. gunzip -c genome_file.gtf.gz | awk 'BEGIN{OFS="\t";} $3=="exon" {print $1,$4-1,$5}' | bedtools sort | bedtools merge -i - | gzip > my_exon.bed.gz1366 days ago
Bash script to get intergenic region from genome files !
#For the intergenic region, we will require the size of the chromosomes. wget http:/...| sed 's/^chr//' | sed 's/Cp/Pt/' > tmp mv tmp xxx.chrom.sizes gunzip -c genome_file.gtf.gz | awk 'BEGIN{OFS...1366 days ago
1216 days ago
Create random 10000 SNPs in genome !
(base) ➜ dupStudy git:(master) ✗ perl ../simuG.pl -refseq SGDref.R64-2-1.dups.fa -snp_count 10000 -prefix simuSNP...ts introduced during simulation: simuSNP.refseq2simseq.map.txt Generating refere...1212 days ago