Generates a genome coverage plot with R
library(CoverageView) ##draw a coverage plot for a test case BAM file #get a BAM test file treatBAMfile2065 days ago
Generate simulated polyploid genome !
#Generate 3% divergence msbar -point 4 -count 16558 toy.fasta > toyheterozygous3percent.fasta #Cat both files cat toy.fasta toymutated3percent.fasta > toyheterozygo...2063 days ago
Setting up falconUnzip conda environments for genome assembly !
➜ Analysis_Results conda create -n denovo_asm Solving environment: done ## Package Plan ## environment location: /home/urbe/anaconda3/envs/denovo_asm Procee...2000 days ago
1735 days ago
Perl script to run in parellel !
#!/usr/bin/perl use strict; use warnings; use Parallel::ForkMana...my ($sequence_data_ref) = parse_genome_files($ARGV[0]); my %genome=%{$sequence_data_ref}; my...foreach my $chr_set (keys %genome) { $count++...FILE or die $!; } sub parse_genome_files...1696 days ago
Find and replace in multifasta or fasta header with perl onliner
You have a fasta file and you want to replace: "|" You are told to replace that by "_" perl -i -p -e "s/\|/_/g" genome.fasta -i = inplace editing -p = loop over lines and print each line (after processing) -e = command line script1625 days ago
Samtools commands for bioinformatician !
## count mapped reads samtools view -c -F 260 mapping_file.bam ### converting sa...42 -c sal_sej.bam ### sorting bam file by genome position samtools sort sal_s...s index sal_sej_sorted.bam.bam ### identifying genome varian...1613 days ago
Bash script to download SRA file !
#We can use the sratoolkit to directly pull the sequence data (in paired FASTQ format) from the archive. fastq-dump is in the SRA toolkit. It allows directly downloading d...1569 days ago
Bash script to alignment of short reads against reference genome !
bwa mem -t 40 -R '@RG\tID:K12\tSM:K12' \ E.coli_K12_MG1655.fa SRR1770413_1.fastq.gz SRR1770413_2.fastq.gz \ | samtools view -b - >SRR1770413.raw.bam sambamba so...1569 days ago
To convert just one specific read group to fastq
# Stop script on error. set -uex # The SRR BioProject number for the sequencing data. PROJECT=PRJNA257197 # The number of datasets to subselect from the project....1552 days ago