Finding Kmers from fasta sequence file
Save it in sample.fa >test TAATGCCATGGGATGTT jellyfish count -m 3 -s 100000 sample.fa -o sample.jf jellyfish dump -c sample.jf It return TGT 1 GAT 1 GGG 1 GGA 1 CAT 1 TGC 1 TAA 1 GCC 1 CCA 1 GTT 1 TGG 1 ATG 3 AAT 12027 days ago
Perl script to split fasta sequence and create overlaps
#!/usr/bin/perl use strict; use warnings; my $len = 5000; my $over = 200; my $seq_id=$ARGV[0]; my $seqFile = $ARGV[1]; my $seq; open(my $fh, "2027 days ago
Perl script to run in parellel !
#!/usr/bin/perl use strict; use warnings; use Parallel::ForkManager; use Bio::SeqIO; my ($sequence_data_ref) = parse_genome_files($ARGV[0]); my %genome=%{$sequence_data_ref}; my $n_processes...my $file=shift; my (%sequence_...->length; my $GCcount = $sequence =...1742 days ago
Samtools commands for bioinformatician !
## count mapped reads samtools view -c -F 260 mapping_file.bam ### converting sam file into fasta samtools fa...input_data/ref.fasta \ vars.vcf \ -o vars_indels_realigned.vcf ### show alignment...1658 days ago
Perl subroutine to creating kmer !
sub k_mers { my ($sequence, $k) = @_; my $len = length($sequence); my @result = (); for (my $i = 0; $i1623 days ago
Bash script to download SRA file !
#We can use the sratoolkit to directly pull the sequence data (in paired FASTQ format) from the archive. fastq-dump is in the SRA toolkit. It allows directly downloading data from...1615 days ago
Bash script to alignment of short reads against reference genome !
bwa mem -t 40 -R '@RG\tID:K12\tSM:K12' \ E.coli_K12_MG1655.fa SRR1770413_1.fastq.gz...R1770413.bam ##Breaking it down by line: #alignment with bwa: bwa mem -t $threads...(bwa finds the indexes using this) and the input alignment...1615 days ago
Pack a perl program with their dependencies on Ubuntu !
#Follow steps to create your own executable ./web jit@jit-HP-Pro-3...lyzing codon usage in an input sequence to evaluate how efficiently i...ew insights into evolution and sequence... Latest groups Bioinform...Sequence Tube Maps: displays multiple...SequenceTubeMapsdisplaysmultiplege...1567 days ago
Commands to install conda in Ubuntu !
jit@jit-HP-Pro-3335-MT:~/Downloads$ mkdir jittmp...cryptographic functions with asymmetric algorithm...nd RIPEMD160), a nd various encryption algorithms (AES, DES, RSA, ElGamal, etc....m-1.0.2-py37hd81dba3_0 ... installing: multipledispatch-0.6.0-py37_0 ... insta...1567 days ago
Python script to check sequence length in multifasta file
#!/usr/bin/python from Bio import SeqIO import sys cmdargs = str(sys.argv) for seq_record in SeqIO.parse(str(sys.argv[1]), "fasta"): output_line = '%s\t%i' % \ (seq_record.id, len(seq_record)) print(output_line)1271 days ago