Command line to download blast database / protein
#download all available nr - protein database as a single file #Database location - NCBI where all databases...ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.*.tar.gz' #cat them into one cat nr.*.tar.g...957 days ago
Extract the values using ids !
#Awk script awk 'NR==FNR{tgts[$1]; next} $1 in tgts' file1 file2 Look: $ cat file1 11002 10995 48981 79600 $ cat file2 10993 item 0 110...908 days ago
Extract all fasta sequences except ids !
awk 'BEGIN{while((getline0)l[">"$1]=1}/^>/{f=!l[$1]}f...ids.fa omi_single_id.txt > omi_single_id.fa #cat omi and all the rest cat omi_single_id.fa filtered_wit...mated.fa > omi_kmer19_formated_numbered.fa # cat all *.rd file cat *.rd > all...842 days ago
bash script to extract sequence by ids !
Use a Perl one-liner, grep and seqtk...es: # Create test input: cat > in.fasta BGI_novel_T016313...3g025570.2.1 TTCAAGTGTTAGTTTCACATCAT >BGI_novel_T018109 Solyc0...817 BGI_novel_G001220 GCCCAAGTCATAGGTAGTGCCTG >BGI_novel_T0161...> out.fasta cat out.fasta Output: >BGI_no...837 days ago