<![CDATA[BOL: All]]>

<![CDATA[BOL: All]]> https://bioinformaticsonline.com/snippets?offset=90 https://bioinformaticsonline.com/snippets/view/43680/update-the-linux-os Mon, 27 Dec 2021 06:35:22 -0600 https://bioinformaticsonline.com/snippets/view/43680/update-the-linux-os <![CDATA[Update the Linux OS !]]> #To update the linux OS -- run the following sudo -- sh -c 'apt-get update; apt-get upgrade -y; apt-get dist-upgrade -y; apt-get autoremove -y; apt-get autoclean -y' #OR sudo apt-get update && sudo apt-get upgrade]]> Abhi https://bioinformaticsonline.com/snippets/view/43678/update-conda-version Sat, 25 Dec 2021 01:55:08 -0600 https://bioinformaticsonline.com/snippets/view/43678/update-conda-version <![CDATA[Update conda version !]]> Lenovo-ideapad-320-15ISK:~/VANSH$ conda update -n base conda Collecting package metadata (current_repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/Jit/anaconda3 added / updated specs: - conda The following packages will be downloaded: package | build ---------------------------|----------------- backports.functools_lru_cache-1.6.4| pyhd3eb1b0_0 9 KB conda-4.11.0 | py38h06a4308_0 14.4 MB conda-package-handling-1.7.3| py38h27cfd23_1 884 KB xmltodict-0.12.0 | pyhd3eb1b0_0 13 KB ------------------------------------------------------------ Total: 15.3 MB The following packages will be UPDATED: backports.functoo~ 1.6.1-pyhd3eb1b0_0 --> 1.6.4-pyhd3eb1b0_0 conda conda-forge::conda-4.10.3-py38h578d9b~ --> pkgs/main::conda-4.11.0-py38h06a4308_0 conda-package-han~ 1.7.2-py38h03888b9_0 --> 1.7.3-py38h27cfd23_1 The following packages will be DOWNGRADED: xmltodict 0.12.0-py_0 --> 0.12.0-pyhd3eb1b0_0 Proceed ([y]/n)? y Downloading and Extracting Packages conda-package-handli | 884 KB | #################################################################################################################################################################################################### | 100% conda-4.11.0 | 14.4 MB | #################################################################################################################################################################################################### | 100% backports.functools_ | 9 KB | #################################################################################################################################################################################################### | 100% xmltodict-0.12.0 | 13 KB | #################################################################################################################################################################################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done]]> Jit https://bioinformaticsonline.com/snippets/view/43669/bowtie2-mapping Mon, 20 Dec 2021 05:46:02 -0600 https://bioinformaticsonline.com/snippets/view/43669/bowtie2-mapping <![CDATA[Bowtie2 Mapping !]]> bowtie2-build toy_dataset_contig_for_mapping.fasta toy_dataset_contig_for_mapping.btindex bowtie2 -x toy_dataset_contig_for_mapping.btindex -f -U toy_dataset_reads_for_mapping.fasta -S toy_dataset_mapped_species1.sam samtools sort toy_dataset_mapped_species1.bam -o toy_dataset_mapped_species1_sorted.bam samtools index toy_dataset_mapped_species1_sorted.bam]]> Jit https://bioinformaticsonline.com/snippets/view/43636/extract-fasta-header-with-ids Fri, 10 Dec 2021 09:58:58 -0600 https://bioinformaticsonline.com/snippets/view/43636/extract-fasta-header-with-ids <![CDATA[Extract fasta header with ids !]]> #Extract all the fasta header name with certain ids kraken --db ../../../../DATABASE/minikraken_20171019_8GB.tgz out.fa more out.fa_class.txt | grep "227859" | awk '{print $2}' > all_real_ids.txt minimap2 -t 36 -k19 -w5 -A1 -B2 -O3,13 -E2,1 -s200 -z200 -N50 --min-occ-floor=100 finaal_output.fasta finaal_output.fasta > finaal_self_align.paf]]> Surabhi Chaudhary https://bioinformaticsonline.com/snippets/view/43617/omicron-sequences-accession-number Thu, 02 Dec 2021 06:39:30 -0600 https://bioinformaticsonline.com/snippets/view/43617/omicron-sequences-accession-number <![CDATA[Omicron Sequences accession number !]]> EPI_ISL_6647956 EPI_ISL_6647957 EPI_ISL_6647958 EPI_ISL_6647959 EPI_ISL_6647960 EPI_ISL_6647962 EPI_ISL_6647961 Search the IDs in https://www.epicov.org/epi3/frontend]]> Surabhi Chaudhary https://bioinformaticsonline.com/snippets/view/43613/run-pango-on-your-multifasta-file Tue, 30 Nov 2021 01:41:44 -0600 https://bioinformaticsonline.com/snippets/view/43613/run-pango-on-your-multifasta-file <![CDATA[Run Pango on your multifasta file !]]> #More at https://cov-lineages.org/resources/pangolin/usage.html (base) [jnarayan@hn1 FASTA]$ conda activate pangolin (pangolin) [jnarayan@hn1 FASTA]$ ls Input_for_Cova_all_samples_combined.fa (pangolin) [jnarayan@hn1 FASTA]$ pangolin .DS_Store Input_for_Cova_all_samples_combined.fa (pangolin) [jnarayan@hn1 FASTA]$ pangolin --update pangolin already latest release (v3.1.16) pangolearn updated to 2021-11-18 constellations updated to v0.0.24 scorpio already latest release (v0.3.14) pango-designation updated to v1.2.103 (pangolin) [jnarayan@hn1 FASTA]$ pangolin .DS_Store Input_for_Cova_all_samples_combined.fa (pangolin) [jnarayan@hn1 FASTA]$ pangolin Input_for_Cova_all_samples_combined.fa All dependencies satisfied. The query file is:/home/jnarayan/RF_DATA/FASTA/Input_for_Cova_all_samples_combined.fa ** Running sequence QC ** Number of sequences detected: 320 Total passing QC: 293 Data files found: Trained model: /home/jnarayan/anaconda3/envs/pangolin/lib/python3.8/site-packages/pangoLEARN/data/decisionTree_v1.joblib Header file: /home/jnarayan/anaconda3/envs/pangolin/lib/python3.8/site-packages/pangoLEARN/data/decisionTreeHeaders_v1.joblib Designated hash: /home/jnarayan/anaconda3/envs/pangolin/lib/python3.8/site-packages/pangoLEARN/data/lineages.hash.csv Job stats: job count min threads max threads -------------------- ------- ------------- ------------- add_failed_seqs 1 1 1 align_to_reference 1 1 1 all 1 1 1 generate_report 1 1 1 get_constellations 1 1 1 hash_sequence_assign 1 1 1 pangolearn 1 1 1 scorpio 1 1 1 total 8 1 1 loading model 11/30/2021, 13:10:05 /home/jnarayan/anaconda3/envs/pangolin/lib/python3.8/site-packages/sklearn/base.py:324: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 0.24.2 when using version 1.0.1. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations warnings.warn( processing block of 293 sequences 11/30/2021, 13:10:08 /home/jnarayan/anaconda3/envs/pangolin/lib/python3.8/site-packages/sklearn/base.py:438: UserWarning: X has feature names, but DecisionTreeClassifier was fitted without feature names warnings.warn( complete 11/30/2021, 13:10:09 Output file written to: /home/jnarayan/RF_DATA/FASTA/lineage_report.csv (pangolin) [jnarayan@hn1 FASTA]$ ls Input_for_Cova_all_samples_combined.fa lineage_report.csv]]> Jit https://bioinformaticsonline.com/snippets/view/43612/extract-fasta-sequences-with-ids-in-another-file Sun, 28 Nov 2021 03:46:22 -0600 https://bioinformaticsonline.com/snippets/view/43612/extract-fasta-sequences-with-ids-in-another-file <![CDATA[Extract fasta sequences with ids in another file !]]> #Ids are in test.txt - one ids per line #sequences are in test.fa grep -w -A 2 -f test.txt test.fa --no-group-separator # seqtk seqtk subseq test.fa test.txt #faSomeRecods faSomeRecords in.fa listFile out.fa # seqkit seqkit grep -n -f list.txt sequences.fas > newfile2.fas]]> Surabhi Chaudhary https://bioinformaticsonline.com/snippets/view/43603/extract-the-values-using-ids Mon, 22 Nov 2021 20:07:29 -0600 https://bioinformaticsonline.com/snippets/view/43603/extract-the-values-using-ids <![CDATA[Extract the values using ids !]]> #Awk script awk 'NR==FNR{tgts[$1]; next} $1 in tgts' file1 file2 Look: $ cat file1 11002 10995 48981 79600 $ cat file2 10993 item 0 11002 item 6 10995 item 7 79600 item 7 439481 item 5 272557 item 7 224325 item 7 84156 item 6 572546 item 7 693661 item 7 $ awk 'NR==FNR{tgts[$1]; next} $1 in tgts' file1 file2 11002 item 6 10995 item 7 79600 item 7]]> Surabhi Chaudhary https://bioinformaticsonline.com/snippets/view/43602/split-the-string-with-underscore-and-store-values-in-array-with-awk Mon, 22 Nov 2021 19:02:31 -0600 https://bioinformaticsonline.com/snippets/view/43602/split-the-string-with-underscore-and-store-values-in-array-with-awk <![CDATA[Split the string with underscore and store values in array with AWK !]]> more enriched_ids | grep "WP_" | awk '{split($2,a,"_"); print a[4]"_"a[5]}' #Other extraction more enriched_ids | grep "WP_" | awk '{split($2,a,"_"); print a[4]"_"a[5]}'> enriched_ids_list awk 'NR==FNR{tgts[$1]; next} $1 in tgts' enriched_ids_list result/GO.out > enriched_GO.out.xls]]> Surabhi Chaudhary https://bioinformaticsonline.com/snippets/view/43601/awk-build-in-commands Mon, 22 Nov 2021 18:50:44 -0600 https://bioinformaticsonline.com/snippets/view/43601/awk-build-in-commands <![CDATA[Awk build in commands !]]> Built-In Variables In Awk Awk’s built-in variables include the field variables—$1, $2, $3, and so on ($0 is the entire line) — that break a line of text into individual words or pieces called fields. #NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. Awk command performs the pattern/action statements once for each record in a file. #NF: NF command keeps a count of the number of fields within the current input record. #FS: FS command contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters. FS can be reassigned to another character (typically in BEGIN) to change the field separator. #RS: RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a newline. #OFS: OFS command stores the output field separator, which separates the fields when Awk prints them. The default is a blank space. Whenever print has several parameters separated with commas, it will print the value of OFS in between each parameter. #ORS: ORS command stores the output record separator, which separates the output lines when Awk prints them. The default is a newline character. print automatically outputs the contents of ORS at the end of whatever it is given to print.]]> Surabhi Chaudhary