<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44640?offset=20</link>
	<atom:link href="https://bioinformaticsonline.com/related/44640?offset=20" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43620/ncbi-datasets-cli-quickstart-command-line-tools</guid>
	<pubDate>Tue, 07 Dec 2021 02:51:26 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43620/ncbi-datasets-cli-quickstart-command-line-tools</link>
	<title><![CDATA[ncbi-datasets-cli -- Quickstart: command line tools !]]></title>
	<description><![CDATA[<p><span>Install and use the NCBI Datasets command line tools</span></p>
<p>The NCBI Datasets datasets command line tools are&nbsp;<a href="https://www.ncbi.nlm.nih.gov/datasets/docs/v1/reference-docs/command-line/datasets/">datasets</a>&nbsp;and&nbsp;<a href="https://www.ncbi.nlm.nih.gov/datasets/docs/v1/reference-docs/command-line/dataformat/">dataformat</a>&nbsp;.</p>
<p>Use&nbsp;<span>datasets</span>&nbsp;to download biological sequence data across all domains of life from NCBI.</p>
<p>Use&nbsp;<span>dataformat</span>&nbsp;to convert metadata from&nbsp;<a href="https://jsonlines.org/" target="_blank">JSON Lines</a>&nbsp;format to other formats.</p>
<p><strong>Conda download:</strong></p>
<p>https://anaconda.org/conda-forge/ncbi-datasets-cli</p>
<p><strong>Buld Download</strong></p>
<p>&nbsp;https://www.ncbi.nlm.nih.gov/datasets/builder/?tax_id=29979</p><p>Address of the bookmark: <a href="https://www.ncbi.nlm.nih.gov/datasets/docs/v1/quickstarts/command-line-tools/" rel="nofollow">https://www.ncbi.nlm.nih.gov/datasets/docs/v1/quickstarts/command-line-tools/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33219/dbcan-a-web-server-and-database-for-automated-carbohydrate-active-enzyme-annotation</guid>
	<pubDate>Mon, 29 May 2017 05:39:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33219/dbcan-a-web-server-and-database-for-automated-carbohydrate-active-enzyme-annotation</link>
	<title><![CDATA[dbCAN: a web server and DataBase for automated Carbohydrate-active enzyme ANnotation]]></title>
	<description><![CDATA[<p><a href="http://csbl.bmb.uga.edu/dbCAN/index.php">dbCAN</a>&nbsp;is a web server and&nbsp;<span style="text-decoration: underline;">D</span>ata<span style="text-decoration: underline;">B</span>ase for&nbsp;<a href="http://csbl.bmb.uga.edu/dbCAN/annotate.php"><strong>automated&nbsp;<span style="text-decoration: underline;">C</span>arbohydrate-active enzyme&nbsp;<span style="text-decoration: underline;">AN</span>notation</strong></a>, funded by the&nbsp;<a href="http://bioenergycenter.org/">BioEnergy Science Center of the DOE</a>. Similar resources on the web include&nbsp;<a href="http://www.cazy.org/" target="_blank">CAZy database</a>&nbsp;and&nbsp;<a href="http://cricket.ornl.gov/cgi-bin/cat.cgi" target="_blank">CAT</a>. All data in dbCAN are generated based on the family classification from&nbsp;<a href="http://www.cazy.org/" target="_blank">CAZy database</a>&nbsp;while it has the following&nbsp;<strong><span style="text-decoration: underline;">unique features</span></strong>&nbsp;compared with CAZy database and CAT:</p>
<ul>
<li>dbCAN provides the capability of&nbsp;<a href="http://csbl.bmb.uga.edu/dbCAN/annotate.php">automated and comprehensive CAZyme annotation</a>&nbsp;of a given genome submitted by the user;</li>
<li>dbCAN provides an explicitly defined&nbsp;<span style="text-decoration: underline;">signature domain</span>&nbsp;for each and every CAZyme family along with its location in all the relevant full-length CAZyme proteins in all sequenced&nbsp;<a href="http://csbl.bmb.uga.edu/dbCAN/genome.php">genomes</a>;</li>
<li>dbCAN provides the most complete set of&nbsp;<span style="text-decoration: underline;">metagenomic CAZyme</span>&nbsp;genes published so far and represents the first step towards discovering novel CAZyme catalysts in metagenomes;</li>
<li>dbCAN provides a&nbsp;<span style="text-decoration: underline;">subfamily classification</span>&nbsp;of the existing CAZyme families based on sequence similarities;</li>
<li>dbCAN make all pre-computed data freely available to the public, including sequence alignments,&nbsp;<a href="http://csbl.bmb.uga.edu/dbCAN/download/">hidden markov models (HMMs)</a>&nbsp;and phylogenies of the signature domain regions in each and every CAZyme family and subfamily.</li>
</ul>
<p><a href="http://csbl.bmb.uga.edu/dbCAN/help.php">dbCAN</a>&nbsp;is updated regularly when&nbsp;<a href="http://www.cazy.org/" target="_blank">CAZy database</a>&nbsp;created new families based on latest literature.</p><p>Address of the bookmark: <a href="http://csbl.bmb.uga.edu/dbCAN/index.php" rel="nofollow">http://csbl.bmb.uga.edu/dbCAN/index.php</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/41496/new-machine-learning-packages-in-r</guid>
	<pubDate>Fri, 27 Mar 2020 12:11:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/41496/new-machine-learning-packages-in-r</link>
	<title><![CDATA[New Machine Learning Packages in R]]></title>
	<description><![CDATA[<h3 id="machine-learning">Machine Learning</h3><p><a href="https://cran.r-project.org/package=autokeras">autokeras</a>&nbsp;v1.0.1: Implements an interface to&nbsp;<a href="https://autokeras.com/">AutoKeras</a>, an open source software library for automated machine learning. See&nbsp;<a href="https://cran.r-project.org/web/packages/autokeras/readme/README.html">README</a>&nbsp;for an example.</p><p><a href="https://cran.r-project.org/package=MTPS">MTPS</a>&nbsp;v0.1.9: Implements functions to predict simultaneous multiple outcomes based on revised stacking algorithms as described in&nbsp;<a href="denied:doi:10.1093/bioinformatics/btz531">Xing et al. (2019)</a>. See the&nbsp;<a href="https://cran.r-project.org/web/packages/MTPS/vignettes/Guide.html">vignette</a>&nbsp;to get started.</p><p><a href="https://cran.r-project.org/package=quanteda.textmodels">quanteda.textmodels</a>&nbsp;v0.9.1: Implements methods for scaling models and classifiers based on sparse matrix objects representing textual data. It includes implementations of the&nbsp;<a href="denied:doi:10.1017/S0003055403000698">Laver et al. (2003)</a>&nbsp;wordscores model, the&nbsp;<a href="denied:arxiv:1710.08963">Perry &amp; Benoit&rsquo;s (2017)</a>&nbsp;class affinity scaling model, and the&nbsp;<a href="denied:doi:10.1111/j.1540-5907.2008.00338.x">Slapin &amp; Proksch (2008)</a>&nbsp;wordfish model. See the&nbsp;<a href="https://cran.r-project.org/web/packages/quanteda.textmodels/vignettes/textmodel_performance.html">vignette</a>&nbsp;to get started.</p><p><a href="https://cran.r-project.org/package=SeqDetect">SeqDetect</a>&nbsp;v1.0.7: Implements the automaton model found in&nbsp;<a href="https://ieeexplore.ieee.org/document/8910574">Krleža, Vrdoljak &amp; Brčić (2019)</a>&nbsp;to detect and process sequences. See the&nbsp;<a href="https://cran.r-project.org/web/packages/SeqDetect/vignettes/SequentialDetector.pdf">vignette</a>&nbsp;for examples and theory.</p><p><a href="https://cran.r-project.org/package=studyStrap">studyStrap</a>&nbsp;v1.0.0: Implements multi-Study Learning algorithms such as Merging, Study-Specific Ensembling (Trained-on-Observed-Studies Ensemble), the Study Strap, and the Covariate-Matched Study Strap. and offers over 20 similarity measures. See&nbsp;<a href="denied:doi:10.1101/856385">Kishida, et al. (2019)</a>&nbsp;for background and the&nbsp;<a href="https://cran.r-project.org/web/packages/studyStrap/vignettes/vignette.html">vignette</a>&nbsp;for how to use the package.</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35915/iupac-codes</guid>
	<pubDate>Tue, 13 Mar 2018 05:16:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35915/iupac-codes</link>
	<title><![CDATA[IUPAC codes]]></title>
	<description><![CDATA[<p>IUPAC codes</p><p>DNA:</p><p>Nucleotide Code: Base:</p><p>---------------- -----</p><p>A.................Adenine</p><p>C.................Cytosine</p><p>G.................Guanine</p><p>T (or U)..........Thymine (or Uracil)</p><p>R.................A or G</p><p>Y.................C or T</p><p>S.................G or C</p><p>W.................A or T</p><p>K.................G or T</p><p>M.................A or C</p><p>B.................C or G or T</p><p>D.................A or G or T</p><p>H.................A or C or T</p><p>V.................A or C or G</p><p>N.................any base . or -............gap</p><p>Protein:</p><p>Amino Acid Code: Three letter Code: Amino Acid:</p><p>---------------- ------------------ -----------</p><p>A.................Ala.................Alanine</p><p>B.................Asx.................Aspartic acid or Asparagine</p><p>C.................Cys.................Cysteine</p><p>D.................Asp.................Aspartic Acid</p><p>E.................Glu.................Glutamic Acid</p><p>F.................Phe.................Phenylalanine</p><p>G.................Gly.................Glycine</p><p>H.................His.................Histidine</p><p>I.................Ile.................Isoleucine</p><p>K.................Lys.................Lysine</p><p>L.................Leu.................Leucine</p><p>M.................Met.................Methionine</p><p>N.................Asn.................Asparagine</p><p>P.................Pro.................Proline</p><p>Q.................Gln.................Glutamine</p><p>R.................Arg.................Arginine</p><p>S.................Ser.................Serine</p><p>T.................Thr.................Threonine</p><p>V.................Val.................Valine</p><p>W.................Trp.................Tryptophan</p><p>X.................Xaa.................Any amino acid</p><p>Y.................Tyr.................Tyrosine</p><p>Z.................Glx.................Glutamine or Glutamic acid</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37788/s-plot2-creates-an-interactive-two-dimensional-heatmap-of-sequences</guid>
	<pubDate>Fri, 28 Sep 2018 05:36:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37788/s-plot2-creates-an-interactive-two-dimensional-heatmap-of-sequences</link>
	<title><![CDATA[S-plot2: creates an interactive, two-dimensional heatmap of sequences]]></title>
	<description><![CDATA[<p><span>S-plot2 creates an interactive, two-dimensional heatmap capturing the similarities and dissimilarities in nucleotide usage between genomic sequences (partial or complete). In S-plot2, whole eukaryotic chromosomes and smaller prokaryotic genomes can be efficiently compared. The tool includes functionality to extract, analyze, and automate BLAST queries of regions of interest within the heatmap. This facilitates the investigation of quickly evolving coding regions, novel coding regions, and laterally transferred elements.</span></p>
<p><span>http://www.putonti-lab.com/uploads/4/5/3/0/45307835/s-plot2_tutorial.pdf</span></p>
<p><span>http://journals.sagepub.com/doi/pdf/10.1177/1176934318797354</span></p><p>Address of the bookmark: <a href="https://bitbucket.org/lkalesinskas/splot" rel="nofollow">https://bitbucket.org/lkalesinskas/splot</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43656/special-nucleotide-characters-symbols</guid>
	<pubDate>Thu, 16 Dec 2021 23:37:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43656/special-nucleotide-characters-symbols</link>
	<title><![CDATA[Special Nucleotide Characters / Symbols !]]></title>
	<description><![CDATA[<h2 style="text-align: center;">Nucleotide symbols</h2><table style="margin: auto;" width="95%" border="1" cellpadding="5">
<tbody>
<tr>
<td align="center">Nucleotide symbol</td>
<td align="center">Full Name</td>
</tr>
<tr>
<td align="center">A</td>
<td align="center">Adenine</td>
</tr>
<tr>
<td align="center">C</td>
<td align="center">Cytosine</td>
</tr>
<tr>
<td align="center">G</td>
<td align="center">Guanine</td>
</tr>
<tr>
<td align="center">T</td>
<td align="center">Thymine</td>
</tr>
<tr>
<td align="center">U</td>
<td align="center">Uracil</td>
</tr>
<tr>
<td align="center">R</td>
<td align="center">Guanine / Adenine (purine)</td>
</tr>
<tr>
<td align="center">Y</td>
<td align="center">Cytosine / Thymine (pyrimidine)</td>
</tr>
<tr>
<td align="center">K</td>
<td align="center">Guanine / Thymine</td>
</tr>
<tr>
<td align="center">M</td>
<td align="center">Adenine / Cytosine</td>
</tr>
<tr>
<td align="center">S</td>
<td align="center">Guanine / Cytosine</td>
</tr>
<tr>
<td align="center">W</td>
<td align="center">Adenine / Thymine</td>
</tr>
<tr>
<td align="center">B</td>
<td align="center">Guanine / Thymine / Cytosine</td>
</tr>
<tr>
<td align="center">D</td>
<td align="center">Guanine / Adenine / Thymine</td>
</tr>
<tr>
<td align="center">H</td>
<td align="center">Adenine / Cytosine / Thymine</td>
</tr>
<tr>
<td align="center">V</td>
<td align="center">Guanine / Cytosine / Adenine</td>
</tr>
<tr>
<td align="center">N</td>
<td align="center">Adenine / Guanine / Cytosine / Thymine</td>
</tr>
</tbody>
</table>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/4193/bioinformatics-101-running-blast</guid>
	<pubDate>Tue, 03 Sep 2013 14:59:50 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/4193/bioinformatics-101-running-blast</link>
	<title><![CDATA[Bioinformatics 101 -  Running BLAST]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/CYnjROfGXv8" frameborder="0" allowfullscreen></iframe>How to format the database for BLAST, run the command, view the output file, and use BioPerl and Perl to parse the output. By David Francis, Ohio State University. Delivered live at the Tomato Disease Workshop 2010. For more information, please visit http://www.extension.org/pages/32521/bioinformatics-101-video.]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/4851/blast</guid>
	<pubDate>Wed, 25 Sep 2013 10:56:23 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/4851/blast</link>
	<title><![CDATA[BLAST]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/g0nSH17psDc" frameborder="0" allowfullscreen></iframe>Dr. Rob Edwards describes how BLAST works]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30976/brig</guid>
	<pubDate>Thu, 16 Feb 2017 13:14:25 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30976/brig</link>
	<title><![CDATA[BRIG]]></title>
	<description><![CDATA[<p>BRIG is a free cross-platform (Windows/Mac/Unix) application that can display circular comparisons between a large number of genomes, with a focus on handling genome assembly data. The application is available at:<a href="http://sourceforge.net/projects/brig">http://sourceforge.net/projects/brig</a></p>
<p>If you have any questions or comments, post them on&nbsp;<a href="http://sourceforge.net/tracker/?group_id=328245">one of the trackers</a>&nbsp;on BRIG&rsquo;s SourceForge page:<a href="http://sourceforge.net/tracker/?group_id=328245">http://sourceforge.net/tracker/?group_id=328245</a>.</p>
<p>Features:</p>
<ul>
<li>Images show similarity between a central reference sequence and other sequences as concentric rings.</li>
<li>BRIG will perform all BLAST comparisons and file parsing automatically via a simple GUI.</li>
<li>Contig boundaries and read coverage can be displayed for draft genomes; customized graphs and annotations can be displayed.</li>
<li>Using a user-defined set of genes as input, BRIG can display gene presence, absence, truncation or sequence variation in a set of complete genomes, draft genomes or even raw, unassembled sequence data.</li>
<li>BRIG also accepts SAM-formatted read-mapping files enabling genomic regions present in unassembled sequence data from multiple samples to be compared simultaneously</li>
</ul>
<p>&nbsp;</p><p>Address of the bookmark: <a href="http://brig.sourceforge.net/" rel="nofollow">http://brig.sourceforge.net/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/35923/basic-command-line-to-run-blast</guid>
	<pubDate>Wed, 14 Mar 2018 05:10:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/35923/basic-command-line-to-run-blast</link>
	<title><![CDATA[Basic command-line to run BLAST]]></title>
	<description><![CDATA[<p>&nbsp;</p><p>The goal of this tutorial is to run you through a demonstration of the command line, which you may not have seen or used much before.</p><p>All of the commands below can copy/pasted.</p><div id="install-software"><h2>Install software<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#install-software" title="Permalink to this headline"></a></h2><p>Copy and paste the following commands</p><div><div><pre>sudo apt-get update &amp;&amp; sudo apt-get -y install python ncbi-blast+
</pre></div></div><p>This updates the software list and installs the Python programming language and NCBI BLAST+.</p></div><div id="get-data"><h2>Get Data<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#get-data" title="Permalink to this headline"></a></h2><p>Grab some data to play with. Grab some cow and human RefSeq proteins:</p><div><div><pre>wget ftp://ftp.ncbi.nih.gov/refseq/B_taurus/mRNA_Prot/cow.1.protein.faa.gz
wget ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/mRNA_Prot/human.1.protein.faa.gz
</pre></div></div><p>This is only the first part of the human and cow protein files - there are 24 files total for human.</p><p>The database files are both gzipped, so lets unzip them</p><div><div><pre>gunzip *gz
ls
</pre></div></div><p>Take a look at the head of each file:</p><div><div><pre>head cow.1.protein.faa
head human.1.protein.faa
</pre></div></div><p>These are protein sequences in FASTA format. FASTA format is something many of you have probably seen in one form or another &ndash; it&rsquo;s pretty ubiquitous. It&rsquo;s just a text file, containing records; each record starts with a line beginning with a &lsquo;&gt;&rsquo;, and then contains one or more lines of sequence text.</p><p>Note that the files are in fasta format, even though they end if &rdquo;.faa&rdquo; instead of the usual &rdquo;.fasta&rdquo;. This NCBI&rsquo;s way of denoting that this is a fasta file with amino acids instead of nucleotides.</p><p>How many sequences are in each one?</p><div><div><pre>grep -c '^&gt;' cow.1.protein.faa
grep -c '^&gt;' human.1.protein.faa
</pre></div></div><p>This grep command uses the c flag, which reports a count of lines with match to the pattern. In this case, the pattern is a regular expression, meaning match only lines that begin with a &gt;.</p><p>This is a bit too big, lets take a smaller set for practice. Lets take the first two sequences of the cow proteins, which we can see are on the first 6 lines</p><div><div><pre>head -6 cow.1.protein.faa &gt; cow.small.faa
</pre></div></div></div><div id="blast"><h2>BLAST<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#blast" title="Permalink to this headline"></a></h2><p>Now we can blast these two cow sequences against the set of human sequences. First, we need to tell blast about our database. BLAST needs to do some pre-work on the database file prior to searching. This helps to make the software work a lot faster. Because you installed your own version of the sotware, you need to tell the shell where the software is located. Use the full path and the makeblastdb command:</p><div><div><pre>makeblastdb -in human.1.protein.faa -dbtype prot
ls
</pre></div></div><p>Note that this makes a lot of extra files, with the same name as the database plus new extensions (.pin, .psq, etc). To make blast work, these files, called index files, must be in the same directory as the fasta file.</p><p><br /> blastp [-h] [-help] [-import_search_strategy filename]<br /> [-export_search_strategy filename] [-task task_name] [-db database_name]<br /> [-dbsize num_letters] [-gilist filename] [-seqidlist filename]<br /> [-negative_gilist filename] [-negative_seqidlist filename]<br /> [-entrez_query entrez_query] [-db_soft_mask filtering_algorithm]<br /> [-db_hard_mask filtering_algorithm] [-subject subject_input_file]<br /> [-subject_loc range] [-query input_file] [-out output_file]<br /> [-evalue evalue] [-word_size int_value] [-gapopen open_penalty]<br /> [-gapextend extend_penalty] [-qcov_hsp_perc float_value]<br /> [-max_hsps int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]<br /> [-xdrop_gap_final float_value] [-searchsp int_value]<br /> [-sum_stats bool_value] [-seg SEG_options] [-soft_masking soft_masking]<br /> [-matrix matrix_name] [-threshold float_value] [-culling_limit int_value]<br /> [-best_hit_overhang float_value] [-best_hit_score_edge float_value]<br /> [-window_size int_value] [-lcase_masking] [-query_loc range]<br /> [-parse_deflines] [-outfmt format] [-show_gis]<br /> [-num_descriptions int_value] [-num_alignments int_value]<br /> [-line_length line_length] [-html] [-max_target_seqs num_sequences]<br /> [-num_threads int_value] [-ungapped] [-remote] [-comp_based_stats compo]<br /> [-use_sw_tback] [-version]</p><p>Now we can run the blast job. We will use blastp, which is appropriate for protein to protein comparisons.</p><div><div><pre>blastp -query cow.small.faa -db human.1.protein.faa
</pre></div></div><p>This gives us a lot of information on the terminal screen. But this is difficult to save and use later - Blast also gives the option of saving the text to a file.</p><div><div><pre>    blastp -query cow.small.faa -db human.1.protein.faa -out cow_vs_human_blast_results.txt
ls
</pre></div></div><p>Take a look at the results using less. Note that there can be more than one match between the query and the same subject. These are referred to as high-scoring segment pairs (HSPs).</p><div><div><pre>less cow_vs_human_blast_results.txt
</pre></div></div><p>So how do you know about all the options, such as the flag to create an output file? Lets also take a look at the help pages. Unfortunately there are no man pages (those are usually reserved for shell commands, but some software authors will provide them as well), but there is a text help output</p><div><div><pre>blastp -help
</pre></div></div><p>To scroll through slowly</p><div><div><pre>blastp -help | less
</pre></div></div><p>To quit the less screen, press the q key.</p><p>Parameters of interest include the -evalue (Default is 10?!?) and the -outfmt</p><p>Lets filter for more statistically significant matches with a different output format:</p><div><div><pre>blastp \
-query cow.small.faa \
-db human.1.protein.faa \
-out cow_vs_human_blast_results.tab \
-evalue 1e-5 \
-outfmt 7
</pre></div></div><p>I broke the long single command into many lines with by &ldquo;escaping&rdquo; the newline. That forward slash tells the command line &ldquo;Wait, I&rsquo;m not done yet!&rdquo;. So it waits for the next line of the command before executing.</p><p>Check out the results with less.</p><p>Lets try a medium sized data set next</p><div><div><pre>head -199 cow.1.protein.faa &gt; cow.medium.faa
</pre></div></div><p>What size is this db?</p><div><div><pre>grep -c '^&gt;' cow.medium.faa
</pre></div></div><p>Lets run the blast again, but this time lets return only the best hit for each query.</p><div><div><pre>blastp \
-query cow.medium.faa \
-db human.1.protein.faa \
-out cow_vs_human_blast_results.tab \
-evalue 1e-5 \
-outfmt 6 \
-max_target_seqs 1
</pre></div></div></div><div id="summary"><h2>Summary<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#summary" title="Permalink to this headline"></a></h2><p>Review:</p><ul>
<li>command line programs such as blast use flags to get information about how and what to do</li>
<li>blast options can be found by typing&nbsp;<cite>blastp -help</cite></li>
<li>break a command up over many lines by using&nbsp;<a href="http://angus.readthedocs.io/en/2016/running-command-line-blast.html#id1">`</a>` to &ldquo;escape&rdquo; the new line</li>
</ul><p>&nbsp;</p><p>Blastn</p><p>blastn [-h] [-help] [-import_search_strategy filename]<br /> [-export_search_strategy filename] [-task task_name] [-db database_name]<br /> [-dbsize num_letters] [-gilist filename] [-seqidlist filename]<br /> [-negative_gilist filename] [-negative_seqidlist filename]<br /> [-entrez_query entrez_query] [-db_soft_mask filtering_algorithm]<br /> [-db_hard_mask filtering_algorithm] [-subject subject_input_file]<br /> [-subject_loc range] [-query input_file] [-out output_file]<br /> [-evalue evalue] [-word_size int_value] [-gapopen open_penalty]<br /> [-gapextend extend_penalty] [-perc_identity float_value]<br /> [-qcov_hsp_perc float_value] [-max_hsps int_value]<br /> [-xdrop_ungap float_value] [-xdrop_gap float_value]<br /> [-xdrop_gap_final float_value] [-searchsp int_value]<br /> [-sum_stats bool_value] [-penalty penalty] [-reward reward] [-no_greedy]<br /> [-min_raw_gapped_score int_value] [-template_type type]<br /> [-template_length int_value] [-dust DUST_options]<br /> [-filtering_db filtering_database]<br /> [-window_masker_taxid window_masker_taxid]<br /> [-window_masker_db window_masker_db] [-soft_masking soft_masking]<br /> [-ungapped] [-culling_limit int_value] [-best_hit_overhang float_value]<br /> [-best_hit_score_edge float_value] [-window_size int_value]<br /> [-off_diagonal_range int_value] [-use_index boolean] [-index_name string]<br /> [-lcase_masking] [-query_loc range] [-strand strand] [-parse_deflines]<br /> [-outfmt format] [-show_gis] [-num_descriptions int_value]<br /> [-num_alignments int_value] [-line_length line_length] [-html]<br /> [-max_target_seqs num_sequences] [-num_threads int_value] [-remote]<br /> [-version]</p><p>DESCRIPTION<br /> Nucleotide-Nucleotide BLAST 2.7.0+</p></div>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>

</channel>
</rss>