<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/1182?offset=10</link>
	<atom:link href="https://bioinformaticsonline.com/related/1182?offset=10" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/10409/check-linux-server-configuration</guid>
	<pubDate>Tue, 06 May 2014 01:10:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/10409/check-linux-server-configuration</link>
	<title><![CDATA[Check Linux server configuration !!]]></title>
	<description><![CDATA[<p>Bioinformatician uses servers for computational analysis. Sometime we need to check the server details before running our programs or tools. Here I am showing some basic commands using them you can gather the system/server information.<br /><br />To check what version of Operating System is installed on the server you can use the following commands:-<br />&nbsp;=================================================================<br />1.cat /etc/issue<br />[root@localhost ~]# cat /etc/issue<br />Red Hat Enterprise Linux Server release 5.5 (Tikanga)<br />Kernel \r on an \m<br /><br />2.cat /etc/redhat-release<br />[root@localhost ~]# cat /etc/redhat-release<br />Red Hat Enterprise Linux Server release 5.5 (Tikanga)<br /><br /><br />3.lsb_release -a<br />[root@localhost ~]# lsb_release -a<br />LSB Version:&nbsp;&nbsp;&nbsp; :core-3.1-ia32:core-3.1-noarch:graphics-3.1-ia32:graphics-3.1-noarch<br />Distributor ID: RedHatEnterpriseServer<br />Description:&nbsp;&nbsp;&nbsp; Red Hat Enterprise Linux Server release 5.5 (Tikanga)<br />Release:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5.5<br />Codename:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Tikanga<br /><br /><br /><br />To check whether the operating system is 32 or 64bit:-<br />================================<br /># uname -i<br />[root@localhost ~]# uname -i<br />i386<br />(i386 represents that server is having 32bit operating system)<br /><br />[root@localhost ~]# uname -i<br />x86_64<br />(x86_64 represents that server is having 64bit operating system)<br /><br />To see the processor/CPU information:-<br />=============================<br /># cat /proc/cpuinfo<br />[root@localhost ~] cat /proc/cpuinfo<br />processor&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0<br />vendor_id&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : GenuineIntel<br />cpu family&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 6<br />model&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 15<br />model name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : Intel(R) Xeon(R) CPU&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5130&nbsp; @ 2.00GHz<br />stepping&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 6<br />cpu MHz&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1995.087<br />cache size&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 4096 KB<br />physical id&nbsp;&nbsp;&nbsp;&nbsp; : 0<br />siblings&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 2<br />core id&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0<br />cpu cores&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 2<br />apicid&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 0<br />fdiv_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br />hlt_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br />f00f_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br />coma_bug&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : no<br />fpu&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : yes<br />fpu_exception&nbsp;&nbsp; : yes<br />cpuid level&nbsp;&nbsp;&nbsp;&nbsp; : 10<br />wp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : yes<br />flags&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl vmx tm2 ssse3 cx16 xtpr lahf_lm<br />bogomips&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 3990.17<br />(Here processor number 0 indicates that the system is having one process(processor number starts with zero))<br /><br /><br /><br /><br />To check memory information:-<br />===========================<br /># free -m<br />[root@localhost ~]# free -m<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; total&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; used&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; free&nbsp;&nbsp;&nbsp;&nbsp; shared&nbsp;&nbsp;&nbsp; buffers&nbsp;&nbsp;&nbsp;&nbsp; cached<br />Mem:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5066&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3513&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1552&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 612&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2319<br />-/+ buffers/cache:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 582&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4484<br />Swap:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1983&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1983<br /><br /><br /><br /># cat /proc/meminfo<br />[root@localhost ~]# cat /proc/meminfo<br />MemTotal:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5187752 kB<br />MemFree:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1639300 kB<br />Buffers:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 627024 kB<br />Cached:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2374944 kB<br />SwapCached:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 kB<br />Active:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2458788 kB<br />Inactive:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 920964 kB<br />HighTotal:&nbsp;&nbsp;&nbsp;&nbsp; 4325164 kB<br />HighFree:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1561936 kB<br />LowTotal:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 862588 kB<br />LowFree:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 77364 kB<br />SwapTotal:&nbsp;&nbsp;&nbsp;&nbsp; 2031608 kB<br />SwapFree:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2031608 kB<br />Dirty:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 704 kB<br />Writeback:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 kB<br />AnonPages:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 377892 kB<br />Mapped:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 35328 kB<br />Slab:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 153036 kB<br />PageTables:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 6316 kB<br />NFS_Unstable:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 kB<br />Bounce:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 kB<br />CommitLimit:&nbsp;&nbsp; 4625484 kB<br />Committed_AS:&nbsp;&nbsp; 977132 kB<br />VmallocTotal:&nbsp;&nbsp; 116728 kB<br />VmallocUsed:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4492 kB<br />VmallocChunk:&nbsp;&nbsp; 112124 kB<br />HugePages_Total:&nbsp;&nbsp;&nbsp;&nbsp; 0<br />HugePages_Free:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0<br />HugePages_Rsvd:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0<br />Hugepagesize:&nbsp;&nbsp;&nbsp;&nbsp; 2048 kB<br /><br /><br />To check the model and serial name of the server:-<br />=======================================<br />[root@localhost ~]#&nbsp; dmidecode | egrep -i "product name|Serial number"<br />Product Name: PowerEdge R710<br />Serial Number: AB8CDE1<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;<br /><br />To check the host name:-<br />=====================<br />[root@localhost ~]# uname -n<br />localhost<br /><br />[root@localhost ~]# hostname<br />localhost<br /><br />To check the kernel version:-<br />========================<br />[root@localhost ~]# uname -r<br />2.6.18-238.9.1.el5PAE</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/24297/bioinformatics-walkin-at-nii</guid>
  <pubDate>Fri, 04 Sep 2015 21:48:15 -0500</pubDate>
  <link></link>
  <title><![CDATA[Bioinformatics WalkIn at NII]]></title>
  <description><![CDATA[
<p>ADVERTISEMENT OF WALK-IN-INTERVIEW</p>

<p>NAME OF THE POST : Bioinformatician (Part time 3 days in a week) (One Position only)</p>

<p>DURATION : One Year</p>

<p>NAME OF THE PROJECT : Next generation sequencing facility</p>

<p>EDUCATIONAL QUALIFICATIONS : At least a Masters degree in Bioinformatics and Bachelors degree in any stream of life sciences</p>

<p>REQUIREMENTS :</p>

<p>Around 5 years of experience and proven track record in next generation sequence data analysis (supported by publications in peer-reviewed journals), ability to analyze transcriptomics, Chip-seq, and small RNA –seq data.</p>

<p>: Should have the ability to analyze raw primary data generated by Illumina next generation sequencing platforms and create / troubleshoot custom analysis Pipelines.</p>

<p>Should have ability to handle all downstream secondary and tertiary data analysis using commercially available as well as open source softwares (transcriptomics, ChIP-seq, small RNA-seq)</p>

<p>Apart from these, the applicant should have knowledge of the following: Programming: Perl and Python. Operating system:</p>

<p>Linux and Windows. NGS Analysis tools: Maq, BWA, Bowtie, SAM tools, BEDTools, MACS, Galaxy, FastQC, Bismark, MEDIPS, Tophat, Cufflinks, AvadisNGS, CLC Genomics Workbench, Galaxy, BaseSpace, Trinity Statistics: Microsoft Excel and R. Database: MySQL Genome Browser: UCSC, Ensemble, IGV, IGB Motif Analysis Tools: MEME Suite, Transfac and RSAT Functional Annotation Tools: DAVID, GeneCodis, Gene Cards Networking Tools: Cytoscape</p>

<p>EMOLUMENTS : The incumbent will be paid a fee of Rs. 2000/- per sitting/ per day.</p>

<p>SCIENTIST NAME : Dr. Arnab Mukhopadhyay,</p>

<p>Staff Scientific V Next generation sequencing facility</p>

<p>SCIENTIST’S E-MAIL ID : arnab@nii.ac.in</p>

<p>WALK IN INTERVIEW ON : 18th September, 2015</p>

<p>REGISTRATION OF CANDIDATES: 10.30 AM to 11.00 AM</p>

<p>PLEASE NOTE- 1. CANDIDATE MAY FILL UP APPLICATION IN THE PRECRIBED FORMAT ALONG WITH NECESSARY DOCUMENTS FOR VERIFICATION. 2. APPLICATIONS CONTAINING INCOMPLETE INFORMATION SHALL NOT BE ENTERTAINED. 3. DATE OF PASSING THE EXAMINATIONS MUST BE INDICATED CLEARLY. 4. ONLY REGISTERED CANDIDATES WILL BE INTERVIEWED. 5. NO TA/DA WILL BE PAID FOR ATTENDING THE INTERVIEW PRESCRIBED FORM 1. NAME 2. FATHER’S NAME 3. MOTHER’S NAME 4. DATE OF BIRTH 5. SEX (MALE/FEMALE) 6. CATEGORY (SC/ ST/ OBC/ PH) 7. ADDRESS a. (CORRSPONDENCE) b. (PERMANENT) 8. E MAIL, TELEPHONE NO. &amp; MOBILE No (if any) 9. ACADEMIC &amp; PROFESSIONAL QUALIFICATIONS NAME OF EXAMINATION PASSED WITH SUBJECTS YEAR OF PASSING BOARD/ UNIVERSITY PERCENTAGE/ DIVISION REMARKS 10. PAST EXPERIENCE &amp; PRESENT EMPLOYMENT, IF ANY 11. CANDIDATES SHOULD STATE CLEARLY WHETHER THEY HAVE BEEN AWARDED PH.D DEGREE OR THESIS HAS BEEN SUBMITTED. 12. HAVE YOU APPLIED FOR A POSITION EARLIER IN THE INSTITUTE? IF SO:- (1) THE DETAILS OF THE PROJECT AND PROJECT INVESTIGATOR (2) IF CALLED FOR INVERVIEW, RESULTS THEREOF</p>

<p>More at http://www1.nii.res.in/sites/default/files/walkininterview-18sept2015.pdf</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35525/linux-commands-cheat-sheet-for-bioinformatics-and-computational-biology-professionals</guid>
	<pubDate>Mon, 05 Feb 2018 18:50:41 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35525/linux-commands-cheat-sheet-for-bioinformatics-and-computational-biology-professionals</link>
	<title><![CDATA[Linux Commands Cheat Sheet for Bioinformatics and Computational Biology Professionals]]></title>
	<description><![CDATA[<p><span>The purpose of this cheat sheet is to introduce biologist and bioinformatician to the frequently used tools for NGS analysis as well as giving experience in writing one-liners.</span></p><ul>
<li><span></span><span><strong>File System</strong></span><span><strong><br /> </strong></span><span>ls</span><span>&nbsp;&mdash; list items in current directory</span><span><br /> </span><span>ls -l</span><span>&nbsp;&mdash; list items in current directory and show in long format to see perimissions, size, and modification date</span><span><br /> </span><span>ls -a</span><span>&nbsp;&mdash; list all items in current directory, including hidden files</span><span><br /> </span><span>ls -F</span><span>&nbsp;&mdash; list all items in current directory and show directories with a slash and executables with a star</span><span><br /> </span><span>ls dir</span><span>&nbsp;&mdash; list all items in directory dir</span><span><br /> </span><span>cd dir</span><span>&nbsp;&mdash; change directory to dir</span><span><br /> </span><span>cd ..</span><span>&nbsp;&mdash; go up one directory</span><span><br /> </span><span>cd /</span><span>&nbsp;&mdash; go to the root directory</span><span><br /> </span><span>cd ~</span><span>&nbsp;&mdash; go to to your home directory</span><span><br /> </span><span>cd -</span><span>&nbsp;&mdash; go to the last directory you were just in</span><span><br /> </span><span>pwd</span><span>&nbsp;&mdash; show present working directory</span><span><br /> </span><span>mkdir dir</span><span>&nbsp;&mdash; make directory dir</span><span><br /> </span><span>rm file</span><span>&nbsp;&mdash; remove file</span><span><br /> </span><span>rm -r dir</span><span>&nbsp;&mdash; remove directory dir recursively</span><span><br /> </span><span>cp file1 file2</span><span>&nbsp;&mdash; copy file1 to file2</span><span><br /> </span><span>cp -r dir1 dir2</span><span>&nbsp;&mdash; copy directory dir1 to dir2 recursively</span><span><br /> </span><span>mv file1 file2</span><span>&nbsp;&mdash; move (rename) file1 to file2</span><span><br /> </span><span>ln -s file link</span><span>&nbsp;&mdash; create symbolic link to file</span><span><br /> </span><span>touch file</span><span>&nbsp;&mdash; create or update file</span><span><br /> </span><span>cat file</span><span>&nbsp;&mdash; output the contents of file</span><span><br /> </span><span>less file</span><span>&nbsp;&mdash; view file with page navigation</span><span><br /> </span><span>head file</span><span>&nbsp;&mdash; output the first 10 lines of file</span><span><br /> </span><span>tail file</span><span>&nbsp;&mdash; output the last 10 lines of file</span><span><br /> </span><span>tail -f file</span><span>&nbsp;&mdash; output the contents of file as it grows, starting with the last 10 lines</span><span><br /> </span><span>vim file</span><span>&nbsp;&mdash; edit file</span><span><br /> </span><span>alias name 'command'</span><span>&nbsp;&mdash; create an alias for a command</span><span><br /> </span></li>
<li><span></span><span><strong>System</strong></span><span><strong><br /> </strong></span><span>shutdown</span><span>&nbsp;&mdash; shut down machine</span><span><br /> </span><span>reboot</span><span>&nbsp;&mdash; restart machine</span><span><br /> </span><span>date</span><span>&nbsp;&mdash; show the current date and time</span><span><br /> </span><span>whoami</span><span>&nbsp;&mdash; who you are logged in as</span><span><br /> </span><span>finger user</span><span>&nbsp;&mdash; display information about user</span><span><br /> </span><span>man command</span><span>&nbsp;&mdash; show the manual for command</span><span><br /> </span><span>df</span><span>&nbsp;&mdash; show disk usage</span><span><br /> </span><span>du</span><span>&nbsp;&mdash; show directory space usage</span><span><br /> </span><span>free</span><span>&nbsp;&mdash; show memory and swap usage</span><span><br /> </span><span>whereis app</span><span>&nbsp;&mdash; show possible locations of app</span><span><br /> </span><span>which app</span><span>&nbsp;&mdash; show which app will be run by default</span><span><br /> </span></li>
<li><span></span><span><strong>Process Management</strong></span><span><strong><br /> </strong></span><span>ps</span><span>&nbsp;&mdash; display your currently active processes</span><span><br /> </span><span>top</span><span>&nbsp;&mdash; display all running processes</span><span><br /> </span><span>kill pid</span><span>&nbsp;&mdash; kill process id pid</span><span><br /> </span><span>kill -9 pid</span><span>&nbsp;&mdash; force kill process id pid</span><span><br /> </span></li>
<li><span></span><span><strong>Permissions</strong></span><span><strong><br /> </strong></span><span>ls -l</span><span>&nbsp;&mdash; list items in current directory and show permissions</span><span><br /> </span><span>chmod ugo file</span><span>&nbsp;&mdash; change permissions of file to ugo - u is the user's permissions, g is the group's permissions, and o is everyone else's permissions. The values of u, g, and o can be any number between 0 and 7.</span><span><br /> </span><span>7</span><span>&nbsp;&mdash; full permissions</span><span><br /> </span><span>6</span><span>&nbsp;&mdash; read and write only</span><span><br /> </span><span>5</span><span>&nbsp;&mdash; read and execute only</span><span><br /> </span><span>4</span><span>&nbsp;&mdash; read only</span><span><br /> </span><span>3</span><span>&nbsp;&mdash; write and execute only</span><span><br /> </span><span>2</span><span>&nbsp;&mdash; write only</span><span><br /> </span><span>1</span><span>&nbsp;&mdash; execute only</span><span><br /> </span><span>0</span><span>&nbsp;&mdash; no permissions</span><span><br /> </span><span>chmod 600 file</span><span>&nbsp;&mdash; you can read and write - good for files</span><span><br /> </span><span>chmod 700 file</span><span>&nbsp;&mdash; you can read, write, and execute - good for scripts</span><span><br /> </span><span>chmod 644 file</span><span>&nbsp;&mdash; you can read and write, and everyone else can only read - good for web pages</span><span><br /> </span><span>chmod 755 file</span><span>&nbsp;&mdash; you can read, write, and execute, and everyone else can read and execute - good for programs that you want to share</span><span><br /> </span></li>
<li><span></span><span><strong>Networking</strong></span><span><strong><br /> </strong></span><span>wget file</span><span>&nbsp;&mdash; download a file</span><span><br /> </span><span>curl file</span><span>&nbsp;&mdash; download a file</span><span><br /> </span><span>scp user@host:file dir</span><span>&nbsp;&mdash; secure copy a file from remote server to the dir directory on your machine</span><span><br /> </span><span>scp file user@host:dir</span><span>&nbsp;&mdash; secure copy a file from your machine to the dir directory on a remote server</span><span><br /> </span><span>scp -r user@host:dir dir</span><span>&nbsp;&mdash; secure copy the directory dir from remote server to the directory dir on your machine</span><span><br /> </span><span>ssh user@host</span><span>&nbsp;&mdash; connect to host as user</span><span><br /> </span><span>ssh -p port user@host</span><span>&nbsp;&mdash; connect to host on port as user</span><span><br /> </span><span>ssh-copy-id user@host</span><span>&nbsp;&mdash; add your key to host for user to enable a keyed or passwordless login</span><span><br /> </span><span>ping host</span><span>&nbsp;&mdash; ping host and output results</span><span><br /> </span><span>whois domain</span><span>&nbsp;&mdash; get information for domain</span><span><br /> </span><span>dig domain</span><span>&nbsp;&mdash; get DNS information for domain</span><span><br /> </span><span>dig -x host</span><span>&nbsp;&mdash; reverse lookup host</span><span><br /> </span><span>lsof -i tcp:1337</span><span>&nbsp;&mdash; list all processes running on port 1337</span><span><br /> </span></li>
<li><span></span><span><strong>Searching</strong></span><span><strong><br /> </strong></span><span>grep pattern files</span><span>&nbsp;&mdash; search for pattern in files</span><span><br /> </span><span>grep -r pattern dir</span><span>&nbsp;&mdash; search recursively for pattern in dir</span><span><br /> </span><span>grep -rn pattern dir</span><span>&nbsp;&mdash; search recursively for pattern in dir and show the line number found</span><span><br /> </span><span>grep -r pattern dir --include='*.ext</span><span>&nbsp;&mdash; search recursively for pattern in dir and only search in files with .ext extension</span><span><br /> </span><span>command | grep pattern</span><span>&nbsp;&mdash; search for pattern in the output of command</span><span><br /> </span><span>find file</span><span>&nbsp;&mdash; find all instances of file in real system</span><span><br /> </span><span>locate file</span><span>&nbsp;&mdash; find all instances of file using indexed database built from the updatedb command. Much faster than find</span><span><br /> </span><span>sed -i 's/day/night/g' file</span><span>&nbsp;&mdash; find all occurrences of day in a file and replace them with night - s means substitude and g means global - sed also supports regular expressions</span><span><br /> </span></li>
<li><span></span><span><strong>Compression</strong></span><span><strong><br /> </strong></span><span>tar cf file.tar files</span><span>&nbsp;&mdash; create a tar named file.tar containing files</span><span><br /> </span><span>tar xf file.tar</span><span>&nbsp;&mdash; extract the files from file.tar</span><span><br /> </span><span>tar czf file.tar.gz files</span><span>&nbsp;&mdash; create a tar with Gzip compression</span><span><br /> </span><span>tar xzf file.tar.gz</span><span>&nbsp;&mdash; extract a tar using Gzip</span><span><br /> </span><span>gzip file</span><span>&nbsp;&mdash; compresses file and renames it to file.gz</span><span><br /> </span><span>gzip -d file.gz</span><span>&nbsp;&mdash; decompresses file.gz back to file</span><span><br /> </span></li>
<li><span></span><span><strong>Shortcuts</strong></span><span><strong><br /> </strong></span><span>ctrl+a</span><span>&nbsp;&mdash; move cursor to beginning of line</span><span><br /> </span><span>ctrl+f</span><span>&nbsp;&mdash; move cursor to end of line</span><span><br /> </span><span>alt+f</span><span>&nbsp;&mdash; move cursor forward 1 word</span><span><br /> </span><span>alt+b</span><span>&nbsp;&mdash; move cursor backward 1 word</span><span><br /> </span></li>
<li></li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</guid>
	<pubDate>Mon, 02 Jun 2014 18:03:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</link>
	<title><![CDATA[Next generation sequencing in R or bioconductor environment]]></title>
	<description><![CDATA[<p>There are many R software and bioconductor packages for NGS data analysis, some of them are as follows</p><h3><a name="TOC-Biostrings" id="TOC-Biostrings"></a>Biostrings</h3><p>The Biostrings package from Bioconductor provides an advanced environment for efficient sequence management and analysis in R. It contains many speed and memory effective string containers, string matching algorithms, and other utilities, for fast manipulation of large sets of biological sequences. The objects and functions provided by Biostrings form the basis for many other sequence analysis packages. <a href="http://bioconductor.org/packages/release/bioc/html/Biostrings.html">Documentation</a></p><div><div style="text-align: left;"><div style="color: #000000;"><h4><a name="TOC-IRanges-Overview" id="TOC-IRanges-Overview"></a>IRanges Overview</h4><p>IRanges provides the low-level infrastructure and containers for handling sets of integer ranges within Bioconductor's BioC-Seq domain. Its classes and methods provide support for many more high-level packages like GenomicRanges, ShortRead, Rsamtools, etc. <a href="http://bioconductor.org/packages/release/bioc/html/IRanges.html">Documentation</a></p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-GenomicRanges-Overview" id="TOC-GenomicRanges-Overview"></a>GenomicRanges Overview</h4><p>The <em>GenomicRanges</em> package serves as the foundation for representing genomic locations within the Bioconductor project. It is built upon the <em>IRanges</em> infrastructure and defines three major data containers - <em>GRanges, GRangesList</em> and <em>GappedAlignments</em> - which are supporting other important BioC-Seq packages including <em>ShortRead, Rsamtools, rtracklayer, GenomicFeatures</em> and <em>BSgenome</em>.&nbsp; Compared to the IRanges container, the GRanges/<em>GRangesList</em> classes are more flexible and extensible to store additional information about sequence ranges, such as chromosome identifiers (sequence space), strand information and annotation data. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p></div></div></div></div><h3><a name="TOC-Motif-Discovery" id="TOC-Motif-Discovery"></a>Motif Discovery</h3><h4><a name="TOC-cosmo" id="TOC-cosmo"></a>cosmo</h4><p>The cosmo package allows to search a set of unaligned DNA sequences for a shared motif that may function as transcription factor binding site. The algorithm extends the popular motif discovery tool MEME (Bailey and Elkan, 1995) in that it allows the search to be supervised by specifying a set of constraints that the motif to be discovered must satisfy. <a href="http://bioconductor.org/packages/release/bioc/html/cosmo.html">Documentation</a></p></div><div>
<p><span></span><span></span></p>
<div style="color: #0000ff;"><h4><a name="TOC-BCRANK" id="TOC-BCRANK"></a>BCRANK</h4><p>BCRANK is a method that takes a ranked list of genomic regions as input and outputs short DNA sequences that are overrepresented in some part of the list. The algorithm was developed for detecting transcription factor (TF) binding sites in a large number of enriched regions from high-throughput ChIP-chip or ChIP-seq experiments, but it can be applied to any ranked list of DNA sequences. Documentation</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/BCRANK.html"></a></p>
<p>rGADEM: <a href="http://bioconductor.org/packages/devel/bioc/html/rGADEM.html">Documentation</a></p><p>MotIV: <a href="http://bioconductor.org/packages/devel/bioc/html/MotIV.html">Documentation</a></p></div><h3><a name="TOC-ShortRead" id="TOC-ShortRead"></a>ShortRead</h3><p>The ShortRead package provides input, quality control, filtering, parsing, and manipulation functionality for short read sequences produced by high throughput sequencing technologies. While support is provided for many sequencing technologies, this package is primairly focused on Solexa/Illumina reads. <a href="http://bioconductor.org/packages/release/bioc/html/ShortRead.html">Documentation</a></p><h3><a name="TOC-Rsamtools" id="TOC-Rsamtools"></a>Rsamtools</h3><p>Rsamtools provides functions for parsing and inspecting samtools BAM formatted binary alignment data. SAM/BAM is quickly becoming a universal standard alignment format, and is now supported by a wide variety of alignment tools. <a href="http://bioconductor.org/help/bioc-views/2.7/bioc/html/Rsamtools.html">Documentation</a></p>
<p><a href="http://samtools.sourceforge.net/">Samtools Website</a><br /> <a href="http://bio-bwa.sourceforge.net/">BWA (Burrows-Wheeler Alignment) Website</a><br /><span style="color: #0000ff;"></span></p>
<div style="color: #000000;">&nbsp;</div></div><div>
<p><span style="color: #000000;">Additional tools for SNP analysis:&nbsp;</span></p>
<p><a href="http://bioconductor.org/help/bioc-views/release/bioc/html/snpMatrix.html">snpMatrix</a></p><h3><a name="TOC-BSgenome" id="TOC-BSgenome"></a>BSgenome</h3><p>BSgenome provides an object oriented infrastructure for interacting with a Biostring based genome sequence. BSgenome packages exist for many common genomes, and can be created to represent custom genomes. See the "How to forge a BSgenome data package" Vignette for instructions to create a new BSgenome package if a prebuilt package does not exist for your organism. <a href="http://bioconductor.org/packages/release/bioc/html/BSgenome.html">Documentation</a></p><h3><a name="TOC-rtracklayer" id="TOC-rtracklayer"></a>rtracklayer</h3><p>rtracklayer provides an interface for exporting annotation feature data to various genome browsers and file formats (such as GFF). See the Small RNA Profiling exercise for an example of using rtracklayer to visualize alignment coverage. <a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">Documentation</a></p><h3><a name="TOC-biomaRt" id="TOC-biomaRt"></a>biomaRt</h3><p>The biomaRt package, provides an interface to a growing collection of databases implementing the BioMart software suite (http:// www.biomart.org). The package enables online retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas. This data is retrieved automatically via the Internet, so it's recommended that you cache the data locally, or check versions if your code will be adversely affected by updates to these data. <a href="http://bioconductor.org/packages/release/bioc/html/biomaRt.html">Documentation</a></p><h3><a name="TOC-ChIP-Seq-Analysis-Packages" id="TOC-ChIP-Seq-Analysis-Packages"></a>ChIP-Seq Analysis Packages</h3><p>Bioconductor provides various packages for analyzing and visualizing ChIP-Seq data. Only a small selection of these packages is introduced here. Additional useful introductions to this topic are: <a href="http://www.bioconductor.org/workshops/2009/SeattleJan09/ChIP-seq/">BioC ChIP-seq Case Study</a> and BioC <a href="http://www.bioconductor.org/help/course-materials/2009/SeattleNov09/ChIP-seq/">ChIP-Seq</a>.</p><h4><a name="TOC-chipseq" id="TOC-chipseq"></a>chipseq</h4><p>The chipseq package combines a variety of HT-Seq packages to a pipeline for ChIP-Seq data analysis. <a href="http://bioconductor.org/packages/release/bioc/html/chipseq.html">Documentation</a></p><h4><a name="TOC-BayesPeak" id="TOC-BayesPeak"></a>BayesPeak</h4><p>BayesPeak is a peak calling package for identifying DNA binding sites of proteins in ChIP-Seq experiments. Its algorithm uses hidden Markov models (HMM) and Bayesian statistical methods. The following sample code introduces the identification of peaks with the BayesPeak package as well as the incorporation of read coverage information obtained by the chipseq package. <a href="http://bioconductor.org/packages/release/bioc/html/BayesPeak.html">Documentation</a> [ <a href="http://www.biomedcentral.com/1471-2105/10/299">Publication</a> ]</p><h4><a name="TOC-PICS" id="TOC-PICS"></a>PICS</h4><p>The PICS package applies probabilistic inference to aligned-read ChIP-Seq data in order to identify regions bound by transcription factors. PICS identifies enriched regions by modeling local concentrations of directional reads, and uses DNA fragment length prior information to discriminate closely adjacent binding events via a Bayesian hierarchical t-mixture model. The following sample code uses the test data set from the above BayesPeak package in order to compare the results from both methods by identifying their consensus peak set. <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">Documentation</a> [ <a href="http://www.hubmed.org/display.cgi?uids=20528864">Publication</a> ]</p><h4><a name="TOC-ChIPpeakAnno" id="TOC-ChIPpeakAnno"></a>ChIPpeakAnno</h4><p>The ChIPpeakAnno package provides. batch annotation of the peaks identified from either ChIP-seq or ChIP-chip experiments. It includes functions to retrieve the sequences around peaks, obtain enriched Gene Ontology (GO) terms, find the nearest gene, exon, miRNA or custom features such as most conserved elements and other transcription factor binding sites supplied by users. The package leverages the biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest and stat packages. <a href="http://bioconductor.org/packages/release/bioc/html/ChIPpeakAnno.html">Documentation</a></p><h4><a name="TOC-Additional-ChIP-Seq-Packages" id="TOC-Additional-ChIP-Seq-Packages"></a>Additional ChIP-Seq Packages</h4><p>DiffBind: <a href="http://www.bioconductor.org/packages/release/bioc/html/DiffBind.html">Documentation</a></p><p>MOSAICS: <a href="http://bioconductor.org/packages/devel/bioc/html/mosaics.html">Documentation</a></p><p>iSeq: <a href="http://bioconductor.org/packages/release/bioc/html/iSeq.html">Documentation</a></p><p>ChIPseqR: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPseqR.html">Documentation</a></p><p>ChiPsim: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPsim.html">Documentation</a></p><p>CSAR: <a href="http://www.bioconductor.org/packages/devel/bioc/html/CSAR.html">Documentation</a></p><p>ChIP-Seq Pipeline: <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">PICS</a>, rGADEM and MotIV (<a href="http://www.rglab.org/pics-and-bioconductor/">developer web site</a>)</p><p>SPP: <a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/">ChIP-seq processing pipeline</a></p><p><a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/tutorial.html">SPP Tutorial</a></p><p><a href="http://liulab.dfci.harvard.edu/MACS/index.html">MACS</a></p><p><a href="http://gmdd.shgmo.org/Computational-Biology/ChIP-Seq/download/SIPeS">SIPeS</a></p><h3><a name="TOC-RNA-Seq-Analysis" id="TOC-RNA-Seq-Analysis"></a>RNA-Seq Analysis</h3><h4><a name="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-" id="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-"></a>Counting Reads that Overlap with Annotation Ranges&nbsp;</h4><p>The GenomicRanges package provides support for importing into R short read alignment data in BAM format (via Rsamtools) and associating them with genomic feature ranges, such as exons or genes. This way one can quantify the number of reads aligning to annotated genomic regions. The package defines general purpose containers for storing genomic intervals as well as more specialized containers for storing alignments against a reference genome. The two main functions for read counting provided by this infrastructure are <span>countOverlaps <span style="color: #000000;"><span>and</span></span> summarizeOverlaps</span>. For their proper usage, it is important to read the corresponding <a href="http://www.bioconductor.org/packages/devel/bioc/vignettes/GenomicRanges/inst/doc/summarizeOverlaps.pdf">PDF manual</a>. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-DESeq" id="TOC-Differential-Gene-Expression-Analysis-with-DESeq"></a>Differential Gene Expression Analysis with DESeq</h4><p>The DESeq package contains functions to call differentially expressed genes (DEGs) in count tables based on a model using the negative binomial distribution. It expects as input a data frame with the raw read counts per region/gene of interest (rows) for each test sample (columns).&nbsp; Such a count table can be imported into R or generated from BAM alignment files using the <span>countOverlaps</span> function as introduced above. <a href="http://www.bioconductor.org/packages/release/bioc/html/DESeq.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-edgeR" id="TOC-Differential-Gene-Expression-Analysis-with-edgeR"></a>Differential Gene Expression Analysis with edgeR</h4><p>The edgeR package uses empirical Bayes estimation and exact tests based on the negative binomial distribution to call differentially expressed genes (DEGs) in count data.&nbsp;</p>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/edgeR.html">Documentation</a></p>
<p><span style="color: #000000;">A variety of additional R packages are available for normalizing RNA-Seq read count data and identifying differentially expressed genes (DEG): <br /> </span></p><p><a href="http://bioconductor.org/packages/devel/bioc/html/easyRNASeq.html">easyRNASeq</a> (simplifies read counting per genome feature)</p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/DEXSeq.html">DEXSeq</a> (Inference of differential exon usage);&nbsp;<a href="http://www.bioconductor.org/packages/release/data/experiment/html/parathyroidSE.html">parathyroidSE</a> explains how to generate exon read counts in R</p><p><a href="http://bioconductor.org/packages/release/bioc/html/DEGseq.html">DEGseq</a></p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/baySeq.html">baySeq</a> (also see: <a href="http://www.bioconductor.org/packages/release/bioc/html/segmentSeq.html">segmentSeq</a>)</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a> (<a href="http://www.hubmed.org/display.cgi?uids=20167110">Bullard et al. 2010</a>)</p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-Detection-of-Alternative-Splice-Junctions" id="TOC-Detection-of-Alternative-Splice-Junctions"></a>Detection of Alternative Splice Junctions</h4>
<p><span style="color: #000000;">Another utility of RNA-Seq experiments is the analysis of splice junctions. The following software suggestions provide this utility:</span></p>
<p><a href="http://woldlab.caltech.edu/rnaseq/">ERANGE<br /> </a><a href="http://tophat.cbcb.umd.edu/">TopHat</a></p><p><a href="http://biogibbs.stanford.edu/%7Ekinfai/SpliceMap/">SpliceMap</a></p><p><a href="http://solidsoftwaretools.com/gf/project/splitseek/">SplitSeek</a></p><h3><a name="TOC-DNA-Methylation-Data-Analysis" id="TOC-DNA-Methylation-Data-Analysis"></a>DNA-Methylation Data Analysis</h3><div><ul>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/help/course-materials/2012/BiocEurope2012/mattia_pelizzola_methylPipe.pdf">methylPipe</a></span></li>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/packages/devel/bioc/html/bsseq.html">bsseq</a></span></li>
<li><a href="http://www.bioconductor.org/packages/devel/bioc/html/BiSeq.html">BiSeq</a></li>
<li>Much more under <a href="http://www.bioconductor.org/packages/devel/BiocViews.html#___DNAMethylation">BiocViews</a></li>
</ul></div></div></div><h3><a name="TOC-HT-Seq-Data-Visualization" id="TOC-HT-Seq-Data-Visualization"></a>HT-Seq Data Visualization</h3>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/ggbio.html">ggbio</a>: ggplot2 extension for genomics data (<a href="http://tengfei.github.com/ggbio/">online manual</a>) <a href="http://www.bioconductor.org/packages/devel/bioc/html/Gviz.html">Gviz</a>:&nbsp;Plotting data and annotation information along genomic coordinates <a href="http://bioconductor.org/packages/release/bioc/html/HilbertVis.html">HilbertVis</a>: Hilbert genome plots</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/GenomeGraphs.html">GenomeGraphs</a>: Plotting genomic information from Ensembl</p><p><a href="http://www.hubmed.org/display.cgi?uids=18507856">TileQC</a>: Flow Cell Quality Visualization</p><p><a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">rtracklayer</a>: R interface to genome browsers</p><p><a href="http://genoplotr.r-forge.r-project.org/">genoPlotR</a>: Plotting maps of genes and genomes</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a>: Tools for storing, accessing, analyzing and visualizing genomic data.</p><p>&nbsp;</p><p>To install all packages</p><blockquote><p>source("http://bioconductor.org/biocLite.R")<br />biocLite()<br />biocLite(c("ShortRead", "Biostrings", "IRanges", "BSgenome", "rtracklayer", "biomaRt", "chipseq", "ChIPpeakAnno", "Rsamtools", "BayesPeak", "PICS", "GenomicRanges", "DESeq", "edgeR", "leeBamViews", "GenomicFeatures", "BSgenome.Celegans.UCSC.ce2"))</p></blockquote></div>]]></description>
	<dc:creator>John Parker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/88/regular-expression-cheat-sheet</guid>
	<pubDate>Tue, 09 Jul 2013 17:38:42 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/88/regular-expression-cheat-sheet</link>
	<title><![CDATA[Regular Expression Cheat Sheet]]></title>
	<description><![CDATA[<p><span>The Regular Expression are the sole of Perl language, and for bioinformatician it is just a magical stick to resolve gingatic string data. We did not find any good and user friendly regular expression cheat sheet, hence write our own cheat sheet.&nbsp;</span><span>The Regular Expressions Cheat Sheet, a quick reference guide for regular expressions, including symbols, ranges, grouping, assertions and some sample patterns to get you started.</span></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/88" length="14944" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/783/perl-module-installation</guid>
	<pubDate>Fri, 12 Jul 2013 11:19:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/783/perl-module-installation</link>
	<title><![CDATA[Perl Module Installation]]></title>
	<description><![CDATA[<p>Nice step wide information on perl module installation.</p><p>Address of the bookmark: <a href="http://bioinformaticsonline.com/blog/view/710/how-to-install-perl-modules-manually-using-cpan-command-and-other-quick-ways" rel="nofollow">http://bioinformaticsonline.com/blog/view/710/how-to-install-perl-modules-manually-using-cpan-command-and-other-quick-ways</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/2376/citrus-perl</guid>
	<pubDate>Wed, 14 Aug 2013 14:57:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/2376/citrus-perl</link>
	<title><![CDATA[Citrus Perl]]></title>
	<description><![CDATA[<p>Citrus Perl is a binary distribution of Perl created for GUI application developers. The distribution includes <a href="http://wxperl.sourceforge.net">wxPerl</a>, the Perl wrapper for <a href="http://www.wxwidgets.org">wxWidgets</a>. Where supported by the operating system wxWidgets is available as a package for the 2.8.x stable branch and the 2.9.x development branch.</p><p>Address of the bookmark: <a href="http://www.citrusperl.com/" rel="nofollow">http://www.citrusperl.com/</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/4037/perl-and-bioperl-tutorials</guid>
	<pubDate>Wed, 28 Aug 2013 05:51:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/4037/perl-and-bioperl-tutorials</link>
	<title><![CDATA[Perl and BioPerl Tutorials]]></title>
	<description><![CDATA[<p>This bookmark is created to store the useful Perl and BioPerl tutorial links at one place. Feel free to share and add more useful tutorial links here ....&nbsp;</p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="http://cbb.sjtu.edu.cn/course/database/beginning.pdf" rel="nofollow">http://cbb.sjtu.edu.cn/course/database/beginning.pdf</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/11181/perl-one-liner-for-bioinformatician</guid>
	<pubDate>Fri, 30 May 2014 05:49:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/11181/perl-one-liner-for-bioinformatician</link>
	<title><![CDATA[Perl one-liner for bioinformatician !!!]]></title>
	<description><![CDATA[<p>With the emergence of NGS technologies, and sequencing data most of the bioinformaticians mung and wrangle around massive amounts of genomics text. There are several "standardized" file formats (FASTQ, SAM, VCF, etc.) and some tools for manipulating them (fastx toolkit, samtools, vcftools, etc.), there are still times where knowing a little bit of Perl onliner is extremely helpful.</p><p>Perl one-liners are small and awesome Perl programs that fit in a single line of code and they do one thing really well. These things include changing line spacing, numbering lines, doing calculations, converting and substituting text, deleting and printing certain lines, parsing logs, editing files in-place, doing statistics, carrying out system administration tasks, updating a bunch of files at once, and many more. Perl one-liners will make you the shell warrior. Anything that took you minutes to solve, will now take you seconds!<br /><br />perl -pe '$\="\n"'&nbsp; &nbsp;<br />#double space a file<br /><br />perl -pe '$_ .= "\n" unless /^$/' <br />#double space a file except blank lines<br /><br />perl -pe '$_.="\n"x7' <br />#7 space in a line.<br /><br />perl -ne 'print unless /^$/' <br />#remove all blank lines<br /><br />perl -lne 'print if length($_) &lt; 20' <br />#print all lines with length less than 20.<br /><br />perl -00 -pe '' <br />#If there are multiple spaces, delete all leaving one(make the file a single spaced file).<br /><br />perl -00 -pe '$_.="\n"x4' <br />#Expand single blank lines into 4 consecutive blank lines<br /><br />perl -pe '$_ = "$. $_"'<br />#Number all lines in a file<br /><br />perl -pe '$_ = ++$a." $_" if /./' <br />#Number only non-empty lines in a file<br /><br />perl -ne 'print ++$a." $_" if /./' <br />#Number and print only non-empty lines in a file<br /><br />perl -pe '$_ = ++$a." $_" if /regex/' <br />#Number only lines that match a pattern<br /><br />perl -ne 'print ++$a." $_" if /regex/' <br />#Number and print only lines that match a pattern<br /><br />perl -ne 'printf "%-5d %s", $., $_ if /regex/' <br />#Left align lines with 5 white spaces if matches a pattern (perl -ne 'printf "%-5d %s", $., $_' : for all the lines)<br /><br />perl -le 'print scalar(grep{/./}&lt;&gt;)' <br />#prints the total number of non-empty lines in a file<br /><br />perl -lne '$a++ if /regex/; END {print $a+0}' <br />#print the total number of lines that matches the pattern<br /><br />perl -alne 'print scalar @F' <br />#print the total number fields(words) in each line.<br /><br />perl -alne '$t += @F; END { print $t}' <br />#Find total number of words in the file<br /><br />perl -alne 'map { /regex/ &amp;&amp; $t++ } @F; END { print $t }' <br />#find total number of fields that match the pattern<br /><br />perl -lne '/regex/ &amp;&amp; $t++; END { print $t }' <br />#Find total number of lines that match a pattern<br /><br />perl -le '$n = 20; $m = 35; ($m,$n) = ($n,$m%$n) while $n; print $m' <br />#will calculate the GCD of two numbers.<br /><br />perl -le '$a = $n = 20; $b = $m = 35; ($m,$n) = ($n,$m%$n) while $n; print $a*$b/$m' <br />#will calculate lcd of 20 and 35.<br /><br />perl -le '$n=10; $min=5; $max=15; $, = " "; print map { int(rand($max-$min))+$min } 1..$n' <br />#Generates 10 random numbers between 5 and 15.<br /><br />perl -le 'print map { ("a".."z",&rdquo;0&rdquo;..&rdquo;9&rdquo;)[rand 36] } 1..8'<br />#Generates a 8 character password from a to z and number 0 &ndash; 9.<br /><br />perl -le 'print map { ("a",&rdquo;t&rdquo;,&rdquo;g&rdquo;,&rdquo;c&rdquo;)[rand 4] } 1..20'<br />#Generates a 20 nucleotide long random residue.<br /><br />perl -le 'print "a"x50'<br />#generate a string of &lsquo;x&rsquo; 50 character long<br /><br />perl -le 'print join ", ", map { ord } split //, "hello world"'<br />#Will print the ascii value of the string hello world.<br /><br />perl -le '@ascii = (99, 111, 100, 105, 110, 103); print pack("C*", @ascii)'<br />#converts ascii values into character strings.<br /><br />perl -le '@odd = grep {$_ % 2 == 1} 1..100; print "@odd"'<br />#Generates an array of odd numbers.<br /><br />perl -le '@even = grep {$_ % 2 == 0} 1..100; print "@even"'<br />#Generate an array of even numbers<br /><br />perl -lpe 'y/A-Za-z/N-ZA-Mn-za-m/' file <br />#Convert the entire file into 13 characters offset(ROT13)<br /><br />perl -nle 'print uc' <br />#Convert all text to uppercase:<br /><br />perl -nle 'print lc' <br />#Convert text to lowercase:<br /><br />perl -nle 'print ucfirst lc' <br />#Convert only first letter of first word to uppercas<br /><br />perl -ple 'y/A-Za-z/a-zA-Z/' <br />#Convert upper case to lower case and vice versa<br /><br />perl -ple 's/(\w+)/\u$1/g' <br />#Camel Casing<br /><br />perl -pe 's|\n|\r\n|' <br />#Convert unix new lines into DOS new lines:<br /><br />perl -pe 's|\r\n|\n|' <br />#Convert DOS newlines into unix new line<br /><br />perl -pe 's|\n|\r|' <br />#Convert unix newlines into MAC newlines:<br /><br />perl -pe '/regexp/ &amp;&amp; s/foo/bar/' <br />#Substitute a foo with a bar in a line with a regexp.</p><p>Reference/Sources:</p><p>http://genomics-array.blogspot.in/2010/11/some-unixperl-oneliners-for.html</p><p><a href="http://genomespot.blogspot.com/2013/08/a-selection-of-useful-bash-one-liners.html">http://genomespot.blogspot.com/2013/08/a-selection-of-useful-bash-one-liners.html</a></p><p><a href="http://biowize.wordpress.com/2012/06/15/command-line-magic-for-your-gene-annotations/">http://biowize.wordpress.com/2012/06/15/command-line-magic-for-your-gene-annotations/</a></p><p><a href="http://genomics-array.blogspot.com/2010/11/some-unixperl-oneliners-for.html">http://genomics-array.blogspot.com/2010/11/some-unixperl-oneliners-for.html</a></p><p><a href="http://bioexpressblog.wordpress.com/2013/04/05/split-multi-fasta-sequence-file/">http://bioexpressblog.wordpress.com/2013/04/05/split-multi-fasta-sequence-file/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/22571/pattern-matching-problem-solution-with-perl</guid>
	<pubDate>Tue, 09 Jun 2015 23:58:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/22571/pattern-matching-problem-solution-with-perl</link>
	<title><![CDATA[Pattern Matching Problem Solution with Perl]]></title>
	<description><![CDATA[<p>Problem at http://rosalind.info/problems/1c/</p><p>#Find all occurrences of a pattern in a string.<br />#Given: Strings Pattern and Genome.<br />#Return: All starting positions in Genome where Pattern appears as a substring. Use 0-based indexing.<br /><br />use strict;<br />use warnings;<br /><br />my $string="GATATATGCATATACTT";<br />my $subStr="ATAT";<br />my $kmer=length($subStr);<br /><br />kmerMatch ($string, $subStr, $kmer);<br /><br />sub kmerMatch { #Check the exact matching kmers with sliding window<br />my ($string, $myStr, $kmer)=@_;<br />for (my $aa=0; $aa&lt;=(length($string)-$kmer); $aa++) {<br />&nbsp;&nbsp;&nbsp; my $myWin=substr&nbsp; $string, $aa,$kmer;<br />&nbsp;&nbsp;&nbsp; if ($myWin eq $myStr) {<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #print "$myWin eq $myStr\n";<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; print $aa;<br />&nbsp;&nbsp;&nbsp; }<br />}<br />}</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>