BOL: Related items

Software developed in pevsner lab

Robert M Willioms — Mon, 06 Oct 2014 12:41:26 -0500

DRAGON: Database Referencing of Array Genes Online

SNOMAD: Standardization and Normalization of Microarray Data

SNPduo: SNP Analysis Between Two Individuals

SNPtrio: Analyzing and Visualizing and Inheritance Patterns in Trios

SNPscan: Data Analysis and Visualization of SNP Data

pediSNP: Analyze SNP Data From a Pedigree of Two Generations

kcoeff: Calculate Cotterman Coefficients of SNP Genotype Data

triPOD: Detects chromosomal abnormalities in parent-child trio-based microarray data

Address of the bookmark: http://pevsnerlab.kennedykrieger.org/php/?q=software

Bioinformatics WalkIn at NII

Fri, 04 Sep 2015 21:48:15 -0500

ADVERTISEMENT OF WALK-IN-INTERVIEW

NAME OF THE POST : Bioinformatician (Part time 3 days in a week) (One Position only)

DURATION : One Year

NAME OF THE PROJECT : Next generation sequencing facility

EDUCATIONAL QUALIFICATIONS : At least a Masters degree in Bioinformatics and Bachelors degree in any stream of life sciences

REQUIREMENTS :

Around 5 years of experience and proven track record in next generation sequence data analysis (supported by publications in peer-reviewed journals), ability to analyze transcriptomics, Chip-seq, and small RNA –seq data.

: Should have the ability to analyze raw primary data generated by Illumina next generation sequencing platforms and create / troubleshoot custom analysis Pipelines.

Should have ability to handle all downstream secondary and tertiary data analysis using commercially available as well as open source softwares (transcriptomics, ChIP-seq, small RNA-seq)

Apart from these, the applicant should have knowledge of the following: Programming: Perl and Python. Operating system:

Linux and Windows. NGS Analysis tools: Maq, BWA, Bowtie, SAM tools, BEDTools, MACS, Galaxy, FastQC, Bismark, MEDIPS, Tophat, Cufflinks, AvadisNGS, CLC Genomics Workbench, Galaxy, BaseSpace, Trinity Statistics: Microsoft Excel and R. Database: MySQL Genome Browser: UCSC, Ensemble, IGV, IGB Motif Analysis Tools: MEME Suite, Transfac and RSAT Functional Annotation Tools: DAVID, GeneCodis, Gene Cards Networking Tools: Cytoscape

EMOLUMENTS : The incumbent will be paid a fee of Rs. 2000/- per sitting/ per day.

SCIENTIST NAME : Dr. Arnab Mukhopadhyay,

Staff Scientific V Next generation sequencing facility

SCIENTIST’S E-MAIL ID : arnab@nii.ac.in

WALK IN INTERVIEW ON : 18th September, 2015

REGISTRATION OF CANDIDATES: 10.30 AM to 11.00 AM

PLEASE NOTE- 1. CANDIDATE MAY FILL UP APPLICATION IN THE PRECRIBED FORMAT ALONG WITH NECESSARY DOCUMENTS FOR VERIFICATION. 2. APPLICATIONS CONTAINING INCOMPLETE INFORMATION SHALL NOT BE ENTERTAINED. 3. DATE OF PASSING THE EXAMINATIONS MUST BE INDICATED CLEARLY. 4. ONLY REGISTERED CANDIDATES WILL BE INTERVIEWED. 5. NO TA/DA WILL BE PAID FOR ATTENDING THE INTERVIEW PRESCRIBED FORM 1. NAME 2. FATHER’S NAME 3. MOTHER’S NAME 4. DATE OF BIRTH 5. SEX (MALE/FEMALE) 6. CATEGORY (SC/ ST/ OBC/ PH) 7. ADDRESS a. (CORRSPONDENCE) b. (PERMANENT) 8. E MAIL, TELEPHONE NO. & MOBILE No (if any) 9. ACADEMIC & PROFESSIONAL QUALIFICATIONS NAME OF EXAMINATION PASSED WITH SUBJECTS YEAR OF PASSING BOARD/ UNIVERSITY PERCENTAGE/ DIVISION REMARKS 10. PAST EXPERIENCE & PRESENT EMPLOYMENT, IF ANY 11. CANDIDATES SHOULD STATE CLEARLY WHETHER THEY HAVE BEEN AWARDED PH.D DEGREE OR THESIS HAS BEEN SUBMITTED. 12. HAVE YOU APPLIED FOR A POSITION EARLIER IN THE INSTITUTE? IF SO:- (1) THE DETAILS OF THE PROJECT AND PROJECT INVESTIGATOR (2) IF CALLED FOR INVERVIEW, RESULTS THEREOF

More at http://www1.nii.res.in/sites/default/files/walkininterview-18sept2015.pdf

Postdoctoral Fellowship in Bioinformatics at pesolelab

Thu, 01 Oct 2015 07:20:48 -0500

Job Description: Bioinformatics postdoc positions are available in the area of genomics with main focus on exome and RNAseq technologies by ultra high-throughput sequencing platforms. Successful applicants should have the following qualities:

1) demonstrated experience in Bioinformatics research,
2) programing experience (python and/or R, C and C++ are very welcome),
3) knowledge of Linux/Unix environment,
4) experience in handling deep-seq data,
5) highly motivated and hard working, and
6) interested to work with a multi-disciplinary team combining bioinformatics, genomics, computational biology approaches with experimental biology.

Our research interest covers different areas of bioinformatics and genomics in order to achieve a deeper understanding of gene and genome structure and function (please look at our PubMed publications for more details about our research http://www.ncbi.nlm.nih.gov/pubmed/?term=pesole+g).

Interested applicants should email the curriculum vitae to Prof. Graziano Pesole at graziano.pesole@uniba.it or Dr. Ernesto Picardi at Ernesto.picardi@uniba.it.

Start date: immediate

Duration: up to 24 months
Contact Person (Referent): Ernesto Picardi
Ref. E-Mail: ernesto.picardi@uniba.it
Tel: +390805443308
Fax: +390805443317

Group Web Page: http://www.pesolelab.it/

HOMER: Software for motif discovery and next-gen sequencing analysis

Neel — Tue, 26 Apr 2016 03:48:23 -0500

This tutorial covers topics independently of HOMER, and represents knowledge which is important to know before diving head first into more advanced analysis tools such as HOMER.

Setting up your computing environment
Retrieving and storing sequencing files (your own data or from public sources)
Checking sequence quality, trimming, general sequence manipulation
Mapping reads to a reference genome
Manipulating SAM/BAM alignment files
Visualizing data in a genome browser

RNA-Seq

Microarray

Basic analysis of Affymetrix Gene Expression Arrays using R/Bioconductor

General Tips for Data Analysis

Excel workarounds, adding gene annotation, X-Y plots tips, etc.

Address of the bookmark: http://homer.salk.edu/homer/basicTutorial/

Release Notes for Genome Workbench 2.10.5

Jit — Thu, 12 May 2016 13:49:41 -0500

New Features in latest release

New ProSplign tool integrated with Genome Workbench (Tutorial, Video)
New export function for BAM/cSRA coverage graphs (Tutorial)
New export function for alignments GFF3 format ((Tutorial))
Tree View: implemented new export mode based on selections (tutorial coming)
Tree View: added support for distance based circular trees
Tree View: new rooting mode (Midpoint Root) results in more balanced trees (Tutorial)
Tree View: added possibility to right-click on an edge between two nodes and "Place Root at Middle of Branch" – to re-root at mid-branch (Tutorial)

Tools for Searching Repeats And Palindromic Sequences

Radha Agarkar — Sat, 21 May 2016 22:32:25 -0500

What are genomic interspersed repeats?

In the mid 1960's scientists discovered that many genomes contain stretches of highly repetitive DNA sequences ( see Reassociation Kinetics Experiments, and C-Value Paradox ). These sequences were later characterized and placed into five categories:

Simple Repeats - Duplications of simple sets of DNA bases (typically 1-5bp) such as A, CA, CGG etc.
Tandem Repeats - Typically found at the centromeres and telomeres of chromosomes these are duplications of more complex 100-200 base sequences.
Segmental Duplications - Large blocks of 10-300 kilobases which are that have been copied to another region of the genome.
Interspersed Repeats
Processed Pseudogenes, Retrotranscripts, SINES - Non-functional copies of RNA genes which have been reintegrated into the genome with the assitance of a reverse transcriptase.
DNA Transposons
Retrovirus Retrotransposons
Non-Retrovirus Retrotransposons ( LINES )

Currently up to 50% of the human genome is repetitive in nature and as improvements are made in detection methods this number is expected to increase.

On the other hand; In genetics, the term palindrome refers to a sequence of nucleotides along a DNA (deoxyribonucleic acid) or RNA (ribonucleic acid) strand that contains the same series of nitrogenous bases regardless from which direction the strand is analyzed. Akin to a language palindrome—wherein a word or phrase is spelled the same left-to-right as right-to-left (e.g., the word RADAR or the phrase "able was I ere I saw elba")—with genetic palindromes it does not matter whether the nucleic acid strand is read starting from the 3' (three prime) end or the 5' (five prime) end of the strand.

Recent research on palindromes centers on understanding palindrome formation during gene amplification. Other studies have attempted to relate palindrome formation to molecular mechanisms involved in double stranded breaks and in the formation of inverted repeats. Assisted by high speed computers, other groups of scientists link palindrome formation to the conservation of genetic information.

Related to the direction of transcription by RNA polymerase, DNA strands have upstream and downstream terminus defined by differing chemical groups at each end. The ends of each strand of DNA or RNA are termed the 5' (phosphate bound to the 5' position carbon) and 3' (phosphate bound to the 3' carbon) ends to indicate a polarity within the molecule. Using the letters A, T, C, G, to represent the nitrogenous bases adenine, thymine, cytosine, and guanine found in DNA, and the letters A, U, C, G to represent the nitrogenous bases adenine, uracil, cytosine, guanine found in RNA (Note that uracil in RNA replaces the thymine found in DNA), geneticists usually represent DNA by a series of base codes (e.g., 5' AATCGGATTGCA 3'). The base codes are usually arranged from the 5' end to the 3' end.

Because of specific base pairing in DNA (i.e., adenine (A) always bonds with (thymine (T) and cytosine (C) always bonds with guanine (G)) the complimentary stand to the sequence 5' AATCGGATTGCA 3' would be 3' TTAGCCTAACGT 5'.

With palindromes the sequences on the complimentary strands read the same in either direction. For example, a sequence of 5' GAATTC3' on one strand would be complimented by a 3' CTTAAG 5' strand. In either case, when either strand is read from the 5' prime end the sequence is GAATTC. Another example of a palindrome would be the sequence 5' CGAAGC 3' that, when reversed, still reads CGAAGC.

Palindromes are important sequences within nucleic acids. Often they are the site of binding for specific enzymes (e.g., restriction endobucleases) designed to cut the DNA strands at specific locations (i.e., at palindromes).

Palindromes may arise from brakeage and chromosomal inversions that form inverted repeats that compliment each other. When a palindrome results from an inversion, it is often referred to as an inverted repeat. For example, the sequence 5' CGAAGC 3', if inverted (reversed 180°), still reads CGAAGC.

The European Molecular Biology Open Software Suite (EMBOSS) includes some basic tools for finding tandem repeats and inverted repeats (see B.6.22. Applications in group Nucleic:repeats). There are many on-line services providing the EMBOSS tools, for example:

Wageningen Bioinformatics Webportal EMBOSS explorer
Mobyle@Pasteur
Soaplab2 Web Services at Vital-IT

For more sophisticated repeat finding you will want to look at tools using Repbase for example:

Other nucleotide repeat finding methods found by a couple of web searches:

Cytoscape

Anjana — Mon, 23 May 2016 02:32:00 -0500

Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data. A lot of Apps are available for various kinds of problem domains, including bioinformatics, social network analysis, and semantic web.

Address of the bookmark: http://www.cytoscape.org/

Anvio

Shruti Paniwala — Thu, 16 Jun 2016 18:15:41 -0500

In a nutshell

Anvi’o is an analysis and visualization platform for ‘omics data.

Please find the methods paper here: https://peerj.com/articles/1319/

Anvi’o would not have been possible without the help of many people who directly or indirectly contributed to its development. Here is the acknowledgements section of our methods paper

An analysis and visualization platform for 'omics data http://merenlab.org/projects/anvio

Paper https://peerj.com/articles/1839/

Address of the bookmark: https://github.com/meren/anvio

Kraken: ultrafast metagenomic sequence classification using exact alignments

Jit — Mon, 27 Jun 2016 11:01:44 -0500

Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of k-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at http://ccb.jhu.edu/software/kraken/.

Krona

https://sourceforge.net/p/krona/home/krona/

Address of the bookmark: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053813/

Fancy Oneliner for Bioinformatics !!

Poonam Mahapatra — Thu, 07 Jul 2016 12:05:50 -0500

This webpage lists some of the one-liners that we frequently use in metagenomic analyses. You can click on the following links to browse through different topics. You can copy/paste the commands as they are in your terminal screen, provided you follow the same naming conventions and folder structures as we have. We are sharing these codes with the intention that if they are useful and help you in your analyses, then we will be appropriately credited as considerable effort has been put into devising them.

Address of the bookmark: http://userweb.eng.gla.ac.uk/umer.ijaz/bioinformatics/oneliners.html