BOL: All Site Activity

All Site Activity

- Rahul Nayak@rahul
Rahul Nayak commented on a bookmark R and Bioconductor Tutorial in the group R and Bioconductor 3183 days ago

Learn R by urself www.datasciencecentral.com/profiles/blogs/learning-r-in-seven-simple-steps
- Neel@neelam
Neel bookmarked Picard 3183 days ago

Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the Hts-specs repository. See especially the SAM specification and the VCF...

http://broadinstitute.github.io/picard/
- Poonam Mahapatra@poonam
Poonam Mahapatra bookmarked Easyfig 3183 days ago

Easyfig has moved to github, for newer releases of Easyfig please visit our new webpage - https://mjsull.github.io/Easyfig. Easyfig is a Python application for creating linear comparison figures of multiple genomic loci with an easy-to-use...

http://easyfig.sourceforge.net/
- Poonam Mahapatra@poonam
Poonam Mahapatra commented on the blog Computer simulation of genetic mechanism !! 3183 days ago

Thanks for the list. I came across GPOPSIM: a simulation tool for whole-genome genetic data ( http://bmcgenet.biomedcentral.com/articles/10.1186/s12863-015-0173-4 ), which seems the best for be a useful tool for the methodological and...
- Jit@jit.aber
Jit bookmarked GATB : Genome Analysis Toolbox with de-Bruijn graph 3184 days ago

The Genome Analysis Toolbox with de-Bruijn graph (GATB) provides a set of highly efficient algorithms to analyse NGS data sets. These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge...

https://gatb.inria.fr/
- Abhi@abhinav
Abhi bookmarked RASTtk : algorithm for building custom annotation pipelines and annotating batches of genomes 3185 days ago

The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes...

http://rast.nmpdr.org/
- Abhi@abhinav
Abhi posted a new ad in the Opportunity Bioinformatics Faculty at TNU 3185 days ago
- Jit@jit.aber
Jit bookmarked Smash: An alignment-free method to find and visualise rearrangements between pairs of DNA sequences 3186 days ago

Smash is a completely alignment-free method/tool to find and visualise genomic rearrangements. The detection is based on conditional exclusive compression, namely using a FCM (Markov model), of high context order (typically 20). For...

http://bioinformatics.ua.pt/software/smash/
- Jit@jit.aber
Jit bookmarked MEDEA: Comparative Genomic Visualization with Adobe Flash 3186 days ago

As the number of sequence and annotated genomes grows larger, the need to understand, compare, and contrast the data becomes increasingly important. Using the power of the human visual system to detect trends and spot outliers is necessary in such...

http://www.broadinstitute.org/annotation/medea/
- Jit@jit.aber
Jit bookmarked CANU: Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing. 3186 days ago

Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION). The software is currently alpha level, feel free to use and report issues encountered. Canu is...

https://github.com/marbl/canu
Comments
- Poonam Mahapatra@poonam
  
  Poonam Mahapatra 2430 days ago
  Canu is one of the best de novo assemblers available for long reads - it’s a fork and updated version of the Celera assembler that was used to assemble the human genome.
  It is quite a complex beast that has HPC integration built in - though you can turn this off. However, large assembly jobs are best run in parallel, making HPC integration essential. This can get tough if your cluster has a non-standard configuration.
  Run canu without any options to get help:
  canu
  This produces:
  usage: canu [-version] \ [-correct | -trim | -assemble | -trim-assemble] \ [-s <assembly-specifications-file>] \ -p <assembly-prefix> \ -d <assembly-directory> \ genomeSize=<number>[g|m|k] \ [other-options] \ [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq By default, all three stages (correct, trim, assemble) are computed. To compute only a single stage, use: -correct - generate corrected reads -trim - generate trimmed reads -assemble - generate an assembly -trim-assemble - generate trimmed reads and then assemble them The assembly is computed in the (created) -d <assembly-directory>, with most files named using the -p <assembly-prefix>. The genome size is your best guess of the genome size of what is being assembled. It is used mostly to compute coverage in reads. Fractional values are allowed: '4.7m' is the same as '4700k' and '4700000' A full list of options can be printed with '-options'. All options can be supplied in an optional sepc file. Reads can be either FASTA or FASTQ format, uncompressed, or compressed with gz, bz2 or xz. Reads are specified by the technology they were generated with: -pacbio-raw <files> -pacbio-corrected <files> -nanopore-raw <files> -nanopore-corrected <files> Complete documentation at http://canu.readthedocs.org/en/latest/
  Canu has three stages which it runs in order:
  
  Correct
  
  Trim
  
  Assemble
  
  By default canu runs these one after the other, but they can be run individually.
  An example “full pipeline” command would be:
  canu -p meta \ -d meta \ genomeSize=40m \ useGrid=false \ -nanopore-raw /vol_b/public_data/minion_brown_metagenome/brown_metagenome.2D.10.fasta
  This puts output in directory meta with prefix “meta”. We estimate the genome size, tell canu NOT to use HPC (as we don’t have one for porecamp) and give it some ONT data as fasta.
  This runs pretty quickly but doesn’t assemble anything. It’s a low coverage synthetic metagenome, so no surprise. It does produce corrected reads though! These could be used in the metagenomics practical (hint!)
  Now try the E coli subset:
  canu -p ecoli -d ecoli genomeSize=4.8m useGrid=false -nanopore-raw /vol_b/public_data/minion_ecoli_sample/ecoli_sample.template.fasta
  This one will take a bit longer ;)
- Rahul Nayak@rahul
  
  Rahul Nayak 2354 days ago
  ➜ bin git:(master) ✗ ./canu
  usage: canu [-version] [-citation] \
  [-correct | -trim | -assemble | -trim-assemble] \
  [-s <assembly-specifications-file>] \
  -p <assembly-prefix> \
  -d <assembly-directory> \
  genomeSize=<number>[g|m|k] \
  [other-options] \
  [-pacbio-raw |
  -pacbio-corrected |
  -nanopore-raw |
  -nanopore-corrected] file1 file2 ...
  example: canu -d run1 -p godzilla genomeSize=1g -nanopore-raw reads/*.fasta.gz
  
  To restrict canu to only a specific stage, use:
  -correct - generate corrected reads
  -trim - generate trimmed reads
  -assemble - generate an assembly
  -trim-assemble - generate trimmed reads and then assemble them
  The assembly is computed in the -d <assembly-directory>, with output files named
  using the -p <assembly-prefix>. This directory is created if needed. It is not
  possible to run multiple assemblies in the same directory.
  The genome size should be your best guess of the haploid genome size of what is being
  assembled. It is used primarily to estimate coverage in reads, NOT as the desired
  assembly size. Fractional values are allowed: '4.7m' equals '4700k' equals '4700000'
  Some common options:
  useGrid=string
  - Run under grid control (true), locally (false), or set up for grid control
  but don't submit any jobs (remote)
  rawErrorRate=fraction-error
  - The allowed difference in an overlap between two raw uncorrected reads. For lower
  quality reads, use a higher number. The defaults are 0.300 for PacBio reads and
  0.500 for Nanopore reads.
  correctedErrorRate=fraction-error
  - The allowed difference in an overlap between two corrected reads. Assemblies of
  low coverage or data with biological differences will benefit from a slight increase
  in this. Defaults are 0.045 for PacBio reads and 0.144 for Nanopore reads.
  gridOptions=string
  - Pass string to the command used to submit jobs to the grid. Can be used to set
  maximum run time limits. Should NOT be used to set memory limits; Canu will do
  that for you.
  minReadLength=number
  - Ignore reads shorter than 'number' bases long. Default: 1000.
  minOverlapLength=number
  - Ignore read-to-read overlaps shorter than 'number' bases long. Default: 500.
  A full list of options can be printed with '-options'. All options can be supplied in
  an optional sepc file with the -s option.
  Reads can be either FASTA or FASTQ format, uncompressed, or compressed with gz, bz2 or xz.
  Reads are specified by the technology they were generated with, and any processing performed:
  -pacbio-raw <files> Reads are straight off the machine.
  -pacbio-corrected <files> Reads have been corrected.
  -nanopore-raw <files>
  -nanopore-corrected <files>
  Complete documentation at http://canu.readthedocs.org/en/latest/
- Jit@jit.aber
Jit answered the question Comparison of mapping tools ! 3186 days ago

You should check the segemehl algorithm paper http://bioinformatics.oxfordjournals.org/content/early/2014/03/13/bioinformatics.btu146.full.pdf+html , in which they compare the mapping tools. For further detail of the Algo...
- Jit@jit.aber
Jit commented on a bookmark ALE: a Generic Assembly Likelihood Evaluation Framework for Assessing the Accuracy of Genome and... 3186 days ago

Thanks for reporting the updated tool for assembly validation, you can also try following methods/pipelines CEGMA (formally discontinued but still useful) BUSCO (we have issues with fish, seems not to be tailored to that group of...
- Neel@neelam
Neel bookmarked mrFAST: Micro Read Fast Alignment Search Tool 3186 days ago

mrFAST is a read mapper that is designed to map short reads to reference genome with a special emphasis on the discovery of structural variation and segmental duplications. mrFAST maps short reads with respect to user defined error threshold,...

http://mrfast.sourceforge.net/manual.html
- Neel@neelam
Neel bookmarked HOMER: Software for motif discovery and next-gen sequencing analysis 3186 days ago

This tutorial covers topics independently of HOMER, and represents knowledge which is important to know before diving head first into more advanced analysis tools such as HOMER. Setting up your computing environment Retrieving and storing...

http://homer.salk.edu/homer/basicTutorial/
- Neel@neelam
Neel bookmarked ALE: a Generic Assembly Likelihood Evaluation Framework for Assessing the Accuracy of Genome and... 3186 days ago

Assembly Likelihood Evaluation (ALE) framework that overcomes these limitations, systematically evaluating the accuracy of an assembly in a reference-independent manner using rigorous statistical methods. This framework is comprehensive, and...

http://sc932.github.io/ALE/about.html
Comments
- Jit@jit.aber
  
  Jit 3186 days ago
  Thanks for reporting the updated tool for assembly validation, you can also try following methods/pipelines
  
  CEGMA (formally discontinued but still useful)
  
  BUSCO (we have issues with fish, seems not to be tailored to that group of organisms, developers tell us they are fixing it)
  
  linkage map? or other map (RAD-tag based). (software?)
  
  BioNanoGenomics can be used for QC also
  
  Use a genome browser to get a feeling for your results, e.g. IGV; add assembly, BAM files, annotation, transcripts mapped and browse
- Jitendra Prajapati@jprajapati81
Jitendra Prajapati commented on a bookmark Venn Diagrams on R Studio 3187 days ago

How can I generate a Venn diagram in R? by UCLA is also useful http://www.ats.ucla.edu/stat/r/faq/venn.htm
- Jitendra Prajapati@jprajapati81
Jitendra Prajapati bookmarked Venn Diagrams on R Studio 3187 days ago

First step: Install & load “VennDiagram” package. # install.packages('VennDiagram') library(VennDiagram) Second step: Load data Add filepath if “catdoge.csv” is not in working-directory. d <-...

http://rstudio-pubs-static.s3.amazonaws.com/13301_6641d73cfac741a59c0a851feb99e98b.html
Comments
- Jitendra Prajapati@jprajapati81
  
  Jitendra Prajapati 3187 days ago
  How can I generate a Venn diagram in R? by UCLA is also useful http://www.ats.ucla.edu/stat/r/faq/venn.htm
- Jit@jit.aber
  
  Jit 2573 days ago
  Six-way Venn diagram
  https://stackoverflow.com/questions/32440128/nice-looking-five-sets-venn-diagrams
- Jit@jit.aber
Jit commented on a page titled Worldwide funding agencies to fund your bioinformatics research !! 3190 days ago

Thanks for such a useful links, I found Sofja Kovalevskaja Award very competative and have many scope for bioinformatician. Sofja Kovalevskaja Award – Become a research group leader in Germany € 1.65 million for young researchers from...
- Abhimanyu Singh@abhimanyu
Abhimanyu Singh posted a new ad in the ResearchLabs Desai Lab 3191 days ago
- Jitendra Prajapati@jprajapati81
Jitendra Prajapati created a page Worldwide funding agencies to fund your bioinformatics research !! 3193 days ago

Are you seeking funding for research or training in a particular area? Check out the following agencies ... National Science Foundation: For the love of science! Head here when searching for ways to pay for that gargantuan geology or bigtime...
Comments
- Jit@jit.aber
  
  Jit 3033 days ago
  Bioinformatics funding for Japan
  Promoting science and technology is a key engine to materialize a bright future of Asia and it is vitally important to enhance the exchange of youths in Asian countries and Japan who will play a crucial role in the field of science and technology.
  
  Based on this concept, “Japan-Asia Youth Exchange Program in Science” (SAKURA Exchange Program in Science) is the program for enhancing exchanges between Asia and Japan of the youths who will play a crucial role in the future field of science and technology through the close collaboration of industry-academia-government by facilitating short-term visits of competent Asian youths to Japan. This program aims at raising the interest of Asian youths toward the leading Japanese science and technologies at Japanese universities, research institutions and private companies.
  More at http://www.ssp.jst.go.jp/EN/outline/index.html
- Shruti Paniwala@shruti
  
  Shruti Paniwala 2990 days ago
  The Arturo Falaschi ICGEB Fellowship Programmes for PhD, PostDoc and Short term courses
  The Arturo Falaschi ICGEB Fellowships programme offers long and short-term fellowships for scientists who are nationals of ICGEB Member States to perform research in Trieste, New Delhi or Cape Town.
  More at http://www.icgeb.org/fellowships.html
- Neel@neelam
  
  Neel 1809 days ago
  Try this https://www.birac.nic.in/big.php if you dream ur company
  https://www.uniraj.ac.in/uic/
+2 more

BOL

Our Sponsors

All Site Activity