BOL: Related items

Project Associate-I | Project Associate-II | Senior Project Associate @ IGIB

Thu, 05 Aug 2021 16:11:32 -0500

Experience in Next Generation Sequencing (NGS) application and interest in Genomics/ Clinical / Translational Applications. OR Good computational programming skills and deep interest in working on interface of Genomics and Clinical application.

Project Scientist-I
Experimental / Computation analysis experience in highthroughput genomics/ clinical application.

Project Manager
Experience in handling large biological projects involving high-throughput genomics/ clinical application.

Scientific Administrative Assistant
Lab Work.

More at https://vinodscaria.genomes.in/positionsopen

Biology, Computers Collide in High-Demand Field of Bioinformatics

Mon, 25 Aug 2014 00:56:10 -0500

Dr. Shivas Amin calls bioinformatics a "collision of biology and computers." Students learn how to use computers and skills in math and biology to analyze genome and proteome projects to prepare for high-demand jobs in the life sciences. Learn more about Amin and hear from student Medina Baitemirova and alumnus Lukas Simon about the fast-growing field of bioinformatics.

The "Ifs" and "Buts" of NGS Quality Control and Trimming

BioStar — Thu, 02 Jan 2025 20:11:07 -0600

Next-Generation Sequencing (NGS) has revolutionized biological research, providing vast amounts of data for a wide range of applications. However, the reliability of NGS analyses heavily depends on the quality of raw sequencing data. Quality control (QC) and trimming are critical preprocessing steps that can make or break your downstream analyses. In this blog, we explore the "ifs" (why you should perform QC and trimming) and the "buts" (challenges or considerations) of this vital step in NGS workflows.

The "Ifs" of NGS QC and Trimming

Ensures Data Integrity
If you want to minimize errors in downstream analyses, QC and trimming remove low-quality reads and bases, ensuring high-confidence data. This step is essential for reliable variant calling, assembly, and other applications.
Removes Contaminants
If adapter sequences or contaminants are present in the raw reads, trimming can eliminate them. This prevents issues like misalignment or incorrect biological interpretations, ensuring cleaner data for analysis.
Improves Mapping and Assembly
If your goal is better alignment to a reference genome or improved de novo assembly, trimming low-quality bases and adapters is critical. High-quality reads map more efficiently and generate more accurate assemblies.
Reduces Computational Load
If you want to save computational resources, trimming reduces the dataset size, which speeds up processing and analysis. Clean datasets mean less computational time spent on processing low-quality data.
Prepares for Standardized Analyses
If your project involves multiple datasets, QC and trimming ensure uniformity across them. This standardization makes comparisons valid and reproducible, particularly in large collaborative studies.

The "Buts" of NGS QC and Trimming

Risk of Over-Trimming
But excessive trimming can lead to the loss of informative sequences, reducing read depth and potentially discarding biologically relevant data. This is especially critical in studies with limited sequencing depth.
Bias Introduction
But trimming algorithms might introduce biases, especially if they inadvertently remove sequences with specific biological patterns. This can skew results and compromise biological insights.
Loss of Context in Paired-End Reads
But trimming one read in a pair more than the other can lead to loss of pairing information. This complicates downstream analyses that rely on paired-end data, such as structural variant detection.
Time and Resource Intensive
But running QC and trimming for large datasets can be computationally expensive and time-consuming. As sequencing depth increases, preprocessing becomes a bottleneck in the analysis pipeline.
Variable Standards
But the criteria for trimming (e.g., quality threshold, minimum read length) can vary between tools and datasets. This variability may affect reproducibility and comparability of results across studies.

Balancing the "Ifs" and "Buts"

To maximize the benefits of QC and trimming while mitigating the challenges, consider the following best practices:

Use QC Tools Wisely: Start with tools like FastQC to identify quality issues in your raw data. Visualizing quality metrics helps tailor your trimming parameters.
Choose Reliable Trimming Tools: Tools like Trimmomatic, Cutadapt, and BBduk offer adaptive and customizable trimming options. Select one that aligns with your dataset and project goals.
Set Reasonable Parameters: Avoid over-trimming by setting quality thresholds and minimum read lengths that balance data retention and quality improvement.
Test Downstream Effects: Validate the impact of QC and trimming on downstream analyses, such as alignment efficiency, variant calling accuracy, or assembly quality.
Document Your Workflow: Maintain detailed records of the parameters and tools used for QC and trimming. This ensures reproducibility and enables better troubleshooting.

Conclusion

NGS quality control and trimming are essential steps to ensure reliable and accurate data for analysis. While the "ifs" highlight the clear benefits of these steps, the "buts" remind us of the potential pitfalls. By adopting best practices and carefully balancing these considerations, you can optimize your preprocessing workflow and unlock the full potential of your sequencing data.

A comprehensive atlas of human gene activity released !!!

Abhimanyu Singh — Tue, 02 Sep 2014 14:20:24 -0500

A large international consortium of researchers has produced the first comprehensive, detailed map of the way genes work across the major cells and tissues of the human body. The findings describe the complex networks that govern gene activity, and the new information could play a crucial role in identifying the genes involved with disease.

We are able to pinpoint the regions of the genome that can be active in a disease and in normal activity, whether it’s in a brain cell, the skin, in blood stem cells or in hair follicles. This is a major advance that will greatly increase our ability to understand the causes of disease across the body.

The research is outlined in a series of papers published March 27, 2014, two in the journal Nature and 16 in other scholarly journals. The work is the result of years of concerted effort among 250 experts from more than 20 countries as part of FANTOM 5 (Functional Annotation of the Mammalian Genome). The FANTOM project, led by the Japanese institution RIKEN, is aimed at building a complete library of human genes.

Researchers studied human and mouse cells using a new technology called Cap Analysis of Gene Expression (CAGE), developed at RIKEN, to discover how 95% of all human genes are switched on and off. These “switches” — called “promoters” and “enhancers” — are the regions of DNA that manage gene activity. The researchers mapped the activity of 180,000 promoters and 44,000 enhancers across a wide range of human cell types and tissues and, in most cases, found they were linked with specific cell types.

Referene : www.kurzweilai.net/first-comprehensive-atlas-of-human-gene-activity-released

My commonly used commands in Bioinformatics

Rahul Nayak — Thu, 26 Jul 2018 04:58:45 -0500

FYI, I've found it useful to use MUMmer to extract the specific changes that Racon makes, so I can evaluate them individually:

minimap -t 24 assembly.fasta long_reads.fastq.gz | racon -t 24 long_reads.fastq.gz - assembly.fasta racon_assembly.fasta
nucmer -p nucmer assembly.fasta racon_assembly.fasta
show-snps -C -T -r nucmer.delta

This reports Racon's changes in a table. You can exclude indels with the -I option in show-snps.

This process (Racon -> MUMmer -> SNP table) solves the problem I originally raised in this issue. So as far as I'm concerned, you can close this issue (or keep it open if you still want to implement some kind of variant table).

Which of the following programming language is best for a bioinformatics beginner?

Manisha Mishra — Thu, 04 Sep 2014 07:51:16 -0500

I will be doing NGS in the course of my research work and I will like to learn a programming language which is compatible with most bioinformatics tools or software. I basically want to do de-novo assembly, map reads, align reads, and expression analysis. Recommendations welcomed. Which languages would you recommend to a student wishing to enter the world of bioinformatics?

Referee: Genome assembly quality scores

Jit — Sun, 04 Nov 2018 16:44:30 -0600

Modern genome sequencing technologies provide a succint measure of quality at each position in every read, however all of this information is lost in the assembly process. Referee summarizes the quality information from the reads that map to a site in an assembled genome to calculate a quality score for each position in the genome assembly.

We accomplish this by first calculating genotype likelihoods for every site. For a given site in a diploid genome, there are 10 possible genotypes (AA, AC, AG, AT, CC, CG, CT, GG, GT, TT). Referee takes as input the genotype likelihoods calculated for all 10 genotypes given the called reference base at each position.

Referee is a program to calculate a quality score for every position in a genome assembly. This allows for easy filtering of low quality sites for any downstream analysis.

https://github.com/gwct/referee

Address of the bookmark: https://gwct.github.io/referee/#

Research Scientist – Bioinformatics at Sidra Medical and Research Center

Wed, 10 Sep 2014 14:35:35 -0500

Sidra Medical and Research Center(Doha, Qatar) is looking for talented Research Scientists (Bioinformatics / NGS Data Analysis).

Research Scientists within the Bioinformatics Program are involved in research related to cutting edge genomics and analysis of omics data. The research will utilize concepts, theories and best practices obtained from bioinformatics discipline and applied to biological and other biomedical data for analysis. The role may also involve designing databases, algorithm and/or computation methods for analyzing genomics and other omics data. The scientist will be working closely with the Translational Medicine Program within a state-of-the art research setting.

Please check the details of the opening and apply here: http://careers.sidra.org/sidra/Vacan...acancyID=60181

jackalope: A swift, versatile phylogenomic and high-throughput sequencing simulator

Abhimanyu Singh — Fri, 26 Jul 2019 00:58:12 -0500

jackalope simply and efficiently simulates (i) variants from reference genomes and (ii) reads from both Illumina and Pacific Biosciences (PacBio) platforms. It can either read reference genomes from FASTA files or simulate new ones. Genomic variants can be simulated using summary statistics, phylogenies, Variant Call Format (VCF) files, and coalescent simulations—the latter of which can include selection, recombination, and demographic fluctuations. jackalope can simulate single, paired-end, or mate-pair Illumina reads, as well as reads from Pacific Biosciences These simulations include sequencing errors, mapping qualities, multiplexing, and optical/PCR duplicates. All outputs can be written to standard file formats.

A swift, versatile phylogenomic and high-throughput sequencing simulator https://jackalope.lucasnell.com

Address of the bookmark: https://github.com/lucasnell/jackalope

INTERNSHIP @ NIPGR

Sat, 13 Sep 2014 16:02:35 -0500

Applications are invited from suitable candidates for six months ‘Training Fellowship' at National Institute of Plant Genome Research (NIPGR).

About National Institute Of Plant Genome Research (NIPGR) http://www.nipgr.res.in/

The National Institute of Plant Genome Research is an autonomous institution supported by the Department of Biotechnology, Government of India. It is committed to make the institute a premier Institution for plant genomic research in the country. It was established to contribute in the achievement of such hopes as a part of national effort for meeting the challenges in the midst of fast pace of international genomic research and grasping of opportunities on long-term basis.

About the Internship:

The selected intern(s) will work in the area of in Bioinformatics under the BTISNET program of DBT in the Distributed Information Sub center (DISC) facility at NIPGR, New Delhi, under the supervision of Dr. Gitanjali Yadav, Scientist, NIPGR.

Who can apply:

Students currently pursuing the final year of Masters Degree (or equivalent) in Bioinformatics/Biotechnology with strong interest in Computational Biology and First class/division throughout academic career may apply.