BOL: Related items

Darwin-WGA: A Co-processor Provides Increased Sensitivity in Whole Genome Alignments with High Speedup

Jit — Sat, 13 Apr 2019 08:55:31 -0500

Darwin-WGA, is the first hardware accelerator for whole genome alignment and accelerates the gapped filtering stage. Darwin-WGA also employs GACT-X, a novel algorithm used in the extension stage to align arbitrarily long genome sequences using a small on-chip memory, that provides better quality alignments at 2× improvement in memory and speed over the previously published GACT algorithm. Implemented on an FPGA, Darwin-WGA provides up to 24× improvement (performance/$) in WGA over iso-sensitive software.

https://stanford.edu/~yatisht/pubs/darwin-wga.pdf

Address of the bookmark: https://github.com/gsneha26/Darwin-WGA

U-Plot: Genome U-Plot sample implementation

Rahul Nayak — Tue, 03 Mar 2020 01:39:12 -0600

The Genome U-Plot is a JavaScript tool to visualize Chromosomal abnormalities in the Human Genome using a U-shape layout.

Address of the bookmark: https://github.com/gaitat/GenomeUPlot

SneakySnake: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAs

Neel — Sun, 20 Dec 2020 01:39:54 -0600

The first and the only pre-alignment filtering algorithm that works efficiently and fast on modern CPU, FPGA, and GPU architectures. SneakySnake greatly (by more than two orders of magnitude) expedites sequence alignment calculation for both short (Illumina) and long (ONT and PacBio) reads. Described by Alser et al. (preliminary version at https://arxiv.org/abs/1910.09020).

Address of the bookmark: https://github.com/CMU-SAFARI/SneakySnake

Simons Genome Diversity Project

Jit — Sat, 08 May 2021 21:55:25 -0500

Complete genome sequences from more than one hundred diverse human populations

All genomes in the dataset were sequenced to at least 30x coverage using Illumina technology. The sequencing reads were mapped and genotyped using a customized procedure that was optimized for population genetic analysis. The researchers eliminated bias of alleles toward matching the human genome reference sequence, and determined genotypes on a single-sample basis to avoid preferential calling of genotypes from populations that had more individuals represented.

Address of the bookmark: https://www.simonsfoundation.org/simons-genome-diversity-project/

HISAT2 Index Files Download !

LEGE — Wed, 15 Sep 2021 22:17:49 -0500

Resource for downloading all the HISAT2 related files

Please cite:

Kim, D., Paggi, J.M., Park, C. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915 (2019). https://doi.org/10.1038/s41587-019-0201-4

Address of the bookmark: http://daehwankimlab.github.io/hisat2/download/#h-sapiens

HIV genome database !

Rahul Nayak — Fri, 21 Jan 2022 05:40:15 -0600

HIV resources

https://www.hiv.lanl.gov/components/sequence/HIV/search/search.html

Address of the bookmark: https://www.hiv.lanl.gov/components/sequence/HIV/search/search.html

Human Complete Genome

Shruti Paniwala — Wed, 06 Jul 2022 06:42:55 -0500

Telomere-to-telomere consortium

We have sequenced the CHM13hTERT human cell line with a number of technologies. Human genomic DNA was extracted from the cultured cell line. As the DNA is native, modified bases will be preserved. The data includes 30x PacBio HiFi, 120x coverage of Oxford Nanopore, 70x PacBio CLR, 50x 10X Genomics, as well as BioNano DLS and Arima Genomics HiC. Most raw data is available from this site, with the exception of the PacBio data which was generated by the University of Washington/PacBio and is available from NCBI SRA.

A UCSC browser is available for v2.0 (as well as legacy v1.0 and v1.1 versions). An interactive dotplot visualization of all genomic repeats is also available from resgen.io. Known issues identified in the assembly are tracked at CHM13 issues.

MORE at https://github.com/marbl/CHM13

Address of the bookmark: https://www.science.org/doi/10.1126/science.abj6987

Genome Context Viewer (GCV)

LEGE — Sun, 21 May 2023 19:33:43 -0500

The Genome Context Viewer (GCV) is a web-app that visualizes genomic context data provided by third party services. Specifically, it uses functional annotations as a unit of search and comparison. By adopting a common set of annotations, data-store operators can deploy federated instances of GCV, allowing users to compare genomes from different providers in a single interface.

Address of the bookmark: https://github.com/legumeinfo/gcv

Entire Human Genome Sequencing !

LEGE — Tue, 02 Apr 2024 01:19:29 -0500

Cost-effective whole human genome sequencing has revolutionized the landscape of genetic research and personalized medicine by making comprehensive genetic analysis accessible to a wider population. Through advancements in sequencing technologies, such as next-generation sequencing (NGS), costs have significantly decreased, enabling researchers and healthcare providers to analyze an individual's complete genetic makeup with greater efficiency and affordability. This has profound implications for disease diagnosis, prognosis, and treatment, as it allows for the identification of genetic predispositions and the customization of healthcare interventions based on an individual's unique genetic profile. Moreover, as the cost continues to decline, the potential for population-scale genomic studies and large-scale screening programs becomes increasingly feasible, promising to further enhance our understanding of human genetics and improve healthcare outcomes on a global scale.

Here are few companies:

https://mynucleus.com/

https://myome.com/

https://nebula.org/whole-genome-sequencing-dna-test/

piRNA and Bioinformatics: Decoding the Guardians of the Genome

LEGE — Sat, 07 Dec 2024 02:15:11 -0600

In the symphony of small RNAs, PIWI-interacting RNAs (piRNAs) stand out as the protectors of genomic integrity. These small, non-coding RNAs play critical roles in silencing transposable elements, regulating gene expression, and maintaining germline stability. The rise of bioinformatics has revolutionized our understanding of piRNAs, enabling researchers to decipher their biogenesis, functions, and evolutionary significance.

What Are piRNAs?

piRNAs are the largest class of small non-coding RNAs, typically 24–32 nucleotides in length. Unlike microRNAs (miRNAs) and small interfering RNAs (siRNAs), piRNAs do not rely on Dicer enzymes for maturation. Instead, they are processed from long single-stranded precursors and associate with PIWI proteins, a subclass of the Argonaute protein family.

The primary functions of piRNAs include:

Silencing Transposable Elements: By targeting transposons, piRNAs prevent genomic instability, particularly in germline cells.
Regulating Gene Expression: piRNAs modulate gene expression at transcriptional and post-transcriptional levels.
Epigenetic Modulation: They guide epigenetic modifications, such as DNA methylation, to specific genomic loci.

Challenges in piRNA Research

Studying piRNAs is fraught with challenges, including:

Short Length: Their small size complicates sequencing and alignment.
Lack of Sequence Conservation: Unlike miRNAs, piRNAs exhibit limited sequence conservation across species.
Complex Biogenesis: The intricate pathways of piRNA generation require sophisticated computational tools to unravel.

Bioinformatics: Illuminating the World of piRNAs

Bioinformatics has emerged as an indispensable tool for studying piRNAs, facilitating their discovery, annotation, and functional analysis. Here's how bioinformatics is transforming piRNA research:

1. Identification and Annotation

The discovery of piRNAs relies on next-generation sequencing (NGS) data. Bioinformatics tools such as piRNApredictor and Piano identify piRNA clusters and predict potential targets. Databases like piRBase and piRNAdb curate information about known piRNAs, their sequences, and associated proteins.

2. Mapping and Alignment

piRNAs often originate from repetitive regions, making their alignment challenging. Tools like Bowtie and STAR handle the unique mapping requirements of piRNAs, enabling accurate identification of piRNA clusters in genomes.

3. Functional Analysis

Bioinformatics approaches predict piRNA functions by analyzing their interactions with transposons, genes, and epigenetic marks. Algorithms such as TargetFinder and RIblast explore piRNA-mRNA interactions, shedding light on regulatory networks.

4. Evolutionary Studies

piRNAs are evolutionarily diverse, reflecting their roles in species-specific genomic defense. Comparative genomics tools help trace the evolution of piRNA clusters and their associated PIWI proteins across species.

5. Epigenomic Insights

piRNAs are key players in epigenetic regulation. Bioinformatics pipelines integrate piRNA data with chromatin immunoprecipitation sequencing (ChIP-seq) and DNA methylation data to uncover their role in shaping the epigenome.

Case Study: piRNAs in Germline Integrity

One of the hallmark functions of piRNAs is the suppression of transposable elements in the germline. For example, in Drosophila melanogaster, piRNAs target retrotransposons like gypsy and copia. Bioinformatics analyses revealed that these piRNAs guide PIWI proteins to transposon-derived RNA, ensuring genome stability during gametogenesis.

Clinical Relevance of piRNAs

Recent studies suggest that piRNAs may serve as biomarkers for diseases such as cancer, infertility, and neurodegenerative disorders. For instance:

Cancer: Dysregulated piRNA expression has been linked to tumorigenesis, making them potential targets for cancer therapies.
Infertility: Aberrant piRNA pathways are implicated in male infertility due to their role in spermatogenesis.
Neurodegeneration: piRNAs may regulate neuronal gene expression, highlighting their potential in neurological research.

Future Directions

The integration of bioinformatics with emerging technologies offers exciting opportunities for piRNA research:

Single-Cell Sequencing: Unveiling cell-specific piRNA expression and function.
Machine Learning: Predicting piRNA functions and targets with greater accuracy.
CRISPR-Based Tools: Editing piRNA clusters to explore their roles in vivo.

Conclusion

piRNAs are the unsung guardians of the genome, safeguarding genetic material from transposable elements and contributing to gene regulation and epigenetic programming. Bioinformatics has opened the floodgates of discovery, unraveling the complexities of piRNAs and their myriad roles in biology and disease.

As we continue to decode the piRNA landscape, these small RNAs promise to unveil big secrets about genome stability, evolution, and human health, cementing their place as a fascinating frontier in molecular biology.