Our section develops and applies computational methods for the analysis of massive genomics datasets, focusing on the challenges of genome sequencing and comparative genomics. We aim to improve such foundational processes and translate emerging...
My main topics of interest are:
The impact of non tree-like evolution such as horizontal gene transfers and hybridization on species biology
Evolution and adaptation of animals in the absence of sexual reproduction and the underlying...
daehwankimlab.github.io - Resource for downloading all the HISAT2 related files
Please cite:
Kim, D., Paggi, J.M., Park, C. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915...
github.com - Generate unique k-mers for every contig in a FASTA file.
Unique k-mer is consisted of k-mer keys (i.e. ATCGATCCTTAAGG) that are only presented in one contig, but not presented in any other contigs (for both forward and reverse strands).
This tool...
www.genome.gov - This meeting's objective was to obtain a big picture look at the current state of the field of comparative genomics with a focus on commonalities across genomic investigations into humans, model organisms (both traditional and...
https://js.cgview.ca/ - CGView.js is a Circular Genome Viewing tool for visualizing and interacting with small genomes. This software is an adaptation of the Java program CGView.
CGView.js is the genome viewer of Proksee, an expert system for genome...
github.com - LRCstats is an open-source pipeline for benchmarking DNA long read correction algorithms for long reads outputted by third generation sequencing technology such as machines produced by Pacific Biosciences. The reads produced by third generation...
github.com - INC-Seq reads enabled accurate species-level classification, identification of species at 0.1 % abundance and robust quantification of relative abundances, providing a cheap and effective approach for pathogen detection and microbiome profiling...
bitbucket.org - SimLoRD is a read simulator for third generation sequencing reads and is currently focused on the Pacific Biosciences SMRT error model.
Reads are simulated from both strands of a provided or randomly generated reference sequence.
The reference...