Our section develops and applies computational methods for the analysis of massive genomics datasets, focusing on the challenges of genome sequencing and comparative genomics. We aim to improve such foundational processes and translate emerging...
My main topics of interest are:
The impact of non tree-like evolution such as horizontal gene transfers and hybridization on species biology
Evolution and adaptation of animals in the absence of sexual reproduction and the underlying...
daehwankimlab.github.io - Resource for downloading all the HISAT2 related files
Please cite:
Kim, D., Paggi, J.M., Park, C. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915...
github.com - Generate unique k-mers for every contig in a FASTA file.
Unique k-mer is consisted of k-mer keys (i.e. ATCGATCCTTAAGG) that are only presented in one contig, but not presented in any other contigs (for both forward and reverse strands).
This tool...
www.genome.gov - This meeting's objective was to obtain a big picture look at the current state of the field of comparative genomics with a focus on commonalities across genomic investigations into humans, model organisms (both traditional and...
lncRNAs are the hidden gems of the genome, and bioinformatics is the key to unearthing their full potential. As research progresses, lncRNAs could pave the way for novel diagnostics, targeted therapies, and personalized medicine, revolutionizing...
github.com - Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long...
github.com - The pipeline was developed based on a popular workflow framework Nextflow, composed of four core procedures including reads alignment, assembly, identification and quantification. It contains various unique features such as well-designed...
code.google.com - lideSort-BPR ( b reak p oint r eads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing...