Our section develops and applies computational methods for the analysis of massive genomics datasets, focusing on the challenges of genome sequencing and comparative genomics. We aim to improve such foundational processes and translate emerging...
My main topics of interest are:
The impact of non tree-like evolution such as horizontal gene transfers and hybridization on species biology
Evolution and adaptation of animals in the absence of sexual reproduction and the underlying...
daehwankimlab.github.io - Resource for downloading all the HISAT2 related files
Please cite:
Kim, D., Paggi, J.M., Park, C. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915...
github.com - Generate unique k-mers for every contig in a FASTA file.
Unique k-mer is consisted of k-mer keys (i.e. ATCGATCCTTAAGG) that are only presented in one contig, but not presented in any other contigs (for both forward and reverse strands).
This tool...
www.genome.gov - This meeting's objective was to obtain a big picture look at the current state of the field of comparative genomics with a focus on commonalities across genomic investigations into humans, model organisms (both traditional and...
https://js.cgview.ca/ - CGView.js is a Circular Genome Viewing tool for visualizing and interacting with small genomes. This software is an adaptation of the Java program CGView.
CGView.js is the genome viewer of Proksee, an expert system for genome...
musket.sourceforge.net - Musket is a well-established leading next-generation sequencing read error correction algorithm targetting Illumina sequencing. This corrector employs the k-mer spectrum approach and introduces three correction techniques in a multistage...
www.zbh.uni-hamburg.de - Tallymer is based on enhanced suffix arrays. This gives a much larger flexibility concerning the choice of the k-mer size. Tallymer can process large data sizes of several billion bases. We used it in a variety of applications to study the...
sourceforge.net - SuRankCo is a machine learning based software to score and rank contigs from de novo assemblies of next generation sequencing data. It trains with alignments of contigs with known reference genomes and predicts scores and ranking for contigs which...